Unicode workaround via UTF-16 - problems
Ruslan Zasukhin
sunshine at public.kherson.ua
Wed Nov 5 20:10:09 CST 2003
on 11/5/03 7:53 PM, Dave Addey at dave.addey at dsl.pipex.com wrote:
> Hi all,
>
> I've read on this list about Unicode coming in v2.0, and I look forward to
> when it does.
>
> In the meantime, I've been trying a workaround. I'd like to store my UTF-16
> strings in Valentina, without Valentina knowing or caring that they are in
> this format. I use REALbasic 5, and can convert back and forth between text
> encodings at will, so I'm able to take a random set of bytes, of known
> string encoding, and use this string in REALbasic.
Then I think you need use VarBinary or FixedBainry strings
> In theory, byte-based sorting on an UTF-16 string (stored and referenced as
> bytes) is a valid sorting process.
I think this is not correct, Dave.
>From what I have read in the docs of IBM ICU library, sorting on bytes DO
NOT works. IBM guys at first convert string into special form that can be
sorted.
> Because I can pretty much always assume that each character takes 2 bytes in
> UTF-16, I'm happy that any sort I do will come back accurately enough sorted
> for my liking. Sorting would only 'break' if my string contained characters
> which are outside of the UTF-16 set, and this is unlikely in normal usage (in
> my opinion).
I still think this will not work for SOME HARD languages.
> But, my problem is this. When I try and store a Unicode UTF-16 string in a
> Valentina string field, I can't do so for any strings that begin with the
> null character (&h00). Strings that start with other characters are fine
> (e.g. Japanese characters which use both bytes in UTF-16). And since most
> strings in Western alphabets contain mostly ASCII characters, most of my
> UTF-16 strings begin with &h00 .
This is why I say to use FixedBianry and VarBinary fields.
> So, for example:
>
> LibraryDB.ImportedSongsTable.Album.Value = mysong.Album
>
> ...where LibraryDB.ImportedSongsTable.Album is a VVarChar, and mysong.Album
> is a UTF-16 string.
>
> In this example, if the first byte of mysong.Album is &h00, the value of
> LibraryDB.ImportedSongsTable.Album.Value is nil, even though there are other
> characters after the initial &h00 value in mysong.Album .
>
> All I really want is to transfer some bytes into a Valentina field (I'm
> assuming this would be a string) and get them back out again, with byte
> sorting on this field. Is there a way to do this in the current release?
>
> This would allow me to add Unicode support to my app before it is available
> 'native' in Valentina.
>
> BTW, I'm using Valentina 1.9.7, REALBasic 5.2.1 on Mac OS X 10.2.8.
--
Best regards,
Ruslan Zasukhin [ I feel the need...the need for speed ]
-------------------------------------------------------------
e-mail: ruslan at paradigmasoft.com
web: http://www.paradigmasoft.com
To subscribe to the Valentina mail list go to:
http://lists.macserve.net/mailman/listinfo/valentina
-------------------------------------------------------------
More information about the Valentina
mailing list