V4RB, Jon, project
jda
jda at his.com
Tue Sep 14 17:50:00 CDT 2004
>
> >
>> Why? UTF-8 is the "native" format for RB.
>
>I new you will say this.
Of course you did! :->
>
>> It also, for Western
>> languages, usually requires only a little more storage than UTF-16.
>> Then what makes it a bad choice for Valentina Developers (especially
>> RB ones)?
>
>Right, and for e.g. Cyrillic it will eat 2 bytes per char.
And for Japanese even more than 2. But I'd like to be my choice...
>
>So what we get then?
>
> If they all make Vstring(50) as UTF16
> then they all can store 50 chars.
>
> German/USA developer make Vstring(50) as Latin1
> he can store here really 50 chars of English or German
>
> Russian developer make Vstring(50) as Cyrilic-win
> he can store here only 50 chars.
>
> German/USA developer make Vstring(50) as UTF8
> he can store here really 50 chars of English or German
>
> Russian developer make Vstring(50) as UTF8
> he can store here only 25 chars. <<<<<<<<<<< OPS
>
>Non - consistence.
>
>We want and we think this is correct to write in docs
>
> Vstring( MaxCharsCount )
>
>
>Problem of UTF8 is that can have variable length of bytes per chars.
>We cannot guarntee to you that if you make UTF8 Vstring(50)
>Then you will be able store here 50 chars in any language.
>
>At last of end, why we use unicode?
>To be able store any language.
>
>If you want store only German or only English then use Latin1.
>If you really want store any language then use UTF16.
Because for some of us we do not know in advance what users will want
to store, but *most* will be some variant of a Western language. If
hits to my web site are any indication, 90% or more are using a
Western language primarily, and 10% or so use Japenese (primarily).
But many Western language users mix in the occasional Japenese,
Greek, Hebrew, or whatever.
>
>
>We have discuss this deeply here.
>Vstring -- cause the most big problem for UTF8
>VarChar -- so so. IF you will write strings close to max limit you again may
>not fit into declared size.
>Vtext -- do not have problems.
>
Any reason, in principle, that we can't mix encodings in a single
database -- use UTF-18 for VStrings and UTF-8 for VText. Does this
have to be database-wide -- can't it be field-specific, like language
is now?
Anyway, if you are saying this as a warning, but we can still use
UTF-8 if we want, then point taken, and thanks.
Jon
More information about the Valentina-beta
mailing list