Default maximal length for VarChar Re: VChar vs VText

Ed Kleban Ed at Kleban.com
Sun Nov 27 19:57:00 CST 2005




On 11/26/05 1:31 AM, "Ruslan Zasukhin" <sunshine at public.kherson.ua> wrote:

> On 11/26/05 5:35 AM, "Ed Kleban" <Ed at Kleban.com> wrote:
> 
>> I was using 504 because I was thinking about UTF-8 strings.  But UTF-8
>> doesn't work well in V4RB v2.0 correct?  So should I change to using 1022?
> 
> Yes for TF16
You meant "Yes for UTF-16", presumably.

> 
> Actually in v2 for UTF8 the best size of VarChar will be 2044.
> 

Why?  Does this have to do with the natural page size of MacOS X or
something?  

I understood the logic described in the Kernel manual which makes good
sense:

" The default maximal length for VarChar field is 504 bytes. This is the
maximum number that allows us to work with 1,024 byte pages. Indeed:
 8 bytes of header + 2 records *  (504 + 4) = 1024 bytes.
You can specify lower values for maximal length (e.g., 20), but 1,024 byte
pages will still be used. The only advantage is that you will truncate
longer strings to 20 bytes.
"

Is this no longer true?  Has the change to unicode in V2 made it the case
that the default maximal length has changed?   Is the Kernel manual
incorrect?

If the minimal page size is still 1024 bytes, then wouldn't that mean since
each character now takes up 2 bytes in UTF-16 when stored, instead of one by
as when previously stored in UTF-8, that the default maximal length for a
VarChar field should now be 504 / 2 = 252 bytes?

I don't have enough detail yet for this to make sense.

> 
>> Also, when I get a string into a RB variable with GetString, is there a way
>> to retrive a UTF-8 string?
> 
> You get it as UTF8 now right now !!!
> 
> REALbasic work with UTF8 on default.
> 
>> Is it slower if I do rather than retrieveing
>> UTF-16 strings? 
>> 
>> Either way, I'm sure glad I'm doing my searches on string hashes rather than
>> actual strings.




More information about the Valentina mailing list