OpenBase and its sort table.

Robert Brenstein rjb at rz.uni-potsdam.de
Mon Sep 15 21:56:30 CDT 2003


>on 9/15/03 12:03, Robert Brenstein at rjb at rz.uni-potsdam.de wrote:
>
>>  Ruslan, we have discussed the pros and cons a number of times, so it
>>  may be worth for you to search the archives. IMHO you will have to
>>  maintain your own sort tables to work reliably cross-platform and
>>  cross-environments, particularly when supporting multiple encodings.
>>  But you need to provide support for only a few languages in the
>>  kernel, just allowing to extend that through plugins or another
>>  mechanism. Simple tables will not work for all languages, but you
>>  don't need to have such algorithms for all languages right away. You
>>  just need to design a good extendable architecture to handle language
>>  issues.
>
>Okay Robert,
>Then another issue: unicode.
>
>1) own tables can be good for single byte languages when we have table size
>just 256 chars. For unicode this not works of course.

Yes, of course, you need to special-case unicode.

>2) it very looks that for example one-two years later 99% db users and db
>developers will prefer to use unicode. Or not?
>May be implement in Valentina 2.0 only unicode UTF8 and UTF16?
>And drop support of single byte encodings at all?

Hmm, that could be a solution to simplify things for you, but not 
necessarily for all of us:

a) I think that 2 years is a bit optimistic to have 99% demand. Look 
how many people are still using older systems that are a years old. 
Yes, most Mac users will abandon system 8 by then (I am sure 9 will 
still be actively used) and most Win users will move up from W98.

b) Those of us that release programs for mass markets, will need to 
support non-unicode input and output. This will shift 
encoding/decoding from kernel to application level.

c) While Unicode support is important, I dare say critical for some 
applications/markets, I am not sure that Unicode is a solution for 
all.

d) I am not sure Unicode will solve all your problems -- you may have 
to support only two unicode encodings but you will still need to deal 
with language-specific sorting, and relying on what operating system 
offers may not suffice.

e) You could internally do everything in Unicode and just provide 
auto-encoding upon delivery.

But some of this is pure speculation since we can't reliably predict future.

By the way, I was thinking about what you said that most db systems 
allow to specify encoding on the database-level. While we do need 
field-level control for multi-lingual applications, having 
database-level setting would be the default for all fields that do 
not have custom encoding/language. For some projects, this would 
surely suffice and add welcome flexibility for others.

Robert


More information about the Valentina mailing list