OpenBase and its sort table.
Robert Brenstein
rjb at rz.uni-potsdam.de
Mon Sep 15 21:56:30 CDT 2003
>on 9/15/03 12:03, Robert Brenstein at rjb at rz.uni-potsdam.de wrote:
>
>> Ruslan, we have discussed the pros and cons a number of times, so it
>> may be worth for you to search the archives. IMHO you will have to
>> maintain your own sort tables to work reliably cross-platform and
>> cross-environments, particularly when supporting multiple encodings.
>> But you need to provide support for only a few languages in the
>> kernel, just allowing to extend that through plugins or another
>> mechanism. Simple tables will not work for all languages, but you
>> don't need to have such algorithms for all languages right away. You
>> just need to design a good extendable architecture to handle language
>> issues.
>
>Okay Robert,
>Then another issue: unicode.
>
>1) own tables can be good for single byte languages when we have table size
>just 256 chars. For unicode this not works of course.
Yes, of course, you need to special-case unicode.
>2) it very looks that for example one-two years later 99% db users and db
>developers will prefer to use unicode. Or not?
>May be implement in Valentina 2.0 only unicode UTF8 and UTF16?
>And drop support of single byte encodings at all?
Hmm, that could be a solution to simplify things for you, but not
necessarily for all of us:
a) I think that 2 years is a bit optimistic to have 99% demand. Look
how many people are still using older systems that are a years old.
Yes, most Mac users will abandon system 8 by then (I am sure 9 will
still be actively used) and most Win users will move up from W98.
b) Those of us that release programs for mass markets, will need to
support non-unicode input and output. This will shift
encoding/decoding from kernel to application level.
c) While Unicode support is important, I dare say critical for some
applications/markets, I am not sure that Unicode is a solution for
all.
d) I am not sure Unicode will solve all your problems -- you may have
to support only two unicode encodings but you will still need to deal
with language-specific sorting, and relying on what operating system
offers may not suffice.
e) You could internally do everything in Unicode and just provide
auto-encoding upon delivery.
But some of this is pure speculation since we can't reliably predict future.
By the way, I was thinking about what you said that most db systems
allow to specify encoding on the database-level. While we do need
field-level control for multi-lingual applications, having
database-level setting would be the default for all fields that do
not have custom encoding/language. For some projects, this would
surely suffice and add welcome flexibility for others.
Robert
More information about the Valentina
mailing list