2.0
jda
jda at his.com
Thu Mar 11 12:02:36 CST 2004
>>>>
>>>>
>>>I suppose I don't get the point of the entire discussion. It
>>>appears to me that UTF-16 is a better choice than UTF-8 for
>>>Valentina.
>>>
>>
>>Why?
>
>First of all, the storage requirements of UTF-8 v. UTF-16 aren't
>important; it wouldn't surprise me if the differences were dominated
>by the overhead of Valentina's own filesystem format. Disk space is
>cheap. What does matter is speed, and it appears to me that UTF-16
>is better suited for the sort of string manipulation required for
>indexing and other such database operations. You might take a look
>at <http://www.unicode.org/notes/tn12/>.
>
If Valentina can offer both, I suggest that the developer should
decide which better suits his users' needs (one could even let the
user specify what storage to use as a preference). Ruslan has already
said that all internal operations will use UTF-16 (as does Mac OS X
and Windows). The issue of UTF-8 is really only about storage. As I
said before, if supporting UTF-8 as a storage option raises
significant problems (in implementation, indexing, performance, etc.)
then let it be UTF-16 everywhere. Since Ruslan hasn't yet tackled
text indexing issues, I guess we won't know for a bit.
Jon
More information about the Valentina-beta
mailing list