Re: Question to French people about '‘uvres' and 'Oeuvres'

Thorsten Hohage thohage at genericobjects.de
Fri Oct 26 15:12:54 CDT 2012


Hi,

On 2012-10-26, at 21:45, Robert Brenstein <rjb at robelko.com> wrote:

> On 26.10.2012 at 22:39 Uhr +0300 Ruslan Zasukhin apparently wrote:
>> Hi Guys,
>> 
>> I did work on this bug report
>>    http://www.valentina-db.com/bt/view.php?id=5780
>> 
>> And finally come to understanding that issues comes from very special case:
>>  these two French words  '‘uvres'  and 'Oeuvres'
>> 
>> Are the same.  ICU collators says they are equal.
>> 
>> But they have different length. 6 and 7 chars.
>> So our guess in code that we can compare at first lengths was wrong.
> 
> There are more such things in various languages not only French
> 
> http://en.wikipedia.org/wiki/Typographic_ligature
> 
> Many language handling programs also recognize that ü = ue for example.

I'm German, so my information may be wrong …

The situation becomes even more worse. AFAIK Sweden decided to simplify the handling of Umlauts in digital media, so they NOW define 

	Göteborg

to become

	Goteborg

by simply replacing all Umlauts with the vowels without dots. So it is not always the rule Ü=ue but it depends on locale setting.


Furthermore the decomposition of Umlauts can cause more issues. In German there are several given sort orders and in some cases "Fü" comes behind "Fuz" and is not handled like "Fue", btw historical they are at the end  x, y, z, ä, ö, ü … really strange :)



regards,

Thorsten Hohage
-- 

Valentina Technology Evangelist
generic objects  GmbH - Leiter Solution Center Nord



More information about the Valentina mailing list