DRAFT of specification if Indexing By words for strings.

Thu Sep 23 08:46:34 CDT 2004

>Also, there may be situations where a field 
>defined as, for example, English may contain 
>words from other languages, in which case the 
>IBM ICU might cause the word "rechêrche" (for 
>example) to be indexed as "rech" and "rche," as 
>(I believe) the Valentina 1.x kernel does 
>(certainly it does this with many "accent" 
>characters).  If we could have the capability of 
>adding "ê" (in this case) to a set of word-break 
>characters, we could fine-tune the ICU as 
>appropriate for our specific applications.
>
>Do others see a need for this, too, or is this my own pipe-dream?
>

I certainly do. I am in the exact situation you 
are -- the default language is set to English, 
but my users are international (and even the ones 
using English often enter words/names that 
include accented characters).

Jon