Index by words and apostrophes

Ruslan Zasukhin sunshine at public.kherson.ua
Thu Nov 30 11:53:14 CST 2006


On 11/30/06 12:45 AM, "Pierre Rossel" <agora07 at prossel.com> wrote:

Hi Pierre,

I have CC your question to ICU list also to get the best answer.

What I think is: 
    Valentina give you access to 7-9 parameters of Locale.
    
I think that if French have some special rules, then if you set correct
settings, ICU will do correct job.

----
For info of ICU list: in Valentina we use just WordBreakIterator which
search for boundaries of tokens according to current Locale settings.


> Hello,
> 
> I have noticed that the apostrophe character is not considered as a word
> separator when a text field is indexed by words.
> 
> If the text contains a sentence such as "The lion's pride", a search on the
> exact word "lion" won't match.
> 
> In French, the same sentence would translate to "L'orgueil du lion". Notice
> the apostrophe which separates "L" from "orgueil". "L" means "The" in
> English. A search on "orgueil" won't match the record as "L'orgueil" is
> considered as one word.
> 
> I have reported this as a bug in Mantis
> http://www.valentina-db.com/bt/view.php?id=2008
> 
> This is a real problem for me as some words cannot be found by my search
> engine if they are next to an apostrophe.
> 
> By the way, I have tried to search the word "orgueil" in the text "L'orgueil
> du lion" with several word processing applications and text editors, with
> the option  "whole words only". They all found it, so they all consider the
> apostrophe as a word separator. Why should Valentina behave otherwise ?
> 
> What do other developers think ?
> Apostrophe is part of a word or not ?

-- 
Best regards,

Ruslan Zasukhin
VP Engineering and New Technology
Paradigma Software, Inc

Valentina - Joining Worlds of Information
http://www.paradigmasoft.com

[I feel the need: the need for speed]




More information about the Valentina mailing list