IndexbyWords // Vstring.SplitToWords( text as string)
Ruslan Zasukhin
sunshine at public.kherson.ua
Sun Nov 20 21:09:03 CST 2005
On 11/20/05 7:08 PM, "jda" <jda at his.com> wrote:
>>>> Unfortunately it turns out that "words" as defined by SplitToString
>>>> include
>>>> "." and exclude "_".
>>>
>>> I believe they follow natural language rules.
>>>
>>>> The later is unfortunate for my application. The
>>>> former seems to be unfortunate for most every application I can think of
>>>> that might want to otherwise use IndexedByWords.
>>>
>
> I really hate to contribute to this endless series of messages, but
> Ruslan, be careful what you do. The ICU library is smart and ignores
> periods if they end words, but not if they are in the middle of
> "words". The last time I tested this, "foo." becomes "foo", but
> "foo.bar" stays as "foo.bar". I assume this is mostly done so that
> decimal numbers aren't split into two (21.50 isn't split into 21 and
> 50).
Do not worry Jon. I am not going change behavior of ICU.
What we have point is ability for DEVELOPER to change/tune something
--
Best regards,
Ruslan Zasukhin
VP Engineering and New Technology
Paradigma Software, Inc
Valentina - Joining Worlds of Information
http://www.paradigmasoft.com
[I feel the need: the need for speed]
More information about the Valentina
mailing list