splitToWords question
jda
jda at his.com
Sun Aug 7 07:44:49 CDT 2005
> > I discovered the SplitToWords function in your examples, but nowhere in
>> the documentation. Nice feature, but what is it for?
>
>This was made on request of Jon (jda)
>
>Jon, can you describe how to use it ?
This is a very helpful function that I use to obtain words for
indexed searches. It obeys the same rules as the Valentina field it
is a method of, so that word boundaries match those of the index (so
it is for indexed searches). That way I don't have to know all the
characters that Valantina/ICU considers word breaks. Before, I was
guessing and replacing dozens of characters with spaces (periods,
commas, semicolons, etc), which was inefficient and in some cases
wrong.
Also, there are tricky ICU exception that SplitToWords takes care of
for you. For example a period is a word break character if it is
followed by a space:
...and so on. Furthermore,...
But, it is NOT a word break character if it is followed by a letter or number:
$12.23
SplitToWords knows this and returns the correct words for indexed searching.
This should be documented, of course -- I'm sure others would find it useful.
Jon
More information about the Valentina
mailing list