splitToWords question

jda jda at his.com
Sun Aug 7 07:44:49 CDT 2005


>  > I discovered the SplitToWords function in your examples, but nowhere in
>>  the documentation. Nice feature, but what is it for?
>
>This was made on request of Jon (jda)
>
>Jon, can you describe how to use it ?

This is a very helpful function that I use to obtain words for 
indexed searches. It obeys the same rules as the Valentina field it 
is a method of, so that word boundaries match those of the index (so 
it is for indexed searches). That way I don't have to know all the 
characters that Valantina/ICU considers word breaks. Before, I was 
guessing and replacing dozens of characters with spaces (periods, 
commas, semicolons, etc), which was inefficient and in some cases 
wrong.

Also, there are tricky ICU exception that SplitToWords takes care of 
for you. For example a period is a word break character if it is 
followed by a space:

...and so on. Furthermore,...

But, it is NOT a word break character if it is followed by a letter or number:

$12.23

SplitToWords knows this and returns the correct words for indexed searching.

This should be documented, of course -- I'm sure others would find it useful.

Jon


More information about the Valentina mailing list