IndexStyle
Robert Brenstein
rjb at robelko.com
Wed Nov 3 15:59:21 CST 2004
>>-- I think may be Valentina should on default use style with length limit at
>>least 2 or 3 or may be even 4? Or better put this on developer ?
>
>I'd be inclined to want everything indexed unless I as a developer
>explicitly overrode that behavior. But I suppose that having a
>default length limit of 2 or maybe 3 would be OK -- as long as it
>was very clearly documented and overrideable.
While 2-3 char words in English can commonly be excluded when
indexing generic texts, this may or may not be true for other
languages. Furthermore, it may be desirable for some texts to index
2-3 char acronyms. Not to mention a number of 3-char words in English
that are proper nouns or verbs and should normally be included. And
for scientific texts, 1-2 char words can also be quite proper (like
element symbols). A solution used by Atomz for example is to allow us
to provide an explicit list of words to be excluded.
Robert
More information about the Valentina-beta
mailing list