accented characters, SOUNDEX
David Hood
david.hood at stonebow.otago.ac.nz
Tue Jul 27 16:20:29 CDT 2004
On 27/07/2004, at 4:06 PM, Ruslan Zasukhin wrote:
> What is soundex?
>
Soundex is a set of simple rules for identifying variations in English
language names where the name is not spelt in a consistent way. It was
developed for the U.S. census in the late 19th Century (well before the
modern computer), and is still commonly used by genealogists tracing
family histories. Some people also use it for dictionary style
look-ups, however I would not rate it as the best tool for this task.
Basically it codes the consonants of a word so that you can identify
possible variations on that word which share the same code.
For more detail I'll refer people to
http://caversham.otago.ac.nz/files/working/ctp060902.pdf which is some
notes on phonetic matching I wrote for a project I work on. This kind
of thing is pretty easy to implement as a function in any kind of
database - you just have a field that stores the encoded form of the
word.
David Hood
More information about the Valentina
mailing list