Sorting compound names

jda jda at his.com
Mon Sep 20 11:08:41 CDT 2004


>I don't wish to belabor the point -- and I promise not to write 
>further on the topic unless you specifically ask.  I'm writing again 
>only because I'm not sure I communicated my point clearly the first 
>time.

I very much appreciate your input.

>
>>"This is one of these anglocentric idiosyncrasies which horrify 
>>Germans, Dutch, Swedes and so on."
>
>Do all of your authors' names really follow the same sorting rules? 
>There are so very many different -- conflicting! -- rules and 
>customs:  sometimes varying by language, sometimes by country, by 
>region, or by culture.  The list I sent the other day was supposed 
>to show the inconsistencies inherent in the preferences of different 
>peoples, not an Anglo predilection for sorting by "whole" last names.
>
>For example, I'm sure you saw that some of the "vans" were sorted 
>under "V", such as
>
>     - Van Devanter, Willis
>     - Van Rensselaer, Stephen
>
>Yes, they're both Americans, and their last names are sorted under 
>"van" -- that's where you'd find them in a biographical dictionary, 
>too.  *But* the list also included
>
>     - Beethoven, Ludwig van
>     - Braun, Wernher von
>
>properly sorted under "B."
>
>I wasn't trying to argue that you should always sort by the whole 
>last name -- rather, that rules and conventions vary incredibly, and 
>that there is no algorithm you could use to ensure that authors are 
>alphabetized in the manner that they would wish.
>
>The db design that would give you the most flexibility for dealing 
>with this inconsistency is to have an author-name-sort non-method 
>field.  Perhaps you could initially populate it using rules that you 
>deem appropriate for the majority of your (Dutch & German?) authors, 
>but then you could easily override the sorting order on a 
>case-by-case basis.
>
>Whatever you decide, good luck with your project!
>

The method route won't work, because the Valentina functions aren't 
robust enough to handle what needs to be done (e.g. multiple words to 
ignore, multiple names in a field, etc.). The tmp table method I 
outlined before will work though, I think.

I agree with your point about inconsistencies between languages, 
regions, etc. My application is international, and is used by folks 
from the US to Europe to Asia.

I would offer "German/Dutch/whatever" sorting as an option the user 
can set a runtime. It will basically ask what words are to be ignored 
when sorting records by name, so it's generic. I imagine most users 
will not want it. But for those who do (i.e. the non-anglophilic 
person you quoted above) it should suffice.

Thanks again,

Jon


More information about the Valentina-beta mailing list