Optimal Performance String Searching with Hashes Re: VChar vs VText

Ruslan Zasukhin sunshine at public.kherson.ua
Wed Nov 30 18:47:51 CST 2005


On 11/30/05 6:38 PM, "Ed Kleban" <Ed at Kleban.com> wrote:

>> Just IMHO this is bad design make keys on strings. :-)
> 
> Because if the indexing mechanism such as hashing is valuable, then the
> database facilities should support it so that the user doesn't have to --
> that being the whole point of a database?  Yeah, sure I can appreciate and
> even support that view.
> 
> The problem is, that off the top of my head, I don't know what "add[ing]
> HashIndex for STRING/VarChar fields" really means ... or why "[for] fields
> which are marked as Primary Key" is relevant since I might well have
> multiple string fields in a record that I want to access with hashes -- not
> likely, but possible.
> 
> hmm... 
> 
> Yeah, I guess you could do this easy enough.  It raises a few questions
> though..
> 
> 1) What syntax do you use.  I suppose you could use EVFlags.fIsHashed on a
> VarChar, String, or Text field.  As a result a separate parallel hash field
> would be created in the magic secret index table land that the user never
> sees.
> 
> Then FindSingle and the other Find commands just work transparently...  Yep.
> Sure.  That would work fine.

right
 
> The only dicey question is whether there is value in allowing the internal
> hash encodings to be exposed or usable as arguments.  I guess I'd say no,
> and if it really mattered you could simply implement hashes yourself
> manually the way I am now.

Also right.

Or only C++ developers can get to hashing.
 
> There is one likely hitch however.  This will only work for certain
> collation attributes unless you always cast a string into some normalized
> form before performing the hash.   So in the case of kStrength, kPrimary for
> example, you'd have to cast the string to a version with no accents and all
> lowercase before calculating the hash.  But yeah, I guess that's readily
> doable as well.

:-)

It needs check, may be IBM guys have hash methods.
Something they have exactly.

> Ok you've convinced me.  Or we've convinced me.  This should indeed be a
> feature added to V2, and can offer dramatic speed improvements to all.
> 
> I'll (eventually) submit a feature request to Manits.

-- 
Best regards,

Ruslan Zasukhin
VP Engineering and New Technology
Paradigma Software, Inc

Valentina - Joining Worlds of Information
http://www.paradigmasoft.com

[I feel the need: the need for speed]




More information about the Valentina mailing list