Binary links Re: Question Backlog for Valentina mailing list.

Ruslan Zasukhin sunshine at public.kherson.ua
Wed Nov 16 20:59:41 CST 2005


On 11/16/05 5:52 PM, "Ed Kleban" <Ed at Kleban.com> wrote:

>>> They do effectively the same as you do in Relational model with help of
>>> third Table to establish M:M link.
>>> 
>>> What do this table? It just remember pairs  (id1, id2)
>>> 
>>> BinaryLink do the same, but use RecIDs and it is NOT a table.
>>> It is more effective structure.
>> 
>> Hmm, more effective than a table.  Meaning not a linear list of pairs, or
>> meaning not a list that's the same length as one of the two related tables.
>> The former I guess.  Hmm... Well, you could certainly make it more efficient
>> by keeping two separate indexed lists so you could do a binary search in
>> either direction.  And you could make it more efficient with an indexed list
>> that pointed to a "bucket" with a block of corresponding related entries so
>> that when doing a binary search you didn't have the overhead of duplicates,
>> and once you did the seach you had a linear list of corresponding records.
>> Yeah, that's probably how I'd do it... if I could avoid the temptation of
>> hashing the indexes to avoid the log2N cost of the binary search for large
>> sets of links. 
> 
> Ok, I've had some time to sleep on this, and am getting excited!
 
> And of course I have more questions.  :)

Great! I like how you think!

 
> Let's run through an example.  Say I have a table of people that has a
> recursive relationship for "MotherOf" as a M:1 relationship.  If I use an
> ObjectPtr field for the MotherOf relationship, then finding the mother for
> any given record is EXCEEDINGLY fast.  I simply go to the MotherOf field,
> and there's the record ID I need for the mother.  And if I intend to record
> a mother for every person in the table then the representation is also
> exceedingly efficient.
> 
> I could make it more space efficient, and possibly less speed efficient by
> using a UShort field instead of an ObjectPtr if I knew, say I was only going
> to have, say 20,000 records maximum.

This will speed up also, because HDD -- main break in db.
 
>     Question 1: Would use of a UShort field here for 16-bit encoding
>     of RecID actually be slower than use of an ObjectPtr?

NO.

And we have think about giving to developer such ability to specify type of
ObjectPtr. This bring few problems:
    - mistake of developer
    - type of ObjectPtr different, then how right correct code?
 
And honestly was no time to go so deeply.
But yes in Valentina 2 you can use FOREIGN KEY now, it works as link,and it
have constraints. So in general you loose nothing here.

------------------------
> In the opposite direction, if I wanted to find all the children of a given
> mother, it would take much more time since the DB would have to scan every
> record to look for a matching value in the MotherOf field for a given mother
> record RecId.  For N records the order of performance of the linear search
> would be o(N).

But usually you have INDEX for such fields. And ObjectPtr is always indexed
and FK fields also.

------------
> Now lets consider Binary Links.  And since you've not yet commented on my
> supposition above, let's assume that the implementation I postulated is
> indeed close to what Valentina implements; Namely for a M:1 relationship it
> keeps separate "info" for the "M->1" and the "1->M" directions, and that for
> the "1->M" direction in fact the execution requires a single binary search
> of an indexed list of, in this case known mother records, and results in
> obtaining a list of all the associated children.  In this direction the
> performance is much better then because: The DB does a single binary search
> on a small list of records known to be mothers and ends up immediately with
> the same result as would require scanning every record of the table if an
> ObjectPtr field was used.
> 
>     Question 2:  Am I pretty close so far?

Yes, only remember that ObjectPtr is indexed. So search will be the same -
binary by index.

ObjectPtr and Binary Link here are equal by speed.

 
> In the reverse direction, find the mother of a given child, I can't conceive
> of anything special the binary link could do that would be any faster than
> the ObjectPtr case.

Right, and now it is ieven slow, because we have implement for both
direction only the general M M way. We still can improve things to get jump
in director ToOne as fast and by ObjectPtr.

> So for the M->1 "info" retained by the link I'm
> guessing it simply does that.

In ideal yes.

> I.e. it just keeps a mapping for every record
> that a link is defined for, which in the example above whever every record
> has a valid MotherOf would essentially end up taking the same amount of
> space as, and operating in the same amount of time as the ObjectPtr case.
> 
>     Question 3:  Am I still on target with this supposition?

Absolutely correct Ed!

And now COMPARE:

      ObjectPtr           and              BinaryLink

        1 : M                              1 : M
            
         SPEED THE SAME
         
         SPACE on disk THE SAME

But !!!!!

1) Binary link do not infect Structure of Table.

2) Binary Link can be easy converted tomorrow into M:M if your customer or
you change mind.

3) IT IS possible for binary link add parameter, that DEVELOPER swear
    todo searches only in one direction, then we can even more win
    in space AND speed of modification.

We need add all this docs asap :-)


-- 
Best regards,

Ruslan Zasukhin
VP Engineering and New Technology
Paradigma Software, Inc

Valentina - Joining Worlds of Information
http://www.paradigmasoft.com

[I feel the need: the need for speed]




More information about the Valentina-beta mailing list