A plan emerges Re: Fine Tuning

Fri Nov 18 12:13:35 CST 2005

>>> 
>>> You can MIX.
>>> Some task do in API, some in SQL
>> 
>> I intend to.
>>   

Ok, here's a first pass at a plan for more optimal access.  I'm thinking
this through as I write it here, so bear with me please.

First of all, the nature of my processing is going to be as follows:

    1) I'm going to start with a big hairy parse tree derived from some data
source such as an XML file, and then render its contents into a V4RB
database for fast access.  I'll likely not parse it all at once, but
incrementally over time in the background so the user essentially sees
instantaneous response in the GUI at all times. But sooner or later,
population of the DB from the parse is going to be completed -- and in all
probability much sooner than the user can get around to interacting in a
manner that will require a serious query.  Worst case, the user has to wait
a few more seconds for the population to complete before their first deep
query gets processed.  But navigation at high levels should be available
immediately regardless.

    2) Once the population effort is completed, the tables that result -- or
perhaps more accurately, the content of the records in the range of rows
used by the populated data in the various tables -- will never be modified,
and none of the populated records will ever be deleted. Thus this
essentially becomes a static database.

    3) Any modifications that are made by the user will be represented
either in separate tables, or in newly added rows to the existing tables.
And the same static nature for these modifications is true as well.  Once a
modification is entered, it will never be deleted except perhaps by some
eventual "cleanup" routine that gets trigged priore to long-term storage
when the database tables are no longer in active use.

After the database is populated I then plan to query the hell out of it. To
facilitate that in a manner that would simulate the "lazy read" mechanism
you propose I could create a collection of VCursors for the purpose by means
of several "SELECT <someFieldNames> " SQL statements.  If there were natural
field groupings I intended to use together such as "HomePhone, WorkPhone,
Fax", I could include multiple field names in <someFieldNames>.  But in the
extreme case, or if I was clueless in advance of what my desired access
patterns were, I would essentially create a VCursor for each single field in
this manner, each field spanning every record of the table.  Let's call a
Vcursor made in this manner and for this purpose a "VFieldRef".  Since the
table will never grow in a manner that will affect the range of records I
care about, I never need to recreate or update these VFieldRef accessors.

To work with these field accessors, I first get a list of records I care
about either by using the API to come up with Vsets or by using SQL
statements resulting in a cursor over a set of desired records with a single
cursor field of RecId.   I can then iterate over the Set or Cursor, pull out
successive RecId values, and use them to access whichever particular field
values I need to access via...

    aVFieldRef.position = recIdOfInterest
    anInt = aVFieldRef.ShortFied(1).value

...and thus rapidly access the sole field I need without paying the penalty
for retrieving the the current value for every field of the record at
recIdOfInterest.

And to make it all look pretty we can create a new class for VFieldRef:

    class: 

        VFieldRef subclassOf Object

    constructors:

        VFieldRef( inFieldID as Variant , inVCursor as Vcursor )
            // Initialize VFieldRef for accessing the specified field
            // of the existing inVCursor.
            // The inFieldId may be a name String or index Integer

        VFieldRef( inFieldId as Variant, inTable as VTable )
            // Initialize VFieldRef for accessing the specified field
            // of the existing inTable.
            // The inFieldId may be a name String or index Integer

    properties:

        myVCursor as Vcursor
        fieldIndex as Integer

    methods:

        Value( inRecID as Integer ) as Variant
        Value( inRecID as Integer, Assigns inValue as Variant )

        ShortValue( inRecId as Integer ) as Integer
        ShortValue( inRecId as Integer, Assigns inValue as Integer )

        StringValue( inRecId as Integer) as String
        StringValue( inRecId as Integer, Assigns inValue as String )

        ... and many more for each of the various desired types.

The net result is that instead of using the syntax shown above...

    aVFieldRef.position = recIdOfInterest
    anInt = aVFieldRef.ShortFied(1).value

...you could instead use the prettier syntax...

    anInt = aVFieldRef.ShortValue( recIdOfInterest )

...as well as:

    aVFieldRef.ShortValue( recIdOfInterest ) = anInt

There is a minor cost for calling a VFieldRef method of course to use this
prettier syntax, but I suspect that the overhead is absolutely minimal
compared to the overhead of retrieving the current value for every field of
the record at recIdOfInterest when you only want to access one or two
fields.

This also has the benefit of completely eliminating usage of the cursor as
an explicit access mechanism, delegating its usage to a hidden abstraction
--  where I'd far rather have it be.

So then, the big questions are:

    Does this make sense?
    Will it work as intended with the efficiency gains I infer?
    What did I miss?

Assuming the above approach or something similar will work, then I'm now
back to the conclusions:

    1) There is once again no benefit, and indeed there is both a
performance and storage penalty, for using Binary Links over ObjectPtr links
for my particular usage needs where most all of the linked fields involve a
1:1 unique:unique or M:1 unique:unique mapping from my primary table records
to the records of my various ancillary support tables.

    2) I can be far less concerned about trying to predict in advance which
are the most frequently accessed fields to include in my primary table
records and use a layout that feels intuitive and comfortable rather than
minimalist for (supposed) performance optimization -- which is the way it
should be in my opinion.

    3) I still have the option of making a wide or narrow primary record and
shoveling off ancillary fields to other tables.  Which is a good thing!

Now I'm itching to start coding :)