Efficient use of text-containing field

Ed Kleban Ed at Kleban.com
Mon Dec 5 21:41:51 CST 2005




On 12/5/05 9:25 PM, "jda" <jda at his.com> wrote:

>> 
>> 
>> If not, then the String(16) approach is probably your best bet, because as I
>> noted you can't index a field declared as IndexByWords.  If speed is a total
>> non issue and you only needed do the sort once after filling the table and
>> then using it read-only you could certainly use the String(16) approach to
>> sort, then fill in a SortOrder field based upon the result, and then throw
>> the String(16) field away... Or build the String(16) array in a parallel
>> temporary table that you subsequently punt.
>> 
>> Hope that helps.
>> 
> 
> Thank you, but I need the VarChar to store the data (which can be
> large) and is indexed by word.

Great.  No problem there.  I understand.

> That is why I need a sort method that
> is not indexed by word. My question to Ruslan is whether VarChar(16)
> saves me storage space.

Since you're using IndexedByWord, as you say, then to also do a sort then
yes, you'll need to allocate some other field.  And if you only need to do
the sort on the first 16 characters, then creating a VarChar(16) field with
a copy of the first 16 characters of the VarChar(1022) field will indeed
save you a tremendous amount of storage and be much more space efficient
than having to create two VarChar(1022) fields -- one to be IndexedByWords,
and one to be Indexed for sorting.

If it is also the case that the vast majority of strings you have stored in
the VarChar(16) field are at least 16 characters in length, then you can
employ an additional optimization.  You can declare the field of 16
characters to be String(16) instead of VarChar(16).  The benefit of doing
this is that if the majority of the VarChar(1022) fields actually have
strings of at least 16 characters then this declaration will actually save
you storage and be more compact because V2 does not need to store the
per-page overhead bytes that allow it manage a VarChar field.  Furthermore
using String(16) will result in a sort that is faster because V2 can
instantly calculate the precise position of every string record when it does
it's comparisons to accomplish the sort by indexing the 16-char field, as
well as speed any lookups you perform on that field once it is sorted.

And then if storage efficiency is the most important, depending on why you
wanted the field sorted in the first place there are additional
optimizations you may be able to use to eliminate the String(16) field after
you use it for sorting by replacing it with an even smaller field, as I
alluded to in previous posts.  But those are refinements you can simply
ignore if they sound too scary or complex.

> Ruslan?

I'm sure Ruslan will chime in with a confirmation or a better idea when he
gets a chance ;-)
 
> Jon 

Good luck!
--Ed




More information about the Valentina mailing list