can I query raw bits using bit logic in a huge RS database?

Ruslan Zasukhin ruslan_zasukhin at valentina-db.com
Sun Jun 10 07:12:35 CDT 2012


On 6/10/12 2:00 PM, "Aaron Andrew Hunt" <aaronandrewhunt at gmail.com> wrote:

Hi Aaron,

> It looks like Valentina can do XOR on the bits, but maybe lack of indexing
> means the query will be slow ...

* not only Valentina lacks index here. Any db engine I think...

but:

1) Valentina has columnar format, so when query do not indexed FIELD scan,
 Valentina loads only this field, other db engines must load the whole table

2) if will be possible split your strings into different tables, this reduce
scan size in N times ... For any db engine btw.

3) as I have told ... May be it is possible todo tricks and define some
functional indexes ... Valentina can do this.  Question is to find if this
is possible for your task.  If yes, you are lucky :)
if no -- it needs to think again more deeply and get yes :)

 
> The primary data per record is this set of bits which we need to search using
> XOR, and the rest of the data per record is all integers and booleans.

This is clear ...
 
A) so read again about columnar format

B) always possible split such columns at least into 1:1 tables ...


> Separate tables for our database is looking like good option, in which case
> searching some tables will be hundreds of millions of records while others
> will be only hundreds of thousands, some thousands, some hundreds, and some
> only tens.

Yes, each table most probably will have different number of records
I assume your data have min/max spread on line

But anyway, this allow speed up query

And the main magic hidden here --
    is the ability to use N physical computers if needed.
 
> I thought about using separate fields of booleans to represent the bits, but
> then the table gets huge, the query gets weird, and also we don't want a limit
> on the number of values in this set of bits per record.

Well, from one side ... with Valentina DB table will not become huge with
bit-fields 

Just you say that you have VAR-length bit-strings ...

And fields of table -- this is FIXED-set design ...

And you have mention even unlimited length ...
 table cannot have unlimited fields

-- 
Best regards,

Ruslan Zasukhin
VP Engineering and New Technology
Paradigma Software, Inc

Valentina - Joining Worlds of Information
http://www.paradigmasoft.com

[I feel the need: the need for speed]




More information about the Valentina mailing list