Full Text Index Search

Wed Oct 27 09:24:10 CDT 2004

On 10/26/04 2:09 PM, "realbasic-nug-request at lists.realsoftware.com"
<realbasic-nug-request at lists.realsoftware.com> wrote:

>>> I use V4RB in a product I'm developing. I'm looking to implement a
>>> full-text search on a field in one of my tables.
>> 
>> Now I'm confused - I thought from stuff that Ruslan had said here
>> before that Valentina has full text searching and also fragment
>> searching (as well as making the coffee whilst you're out taking a
>> stroll on the lake :-).
>> 
>> I'd do some speed trials with LIKE before making any heavier decisions.
>> 
> LIKE in Valentina is extremely fast (even though it's an unindexed
> search). I did some testing a while ago and unindexed searches in
> Valentina are faster than an indexed search (FULLTEXT) in other DBMS.

Hello guys!

I was absent one day, so sorry for delay.
Let me give answers.

> From: Andy Dent <dent at oofile.com.au>
> Date: Tue, 26 Oct 2004 13:18:03 +0800

>> I use V4RB in a product I'm developing. I'm looking to implement a
>> full-text search on a field in one of my tables.
> 
> Now I'm confused - I thought from stuff that Ruslan had said here
> before that Valentina has full text searching and also fragment
> searching (as well as making the coffee whilst you're out taking a
> stroll on the lake :-).

Andy, I think James understand under term FULL TEXT indexing, the feature
similar to MS SQL Server or may be some special search text engines. Such
feature can give you ability for example do search as

    word1 NEAR TO word1 in 2-3 steps.

Valentina cannot do this.
Valentina do have special INDEX BY WORDS for String, VarChar and TEXT
fields. Up to now most developers are quite satisfied wit this.

2) about speed of LIKE in in Valentina.
Yes James, it is much faster than in mySQL or others. Because of internal
Valentina structure. Let me to remind:

If you have in mySQL table with 30 VarChar fields, then one record can be in
average say 30 * (15-20) = 600 bytes or more. To scan such table for LIKE
DBMS need load into RAM all this.

Valentina instead use for each column one file.
So we need scan only 20 bytes * N of records.
For this example Valentina is expected to be 30 times faster.

3) in Valentina 2.0 we will use SOMETIMES the index file to be even more
faster. Example. Assume you have table in million of records. Assume that
you have only 10,000 different words in column. So index will contains only
10,000 columns. Then using INDEX SCAN instead of COLUMN SCAN we get speed
even 100 times faster.

4) Next, in Valentina 2.0 (you can see this in beta) we have introduce new
special STRING SEACH functions in V4RB API. Similar as you have on OS X or
emails app...

5) Next, in 2.0 we have implement that not only left() function use index,
But also right() and substr().

So James, I think you can look on Valentina 2.0 beta.
I believe you will be impressed :-)

-- 
Best regards,
Ruslan Zasukhin      [ I feel the need...the need for speed ]
-------------------------------------------------------------
e-mail: ruslan at paradigmasoft.com
web: http://www.paradigmasoft.com

To subscribe to the Valentina mail list go to:
http://lists.macserve.net/mailman/listinfo/valentina
-------------------------------------------------------------