IR, inverted-index and lists-join

Ruslan Zasukhin sunshine at public.kherson.ua
Fri Mar 14 01:02:08 CST 2003


on 3/13/03 10:39 PM, Tonio Virgilio LEVRA at levra at yahoo.com wrote:

Hi Tonio,

> I need to do a search engine for combined full-text
> and keyword filtered query but I've a little
> experience on db and no experience on search engine,
> information retrieval, etc.
> 
> My source flat file has 65.000 records with one big
> text field (up to 35000 chars) and at least 8 fields
> of filtering/describing keywords(categories), more
> other descriptive fields.
> The biggest of this 5 keywords field can have till 10
> different keywords each record and the possibile
> values for this keywords could be max 800.
> 
> Looking for a solution I'm wondering to do in this
> way:
> 1. Build an inverted index for the big text field

I have not understand this point.
You want build self index ???
Why? Valentina do self this task.

> 2. Build 5 different inverted index for the 5
> categories field
> 3. Store the 6 inverted index in different table on
> the db
> 4. store the flat file on the db in one table
> 
> and then
> 
> a. query for the 6 inverted index according to the
> end-user category filtering and text search.
> 
> b. store the six resulting lists of records in a temp
> table linked to the big one (see point 4).
> 
> c. query for duplicate of the temp (because I need to
> match only the records part of all the six lists)and
> extract as result the records of the big table  (see
> point 4).
> 
> What do you think about, it makes sense?

May be I do not understand something but why not simple do

    WHERE category1 = 'word' and category2 = 'word2'

You need Make this category fields as Index By Words of course.
I think your way will not be faster of this one.

-- 
Best regards,
Ruslan Zasukhin      [ I feel the need...the need for speed ]
-------------------------------------------------------------
e-mail: ruslan at paradigmasoft.com
web: http://www.paradigmasoft.com

To subscribe to the Valentina mail list go to:
http://listserv.macserve.net/mailman/listinfo/valentina
-------------------------------------------------------------



More information about the Valentina mailing list