[ORM]: faulting

Thu Oct 11 03:27:53 CDT 2007

Hi,

I'm not sure why my mails are not going out while I receive new ones,  
but ... (they'll go out as batch sometimes later I'm afraid)

On 2007-10-11, at 09:51, Philip Mötteli wrote:

> Am 11.10.2007 um 08:26 schrieb Ruslan Zasukhin:
>
>> On 11/10/07 1:55 AM, "Thorsten Hohage"  
>> <thohage at objectmanufactur.com> wrote:
>>
>> I was told on Cocoa list, that its true that if table T5 has million
>> records, then system prepare million small RAM objects for future  
>> faulting.
>>
>> IMO this is huge overhead.
>> IMO this is limited way by hardware.
>>
>> So is this true that all million objects are prepared?
>
> Yes.

Sorry IMHO: NO or better it should be NO

I think ORM and all technics around are related to the requirements.  
If you think about a solution with only a few tables and the main  
table has some million objects and you need to "scan" all the data  
again and again or add perhaps another set of 10000 rows each day,  
then ORM and faulting and many other patterns are not the best  
choice. Of course you could do it, but IMHO it would be a good advice  
to not use them.

But if your style is more finding a some kind of root-object, and in  
this case even do a select over the million records and receiving 20   
or 200 rows, picking one of these "objects" and travers the tree  
down, so e.g. look at the full contact data with all releated items,  
"opening" the invoices, one year of the invoices, request the perhaps  
300 invoices, open one invoice with maybe 20 positions and 2  
reminders and 10 related aritcles, changing to one article, ... THEN  
you should use a ORM tool. It's fast and easy.

In this case and after doing this operation you've got

200 (i.e. contact) + 2000 (faults of related contact informations,  
some are loaded) + 10 (years) + 300 faults of invoices informations,  
some are loaded) + 20 positions + 2 reminders + 10 articles + 100  
(faults of related article informations, some are loaded) ...

= 2642 faults / object in the application cache not a million.

AFTER doing this "operation" (business operation here)  you've got  
two options

a) i.e. using a "database context" for each "operation" and  
nullifying it after, so with the next "operation" you restart with 0  
faults

b) keep the faults, because statistical the next operation will need  
some of the already loaded data. And firing a fault is easier then  
select data again. So of course after a day or week of continous  
execution you'll probably end up with 300000 faults in you app.

Style "b" is often used for application server concepts and THIS is  
often the reason, why apps during test run fine, but after a month of  
real world operation they break down completely and new hardware was  
bought, or the was redesigned ;-)

> But, objects, that collect other objects are collection objects. OO  
> is about reusing code. So a programmer shouldn't re-implement  
> collection objects. He should reuse the ones, that are already  
> there. Which means, that the persistence library can also replace  
> them or part of them, so that they use an optimized memory management.

BUT this is nothing Ruslan like to hear ;-) - because Valentina is  
fast and powerful, but more and more unnecessary after initial load  
of data when you work in this ways.

regards

Thorsten Hohage
--
objectmanufactur.com - Hamburg,Germany