[ORM]: faulting

Thu Oct 11 07:21:40 CDT 2007

Am 11.10.2007 um 10:27 schrieb Thorsten Hohage:
>
> I'm not sure why my mails are not going out while I receive new  
> ones, but ... (they'll go out as batch sometimes later I'm afraid)
>
> On 2007-10-11, at 09:51, Philip Mötteli wrote:
>
>> Am 11.10.2007 um 08:26 schrieb Ruslan Zasukhin:
>>
>>> On 11/10/07 1:55 AM, "Thorsten Hohage"  
>>> <thohage at objectmanufactur.com> wrote:
>>>
>>> I was told on Cocoa list, that its true that if table T5 has million
>>> records, then system prepare million small RAM objects for future  
>>> faulting.
>>>
>>> IMO this is huge overhead.
>>> IMO this is limited way by hardware.
>>>
>>> So is this true that all million objects are prepared?
>>
>> Yes.
>
> Sorry IMHO: NO or better it should be NO
>
> I think ORM and all technics around are related to the  
> requirements. If you think about a solution with only a few tables  
> and the main table has some million objects and you need to "scan"  
> all the data again and again or add perhaps another set of 10000  
> rows each day, then ORM and faulting and many other patterns are  
> not the best choice. Of course you could do it, but IMHO it would  
> be a good advice to not use them.
>
> But if your style is more finding a some kind of root-object, and  
> in this case even do a select over the million records and  
> receiving 20  or 200 rows, picking one of these "objects" and  
> travers the tree down, so e.g. look at the full contact data with  
> all releated items, "opening" the invoices, one year of the  
> invoices, request the perhaps 300 invoices, open one invoice with  
> maybe 20 positions and 2 reminders and 10 related aritcles,  
> changing to one article, ... THEN you should use a ORM tool. It's  
> fast and easy.
>
> In this case and after doing this operation you've got
>
> 200 (i.e. contact) + 2000 (faults of related contact informations,  
> some are loaded) + 10 (years) + 300 faults of invoices  
> informations, some are loaded) + 20 positions + 2 reminders + 10  
> articles + 100 (faults of related article informations, some are  
> loaded) ...
>
> = 2642 faults / object in the application cache not a million.
>
> AFTER doing this "operation" (business operation here)  you've got  
> two options
>
> a) i.e. using a "database context" for each "operation" and  
> nullifying it after, so with the next "operation" you restart with  
> 0 faults
>
> b) keep the faults, because statistical the next operation will  
> need some of the already loaded data. And firing a fault is easier  
> then select data again. So of course after a day or week of  
> continous execution you'll probably end up with 300000 faults in  
> you app.
>
>
> Style "b" is often used for application server concepts and THIS is  
> often the reason, why apps during test run fine, but after a month  
> of real world operation they break down completely and new hardware  
> was bought, or the was redesigned ;-)

You just implemented a different memory management for a persistent  
collection object. So there's a need either to transparently replace  
the original, non persistent collection object or you urge the user  
of your library to do it manually, by instantiating a special  
collection object, that uses a different memory management. But if  
you just transparently take the non persistent collection object, you  
end up with as many faults as you have elements in this collection  
object. I don't know, how they call it in Ruby or RealBasic. But if  
you serialize and then deserialize an NSArray object, you will have  
an NSArray instance with as many faults in it as there were  
originally elements.
So to solve this, you do what I described here:

>> But, objects, that collect other objects are collection objects.  
>> OO is about reusing code. So a programmer shouldn't re-implement  
>> collection objects. He should reuse the ones, that are already  
>> there. Which means, that the persistence library can also replace  
>> them or part of them, so that they use an optimized memory  
>> management.