What is slowing Glorp down north of 10,000 objects in the Transaction's undoMap?

jtuchel

unread,

Nov 5, 2020, 2:06:03 AM11/5/20

to glorp-group

There are days when I think I still don't understand how to use Glorp right. We are looking into performance issues where teh insert of 4 rows into 3 tables and takes a up to a minute instead of a few msecs. The time is spent in Glorp, not in the database. We added a bunch of logging statements to GlorpSession and UnitOfWork and we already know most of the time is spent in Glorp before any SQL is issued to the Database.

This extreme slowdown appears only for users who have loaded lots of objects from the DB. In our current case, there are a bit more than 10,000 entries in the undoMap of the currentUnitOfWork. It seems like 10,000 is a magic number here, a few weeks ago when less data was in play, the performance was okay for this user.
Users with just a few hundred objects have very nice performance.

I want to find ot whether this is a VAST specific problem. Glorp uses an IdentityDictionary for the undoMap on both VAST and VW (and I guess in Pharo as well). This may or may not be a problem, I simply don't know. Is there anybody here on this list *not* on VA Smalltalk who has such big transactions (remember: not number of updates, just objects loeded from the DB!).

I wonder how I can go on from here? Response times of one minute and more are not acceptable...

Any ideas?

Alan Knight

unread,

Nov 5, 2020, 8:40:57 AM11/5/20

to glorp...@googlegroups.com

My first thought when seeing a non-linear slowdown on something large is hashing performance. It also might not be in the undoMap, but in the generated row maps, or somewhere else. But if the time is spent in Glorp, then a profile should help. Or even the quick and dirty profiler of pausing execution. If something is spending 90% of its time doing something, then a random pause will probably stop in that something.

--
You received this message because you are subscribed to the Google Groups "glorp-group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to glorp-group...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/glorp-group/88932d86-9c11-4df8-b2c4-f046308a71d1n%40googlegroups.com.

jtu...@objektfabrik.de

unread,

Nov 5, 2020, 9:19:53 AM11/5/20

to glorp...@googlegroups.com

Hi Alan,

this is what is currently blocking my machine. I'm traceing a single commit of that slow kind and tracing alone took almost 4 hours ;-) I didn't expect this to take that long, otherwise I'd have started with sampling. I have a lot of time to answer my mails now ;-) Next time I'll sample first (which is what the manual says, btw, but who reads manuals ;-)) ).
The machine is now working on opening the Performance Workbench...

I'll be back with more info and very likely questions on what do do about this.

I already added some logging to the server application. The very same commit is fast for users with only a few hundred objects in their object net. The commitUOW takes between 150 and 400 msec for those users.

I am glad you suspect something similar like I do. Shows me I am learning. Maybe it is time to look up a chapter or two in my copy of Andres' Hashing book while I wait for the Workbench to open...

Joachim

Am 05.11.20 um 14:40 schrieb Alan Knight:

You received this message because you are subscribed to a topic in the Google Groups "glorp-group" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/glorp-group/vtmzMOW3QSA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to glorp-group...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/glorp-group/CAGWHZ99W5KP2F2uuZTAt-bDEphE6S75oKSYvrWy61Ef-MY2nxQ%40mail.gmail.com.

-- 
-----------------------------------------------------------------------
Objektfabrik Joachim Tuchel          mailto:jtu...@objektfabrik.de
Fliederweg 1                         http://www.objektfabrik.de
D-71640 Ludwigsburg                  http://joachimtuchel.wordpress.com
Telefon: +49 7141 56 10 86 0         Fax: +49 7141 56 10 86 1

jtuchel

unread,

Nov 6, 2020, 1:51:46 AM11/6/20

to glorp-group

Hi again,

I gave up waiting for the results Browser on my tracing results yesterday. The image grew above 4 GB in size and the Browser still didn't open after ~3 hrs. So I tried sampling at a rate of 5ms.

The results are a bit surprising. If the mein problem would be inefficient hash algroithms or an ineffecient IdentityDictionary, I would expect IdentityDictionary methods like at:ifAbsentPut: and such on top of the sorted list of methods most time spent in. That is not the case in a sample of 12 runs. The methods most time spent in are isRegistered: and registeredObjectsDo: as well as Collection>>#includes: . IdentityDictionary and IdentitySet are on the list, but with low percentages of the overall execution time time.

The top of the list in my Workbench looks like this:

(50,4%) UnitOfWork>>#registeredObjectsDo:

(16,6%) Collection>>#includes:

(2,2%) IdentityDictionary>>#includesKey:

(2,0%) IdentityDictionary#at:ifAbsentPut:

(1,9%) IdentitySet>>#includes:

....

These methods do use #= extensively, of course, but I am not so sure this is related to hashing, right? The main job of these methods is to iterate over #registeredObjects, which, iiuc, does also not rely on hashing, because all they do is walk thorugh a long list of pointers, visiting each object. So I am almost sure this is not a hashing issue, but just a simple case of too much work due to too many objects in the #registeredObjects collection.

@Alan: would you agree on this thesis?

Just to see if I can improve things by another hashing function, i tried implementing hash functions on the two classes that are the majority on the list of registered objects as

hash

^id hash "the send of #hash is probably not necessary, sind id is an Integer anyways, but might one day in a century or so be a LargeInteger..."

The performance wasn't affected at all, it neither improved nor got worse. There are a few questions about hashing in this context, which may be very important for the purpose of changing hashing for persistent objects, like

since registeredObjects and undoMaps are IdentityDctionaries, I guess they're not relying on hash at all, but basicHash instead. #basicHash is a VM primitive in VAST, so maybe there is not much point in overriding this. I am most likely not more clever than what the VM guys do for hashing...
If I wanted to implement another hashing agorithm, should the class be part of the hash? most of our persistent objects have a sequence number as id, each of them created by the database for each table individually with 1, so if all persistent objects just return the id as their hash value, and if Glorp manages instances of different classes in Dctionaries, there are probably lots of collisions. So Maybe teh Class's hash should be part of an Object's hash? Something like
self class hash * id hash
maybe?

But I am not so sure hashing is relevant in my case. My gut feeling is that I simply have a problem of too many registered objects in the session. This is most likely a consequence of the way we handle our Transaction (see my other question about best practices on this group).

So the next thing I'll try is to change the Transction handling for this specific dialog first and see if this has an effect.

Thanks for reading, and also lots of thanks for any comments on my thinking out loud here...

Joachim

jtuchel

unread,

Nov 6, 2020, 2:21:12 AM11/6/20

to glorp-group

little correction:

>If I wanted to implement another hashing agorithm, should the class be part of the hash? most of our persistent objects have a sequence number as id, each of them >created by the database for each table individually *starting* with 1,

jtuchel

unread,

Nov 6, 2020, 5:10:04 AM11/6/20

to glorp-group

I can already answer parts of my questions ;-)

I can easily make this perform a whole lot slower by overriding #basicHash in my persistent classes. Thus I can easily move IdentitySet>>#includes: and IdentityDictionary>>#at:ifAbsent: to the top of the list of worst performers ;-)

So basicHash clearly has in anfluence on the overall performance. There is only this little remaining riddle: can I use this knowledge to achieve the opposite effect ;-))

I am a bit sceptical. In my attempts to play with #basicHash, it always showed up in the list, but it had obviously never been there with the default implementation (because sampling won't measure VM primitives, I guess). So I either chose very slow hashing algorithms, or the hashing algorithms I chose were bad. Is suspect a combination of both ;-)

I chose to include the Class in order to make the hash of an instance of ClassA with id 17 distinguishable from an instance of ClassB with the same id(17). My observation with Hashes of all Classes in the image is that they are all in the range between 1 and 32767. So I went to Andres' book and found his chapter on VsiualWorks' #hash implementation in Date. I thought the class' hash is somewhat similar to a Date's year, just that the class hashes take 15 bits instead of 9.

so I tried

self class hash * 32768 + id hash

(self class hash bitShift: 15) bitXOr: id hash

And a few even lower performing and less clever ideas. But they all just made things worse.

So, what do I do with this new knoweldge? I don't know, tbh.

Tom Robinson

unread,

Nov 6, 2020, 7:12:09 AM11/6/20

to glorp...@googlegroups.com, jtuchel

Hi Joachim,

Is VAST available with a 64 bit VM? If so, it might be interesting to see what happens to the performance there. The problem with Smalltalk hashes in a 32-bit implementation with lots of memory (and objects) is that it means you get lots of duplicate hash values and lookups can start to slow down due to sequential search issues. Or are you already using a 64-bit version?

--
You received this message because you are subscribed to the Google Groups "glorp-group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to glorp-group...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/glorp-group/27d40c12-b71e-4c92-b96e-724e4c2e2de6n%40googlegroups.com.

jtuchel

unread,

Nov 6, 2020, 7:53:11 AM11/6/20

to glorp-group

Tom,

we are on the 64 bit - VM already.

There is one thing I didn't see all the time during sampling and testing: we had quite a few GCs going on in the middle of Transaction commits. So I am now looking into effects of increasing old and new Space. First few experiments show at least some effect.

Joachim

Alan Knight

unread,

Nov 6, 2020, 8:57:23 AM11/6/20

to glorp...@googlegroups.com

This sounds to me like maybe the problem isn't hashing. I assume those top lines are a hierarchical view - that is, they include the calls underneath them in the total time. So, if the 50% of time iterating registered objects includes the stuff in the do: block, that seems reasonable. That's kind of what the whole operation does. But I'm a bit suspicious of the Collection>>includes:. Does that mean it's falling back to a superclass linear search somewhere?

Also, that you're having a lot of GCs and the image is getting very large seems suspicious. RowMaps are heavy - we're iterating every registered object and every registered object's backup copy, and creating a Row object for each of them. But they're not *that* heavy with only 10K objects. Not having so many objects registered is definitely the best long-term answer. But there's something funny going on. Maybe an allocation profile would be interesting.

--
You received this message because you are subscribed to the Google Groups "glorp-group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to glorp-group...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/glorp-group/2a29fb59-ada8-496a-b145-6ff0129d1c69n%40googlegroups.com.

jtuchel

unread,

Nov 10, 2020, 1:34:50 PM11/10/20

to glorp-group

Alan,

I've spent quie a while now trying to understand this much better. The problem is definitely not hashing.

In the specific situation I am looking at, memory consumption grows by 70 MB in #createRowsForPartialWrites, the creation of RowMaps from registeredObjects is what makes this slow. This is due to the fact that #registeredObjects is a Collection with more than 20.000 Objects.

So clearly I need to find ways to reduce the number or registeredObjects (as you say). There is, of course, not much use in managing 20.000 Objects when all you do is insert 7 rows in 3 tables. The tricky question is: how to do that in a clever way that doesn't break the program...

Joachim

Reply all

Reply to author

Forward