Best practice for lists

26 views
Skip to first unread message

jtuchel

unread,
Aug 21, 2023, 7:15:04 AM8/21/23
to glorp-group
Hi there,

I am wondering how people solve the problem of lists with complex subqueries. Let me explain with an example. Let's say we have a Class Customer and a Class Order. A Customer can have any number of orders and each of them has a total (sum of ordered articles).

We'd like to display a list of Customers in which there is an additional column "sum of orders".

It is relatively slow to read a list of customers and all their orders and sum the total of all orders in Smalltalk, even if we use a query with alsoFetch:. It is several orders of magintude faster to solve that in SQL with a view or even with a dynamic SQL statement. I'd like to use the power of SQL for querying the list entries. But I wonder what approach would be best or if there is something nice I should have known.

Is it better to introduce an extra class (maybe a subclass of Customer named CustomerAndSumOfTheirOrder) for such a query/view and create a duplicate set of mappings (I guess that can be done in a clever way by reusing #descirptorForCustomer: etc.) for this view? Would you rather just issue an SQL statement and use the raw SQL result Dictionaries from that SQL query? Is there a way to dynamically nail extra attributes to a mapped class that get materialize from a special query just for this one list (or maybe several use cases)?

What to do about object identities etc? Is it better to read: the real Customer once a user clicked on one of the CustomerAndSumOfTheirOrder ?

I can think of a few ways to do this, but I seem unable to decide which of those is the least stupid. So I hope someone can tell me a bit about what they do in such cases...

Thanks in advance. I hope we can get into an interesting discussion on what may or may not work.


Joachim




Henrik Sperre Johansen

unread,
Aug 21, 2023, 9:52:29 AM8/21/23
to glorp...@googlegroups.com
Materialized views were made for quick queries on this kind of aggregated data.
If available, you might consider using those, rather than a complex query, or a view whose sum() performance will still suffer with sufficiently large number of orders…
Of course, there are some downsides too:
- Data may not always be up to date (Postgres https://www.postgresql.org/docs/current/rules-materializedviews.html)
- The feature is only performant with additional setup (Oracle, using materialised view logs https://oracle-base.com/articles/misc/materialized-views).
- The feature is not available in all database versions (Older Oracle versions required more than the “Standard” version, at least)

I’d probably just do customerID, orderSum in the Materialised view (which helps in the cases where fast update is available through a log), and your new object/descriptor contain a 1-1 mapping of the instvar with customerID from DB to Customer.

If that causes too many single-instance queries to fetch Customer data, or you’re using a DB which does not do quick refresh, you could
- map all columns in the materialised view
- construct another view of the simple MaterializedView and Customer

Either way, the EmbeddedValueOneToOneMapping should then be able to map the customer instance/maintain identity correctly:

classModelForCustomerListEntry: aClassModel
aClassModel
newAttributeNamed: #customer type: Customer;
newAttributeNamed: #sumOrders

descriptorForCustomerListEntry: aDescriptor
(aDescriptor newMapping: EmbeddedValueOneToOneMapping) attributeName: #customer.
aDescriptor directMapping from: #sumOrders to: (customerMatView fieldNamed: ’SumOrders').

Cheers,
Henry 

--
You received this message because you are subscribed to the Google Groups "glorp-group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to glorp-group...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/glorp-group/920e36be-7f6c-46f9-a9cf-9a2ee14c8738n%40googlegroups.com.

Alan Knight

unread,
Aug 21, 2023, 10:50:47 AM8/21/23
to glorp...@googlegroups.com
You can use retrieve: to just fetch attributes, which is very useful for lists where the attributes are probably primitives, so nothing needs to go into the cache just to be in the list. That is you can make a list by querying for retrieve: #name; retrieve: #someComplicatedThing. And you can map someComplicatedThing to whatever SQL construct is useful. You can also create attributes which are neither readable nor writable, which means they can only be used in queries. I think you can still use retrieve: on them, but I could be wrong about that. 

jtuchel

unread,
Aug 23, 2023, 4:05:49 AM8/23/23
to glorp-group
Hi Henry,

thanks for your thoughts.
So you suggest an extra Class for List entries that uses a materialized view, am I getting this right? This means I'd end up with two Classes and Mappings to update when an attribute is added or changed (with the exception that a change may not be relevant for the list (view object) or business object, of course).
I am not judging your comment, just want to understand it. 
Would it make sense to have a read-only mapping for the lsit/view class, and if so, does Glorp support such a thing? This woul probably also make sessions / units of work leaner, because we'd have to cache a lot fewer objects.

Joachim

jtuchel

unread,
Aug 23, 2023, 7:09:53 AM8/23/23
to glorp-group
Alan,

thanks for your comment.I'm trying to get my head around both yours and Henry's ideas.
IIUC, using retrieve: leads to the results not being returned as mapped objects, but as arrays. 
Thus, there is nothing going into a Session's /UOW's cache and undoMap, which has its own interesting advantages.

I will have to play with both approaches - a dedicated mapped Class for Views/Lists vs. Arrays from Queries. I can see advantages in both. I wonder if there is a way of making the Arrays a bit more clever (like "give me your real object") but also keep these result objects out of the undoMaps and thus save lots of resources....

A lot of food for thought in just two responses ;-)

Joachim

jtuchel

unread,
Feb 21, 2025, 3:03:54 AMFeb 21
to glorp-group
Coming back to this topic, I finally tried the approach of using a simple VIEW and map that as objects in my Descriptor.

The good thing is that I can map relationships and follow object nets from a such a view object, just like with any mapped object. The retrieval of list entries can be so much faster, really blazing fast with things like "list me all customers in city x with the sum of all their orders in the last 6 months". While following the objects and summing up all orders and whatnot for a  handful of Customers can take quite a few seconds if all objects are materialized and the sum is calculated, while using SQL magic makes the query a thjing of a few dozen milliseconds.

So Iam happy with what I have thus far and plan to change a few existing screens in our application to use VIEWS (not materialized ones).

What still worries me a little is that the retrieved view entries are not treated as read-only and are still cached and whatnot which I think isn't really necessary. As long as I make sure that the referenced business objects are retrieved as "regular Glorp mapped objects" from tables and are teh only objects that are updated and need tracking of changes and registering transitive closure and stuff, these view objects could be completely transparent for a Unit Of work, couldn't they?

So I am wondering if such read-only mappings are a thing anybody has considered for Glorp?
My naive idea would be that you can probably mark the descriptor and/or class model of such a View object as read-only and thus not have them added to caches. Also my first impulse would be that a chain of #registerTransitiveClosure* actions could simply stop when such a read-only object is approached (treat the object just as if it wasn't a mapped object). Does this make sense? Anything I've forgotton?

I guess such a read-only maping would not only make sense for (non-updateable) Views, but also for constant objects (think of INI entry-like information).

Has anybody worked on such read-only mappings?


Joachim

Alan Knight

unread,
Feb 21, 2025, 7:07:01 PMFeb 21
to glorp...@googlegroups.com
Very interesting.
Mixing up objects that are cached and writable with instances of the same class that look identical but aren't cached or writable can be very confusing, and is something Glorp tries to avoid. And generally it bases the behavior off the class (via the classModel and descriptor), so it's hard to get different behavior.
One thought is that you could have a Foo and a FooView subclass of it whose descriptor says that it isn't cached. But it might be a bit tricky with its references. If you re-use the parent class descriptor then it will be to parent class readable instances, with all the problems it has. So you might end up doing a lot of parameterizing to make that work.
There might be some other stuff in there to accommodate reading views, but I'm afraid I don't remember any more.

Reference data is probably a little different. You want it read-only, but you probably want it cached forever.


Joachim Tuchel

unread,
Feb 22, 2025, 1:34:05 AMFeb 22
to glorp...@googlegroups.com

Hi Alan,


thanks for your thoughts. Seems there are quite a few models for using Views. The ones you reference are not ones I had in mind. Maybe that's a first indicator for this whole thing not being too easy...

What I had in mind was using dedicated classes that do not inherit from or reside in the same inheritence hierarchy as "real" business oobjects. In my case I want them to be used for long lists from which you can edit the "real" objects by double-clicking.

Think of a page like "Top 5o customers of last quarter". I want their name, address, sum of their purchases of the last quarter, maybe their most expensive with date and sum. These are based on a more or less complex (maybe materialized) view. I want the possibility to click on their name to edit/view more details, a link to see the complete list of orders of the quarter and a link to add a note or whatnot. These referenced objects should be the real mapped and cached business objects, so that they can be treated in the current unit of work. But the view objects are neither needed in the cache nor will they ever be updated. (Maybe I am wrong in assuming I don't want them cached, because I might want to ensure their "identitity" if the view is refreshed - my current implementation has no primary key for the view objects, let's see if that is a good decision or not...)


So this to me sounds like a much simpler scenario than what you describe. So far, I like the results without modifications to Glorp, but I am wondering if such objects tend to flood caches and make the registration of transitive closures much more expensive than needed. Glorp will have to keep all the attributes like customer name, their street and city and whatnot in its caches at least once more for the view object and for the real business objects. So the streat and customer name will be kept in the undo maps for the business object (if it's proxy is resolved from the view object) and the view. so it will be in the caches at least 4 times....


I was wondering if that can be avoided by keeping a mapping read-only, meaning: don't keep the initially read value for updates that won't occur anyways. This would eliminate 2 copies of the String in my naive understanding (old value as read froim the DB and the one that is created when the row is prepared for possible updates).


The second idea I tried to describe was that maybe such a read-only object is always an end point for registering a transitive closure of objects. 

It holds the roots of several object "trees", like the Customer and its address, orders etc. and the list of orders of the last quarter. These trees are possibly completely unrelated and will be tracked by Glorp anyways, even if a View object itself doesn't provide anything to the register transitive closure operation. Am I right? And if they are related, they will probably be found in the register transitive closure phase anyways. )Pleas note: by "related" I mean a link between the two trees through a path of resolved proxies / real objects). So my idea was (but I forgot to mention) that a view / read-only object behaves like a Proxy that is not resolved yet. It is an endpoint to the traversal of possible candidates for DB changes.

I am not completely sure about this, however. So I would like to discuss this and probably find co-warriors to extend Glorp towards read-only-mappings...

(bringing us back to the question which code base would be the one to do this in: VW / Pharo / VAST)


Not sure about your "cached forever" comment. Once a read-only object is materialized, there is no need to cache it any more. The instantiated Object is caxche enough. What may be needed is teh ability to refresh such a read-only object, but I see no need to cache it. It's read only, so I assume it doesn't change. And if it does, I assume I never want to write these changes back to the DB. So it can be detached / transient or whatever you'd like to call it. Wrong?



Joachim









Am 22.02.25 um 01:06 schrieb Alan Knight:
You received this message because you are subscribed to a topic in the Google Groups "glorp-group" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/glorp-group/Zh23rVNtOSs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to glorp-group...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/glorp-group/CAGWHZ9_Gc7N6g89wC2j9kEhEHoBZjw_b103hAcSWw88C%3D05d8Q%40mail.gmail.com.
-- 

----------------------------------------------------------------------- 
Objektfabrik Joachim Tuchel              mailto:jtu...@objektfabrik.de 
Fliederweg 1                                 http://www.objektfabrik.de
D-71640 Ludwigsburg                  http://joachimtuchel.wordpress.com
Telefon: +49 7141 56 10 86 0                    Fax: +49 7141 56 10 86 1

Reply all
Reply to author
Forward
0 new messages