WRANING

9 views
Skip to first unread message

Aggelos Pincha

unread,
Jun 6, 2024, 3:00:31 PMJun 6
to CKAN Development Discussions
Hi All,
/home/aggelospincha/ckan/lib/default/src/ckan/ckanext/activity/model/activity.py:713: SAWarning: TypeDecorator JsonDictType() will not produce a cache key because the ``cache_ok`` attribute is not set to True.  This can have significant performance implications including some performance degradations in comparison to prior SQLAlchemy versions.  Set this attribute to True if this type object's state is safe to use in a cache key, or False to disable this warning. (Background on this error at: https://sqlalche.me/e/14/cprf)
  results = q.all()

i realized this error what can i do to fix it or something ?

william dutton

unread,
Jun 7, 2024, 11:47:49 PMJun 7
to CKAN Development Discussions, Aggelos Pincha
Hi Aggelos,

This may be better suited as an issue item on GitHub. 

To your point.  The issue is that the activity system needs more Tender Loving Care (TLC). It originally had two tables, the short form table and a table with the big json blobs. Since 2.8 and solidified in 2.9, their is no split and the programers use case was to have said activity history cleared after 60 days or less. I do believe the majority of us who use CKAN in government want to keep it around 'forever' if possible. Since its such a useful resource for working out when things 'borked'. 
Myself and my fellow college have got some enhancements we are trying to get back into core that we use on www.data.qld.gov.au. A very popular and well updated site accorss Qld Government. When we upgraded to 2.9.x we had to disable email notifications due to said slowness on cacheing. This may have been due to the cache key but was more down to very very bad sql queries being generated by sqlalcomy. 

In activity plugin, the Data column is actually a bit JSON blob is when doing id filtering should try to be avoided from being included on many in joins. 

We have made many large changes to defer or omit the coloumn until it is required for rendering since the pages that use it only need up to 30 queries to get those records instead of the 2million rows that could be in said table.

We also changed from ``'q = model.Session.query(Activity).filter_by(object_id=package_id)'``
to
 ```q = model.Session.query(Activity) \
        .options(_activities_defer_data()) \
        .filter(Activity.object_id.in_(_to_list(package_id)))  # type: ignore
```

So that it was more performant. 
As well as 
``Activity.object_id == group_id,`` changed to ``Activity.object_id.in_(groups), # type: ignore``

This has allowed us to run the email notifications system hourly without issue of overlap and degradation we use to see.  Could this be improved. Yes. But baby steps are needed in this space. 


I am still trying to get the IUploader Changes included upstream for better intregration of non on-disk plugins to allow seamless download and 'deletion' cleanup where wanted.  https://github.com/ckan/ckan/pull/6831

I'm more than welcome for another to pickup and run with these changes as we have found them to be very beneficial in our system with large suite of plugins. 

Regards

William Dutton
GitHub @duttonw
Reply all
Reply to author
Forward
0 new messages