100% CPU and Memory utilization in DB query

30 views
Skip to first unread message

Nitin Jain

unread,
Oct 26, 2019, 1:50:53 PM10/26/19
to sqlalchemy
Hi all ,

We are observing very high memory and cpu consumption almost 100% CPU and memory for one of the API call  witch select query . 


In our model we have many tables like


TABLE A - Parent  ( recording session)


Table B - Child of A ( table a id as FK)  (video )


Table C - Child of B ( table C id as FK) ---> Very high Memory / CPU ( images)


In addition to above table we have many other related tables (parent - child ) also .


Table A is like recording session which contains many videos ( Table B) which contains many images ( Table C) . 


So for 30 min recording session we have 30 clips of videos and then 30 * 30 = 900 images i.e image metadata and not actual image. 


We have created relationships between tables using db.relationship so that we can use filter operations as well.


Once we trigger the query to fetch say 900 images metadata and not actual images then  query is almost blocked with 100 % CPU and Memory. 

.

Please let me know how to debug this issue . We suspect issue in db relationships  / some  infinite loop in query . 


class Images(db.Model):
__tablename__ = 'images'

videoclips = db.relationship('VideoClip', backref='images', lazy='joined')


class VideoClip(db.Model):
__tablename__ = 'video_clip'

images = db.relationship('Images', backref='video_clip', lazy='joined')

Similarly there are many one to many relationships are defined. Is it related to lazy=joined which i 
have used  for filter operations. 

Please let me know if anyone has faced this kind of issues. 

Regards
Nitin



Mike Bayer

unread,
Oct 26, 2019, 5:43:03 PM10/26/19
to noreply-spamdigest via sqlalchemy
Hi there -


the first thing you should be looking at is the SQL output, the code here is not very important.  have you turned on SQL logging on both the Python side, as well as enabled server-side metrics, such as if this is MySQL you want to use slow query log ?  have you done EXPLAIN on the slow queries in question ?     is the CPU blockage in the Python application side or in the database?   what kind of database?      is the slowness during the waiting for queries to complete or in the fetching of objects?   these are all questions you can get answers using to in order to start understanding the problem.

Take a  look at https://docs.sqlalchemy.org/en/13/faq/performance.html#query-profiling for some guidance on getting started on this.
--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper
 
 
To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.

Nitin Jain

unread,
Oct 30, 2019, 6:56:13 PM10/30/19
to sqlalchemy
Thanks for reply . 

I am using postures 9.6 DB and CPU / Memory block is at both DB and application side. 

Thanks for suggestion and using EXPLAIN i got very important information like even though we have fetched 60 records some 40000 rows were affected and it seems like some improper joins are happening because of which if same query we perform using PostgreSQL ( pg-admin) db tool also we got memory exceptions. 

In SQL alchemy logs we can see improper joins are happening like marked in red below . It seems some circular  reference is happening in parent child relationships. 

We have example scenario like 

Table A -  Recording  - Parent  >> define tags relationships here 

Table B - Video  --> Child of Table A >> define tags and recordings relationships here 

Table C - Tags  ---> Child of both Table A and Table B . --> define both video and recordings here  

Recordings can have many videos and tags and videos can also have many tags . 

Here if we fetch videos , it seems it  join tags and then in tags it join recordings  and then again tags . 

Please note here we just define db relationships and don't explicitly write SQL queries and  this is handled by flask sqlalchemy.  

So if we define properly relationships then i think this issue will be resolved as well. 



********************************************************************************************************************************


SELECT required columns 

FROM (SELECT  videoclip  columns FROM video_clip LIMIT 100)  AS anon_1


LEFT OUTER JOIN recording_session AS recording_session_1 ON recording_session_1.recording_id = anon_1.video_clip_recording_id 

LEFT OUTER JOIN maindriver_info AS maindriver_info_1 ON maindriver_info_1.main_driver_id = recording_session_1.main_driver_id 

LEFT OUTER JOIN auxdriver_info AS auxdriver_info_1 ON auxdriver_info_1.aux_driver_id = recording_session_1.aux_driver_id 

LEFT OUTER JOIN car_info AS car_info_1 ON car_info_1.automobile_id = recording_session_1.automobile_id 

LEFT OUTER JOIN camera_info AS camera_info_1 ON camera_info_1.camera_id = recording_session_1.camera_id 

LEFT OUTER JOIN tag AS tag_1 ON recording_session_1.recording_id = tag_1.recordingsession_content_id 

LEFT OUTER JOIN driving_speed AS driving_speed_1 ON recording_session_1.recording_id = driving_speed_1.recording_id 

LEFT OUTER JOIN contextlabel AS contextlabel_1 ON recording_session_1.recording_id = contextlabel_1.recording_id

LEFT OUTER JOIN tag AS tag_2 ON anon_1.video_clip_video_clip_id = tag_2.videoclip_content_id

LEFT OUTER JOIN recording_session AS recording_session_2 ON recording_session_2.recording_id = tag_2.recordingsession_content_id

LEFT OUTER JOIN maindriver_info AS maindriver_info_2 ON maindriver_info_2.main_driver_id = recording_session_2.main_driver_id

LEFT OUTER JOIN auxdriver_info AS auxdriver_info_2 ON auxdriver_info_2.aux_driver_id = recording_session_2.aux_driver_id

LEFT OUTER JOIN car_info AS car_info_2 ON car_info_2.automobile_id = recording_session_2.automobile_id 

LEFT OUTER JOIN camera_info AS camera_info_2 ON camera_info_2.camera_id = recording_session_2.camera_id 

LEFT OUTER JOIN driving_speed AS driving_speed_2 ON recording_session_2.recording_id = driving_speed_2.recording_id 

LEFT OUTER JOIN contextlabel AS contextlabel_2 ON recording_session_2.recording_id = contextlabel_2.recording_id 

LEFT OUTER JOIN contextlabel AS contextlabel_3 ON anon_1.video_clip_video_clip_id = contextlabel_3.videoclip_content_id

 



Regards
Nitin

To unsubscribe from this group and stop receiving emails from it, send an email to sqlal...@googlegroups.com.

Mike Bayer

unread,
Oct 30, 2019, 9:56:09 PM10/30/19
to noreply-spamdigest via sqlalchemy


On Wed, Oct 30, 2019, at 6:56 PM, Nitin Jain wrote:
Thanks for reply . 

I am using postures 9.6 DB and CPU / Memory block is at both DB and application side. 

Thanks for suggestion and using EXPLAIN i got very important information like even though we have fetched 60 records some 40000 rows were affected and it seems like some improper joins are happening because of which if same query we perform using PostgreSQL ( pg-admin) db tool also we got memory exceptions. 

In SQL alchemy logs we can see improper joins are happening like marked in red below . It seems some circular  reference is happening in parent child relationships. 

We have example scenario like 

Table A -  Recording  - Parent  >> define tags relationships here 

Table B - Video  --> Child of Table A >> define tags and recordings relationships here 

Table C - Tags  ---> Child of both Table A and Table B . --> define both video and recordings here  

Recordings can have many videos and tags and videos can also have many tags . 

Here if we fetch videos , it seems it  join tags and then in tags it join recordings  and then again tags . 

Please note here we just define db relationships and don't explicitly write SQL queries and  this is handled by flask sqlalchemy.  

So if we define properly relationships then i think this issue will be resolved as well. 


even if those joins in red are removed, that's still vastly too many joins.

SQLAlchemy never emits LEFT OUTER JOIN unless explicitly told to do so.   The directives which may be emitting these LEFT OUTER JOINs include:

with_polymorphic="*" on a joined table inheritance mapping

lazy="joined" on relationship()

joinedload()  in Query.options

with_polymorphic(BaseClass, "*") with Query.

and of course query.outerjoin() and query.select_from(some join).

These join appear to be emitted from relationships that likely have lazy="joined" at the mapping level; these should be removed.    Instead, use joinedload() at Query time to optimize only those relationship paths which you actually need to load.  Additionally, use joinedload() only for many-to-one relationships, and try to use it only for non-nullable foreign keys, along with the innerjoin=True flag so that an inner join is used, not an outer join.  For one-to-many relationships, use the selectin() loader at query time which is much more efficient than joinedload().






To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.

Nitin Jain

unread,
Nov 5, 2019, 2:06:11 PM11/5/19
to sqlalchemy
Thanks a lot Mike . After using selectin loader (as per understanding in flask-sqlalchemy lazy=true/select is same as selectin ) instead of joined , previous problem related to high cpu / memory is resolved . 

I have one more query related to exclusion of nested relationships e.g. for joining the tables , can we choose level of nested relationships  in case of one to many relationships ? 


Reply all
Reply to author
Forward
0 new messages