How to Increase Performance of serialization? Took me 12s to serialize all WorldBorder data in GeoDjango tutorial to GeoJSON format.

Skip to first unread message

Abirafdi Raditya Putra

Jan 3, 2018, 8:36:49 AM1/3/18
to geodjango

I'm developing a GeoDjango app which use the provided WorldBorder model in the tutorial.

I have extended the WorldBorder so it has a two deep nested reverse sets in each WorldBorder, but not too much (query time is insignificant to the total load time).

I made the API for it using DRF but it's so slow, it takes 16 seconds to load all WorldBorder and Regions in GeoJSON format.

The returned JSON size is 10MB though. Is that reasonable?

I even change the serializer to serpy which is way much faster than the DRF GIS serializer but only offers 10% performance improvement.

Turns out after profiling, most of the time is spent in the GIS functions to convert data type in the database to list of coordinates instead of WKT.

If I use WKT, the serialization is much faster (1.7s compared to 11.7s, the WKT is only for WorldBorder MultiPolygon, everything else is still in GeoJson)

I also tried to compress the MultiPolygon using ST_SimplifyVW with low tolerance (0.005) to preserve the accuracies, which brings down the JSON size to 1.7 MB.

This makes the total load to 3.5s. Of course I can still find which is the best tolerance to balance accuracy and speed

Below is the profiling data

(Ignore the sudden increase of queries when using ST_SimplifyVW due to bad usage of QS API. I already fixed it to be still at 75 queries and it still have somewhat same performance)

The 75 queries numbers is also not what I want. I already use prefetch related and count the len of the reverse set but somehow it still query to the database.

Anyway, the database query is not too significant here compared to the profiled block of code.

The profiling is done in my local machine with a very good CPU (i7 6700K @ 4.4Ghz) and a SSD.

enter image description here

Below is profiling for the function calls. I highlight the part which took most of the time. This one is using vanilla DRF GIS implementation.

enter image description here

Below is when I use WKT for one of the MultiPolygon field without ST_SimplifyVW. Notice the previously slow functions are disappeared because it's just pulling WKT from DB I assume?enter image description here

So the point here, is GeoDjango is not fast as I expected or that performance numbers is expected (sorry I don't mean to condescending by saying GeoDjango is slow)?

What can I do to improve performance while still outputting GeoJSON i.e. not WKT?

Is fine tuning tolerance is the only way?

Abirafdi Raditya Putra

Jan 3, 2018, 8:40:18 AM1/3/18
to geodjango
Because of the late approval, I got my answer here Basically the solution is to implement caching for the serialized result.
But is it that normal for that performance numbers?
Reply all
Reply to author
0 new messages