Old description:
> I have observed consistent GEOSContextHandle leaks when using Django
> Geometry features in temporary threads. And I can avoid the leaks by
> manually clearing all attributes of
> `django.contrib.gis.geos.prototypes.io.thread_context`. My theory is that
> destructors of attributes in `io.thread_context` call some GEOSFunc
> objects, and that can create new GEOSContextHandle while Python is
> clearing thread local storage.
>
> 1. threadsafe.thread_context.handle is cleared
> 2. io.thread_context attributes are cleared
> 3. io.thread_context attributes are destructed, and then created new
> threadsafe.thread_context.handle.
>
> When I am trying a minimized sample, I also found that it got double free
> or corruption error very often if I do GC in main thread right after the
> thread using GEOS is joined. So this may be another big issue
>
> BTW, I am using Python 2.7.12 and django 1.11. But after checking the
> latest Django code, I think the issue is still there.
>
> My sample code is:
>
> {{{
> #!div style="font-size: 80%"
> Code highlighting:
> {{{#!python
> #!/usr/bin/env python
> import gc
> import threading
> from django.contrib.gis.geos import GEOSGeometry
>
> _old_objs = None
> _new_objs = None
> _first_time = True
>
> def gc_objects():
> gc.collect()
> objs_counts = {}
> gc_objs = {}
> for obj in gc.get_objects():
> key = str(type(obj))
> if key in objs_counts:
> objs_counts[key] += 1
> else:
> gc_objs[key] = obj
> objs_counts[key] = 1
> return objs_counts, gc_objs
>
> def dump_memory_leaks():
> global _old_objs
> global _new_objs
> global _first_time
> if _first_time:
> _old_objs, _ = gc_objects()
> _first_time = False
> else:
> _new_objs, gc_objs = gc_objects()
> leaked = {}
> for k, v in _new_objs.iteritems():
> old_v = _old_objs.get(k)
> if old_v:
> diff = _new_objs[k] - old_v
> if diff > 0:
> leaked[str(k)] = diff
> else:
> leaked[str(k)] = v
>
> print "Leaks: {}".format(leaked)
>
> _new_objs = None
>
> def use_geos():
> GEOSGeometry('POINT(5 23)')
> # These lines can get rid of the GEOSContextHandle leak
> # from django.contrib.gis.geos.prototypes.io import thread_context as
> io_thread_context
> # io_thread_context.__dict__.clear()
> dump_memory_leaks()
>
> if __name__ == '__main__':
> for i in xrange(10):
> t = threading.Thread(target=use_geos)
> t.start()
> t.join()
> # If I do GC here, it will crash with "double free or corruption"
> at random `i`
> # dump_memory_leaks()
>
> }}}
> }}}
New description:
I have observed consistent GEOSContextHandle leaks when using Django
Geometry features in temporary threads. And I can avoid the leaks by
manually clearing all attributes of
`django.contrib.gis.geos.prototypes.io.thread_context`. My theory is that
destructors of attributes in `io.thread_context` call some GEOSFunc
objects, and that can create new GEOSContextHandle while Python is
clearing thread local storage.
1. threadsafe.thread_context.handle is cleared
2. io.thread_context attributes are cleared
3. io.thread_context attributes are destructed, and then created new
threadsafe.thread_context.handle.
When I am trying a minimized sample, I also found that it got double free
or corruption error very often if I do GC in main thread right after the
thread using GEOS is joined. So this may be another big issue
BTW, I am using Python 2.7.12 and django 1.11. But after checking the
latest Django code, I think the issue is still there.
My sample code is:
{{{
#!div style="font-size: 80%"
Code highlighting:
{{{#!python
#!/usr/bin/env python
import gc
import threading
from django.contrib.gis.geos import GEOSGeometry
_old_objs = None
_new_objs = None
_first_time = True
def gc_objects():
gc.collect()
objs_counts = {}
gc_objs = {}
for obj in gc.get_objects():
key = str(type(obj))
if key in objs_counts:
objs_counts[key] += 1
else:
gc_objs[key] = obj
objs_counts[key] = 1
return objs_counts, gc_objs
def dump_memory_leaks():
global _old_objs
global _new_objs
global _first_time
if _first_time:
_old_objs, _ = gc_objects()
_first_time = False
else:
_new_objs, gc_objs = gc_objects()
leaked = {}
for k, v in _new_objs.iteritems():
old_v = _old_objs.get(k)
if old_v:
diff = _new_objs[k] - old_v
if diff > 0:
leaked[str(k)] = diff
else:
leaked[str(k)] = v
print "Leaks: {}".format(leaked)
_new_objs = None
def use_geos():
GEOSGeometry('POINT(5 23)')
# These lines can get rid of the GEOSContextHandle leak
# from django.contrib.gis.geos.prototypes.io import thread_context as
io_thread_context
# io_thread_context.__dict__.clear()
dump_memory_leaks()
if __name__ == '__main__':
for i in xrange(10):
t = threading.Thread(target=use_geos)
t.start()
t.join()
# If I do GC here, it will crash with "double free or corruption"
at random `i`
# dump_memory_leaks()
}}}
}}}
Output:
{{{
Leaks: {"<type 'dict'>": 2, "<class
'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 1,
"<type 'weakref'>": 1, "<class
'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 1, "<type
'frame'>": 1}
Leaks: {"<type 'weakref'>": 2, "<class
'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 2,
"<class 'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 2,
"<type 'dict'>": 4}
Leaks: {"<type 'weakref'>": 3, "<class
'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 3,
"<class 'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 3,
"<type 'dict'>": 6}
Leaks: {"<type 'weakref'>": 4, "<class
'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 4,
"<class 'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 4,
"<type 'dict'>": 8}
Leaks: {"<type 'frame'>": 1, "<type 'weakref'>": 5, "<type
'instancemethod'>": 1, "<class
'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 5, "<class
'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 5,
"<type 'dict'>": 10}
Leaks: {"<type 'weakref'>": 6, "<class
'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 6,
"<class 'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 6,
"<type 'dict'>": 12}
Leaks: {"<type 'weakref'>": 7, "<class
'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 7,
"<class 'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 7,
"<type 'dict'>": 14}
Leaks: {"<type 'weakref'>": 8, "<class
'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 8,
"<class 'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 8,
"<type 'dict'>": 16}
Leaks: {"<type 'weakref'>": 9, "<class
'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 9,
"<class 'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 9,
"<type 'dict'>": 18}
}}}
--
--
Ticket URL: <https://code.djangoproject.com/ticket/29878#comment:1>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
Old description:
> }}}
> }}}
>
New description:
My sample code is:
for obj in gc.get_objects():
key = str(type(obj))
if key in objs_counts:
objs_counts[key] += 1
else:
objs_counts[key] = 1
return objs_counts, gc_objs
def dump_memory_leaks():
global _old_objs
global _new_objs
global _first_time
if _first_time:
_old_objs, _ = gc_objects()
_first_time = False
else:
_new_objs = gc_objects()
print "Leaks: {}".format(leaked)
_new_objs = None
}}}
}}}
Output:
}}}
--
--
Ticket URL: <https://code.djangoproject.com/ticket/29878#comment:2>
* type: Uncategorized => Bug
--
Ticket URL: <https://code.djangoproject.com/ticket/29878#comment:4>
* cc: Sergey Fedoseev (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/29878#comment:5>
* stage: Unreviewed => Accepted
Old description:
> for obj in gc.get_objects():
> key = str(type(obj))
> if key in objs_counts:
> objs_counts[key] += 1
> else:
> objs_counts[key] = 1
> return objs_counts
>
> def dump_memory_leaks():
> global _old_objs
> global _new_objs
> global _first_time
> if _first_time:
> _old_objs = gc_objects()
New description:
My sample code is:
def dump_memory_leaks():
global _old_objs
global _new_objs
global _first_time
if _first_time:
_old_objs = gc_objects()
_first_time = False
else:
_new_objs = gc_objects()
leaked = {}
for k, v in _new_objs.items():
old_v = _old_objs.get(k)
if old_v:
diff = _new_objs[k] - old_v
if diff > 0:
leaked[str(k)] = diff
else:
leaked[str(k)] = v
print("Leaks: {}".format(leaked))
_new_objs = None
def use_geos():
GEOSGeometry('POINT(5 23)')
# These lines can get rid of the GEOSContextHandle leak
# from django.contrib.gis.geos.prototypes.io import thread_context as
io_thread_context
# io_thread_context.__dict__.clear()
dump_memory_leaks()
if __name__ == '__main__':
for i in range(10):
}}}
}}}
Output:
}}}
--
Comment:
Tentatively accepting. I'm not a GeoDjango expert. I updated the ticket
description to make the script compatible with Python 3 as Python 2 is no
longer supported as of Django 2.0.
--
Ticket URL: <https://code.djangoproject.com/ticket/29878#comment:6>
Comment (by Yong Li):
Thanks!
Replying to [comment:6 Tim Graham]:
> Tentatively accepting. I'm not a GeoDjango expert. I updated the ticket
description to make the script compatible with Python 3 as Python 2 is no
longer supported as of Django 2.0.
--
Ticket URL: <https://code.djangoproject.com/ticket/29878#comment:7>