Cython classes and attribute cache

94 views
Skip to first unread message

Nils Bruin

unread,
Apr 6, 2013, 1:55:02 PM4/6/13
to cython-users
At present it seems that (at least on Python 2.7) Cython produced
extension classes do not have the "Py_TPFLAGS_HAVE_VERSION_TAG" flag
set. That means they are opting out of the attribute cache and that
they therefore do not benefit from the considerable speedups that
cache can produce. It also means any classes derived from them don't
benefit, and the difference in speed can be considerable with a deep
MRO.

The following code snippet illustrates the issue. We produce two
deeply nested subclass hierarchies, one deriving from `object`, the
other from a no-op cython class. We then test attribute access on
instances of either (as well as instances of classes higher up in the
hierarchy for reference).

------------------------ code ---------------------------
cython("""
cdef class cython_object(object):
pass

cdef extern from "Python.h":
ctypedef struct PyTypeObject_ext "PyTypeObject":
void * tp_getattr
void * tp_setattr
void * tp_getattro
void * tp_setattro
long tp_version_tag
long tp_flags

def get_type_flags(T):
cdef PyTypeObject_ext * V
V = <PyTypeObject_ext *><void *> T
print "tp_getattr:",<long> V.tp_getattr
print "tp_setattr:",<long> V.tp_setattr
print "tp_getattro:",<long> V.tp_getattro
print "tp_setattro:",<long> V.tp_setattro
print "tp_version_tag:",V.tp_version_tag
return V.tp_flags
""")
def shallow_and_deep(base):
shallow = type("T0",(base,),{"t":100});
P=shallow
for i in range(1,1000):
P=type("T%d"%i,(P,),{"a%i"%i:1,"b%i"%i:2})
deep = P
return (shallow,deep)

S_py,D_py=shallow_and_deep(object)
s_py=S_py()
s_py.u=100
d_py=D_py()
d_py.u=100
print "python shallow class attribute:"
timeit("s_py.t",number=20000)
print "python deep class attribute:"
timeit("d_py.t",number=20000)
print "python shallow instance attribute:"
timeit("s_py.u",number=20000)
print "python deep instance attribute:"
timeit("d_py.u",number=20000)

S_cy,D_cy=shallow_and_deep(cython_object)
s_cy=S_cy()
s_cy.u=100
d_cy=D_cy()
d_cy.u=100
print "cython shallow class attribute:"
timeit("s_cy.t",number=20000)
print "cython deep class attribute:"
timeit("d_cy.t",number=20000)
print "cython shallow instance attribute:"
timeit("s_cy.u",number=20000)
print "cython deep instance attribute:"
timeit("d_cy.u",number=20000)

print "pure python object fields:"
py_flags=get_type_flags(D_py)
print "cython derived object fields:"
cy_flags=get_type_flags(D_cy)

py_V=py_flags & (1<<18) #TPFLAGS_HAVE_VERSION_TAG
print "pure python object has version tag:",bool(py_V)
cy_V=cy_flags & (1<<18) #TPFLAGS_HAVE_VERSION_TAG
print "cython derived object has version tag:",bool(cy_V)
--------------------------------------------------------------

Output:

python shallow class attribute:
20000 loops, best of 3: 53.7 ns per loop
python deep class attribute:
20000 loops, best of 3: 54 ns per loop
python shallow instance attribute:
20000 loops, best of 3: 52.7 ns per loop
python deep instance attribute:
20000 loops, best of 3: 52.7 ns per loop
cython shallow class attribute:
20000 loops, best of 3: 66.8 ns per loop
cython deep class attribute:
20000 loops, best of 3: 38.2 µs per loop
cython shallow instance attribute:
20000 loops, best of 3: 83.9 ns per loop
cython deep instance attribute:
20000 loops, best of 3: 23.3 µs per loop
pure python object fields:
tp_getattr: 0
tp_setattr: 0
tp_getattro: 139723747864304
tp_setattro: 139723747864976
tp_version_tag: 2440
cython derived object fields:
tp_getattr: 0
tp_setattr: 0
tp_getattro: 139723747864304
tp_setattro: 139723747864976
tp_version_tag: 0
pure python object has version tag: True
cython derived object has version tag: False

As you can see, on the pure python class instances, MRO depth is
irrelevant, because the cache catches it; also for the instance
attribute, so the absence of an attribute higher up in the MRO that
might shadow the instance attribute (a data descriptor, for instance)
also gets cached.

The rest of the code confirms that the cython-derived class does not
have the "HAVE_VERSION_TAG" flag.

The 1000 level deep hierarchy is of course an exaggeration, but
already with 4 levels (which can easily happen in real-world code),
lookup takes about twice the time.

----

Cython seems to initialize tp_flags with

Py_TPFLAGS_DEFAULT|Py_TPFLAGS_CHECKTYPES|Py_TPFLAGS_HAVE_NEWBUFFER|
Py_TPFLAGS_BASETYPE

The following note from python's "object.h" seems relevant:

"""
NOTE: when building the core, Py_TPFLAGS_DEFAULT includes
Py_TPFLAGS_HAVE_VERSION_TAG; outside the core, it doesn't. This
is so
that extensions that modify tp_dict of their own types directly
don't
break, since this was allowed in 2.5. In 3.0 they will have to
manually remove this flag though!
"""

This suggests that on Python 3.*, the difference observed above would
not be present. It would be nice if cython on 2.7 would also have an
option (probably turned on by default) to include the
"HAVE_VERSION_TAG" flag, because in most cases it should be entirely
valid to have it and it should lead to significant performance gains
in many cases.

Nils Bruin

unread,
Apr 6, 2013, 3:37:54 PM4/6/13
to cython-users
I can confirm that changing Compiler/TypeSlots.py, line 380 from:

value = "Py_TPFLAGS_DEFAULT|Py_TPFLAGS_CHECKTYPES|
Py_TPFLAGS_HAVE_NEWBUFFER"

to:

value = "Py_TPFLAGS_DEFAULT|Py_TPFLAGS_CHECKTYPES|
Py_TPFLAGS_HAVE_NEWBUFFER|Py_TPFLAGS_HAVE_VERSION_TAG"

indeed makes the difference in timing disappear. For extension classes
that don't mess directly with their tp_dict having this flag should be
entirely safe.

See http://bugs.python.org/issue1700288 for background on the
introduction of the method cache in python and http://bugs.python.org/issue1878
for the reason why the flag is not included in Py_TPFLAGS_DEFAULT on
2.7, but is in 3.*.

Stefan Behnel

unread,
Apr 6, 2013, 3:58:12 PM4/6/13
to cython...@googlegroups.com, Cython-devel
Nils Bruin, 06.04.2013 21:37:
I wouldn't mind making it an "on by default" compiler directive. That would
mean that you could switch it off with a class decorator at need.

Any objections?

Stefan

Stefan Behnel

unread,
Apr 6, 2013, 4:33:22 PM4/6/13
to cython...@googlegroups.com, Cython-devel
Stefan Behnel, 06.04.2013 21:58:
Proposed patch:

https://github.com/scoder/cython/commit/b1f10fa2c1bbb0a076807a9f391204c6a997ba9a


> Any objections?

Stefan

Robert Bradshaw

unread,
Apr 6, 2013, 6:11:33 PM4/6/13
to cython...@googlegroups.com
+1 from me.

- Robert

Nils Bruin

unread,
Apr 6, 2013, 6:21:32 PM4/6/13
to cython-users
doesn't "value^Py_FLAG" just toggle the flag? In order to reliably
(i.e., in both Py2 and Py3) turn it off, don't you need to do
"value&~Py_FLAG"?

Coming up with a test that verifies the caching/noncaching behaviour
is rather hard, since python goes out of its way to make the class
attribute dictionary read-only, by only exposing a dictproxy to it,
and proxyobjects are deliberately opaque. It confirms that turning on
attribute caching by default should really be pretty safe.

I've come up with the following, which is probably not quite legal,
since it has to recreate a non-exported typedef. I didn't see how to
do that in cython, due to the PyObject_HEAD macro.

-----------------------------------------
cython("""
#//contents of proxyobject.h
#//see Objects/descrobject.c:658
#typedef struct {
# PyObject_HEAD
# PyObject *dict;
#} proxyobject;

cdef extern from "proxyobject.h":
ctypedef struct proxyobject:
PyObject *dict

def unwrap(P):
cdef proxyobject * V
V=<proxyobject *>P
return <object>V.dict
""")

class P(object):
t=100
p=P()
print "before meddling p.t=",p.t
unwrap(P.__dict__)['t']=200
print "after meddling (but with cache) p.t=",p.t
print "P.__dict__['t']=",P.__dict__['t']
for i in range(10000):
try:
c=getattr(P,'%s'%i)
except:
pass
print "after flushing cache p.t=",p.t
------------------------------------------

anyway, on P, a normal pure-python class that does have attribute
caching, we get the output:

before meddling p.t= 100
after meddling (but with cache) p.t= 100
P.__dict__['t']= 200
after flushing cache p.t= 200

which confirms the caching. By replacing the `object` from which P
derives by a cython class under the @cython.type_version_tag(True) or
@cython.type_version_tag(False) you should be able to confirm that
caching is or isn't taking place.

Nils Bruin

unread,
Apr 6, 2013, 7:26:44 PM4/6/13
to cython-users
It's not easy to fool the caching system! The original dictionary is
only available through a rather opaque proxydict. Getting at the
writeable dictionary underneath only seems possible via bad hackery:

cython("""
#//contents of proxyobject.h
#//see Objects/descrobject.c:658
#typedef struct {
# PyObject_HEAD
# PyObject *dict;
#} proxyobject;

cdef extern from "proxyobject.h":
ctypedef struct proxyobject:
PyObject *dict

def unwrap(P):
cdef proxyobject * V
V=<proxyobject *>P
return <object>V.dict

cdef class cython_object(object):
pass
""")

def testcache(base):
class P(base):
t=100
p=P()
print "before meddling p.t=",p.t
unwrap(P.__dict__)['t']=200
print "after meddling (but with cache) p.t=",p.t
print "P.__dict__['t']=",P.__dict__['t']
setattr(P,'v',0)
print "after flushing cache p.t=",p.t

Anyway, with testcache(object) and testcache(cython_object) it should
be possible to confirm caching behaviour (or its absence).

Incidentally, in your patch you propose flag removal via (value^flag).
Doesn't that just toggle the flag? That means it wouldn't turn off the
flag in Py2. Shouldn't you use (value&~flag) (i.e., and with "not" the
flag)?

Stefan Behnel

unread,
Apr 7, 2013, 1:24:26 AM4/7/13
to cython...@googlegroups.com
Nils Bruin, 07.04.2013 00:21:
> doesn't "value^Py_FLAG" just toggle the flag? In order to reliably
> (i.e., in both Py2 and Py3) turn it off, don't you need to do
> "value&~Py_FLAG"?

Right, thanks. I even had that first, but then got confused with other
little details in the implementation.


> Coming up with a test that verifies the caching/noncaching behaviour
> is rather hard, since python goes out of its way to make the class
> attribute dictionary read-only, by only exposing a dictproxy to it,
> and proxyobjects are deliberately opaque. It confirms that turning on
> attribute caching by default should really be pretty safe.

Good to know, and thanks for figuring this out.

However, I don't think we need to test the caching behaviour here, since
the directive doesn't refer to the cache and the cache itself is not part
of Cython. All we need to make sure is that the directive turns on or off
the flag on the extension type. I'd rather leave the exact details of what
CPython does with the flag to CPython itself.

Stefan

Reply all
Reply to author
Forward
0 new messages