I am trying to do some custom packing/unpacking and am not having much luck.
create column family CF
with comparator = 'CompositeType(UTF8Type, LongType, UTF8Type)'
and key_validation_class=LongType
and default_validation_class=UTF8Type;
And pycassa happily marshals the data like so:
>>> CF.get(123, column_count=1)
OrderedDict([((u'astring', 123, u'anotherstring'), {'ajson': 'blob'})])
And I am manually converting the column names from a tuple of (u'astring', 123, u'anotherstring') into a delimited str (e.g. "astring,123,anotherstring") for use in a url. When storing new columns, I have to change it back to a tuple, and have to do tuple(s.split(',')).
What I want is to have these conversion steps on the column names be transparent. I fiddled with a couple of different techniques:
class MyType(CassandraType):
@staticmethod
def pack(val):
return val.split(',')
@staticmethod
def unpack(val):
return ','.join(['%s'] * len(val)) % val
This obviously doesn't work because by the time these methods are called, the value passed in is somewhat unpacked:
'^TastringP*i!^Tanotherstring'
looks like there's some padding character in the front of the strings, ^T, and I can unpack the number in the middle using struct.unpack('>I', 'P*i!'), but I have no way of telling where the LongType in the middle of the CompositeType is?
If I try inheriting from CompositeType instead of CassandraType but try the same static methods, I get the same results. It seems like the pack and unpack methods aren't running at the very last step, at least in the case of CompositeTypes?
btw, I am on pycassa 1.6.0, but I tried in 1.7.2 also and got the same results.
Thanks, hopefully there is enough info here...I tried tracing the marshaling code for a bit but I don't know enough about Thrift or the underlying storage protocol for Cassandra so it was difficult to follow :-\