What's the proper way to insert a key of type `blob` when using the ByteOrderedPartitioner?
Using Python 2, I can insert a key just fine with the following:
session.execute(
session.prepare('INSERT INTO keys (key) VALUES (?)'),
[bytearray('key_a')])
However, the insert fails if the byte array contains non-ascii data.
session.execute(
session.prepare('INSERT INTO keys (key) VALUES (?)'),
[bytearray('key_\x9a')])
...
File "cassandra/metadata.py", line 1434, in cassandra.metadata.TokenMap.get_replicas (cassandra/metadata.c:30612)
point = bisect_right(self.ring, token)
File "cassandra/metadata.py", line 1467, in cassandra.metadata.Token.__lt__ (cassandra/metadata.c:31398)
return self.value < other.value
UnicodeDecodeError: 'ascii' codec can't decode byte 0x9a in position 4: ordinal not in range(128)
It seems the `bytearray` is converted to a `str` for the BytesToken, and Python 2 tries to treat `str` objects as ascii when comparing them to unicode strings (the BytesToken values in self.ring).
With Python 3, the `bytes` object is not converted to a `str` (which makes sense since it would need to know what codec to use), so I can't figure out how to insert even ascii-compliant values.
session.execute(
session.prepare('INSERT INTO keys (key) VALUES (?)'),
[bytearray('key_a', encoding='utf-8')])
...
TypeError: Tokens for ByteOrderedPartitioner should be strings (got <class 'bytes'>)
session.execute(
session.prepare('INSERT INTO keys (key) VALUES (?)'),
['key_a'])
...
TypeError: Received an argument of invalid type for column "key". Expected: <class 'cassandra.cqltypes.BytesType'>, Got: <class 'str'>; (string argument without an encoding)
Was it a mistake to use a `blob` as the primary key with the ByteOrderedPartitioner, or should I be passing in something other than a `bytearray` during the inserts?
Also, is there a reason BytesToken values are stored as strings? The values are validated against six.string_types, which allows both encoded bytes and decoded string data in Python 2. It seems like it would be easier to order those tokens if they were always raw bytes, but I'm probably missing something.
I'm using Cassandra 2.0.7 with version 3.3.0 of the cassandra-driver.
Thanks,
Chris