Exclusive get count

89 views
Skip to first unread message

Cato Yeung

unread,
Sep 18, 2012, 11:23:26 PM9/18/12
to pycassa...@googlegroups.com
Dear Experts,

I know that composite in pycassa can do exclusive get like this:
result = cf.get('haha', column_start=('xyz','abc', ('123', False)))
how about for normal columns?
I tried but cant do the job:
EventMessageByTime.count_one(event_id, column_start=(time_to_dt(timestamp),False))

Or is it possible to do so? I think it SHOULD.
Anyway, if it is totally impossible, does it imply I need to use composite in my case?

Cheers,
Cato

Tyler Hobbs

unread,
Sep 19, 2012, 12:55:21 PM9/19/12
to pycassa...@googlegroups.com
Unfortunately, just as a limitation of the Cassandra Thrift API, you cannot have exclusive slice ends without CompositeType comparators.  However, for most types, it is easy to construct a slice end that is effectively exclusive.  For example, with IntegerType, if you wanted to count everything where the column name was < 30, you could instead do <= 29, i.e, use a slice end of 29.  For some types, this is more difficult (or impossible) to construct, and in those cases, you would have to use CompositeType with one component.
--
Tyler Hobbs
DataStax

Cato Yeung

unread,
Sep 23, 2012, 11:04:27 PM9/23/12
to pycassa...@googlegroups.com
Tyler,
I found a bug when using DateType in Composite.
Here is how I create the column family:
create column family test2
with comparator = 'CompositeType(DateType)'
and key_validation_class = 'UTF8Type'
and default_validation_class = 'UTF8Type';


Here is my code:
    pool = Connection()
    cf = pycassa.ColumnFamily(pool, 'test2')
    attr = {(datetime.datetime.now()): "colval"}
    cf.insert('haha', attr)

Here is my error:
Traceback (most recent call last): File "/Library/Python/2.7/site-packages/flask/app.py", line 1518, in __call__ return self.wsgi_app(environ, start_response) File "/Library/Python/2.7/site-packages/flask/app.py", line 1506, in wsgi_app response = self.make_response(self.handle_exception(e)) File "/Library/Python/2.7/site-packages/flask/app.py", line 1504, in wsgi_app response = self.full_dispatch_request() File "/Library/Python/2.7/site-packages/flask/app.py", line 1264, in full_dispatch_request rv = self.handle_user_exception(e) File "/Library/Python/2.7/site-packages/flask/app.py", line 1262, in full_dispatch_request rv = self.dispatch_request() File "/Library/Python/2.7/site-packages/flask/app.py", line 1248, in dispatch_request return self.view_functions[rule.endpoint](**req.view_args) File "/Users/zensis/inviteapp_python/runserver.py", line 413, in test cf.insert('haha', attr) File "/Library/Python/2.7/site-packages/pycassa/columnfamily.py", line 950, in insert colval = self._pack_value(columns.values()[0], colname) File "/Library/Python/2.7/site-packages/pycassa/columnfamily.py", line 453, in _pack_value packed_col_name = self._pack_name(col_name, False) File "/Library/Python/2.7/site-packages/pycassa/columnfamily.py", line 415, in _pack_name return self._name_packer(value, slice_start) File "/Library/Python/2.7/site-packages/pycassa/marshal.py", line 91, in pack_composite last_index = len(items) - 1 TypeError: object of type 'datetime.datetime' has no len()

How can I solve this problem?
Cheers,
Cato

Tyler Hobbs

unread,
Sep 24, 2012, 12:23:05 PM9/24/12
to pycassa...@googlegroups.com
To create single-item tuples in python, you have to do "(foo,)".  Note the comma in there.  Without it, Python just considers it a grouping for expressions, so you're just inserting a datetime object, not a datetime inside of a tuple.
--
Tyler Hobbs
DataStax

Cato Yeung

unread,
Sep 24, 2012, 9:33:48 PM9/24/12
to pycassa...@googlegroups.com
I also tried that. But this return error too:
Traceback (most recent call last):
  File "/Library/Python/2.7/site-packages/flask/app.py", line 1518, in __call__
    return self.wsgi_app(environ, start_response)
  File "/Library/Python/2.7/site-packages/flask/app.py", line 1506, in wsgi_app
    response = self.make_response(self.handle_exception(e))
  File "/Library/Python/2.7/site-packages/flask/app.py", line 1504, in wsgi_app
    response = self.full_dispatch_request()
  File "/Library/Python/2.7/site-packages/flask/app.py", line 1264, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/Library/Python/2.7/site-packages/flask/app.py", line 1262, in full_dispatch_request
    rv = self.dispatch_request()
  File "/Library/Python/2.7/site-packages/flask/app.py", line 1248, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/Users/zensis/inviteapp_python/runserver.py", line 413, in test
    cf.insert('haha', attr)
  File "/Library/Python/2.7/site-packages/pycassa/columnfamily.py", line 950, in insert
    colval = self._pack_value(columns.values()[0], colname)
  File "/Library/Python/2.7/site-packages/pycassa/columnfamily.py", line 453, in _pack_value
    packed_col_name = self._pack_name(col_name, False)
  File "/Library/Python/2.7/site-packages/pycassa/columnfamily.py", line 415, in _pack_name
    return self._name_packer(value, slice_start)
  File "/Library/Python/2.7/site-packages/pycassa/marshal.py", line 115, in pack_composite
    s += ''.join((len_packer(len(packed)), packed, eoc))
Cheers,
Cato

Tyler Hobbs

unread,
Sep 24, 2012, 9:51:46 PM9/24/12
to pycassa...@googlegroups.com
What was the exception and exception message?
--
Tyler Hobbs
DataStax

Cato Yeung

unread,
Sep 25, 2012, 12:12:20 AM9/25/12
to pycassa...@googlegroups.com
Traceback (most recent call last):
  File "/Library/Python/2.7/site-packages/flask/app.py", line 1518, in __call__
    return self.wsgi_app(environ, start_response)
  File "/Library/Python/2.7/site-packages/flask/app.py", line 1506, in wsgi_app
    response = self.make_response(self.handle_exception(e))
  File "/Library/Python/2.7/site-packages/flask/app.py", line 1504, in wsgi_app
    response = self.full_dispatch_request()
  File "/Library/Python/2.7/site-packages/flask/app.py", line 1264, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/Library/Python/2.7/site-packages/flask/app.py", line 1262, in full_dispatch_request
    rv = self.dispatch_request()
  File "/Library/Python/2.7/site-packages/flask/app.py", line 1248, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/Users/zensis/inviteapp_python/runserver.py", line 413, in test
    cf.insert('haha', attr)
  File "/Library/Python/2.7/site-packages/pycassa/columnfamily.py", line 950, in insert
    colval = self._pack_value(columns.values()[0], colname)
  File "/Library/Python/2.7/site-packages/pycassa/columnfamily.py", line 453, in _pack_value
    packed_col_name = self._pack_name(col_name, False)
  File "/Library/Python/2.7/site-packages/pycassa/columnfamily.py", line 415, in _pack_name
    return self._name_packer(value, slice_start)
  File "/Library/Python/2.7/site-packages/pycassa/marshal.py", line 115, in pack_composite
    s += ''.join((len_packer(len(packed)), packed, eoc))
UnboundLocalError: local variable 'eoc' referenced before assignment

Sorry. I forgot to include the last line.
Cheers,
Cato

Tyler Hobbs

unread,
Sep 25, 2012, 12:39:20 AM9/25/12
to pycassa...@googlegroups.com
Looks like it was a bug related to inserting single-component composites, but it's fixed here: https://github.com/pycassa/pycassa/commit/7e63cd6bf94016d397084c1a7593314e0231d2b8

You can either use the latest master branch or apply the first half of that patch to your local copy.

Thanks for reporting this!
--
Tyler Hobbs
DataStax

Cato Yeung

unread,
Sep 25, 2012, 3:39:07 AM9/25/12
to pycassa...@googlegroups.com
The fix works great. Thanks for your fast response.

Cato Yeung

unread,
Sep 26, 2012, 11:53:54 PM9/26/12
to pycassa...@googlegroups.com
I found that when I exclusively get datetype composite column, the result is not correct.
I have made a simplified version of what I want to do here:
import pycassa
import datetime
        
if __name__ == '__main__':
    # initialize pool and column family
    pool = pycassa.ConnectionPool('InviteAppRevamp2', server_list=['192.168.1.41:9160'], pool_size=1)
    cf = pycassa.ColumnFamily(pool, 'test2')
    
    # get current datetime
    dt = datetime.datetime.now()
    print 'current datetime: '+str(dt)
    
    # insert into cassandra database
    attr = {(dt,): "dummy_value"}
    cf.insert('dummy_key', attr)
    
    # get entry from cassandra database
    result = cf.get('dummy_key')
    
    # get entry from cassandra database, exclusively
    key, value = result.popitem()
    print 'datetime in cassandra: '+str(key[0])
    result2 = cf.get('dummy_key', column_start=((key[0], False)))
    print '----------------------------------------------------------------'
    print 'I still can get the entry when I specify getting it exclusively:'
    print result2
    
    # get entry from cassandra database, inclusively
    result3 = cf.get('dummy_key', column_start=((key[0], True)))
    print '----------------------------------------------------------------'
    print 'I still can get the entry when I specify getting it inclusively:'
    print result3
        
    # delete entry
    cf.remove('dummy_key')

or you can get by gist:

What I want to do is that I can get and get count exclusively.
Cheers,
Cato

Cato Yeung

unread,
Sep 26, 2012, 11:55:31 PM9/26/12
to pycassa...@googlegroups.com
here is how I created my cf:
create column family test2
with comparator = 'CompositeType(DateType)'
and key_validation_class = 'UTF8Type'
and default_validation_class = 'UTF8Type';


Cheers,
Cato

Tyler Hobbs

unread,
Sep 27, 2012, 1:56:55 PM9/27/12
to pycassa...@googlegroups.com
I haven't had time to test this, but it looks like you have another problem with single-element tuples.

Your version: result2 = cf.get('dummy_key', column_start=((key[0], False)))

The correct version:  result2 = cf.get('dummy_key', column_start=((key[0], False),))
--
Tyler Hobbs
DataStax

Reply all
Reply to author
Forward
0 new messages