Hi Samuel,
First off, thanks for your interest in adding some useful features! There are a few things worth mentioning:
The first is that I'm currently laying the groundwork for a new python driver for Cassandra that's similar to the new java driver (
https://github.com/datastax/java-driver). It uses the new native protocol instead of Thrift and only presents the cql3 API; additionally, it will have some good improvements in connection pooling, request pipelining, node discovery, and failure/retry policies over what pycassa offers. I'm actually planning to include a query-builder interface that is similar to sqlalchemy's just like you mention, so I'm glad to see there is interest in having that. I expect to have a beta-quality version of the new driver available on GitHub in roughly 1.5 or 2 months. (I'm not working on it full-time, at the moment.)
The second is that I am definitely interested in getting cql3 support into pycassa regardless of developments on the new driver, primarily because it provides a nice upgrade path for those who are currently using pycassa but are looking to use cql3. At a minimum level, proper cql3 support includes:
* Being able to execute queries through a ConnectionPool with proper retries, etc
* Encoding and decoding cql3 types
The first point shouldn't be very difficult at all. It primarily depends on deciding what the API looks like (i.e. whether it's a method on the ConnectionPool, on a ColumnFamily, or comes from a different module); I don't necessarily have a strong preference there yet.
The second point can also be pretty easily accomplished by copying code from the dbapi2 driver.
I would be quite happy to have just those two things in pycassa and will gladly accept pull requests. Anything extra is a bonus.
Regarding a mapper and query-builder interface, I feel like extending the ColumnFamily class is the wrong way to go simply because the two APIs are essentially mutually incompatible. I haven't thought too deeply about the topic, but my gut says a combination of a subset of the sqlalchemy core expression api (
http://docs.sqlalchemy.org/en/rel_0_8/core/expression_api.html) and declarative mappings (similar to
http://docs.sqlalchemy.org/en/rel_0_8/orm/tutorial.html#declare-a-mapping) where execute() and query() are methods on a ConnectionPool would be a good start.
I'm definitely interested in discussing the API further. Any work done for this in pycassa will likely be closely translated to the new python driver (and perhaps vice-versa).