|Hybrid schema & schemaless tables||Ian Campbell||9/16/12 6:22 AM|
I came across FluentCassandra today, and am very interested to know if the following can be done via API.
I have a use case where I'm storing product definitions. Speaking in OO terms, some fields are common to all product types, and others are dynamic. Speaking in Cassandra terms, short of creating a column family (or table in CQL terms) for each product type, can I create and query a set schema that is extended with dynamic columns on a row-to-row basis (in my case depending on product type)?
Something like this (I know it's not possible in CQL, but maybe via API):
CREATE TABLE Products ( rowID int,
productTypeID int, price float, mass float, manufacturerID varint,
data varchar, PRIMARY KEY (rowID) );
... and then also be able to define additional columns per each productTypeID.
A SELECT could look something like this (again, not possible in CQL, but perhaps via API):
SELECT data FROM Products WHERE rowID IN (?, ?) AND productTypeID = ? AND price >= 100 AND price <= 200 AND mass <= 5 AND grade = 5;
Is this possible? If not, what is recommended to achieve this functionality in the most elegant manner possible?
|Re: [FluentCassandra:115] Hybrid schema & schemaless tables||Nick Berardi||9/16/12 6:30 AM|
I encourage you to bone up a little more on Cassandra and how it differs from SQL. The thing you are asking for can be done, but they may not be in a manor that you are use to.
In your case the rowID is called a KEY in Cassandra and you can ask for KEY Ranges in Cassandra. You can even ask for specific keys like you are doing in the SQL statement. But every part of the query currently requires an index to be queried upon. So if you add a new column value that you need to query against, then you will also have to create an index for it.
I encourage you to give it a try.
|Re: [FluentCassandra:115] Hybrid schema & schemaless tables||Ian Campbell||9/16/12 6:50 AM|
Thanks for your reply. Maybe I should reword my question a bit.
I know that Cassandra uses composite columns under the hood, and that data is stored in wide rows, with the partition key being the row key. A CQL composite primary key creates composite columns. Depending who answers the question (and this is the confusing part), those columns do / don't have to be part of the PK in order to be queryable (I suspect the answer is DOES). I also know that CQL re-represents Cassandra to look a little like a SQL-based DB.
Since I'm not running queries on the actual data (i.e. the final field that is not part of the composite columns), I don't need secondary indexes.
The online references for the sort of query I want are all in Hector, which doesn't help me much. Is there something you can point me to that will show me how this could be done using your C# driver?
|Re: [FluentCassandra:117] Hybrid schema & schemaless tables||Nick Berardi||9/16/12 8:46 AM|
Documentation right now is where I am sort of lacking. A good resource I tell people is the Sandbox application that I include.
This program shows many of the different ways to execute a statement either through the RPC methods in Cassandra or through CQL.
Hope this helps, if you have a specific Hector statement your are looking at please send it a long I will try and translate.
To view this discussion on the web visit https://groups.google.com/d/msg/fluentcassandra/-/sDygFVCS9xYJ.
|Re: [FluentCassandra:117] Hybrid schema & schemaless tables||Ian Campbell||9/16/12 10:55 AM|
Thanks Nick. I looked through the example. I think this boils down to the one question that every time I ask, I get opposing answers:
Can Cassandra (let's talk API) allow me to have x static + y dynamic columns that I can filter in a query without having to add indexes?
If the answer is yes, I can proceed using FluentCassandra. If not, I need to re-think how to use Cassandra.
|Re: [FluentCassandra:119] Hybrid schema & schemaless tables||Nick Berardi||9/16/12 11:16 AM|
Never tried it. But it is easy to test using the sandbox project I have created. My guess is yes.
Sent on the go from my phone.
|Re: [FluentCassandra:119] Hybrid schema & schemaless tables||Ian Campbell||9/26/12 1:17 PM|
Nick, I posted a question on SO re. range queries (which is part of this question thread). Would you mind taking a look please?
|Re: [FluentCassandra:133] Hybrid schema & schemaless tables||Nick Berardi||9/26/12 1:21 PM|
I don't know the answer to your question. The biggest question is why so many primary keys? Primary Keys don't work the same way in Cassandra as they do in relational databases.
To view this discussion on the web visit https://groups.google.com/d/msg/fluentcassandra/-/EGjW3UsmE54J.
|Re: [FluentCassandra:133] Hybrid schema & schemaless tables||Ian Campbell||9/26/12 1:41 PM|
OK. Only because the documentation in one place states that range queries can only be done on primary key composites.
As you can see, I tried it both ways.
I know PKs work differently, but they essentially deliver the same thing.
|Re: [FluentCassandra:133] Hybrid schema & schemaless tables||Ian Campbell||9/26/12 1:48 PM|
I'd like to compliment you on your driver. It's clear you've put in a lot of work into it, and it looks and works professionally. Well done.
|Re: [FluentCassandra:136] Hybrid schema & schemaless tables||Nick Berardi||9/26/12 1:50 PM|
Thank you. It has been a work in progress for a couple years.