2014/01/14 00:20:25 ERROR lmdb txn put error: MDB_BAD_VALSIZE: Too big key/data, key is empty, or wrong DUPFIXED size
This was a very recent change we made. Previously Sky used a long byte array to store all the events for a single object but this made appending new events slower and slower. We switched to use LMDB's DUPSORT feature to get O(log n) insertion time but one downside is that LMDB limits values to the max key size (511 bytes). You can change that when you compile LMDB but I think the upper bound is still ~2048 bytes.
As small as this limit sounds, it's usually not a limitation if you're using Factor types (instead of Strings). Factors get stored in the event as only a couple bytes so you need quite a few to fill up 511 bytes. I'm thinking about just removing support for string types entirely since they're horribly inefficient to query against and Sky is really meant for fields that have low cardinality so factors are always a better choice.
On January 14, 2014 at 1:31:33 AM, harry...@gmail.com (harry...@gmail.com) wrote:
Hello,
Among the test data I am trying to load, some of the rows (events) fail with the following error
However, that said, if you are removing support for strings altogether, that wouldn't work for me at all. I plan to have URL fragments with query arguments stored and those are not low cardinality. Even if I parse out the arguments, I might have a campaign-id or catalog-id or sku-id that can be in millions (cardinality).
High cardinality isn't necessarily the problem. Sky can still handle high cardinality but I meant to say that Sky isn't meant for dumping a bunch of string data that needs to be parsed later. If you're using string data then Sky works well if you're grouping by it or if you're doing an equality filter (== or !=). There's no support for anything like regex or LIKE clauses. Sorry for the confusion.
On January 14, 2014 at 12:11:03 PM, harry...@gmail.com (harry...@gmail.com) wrote:
Ben,
Thank you for the quick reply. I will look at my data and see what I can do to make it fit with 511 bytes.