Queries - cutting by timeframe/date

51 views
Skip to first unread message

james.butler

unread,
Apr 5, 2013, 10:00:19 AM4/5/13
to sk...@googlegroups.com
What is the sanest way to only find data/events for a given timespan?
At the moment I'm just storing a unix timestamp as an event property and then putting in subconditions that timestamp > X and timestamp < Y
This seems really clunky to me and like i'm missing a trick, it also means multiple queries to retrieve separate days rather than being able to pull out 7 consecutive days from one run through the table.

Am I being daft?

Ben Johnson

unread,
Apr 5, 2013, 1:01:42 PM4/5/13
to sk...@googlegroups.com
No, you're not being daft. :)

I made some changes to the 'unstable' branch to fix your issue:

System property support (936ac2640)
The 'timestamp' property is now available to the query engine. You can use it like a normal property except you can't delete it.

Additional conditional operators (d92bf941b)
I was only supporting "==". I forgot to add the other ones in. You can now do ">", ">=", "<", "<=" and "!=" as well.

Simple multiple conditions (b4d6ecf48)
Conditions previously only supported a single condition (i.e. "age == 12"). Now you can string multiple AND conditions together (i.e. "age == 12 && gender == 'male'"). It doesn't do OR and it doesn't do nesting with parentheses. It's doing a regex match now to validate the expression and convert it to Lua. I need to integrate a PEG parser for conditions before I can support anything more complex.

Let me know if you run into any other issues! Thanks, James.



Ben

James Butler

unread,
Apr 5, 2013, 1:12:23 PM4/5/13
to sk...@googlegroups.com
Excellent stuff, I'll try and have a play with this today/tomorrow.

with regards to system properties, apart from timestamp, object name and maybe table name (that seems dim in my head though), what other potential ones could there be? Do the system names warrant some kind of namespacing to make it clear they aren't user supplied?

Just throwing ideas out here... 


From: sk...@googlegroups.com [sk...@googlegroups.com] on behalf of Ben Johnson [b...@skylandlabs.com]
Sent: Friday, April 05, 2013 5:01 PM
To: sk...@googlegroups.com
Subject: Re: [sky] Queries - cutting by timeframe/date

Ben Johnson

unread,
Apr 5, 2013, 1:28:47 PM4/5/13
to sk...@googlegroups.com
Currently the timestamp is an unsigned 32-bit integer representing the seconds from Unix epoch (UTC). The only other system property besides timestamp & object id would be fractional seconds on the timestamp. LuaJIT doesn't support 64-bit integers right now (although I think it's coming) so I'll push the fractional microseconds into something like 'utimestamp' or 'timestamp_us'.

Sky stores the whole second and fractional second in separate masks on the same 64-bit int internally so it's easy (and fast) to extract those out separately. Also, it'll need to be microseconds instead of milliseconds because that's how Sky stores it and I don't want to incur a division operation on every event to convert it.

I'm not too worried about namespacing right now. The nice thing about the current setup is that Sky basically sticks a property in the property file with an id of zero and the rest of the server doesn't really have to know it's a system property. It just treats it like any other property.

I think I'll need to add a 'date' data type here soon. It'll just be stored as an integer but it'll give me the meta data I need to perform cohort analysis and some other nifty stuff.


Ben

Reply all
Reply to author
Forward
0 new messages