Querying date strings

530 views
Skip to first unread message

Julian May

unread,
Dec 14, 2015, 9:56:02 AM12/14/15
to ArangoDB
Hey

Just wondering if some could explain to my why:


return DATE_ISO8601("2015-11-25") == "2015-11-25" //Is false??
return DATE_ISO8601("2015-11-25") > "2015-11-25" //Is true??

I will be refactoring to use a long integer value instead for performance, but while red/green/refactor'ing, it's nice to have a concise model we can easier discuss and verify with our domain expert.
I was having problems understanding why i could never get the date-filters correct, and it came down to this. Understanding why, will help me help the next developer falling into this trap :)

Thanks!


Julian May

unread,
Dec 14, 2015, 10:07:01 AM12/14/15
to ArangoDB
The reason I'm comparing with DATE_ISO8601("2015-11-25") is because it's the value that "2015-11-25" is interpreted as when comparing. We we're baffled when we first saw this:

Documents in "RoomAvailabilityOnDate" collection:
{
   "roomNumber":"102",
   "available": true,
   "date": "2015-11-25"
},
{
   "roomNumber":"103",
   "available": true,
   "date": "2015-11-26"
},


Query for availablity in period (actually just for the 25th of november):

FOR r IN RoomAvailabilityOnDate
FILTER r.date >= "2015-11-25" && r.date < "2015-11-26"
RETURN { r.roomNumber }


....Results in ['103'], and not ['102']

Jan Steemann

unread,
Dec 14, 2015, 11:18:06 AM12/14/15
to aran...@googlegroups.com
Hi,


> return DATE_ISO8601("2015-11-25") == "2015-11-25" //Is false??

This is because DATE_ISO8601("2015-11-25") actually returns "2015-11-25T00:00:00.000Z", which is unequal to the compared string "2015-11-25".
You can check this by running the query `return DATE_ISO8601("2015-11-25")`.


> return DATE_ISO8601("2015-11-25") > "2015-11-25" //Is true??

This is for the same reason: "2015-11-25T00:00:00.000Z" is "greater" than the string "2015-11-25".

Note that as DATE_ISO8601 returns a string and the other operand is also a string, a string comparison will be done here.


Finally, when looking at these two documents:

  { "roomNumber":"102", "available": true, "date": "2015-11-25" }
  { "roomNumber":"103", "available": true, "date": "2015-11-26" }

Running the query on the above data results in (only) the document with room number 102 here, which is the expected result:


  FOR r IN RoomAvailabilityOnDate
  FILTER r.date >= "2015-11-25" && r.date < "2015-11-26"
  RETURN { value: r.roomNumber }

Note that I had to adjust the query slightly as `RETURN { r.roomNumber }` was syntactically invalid. It parses when `RETURN { value: r.roomNumber }` is used, or simply `RETURN r.roomNumber`.
But in both cases only 102 should be and is returned whenever I try.

If you are sure your query returns an invalid value, can you please supply the actual query (the above one seemed slightly invalid) plus the environment it ran on (I tried current devel, 2.8, 2.7), plus any index definitions for the collection?

Thanks and best regards
Jan

--
You received this message because you are subscribed to the Google Groups "ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to arangodb+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Julian May

unread,
Dec 14, 2015, 11:54:51 AM12/14/15
to ArangoDB
Hey Jan

Yea, these queries where just typed in as a pseudo example.
I'll rectreate actual scenarios tomorrow morning and get back to you with something testable.
Thanks!

/Julian

Jan Steemann

unread,
Dec 14, 2015, 11:55:47 AM12/14/15
to aran...@googlegroups.com
Thanks!

CoDEmanX

unread,
Dec 14, 2015, 9:06:44 PM12/14/15
to ArangoDB
Hey Julian,

I reworked the documentation regarding AQL date functions not too long ago :

Should it explain the sort properties of ISO dates even more?

Also have a look at DATE_COMPARE, you can use it like this:

DATE_COMPARE(DATE_NOW(), "2015-11-25", "years", "days")

It essentially takes the first 10 characters of the first ISO string and compares it to the second date (also 10 chars).

Julian May

unread,
Dec 15, 2015, 3:37:52 AM12/15/15
to ArangoDB
Hey Jan

The query looks like this:

for `r` in `RoomAvailabilityOnDate`  
filter   `r`.`Date`  >=  "2015-11-25T00:00:00.0000000" 
and    `r`.`Date`  <  "2015-11-26T00:00:00.0000000" 
return r

generated from this .net code
var start = new DateTime(2015,11,25);
var end = new DateTime(2015,11,26);

var result = _client.Query<RoomAvailabilityOnDate>().Where(x=>x.Date >= start && x.Date < end).ToArray();


I did not realize (until you explained it in the last entry) that these are just string comparisons... Maybe that would be a good sidenote to the following text in https://docs.arangodb.com/Aql/DateFunctions.html :
"You can directly compare ISO date strings if you want to find dates before or after a certain date, or in between two dates (>=><<=). No special date function is required"

Well, the mystery is solved, and I'm off to start comparing on dates as number values instead of strings :)
Thanks!

/Julian



Den mandag den 14. december 2015 kl. 17.18.06 UTC+1 skrev Jan Steemann:

Wilfried Gösgens

unread,
Dec 15, 2015, 4:10:53 AM12/15/15
to ArangoDB
Hi,
the DATE_COMPARE section of https://docs.arangodb.com/Aql/DateFunctions.html
already has a similar Quote:

  You can directly compare ISO date strings if you want to find dates before or after a certain date, or in between two dates (>=, >, <, <=). No special date function is required. Equality tests (== and !=) will only match the exact same date and time however. You may use SUBSTRING() to compare partial date strings, DATE_COMPARE() is basically a convenience function for that. However, neither is really required to limit a search to a certain day for instance:

However, the document structure with the functions being a list is a bit bulky, maybe inserting some more headlines (which then also would provide HTML-anchors)  would be a more reader friendly representation.

Cheers,
Willi

CoDEmanX

unread,
Dec 15, 2015, 5:51:02 AM12/15/15
to ArangoDB
Yeah, I'll rework that, headlines never hurt ;)

I think we should also add a sentence regarding index utilization. Julian's approach "date >= today && date < tomorrow" can make use of a skiplist index, whereas DATE_COMPARE() or other SUBSTRING()-style queries can not.

Would be cool if one could define custom indexes, actually. One could compress an ISO date to something like 20151215 and use a hash index for super fast exact date matches. On the other hand, one could just add another field in this format and derive it from existing ISO dates and use a regular hash index for exact matches, and a skiplist index for range matches. Difference: the former approach wouldn't require an extra field and it's can't get out of sync with the actual date field. Database-level triggers would be a middle way.

Jan Steemann

unread,
Dec 15, 2015, 6:06:35 AM12/15/15
to aran...@googlegroups.com
If dates are stored as date strings without time and fractional components (e.g. "2015-12-15", "2016-01-01"), then comparing two values is also easy and fast.
I would assume that for many applications it's not necessary to have full time and millisecond precision for date values, at least for applications such as reservation apps that reserve resources on a per-day basis. For anything that needs hourly precision or more, then of course that won't work.

Best regards
Jan


--

CoDEmanX

unread,
Dec 15, 2015, 7:58:51 AM12/15/15
to ArangoDB
Of course. One should take a few things into account however:
If you do reservations, you might need to know the hour (e.g. guest cancels reservation and needs to pay a fee if <24h, or police needs to check an alibi...).
On the technical side, you might want to know the exact time to detect accidental double reservations.
Some coders might have written a timestamp or full ISO date to the database
(possibly because it was easier to do than just the date part) and your current data has already millisecond precision.

My point: there are probably reasons to store a timestamp or full ISO date string, but you might only need the date part for certain queries.
Managing redundant date fields on application level can be problematic however.

Another interesting topic are time zones. ISO dates encode timezone offsets, but only a date + timezone offset, without the time part, may not be valid
(it's not in Firefox's JavaScript implementation for instance, there must be at least an hour).
One should probably use UTC server time and omit +00:00 / Z in the date string to avoid this altogether.

Jan Steemann

unread,
Dec 15, 2015, 8:17:18 AM12/15/15
to aran...@googlegroups.com
I agree: it depends on the application whether you can get away with storing just the date part.
If you can, it makes implementation a lot simpler, and that's why I wanted to point it out.
If not, then there'll be date-fiddling required, either with strings or timestamps, and even more when using (multiple) timezones, DST etc.
And there it makes sense as you wrote to use UTC server times for everything and try to get rid of at least timezone issues.

Best regards
Jan

--

Julian May

unread,
Dec 29, 2015, 3:00:32 AM12/29/15
to ArangoDB
Hey

Just a followup. "Time of day" is extremely relevant for hotels business logic, but only as a local concept - "14:00" is the same time of day in Japan and UK, although they are different points in time (timezones and all that noise).
That aside, I'm very happy with the solution i ended up with, relying on a simple user-function to change the iso8601 string into a number for fast and relyably sortable results:

aqlfunctions.register("xpectra::availability::iso8601todatenumber",function(iso8601Date) {var str = iso8601Date.substr(0,4) + iso8601Date.substr(5,2) + iso8601Date.substr(8,2);   return Number(str); });


The first iterations of the code (when i started this thread) was executed from C#, but moving the queries to a Foxx-service really helped encapsulate the query and insert/update logic.
Just leaving this here for others who might have similar challanges.

/Julian

Jan Steemann

unread,
Dec 29, 2015, 10:52:28 AM12/29/15
to aran...@googlegroups.com
Note that you could do the conversion in plain AQL too:

  LET dt = '2015-12-29T16:51:00' /* insert actual date here */
  RETURN TO_NUMBER(CONCAT(SUBSTRING(dt, 0, 4), SUBSTRING(dt, 5, 2), SUBSTRING(dt, 8, 2)))

Julian May

unread,
Dec 30, 2015, 7:57:03 AM12/30/15
to ArangoDB
I did not know that, thanks!

/Julian
Reply all
Reply to author
Forward
0 new messages