timestamp problem

725 views
Skip to first unread message

Kévin Garro

unread,
Jun 20, 2016, 9:47:00 AM6/20/16
to DataStax Python Driver for Apache Cassandra User Mailing List
Hi everyone

In my project i try to save the current date i do it in two diffrents ways ( to experiment)

the first one is to use
datetime.datime.now()

in python and send the resultat ( a timestamp) in a query.
The result will send a datetime object like
datetime.datetime(2016, 6, 20, 15, 38, 16, 510234)

by example.

The second way i use is tu send a string 'dateof(now())' that actually work and give a timestamp in cassandra.
The problem is that in cqlsh it show me a different timestamp ( +2h) from what i send.

And when i get the data from cassandra it return the good timestamp ! 

if i send it :
datetime.datetime(2016, 6, 20, 15, 38, 16, 510234)

it will save:
'2016-6-20 17:38:16+0200'

I did a experience: i create a table test and do personnaly an insert values with dateof(now) that show me the good timestamp on cassandra so i don't understand :o even if that work to my project.

Could you explain me ? pliz


Thank you in advance, and also sorry for my poor english.

Alan Boudreault

unread,
Jun 20, 2016, 11:30:00 AM6/20/16
to python-dr...@lists.datastax.com
Hello Kevin,

Some points to keep in mind when playing with datetimes:

1) The driver will implicitly converts to UTC if the timezone info is present
2) if not, the time is assumed to be UTC
3)  cqlsh will assume DB value is UTC and convert to local for display in the REPL

Can you give us more details on what you tried with dateof()/now() ? (table schema, cql statements etc.)

Here is a good documentation for dates and times with the python driver: http://datastax.github.io/python-driver/dates_and_times.html

Regards,
Alan


--
You received this message because you are subscribed to the Google Groups "DataStax Python Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python-driver-u...@lists.datastax.com.



--

Alan Boudreault
Software Engineer (Drivers) | alan.bo...@datastax.com


Kévin Garro

unread,
Jun 20, 2016, 12:08:58 PM6/20/16
to DataStax Python Driver for Apache Cassandra User Mailing List
Hello Allan
Thank for your answer and your information !
Before this i had a problem with the datetime  when i get the data i had +2h on my timestamp. So i wanted to get and save the good timestamp of Cassandra.
But since my error " Failed to decode ... as int: unpack requires a string argument of length 4"of  this morning when i rebooted my server. I can get the good timestamp. In my two tables. I thought i had this error because of my memory disk so i only truncate the table. I had this error on my table "unite". But also since that error cqlsh don't show the good timestamp :o. And when i teste dateof(now()) in my table test that work and show the good timestamp.

There are my tables :

CREATE TABLE unite (
  unite
int,
  date timestamp
,
  humidite
float,
  lumiere
float,
  temperature
float,
  PRIMARY KEY
((unite), date)
) WITH
  bloom_filter_fp_chance
=0.010000 AND
  caching
='KEYS_ONLY' AND
  comment
='' AND
  dclocal_read_repair_chance
=0.100000 AND
  gc_grace_seconds
=864000 AND
  index_interval
=128 AND
  read_repair_chance
=0.000000 AND
  replicate_on_write
='true' AND
  populate_io_cache_on_flush
='false' AND
  default_time_to_live
=0 AND
  speculative_retry
='99.0PERCENTILE' AND
  memtable_flush_period_in_ms
=0 AND
  compaction
={'class': 'SizeTieredCompactionStrategy'} AND
  compression
={'sstable_compression': 'LZ4Compressor'};

CREATE TABLE gaiadata
(
  unite
int,
  date timestamp
,
  humidite
float,
  lumiere
float,
  temperature
float,
  PRIMARY KEY
((unite), date)
) WITH
  bloom_filter_fp_chance
=0.010000 AND
  caching
='KEYS_ONLY' AND
  comment
='' AND
  dclocal_read_repair_chance
=0.100000 AND
  gc_grace_seconds
=864000 AND
  index_interval
=128 AND
  read_repair_chance
=0.000000 AND
  replicate_on_write
='true' AND
  populate_io_cache_on_flush
='false' AND
  default_time_to_live
=0 AND
  speculative_retry
='99.0PERCENTILE' AND
  memtable_flush_period_in_ms
=0 AND
  compaction
={'class': 'SizeTieredCompactionStrategy'} AND
  compression
={'sstable_compression': 'LZ4Compressor'};

CREATE TABLE test
(
  id text
,
  date timestamp
,
  PRIMARY KEY
((id))
) WITH
  bloom_filter_fp_chance
=0.010000 AND
  caching
='KEYS_ONLY' AND
  comment
='' AND
  dclocal_read_repair_chance
=0.100000 AND
  gc_grace_seconds
=864000 AND
  index_interval
=128 AND
  read_repair_chance
=0.000000 AND
  replicate_on_write
='true' AND
  populate_io_cache_on_flush
='false' AND
  default_time_to_live
=0 AND
  speculative_retry
='99.0PERCENTILE' AND
  memtable_flush_period_in_ms
=0 AND
  compaction
={'class': 'SizeTieredCompactionStrategy'} AND
  compression
={'sstable_compression': 'LZ4Compressor'};



here some of my result :
with the following query :
select * from  X;


table unite:

id        | date
-----------+--------------------------
        ok
| 2016-06-20 15:43:32+0200
 testTemps
| 2016-06-20 14:57:25+0200

table test :
 unite | date                     | humidite | lumiere | temperature
-------+--------------------------+----------+---------+-------------
     
1 | 2016-06-20 15:53:49+0200 |       35 |    1023 |          26
     
1 | 2016-06-20 15:54:09+0200 |       36 |    1023 |          26
     
1 | 2016-06-20 15:54:29+0200 |       36 |    1023 |          26
     
1 | 2016-06-20 15:57:13+0200 |       36 |    1023 |          26
     
1 | 2016-06-20 15:57:33+0200 |       35 |    1023 |          26
     
1 | 2016-06-20 15:57:53+0200 |       35 |    1007 |          26
     
1 | 2016-06-20 15:59:38+0200 |       36 |    1023 |          26
     
1 | 2016-06-20 15:59:58+0200 |       35 |    1023 |          26
     
1 | 2016-06-20 16:00:18+0200 |       35 |    1023 |          26
     
1 | 2016-06-20 16:00:38+0200 |       35 |    1023 |          26
     
1 | 2016-06-20 16:03:28+0200 |       35 |    1022 |          26
     
1 | 2016-06-20 16:03:48+0200 |       35 |    1023 |          26


table gaiadata:

 unite | date                     | humidite | lumiere | temperature
-------+--------------------------+----------+---------+-------------
     
1 | 2016-06-20 16:46:44+0200 |       33 |    1023 |          27
     
1 | 2016-06-20 16:47:04+0200 |       33 |    1023 |          27
     
1 | 2016-06-20 16:47:24+0200 |       33 |    1023 |          27
     
1 | 2016-06-20 16:47:45+0200 |       33 |    1023 |          27
     
1 | 2016-06-20 16:48:05+0200 |       33 |    1023 |          27
     
1 | 2016-06-20 16:48:25+0200 |       33 |    1023 |          27
     
1 | 2016-06-20 16:48:46+0200 |       40 |    1023 |          27
     
1 | 2016-06-20 16:49:06+0200 |       33 |    1023 |          27
     
1 | 2016-06-20 16:49:26+0200 |       33 |    1023 |          27
     
1 | 2016-06-20 16:49:47+0200 |       33 |     988 |          27
     
1 | 2016-06-20 16:50:07+0200 |       34 |    1023 |          27


I use dateOf(now()) with gaiaData  and i send the value of datetime.datetime.now() to unite.
My current version of cassandra is :2.0.17

In my project i only use one node. We are actualy in the state of proof of concept.

Thank you in advance for your helps;
Best regards,
A little student Kévin

Max Campos

unread,
Jun 22, 2016, 10:46:03 PM6/22/16
to DataStax Python Driver for Apache Cassandra User Mailing List
This is very timely, I was just about to post a comment to JIRA issue PYTHON-145.

One of the most confusing things I find about the current implementation is that if you take a datetime with a non-UTC timezone, and then insert and select it, you don't get the same time back.  Instead you get the original datetime converted to UTC but not tagged with any timezone.

For my particular situation, I use UTC for all timestamps going into/out of C*, but that still means I have to manually add the UTC timezone for any timestamps selected from Cassandra.  

Also: For Python 3, you can add a UTC timezone without introducing a pytz dependency -- datetime.timezone.utc; though if this was applied by default on selected timestamp columns, I suppose it could break people who are today consistently inserting and selecting naively.

I'll give this some more thought and come up with a proposal for adding timezone support to the driver, and then if guys at Datastax like it, I could submit a patch?  

- Max
To unsubscribe from this group and stop receiving emails from it, send an email to python-driver-user+unsub...@lists.datastax.com.

Alan Boudreault

unread,
Jun 23, 2016, 11:07:51 AM6/23/16
to python-dr...@lists.datastax.com
Hi Max,

The reasoning of this is explained in the following document: http://datastax.github.io/python-driver/dates_and_times.html (in case you hadn't seen it yet)

Although our preference is to let this handled at the application level for several reasons stated in the above document, I understand that it can be confusing for new users. I don't think we plan changing the way it works for the moment, but we could be interested to see your work if you have anything done.

Regards,
Alan

To unsubscribe from this group and stop receiving emails from it, send an email to python-driver-u...@lists.datastax.com.



--

Alan Boudreault
Software Engineer (Drivers) | alan.bo...@datastax.com


--
You received this message because you are subscribed to the Google Groups "DataStax Python Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python-driver-u...@lists.datastax.com.

Max Campos

unread,
Jun 24, 2016, 3:21:13 AM6/24/16
to DataStax Python Driver for Apache Cassandra User Mailing List
Hi Alan,

Yes, I've read that document.  IMO, there are a few issues with the current behavior:

1) If timezone handling should be the sole responsibility of the application, then the driver shouldn't convert timezone-aware datetimes to UTC during encode (see 'before hack pst' case below).  When the driver changes the datetime during encode, it effectively puts it into the timezone business.

2) The doc says "We follow this approach because the datetime API has deficiencies around daylight saving time, and the defacto package for handling this is a third-party package (we try to minimize external dependencies and not make decisions for the integrator). "   Yes it does have deficiencies, but those deficiencies do not apply when converting an already timezone'd datetime to UTC or when taking a naive datetime and applying a UTC timezone, since UTC doesn't use DST.  Furthermore, a UTC timezone can be applied without introducing a dependency on pytz.

I think the easiest way to fix this would be to simply always attach a utc timezone for columns decoded by cassandra.util.datetime_from_timestamp.  In order to avoid the pytz dependency, you can get a utc tzinfo from "datetime.timezone.utc" in Python 3, or by defining the timezone using the sample code from the Python 2.7 docs.  The consequence, however is that it breaks people who are using naive timestamps -- suddenly a selected timestamp will have a UTC timezone applied when it previously did not.

My method of hot-hacking (err...patching) the driver to do this is simply:
import cassandra.util
cassandra.util.DATETIME_EPOC = cassandra.util.DATETIME_EPOC.replace(tzinfo=timezone.utc)

Here are the before/after results from a little test application I wrote:

BEFORE HACK:
       What I inserted...           What I selected...       
utc  : 2016-01-01 04:44:44+00:00 -> 2016-01-01 04:44:44       OOPS!
naive: 2016-01-01 04:44:44       -> 2016-01-01 04:44:44       OK!
pst  : 2016-01-01 04:44:44-08:00 -> 2016-01-01 12:44:44       OOPS!

AFTER HACK:
       What I inserted...           What I selected...       
utc  : 2016-01-01 04:44:44+00:00 -> 2016-01-01 04:44:44+00:00 OK!
naive: 2016-01-01 04:44:44       -> 2016-01-01 04:44:44+00:00 OOPS!
pst  : 2016-01-01 04:44:44-08:00 -> 2016-01-01 12:44:44+00:00 OK!

I've attached my test application.  That's my opinion, anyway.  If you guys become interested in changing the current behavior, let me know and I'd be happy work on a patch.  

Thanks for all of your work on the driver.  :-)

- Max
To unsubscribe from this group and stop receiving emails from it, send an email to python-driver-user+unsub...@lists.datastax.com.



--

Alan Boudreault
Software Engineer (Drivers) | alan.bo...@datastax.com


--
You received this message because you are subscribed to the Google Groups "DataStax Python Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python-driver-user+unsub...@lists.datastax.com.
timezone_test.py
Reply all
Reply to author
Forward
0 new messages