Re: How to import POSIXct variables in a h2o data frame

1,413 views
Skip to first unread message

Tom Kraljevic

unread,
Dec 9, 2015, 11:43:19 AM12/9/15
to Stéphane Tufféry, H2O Open Source Scalable Machine Learning - h2ostream

[ Sending to h2ostream, the open source community forum… ]


Hi,

Convert the column to a Date first using as.Date().
H2O does not understand POSIXct...

Tom


On Dec 9, 2015, at 4:28 AM, Stéphane Tufféry <tuffery....@gmail.com> wrote:

how to import POSIXct variables in a h2o data frame?
Example below:
 
> str(Test2)
'data.frame':   500000 obs. of  3 variables:
$ num_carte   : num  4.98e+15 4.98e+15 4.98e+15 4.98e+15 4.98e+15 ...
$ code_reponse: num  0 0 0 0 0 0 0 0 0 0 ...
$ Date_Ok     : POSIXct, format: "2015-02-11 19:41:26" "2015-02-12 19:30:32" "2015-02-13 12:19:19" "2015-02-20 15:05:25" ...
> Test_h2o <- as.h2o(Test2)
 
ERROR: Unexpected HTTP Status code: 412 Precondition Failed (url = http://localhost:54321/3/Parse)
 
water.exceptions.H2OIllegalArgumentException
[1] "water.parser.ParseSetup.strToColumnTypes(ParseSetup.java:135)"                       
 [2] "water.api.ParseHandler.parse(ParseHandler.java:14)"                                  
 [3] "sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)"                         
 [4] "sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)"       
 [5] "sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)"
[6] "java.lang.reflect.Method.invoke(Method.java:497)"                                    
 [7] "water.api.Handler.handle(Handler.java:64)"                                           
 [8] "water.api.RequestServer.handle(RequestServer.java:644)"                              
 [9] "water.api.RequestServer.serve(RequestServer.java:585)"                               
[10] "water.JettyHTTPD$H2oDefaultServlet.doGeneric(JettyHTTPD.java:617)"                   
[11] "water.JettyHTTPD$H2oDefaultServlet.doPost(JettyHTTPD.java:565)"                      
[12] "javax.servlet.http.HttpServlet.service(HttpServlet.java:755)"                        
[13] "javax.servlet.http.HttpServlet.service(HttpServlet.java:848)"                        
[14] "org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)"              
 
Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = page,  :
  Provided column type c("POSIXct", "POSIXt") is unknown.  Cannot proceed with parse due to invalid argument.
>
 
Thanks a lot in advance!
 
Best regards,
 
Stéphane
 

Stéphane Tufféry

unread,
Dec 9, 2015, 3:26:24 PM12/9/15
to Tom Kraljevic, H2O Open Source Scalable Machine Learning - h2ostream

Hi,

 

Thanks, but I want to use functions such as difftime on date-time variables, not only on date variables.

 

Stéphane

 

 

De : Tom Kraljevic [mailto:to...@h2o.ai]
Envoyé : mercredi 9 décembre 2015 17:43
À : Stéphane Tufféry
Cc : H2O Open Source Scalable Machine Learning - h2ostream
Objet : Re: How to import POSIXct variables in a h2o data frame

Tom Kraljevic

unread,
Dec 9, 2015, 4:09:28 PM12/9/15
to Stéphane Tufféry, H2O Open Source Scalable Machine Learning - h2ostream

H2O stores the value as a millis number.
So, you can do math on seconds if you need to.

I’m not sure the as.Date can handle seconds, but the string form definitely does.

Tom

Stéphane Tufféry

unread,
Dec 10, 2015, 7:00:13 AM12/10/15
to Tom Kraljevic, H2O Open Source Scalable Machine Learning - h2ostream

Hi Tom,

 

Thank you for your quick reply J

 

So I get something like that:

 

> class(Test$Date_Heure)

[1] "character"

> head(Test$Date_Heure)

[1] "2015-02-11 19:41:26" "2015-02-12 19:30:32" "2015-02-13 12:19:19" "2015-02-20 15:05:25" "2015-02-22 15:36:58"

[6] "2015-02-28 08:22:49"

> Test_h2o <- as.h2o(Test[, c("num_carte", "code_reponse", "Date_Heure")])

> head(Test_h2o)

    num_carte code_reponse          Date_Heure

1 4.97783e+15            0 2015-02-11 19:41:26

2 4.97783e+15            0 2015-02-12 19:30:32

3 4.97783e+15            0 2015-02-13 12:19:19

4 4.97783e+15            0 2015-02-20 15:05:25

5 4.97783e+15            0 2015-02-22 15:36:58

6 4.97783e+15            0 2015-02-28 08:22:49

> 

> Test_h2o$Date_Heure[2]-Test_h2o$Date_Heure[1]

  Date_Heure

1        NaN

 

[1 rows x 1 columns]

> Test_h2o$code_reponse[2]-Test_h2o$code_reponse[1]

  code_reponse

1            0

 

[1 rows x 1 columns]

> 

 

What do you mean by “H2O stores the value as a millis number”?

How can I compute the number of seconds between 2015-02-12 19:30:32 and 2015-02-11 19:41:26?

 

Thanks a lot!

 

Stéphane

 

 

De : Tom Kraljevic [mailto:to...@h2o.ai]
Envoyé : mercredi 9 décembre 2015 22:09

Tom Kraljevic

unread,
Dec 10, 2015, 10:15:51 AM12/10/15
to Stéphane Tufféry, Tom Kraljevic, H2O Open Source Scalable Machine Learning - h2ostream
use h2o.importFile("myfile.csv") rather than as.h2o() so the h2o parser can guess the right types.

tom

Sent from my iPhone

Matthew Landowski

unread,
Dec 19, 2015, 9:57:33 PM12/19/15
to H2O Open Source Scalable Machine Learning - h2ostream, tuffery....@gmail.com
My hack for this is to convert my date/time objects to character objects.

df$datetime <- as.character(df$datetime)
df.h2o <- as.h2o(df)


Another option, I think someone mentioned, is to write the data frame to a csv file and then upload the csv file.

Reply all
Reply to author
Forward
0 new messages