DistributedException from /<ip>:54321, caused by java.lang.IllegalArgumentException: 0 > -2147483648

67 views
Skip to first unread message

daiy...@gmail.com

unread,
Jun 15, 2016, 5:37:18 PM6/15/16
to H2O Open Source Scalable Machine Learning - h2ostream
Using h2o version 3.8.2.9 on a 5 node h2o cluster. I am getting this error when parsing a file uploaded using R (3.2.2 64 bit on ubuntu 14.04 LTS).

tmp = h2o.uploadFile(path = "test.csv.gz", parse = FALSE)
e2 = h2o.parseRaw(e2, header = TRUE, col.types = c("Numeric", "Enum", "Enum", "String",
"Numeric", "Enum", "Enum", "Enum", "Enum", "Numeric",
"Enum", "Enum", "Enum", "Enum", "String", "Enum"), destination_frame = "test")

After the progress bar reaches 100%, an error appears on the R console:
DistributedException from /<IP>:54321, caused by java.lang.IllegalArgumentException: 0 > -2147483648

On that IP, h2o gives the following message

06-15 13:43:16.463 <IP>:54321 2782 #46969-15 INFO: Method: POST , URI: /3/Parse, route: /3/Parse, parms: {number_columns=16, source_frames=["test.csv.gz_sid_a58d_21"], single_quotes=FALSE, column_types=["Numeric","Enum","Enum","String","Numeric","Enum","Enum","Enum","Enum","Numeric","Enum","Enum","Enum","Enum","String","Enum"], parse_type=CSV, destination_frame=test, delete_on_done=TRUE, column_names=["EID","DID","NAME","DESCRIPTION","TIMESTAMP","ACK","a","b","c","d","d","f","g","h","i","j"], check_header=1, separator=9, blocking=FALSE, na_strings=[], chunk_size=4194304}
06-15 13:43:16.466 <IP>:54321 2782 #46969-15 INFO: Total file size: 420.2 MB
06-15 13:43:16.467 <IP>:54321 2782 #46969-15 INFO: Parse chunk size 4194304
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: DistributedException from /<IP>:54321, caused by java.lang.IllegalArgumentException: 0 > -2147483648
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at water.RPC.result(RPC.java:241)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at water.RPC.get(RPC.java:257)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at water.parser.ParseDataset.parseAllKeys(ParseDataset.java:303)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at water.parser.ParseDataset.access$000(ParseDataset.java:27)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at water.parser.ParseDataset$ParserFJTask.compute2(ParseDataset.java:190)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at water.H2O$H2OCountedCompleter.compute(H2O.java:1194)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: Caused by: java.lang.IllegalArgumentException: 0 > -2147483648
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at java.util.Arrays.copyOfRange(Arrays.java:2549)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at water.MemoryManager.malloc(MemoryManager.java:235)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at water.MemoryManager.malloc(MemoryManager.java:207)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at water.MemoryManager.arrayCopyOfRange(MemoryManager.java:269)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at water.AutoBuffer.expandByteBuffer(AutoBuffer.java:682)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at water.AutoBuffer.putA4(AutoBuffer.java:1253)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at water.AutoBuffer.putAA4(AutoBuffer.java:1364)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at water.parser.ParseDataset$CategoricalUpdateMap$Icer.write86(ParseDataset$CategoricalUpdateMap$Icer.java)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at water.parser.ParseDataset$CategoricalUpdateMap$Icer.write(ParseDataset$CategoricalUpdateMap$Icer.java)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at water.Iced.write(Iced.java:61)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at water.Iced.asBytes(Iced.java:42)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at water.Value.<init>(Value.java:317)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at water.Value.<init>(Value.java:312)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at water.DKV.put(DKV.java:55)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at water.parser.ParseDataset$CreateParse2GlobalCategoricalMaps.compute2(ParseDataset.java:408)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at water.H2O$H2OCountedCompleter.compute1(H2O.java:1197)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at water.parser.ParseDataset$CreateParse2GlobalCategoricalMaps$Icer.compute1(ParseDataset$CreateParse2GlobalCategoricalMaps$Icer.java)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: at water.H2O$H2OCountedCompleter.compute(H2O.java:1193)
06-15 13:46:15.218 <IP>:54321 2782 FJ-1-1 ERRR: ... 5 more

Parsing using flow interface gives the same error.
Not specifying col.types gets rid of the error.
Changing "String" to "Enum" still gives the same error.
This error did not occur on 3.8.0.6

Tomas Nykodym

unread,
Jun 15, 2016, 8:42:11 PM6/15/16
to H2O Open Source Scalable Machine Learning - h2ostream, daiy...@gmail.com
Thx for reporting. 
Looks like the error is caused by an enum column with too many levels (or by too many levels in all categorical columns combined).
You should be able to get around it for now by setting the type of the highest cardinality categorical to String.

Were all the "Enum" columns parsed correctly or was some turned into NAs in 3.8.0.6? 

Thx,

Tomas

daiy...@gmail.com

unread,
Jun 15, 2016, 8:57:30 PM6/15/16
to H2O Open Source Scalable Machine Learning - h2ostream, daiy...@gmail.com
h2o version 3.8.1.4 works. On this version I ran:

h2o.table(is.na(test[,i]))

for ever column where i is the column number. All the enum columns had no NAs.

btw, when i is greater than number of columns, it gives the following error:

ERROR MESSAGE:

Column must be an integer from 0 to 15

For R, this error should be 1 to 16 instead.

Reply all
Reply to author
Forward
0 new messages