Dataset already in use exception

102 views
Skip to first unread message

vraman...@gmail.com

unread,
Sep 9, 2014, 2:08:55 PM9/9/14
to h2os...@googlegroups.com
Here's the use case

Single server/instance h2o
1. I import dataset using REST API
2. Kick off training (DL) using REST API
3. Launch R
->
-> data.train = h2o.getFrame(conn, '...')
I get this exception..Is n't getFrame(..) read only or it's trying to update something while reading?
Would this also prevent running multiple models on the same data frame?..THANKS

java.lang.IllegalArgumentException: Dataset .... is already in use. Unable to use it now. Consider using a different destination name.
+ at water.Lockable$PriorWriteLock.atomic(Lockable.java:85)
+ at water.Lockable$PriorWriteLock.atomic(Lockable.java:74)
+ at water.TAtomic.atomic(TAtomic.java:19)
+ at water.Atomic.compute2(Atomic.java:58)
+ at water.Atomic.fork(Atomic.java:42)
+ at water.Atomic.invoke(Atomic.java:34)
+ at water.Lockable.write_lock(Lockable.java:60)
+ at water.exec.Env.remove_and_unlock(Env.java:349)
+ at water.api.Exec2.serve(Exec2.java:71)
+ at water.api.Request.serveGrid(Request.java:165)
+ at water.Request2.superServeGrid(Request2.java:481)
+ at water.api.Exec2.serveGrid(Exec2.java:78)
+ at water.api.Request.serve(Request.java:142)
+ at water.api.RequestServer.serve(RequestServer.java:479)
+ at water.NanoHTTPD$HTTPSession.run(NanoHTTPD.java:424)
+ at java.lang.Thread.run(Thread.java:724)

Tom Kraljevic

unread,
Sep 9, 2014, 6:35:56 PM9/9/14
to vraman...@gmail.com, h2os...@googlegroups.com

Hi Venkatesh,


This is actually the same issue that you reported on August 27.
We are tracking it in Jira PUB-1018.

Note that this is just an IllegalArgumentException, so no damage was done. It’s just a bit ugly.


Thanks,
Tom
> --
> You received this message because you are subscribed to the Google Groups "H2O & Open Source Scalable Machine Learning - h2ostream" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to h2ostream+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

vraman...@gmail.com

unread,
Sep 9, 2014, 8:30:59 PM9/9/14
to h2os...@googlegroups.com, vraman...@gmail.com
Thanks Tom..My memory is short-lived :)

ke...@0xdata.com

unread,
Sep 11, 2014, 5:05:31 AM9/11/14
to h2os...@googlegroups.com, vraman...@gmail.com
Not sure what you're trying to do, but the source of the operations doesn't matter if things are synchronized correctly.
So you ask...how can I synchronize when I wasn't the person who initiated the work request...

well most lenghty operations are considered "jobs" and enter the jobs list.

So you can poll for any jobs in the jobs list not being done...and not do any using of data frames until then

or you might pattern match on certain types of jobs (parse, glm etc)

So, you may be able to achieve what you want to do, it's just that you might need some synchronization your currently don't have.

For instance, I run tests that alternate between requests from a python feed, and a browser feed. So there's nothing inherently wrong with it, other than messing with things while they're changing.

I notice the stack trace up above mentiones "Exec"


at water.api.Exec2.serve(Exec2.java:71)
+ at water.api.Request.serveGrid(Request.java

Exec is a special land that does locking and unlocking in a different way than the base algos.

There are some restrictions about doing Exec things in parallel with other things, and you might be hitting that.

The intent was "no restrictions" as long as independent names were used. But there may be some issues.

Prior discussion seemed to be that you were doing unsynchronized "same key" use? is that true?

-kevin

Reply all
Reply to author
Forward
0 new messages