worksheets aren't saved using docker on OS X

51 views
Skip to first unread message

Peter Kerpedjiev

unread,
Feb 9, 2015, 3:46:18 AM2/9/15
to spark-not...@googlegroups.com
Hi Andy,

When I create a new worksheet, save and checkpoint,  exit and shut down docker (using ctrl-c), re-start it, the new worksheet isn't there. Have you encountered something like this before?

The command to start the instance:

docker run -v /Users/pkerp/projects/chairliftplot/:/mnt -p 9000:9000 andypetrella/spark-notebook:0.2.0-spark-1.2.0-hadoop-1.0.4

The log messages after closing the browser:

15/02/09 08:38:12 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://Rem...@127.0.0.1:41602]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connection refused: /
127.0.0.1:41602
15/02/09 08:38:12 INFO remote.RemoteActorRefProvider$RemoteDeadLetterActorRef: Message [akka.remote.RemoteWatcher$Heartbeat$] from Actor[akka://NotebookServer/system/remote-watcher#-457307005] to Actor[akka://NotebookServer/deadLetters] was not delivered. [8] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.

The stack-overflow question:

http://stackoverflow.com/questions/28405699/spark-notebook-worksheets-not-saved-with-docker

cheers,

-Peter

andy petrella

unread,
Feb 9, 2015, 4:53:43 AM2/9/15
to Peter Kerpedjiev, spark-not...@googlegroups.com
Hello Peter,

okay sounds like the notebooks wasn't created in the notebooks dir I guess. Did you provided the dir to the notebook manually?

If not, it will be stored in the current spark-notebook/conf/notebooks directory, which might not be kept between run. Unless (I hope) the previous container instance is restarted.
Since, using run on docker will start a new one, that might be the problem. Could you try a docker start on the instance?

So a good thing to do is to use the folder you mapped to also contain the notebooks (but you won't have access to the default ones unless  you copy them from the fs).
This can be done using the extra parameter in the launch command : -Dmanager.notebooks.dir="<absolute path to other notebook dirs>"

I'll answer the question when I'll be sure :-D

Thanks for reporting,
cheers,
andy

--
You received this message because you are subscribed to the Google Groups "spark-notebook-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spark-notebook-...@googlegroups.com.
To post to this group, send email to spark-not...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spark-notebook-user/54D873D8.7020001%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Peter Kerpedjiev

unread,
Feb 9, 2015, 6:07:18 AM2/9/15
to andy petrella, spark-not...@googlegroups.com
Hey,

OK, so it doesn't save the newly created notebooks because I don't commit the docker image before killing the instance. This is explained in this stack overflow question.

You're right that a better thing to do would be store the notebooks on a local folder. Running it like so:

docker run -v /Users/pkerp/projects/chairliftplot/:/mnt -p 9000:9000 andypetrella/spark-notebook:0.2.0-spark-1.2.0-hadoop-1.0.4 -Dmanager.not
ebooks.dir="/mnt"

yields the error below. Apparently the /mnt directory is loaded as read-only. I'll try and fix this later today and report what I find. If you've encountered this or have an idea of how to fix it, let me know.

cheers,

-Peter

Caused by: java.io.FileNotFoundException: /mnt/Untitled1.snb (Permission denied)
        at java.io.FileOutputStream.open(Native Method)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:221)
        at org.apache.commons.io.FileUtils.openOutputStream(FileUtils.java:367)
        at org.apache.commons.io.FileUtils.writeStringToFile(FileUtils.java:1928)
        at org.apache.commons.io.FileUtils.writeStringToFile(FileUtils.java:1962)
        at notebook.server.NotebookManager.save(NotebookManager.scala:107)
        at notebook.server.NotebookManager.newNotebook(NotebookManager.scala:50)
        at controllers.Application$$anonfun$newNotebook$1.apply(Application.scala:154)
        at controllers.Application$$anonfun$newNotebook$1.apply(Application.scala:153)
        at play.api.mvc.ActionBuilder$$anonfun$apply$17.apply(Action.scala:464)
        at play.api.mvc.ActionBuilder$$anonfun$apply$17.apply(Action.scala:464)
        at play.api.mvc.ActionBuilder$$anonfun$apply$16.apply(Action.scala:433)
        at play.api.mvc.ActionBuilder$$anonfun$apply$16.apply(Action.scala:432)
        at play.api.mvc.Action$.invokeBlock(Action.scala:556)
        at play.api.mvc.Action$.invokeBlock(Action.scala:555)
        at play.api.mvc.ActionBuilder$$anon$1.apply(Action.scala:518)
        at play.api.mvc.Action$$anonfun$apply$1$$anonfun$apply$4$$anonfun$apply$5.apply(Action.scala:130)
        at play.api.mvc.Action$$anonfun$apply$1$$anonfun$apply$4$$anonfun$apply$5.apply(Action.scala:130)
        at play.utils.Threads$.withContextClassLoader(Threads.scala:21)
        at play.api.mvc.Action$$anonfun$apply$1$$anonfun$apply$4.apply(Action.scala:129)
        at play.api.mvc.Action$$anonfun$apply$1$$anonfun$apply$4.apply(Action.scala:128)
        at scala.Option.map(Option.scala:145)
        at play.api.mvc.Action$$anonfun$apply$1.apply(Action.scala:128)
        at play.api.mvc.Action$$anonfun$apply$1.apply(Action.scala:121)
        at play.api.libs.iteratee.Iteratee$$anonfun$mapM$1.apply(Iteratee.scala:483)
        at play.api.libs.iteratee.Iteratee$$anonfun$mapM$1.apply(Iteratee.scala:483)
        at play.api.libs.iteratee.Iteratee$$anonfun$flatMapM$1.apply(Iteratee.scala:519)
        at play.api.libs.iteratee.Iteratee$$anonfun$flatMapM$1.apply(Iteratee.scala:519)
        at play.api.libs.iteratee.Iteratee$$anonfun$flatMap$1$$anonfun$apply$14.apply(Iteratee.scala:496)
        at play.api.libs.iteratee.Iteratee$$anonfun$flatMap$1$$anonfun$apply$14.apply(Iteratee.scala:496)
        at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
        at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)

andy petrella

unread,
Feb 9, 2015, 6:25:42 AM2/9/15
to Peter Kerpedjiev, spark-not...@googlegroups.com
Mmh I'll give it a try as well. It might be something else as well.... java file support is very bad at reporting for tricky cases.

For the answer, there is another answer which is interesting as well the 54stars one that tells about start and attach. It should be easier than comitting the image :-/

tell me if you have any news 
cheers

andy petrella

unread,
Feb 9, 2015, 11:54:19 AM2/9/15
to Peter Kerpedjiev, spark-not...@googlegroups.com
Argl, I tried some stuffs but didn't worked out. It's actually more a Docker usage thingy than the notebook it self.

Maybe create a data volume would be easier and safer?
I'll try to try that tonight ^^

To unsubscribe from this group and stop receiving emails from it, send an email to spark-notebook-user+unsub...@googlegroups.com.
To post to this group, send email to spark-notebook-user@googlegroups.com.

Peter Kerpedjiev

unread,
Feb 9, 2015, 6:01:22 PM2/9/15
to andy petrella, spark-not...@googlegroups.com
Thanks for the help!

Apparently, this is part of a much larger issue which a lot of people are having troubles with:

https://github.com/boot2docker/boot2docker/issues/581

I finally got it to work, but in an extremely hacky way by following this comment:

/Users is a bit special as it's mounted by the boot2docker script (any share matching the names listed in release notes), changing it requires customising the code. I just created a custom share anyway as I don't want to share the whole /Users structure with containers.

1) Overriding the /Users default share on boot2docker start:

boot2docker --vbox-share=$(pwd)/share-location=share-name up

2) boot2docker ssh in and mount the custom share:

sudo mount -t vboxsf -o uid=1,gid=1 share-name [SHARE
-FULL-PATH]/share-location


I was trying to set up a data volume, but am having the same problem and it's just about bedtime. For future reference, where can I find the Dockerfile for the container you created? Is it required that it be run with uid:gid of 1:1? If it can be run as 1000:staff (whatever staff is...), then the /mnt directory might be accessible.

cheers,

-Peter
To unsubscribe from this group and stop receiving emails from it, send an email to spark-notebook-...@googlegroups.com.
To post to this group, send email to spark-not...@googlegroups.com.

andy petrella

unread,
Feb 9, 2015, 6:09:52 PM2/9/15
to Peter Kerpedjiev, spark-not...@googlegroups.com
Waow good catch indeed!

I tried some stuffs on my side too, but always facing this permissions issue. I guess that create a dedicated VOLUME with the right rights should work better.
Actually, the one running the notebook will be `daemon` hence we can create a vol that allows him to write in the shareable folder.

Regarding the Dockerfile, it's published on the hub, and I'm using the sbt native packager for that. But never looked at the management of these ids... I need first to catch what are their purposes as well :-/.

More generally, I'd love to add a section to the README telling the story about creating this shareable repo :-/, hopefully we're close...

thanks for your help and explorationsi!
cheers,
andy

Peter Kerpedjiev

unread,
Feb 11, 2015, 5:41:02 PM2/11/15
to andy petrella, spark-not...@googlegroups.com
Hi Andy,

This is driving me crazy :) Every road seems blocked by some obstacle. The problem is really simple.

The spark-notebook container runs with a user:group of 1:1 (daemon:daemon). The mounted folder has an id of 1000:staff. Hence, permission denied. To make it even better, the mounted folder only has user write access so we can't add the daemon user to the staff group. Oh well...

Until then, it's good to know that after <ctrl-c>-ing a running container, I can find its id using

docker ps -l

and then start it again using:

docker start <id>

where <id> is something like 'thirsty_lovelace' or whatever appears in the last column of docker ps.

Hopefully I remember to commit before I restart my computer :)

I'll keep my eye on this issue and see if something comes of it.

cheers,

-Peter


On the bright side I'm learning a lot about docker.

andy petrella

unread,
Feb 12, 2015, 5:28:37 AM2/12/15
to Peter Kerpedjiev, spark-not...@googlegroups.com
mmmmh good catch!
However... there should be a way to add something better regarding this daemon thingy. 
I have little knowledge in docker (I mean I use the basics...) but I'll try to also ask someone if a better background in ops than I have...

You know what, wouldn't it be interesting to update the question on SO with these discoveries? So that we can easily ask friends :-D
I can do it if you've no time, I guess I gave me the nitty gritty details I need :-)

Cheers and Thanks!

andy

Peter Kerpedjiev

unread,
Feb 12, 2015, 6:59:12 AM2/12/15
to andy petrella, spark-not...@googlegroups.com

On 02/12/2015 11:28 AM, andy petrella wrote:
> mmmmh good catch!
> However... there should be a way to add something better regarding
> this daemon thingy.
> I have little knowledge in docker (I mean I use the basics...) but
> I'll try to also ask someone if a better background in ops than I have...
Me neither. I hope you find out something good :)
>
> You know what, wouldn't it be interesting to update the question on SO
> with these discoveries? So that we can easily ask friends :-D
Done!

It just occurred to me that having the notebooks stored in one container
would be a problem when you release an update. This would come in a new
container which would lack the notebooks from the old. Maybe it's easy
to transfer data between two containers, but the ordeal so far has left
me skeptical :-/

-Peter


andy petrella

unread,
Feb 12, 2015, 7:14:44 AM2/12/15
to Peter Kerpedjiev, spark-not...@googlegroups.com
indeed, having the notebooks separated should be the preferred way.

I'm discussing with a friend about what we should do. I think I'm getting closer :-).
But I'm interleaving the tasks with a plenty of others :-$

andy petrella

unread,
Feb 12, 2015, 3:26:05 PM2/12/15
to Peter Kerpedjiev, spark-not...@googlegroups.com
a small update: this shit drives me crazy.

that being said, I'm trying something like this:
1/ https://github.com/andypetrella/sbt-native-packager → I've updated it (with a PR) to allow some more customization.
2/ https://github.com/andypetrella/spark-notebook/tree/docker → I've customized the Dockerfile to use a different user than daemon → this user is guilty of our frustration

It's not working yet (a problem when starting the application, but I think it's a env problem).

I'll keep you up to date.

Cheers,
andy

andy petrella

unread,
Feb 15, 2015, 10:17:07 AM2/15/15
to Peter Kerpedjiev, spark-not...@googlegroups.com
So... I got this very helpful comment: https://github.com/sbt/sbt-native-packager/pull/488#issuecomment-74222697

I guess waiting for the coming release of the packager will help a lot easing the life while using docker :-S.
I'm trying to see when it'll land!

There is a hope :-D

Peter Kerpedjiev

unread,
Feb 15, 2015, 11:59:05 AM2/15/15
to andy petrella, spark-not...@googlegroups.com
Great!

Everything would be so much easier if the daemon ran as root, but that's just a sloppy workaround.

I've been trying to get data volumes to work as well, but they also seem beset by these user permission issues. I'll just work locally for now take a look in a week or so to see what's new :-)

Thanks a lot for you help and for actively working to make this work.

-Peter

andy petrella

unread,
Aug 4, 2015, 8:35:05 PM8/4/15
to Peter Kerpedjiev, spark-not...@googlegroups.com

Hey,
It’s been a while I know but I kept your mail unread until now that loading&saving notebooks outside world can work ;-).

This is going to bind you local /tmp/ttt folder to the folder ext in your notebooks.

docker run --rm -v /tmp/ttt:/opt/docker/notebooks/ext -p 9000:9000 andypetrella/spark-notebook:0.6.0-scala-2.10.4-spark-1.4.1-hadoop-2.6.0

So that all notebooks created in this ext folders will be in your /tmp/ttt host \o/.

HT(finally)H

cheers,

--
andy

Peter Kerpedjiev

unread,
Aug 11, 2015, 3:41:39 AM8/11/15
to andy petrella, spark-not...@googlegroups.com
Hey Andy,

Wow! I'm impressed with your dedication to this topic :)

For me it works as long as I don't power down the boot2docker virtual machine.

Have you tried doing this and then restarting the virtual machine?

So that all notebooks created in this ext folders will be in your /tmp/ttt host \o/.

I presume that 'host' here means the boot2docker virtual machine? When I save a notebook, I don't see a /tmp/ttt directory on my local file system.

cheers,

-Peter
Reply all
Reply to author
Forward
0 new messages