device-mapper based docker backend

3,602 views
Skip to first unread message

Alexander Larsson

unread,
Sep 6, 2013, 7:08:16 AM9/6/13
to docke...@googlegroups.com
The last few weeks I've been working on a device-mapper based backend for docker, as AUFS is not generally availible (for instance in Fedora or RHEL). I've now got something that seems to mostly work on my Fedora 19 install.

The code is in my device-mapper2 branch at https://github.com/alexlarsson/docker/commits/device-mapper2

There are some outstanding issues on Fedora not related to the backend:
  • You have to run "mount --make-rprivate /" as the Fedora default of shared doesn't work with lxc-start.
  • The ubuntu 12.10 image doesn't work by default, as the image contains a /dev/shm that is a symlink to /run/shm, and the lxc-start scripts dereference this at mount-time. /run/shm doesn't exist on Fedora, so this fails. "mkdir /run/shm" on the host will "solve" this for now.
But other than these it does seem to work in simple tests. Right now I'm working on getting the test suite running, its currently a bit slow because it keeps setting up the device mapper stuff for each test.

The way it works is that we set up a device-mapper thin provisioning pool with a single base device containing an empty ext4 filesystem. Then each time we create an image we take a snapshot of the parent image (or the base image) and manually apply the AUFS layer to this. Similarly we create snapshots of images when we create containers and mount these as the container filesystem.

"docker diff" is implemented by just scanning the container filesystem and the parent image filesystem, looking at the metadata for changes. Theoretically this can be fooled if you do in-place editing of a file (not changing the size) and reset the mtime/ctime, but in practice I think this will be good enough.

"docker commit" uses the above diff command to get a list of changed files which are used to construct a tarball with files and AUFS whiteouts (for deletes). This means you can commit containers to images, run new containers based on the image, etc. You should be able to push them to the index too (although I've not tested this yet).

Docker looks for a "docker-pool" device-mapper device (i.e. /dev/mapper/docker-pool) when it starts up, but if none exists it automatically creates two sparse files (100GB for the data and 2GB for the metadata) and loopback mount these and sets these up as the block devices for docker-pool, with a 10GB ext4 fs as the base image.

This means that there is no need for manual setup of block devices, and that generally there should be no need to pre-allocate large amounts of space (the sparse files are small, and we things up so that discards are passed through all the way back to the sparse loopbacks, so deletes in a container should fully reclaim space.

You can also manually set up a docker-pool device-mapper device if you want to actually use raw block devices. This might be more interesting when deploying docker.

Anyway, I'm hoping interested people can start testing this, and we could start figuring out the path towards merging this into docker.

Alexander Larsson

unread,
Sep 6, 2013, 11:06:52 AM9/6/13
to docke...@googlegroups.com


On Friday, 6 September 2013 13:08:16 UTC+2, Alexander Larsson wrote:
Right now I'm working on getting the test suite running, its currently a bit slow because it keeps setting up the device mapper stuff for each test.

So, one problem with the testsuite was that the test binary "docker.test" picked up the libdevmapper.so dependency, so it didn't run properly as .dockerinit. I added a hack where you can point to a real client binary via an env var, and now this seems to work a lot better, although the tests take a lot of time...

> _DOCKER_CLIENT_PATH=/home/alex/vcs/go/bin/docker go test -v
2013/09/06 16:56:00 Listening for HTTP on 127.0.0.1:4270 (tcp)
=== RUN TestGetBoolParam
--- PASS: TestGetBoolParam (0.00 seconds)
=== RUN TestGetVersion
--- PASS: TestGetVersion (14.47 seconds)
=== RUN TestGetInfo
--- PASS: TestGetInfo (12.04 seconds)
=== RUN TestGetEvents
--- PASS: TestGetEvents (11.00 seconds)
=== RUN TestGetImagesJSON
--- PASS: TestGetImagesJSON (35.91 seconds)
=== RUN TestGetImagesViz
--- PASS: TestGetImagesViz (17.55 seconds)
=== RUN TestGetImagesHistory
--- PASS: TestGetImagesHistory (12.32 seconds)
=== RUN TestGetImagesByName
--- PASS: TestGetImagesByName (12.05 seconds)
=== RUN TestGetContainersJSON
--- PASS: TestGetContainersJSON (14.26 seconds)
=== RUN TestGetContainersExport
--- PASS: TestGetContainersExport (23.19 seconds)
=== RUN TestGetContainersChanges
--- PASS: TestGetContainersChanges (22.85 seconds)
=== RUN TestGetContainersTop
2013/09/06 16:59:15 lxc-kill: failed to get the init pid
2013/09/06 16:59:15 Failed to send SIGTERM to the process, force killing
2013/09/06 16:59:15 error killing container 8ca431b284b5725559bb25cf6c90ba8dc807211b349c64d923bddc822a4558ba (lxc-kill: failed to get the init pid
, exit status 255)
--- PASS: TestGetContainersTop (21.89 seconds)
=== RUN TestGetContainersByName
--- PASS: TestGetContainersByName (14.84 seconds)
=== RUN TestPostCommit
--- PASS: TestPostCommit (25.08 seconds)
=== RUN TestPostContainersCreate
--- FAIL: TestPostContainersCreate (26.35 seconds)
    api_test.go:661: The test file has not been created
=== RUN TestPostContainersKill
--- PASS: TestPostContainersKill (24.66 seconds)
=== RUN TestPostContainersRestart
2013/09/06 17:01:17 Container 657fe57ff34fb7e17ffc426a10e695f666ecaeef4a3441dd563e1b9c4365ba9e failed to exit within 1 seconds of SIGTERM - using the force
--- PASS: TestPostContainersRestart (28.49 seconds)
=== RUN TestPostContainersStart
--- PASS: TestPostContainersStart (24.96 seconds)
=== RUN TestPostContainersStop
2013/09/06 17:02:01 Container 21f49dd3a5cb58ce9e14be2e1957624c5a49f72c33fc5e0f106588ae721bb9d4 failed to exit within 1 seconds of SIGTERM - using the force
--- PASS: TestPostContainersStop (22.02 seconds)
=== RUN TestPostContainersWait
--- PASS: TestPostContainersWait (28.73 seconds)
=== RUN TestPostContainersAttach
--- PASS: TestPostContainersAttach (25.91 seconds)
=== RUN TestDeleteContainers
--- PASS: TestDeleteContainers (26.64 seconds)
=== RUN TestOptionsRoute
--- PASS: TestOptionsRoute (16.28 seconds)
=== RUN TestGetEnabledCors
--- PASS: TestGetEnabledCors (17.27 seconds)
=== RUN TestDeleteImages
--- PASS: TestDeleteImages (21.25 seconds)
=== RUN TestJsonContentType
--- PASS: TestJsonContentType (0.00 seconds)
=== RUN TestPostContainersCopy
--- PASS: TestPostContainersCopy (28.07 seconds)
=== RUN TestCmdStreamLargeStderr
--- PASS: TestCmdStreamLargeStderr (0.02 seconds)
=== RUN TestCmdStreamBad
--- PASS: TestCmdStreamBad (0.00 seconds)
=== RUN TestCmdStreamGood
--- PASS: TestCmdStreamGood (0.00 seconds)
=== RUN TestTarUntar
--- PASS: TestTarUntar (0.11 seconds)
=== RUN TestBuild
--- FAIL: TestBuild (59.03 seconds)
    buildfile_test.go:235: The command [/bin/sh -c [ "$(cat /usr/lib/baz/quux)" = 'world!' ]] returned a non-zero code: 1
=== RUN TestVolume
*** Test killed: ran too long (10m0s).
FAIL    github.com/dotcloud/docker    602.814s




Jérôme Petazzoni

unread,
Sep 6, 2013, 1:06:57 PM9/6/13
to Alexander Larsson, docker-dev
On Fri, Sep 6, 2013 at 8:06 AM, Alexander Larsson <alexande...@gmail.com> wrote:


On Friday, 6 September 2013 13:08:16 UTC+2, Alexander Larsson wrote:
Right now I'm working on getting the test suite running, its currently a bit slow because it keeps setting up the device mapper stuff for each test.


When I worked on the BTRFS branch, I used BTRFS itself to snapshot and setup each new test environment.
I.e. after pulling the test image, I created a snapshot of the whole test environment, and created a new clone of that snapshot for each test.
I wonder if we could do something similar with device-mapper, by using nesting.

I.e., the test suite would:
- use the devmapper package to create a thin pool over a set of loopback files
- in that thin pool, create a filesystem containing itself a new set of loopback files; and turn those files into a pool
- that "inside" thin pool would be prepared to be used for the tests (pull the test image), then taken a snapshot
- then, before each test, just create a clone of this snapshot: it gives us a new set of loopback files, prepared and ready to use

Is this possible, or will the device-mapper and/or the loopback devices refuse to perform such an inception?

Guillaume Charmes

unread,
Sep 6, 2013, 9:20:25 PM9/6/13
to Jérôme Petazzoni, Alexander Larsson, docker-dev
Alex,

I managed to statically link docker with libdevmapper using your branch. Now the tests all pass without dirty env hack :)

I made a Dockerfile here: https://github.com/creack/docker_devmapper or you can pull the image creack/docker_devmapper.

The catch is that the 'netgo' tag has been introduced after the current 1.1.2 release. We'll need to wait the next release in order to have something 100% stable. WIthout the tag, I get random segfaults from the net package.

Regards,


--
You received this message because you are subscribed to the Google Groups "docker-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to docker-dev+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--
Guillaume J. Charmes

Alexander Larsson

unread,
Sep 9, 2013, 4:28:22 AM9/9/13
to docke...@googlegroups.com, Jérôme Petazzoni, Alexander Larsson
Fedora doesn't have any static libdevmapper, and in general frowns on static dependencies. Furthermore, we will likely run into more similar issues, for instance with libvirt. It would be nicer to have a more generic solution to this problem.

Ideally we'd like "go test" to also build the client binary and put it next to the temporary docker.test image. I don't know how easy it is to hook into the go build tools for something like this though...

The easier solution is to add a small test script that builds the client first in some temporary location and then passes this to go test in some way.

Alexander Larsson

unread,
Sep 9, 2013, 10:02:53 AM9/9/13
to docke...@googlegroups.com
Pushed a bunch of test fixes to the branch, now we pass all tests, although i do get some warnings:

[root@localhost docker]# _DOCKER_CLIENT_PATH=/home/alex/vcs/go/bin/docker go test -v
2013/09/09 15:58:52 Listening for HTTP on 127.0.0.1:4270 (tcp)

=== RUN TestGetBoolParam
--- PASS: TestGetBoolParam (0.00 seconds)
=== RUN TestGetVersion
--- PASS: TestGetVersion (0.02 seconds)
=== RUN TestGetInfo
--- PASS: TestGetInfo (0.02 seconds)
=== RUN TestGetEvents
--- PASS: TestGetEvents (0.52 seconds)
=== RUN TestGetImagesJSON
--- PASS: TestGetImagesJSON (0.02 seconds)
=== RUN TestGetImagesViz
--- PASS: TestGetImagesViz (0.02 seconds)
=== RUN TestGetImagesHistory
--- PASS: TestGetImagesHistory (0.02 seconds)
=== RUN TestGetImagesByName
--- PASS: TestGetImagesByName (0.03 seconds)
=== RUN TestGetContainersJSON
2013/09/09 15:58:54 Creating loopback file /var/lib/docker/unit-tests-devices/loopback/data for device-manage use
2013/09/09 15:58:54 Creating loopback file /var/lib/docker/unit-tests-devices/loopback/metadata for device-manage use
2013/09/09 15:58:54 Initializing base device-manager snapshot
2013/09/09 15:58:54 Creating filesystem on base device-manager snapshot
--- PASS: TestGetContainersJSON (2.61 seconds)
=== RUN TestGetContainersExport
--- PASS: TestGetContainersExport (0.32 seconds)
=== RUN TestGetContainersChanges
--- PASS: TestGetContainersChanges (0.38 seconds)
=== RUN TestGetContainersTop
--- PASS: TestGetContainersTop (0.34 seconds)
=== RUN TestGetContainersByName
--- PASS: TestGetContainersByName (0.03 seconds)
=== RUN TestPostCommit
--- PASS: TestPostCommit (0.41 seconds)
=== RUN TestPostContainersCreate
--- PASS: TestPostContainersCreate (0.30 seconds)
=== RUN TestPostContainersKill
--- PASS: TestPostContainersKill (0.74 seconds)
=== RUN TestPostContainersRestart
2013/09/09 15:59:01 Container 13cbd431662034cb1f2ffc5949b1d9fbc392d3fb1d1a0d446f3557d92363d1a5 failed to exit within 1 seconds of SIGTERM - using the force
--- PASS: TestPostContainersRestart (2.29 seconds)
=== RUN TestPostContainersStart
--- PASS: TestPostContainersStart (0.73 seconds)
=== RUN TestPostContainersStop
2013/09/09 15:59:04 Container 28b0059afe4f41500f39b8a95f543f2623a4014a5886433bbab409d00ffbf218 failed to exit within 1 seconds of SIGTERM - using the force
--- PASS: TestPostContainersStop (1.74 seconds)
=== RUN TestPostContainersWait
--- PASS: TestPostContainersWait (1.35 seconds)
=== RUN TestPostContainersAttach
--- PASS: TestPostContainersAttach (0.85 seconds)
=== RUN TestDeleteContainers
--- PASS: TestDeleteContainers (0.35 seconds)
=== RUN TestOptionsRoute
--- PASS: TestOptionsRoute (0.03 seconds)
=== RUN TestGetEnabledCors
--- PASS: TestGetEnabledCors (0.04 seconds)
=== RUN TestDeleteImages
--- PASS: TestDeleteImages (0.03 seconds)

=== RUN TestJsonContentType
--- PASS: TestJsonContentType (0.00 seconds)
=== RUN TestPostContainersCopy
--- PASS: TestPostContainersCopy (0.30 seconds)
=== RUN TestCmdStreamLargeStderr
--- PASS: TestCmdStreamLargeStderr (0.00 seconds)

=== RUN TestCmdStreamBad
--- PASS: TestCmdStreamBad (0.00 seconds)
=== RUN TestCmdStreamGood
--- PASS: TestCmdStreamGood (0.00 seconds)
=== RUN TestTarUntar
--- PASS: TestTarUntar (0.03 seconds)
=== RUN TestBuild
--- PASS: TestBuild (11.56 seconds)
=== RUN TestVolume
--- PASS: TestVolume (0.43 seconds)
=== RUN TestBuildMaintainer
--- PASS: TestBuildMaintainer (0.28 seconds)
=== RUN TestBuildUser
--- PASS: TestBuildUser (0.29 seconds)
=== RUN TestBuildEnv
--- PASS: TestBuildEnv (0.31 seconds)
=== RUN TestBuildCmd
--- PASS: TestBuildCmd (0.31 seconds)
=== RUN TestBuildExpose
--- PASS: TestBuildExpose (0.29 seconds)
=== RUN TestBuildEntrypoint
--- PASS: TestBuildEntrypoint (0.29 seconds)
=== RUN TestBuildEntrypointRunCleanup
--- PASS: TestBuildEntrypointRunCleanup (0.69 seconds)
=== RUN TestBuildImageWithCache
--- PASS: TestBuildImageWithCache (0.29 seconds)
=== RUN TestBuildImageWithoutCache
--- PASS: TestBuildImageWithoutCache (0.40 seconds)
=== RUN TestForbiddenContextPath
--- PASS: TestForbiddenContextPath (0.41 seconds)
=== RUN TestRunHostname
2013/09/09 15:59:23 error killing container 99f0209efdcffeaebd24e385ec19bf4721ecc3a687d9d6729dcce44888991237 (lxc-kill: failed to get the init pid
, exit status 255)
--- PASS: TestRunHostname (0.33 seconds)
=== RUN TestRunWorkdir
2013/09/09 15:59:23 error killing container b10ac90412eca6a99a4c4b3ddf9fc7992ea8d1c06f282596b166cc54ca3632f8 (lxc-kill: failed to get the init pid
, exit status 255)
--- PASS: TestRunWorkdir (0.34 seconds)
=== RUN TestRunWorkdirExists
2013/09/09 15:59:23 error killing container 450341ace73bdde8b9d7d10d2d88623d8ecc3bf08762f380324271e9ef457c70 (lxc-kill: failed to get the init pid
, exit status 255)
--- PASS: TestRunWorkdirExists (0.35 seconds)
=== RUN TestRunExit
--- PASS: TestRunExit (0.35 seconds)
=== RUN TestRunDisconnect
--- PASS: TestRunDisconnect (0.35 seconds)
=== RUN TestRunDisconnectTty
--- PASS: TestRunDisconnectTty (0.59 seconds)
=== RUN TestRunAttachStdin
--- PASS: TestRunAttachStdin (5.34 seconds)
=== RUN TestAttachDisconnect
--- PASS: TestAttachDisconnect (0.87 seconds)
=== RUN TestIDFormat
--- PASS: TestIDFormat (0.04 seconds)
=== RUN TestMultipleAttachRestart
--- PASS: TestMultipleAttachRestart (0.67 seconds)
=== RUN TestDiff
--- PASS: TestDiff (1.16 seconds)
=== RUN TestCommitAutoRun
--- PASS: TestCommitAutoRun (0.69 seconds)
=== RUN TestCommitRun
--- PASS: TestCommitRun (0.73 seconds)
=== RUN TestStart
--- PASS: TestStart (0.77 seconds)
=== RUN TestRun
--- PASS: TestRun (0.36 seconds)
=== RUN TestOutput
--- PASS: TestOutput (0.34 seconds)
=== RUN TestKillDifferentUser
--- PASS: TestKillDifferentUser (0.36 seconds)
=== RUN TestCreateVolume
--- PASS: TestCreateVolume (0.35 seconds)
=== RUN TestKill
--- PASS: TestKill (0.76 seconds)
=== RUN TestExitCode
--- PASS: TestExitCode (0.82 seconds)
=== RUN TestRestart
--- PASS: TestRestart (1.17 seconds)
=== RUN TestRestartStdin
--- PASS: TestRestartStdin (0.68 seconds)
=== RUN TestUser
--- PASS: TestUser (2.13 seconds)
=== RUN TestMultipleContainers
--- PASS: TestMultipleContainers (0.85 seconds)
=== RUN TestStdin
--- PASS: TestStdin (0.61 seconds)
=== RUN TestTty
--- PASS: TestTty (0.35 seconds)
=== RUN TestEnv
--- PASS: TestEnv (0.36 seconds)
=== RUN TestEntrypoint
--- PASS: TestEntrypoint (0.36 seconds)
=== RUN TestEntrypointNoCmd
--- PASS: TestEntrypointNoCmd (0.35 seconds)
=== RUN TestLXCConfig
--- PASS: TestLXCConfig (0.04 seconds)
=== RUN TestCustomLxcConfig
--- PASS: TestCustomLxcConfig (0.04 seconds)
=== RUN TestBindMounts
--- PASS: TestBindMounts (0.66 seconds)
=== RUN TestVolumesFromReadonlyMount
--- PASS: TestVolumesFromReadonlyMount (0.70 seconds)
=== RUN TestRestartWithVolumes
--- PASS: TestRestartWithVolumes (0.68 seconds)
=== RUN TestVolumesFromWithVolumes
--- PASS: TestVolumesFromWithVolumes (1.06 seconds)
=== RUN TestOnlyLoopbackExistsWhenUsingDisableNetworkOption
--- PASS: TestOnlyLoopbackExistsWhenUsingDisableNetworkOption (0.35 seconds)
=== RUN TestPrivilegedCanMknod
--- PASS: TestPrivilegedCanMknod (0.33 seconds)
=== RUN TestPrivilegedCanMount
--- PASS: TestPrivilegedCanMount (0.35 seconds)
=== RUN TestPrivilegedCannotMknod
--- PASS: TestPrivilegedCannotMknod (0.36 seconds)
=== RUN TestPrivilegedCannotMount
--- PASS: TestPrivilegedCannotMount (0.37 seconds)
=== RUN TestInit
--- PASS: TestInit (0.00 seconds)
=== RUN TestInterruptedRegister
--- PASS: TestInterruptedRegister (0.20 seconds)
=== RUN TestGraphCreate
--- PASS: TestGraphCreate (0.00 seconds)
=== RUN TestRegister
--- PASS: TestRegister (0.01 seconds)
=== RUN TestMount
--- PASS: TestMount (0.15 seconds)
=== RUN TestDeletePrefix
--- PASS: TestDeletePrefix (0.00 seconds)
=== RUN TestDelete
--- PASS: TestDelete (0.01 seconds)
=== RUN TestByParent
--- PASS: TestByParent (0.01 seconds)
=== RUN TestTCP4Proxy
--- PASS: TestTCP4Proxy (0.00 seconds)
    network_proxy_test.go:47: EchoServer listening on tcp/127.0.0.1:37875
    network_proxy_test.go:59: TCP client accepted on the EchoServer
=== RUN TestTCP6Proxy
--- PASS: TestTCP6Proxy (0.00 seconds)
    network_proxy_test.go:47: EchoServer listening on tcp/[::1]:37363
    network_proxy_test.go:59: TCP client accepted on the EchoServer
=== RUN TestTCPDualStackProxy
--- SKIP: TestTCPDualStackProxy (0.00 seconds)
    network_proxy_test.go:149: No support for dual stack yet
=== RUN TestUDP4Proxy
--- PASS: TestUDP4Proxy (0.00 seconds)
    network_proxy_test.go:47: EchoServer listening on udp/127.0.0.1:38003
    network_proxy_test.go:82: Writing UDP datagram back
=== RUN TestUDP6Proxy
--- PASS: TestUDP6Proxy (0.00 seconds)
    network_proxy_test.go:47: EchoServer listening on udp/[::1]:34469
    network_proxy_test.go:82: Writing UDP datagram back
=== RUN TestUDPWriteError
--- PASS: TestUDPWriteError (0.00 seconds)
    network_proxy_test.go:47: EchoServer listening on udp/127.0.0.1:25587
    network_proxy_test.go:82: Writing UDP datagram back
=== RUN TestIptables
--- PASS: TestIptables (0.06 seconds)
=== RUN TestParseNat
--- PASS: TestParseNat (0.00 seconds)
=== RUN TestPortAllocation
--- PASS: TestPortAllocation (0.00 seconds)
=== RUN TestNetworkRange
--- PASS: TestNetworkRange (0.00 seconds)
=== RUN TestConversion
--- PASS: TestConversion (0.00 seconds)
=== RUN TestIPAllocator
--- PASS: TestIPAllocator (0.00 seconds)
=== RUN TestNetworkOverlaps
--- PASS: TestNetworkOverlaps (0.00 seconds)
=== RUN TestCheckRouteOverlaps
--- PASS: TestCheckRouteOverlaps (0.00 seconds)
=== RUN TestRuntimeCreate
--- PASS: TestRuntimeCreate (0.05 seconds)
=== RUN TestDestroy
--- PASS: TestDestroy (0.04 seconds)
=== RUN TestGet
--- PASS: TestGet (0.04 seconds)
=== RUN TestAllocateTCPPortLocalhost
--- PASS: TestAllocateTCPPortLocalhost (0.78 seconds)
    runtime_test.go:313: Trying port 5555
=== RUN TestAllocateUDPPortLocalhost
--- PASS: TestAllocateUDPPortLocalhost (0.76 seconds)
    runtime_test.go:313: Trying port 5555
=== RUN TestRestore
--- PASS: TestRestore (1.04 seconds)
=== RUN TestContainerTagImageDelete
--- PASS: TestContainerTagImageDelete (0.03 seconds)
=== RUN TestCreateRm
--- PASS: TestCreateRm (0.04 seconds)
=== RUN TestCommit
--- PASS: TestCommit (0.27 seconds)
=== RUN TestCreateStartRestartStopStartKillRm
2013/09/09 15:59:53 lxc-kill: failed to get the init pid
2013/09/09 15:59:53 Failed to send SIGTERM to the process, force killing
2013/09/09 15:59:53 error killing container 631b451a806d29e4734e6f6c8d54d93669d32ba6282e6bae4f3ddafbe0667633 (lxc-kill: failed to get the init pid
, exit status 255)
2013/09/09 15:59:53 lxc-kill: failed to get the init pid
2013/09/09 15:59:53 Failed to send SIGTERM to the process, force killing
2013/09/09 15:59:53 error killing container 631b451a806d29e4734e6f6c8d54d93669d32ba6282e6bae4f3ddafbe0667633 (lxc-kill: failed to get the init pid
, exit status 255)
2013/09/09 15:59:54 error killing container 631b451a806d29e4734e6f6c8d54d93669d32ba6282e6bae4f3ddafbe0667633 (lxc-kill: failed to get the init pid
, exit status 255)
--- PASS: TestCreateStartRestartStopStartKillRm (1.01 seconds)
=== RUN TestRunWithTooLowMemoryLimit
--- PASS: TestRunWithTooLowMemoryLimit (0.03 seconds)
=== RUN TestContainerTop
--- SKIP: TestContainerTop (0.00 seconds)
    server_test.go:209: Fixme. Skipping test for now. Reported error: 'server_test.go:236: Expected 2 processes, found 1.'
=== RUN TestPools
--- PASS: TestPools (0.04 seconds)
=== RUN TestLogEvent
--- PASS: TestLogEvent (0.44 seconds)
=== RUN TestRmi
2013/09/09 15:59:55 error killing container 005a6c475348b9b064b9069ed6b1d5ee90ef7b726099bac22508dd5fd1852ec7 (lxc-kill: failed to get the init pid
, exit status 255)
--- PASS: TestRmi (0.70 seconds)
=== RUN TestServerListOrderedImagesByCreationDate
--- PASS: TestServerListOrderedImagesByCreationDate (0.04 seconds)
=== RUN TestServerListOrderedImagesByCreationDateAndTag
--- PASS: TestServerListOrderedImagesByCreationDateAndTag (0.04 seconds)
=== RUN TestLookupImage
--- PASS: TestLookupImage (0.03 seconds)
=== RUN TestCompareConfig
--- PASS: TestCompareConfig (0.00 seconds)
=== RUN TestMergeConfig
--- PASS: TestMergeConfig (0.00 seconds)
=== RUN TestMergeConfigPublicPortNotHonored
--- PASS: TestMergeConfigPublicPortNotHonored (0.00 seconds)
=== RUN TestParseLxcConfOpt
--- PASS: TestParseLxcConfOpt (0.00 seconds)
=== RUN TestFinal
--- PASS: TestFinal (0.18 seconds)
    z_final_test.go:15: Start Fds: 6, Start Goroutines: 10
    z_final_test.go:10: Fds: 14, Goroutines: 361
PASS
ok      github.com/dotcloud/docker    63.721s

Solomon Hykes

unread,
Sep 9, 2013, 4:25:20 PM9/9/13
to Alexander Larsson, docker-dev, Jérôme Petazzoni
Hi guys,

On Mon, Sep 9, 2013 at 1:28 AM, Alexander Larsson <alexande...@gmail.com> wrote:
Fedora doesn't have any static libdevmapper, and in general frowns on static dependencies. Furthermore, we will likely run into more similar issues, for instance with libvirt. It would be nicer to have a more generic solution to this problem.

I understand that static dependencies are not a distro maintainer's best friend, and don't have any philosophical attachment to static dependencies. Furthermore, as far as Fedora is concerned it's "your house, your rules".

However :) I believe in this case sticking to a static binary is the right thing for Fedora to do.

I think that's the case simply because being statically linked is an actual *feature* of Docker, and removing that feature will alter the product in a meaningful way. Dynamically linking libdevmapper (or any other library) means that the container's version of the lib will be loaded - or fail to load if it's not available. This destroys the consistency of behavior across all deployments of the docker runtime, which is something that we promise in the documentation and that docker users expect. A dynamically linked docker is simply less useful.

I understand we could work around this by breaking down the binary in multiple parts, or other hacks. But that seems to be like overly complicating the codebase and user experience for the sake of policy.

A good comparison is Busybox. It seems to be distributed by Fedora as a static binary [1], which seems like the right thing to do, because a dynamically linked busybox would simply be less useful. I hope we can benefit from the same jurisprudence :)



I realize this adds a little more work on the packaging side, but I sincerely believe this is the best approach for Fedora users. We will gladly help in this effort in every way we can!

Alexander Larsson

unread,
Sep 10, 2013, 3:53:51 AM9/10/13
to docke...@googlegroups.com, Alexander Larsson, Jérôme Petazzoni
I don't tthink that having the daemon and the client in the same binary is a good idea in general. Separating the two is good hygiene (limit the amount of priviledged code that the client runs), not a hack. I mean, its not like we're *using* the daemon code inside the container. So, even if there was no problem with the static/dynamic linking I would be for a client/daemon split. And, given such a split I don't see a general problem with having the daemon being dynamically linked, its is part of the host system, not the container, and as such its up to the builder of the host system to match it with the right libraries and such.

The comparison to busybox imho works for the client, which is the thing that needs to be able to run anywhere (similar to busybox), but doesn't really give any reason for merging that with the daemon code.

The problem with the testcase still exists in the split binaries case though, but that is really imho an issue in how we build docker in the testsuite rather than some conceptual issue.

Alexander Larsson

unread,
Sep 10, 2013, 4:40:20 PM9/10/13
to docke...@googlegroups.com, Alexander Larsson, Jérôme Petazzoni
I just pushed a new device-mapper3 branch which is rebased on latest master. Apart from some rebase cleanups the main difference from the old branch is the way the binaries are split up.

In the new branch the main docker binary has not been changed much at all, it works fine as a .dockerinit inside the container if it is statically linked. However, we also build (optionally/separately) a docker-init binary which is always static (no libdevmapper dependency), and if docker-init exists next to the main docker binary then the docker-init file will be used as .dockerinit bind mount source instead.

This means that a fully static build can just ignore building docker-init, whereas someone building a dynamic docker will need to also build and install the static docker-init next to it.

There is still the problem with the testsuite, i don't know of any good way to have "go test" build docker-init and put it next to the docker.test binary, so there is still this hack with a _DOCKER_INIT_PATH env var you can use to make the tests pass when using dynamical linking.

Alexander Larsson

unread,
Sep 13, 2013, 10:12:23 AM9/13/13
to docke...@googlegroups.com, Alexander Larsson, Jérôme Petazzoni
In addition to some fixes I also pushed a new "trivial" CoW backend that just copies the layers into the container when it is launched. Its currently not enabled, you have to hack Runtime.GetMountMethod() atm to enable it.

This is obviously not as nice as a real CoW solution, but it may be nice for some usecases. For instance if you don't want to use the devmapper backend on your deployment system (but its ok on the development systems) and you either don't create a lot of containers (few long-running containers), or you don't mind the extra space and slower container startup.

Container startup is surprisingly fast though, since the base images are generallty pretty small, and it starts running before all the copied data is on disk:

$ sudo time docker run -t -i ubuntu /bin/echo hello world
hello world
0.01user 0.00system 0:01.45elapsed 1%CPU (0avgtext+0avgdata 6068maxresident)k
0inputs+0outputs (0major+1596minor)pagefaults 0swaps

So, less than 2 seconds to start and run hello world....

Also, if we could make the copy use the btrfs reflink (BTRFS_IOC_CLONE) it would actually be a CoW solution...

Alexander Larsson

unread,
Sep 13, 2013, 10:49:37 AM9/13/13
to docke...@googlegroups.com, Alexander Larsson, Jérôme Petazzoni
I just pushed a version that uses reflink on btrfs, makes it perform a bit better on my system:


$ sudo time docker run -t -i ubuntu /bin/echo hello world
hello world
0.01user 0.00system 0:00.81elapsed 2%CPU (0avgtext+0avgdata 8116maxresident)k
0inputs+0outputs (0major+1597minor)pagefaults 0swaps

Of course, its still not the same speed as the devmapper version:


sudo time docker run -t -i ubuntu /bin/echo hello world
hello world
0.01user 0.00system 0:00.14elapsed 12%CPU (0avgtext+0avgdata 6076maxresident)k
0inputs+0outputs (0major+1598minor)pagefaults 0swaps

Victor Vieux

unread,
Sep 14, 2013, 8:04:59 PM9/14/13
to Alexander Larsson, docke...@googlegroups.com, Jérôme Petazzoni
Hi Alexander,

I tried your device-mapper3 branch, forcing the use of device mapper,
and I got this issue on docker start: 

2013/09/15 00:03:19 Initializing base device-manager snapshot
device-mapper: message ioctl failed: File exists
[debug] image.go:392 Creating device-mapper device for image id e9aa60c60128cad1
[debug] api.go:1032 Error: Error starting container 5a3e0919ef0d: Unknown base hash
[debug] api.go:71 [error 500] Error starting container 5a3e0919ef0d: Unknown base hash

Do you have any idea ?

Alexander Larsson

unread,
Sep 15, 2013, 4:07:30 AM9/15/13
to docke...@googlegroups.com, Alexander Larsson, Jérôme Petazzoni
Did you run a previous version of the branch? Try unmounting any lefttover mounts and "dmsetup remove docker-..." all docker devmapper devices.

This shouldn't happen normally but the format and stuff on disk changed a bit.

Jérôme Petazzoni

unread,
Oct 14, 2013, 6:33:05 PM10/14/13
to docke...@googlegroups.com, Alexander Larsson, Jérôme Petazzoni
In case anyone hits this later — this is because the devicemapper doesn't support (yet) docker-in-docker.
Of course, if you're seeing this message but aren't running docker within docker, the cause is different.

Solomon Hykes

unread,
Oct 14, 2013, 6:44:40 PM10/14/13
to Jérôme Petazzoni, docker-dev, Alexander Larsson
See https://github.com/dotcloud/docker/tree/dm-dind for an ongoing fix to that issue (not yet finished).

James Turnbull

unread,
Oct 14, 2013, 6:34:32 PM10/14/13
to docke...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

J�r�me Petazzoni wrote:
> In case anyone hits this later � this is because the devicemapper
> doesn't support (yet) docker-in-docker. Of course, if you're seeing
> this message but aren't running docker within docker, the cause is
> different.

What's the plan for docker-in-docker on DeviceMapper? For 0.7?

Regards

James

- --
* The Docker Book (http://dockerbook.com)
* The LogStash Book (http://logstashbook.com)
* Pro Puppet (http://tinyurl.com/ppuppet)
* Pro Linux System Administration (http://tinyurl.com/linuxadmin)
* Pro Nagios 2.0 (http://tinyurl.com/pronagios)
* Hardening Linux (http://tinyurl.com/hardeninglinux)

-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.17 (Darwin)
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJSXHF4AAoJECFa/lDkFHAyWCkH/1g1yEYyuxW5HrXPPVUDsQFQ
/ZCDeEIg57TLIEIEAcngUDLqeHuHZBQobcScAeUsIDA4Eicyd6hek7229EFX8w+A
OyZKke/+sUt80qwibSGZW1K3b02VLCs5uGwDgDrbjLQzuPn7nX4RpNc/2vLa0C87
XUnUard3yrANGTvdApnEbH5etWzPv7KoxFQzpdQu9x8zbOnnXbGjFbiSzZeAMov6
vWWN/IjxdiGKZ4vqiEeN1AGTai32m75SUfVw+DGiLjTLEeh1KvhaVDId2dwwVz5y
OPp/w6RF8IOO2JDaBiWb3FEHmVCI6iwxiXyPLPfUJ+0MK6Hw6XOkBRRlk2702c0=
=/4fH
-----END PGP SIGNATURE-----

Jérôme Petazzoni

unread,
Oct 14, 2013, 11:15:49 PM10/14/13
to James Turnbull, docker-dev



On Mon, Oct 14, 2013 at 3:34 PM, James Turnbull <ja...@lovedthanlost.net> wrote:

Jérôme Petazzoni wrote:
> In case anyone hits this later — this is because the devicemapper

> doesn't support (yet) docker-in-docker. Of course, if you're seeing
> this message but aren't running docker within docker, the cause is
> different.

What's the plan for docker-in-docker on DeviceMapper? For 0.7?

I don't know if d-in-d will work on 0.7.0; but it will very probably work on 0.7.1 or something like that.
I mean -- there is already some work in a branch, to fix the d-in-d issues on device mapper; I just don't know if we will want to release 0.7 with this regression or wait (since 0.7 is already late due to the complexity of the devmapper integration). Stay tuned ;-)



--

James Turnbull

unread,
Oct 14, 2013, 11:20:46 PM10/14/13
to Jérôme Petazzoni, docker-dev
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

> I don't know if d-in-d will work on 0.7.0; but it will very probably
> work on 0.7.1 or something like that. I mean -- there is already some
> work in a branch, to fix the d-in-d issues on device mapper; I just
> don't know if we will want to release 0.7 with this regression or
> wait (since 0.7 is already late due to the complexity of the
> devmapper integration). Stay tuned ;-)
>

Thanks mate! Helpful to know as I use a lot of d-in-d for my Jenkins
builds and if I am going to break everything it'd be good to know when
to upgrade. :)

James

- --
* The Docker Book (http://dockerbook.com)
* The LogStash Book (http://logstashbook.com)
* Pro Puppet (http://tinyurl.com/ppuppet)
* Pro Linux System Administration (http://tinyurl.com/linuxadmin)
* Pro Nagios 2.0 (http://tinyurl.com/pronagios)
* Hardening Linux (http://tinyurl.com/hardeninglinux)

-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.17 (Darwin)
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJSXLSOAAoJECFa/lDkFHAyAnwIAIPrvb0rahgnc9oARUVeMxm7
Ic6Qc5NkteVKpzHv/TGUvdiXs8I/UqimXuq1vR6R8x015SyEC9JoBTnF4uQ45Wm4
IVFuQV8kZVuI7TuBaBRxnSE8iDzE+5UktV/9cn7LhV+vVBpgeL6aBlT/CXtwKxZ5
77g6nh4Mln824Lrl39dqeGP3x6f7R9Yk1jVu2zEvGUDAn4CeHgEn2a9ZAC0VnxWS
VtHJoLn3KNp+WvOdhs3CfiLxQZi3MtEIs6d6CP7Dm6cLEsHW8KaM+Va+540DYE18
C1GopCFPDQoqEoWkOwO+DSDOHjj3EIiMsEdTBpbb1KouakclwekKJHfrhjXvBMY=
=2j+8
-----END PGP SIGNATURE-----
Reply all
Reply to author
Forward
0 new messages