Revisiting Jepsen/Debian/Docker with latest versions

20 views
Skip to first unread message

JS

unread,
May 12, 2022, 6:23:26 PM5/12/22
to Jepsen Talk
Following up on the Docker discussions, tried the latest versions of:

  • Debian 11 (fresh install, with updates)
  • Docker (repository from docker.com, using current docs)

and it still fails requiring a power cycle to recover.
The error logs and behavior are basically the same as the initial Issue 532: docker problematic.

From following issues and commits, I think the current status with Jepsen + Docker:

  • Debian, Ubuntu, Windows, confirmed fail
  • macOS, working for some when privileged=true

I happily use Debian\LXC to develop with Jepsen and recommend it to others.
Docker is only of interest as a convenient way to distribute test environments, results to others, particularly developers.

Although not the same as Docker, would there be interest in a GitHub runner/action that provided the Jepsen Debian/LXC environment for running tests? Would it actually be useful and used by application developers when presented with Jepsen results that indicate a likely anomaly?

P.S. docker-compose is now a plugin for the docker CLI vs a standalone command so Jepsen's bin/up errors with a request to install it. Would Jepsen like a PR for bin/up?

Kyle Kingsbury

unread,
May 13, 2022, 10:36:11 AM5/13/22
to ta...@jepsen.io
On Thu, 2022-05-12 at 15:23 -0700, JS wrote:
> Following up on the Docker discussions, tried the latest versions of:
>
>  * Debian 11 (fresh install, with updates)
>  * Docker (repository from docker.com, using current docs)
>
> and it still fails requiring a power cycle to recover.
> The error logs and behavior are basically the same as the initial
> Issue 532: docker problematic.
>
> From following issues and commits, I think the current status with
> Jepsen + Docker:
>
>  * Debian, Ubuntu, Windows, confirmed fail
>  * macOS, working for some when privileged=true

Ughhghhg. If you can find a way to get this working, I think folks
would probably appreciate it. Feels like the Docker setup is one of
those things which is constantly bitrotting as Docker itself evolves,
and every time a client asks about it I spent a week trying to un-break
a system I don't really understand.

> Although not the same as Docker, would there be interest in a GitHub
> runner/action that provided the Jepsen Debian/LXC environment for
> running tests? Would it actually be useful and used by application
> developers when presented with Jepsen results that indicate a likely
> anomaly?

I don't really know much personally about Github runners/actions, so I
can't say that I'd find it helpful--though I don't necessarily know
what I'm missing. Anyone else?

> P.S. docker-compose is now a plugin for the docker CLI vs a
> standalone command so Jepsen's bin/up errors with a request to
> install it. Would Jepsen like a PR for bin/up?

Yes, please!

--Kyle

JS

unread,
May 16, 2022, 4:45:56 PM5/16/22
to Jepsen Talk, ap...@jepsen.io
Ughhghhg. If you can find a way to get this working, I think folks
would probably appreciate it.

Did some research and tldr; current host/container/systemd/Docker configs/behavior have evolved to the point where systemd containers can only be configured with `docker run` and not with `docker compose`. The Jepsen issue has been updated with specifics, links to Docker issues, etc.

The workaround of using `docker run` vs `docker compose` is problematic.
I was able to manually create a Jepsen environment using individual `docker` commands for containers/volumes/networks and poking at the Docker environment.
I was not able to create a bash script that reliably worked. 

> P.S. docker-compose is now a plugin for the docker CLI vs a
> standalone command so Jepsen's bin/up errors with a request to
> install it. Would Jepsen like a PR for bin/up?

Yes, please!

Going to wait for something to change with Docker before suggesting any changes.
Linux and Windows users are currently blocked from docker composing systemd containers.
macOS users are hanging on by a thread.

Kyle Kingsbury

unread,
May 16, 2022, 5:09:10 PM5/16/22
to JS, Jepsen Talk
On Mon, 2022-05-16 at 13:45 -0700, JS wrote:
>
> The workaround of using `docker run` vs `docker compose` is
> problematic.
> I was able to manually create a Jepsen environment using individual
> `docker` commands for containers/volumes/networks and poking at the
> Docker environment.
> I was not able to create a bash script that reliably worked. 


A valiant effort, I'm sure!

> Going to wait for something to change with Docker before suggesting
> any changes.
>  Linux and Windows users are currently blocked from docker composing
> systemd containers.
> macOS users are hanging on by a thread.

Argh! I'm so sorry, what a mess. Thanks for looking into this Jeff--
hopefully Docker stabilizes. In the meantime I may need to add more
language to the docs steering folks away from Docker, cuz this has been
a persistent pain point. :(

--Kyle

JS

unread,
Jun 16, 2022, 11:45:32 PM6/16/22
to Jepsen Talk, ap...@jepsen.io, JS
I did end up trying to go all the way with decomposing Jepsen's docker compose into individual docker commands:


It's temporary (until docker fix), not really designed for development (e.g. smart build caching, volume management), etc.

It was made to be able to share Jepsen tests, e.g. submit a bug report.
IOW, the value of sharing Jepsen test results today is worth hacking around docker compose's indeterminate fix.

The etcd tests, https://github.com/jepsen-io/etcd were used to test the workaround environment.

The environment expects current docker from docker.com repository.
It was developed/tested on Debian with docker-desktop in the hope for a better chance of success with macOS and Windows.

jepsen-docker-workaround also uses AntidoteDB as an example. The tests are still at the beginning stages of development but can be used to partition/pause/kill inter/intra data center nodes for g-set and pn-counter data types.
Reply all
Reply to author
Forward
0 new messages