Distributed systems enthusiasts (and those who want to see those systems burn) I
have good news! A new release of Jepsen is at hand!
I've been taking a long-overdue break from contract work this fall, and took the
opportunity to rewrite many parts of Jepsen that were slow or limited to small
histories. The result is Jepsen 0.3.0, available now on Clojars.
https://github.com/jepsen-io/jepsen/releases/tag/v0.3.0
This release replaces many of Jepsen's internals with faster or more scalable
data structures. It introduces significant new datatypes and adds new support
libraries. Core generators are much faster, thanks to new Context and Op types.
Running and analyzing tests can be 1-2 orders of magnitude faster: Jepsen can
now run list-append tests at ~45,000 ops/sec and check them at ~30,000 ops/sec.
Histories are streamed and loaded incrementally, which improves crash recovery,
allows for histories larger than RAM, and speeds up REPL work. Histories in the
hundreds of millions or even billions of operations are now tractable. Most
checkers are parallelized and take advantage of sophisticated multi-query
optimization for reductions over histories. A new dependency-aware executor
allows checkers to run in parallel without starvation. New `nemesis.combined`
packages support file truncation and bitflips, as well as network latency and
packet loss.
As usual, most things should be API compatible, and we try to issue Obvious
Warnings when they're not--but this is a big enough change that we're bumping
the minor version from 0.2.7 to 0.3.0. Users integrating tightly with histories
and generators should test their code carefully.
## New Features
- A new library, [jepsen.history](
https://github.com/jepsen-io/history),
provides support for writing efficient checkers. It includes a transactional
dependency-aware concurrent executor, concurrent and linear folds with
multi-query optimization, and lazy datatypes for working with large histories.
- Operations are now represented by an Op defrecord (jepsen.history.Op) instead
of maps. This yields significant performance and speed improvements. Ops have
mandatory :index and :time fields, both longs. See jepsen.history for more details.
- Histories are incrementally streamed to the `test.jepsen` file, and sealed in
16384-operation chunks. If a test crashes during the run or analysis phase, you
can likely recover some of its history and re-analyze it.
- Histories are now represented by subtypes of jepsen.history.History. These
should be compatible with vectors, but stream their contents lazily from disk.
Mapping between invocations and completions is now built in to histories, rather
than being an external pair-index structure. Histories support efficient linear
and concurrent folds with stream fusion and multi-query optimization, and
directly support [Tesser](
https://github.com/aphyr/tesser) folds. Analyses may
be 1-2 orders of magnitude faster, depending on hardware. See jepsen.history for
details.
- dom-top.core has a new `reducer` macro which roughly doubles performance for
reductions with multiple accumulator variables.
- Elle can catch new classes of anomalies, especially involving realtime and
process-including anti-dependency cycles.
- `lein run analyze` now pulls the test arguments out of the test; you don't
have to pass them every time.
- A new `nemesis.combined/file-corruption-package` provides support for bitflips
and truncation of files.
- A new `nemesis.combined/packet-package` induces network latency and packet loss.
- A new `tests.kafka` namespace supports tests for Kafka-style append-only
ordered logs.
- `util/rand-distribution` supports picking random numbers
## Significant API Changes
- Operations are now jepsen.history.Ops, not maps. `:index` and `:time` fields
are now mandatory.
- Histories are now subtypes of jepsen.history.History, not vectors. They should
be mostly API compatible, and will transparently promote themselves to vectors
on certain operations (for instance, conj).
- Generator contexts are now jepsen.context.Contexts, rather than maps.
Accessing their old fields will throw and warn you to use new polymorphic
functions in jepsen.context.
- `lein run analyze` now takes `-t path-to-test` or `-t test-index`, rather than
the full arguments to recreate the test map.
- `test.fressian` files, deprecated in 0.2.x, are no longer generated. Use
`test.jepsen` instead.
## Performance Improvements
- Accessing operations is much faster thanks to jepsen.history.Op
- jepsen.generator is roughly an order of magnitude faster, especially for high
(~thousands of threads) concurrency tests, thanks to the new
generator.context.Context type.
- Generators can now dynamically compile context-filtering operations to BitSet
intersections, which speeds up `reserve`, `on-threads`, `clients`, `nemesis`,
and other generators.
- Reductions over histories (e.g. basically every checker) are 1-2 orders of
magnitude faster, thanks to jepsen.history.
- Elle is roughly an order of magnitude faster, thanks to jepsen.history and
careful parallelization.
- Assorted optimizations to generator/fill-in-op, soonest-op-mop, and reserve
make them significantly faster.
- Tests no longer need to wait for history writing at the end of the test, since
it's streamed to disk.
- Using functions as generators is now faster; we perform arity reflection only
once rather than on every op.
- store.fressian decodes lists as vectors directly, rather than post-processing
them. This makes Fressian decoding significantly faster.
## Minor Improvements
- Jepsen and Elle used knossos.history and knossos.op extensively. These have
been almost entirely replaced with jepsen.history.
- Most checkers have been rewritten to use jepsen.history; many reductions are
now concurrent folds.
- Knossos 0.3.9
- Tools.cli 1.0.214
- Unilog 0.7.31
- Ring 1.9.6
- SSHJ 0.34.0
- Elle 0.1.6
- Lazyfs c16518f6
- Assorted type hints and compiler warnings resolved
- Contexts are deterministic again, rather than stochastic. This may break tests
that depended on specific nondeterministic orders.
**Full Changelog**:
https://github.com/jepsen-io/jepsen/compare/v0.2.7...v0.3.0
Happy testing!
--Kyle