Subject: Feedback wanted: Java 21 static graph runtime for low-latency in-process pipelines

58 views
Skip to first unread message

Leonidas Kanarelis

unread,
May 7, 2026, 8:17:03 AM (2 days ago) May 7
to mechanical-sympathy
Hi all,

I would appreciate technical feedback on a Java 21 runtime I have been building:

https://github.com/ElevatedDev/Lattice

Lattice is aimed at a specific shape of service that I kept seeing in low-latency JVM work: the processing topology is known before startup, the work stays inside one JVM, and the runtime shape is effectively fixed — ingest, validate, enrich, route, join, sink.

In practice I often saw those systems implemented as either:

- hand-rolled queues and worker conventions,
- one ordered ring with application-specific structure built around it,
- or a stream processor / broker that was larger than the in-process problem.

Lattice is an attempt to make the middle case explicit.

The model is a typed static graph. You declare sources, stages, routing nodes, joins, sinks, and bounded SPSC/MPSC edges. Before workers start, Lattice validates the graph and compiles it into a fixed worker plan. The compiler can then make decisions around source specialization, edge-local backpressure, eligible linear-chain fusion, worker placement, and preallocated payload reuse. It also emits a compilation report so the user can see what fused, what stayed physical, and why.

The scope is intentionally narrow:

- Java 21
- one JVM
- static topology
- bounded edges
- explicit ownership
- no durability
- no replay
- no distributed delivery
- no runtime topology changes
- no exactly-once external effects

So this is not trying to be Kafka, Flink, a broker, a persistence layer, or a general queue replacement.

The obvious comparison is LMAX Disruptor. I still think Disruptor is excellent when the problem is fundamentally one ordered ring. Lattice is making a different bet: if the application is really a fixed typed DAG, the runtime should be able to see that DAG directly and specialize around it instead of reconstructing it from queue plumbing and handler conventions.

I included a Disruptor comparison, but tried to keep the claim scoped rather than promotional:

https://github.com/ElevatedDev/Lattice/blob/master/docs/disruptor-comparison.md

The current checked-in benchmark snapshot is here:

https://github.com/ElevatedDev/Lattice/tree/master/docs/benchmark-results/2026-05-02-per-graph-refresh

That baseline uses OpenJDK 21.0.10, JMH 1.36, Disruptor 4.0.0, Intel Core i9-14900HX under WSL2, with:

-Xms2g -Xmx2g -XX:+AlwaysPreTouch -XX:+UnlockDiagnosticVMOptions -XX:+UseParallelGC

Some of the static-graph rows are favorable to Lattice:

- physical three-stage publish: 31.9M vs 21.7M ops/s
- inline-fused publish: 127.9M vs 35.7M ops/s
- equal-call-site reference row: 209.2M vs 31.1M ops/s
- completion-gated source-inline path: 77.9M vs 3.62M ops/s

But the matrix also keeps Disruptor-favorable rows visible. Disruptor wins or does better on some simple source/sink completed-operation rows, physical pipeline completed rows, broadcast, and dependency/join completed cases. So the intended claim is not “Lattice is always faster than Disruptor.” The claim is narrower: a static typed graph gives the runtime extra information, and that information can be useful in specific shapes.

I would particularly value criticism on:

- whether the benchmark shapes are fair and useful,
- which benchmark cases are missing,
- whether the Disruptor comparison is framed correctly,
- whether the ownership/backpressure model is clear enough,
- whether the fusion and source-specialization rules make sense,
- and where this abstraction would break down in real low-latency JVM systems.

The core runtime is Java. There is an optional Rust JNI backend, but it is only for placement/topology diagnostics and is not required for normal Java use. The project is Apache 2.0 and includes runnable examples, Javadocs, JCStress tests, JMH JSON/stdout artifacts, and release checks.

It is also on Maven Central:

io.github.elevateddev:lattice:1.0.0

Repository:

https://github.com/ElevatedDev/Lattice

I am not looking for soft launch feedback so much as technical pushback. If you have maintained JVM pipelines built from raw queues, Disruptor, Aeron-adjacent components, executor graphs, or heavier stream processors, I would be very interested in where you think this design is useful, where it is confused, and what you would test before trusting it.
Reply all
Reply to author
Forward
0 new messages