Finnian Reilly

unread,

May 22, 2026, 7:36:15 AMMay 22

to Eiffel Users

Finding a billion-user project for Eiffel

How DbC catches the security flaws that Rust misses

Summary of article on eiffel.org

The industry is currently celebrating Rust as a breakthrough in software safety. But Eiffel had a more complete answer to software correctness before Rust's creator was born. This essay argues that the moment has arrived for the Eiffel community to demonstrate that publicly — with a specific project, a real deployment strategy, and a compelling security argument.

https://www.eiffel.org/blog/Finnian%20Reilly/2026/05/finding-billion-user-project-eiffel-how-dbc-catches-security-flaws-rust-misses

Finding a billion-user project for Eiffel How DbC catches the security flaws that Rust misses.png

Friedrich Dominicus

unread,

May 22, 2026, 10:34:36 AMMay 22

to eiffel...@googlegroups.com

Ich habe DBC für mich in VBA eingebaut, weil es a) funktioniert und b) die Tools für das verbessern der Codebasis mitder Größe der Codebasis nicht klar kommt
DBC wird imer noch massiv unterschätzt. (Warum auch immer)

--
You received this message because you are subscribed to the Google Groups "Eiffel Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/eiffel-users/4e728a1c-4efd-40cf-b397-87f334030a0an%40googlegroups.com.

lar...@eiffel.com

unread,

May 22, 2026, 10:51:28 AMMay 22

to eiffel...@googlegroups.com

Thank you Finnian!

From: eiffel...@googlegroups.com <eiffel...@googlegroups.com> On Behalf Of Finnian Reilly
Sent: Friday, May 22, 2026 1:36 PM
To: Eiffel Users <eiffel...@googlegroups.com>
Subject: [eiffel-users] Finding a billion-user project for Eiffel: How DbC catches the security flaws that Rust misses

Finding a billion-user project for Eiffel

How DbC catches the security flaws that Rust misses

Summary of article on eiffel.org

The industry is currently celebrating Rust as a breakthrough in software safety. But Eiffel had a more complete answer to software correctness before Rust's creator was born. This essay argues that the moment has arrived for the Eiffel community to demonstrate that publicly — with a specific project, a real deployment strategy, and a compelling security argument.

https://www.eiffel.org/blog/Finnian%20Reilly/2026/05/finding-billion-user-project-eiffel-how-dbc-catches-security-flaws-rust-misses

image001.png

Ulrich Windl

unread,

May 23, 2026, 10:48:04 AMMay 23

to eiffel...@googlegroups.com

Hi!

I have studied the latest Eiffel compiler, nor did I try the latest Rust compiler, but from what I saw in the past was that the Rust compiler was quite good in static analysis during compilation (without the user having to add additional assertions), while Eiffel may detect violated assertions during runtime. So somehow it seems that Rust uses the more advanced compiler technology. However, I'm aware that many errors the Rust compiler complains about cannot happen in Eiffel. This is another performance against comfort issue.
I'd advise the Eiffel community to look at the good things in Rust, rather than bashing at it with a "we are so much better". However, the syntax of Rust is really ugly, maybe even uglier than that of C.

Kind regards,
Ulrich

22.05.2026 13:36:14 Finnian Reilly <frei...@gmail.com>:

> Finding a billion-user project for Eiffel
> How DbC catches the security flaws that Rust misses
>

> *Summary of article on eiffel.org*

> The industry is currently celebrating Rust as a breakthrough in software safety. But Eiffel had a more complete answer to software correctness before Rust's creator was born. This essay argues that the moment has arrived for the Eiffel community to demonstrate that publicly — with a specific project, a real deployment strategy, and a compelling security argument.
> https://www.eiffel.org/blog/Finnian%20Reilly/2026/05/finding-billion-user-project-eiffel-how-dbc-catches-security-flaws-rust-misses

> [Bild][Finding a billion-user project for Eiffel How DbC catches the security flaws that Rust misses.png][cid:7278d4e1-f251-48b9-951e-ee938488cbcd]

>
> --
> You received this message because you are subscribed to the Google Groups "Eiffel Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users...@googlegroups.com.

> To view this discussion visit https://groups.google.com/d/msgid/eiffel-users/4e728a1c-4efd-40cf-b397-87f334030a0an%40googlegroups.com[https://groups.google.com/d/msgid/eiffel-users/4e728a1c-4efd-40cf-b397-87f334030a0an%40googlegroups.com?utm_medium=email&utm_source=footer].

Finding a billion-user project for Eiffel How DbC catches the security flaws that Rust misses.png

Finnian Reilly

unread,

May 23, 2026, 11:25:41 AMMay 23

to eiffel...@googlegroups.com

On 23/05/2026 15:47, Ulrich Windl wrote:
> I'd advise the Eiffel community to look at the good things in Rust, rather than bashing at it with a "we are so much better". However, the syntax of Rust is really ugly, maybe even uglier than that of C.

Hi Ulrich, the essay isn't bashing Rust — it explicitly acknowledges
what Rust does well. The specific claim is narrower: that Design by
Contract catches logic errors like the sudo-rs CVEs of 2025, which
Rust's static analysis cannot address because they're not memory errors.

--
SmartDevelopersUseUnderScoresInTheirIdentifiersBecause_it_is_much_easier_to_read

Eric Bezault

unread,

May 23, 2026, 5:41:44 PMMay 23

to eiffel...@googlegroups.com, Ulrich Windl

On 23/05/2026 16:47, Ulrich Windl wrote:
> I have studied the latest Eiffel compiler, nor did I try the latest Rust compiler, but from what I saw in the past was that the Rust compiler was quite good in static analysis during compilation (without the user having to add additional assertions), while Eiffel may detect violated assertions during runtime.

What about if Eiffel could detect assertion violations
during compilation? This is the promise of Autoproof.
After many years as a research project, Autoproof is
already able to detect many kinds of assertion violations.
There is a (several year old) prototype/demo available
online:

http://comcom.csail.mit.edu/comcom/#AutoProof

Can you find why class ACCOUNT is broken in this example?
Just fix it and click on Run again...

Having such tool in production would be a killer in
the era of AI generated code. The assertions would be
the specification provided to the AI, and Autoproof
would automatically verify that the generated code
satisfies these assertions.

If there are some companies interested in such technology
and ready to invest money to support this work, this
sponsorship would allow me to join the team and help turn
this research project into a mature production tool
integrated into the Eiffel ecosystem. Please contact me
or Eiffel Software for further discussion.

--
Eric Bezault <er...@gobosoft.com>
Eiffel expert - available for freelance work
https://www.gobosoft.com

Ian Joyner

unread,

May 23, 2026, 8:35:35 PMMay 23

to eiffel...@googlegroups.com

Hello Ulrich.

"I'm aware that many errors the Rust compiler complains about cannot happen in Eiffel. This is another performance against comfort issue.”

I’m not sure about the "performance against comfort issue” point. There are many issues in C-like languages that a compiler must detect, and when not detected are traps the programmer can fall into from mildly irritating to serious and difficult-to-find, only by run-time testing and debugging. That is poor for development performance. Many of these issues just won’t happen in Eiffel. It is not stopping the programmer from expressing anything legitimate, rather just not even being able to express something illegitimate in the first place. That certainly is superior to needing to detect such issues in a compiler or by debugging.

I don’t see that this affects run-time performance in general.

Remember also that Rust is meant for low-level programming. They seem to have forgotten that and now compete in areas that C and C++ should never have been used for, but illegitimately encroached on those areas.

I keep telling people on the net that system and general programming are entirely different. System programming is to define the abstract platform on which other software runs. System programming should take care of lower-level platform details so others don’t have to.

System programming include general programming languages, but with those extras in, which should not be in general languages. Unfortunately, CS students are taught how things work underneath, like memory management, and then think they should deal with these issues. Rust is a language meant to handle those issues.

Where system programmers insist that all should use their system language, they have failed to do their job. Many don’t even understand that is their job.

"I'd advise the Eiffel community to look at the good things in Rust, rather than bashing at it with a "we are so much better”.”

I agree. But we must also be able to express why Eiffel is better. Not exposing system issues is one such place. So actually things that can't happen in Eiffel are someone due to not being able to make system mistakes (because we should not be programming at that level), but there are many non-system traps and issues that aren’t possible in Eiffel.

Sure we should leave the airs-and-graces of superiority to the C/C++ people who consider themselves superior to other programmers. Those other programmers might realise that computing is about computation, not computers.

Ian

To view this discussion visit https://groups.google.com/d/msgid/eiffel-users/2a3ceb58-ba70-41cd-acec-362529266c66%40gmail.com.

javier...@gmail.com

unread,

May 24, 2026, 12:11:09 PMMay 24

to Eiffel Users

I agree that AutoProof is a potential killer app for Eiffel in the era of AI-generated code.

With AutoProof (Code Verifier ), we move bug detection from `runtime exceptions --> compile-time verification.`

Spec + DbC

--> LLM generates an implementation to satisfy them

--> AutoProof: run static poof (SMT Solvers) -
--> Poof Pass?
Yes --> [Execute]
No --> Violation details back to LLM | Review the Spec

See a related complementary approach about formal verification to AI agent workflows https://dl.acm.org/doi/epdf/10.1145/3777544 about
The article Guardian of Agents is a higher level, for Agents' Orchestrations, a kind of Workflow Verifier.

--Javier

Anders Persson

unread,

May 26, 2026, 7:56:23 AMMay 26

to eiffel...@googlegroups.com

FYI: I am trying to build it

https://github.com/andersoxie/xpact

Vänligen

Anders Persson

BSharp AB

Linked in profile

+46 763 17 23 25

--
You received this message because you are subscribed to the Google Groups "Eiffel Users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users...@googlegroups.com.

To view this discussion visit https://groups.google.com/d/msgid/eiffel-users/0d115ad6-e885-4954-be86-be995ed7e7e1n%40googlegroups.com.

Ian Joyner

unread,

May 27, 2026, 7:55:05 AMMay 27

to eiffel...@googlegroups.com

Thanks Finnian. I finally found time to read your article. It is great to see useful projects for Eiffel being proposed.

When it comes to security though, we cannot rely on languages to ensure security on a platform level. The hackers will just opt to not use such a language.

Yes, a language can ensure correctness and security above the platform level, but hackers want to subvert that. They could even opt for assembler (something that should have disappeared decades ago).

Thus security must be built into the platform itself. A language should reflect that and have such security in its semantics. If a language has secure semantics, its programs will run on a secure platform.

Languages without secure semantics will most probably break. Designers of platforms are scared to define secure platforms because that would break a lot of C programs.

C Is Not a Low-level Language - ACM Queue

queue.acm.org

Insecure language on insecure platform — results in security breaches, that responsible programmers must be careful to avoid. It is not actually programmer responsibility but programmer burden.

Secure language on insecure platform — an improvement, but still can be bypassed.

Secure language on a secure platform — results in secure platforms, ease of programming.

Insecure language on secure platform — can result in better programs in that language, but will also break many existing programs, at least forcing them to update (this is the problem with C++, it has never forced programmers to abandon old flaky C methods and update).

Ian

On 22 May 2026, at 21:36, Finnian Reilly <frei...@gmail.com> wrote:

Finding a billion-user project for Eiffel
How DbC catches the security flaws that Rust misses

Summary of article on eiffel.org
The industry is currently celebrating Rust as a breakthrough in software safety. But Eiffel had a more complete answer to software correctness before Rust's creator was born. This essay argues that the moment has arrived for the Eiffel community to demonstrate that publicly — with a specific project, a real deployment strategy, and a compelling security argument.
https://www.eiffel.org/blog/Finnian%20Reilly/2026/05/finding-billion-user-project-eiffel-how-dbc-catches-security-flaws-rust-misses

<Finding a billion-user project for Eiffel How DbC catches the security flaws that Rust misses.png>

--
You received this message because you are subscribed to the Google Groups "Eiffel Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users...@googlegroups.com.

To view this discussion visit https://groups.google.com/d/msgid/eiffel-users/4e728a1c-4efd-40cf-b397-87f334030a0an%40googlegroups.com.

Finnian Reilly

unread,

May 28, 2026, 9:34:53 AMMay 28

to eiffel...@googlegroups.com

Two New Sections 13 and 14

For completeness I have inserted two new sections into the eiffel.org essay. You can read them below.

regards

Finnian

13. Eiffel's static verification ecosystem: AutoTest and AutoProof

AutoTest is a contract-driven automated test generation tool integrated into EiffelStudio. Rather than requiring developers to write test cases manually, AutoTest reads existing DbC contracts and generates test inputs automatically — using random and systematic strategies to find inputs that satisfy preconditions, executing routines, and checking whether postconditions and invariants hold. A contract violation becomes an automatically discovered bug without a human having written the test that found it. For xpact, AutoTest would automatically generate both valid and malformed XML inputs, exercising the parser's contract boundaries continuously as development proceeds. The contracts serve double duty: specification and test oracle simultaneously.

AutoProof goes further still — it attempts to prove Eiffel contracts correct at compile time using SMT solvers (Satisfiability Modulo Theories), most notably Microsoft Research's Z3 engine. Rather than finding violations empirically, AutoProof seeks mathematical proof that violations are impossible for all inputs. If a proof succeeds, the guarantee is not "we tested this extensively" but "this is provably correct." AutoProof translates Eiffel contracts into logical propositions and attempts to determine their truth mathematically. A successful proof eliminates an entire class of bug not just for tested inputs but for every possible input. AutoProof currently remains a research prototype — scaling to production codebases of arbitrary complexity is an unsolved problem, and many correct programs cannot yet be automatically proven within practical time bounds. However xpact's bounded, specification-driven domain — a streaming parser implementing a formal grammar with well-defined invariants derived directly from the XML specification — represents close to an ideal case for AutoProof. The possibility of mathematically proving xpact's core parsing contracts correct is realistic in a way that it would not be for a large general-purpose application.

The layered correctness picture for xpact is therefore:

Void safety — static, always on, eliminates null dereference at compile time regardless of contract mode
DbC contracts — verified against the complete libexpat test suite including all CVE regression tests, with full contract checking enabled
AutoTest — automatic test generation driven by contracts, continuously exercising contract boundaries without manual test authorship
AutoProof — mathematical proof of correctness for provable components, eliminating entire bug classes for all possible inputs
Fuzzing — adversarial input coverage via OSS-Fuzz for the unprovable remainder

No other language or toolchain currently offers this combination for a security-critical parser. Rust eliminates memory corruption. Eiffel eliminates memory corruption, proves logic correctness, generates tests from specifications, and offers a path to mathematical verification — all from the same codebase, with the same annotations serving every layer.

14. A living contract suite: learning from the entire XML parser ecosystem

One of the less obvious advantages of xpact's contract-driven architecture is that it can learn continuously from security research happening across the entire XML parsing ecosystem — not just libexpat's CVE history.

Every XML parser CVE ever published represents a logic error that a correctly written contract would have prevented. libexpat, libxml2, Xerces-C++, MSXML, RapidXML — each has its own CVE history, and many of those vulnerabilities share common root causes: unbounded recursion depth, unconstrained entity expansion ratios, integer overflow on length calculations, negative length acceptance, algorithmic complexity attacks from crafted input structures. These are not libexpat-specific failures. They are recurring patterns in XML parser implementations across languages, decades, and organisations.

This collective CVE history is, in effect, a collaboratively written specification of what an XML parser must constrain — contributed inadvertently by security researchers and attackers worldwide. xpact can treat it as such.

The AI-assisted contract derivation workflow

Claude and similar AI tools are well suited to systematic CVE analysis. The workflow is straightforward:

Fetch CVE descriptions and root cause analyses for all major XML parsers
Categorise each CVE by the class of constraint that was missing — recursion depth, expansion ratio, buffer length, integer range, complexity bound
Map each category to a specific Eiffel precondition or invariant
Identify which constraints xpact already expresses as contracts and which are absent
Suggest specific Eiffel contract annotations for the gaps

This turns the entire XML parser CVE database into a continuously updated contract specification. As new CVEs are published for any XML parser, the same analysis applies — a new vulnerability anywhere in the ecosystem becomes a contract suggestion for xpact within days of disclosure rather than years after exploitation.

The inversion worth noting

This is a remarkable inversion of the usual security dynamic. Normally attackers discover vulnerabilities and defenders react. In xpact's model, every vulnerability discovered anywhere in the XML parsing ecosystem — past, present, and future — automatically strengthens xpact's contract suite. Attackers inadvertently contribute to xpact's correctness story. The CVE database becomes a collaborative specification written by the world's security researchers.

The other significant XML parsers

libexpat is the most widely embedded XML parser but the CVE landscape is broader. libxml2 — used by GNOME, Python's lxml, PHP, and many others — has an even longer CVE history and supports XPath, XPointer, and full validation, making it a natural Phase 2 target once xpact establishes its libexpat-compatible core. Xerces-C++ (Apache) has a significant enterprise deployment footprint with its own history of denial-of-service vulnerabilities from crafted inputs. RapidXML and pugixml are widely used in game development and embedded systems — less security scrutiny means their CVE surface may be underreported rather than absent.

The living contract suite

Combined with AutoTest and OSS-Fuzz continuous fuzzing, xpact's contract suite would be updated from three complementary sources: systematic analysis of historical CVEs across all XML parsers, new CVEs as they are published, and AutoTest-discovered violations during active development. This is a specification that grows more complete over time automatically — the opposite of how most security-critical libraries evolve, where specifications are static and vulnerabilities accumulate. xpact's contracts do not just reflect what we know today. They are designed to incorporate what the security community will discover tomorrow.

-- 
SmartDevelopersUseUnderScoresInTheirIdentifiersBecause_it_is_much_easier_to_read

Finnian Reilly

unread,

May 28, 2026, 9:41:49 AMMay 28

to eiffel...@googlegroups.com

Impressive progress Anders, six commits on day one speaks for itself. I
noticed the loop invariants already in place. That is exactly the spirit
the essay was hoping to inspire. Looking forward to watching xpact take
shape.
Vänligen
Finnian

> FYI: I am trying to build it
>
> https://github.com/andersoxie/xpact
>

--
SmartDevelopersUseUnderScoresInTheirIdentifiersBecause_it_is_much_easier_to_read

Finnian Reilly

unread,

May 28, 2026, 11:45:29 AMMay 28

to eiffel...@googlegroups.com

Hi Ian,
thanks for taking the time. You're right that the essay addresses only one layer of a multi-layer problem. Platform security is the foundation everything else rests on. But if attackers will always find the weakest layer, then making each layer as fortified as possible is the only rational response. Within the application layer — where most CVEs that affect end users actually originate — DbC addresses a class of logic errors that platform security cannot prevent and that Rust's model doesn't reach. The sudo-rs authentication bypass was a logic error that would have run correctly on even the most secure platform. That's the specific gap the essay is addressing. The two concerns are complementary — platform security and application-layer correctness both matter. xpact addresses the latter
Regards

Finnian

Finnian Reilly

unread,

May 30, 2026, 7:50:41 AMMay 30

to Eiffel Users

New section added: xpact vs expat — bridging the performance gap

A new section has been added to the eiffel.org essay addressing the performance question that any serious xpact proposal must answer: can an Eiffel XML parser compete with a highly optimised C library on raw throughput?

The section introduces C_STRING_8, a lightweight proof-of-concept class that wraps C-allocated memory directly via MANAGED_POINTER inheritance, with string operations delegated to memcmp and memcpy. The benchmarks comparing it against STRING_8 (SPECIAL-backed) show some striking results:

starts_with — C buffer 59.7% faster
occurrences — C buffer 22.3% faster
CSV parsing — C buffer 20.0% faster

The starts_with result is the most significant for xpact — token prefix recognition is the dominant operation in XML parsing, and a 59.7% advantage on that operation compounds substantially across a full document parse.

Beyond raw speed, the section discusses a zero-copy parse architecture using shared substrings — every token the parser recognises becomes a lightweight window into the original input buffer with no character copying and no additional heap allocation. For large documents this halves the GC object count per token compared to STRING_8, since C_STRING_8 requires only one heap object per string rather than the two that STRING_8 requires (instance + SPECIAL [CHARACTER_8]).

The full section including Eiffel source listings is at:

https://www.eiffel.org/blog/Finnian%20Reilly/2026/05/finding-billion-user-project-eiffel-how-dbc-catches-security-flaws-rust-misses#xpact_vs_expat:_bridging_the_performance_gap

Benchmark source code + C_STRING_8 attached

Finnian

c_string_8.e

string_8_vs_c_string_8.e

Finnian Reilly

unread,

May 30, 2026, 10:30:57 AMMay 30

to Eiffel Users

Re: xpact vs expat — bridging the performance gap

Benchmark source code + C_STRING_8 attached

Definitive versions are on github

Ulrich Windl

unread,

May 31, 2026, 4:11:48 PMMay 31

to eiffel...@googlegroups.com

Well, the truth is that the bugs would have been prevented by the contract if all the cases the contract covers were actually checked, unless you ship production code with assertions enabled.
Contracts don't make codes correct; they just flag bad code.

Ulrich

28.05.2026 15:34:14 Finnian Reilly <fin...@eiffel-loop.com>:

Finnian Reilly

unread,

Jun 1, 2026, 10:17:22 AMJun 1

to Eiffel Users

XML name interning: a zero-copy, cache-efficient strategy for xpact callbacks

New section added to the "Billion User" essay — 1,700 words

This new section addresses one of the subtler engineering challenges in making xpact a genuine libexpat drop-in replacement — satisfying the null-termination contract of the C callback API without abandoning the zero-copy parse architecture.

The solution turns out to be more elegant than expected, eliminating the need for GC pinning entirely through a new class C_NULLED_STRING_8 that allocates directly in C memory. The section covers:

The intern table — mapping non-null-terminated parse tokens to cached null-terminated C strings
C_NULLED_STRING_8 — why C-allocated memory sidesteps the GC pinning problem entirely
Why binary search rather than a hash table — the case for a sorted arrayed map list for typical XML vocabularies
Bucketing by first character — scaling gracefully to larger vocabularies
Linear or binary search chosen automatically — with a tunable threshold constant of 10
The unit test — proving identity rather than equality, with 12 'a' words exercising the binary search threshold
A C programmer's perspective — why the same architecture is hazardous to maintain in C but self-verifying in Eiffel
What this means for xpact — the complete Phase 2 performance roadmap

All classes — C_NULLED_STRING_8, C_NULLED_STRING_8_NAME_CACHE, and the unit test — are implemented and tested in Eiffel-Loop under the MIT licence, ready for Anders to draw on when Phase 2 begins.

https://www.eiffel.org/blog/Finnian%20Reilly/2026/05/finding-billion-user-project-eiffel-how-dbc-catches-security-flaws-rust-misses#XML_name_interning:_a_zero-copy,_cache-efficient_strategy_for_xpact_callbacks

Finnian

Finnian Reilly

unread,

Jun 2, 2026, 4:06:49 AMJun 2

to Eiffel Users

Ulrich, agreed — contracts don't make code correct and the essay says so explicitly in section Addressing the honest objections. The specific claim is narrower: libexpat's CVEs were predominantly missing boundary specifications rather than incorrect implementations of stated specifications. A precondition on recursion depth doesn't fix bad logic — it makes the required constraint explicit and automatically verifiable during development. That's a different claim from 'contracts guarantee correctness

Eric Bezault

unread,

Jun 2, 2026, 5:31:51 AMJun 2

to eiffel...@googlegroups.com, Finnian Reilly

Hi Finnian,

Is the mapping C_STRING_8 -> C_NULLED_STRING_8 only needed for
element and attribute names? The advantages that you describe
in using C_NULLED_STRING_8 are based on the small size of the
vocabulary typically used in XML files, namely element and
attribute names. But what about attribute values, user text
between <foo> and </foo>, comments, etc. They are often unique
and can sometimes be large and/or numerous. Do these strings
also need to be mapped from C_STRING_8 to C_NULLED_STRING_8?

--
Eric Bezault <er...@gobosoft.com>
Eiffel expert - available for freelance work
https://www.gobosoft.com

On 01/06/2026 16:17, Finnian Reilly wrote:
> *XML name interning: a zero-copy, cache-efficient strategy for xpact
> callbacks <https://www.eiffel.org/blog/Finnian%20Reilly/2026/05/finding-
> billion-user-project-eiffel-how-dbc-catches-security-flaws-rust-
> misses#XML_name_interning:_a_zero-copy,_cache-
> efficient_strategy_for_xpact_callbacks>*
>
> *New section added to the "Billion User" essay — 1,700 words*

>
> This new section addresses one of the subtler engineering challenges in
> making xpact a genuine libexpat drop-in replacement — satisfying the
> null-termination contract of the C callback API without abandoning the
> zero-copy parse architecture.
>
> The solution turns out to be more elegant than expected, eliminating the
> need for GC pinning entirely through a new class C_NULLED_STRING_8 that
> allocates directly in C memory. The section covers:
>

> * *The intern table* — mapping non-null-terminated parse tokens to
> cached null-terminated C strings
> * *C_NULLED_STRING_8* — why C-allocated memory sidesteps the GC
> pinning problem entirely
> * *Why binary search rather than a hash table* — the case for a sorted

> arrayed map list for typical XML vocabularies

> * *Bucketing by first character* — scaling gracefully to larger
> vocabularies
> * *Linear or binary search chosen automatically* — with a tunable
> threshold constant of 10
> * *The unit test* — proving identity rather than equality, with 12 'a'

> words exercising the binary search threshold

> * *A C programmer's perspective* — why the same architecture is

> hazardous to maintain in C but self-verifying in Eiffel

> * *What this means for xpact* — the complete Phase 2 performance roadmap

>
> All classes — C_NULLED_STRING_8, C_NULLED_STRING_8_NAME_CACHE, and the
> unit test — are implemented and tested in Eiffel-Loop under the MIT
> licence, ready for Anders to draw on when Phase 2 begins.
>
> https://www.eiffel.org/blog/Finnian%20Reilly/2026/05/finding-billion-
> user-project-eiffel-how-dbc-catches-security-flaws-rust-
> misses#XML_name_interning:_a_zero-copy,_cache-

> efficient_strategy_for_xpact_callbacks <https://www.eiffel.org/blog/
> Finnian%20Reilly/2026/05/finding-billion-user-project-eiffel-how-dbc-
> catches-security-flaws-rust-misses#XML_name_interning:_a_zero-
> copy,_cache-efficient_strategy_for_xpact_callbacks>

>
> Finnian
>
>
>
> On Friday, 22 May 2026 at 12:36:15 UTC+1 Finnian Reilly wrote:
>
> Finding a billion-user project for Eiffel
> How DbC catches the security flaws that Rust misses
>

> --
> You received this message because you are subscribed to the Google
> Groups "Eiffel Users" group.
> To unsubscribe from this group and stop receiving emails from it, send

> an email to eiffel-users...@googlegroups.com <mailto:eiffel-
> users+un...@googlegroups.com>.

> To view this discussion visit https://groups.google.com/d/msgid/eiffel-

> users/a6106d43-89ca-4c11-9434-3af8bd9df075n%40googlegroups.com <https://
> groups.google.com/d/msgid/eiffel-users/
> a6106d43-89ca-4c11-9434-3af8bd9df075n%40googlegroups.com?
> utm_medium=email&utm_source=footer>.

Finnian Reilly

unread,

Jun 2, 2026, 12:11:35 PMJun 2

to Eiffel Users

xpact vs expat: bridging the performance gap (revised)

Hi Eric

thank you for your question and you have got me thinking more deeply about how an optimized xpact architecture might work. I have rewritten the section xpact vs expat: bridging the performance gap with substantially more technical detail and argue that for large documents xpact might actually end up being faster. But here is the short answer to your question quoted from the revised section

Callback string types and the appropriate Eiffel class

Not all callback strings benefit equally from the name cache. The following table maps each libexpat callback type to its string representation, the appropriate Eiffel class, and whether caching is beneficial:

Finnian Reilly

unread,

Jun 2, 2026, 12:31:58 PMJun 2

to Eiffel Users

Eiffel's practical advantages for this specific task

I have added one extra item to the case that Eiffel makes for a more secure language to implement a widely used XML parser. It may make you smile :-)

Incidental Obfuscation There is one unintended security benefit worth noting with a smile: EiffelStudio's generated C is so thoroughly transformed from the original Eiffel source that conventional C vulnerability research tools and techniques produce largely meaningless results against it. The security-relevant behaviour is expressed in the Eiffel contracts — not in identifier-mangled, macro-expanded generated C that bears no resemblance to hand-written code. This is not a security strategy anyone would recommend deliberately. But as an accidental consequence of the compilation model, it is not nothing.

Finnian Reilly

unread,

Jun 2, 2026, 12:39:29 PMJun 2

to Eiffel Users

Hi Friedrich,
your VBA experience is one of the most compelling endorsements of DbC I have seen — not because VBA is a great language for it, but precisely because it isn't. You found DbC valuable enough to implement it by hand in a language with no native support, which says everything about the practical worth of the idea.

Your point b) is particularly insightful and deserves more attention than it usually gets. Conventional quality tools — static analysers, code review, test coverage — all struggle to scale as codebases grow. DbC scales naturally because the contracts live in the code itself rather than in external tooling. A precondition on a routine is just as enforceable in a million-line codebase as in a hundred-line one.

This is exactly why xpact — the Eiffel XML parser proposed in the essay — uses DbC not just as a debugging aid but as the primary correctness argument. The contracts derived from the XML specification scale with the parser's complexity by construction.

In Eiffel of course you don't have to implement any of this by hand. It is native to the language, supported by the IDE, enforced automatically during testing, and statically verifiable via AutoProof. Everything you built in VBA by discipline is just part of the syntax.

Finnian

Ulrich Windl

unread,

Jun 12, 2026, 12:48:53 PMJun 12

to eiffel...@googlegroups.com

Hi!

A somewhat random comment: Some years ago (it was the time of the 200Mhz Pentium Pro) I wrote a parser for MIME E-Mail messages in order to scan for malware. Using the absolute evil language (C) I had thought how that could be done efficiently, and (considering the small CPU cache) I came to the conclusion that I'll have to avoid copying strings whenever possible and use references to substrings instead (MIME's multi-part messages may be nested just like XML can be, and the content can be encoded). The fortunate part was that the decoded string would always be shorter than the original, so I could replace it "in-place". You can guess memory management was a bit tricky, but it worked well.
When thinking to do something similar for XML, the problem are entity references, a kind of encoding that may be LARGER than the original, to working with zero-copy and substring references won't work.
Also today's CPUs have caches so large that the whole Windows/95 would fit into them and memory bandwidth is probably 100 times as fast.
Still: Can an OO solution be time and space efficient?
(One of my experiences with SGML parsers was that they would need 32MB RAM (that was all I had at that time) to process an 1MB input file. That made me think about the design.)
When processing MIME one could save the message body to a string, then create an array for all the parts the main message defines, copying the parts, and then recursively parse each of the sub-parts for further sub-divisions, eventually applying the decoding into new strings, keeping the garbage collector busy, or having multiple equivalent copies of those parts in memory (or on disk). Obviously that was not what I wanted to do, even if that's the straight forward and clean design...

Regards,
Ulrich

Finnian Reilly

unread,

Jun 24, 2026, 9:50:05 AMJun 24

to eiffel...@googlegroups.com

Xpact-core benchmark milestone: x2 eXpat throughput, same parsing window

Hi everyone,

I want to share a benchmark result that I think marks a genuine milestone for the Xpact-core project, not just an incremental tuning win.

For background, Xpact-core (https://github.com/finnianr/Xpact-core) is an Eiffel port of the eXpat XML parser. Both the C reference implementation and the Eiffel benchmark is published here: so anyone can verify the comparison independently rather than take my word for it.

The benchmark counts how many full parsing passes each program completes in a fixed 2000 ms window, across three real test documents from the libexpat test corpus:

Large XML Files from expat testdata	eXpat (C) passes / 2000 ms	Xpact-core (Eiffel) passes / 2000 ms	Xpact relative to eXpat
ns_att_test.xml	14	30	2.14x
recset.xml	17	35	2.06x
wordnet_glossary-20010201.rdf	34	64	1.88x

Methodology

Xpact-core is completing roughly twice as many parsing passes as eXpat across all three documents, in the same window, on the same machine. But there is one methodology detail worth being upfront about: garbage collection is disabled during each parse call in this benchmark, and a single GC pass is run after each parse to reclaim the small number of additional objects created. This is not a case of leaking memory to inflate the numbers. Xpact-core is deliberately designed to allocate extremely few additional objects during parsing, so there is very little for the collector to do regardless, and nothing allocated during the parse survives past the single cleanup pass afterward. Any time spent reclaiming memory is included in the benchmark.

Disabling GC during the parse and re-enabling it immediately after is intended to be how it works in production use as a drop-in libexpat replacement. It adds an extra measure of safety as we don't want arrays moving around while the call backs are processed on the client. But this is entirely optional and a caller is free to leave the collector running throughout if they prefer.

Removing the main obstacle to adoption

I think this result matters for reasons beyond just "Eiffel is fast enough." When I started this port, the goal was to demonstrate that an Eiffel implementation, written with Design by Contract throughout, did not have to pay a performance tax against a mature, heavily optimized C library that has been the de facto standard XML parser for over two decades. Being merely competitive with eXpat would already have been a reasonable outcome. Being roughly twice as fast is the kind of result that changes the conversation from "is Eiffel viable for this kind of work" to "why would you reach for the C library at all."

More than just an AI port

Part of the reason Xpact-core gets there is that the port was not a mechanical line-by-line translation. A number of inefficiencies in expat's own internal design were identified and removed in the Eiffel rewrite. For strategic reasons I will leave the specifics out of this post for now. It is disadvantageous to us if the expat team fixes their inefficiencies before we have a chance to gift xpact to the Python community. We want people to "come for the speed, and stay for the DbC (less CVE)".

Compiles out of the box

Some usability highlights worth mentioning: Xpact-core itself compiles fully void-safe, and has no dependencies beyond EiffelStudio's own library classes. Anyone wanting to use it this way, as a standalone, so-called native Eiffel parser, can clone the repository and build it out of the box.

The Eiffel-C bridge gateway to Python adoption

Separately, Anders Persson is managing the C integration aspect of this project, aimed at making xpact a drop-in replacement for expat inside Python (https://github.com/andersoxie/xpact). That layer does introduce C dependencies, since it has to bridge into Python's C extension interface. The two efforts are complementary rather than identical: Xpact-core is the dependency-free Eiffel core, benchmarked above, and Anders' xpact project builds the C bridging layer on top of it for Python consumption.

The unique security selling point

This result also connects to the broader case I made in an essay published on eiffel.org in May, on why Design by Contract catches classes of security flaw that Rust's type system does not (https://www.eiffel.org/blog/Finnian%20Reilly/2026/05/finding-billion-user-project-eiffel-how-dbc-catches-security-flaws-rust-misses). XML parsing is exactly the kind of attack surface where that argument has teeth: it is a security-sensitive, performance-sensitive, widely-deployed piece of infrastructure. Xpact-core is becoming a concrete demonstration that you do not have to trade safety for speed to get there, in either direction.

There is more migration and benchmarking work ahead, including extending these comparisons across a wider range of document shapes and sizes, but I wanted to share this number now since it is the first result solid enough to call a real milestone rather than a promising early sign.