Size of Metamath/set.mm

59 views
Skip to first unread message

Fabian Huch

unread,
Jun 5, 2025, 1:08:30 PMJun 5
to Metamath
I am currently creating a comparison of large libraries of ITPs. I am only considering libraries with more than 1M (nonempty) LOC, of which I know (in alphabetical order): ACL2, HOL4, Isabelle, Lean, Metamath, Mizar, PVS, Rocq.
To that end, I want to approximate the number of definitions (incl. axioms; excl. abbreviations, instances, fixed values/constants, examples) and proven proper theorems (excl. automatically generated ones) by simple analysis of the sources (without running the system).

For Metamath, my approach would be to look at tokens in `.mm` files and count $a for axioms and $p for theorems. Is this reasonable or is there a better way?

David A. Wheeler

unread,
Jun 5, 2025, 3:09:15 PMJun 5
to Metamath Mailing List
That makes *overall* sense but there are some details you'll want to address:

* A comment can refer to an "$a" or "$p" statement. It doesn't happen *often*,
and checking for patterns should make that unimportant.
* In Metamath, definitions are also considered $a statements. The largest
Metamath databases use naming conventions to distinguish them
("ax-" means axiom, "df-" means definition). I would provide numbers for
those 3 cases (true axioms, definitions, and proven statements).
* Metamath has several databases. We normally distribute them compressed, which
through the magic of compression makes them smaller. When uncompressed
several are *easily* larger than 1M nonempty LOC. In size order (starting from largest)
they are set.mm, iset.mm, nf.mm, and ql.mm. I would include at least
set.mm and iset.mm, as they're the largest, in especially active development,
and each has proofs the other doesn't. I'm not sure I'd include nf.mm or ql.mm
given your goals.

Here's a quick shell script I wrote to do those estimates
(including nf.mm and ql.mm in case you want them):

for f in set.mm iset.mm nf.mm sqlmm; do
echo "$f:"
echo -n "Axioms: "; grep -E ' *ax-.* \$a ' "$f" | wc -l
echo -n "Definitions: "; grep -E ' *df-.* \$a ' "$f" | wc -l
echo -n "Proofs: "; grep -E ' [^ ]+ \$p ' "$f" | wc -l
echo
done

set.mm:
Axioms: 124
Definitions: 1401
Proofs: 45488

iset.mm:
Axioms: 89
Definitions: 349
Proofs: 14462

nf.mm:
Axioms: 43
Definitions: 159
Proofs: 5981

ql.mm:
Axioms: 21
Definitions: 29
Proofs: 1178

The number of axioms shown is a little misleadingly large.
For example, in set.mm, ax-1cn is an axiom that 1 is a complex number.
This statement is proven in set.mm, but we restate it as an axiom so that later
proofs that depend on it won't necessarily depend on the construction of 1.

Still, that will give you numbers to work with.

(This is a resend, sorry if you got both.)

--- David A. Wheeler

David A. Wheeler

unread,
Jun 5, 2025, 3:11:00 PMJun 5
to Metamath Mailing List


> On Jun 5, 2025, at 11:56 AM, 'Fabian Huch' via Metamath <meta...@googlegroups.com> wrote:
>
> I am currently creating a comparison of large libraries of ITPs. I am only considering libraries with more than 1M (nonempty) LOC, of which I know (in alphabetical order): ACL2, HOL4, Isabelle, Lean, Metamath, Mizar, PVS, Rocq.
> To that end, I want to approximate the number of definitions (incl. axioms; excl. abbreviations, instances, fixed values/constants, examples) and proven proper theorems (excl. automatically generated ones) by simple analysis of the sources (without running the system).

Quick clarification: I didn't actually uncompress these databases (set.mm etc.)
and measure their uncompressed length. That said,
I'm pretty confident that uncompressed set.mm and iset.mm would be >1M just
from eyeballing them.
I'm not as sure about nf.mm and ql.mm, though that's plausible.

If it's vital, I could uncompress and measure them. However, there's a *reason*
we normally work with compressed proofs :-). Most other systems like Lean and Isabelle
are happy with statements like "blast" that say "go rediscover the proof" without
actually showing the steps. A fundamental differentiator about
Metamath is that it *never* skips proof steps, so its proofs have steps others
would skip over. That doesn't mean we hate other systems, but it's important
to understand that is a key differentiator: *all* steps of a proof are always included.

By "uncompress" I simply mean the sequence of statements used in each proof.
If we truly uncompressed the proofs to show what humans can consider
(that is, showing the resulting statement after each proof step), all 4 of those
databases would *trivially* clear 1M lines, even if we skipped the syntax ("non-essential")
proof steps that are in the database.

(This is a resend, sorry if you got this twice.)

--- David A. Wheeler

Mario Carneiro

unread,
Jun 5, 2025, 7:59:07 PMJun 5
to meta...@googlegroups.com
I think uncompressing is not really a good idea for this test, that just bloats the file, and for comparison with other ITPs this would not be a fair comparison, you would need to similarly pretty print the proof terms there. I think the metric of LOC of the regular file (which in our case means proofs are in compressed blocks) is the closest we have to a measure here, representing approximately how much human typing went into the proof. (Humans don't type the compressed proof of course, but the text they do type, in a proof assistant like mmj2, is probably around a similar order of magnitude to the compressed proof.)

Besides, I don't think it matters much since the 1M cutoff was simply used to select the systems to consider, and the numbers which are reported are not about LOC but about number of theorems and definitions.

--
You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/metamath/AC0905B3-1FBE-4113-AB63-2A3E370E2D6F%40dwheeler.com.

Fabian Huch

unread,
Jun 6, 2025, 6:07:01 PMJun 6
to Metamath
Thanks for the suggestions. In my comparison, I don't differentiate between axioms and definitions (Those are all 'defined things' in my methodology, whereas theorems are 'proven things'); so $a fits well then. I am aware that $a/$p may occur in comments as well but would opt to not filter them out: This happens in other ITPs as well, and for some systems it is not feasible to differentiate.
I agree with Mario that counting the lines of regular files with compressed proof blocks resembles other ITPs the "closest", but yes LOC is not great for comparing ITPs since proofs can be of vastly different sizes across systems.
Though of course this is still a quantiative comparison and won't tell you too much about how interesting the theorems are ;) the 1000+ project would be better for that once widely adopted.

I am counting all databases in the set.mm repository even though they're separate. I think that's be comparable to how some of the other ITP libraries are organized.

Fabian Huch

unread,
Jun 6, 2025, 6:07:01 PMJun 6
to Metamath
Thank you for the suggestions. Yes, it can occur that keywords are in a comment, but that is consistent with other ITPs. As I am counting definitions + axioms in one bin, $a seems fine.

I agree with Mario that the compressed proofs are the right ones to count, though LOC is not too helpful of a metric.

(apologies if you receive multiple responses, my first response never appeared (?))
On Friday, 6 June 2025 at 01:59:07 UTC+2 di....@gmail.com wrote:
Reply all
Reply to author
Forward
0 new messages