Question about mmj2 style unifiers

Marlo Bruder

unread,

Sep 27, 2025, 5:35:06 PMSep 27

to Metamath

Hello everyone,
I'm currently in the process of creating a mmj2 style proof assistant and I had another question. So far my unifier is able to:

1. Fill in step refs if the assertion and hypotheses are given
2. Unify the assertion and hypotheses if the step ref is given

Now I noticed that yammas unifier is able to do the following:

When given the following mmp file:

h1::test.2 |- ( ps -> ch )
h2::test.1 |- ps
qed:: |- ch

The unifier is able to do the following:

h1::test.1 |- ( ps -> ch )
h2::test.2 |- ps
qed:2,1:ax-mp |- ch

As you can see, despite the fact that the qed step previously had no hypotheses or step ref, the unifier was able to fill in both at the same time. Now my question is: How is this possible in a short period of time? Wouldn't the proof assistant have to look at every theorem in the database, and then try a whole lot of possible combinations of previous lines and hypotheses? That sounds like it would take forever. There are obviously some optimizations to be done (like when one hypothesis does not match a line, don't try it again), but even then my intuition tells me that it would still take practically forever, since multiple hypotheses could match multiple lines and each combination would have to be tried. Is there a smarter, faster way (than what is essentially a smart brute force) of achieving this that I am overlooking?

Best regards,
Marlo Bruder

Matthew House

unread,

Sep 27, 2025, 6:04:49 PMSep 27

to meta...@googlegroups.com

My understanding is that it really is just a smart brute force: the worst-case time complexity is something like O((# of theorems) × (# of preceding steps) ^ (# of hypotheses per theorem)). That's why 4syl is discouraged, since both its hypotheses and conclusion are very broad and can potentially match many different steps. In my experience with metamath-lamp's unifier, it tends to time out on theorems like syl333anc with loads of hypotheses, since there are simply too many possibilities for matching.

In practice, there are a few mitigating factors:

A unifier can first check that the conclusion matches the final line of the candidate theorem. Most theorems are very particular, so heuristics like (e.g.) "the candidate theorem's conclusion must have no constant symbols not present in the proof step" can quickly filter out the vast majority of theorems, without having to scan for hypotheses at all.
Of the remaining theorems, most have few hypotheses, and it works to eat the cost of scanning through all prior steps. It can still become an issue if you have a very large proof, which is one of the reasons to break them up into lemmas past a certain point.
Of the theorems that do have lots of hypotheses, most have very particular hypotheses, such as "|- F/ x ph" or "|- A = ( blah ` x )" that cannot match very many prior steps. So the overall combinatorial explosion can be broken up into hypotheses that match smaller subsets of the prior steps.

Matthew House

--
You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/metamath/ddc0991a-7f2d-4f50-92f4-e504e4f038c9n%40googlegroups.com.

Glauco

unread,

Sep 27, 2025, 7:09:00 PMSep 27

to Metamath

Hi Matthew,

do you have a specific partial .mmp file where you would expect unification to fail or take forever so that I can test it with Yamma? (so far I've not been able to find one; with every new major update I try unification in the proof of ~ fouriersw that's more than 1500 steps and contains some steps with "many" hyps)

I've loaded the proof of ~ segconeq that contains two steps justified by ~ syl333anc . If I remove the labels and the refs from both steps, Yamma finds them with no lag. Maybe I've not understood the example you were referring to?

Thanks in advance

Glauco

p.s.
before trying what I call "step derivation" (not sure if mmj2 uses the same nomenclature), all parsing nodes for all theorems in set.mm need to be computed and in my slow VM it takes about one minute (it is done by a separate working thread in the background)

Matthew House

unread,

Sep 27, 2025, 7:20:33 PMSep 27

to Metamath Mailing List

I don't have any particular experience with Yamma or mmj2. I was mainly thinking back to my experiences with metamath-lamp, which runs in the browser and will obviously be more limited in just how far it can brute force. It has a set threshold past which it will give up on unifying a blank step, and instead spit out a warning with the last theorem it was attempting. When it does give up, it's usually stuff like syl32anc with lots of general hypotheses.

If you wanted a pathological test case that should mix up any unifier, I'd try writing a theorem with maybe a dozen hypotheses, some combinations of which share variables that don't appear in the final assertion, alongside a WIP proof with a lot of steps that can individually match any hypothesis, but with only one combination causing all the shared variables to line up correctly. The goal would be to rule out almost all possible shortcuts so it actually has to check most combinations.

--

You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+u...@googlegroups.com.

To view this discussion visit https://groups.google.com/d/msgid/metamath/61681cd8-f663-4bd3-b650-06d076be97b1n%40googlegroups.com.

Glauco

unread,

Sep 27, 2025, 8:54:39 PMSep 27

to Metamath

If anyone has a particularly challenging step in a proof in set.mm where the step derivation could fail, please let me know. I'd be happy to try it.

As I've written, I usually check the largest proof I'm aware of and perform step derivation for a step that has many hypotheses and it's in the final part of the proof. Since it's at the end of the proof, the backtracking has 1500+ candidates for every hypothesis. Furthermore, the correct label is in the last part of set.mm, so tens of thousands of labels have to be tried before a solution is found.

But since I wrote the algorithm, I'm the worst person to try to test it, so adversarial tests are welcome.

Antony Bartlett

unread,

Sep 28, 2025, 5:10:49 AMSep 28

to meta...@googlegroups.com

While we're talking about unifiers, I'm still in the market for one which is api based with a thin command line wrapper. For all the reasons people usually ask for api or cli:

1. The usual way I write code is with a text editor and a terminal, and I like that seperation

2. A command line lets a programmer see what an api can do before getting their hands dirty

3. A unifier api would allow a programmer to focus on the front end of a proof assistant (my personal conceit, since 2018, is that I'd like to see/write an Incredible Proof Machine (vertex-edge-graph drag and drop toy proof assistant) with a Metamath back end, and, oh, maybe a maxGraph front end).

4. The collection of dockerised Metamath command line tools I (occasionally) maintain, is currently little more than a glorified collection of verifiers, with the sole exception of metamath.exe which can do pretty much everything except allow you to start a proof at the beginning or in the middle.

Thanks for listening to this aside. I should stop procrastinating and get back to whatever it was I was doing with checkmm.cpp & .ts ;-)

Best regards,

Antony

--

You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+u...@googlegroups.com.

To view this discussion visit https://groups.google.com/d/msgid/metamath/e96ad325-e3c6-4689-9ad7-b5a25a2860a9n%40googlegroups.com.

Marlo Bruder

unread,

Sep 28, 2025, 6:29:03 AMSep 28

to Metamath

Hey Antony,

While we're talking about unifiers, I'm still in the market for one which is api based with a thin command line wrapper.

Could you elaborate on your proposal a bit? When you say "thin command line wrapper", do you mean a fully functional proof assistant (like a minimalist metamath.exe) or simply a command line tool that let's you call the proof assistant functionality (unify step, renumber, etc.)?

3. A unifier api would allow a programmer to focus on the front end of a proof assistant (my personal conceit, since 2018, is that I'd like to see/write an Incredible Proof Machine (vertex-edge-graph drag and drop toy proof assistant) with a Metamath back end, and, oh, maybe a maxGraph front end).

Another interesting proposal. Are you suggesting a drag and drop proof assistant where the proof lines are represented as vertices and hypothesis dependencies as edges? What would the advantage of such a proof assistant be in your opinion?

Until the end of October I'll still be busy with the first version of mmt1 (the proof assistant I am currently developing) and if there is interest from people in using mmt1 I'll probably be busy until next year developing future versions of mmt1. But after that I would be open to new ideas if you need any help.

Best regards,
Marlo Bruder

Glauco

unread,

Sep 28, 2025, 7:12:40 AMSep 28

to Metamath

Hi Antony,

I'm a bit busy, but if you have the time to write the spec of the API, it would be helpful.

Just a couple of unit tests in any language would probably be enough for us to understand the goal.

BR
Glauco

Antony Bartlett

unread,

Sep 28, 2025, 7:17:17 AMSep 28

to meta...@googlegroups.com

On Sun, Sep 28, 2025 at 11:29 AM Marlo Bruder <marlo.j...@gmail.com> wrote:

Hey Antony,

While we're talking about unifiers, I'm still in the market for one which is api based with a thin command line wrapper.

Could you elaborate on your proposal a bit? When you say "thin command line wrapper", do you mean a fully functional proof assistant (like a minimalist metamath.exe) or simply a command line tool that let's you call the proof assistant functionality (unify step, renumber, etc.)?

At its most fundamental level, a unifier takes a .mm file and an incomplete .uue file and returns a hopefully slightly more complete .uue file, doesn't it? That's what I'd like to see from a command line tool. And if it could do more, that would be a bonus. Combined with your preferred text editor, this does indeed become the bare bones of a proof assistant. And I suggested it would also be nice to have a unifier as an api, in which case the cli just becomes a thin wrapper over that api.

This is functionality which exists in at least three places - mmj2 (Mel O'Cat, Java) , yamma (Glauco, JavaScript), and metamath-vspa (Thierry, Rust). But to the best of my knowledge it has not been componentised and exposed in the kind of manner I was asking for, but it'd be nice to be wrong about that.

3. A unifier api would allow a programmer to focus on the front end of a proof assistant (my personal conceit, since 2018, is that I'd like to see/write an Incredible Proof Machine (vertex-edge-graph drag and drop toy proof assistant) with a Metamath back end, and, oh, maybe a maxGraph front end).

Another interesting proposal. Are you suggesting a drag and drop proof assistant where the proof lines are represented as vertices and hypothesis dependencies as edges? What would the advantage of such a proof assistant be in your opinion?

I just found The Incredible Proof Machine to be a fun introduction to formal verifiers, and think it would be great with Metamath behind it. It might not have any utility beyond engaging beginners, given that the Scratch programming language gives a visual introduction to programming which people quickly move beyond to something which is text file based, but the proof diagrams would still be pretty.

Antony Bartlett

unread,

Sep 30, 2025, 3:17:53 AMSep 30

to meta...@googlegroups.com

Sorry to anyone I confused by saying .uue when I meant .mmp, or misled by referring to yamma's platform (JavaScript) rather than its programming language (TypeScript).

Glauco made the excellent suggestion that I code a spec for this, and I am in the process of writing it. We first discussed the possibility of spinning off a unifier component from yamma at the end of 2022, I believe, so I don't have any sense of urgency at all about wanting this api if you are busy.

Best regards,

Antony

--
You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+u...@googlegroups.com.

To view this discussion visit https://groups.google.com/d/msgid/metamath/33412bc3-bde8-489a-a69b-ed6fa7f0b958n%40googlegroups.com.

Igor Ieskov

unread,

Sep 30, 2025, 4:26:38 PMSep 30

to Metamath

> My understanding is that it really is just a smart brute force: the worst-case time complexity is something like O((# of theorems) × (# of preceding steps) ^ (# of hypotheses per theorem)). That's why 4syl is discouraged, since both its hypotheses and conclusion are very broad and can potentially match many different steps. In my experience with metamath-lamp's unifier, it tends to time out on theorems like syl333anc with loads of hypotheses, since there are simply too many possibilities for matching.

Yes, that’s correct, in metamath-lamp it is just a smart brute force. I need to check, but I feel like the inability to unify syl333anc and similar assertions in some cases is not the browser limitation, it is rather not enough optimized implementation.

Maybe this will be useful:

In this comment Mario summarized his implementation of the unification algorithm in MM1.
This comment contains a few interesting examples of unification.

-
Igor

Igor Ieskov

unread,

Sep 30, 2025, 5:06:52 PMSep 30

to Metamath

I forgot to add, this comment contains another interesting example of unification that I wasn’t familiar with because I hadn’t used mmj2 enough.

-
Igor

Marlo Bruder

unread,

Oct 1, 2025, 12:53:59 AMOct 1

to Metamath

Thanks for confirming that it is just a smart brute force, erveryone!

The algorithm is not implemented.

Best regards,

Marlo Bruder

unread,

Oct 1, 2025, 12:57:04 AMOct 1

to Metamath

Nooo, I meant to say "The algorithm is now implemented". What an unfortunate typo. I shouldn't be sending emails this early.

Antony Bartlett

unread,

Oct 1, 2025, 4:55:01 PMOct 1

to meta...@googlegroups.com

For Glauco, and anyone else interested in how a simple API for performing unification might be specified, please find the promised spec in this repo:

https://github.com/Antony74/yamma-unifier-cli

I hope that is helpful, and I'm happy to receive feedback if it could be more so. I've written the command line interface (CLI) too - the API is mocked so it all runs but there's just the one file that can be unified (via hardcoding) at the moment.

I'm tempted to add code to create requested .mmp files from .mm files, as I don't believe yamma can do that yet? That would be horribly inefficient because I'd be using a different parser... well, verifier actually, which would make it even less efficient, but I think it would be quite easy to write.

Best regards,

Antony

On Sun, Sep 28, 2025 at 12:12 PM Glauco <glaform...@gmail.com> wrote:

--
You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+u...@googlegroups.com.

To view this discussion visit https://groups.google.com/d/msgid/metamath/33412bc3-bde8-489a-a69b-ed6fa7f0b958n%40googlegroups.com.

Glauco

unread,

Oct 1, 2025, 6:19:22 PMOct 1

to Metamath

Hi Antony,

here https://github.com/glacode/yamma/blob/master/server/src/__test__/MmpUnifier.test.ts there are dozens of unit tests along the lines of your repo.

You should be able to easily serve them using Express. Here's a simple example (but there are not more "complex" examples from your standpoint: only the .mm and the .mmp will be more complex)

First you create the mmParser

const mp2Theory = '$c ( $. $c ) $. $c -> $. $c wff $. $c |- $. $v ph $. ' +
'$v ps $. $v ch $. wph $f wff ph $. wps $f wff ps $. wch $f wff ch $.\n' +
'wi $a wff ( ph -> ps ) $.\n' +
'${ min $e |- ph $. maj $e |- ( ph -> ps ) $. ax-mp $a |- ps $. $}';
const mp2MmParser: MmParser = new MmParser(globalState);
mp2MmParser.ParseText(mp2Theory);
mp2MmParser.createParseNodesForAssertionsSync();

Then you can unify:

test('Unify 2 ax-mp', () => {
const mmpSource = `
* test comment

qed::ax-mp |- ph`;
const mmpParserParams: IMmpParserParams = {
textToParse: mmpSource,
mmParser: mp2MmParser,
workingVars: new WorkingVars(kindToPrefixMap)
};
const mmpParser: MmpParser = new MmpParser(mmpParserParams);
mmpParser.parse();
const mmpUnifier: MmpUnifier = new MmpUnifier(
{
mmpParser: mmpParser,
proofMode: ProofMode.normal,
maxNumberOfHypothesisDispositionsForStepDerivation: 0,
renumber: false,
removeUnusedStatements: false
});
mmpUnifier.unify();
const textEditArray: TextEdit[] = mmpUnifier.textEditArray;
expect(textEditArray.length).toBe(1);
const newTextExpected = `
* test comment

d1:: |- &W1
d2:: |- ( &W1 -> ph )
qed:d1,d2:ax-mp |- ph
`;
const textEdit: TextEdit = textEditArray[0];
expect(textEdit.newText).toEqual(newTextExpected);
});

That's it!

Glauco

unread,

Oct 1, 2025, 6:21:00 PMOct 1

to Metamath

here is the GlobalState I use for all the theories in the test suite

const variableKindsConfiguration: Map<string, IVariableKindConfiguration> = new Map<string, IVariableKindConfiguration>();
variableKindsConfiguration.set('wff', { workingVarPrefix: 'W', lspSemantictokenType: 'variable' });
variableKindsConfiguration.set('setvar', { workingVarPrefix: 'S', lspSemantictokenType: 'string' });
variableKindsConfiguration.set('class', { workingVarPrefix: 'C', lspSemantictokenType: 'keyword' });
export const lastFetchedSettings: IExtensionSettings = {
maxNumberOfProblems: 100, mmFileFullPath: '',
disjVarAutomaticGeneration: DisjVarAutomaticGeneration.GenerateNone,
proofMode: ProofMode.normal,
labelsOrderInCompressedProof: LabelsOrderInCompressedProof.mostReferencedFirstAndNiceFormatting,
diagnosticMessageForSyntaxError: DiagnosticMessageForSyntaxError.short,
variableKindsConfiguration: variableKindsConfiguration
};

export const globalState: GlobalState = new GlobalState();
globalState.lastFetchedSettings = lastFetchedSettings;

Glauco

unread,

Oct 1, 2025, 6:23:42 PMOct 1

to Metamath

As you see:

- you create a GlobalState
- then you create the MmParser
- then you create the MmpParser
- finally you create the MmpUnifier

Thierry Arnoux

unread,

Oct 10, 2025, 10:49:46 AMOct 10

to meta...@googlegroups.com, Antony Bartlett

About an API for a proof assistant, the LSP (Language Server Protocole) is a protocole for interactions between editors/IDE's and language servers.

Language servers typically provide syntax highlighting, contextual help, code completion, etc.

Unification could simply be code completion for a partial MMP file, or a special language server command.

An advantage of this approach is that this is an industry standard, and there is a long list of editors which support them.

Just my 2c!

BR,
_
Thierry

The advantage of a client-server architecture compared to a command-line tool is that the server can perform computing intense tasks once (when started, when a file is loaded) and keep that information. A command-line tool does not have any context and might have to reload/re-parse a lot of information at each call.

Antony Bartlett

unread,

Oct 10, 2025, 12:01:09 PMOct 10

to Thierry Arnoux, meta...@googlegroups.com

Yes, a language server does sound like the best approach. I'm trying to aim for the simplest.

Glauco - thank you very much for your help with that - sorry, I meant to say sooner - I got that working OK, but haven't got much further. Am planning on working on that this weekend.