mmverify.py bug

83 views
Skip to first unread message

Jason Orendorff

unread,
Feb 12, 2007, 12:57:38 PM2/12/07
to Metamath Mailing List
Hey, I was looking at mmverify.py today (hoping to strip out the
parser for my own nefarious purposes) and I think I found a bug. See
the file test.mm at:

http://groups.google.com/group/metamath

Norm's metamath program rejects it, but mmverify.py passes it.

Also, if the input file contains an incomplete comment (just "$("),
mmverify.py gets into an infinite loop. It's the innermost loop in
MM.readc().

This isn't a problem for what I'm trying to do. Just posting this in
case Raph wants to know about it.

-j

Glauco

unread,
May 8, 2022, 8:17:26 AMMay 8
to Metamath
I can't find test.mm , but I'm using this old post to keep mmverify.py 'issues' in one place.

I've written a TypeScript parser/verifier (one for .mm and one for .mmp format) along the lines of mmverify.py , and I noticed that both my verifier (and mmverify.py) don't catch an error that metamat.exe instead does.

If you remove the line

vx.wal $f setvar x $.

from the block of

wal $a wff A. x ph $.

mmverify.py doesn't rise an error (of course, the first theorem that uses wal, that is trujust, will rise an error).

Instead, metamath.exe, when reading set.mm (then, even before 'verify proof *' ) returns this error:

The source has 200287 statements; 2648 are $a and 39659 are $p.

?Error on line 23725 of file "set.mm" at statement 4161, label "wal", type
"$a":
    wal $a wff A. x ph $.
                  ^
This variable does not occur in any active "$e" or "$f" hypothesis.  All
variables in "$a" and "$p" statements must appear in at least one such
hypothesis.

?Error on line 23725 of file "set.mm" at statement 4161, label "wal", type
"$a":
    wal $a wff A. x ph $.
The variable "x" does not appear in an active "$f" statement.

2 errors were found.


I've not looked at the spec in the metamath book, yet, to see if actually any var is required to have a kind (when used in assertions).

If I'm not wrong, other verifiers are "based" on mmverify.mm , so I decided to write this post.

Glauco



Mario Carneiro

unread,
May 8, 2022, 8:22:40 AMMay 8
to metamath
On Sun, May 8, 2022 at 8:17 AM Glauco <glaform...@gmail.com> wrote:
I've not looked at the spec in the metamath book, yet, to see if actually any var is required to have a kind (when used in assertions).

To this question specifically: Yes, every variable appearing in a $a statement must have an $f in scope. From the book (p.114):

> A $f, $e, or $d statement is active from the place it occurs until the end of the block it occurs in.
> A $a or $p statement is active from the place it occurs through the end of the database.
> There may not be two active $f statements containing the same variable.
> Each variable in a $e, $a, or $p statement must exist in an active $f statement.

Glauco

unread,
May 8, 2022, 9:54:35 AMMay 8
to Metamath
Thank you, Mario

There's also this footnote:
This requirement can greatly simplify the unification algorithm (substitution calculation) required by proof verification.

(I need to check if I was anyway assuming this to be the case, or if my verification / unification code can improve using this assumption)

Jim Kingdon

unread,
May 8, 2022, 11:08:39 AMMay 8
to meta...@googlegroups.com

Thanks for asking about this.

Here's a pull request to add this case to the metamath-exe testsuite: https://github.com/metamath/metamath-exe/pull/83

--
You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/metamath/d6dd174f-b563-4adb-ad06-5c2531e0520en%40googlegroups.com.

Benoit

unread,
May 8, 2022, 11:53:00 AMMay 8
to Metamath
Thanks Glauco.  I'll try to fix this in mmverify.py next week (I'm currently away from my main computer), as well as the problem mentioned in the original post about the infinite loop in case of a non-closed comment. For the latter, it is probably sufficient to replace " while tok != '$)' " with " while tok and tok != '$)' ".  This would avoid the infinite loop, but would ignore a non-closing comment at the end of a file.  There are a few similar cases with other statements.  These could be fixed by handling "EOF" separately.  There is also a problem when a file inclusion statement is followed by some text in the same line.  So mmverify.py may currently accept some files which should be rejected, but I think it does not reject correct files (with complete proofs).

By the way: you were looking for a place to keep such issues.  A natural choice is https://github.com/david-a-wheeler/mmverify.py/issues

Benoît

Antony Bartlett

unread,
May 8, 2022, 12:11:26 PMMay 8
to meta...@googlegroups.com
Hi Glauco,

> I've written a TypeScript parser/verifier (one for .mm and one for .mmp format) along the lines of mmverify.py

Can I see, please?  My interest is that I may have almost completed a port of Eric Schmidt's C++ checkmm verifier to TypeScript


(I say "may" because I previously attempted this in 2019 and had a problem with the performance not being linear compared to the original.  It passed the test suite for all the smaller files, but had I left it running on set.mm I'm not sure whether it would have completed yet).

    Best regards,

        Antony


Glauco

unread,
May 9, 2022, 3:39:06 PMMay 9
to Metamath
Hi Antony,

if I remember correctly, my first implementation was going n^2, because I used something like

tokens = tokens.slice(1)

and

linesToParse = linesToParse.slice(1)

Are you still experiencing the non-linear behavior? (as a first step, I would look into your 'nexttoken' function)


I'm a complete newbie to Typescript and Node.js and my first parser (without verification) took 10 minutes to parse set.mm

Then I looked into Node.js profiling:

https://nodejs.org/en/docs/guides/simple-profiling/

used with

flame graphs
https://github.com/brendangregg/FlameGraph

and it was enough to spot the .slice(1) problem: the parse time (without verification) went down from 10 minutes to 5 seconds.

You may wish to try those two profiling tools in your project.


I've not published the parser, yet. It's a "component" of a larger project (a TypeScript language server for mmp files).
Unfortunately, the .mm parse logic is distributed in several files:
- the parser is a class with a dedicated file (it has a couple of bare-bones parse methods, one synchronous and one asynchronous)
- the verifier is a class living in its own file
- a tokenizer (every symbol is annotated with a "Range", info needed for producing meaningful language server diagnostics) lives in its own file
- every mm statement type has its own class (with its own methods); in particular, I have a BlockStatement class that has several properties and methods, used by the parser/verifier (and it lives in yet another file)
- the parser also generates a theory specific grammar for Nearly.js (a general purpose Early parser); and GrammarManager.ts is yet another file

I have zero experience in managing a github repository, that's why I prefer to get as close as possible to a "production" level, before publishing the repository on github.

HTH
Glauco

Antony Bartlett

unread,
May 9, 2022, 4:55:19 PMMay 9
to meta...@googlegroups.com
Thanks Glauco,

> (as a first step, I would look into your 'nexttoken' function)

Nice guess, you had me wondering!  But having tried it the performance of nexttoken() seems nicely linear, and I had to give it ten million tokens before I noticed it taking time.  https://github.com/Antony74/checkmm-ts/commit/cbccfd3305266531d11258a8e35ef909fd858b91

My latest port is not quite ready to parse an mm file, so I don't know if I'm going to experience non-linear behaviour this time or not, and I didn't manage to isolate the source of my problem last time round.  Maybe what I ported std::set to was a problem, maybe it was something else.  My instinct is that this time it will be fine, but if it isn't then I intend to get to the bottom of it!  The main reason I started again is that I'm a better TypeScript programmer than I was in 2019 and I don't feel like my original attempt at a port was sufficiently faithful to checkmm.cpp.

> I'm a complete newbie to Typescript and Node.js

That's cool, I was too once!  I might have a slight advantage in that I also use this tech-stack in my professional work.

> I have zero experience in managing a github repository, that's why I prefer to get as close as possible to a "production" level, before publishing the repository on github.

github specifically, or version control in general?  Once you get used to version control, you'll never want to do anything without it again!

I'm expecting the port to be some multiple slower than the original, by the way - speed isn't that important to me, having a very expressive language which can create programs that can be accessed from pretty much any device is what I like.

    Best regards,

        Antony


Benoit

unread,
May 11, 2022, 6:08:32 PMMay 11
to Metamath
The bug reported in the original post of this thread (checking unclosed comments) is fixed in https://github.com/david-a-wheeler/mmverify.py/pull/10.
I'll fix the other bug (checking for untyped active variables) tomorrowish.
Benoît
Reply all
Reply to author
Forward
0 new messages