Yet another new proof verifier

131 views
Skip to first unread message

Stefan O'Rear

unread,
Jun 18, 2016, 6:08:26 AM6/18/16
to metamath list
I went ahead and did the thing Mario and I discussed on the list back
in April; we now have a working example of a proof verifier that works
at a segment level, incrementally and taking advantage of parallelism.
The code and build/execution instructions are
https://github.com/sorear/smetamath-rs.

The more advanced features from SMM2, the grammatical parser and the
outline extraction, are not yet in place. Since the point of
parallelism is to be as fast as possible, the stats:

On my machine it's 6 times faster than the reference impl. All tests
using set.mm develop/0c32b31f. Unfortunately there's no overlap with
Dan Getz' list and I don't know his benchmark machine specs, so can't
directly compare the two.

uname -a: Linux sorear6 4.4.9-300.fc23.x86_64 #1 SMP Wed May 4
23:56:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

/sys/devices/virtual/dmi/id/board_version: MacBookAir7,2

/proc/cpuinfo: model name : Intel(R) Core(TM) i5-5250U CPU @ 1.60GHz
(2 cores, 2-way SMT ("hyperthreading"); I'd like to include memory
latency, but I can't figure out how to get those specs)

SMM3 (rustc 1.11.0-nightly (0554abac6 2016-06-10))

Performance counter stats for 'target/release/smetamath --verify
--split --jobs 4 /home/sorear/dlcode/set.mm/set.mm' (30 runs):

1955.445112 task-clock (msec) # 2.679 CPUs
utilized ( +- 0.15% )
366 context-switches # 0.187 K/sec
( +- 6.28% )
59 cpu-migrations # 0.030 K/sec
( +- 2.75% )
64,295 page-faults # 0.033 M/sec
( +- 0.09% )
4,915,730,304 cycles # 2.514 GHz
( +- 0.16% )
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
5,811,148,177 instructions # 1.18 insns per
cycle ( +- 0.00% )
1,179,294,370 branches # 603.082 M/sec
( +- 0.00% )
36,216,379 branch-misses # 3.07% of all
branches ( +- 0.04% )

0.729840747 seconds time elapsed
( +- 0.70% )

or, limited to a single thread:

Performance counter stats for 'target/release/smetamath --verify
--jobs 1 /home/sorear/dlcode/set.mm/set.mm' (30 runs):

1238.270068 task-clock (msec) # 0.999 CPUs
utilized ( +- 0.23% )
16 context-switches # 0.013 K/sec
( +- 22.09% )
0 cpu-migrations # 0.000 K/sec
( +- 36.27% )
63,299 page-faults # 0.051 M/sec
( +- 0.00% )
3,288,685,107 cycles # 2.656 GHz
( +- 0.17% )
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
5,819,668,862 instructions # 1.77 insns per
cycle ( +- 0.00% )
1,183,317,488 branches # 955.621 M/sec
( +- 0.00% )
34,112,392 branch-misses # 2.88% of all
branches ( +- 0.24% )

1.239175774 seconds time elapsed
( +- 0.23% )

metamath.exe (gcc version 5.3.1 20160406 (Red Hat 5.3.1-6) (GCC)):

Performance counter stats for '/home/sorear/dlcode/metamath/metamath
r '/home/sorear/dlcode/set.mm/set.mm v p * exit' (10 runs):

4811.725531 task-clock (msec) # 1.000 CPUs
utilized ( +- 0.42% )
31 context-switches # 0.006 K/sec
( +- 24.00% )
1 cpu-migrations # 0.000 K/sec
( +- 31.18% )
41,349 page-faults # 0.009 M/sec
( +- 0.00% )
12,782,471,756 cycles # 2.657 GHz
( +- 0.41% )
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
24,731,912,028 instructions # 1.93 insns per
cycle ( +- 0.01% )
4,994,967,694 branches # 1038.082 M/sec
( +- 0.01% )
128,510,117 branch-misses # 2.57% of all
branches ( +- 0.20% )

4.813443077 seconds time elapsed
( +- 0.42% )

SMM2 (node.js 5.11.0; 6.x is measurably slower :( )

Performance counter stats for '/home/sorear/dlcode/metamath/metamath
r '/home/sorear/dlcode/set.mm/set.mm v p * exit' (10 runs):

Performance counter stats for 'node
/home/sorear/active/smm/out/misc/verify.js
/home/sorear/dlcode/set.mm/set.mm' (10 runs):

7261.357665 task-clock (msec) # 1.043 CPUs
utilized ( +- 0.34% )
337 context-switches # 0.046 K/sec
( +- 3.58% )
38 cpu-migrations # 0.005 K/sec
( +- 5.13% )
114,217 page-faults # 0.016 M/sec
( +- 0.11% )
19,221,842,218 cycles # 2.647 GHz
( +- 0.33% )
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
35,100,750,527 instructions # 1.83 insns per
cycle ( +- 0.04% )
7,773,372,796 branches # 1070.512 M/sec
( +- 0.04% )
96,023,227 branch-misses # 1.24% of all
branches ( +- 0.13% )

6.960570593 seconds time elapsed
( +- 0.29% )

The incremental mode is much harder to show with benchmarks, but a
minimal recalculation with no cascading effects can be under 100ms.
Much of this is in the read() system call; if you split set.mm into
one file per section, it can notice unmodified sections with stat()
bringing the baseline down to 18ms.

-sorear

fl

unread,
Jun 18, 2016, 9:56:26 AM6/18/16
to Metamath


What would help is a new compression algorithm. I suspect that the current one reads
set.mm again every time it finds a compression. I wonder if it is not possible to have
a more linear algorithm. Using cache maybe.

--
FL

Mario Carneiro

unread,
Jun 18, 2016, 3:53:03 PM6/18/16
to metamath
You read and verified set.mm in 0.7 seconds?! That's phenomenal! That could completely change the way we interact with the database, since it's comparable to an mmj2 unify action.

A while ago you said you had a new verification algorithm for metamath that could unify long steps quickly - a.k.a. "METAMATH e. P". Could you give more details on how this works? Are you using it in this impl?

FL, I don't think Stefan's test is doing any compression. The only existing compression algorithms I am aware of are mmj2's (one proof at a time) and metamath.exe's (which is indeed not very efficient due to internally expanding the compressed proof to uncompressed before recompressing). The compression of a proof is essentially unique, so there is no need to look for new compressions after the initial compression of a proof. A compression algorithm is only needed when exporting proofs from some internal proof format.

If you meant minimization instead of compression, then yes there is a market for an algorithm that leverages the caches to minimize only the proofs that need it. However, there are a lot of things that invalidate a minimized proof (that is, running minimize again on it may minimize it still further); in particular, it is very difficult to tell if a new theorem is likely to be useful in a minimize run - potentially any new theorem may improve any later theorem. Possible improvements: any constant which appears in the consequent of a new theorem must also appear in some step formula of a later minimizable theorem. If we restrict to the present minimize algorithm, which never introduces new step formulas, this also applies to constants in the hypotheses of the new theorem. For grammatical systems, we can use syntax axioms in place of constants here.

(For clarification, I define "compression" as the interning of duplicated subproofs, and "minimization" as a search for shorter proofs of the statement, usually by local perturbation of the original proof and applying new theorems to skip some steps.)

Mario

--
You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+u...@googlegroups.com.
To post to this group, send email to meta...@googlegroups.com.
Visit this group at https://groups.google.com/group/metamath.
For more options, visit https://groups.google.com/d/optout.

Stefan O'Rear

unread,
Jun 18, 2016, 5:18:04 PM6/18/16
to metamath list
On Sat, Jun 18, 2016 at 12:53 PM, Mario Carneiro <di....@gmail.com> wrote:
> You read and verified set.mm in 0.7 seconds?! That's phenomenal! That could
> completely change the way we interact with the database, since it's
> comparable to an mmj2 unify action.
>
> A while ago you said you had a new verification algorithm for metamath that
> could unify long steps quickly - a.k.a. "METAMATH e. P". Could you give more
> details on how this works? Are you using it in this impl?

The "new verification algorithm" is of mostly theoretical interest.
While it's polynomial time in all cases (woo!) and deterministic, it's
only more efficient than naive verification using native strings once
the step length reaches about one million symbols (measured in SMM2).

I can talk more about it, it's quite interesting in its own right, but
it's not relevant to quick processing of set.mm.

-sorear

Stefan O'Rear

unread,
Jun 19, 2016, 5:28:54 AM6/19/16
to metamath list
On Sat, Jun 18, 2016 at 2:18 PM, Stefan O'Rear <sor...@gmail.com> wrote:
> The "new verification algorithm" is of mostly theoretical interest.
> While it's polynomial time in all cases (woo!) and deterministic, it's
> only more efficient than naive verification using native strings once
> the step length reaches about one million symbols (measured in SMM2).
>
> I can talk more about it, it's quite interesting in its own right, but
> it's not relevant to quick processing of set.mm.

To elaborate a bit more on this, verification in SMM3 mostly winds up
being a branch predictor benchmark, and everything I've tried that
increases the number of cases to consider has turned out to be a
pessimization.

Things that might help the verification loop are a branch-free varint
decoder and using fixed-length universal hashes instead of
variable-length steps.

Right now where I think optimization work would go best is improving
the parallelizable fraction.

-sorear

Stefan O'Rear

unread,
Jun 25, 2016, 4:34:58 AM6/25/16
to metamath list
On Sun, Jun 19, 2016 at 2:28 AM, Stefan O'Rear <sor...@gmail.com> wrote:
> Right now where I think optimization work would go best is improving
> the parallelizable fraction.

I was able to make a big improvement on that bit. The problem was bin
packing; Norm's mathbox is processed as a single chunk, and it's at
the end, which means that when we start 4 threads, usually 1 of them
will be stuck processing said mathbox while the other 3 are done.
Processing chunks in descending size order makes the parallel fraction
significantly larger. With that and a number of other improvements
I'm now getting:

Performance counter stats for 'target/release/smetamath --verify
--jobs 1 /home/sorear/dlcode/set.mm/set.mm' (40 runs):

1006.992560 task-clock (msec) # 1.000 CPUs
utilized ( +- 1.03% )
3 context-switches # 0.003 K/sec
( +- 19.66% )
0 cpu-migrations # 0.000 K/sec
( +- 38.12% )
42,386 page-faults # 0.042 M/sec
( +- 0.00% )
2,685,083,797 cycles # 2.666 GHz
( +- 1.02% )
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
4,627,729,291 instructions # 1.72 insns per
cycle ( +- 0.01% )
932,567,410 branches # 926.092 M/sec
( +- 0.01% )
28,340,960 branch-misses # 3.04% of all
branches ( +- 0.09% )

1.007464978 seconds time elapsed
( +- 1.03% )

Performance counter stats for 'target/release/smetamath --verify
--jobs 4 --split /home/sorear/dlcode/set.mm/set.mm' (100 runs):

1667.470675 task-clock (msec) # 3.405 CPUs
utilized ( +- 0.05% )
365 context-switches # 0.219 K/sec
( +- 4.12% )
46 cpu-migrations # 0.028 K/sec
( +- 4.03% )
48,961 page-faults # 0.029 M/sec
( +- 0.05% )
4,162,179,419 cycles # 2.496 GHz
( +- 0.05% )
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
4,646,942,880 instructions # 1.12 insns per
cycle ( +- 0.00% )
933,363,010 branches # 559.748 M/sec
( +- 0.00% )
30,894,271 branch-misses # 3.31% of all
branches ( +- 0.03% )

0.489711218 seconds time elapsed
( +- 0.19% )

( note that the operating clock frequency is substantially lower for
the parallel run, due to thermal management )

Has anyone other than me tried to build this yet?

-sorear

Dan Getz

unread,
Jun 25, 2016, 10:22:19 AM6/25/16
to Metamath
Just noticed this thread and haven't digested it yet, but I did benchmark SMM2 and SMM3 (nodejs and rust, respectively) and SMM2 was blazing fast (the fastest). SMM3 had speed like my own implementation. I will update the table in my README and then post the revised benchmark table.

David A. Wheeler

unread,
Jun 25, 2016, 10:42:31 AM6/25/16
to metamath
On Sat, 25 Jun 2016 01:34:56 -0700, "Stefan O'Rear" <sor...@gmail.com> wrote:
> I was able to make a big improvement on that bit...
> With that and a number of other improvements [verifying set.mm takes]
> Performance counter stats for 'target/release/smetamath --verify
> --jobs 4 --split /home/sorear/dlcode/set.mm/set.mm' (100 runs):
> 0.489711218 seconds time elapsed

Wow.

I propose that *all* the .mm files be routinely checked by at least 3 verifiers.
In particular, I think the .travis.yml file in set.mm should be modified to *also* run
Stefan O'Rear's smm (in addition to Norm's metamath in C and mmj2).
At these speeds, the download time will dominate the verification time :-).

Adding 1 or 2 more besides those 3 would even would be fine, e.g.,
mmverify (Python by Raph Levien) and/or checkmm.cpp (C++ by Eric Schmidt).
If nothing else, it'd be nice to have a relatively-easy command that could be used
to automatically download, install, and run them all on all the .mm files.

Once there's at least 3, I think the metamath web page should *trumpet* that the proofs
are verified by at least N independently-implemented verifiers.
A *big* advantage of metamath over many other systems is the availability of
many independent verifiers (due to the simplicity of its fundamentals),
which is a great counter to bugs in verifiers.

Stefan O'Rear: Could you modify smm/smetamath so it could be installed just by
running "npm install smetamath" or similar? You use npm to
install its *dependencies*, but it'd be easier if it could be installed directly with npm.
Also, the io.js / node.js fork has healed, so no need to talk about that.

--- David A. Wheeler

Stefan O'Rear

unread,
Jun 25, 2016, 3:06:34 PM6/25/16
to metamath list
On Sat, Jun 25, 2016 at 7:22 AM, Dan Getz <get...@gmail.com> wrote:
> Just noticed this thread and haven't digested it yet, but I did benchmark
> SMM2 and SMM3 (nodejs and rust, respectively) and SMM2 was blazing fast (the
> fastest). SMM3 had speed like my own implementation. I will update the table
> in my README and then post the revised benchmark table.

You appear to be saying SMM2 is faster than SMM3 for you, which is odd
as SMM3 is faster for me and by a huge margin (presently 15x).

1. Typo in your comment?
2. Did you forget `--release` when compiling SMM3? The rust compiler
generates … code worse than most splat compilers … with optimizations
disabled.
3. Something _very interesting_ is going on with your computer. If
the first two options are excluded, can you tell me your CPU model,
used versions of rustc and node, any easily obtainable
performance-counter data, and command lines used to obtain it?

-sorear

Stefan O'Rear

unread,
Jun 25, 2016, 3:18:47 PM6/25/16
to metamath list
On Sat, Jun 25, 2016 at 7:42 AM, David A. Wheeler <dwhe...@dwheeler.com> wrote:
> I propose that *all* the .mm files be routinely checked by at least 3 verifiers.
> In particular, I think the .travis.yml file in set.mm should be modified to *also* run
> Stefan O'Rear's smm (in addition to Norm's metamath in C and mmj2).
> At these speeds, the download time will dominate the verification time :-).

It already did, back when set.mm was 22MB and could be verified in 5
seconds, at least with San Diego's ISPs (10Mbps on a good day…)

> Adding 1 or 2 more besides those 3 would even would be fine, e.g.,
> mmverify (Python by Raph Levien) and/or checkmm.cpp (C++ by Eric Schmidt).
> If nothing else, it'd be nice to have a relatively-easy command that could be used
> to automatically download, install, and run them all on all the .mm files.

Agreed. I want that too. Hoping someone else beats me to it, but I could.

> Stefan O'Rear: Could you modify smm/smetamath so it could be installed just by
> running "npm install smetamath" or similar? You use npm to
> install its *dependencies*, but it'd be easier if it could be installed directly with npm.
> Also, the io.js / node.js fork has healed, so no need to talk about that.

SMM2 is tricky to deal with because I started it during the ES6
transpiler era, and so it requires an extra build step that I don't
know how to automate with npm. I'm sure it can be done. Or, I could
translate SMM2 to the subset of ES6 which is supported by Node 5.x
(Node 6.x is measurably slower so I'm not sure I want to depend on
it). Either way it's more work than I really want to spend on an
architecture I've already abandoned, but I could probably be convinced
if you keep trying.

As for SMM3, it appears to already work:

[12:14:20 S6:~]$ cargo install --git https://github.com/sorear/smetamath-rs
Updating git repository `https://github.com/sorear/smetamath-rs`
Compiling unicode-width v0.1.3
Compiling vec_map v0.6.0
Compiling fnv v1.0.2
Compiling bitflags v0.5.0
Compiling strsim v0.4.1
Compiling libc v0.2.11
Compiling ansi_term v0.7.2
Compiling clap v2.5.2
Compiling filetime v0.1.10
Compiling smetamath v3.0.0-1
(https://github.com/sorear/smetamath-rs#c116c194)
Installing /home/sorear/.cargo/bin/smetamath
[12:16:20 S6:~]$ smetamath --verify --split --jobs 4 --timing
~/dlcode/set.mm/set.mm
parse 86ms
nameck 33ms
scopeck 116ms
verify 222ms
diag 0ms
/home/sorear/dlcode/set.mm/set.mm:443131-443133:Warning:This comment
marker must be the first token in the comment to be effective

-sorear

Mario Carneiro

unread,
Jun 25, 2016, 6:23:37 PM6/25/16
to metamath
I tried to download and install, and I was unable to link successfully:

$ cargo build --verbose
       Fresh bitflags v0.5.0
       Fresh strsim v0.4.1
       Fresh fnv v1.0.2
       Fresh libc v0.2.11
       Fresh vec_map v0.6.0
       Fresh ansi_term v0.7.2
       Fresh unicode-width v0.1.3
       Fresh filetime v0.1.10
       Fresh clap v2.5.2
   Compiling smetamath v3.0.0-1 (file:///C:/metamath/smetamath-rs)
     Running `rustc src\\main.rs --crate-name smetamath --crate-type bin -C opt-level=1 -C codegen-units=4 -g -C debug-assertions=on --out-dir C:\\metamath\\smetamath-rs\\target\\debug --emit=dep-info,link -L dependency=C:\\metamath\\smetamath-rs\\target\\debug -L dependency=C:\\metamath\\smetamath-rs\\target\\debug\\deps --extern fnv=C:\\metamath\\smetamath-rs\\target\\debug\\deps\\libfnv-4e449da685eb30c8.rlib --extern filetime=C:\\metamath\\smetamath-rs\\target\\debug\\deps\\libfiletime-7100ae4a7fd471a6.rlib --extern clap=C:\\metamath\\smetamath-rs\\target\\debug\\deps\\libclap-87fff2d3efdcc5bf.rlib -Z orbit`
error: linking with `link.exe` failed: exit code: 1
note: "link.exe" "/NOLOGO" "/NXCOMPAT" "/LIBPATH:C:\\Program Files\\Rust nightly MSVC 1.11\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib" "C:\\metamath\\smetamath-rs\\target\\debug\\smetamath.0.o" "C:\\metamath\\smetamath-rs\\target\\debug\\smetamath.1.o" "C:\\metamath\\smetamath-rs\\target\\debug\\smetamath.2.o" "C:\\metamath\\smetamath-rs\\target\\debug\\smetamath.3.o" "/OUT:C:\\metamath\\smetamath-rs\\target\\debug\\smetamath.exe" "/OPT:REF,ICF" "/DEBUG" "/LIBPATH:C:\\metamath\\smetamath-rs\\target\\debug" "/LIBPATH:C:\\metamath\\smetamath-rs\\target\\debug\\deps" "/LIBPATH:C:\\Program Files\\Rust nightly MSVC 1.11\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib" "C:\\Users\\Mario\\Documents\\metamath\\smetamath-rs\\target\\debug\\deps\\libclap-87fff2d3efdcc5bf.rlib" "C:\\Users\\Mario\\Documents\\metamath\\smetamath-rs\\target\\debug\\deps\\libbitflags-1efefdc447eb53f2.rlib" "C:\\Users\\Mario\\Documents\\metamath\\smetamath-rs\\target\\debug\\deps\\libansi_term-60226d3f8fc96ac9.rlib" "C:\\Users\\Mario\\Documents\\metamath\\smetamath-rs\\target\\debug\\deps\\libvec_map-d843695eff67d492.rlib" "C:\\Users\\Mario\\Documents\\metamath\\smetamath-rs\\target\\debug\\deps\\libstrsim-cb9b0c8013c1be6f.rlib" "C:\\Users\\Mario\\Documents\\metamath\\smetamath-rs\\target\\debug\\deps\\libfnv-4e449da685eb30c8.rlib" "C:\\Users\\Mario\\Documents\\metamath\\smetamath-rs\\target\\debug\\deps\\libfiletime-7100ae4a7fd471a6.rlib" "C:\\Users\\Mario\\Documents\\metamath\\smetamath-rs\\target\\debug\\deps\\liblibc-6b483f9a7097e9a4.rlib" "C:\\Program Files\\Rust nightly MSVC 1.11\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\libstd-8102e29f.rlib" "C:\\Program Files\\Rust nightly MSVC 1.11\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\libcollections-8102e29f.rlib" "C:\\Program Files\\Rust nightly MSVC 1.11\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\libpanic_unwind-8102e29f.rlib" "C:\\Program Files\\Rust nightly MSVC 1.11\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\librustc_unicode-8102e29f.rlib" "C:\\Program Files\\Rust nightly MSVC 1.11\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\libunwind-8102e29f.rlib" "C:\\Program Files\\Rust nightly MSVC 1.11\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\librand-8102e29f.rlib" "C:\\Program Files\\Rust nightly MSVC 1.11\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\liblibc-8102e29f.rlib" "C:\\Program Files\\Rust nightly MSVC 1.11\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\liballoc-8102e29f.rlib" "C:\\Program Files\\Rust nightly MSVC 1.11\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\liballoc_system-8102e29f.rlib" "C:\\Program Files\\Rust nightly MSVC 1.11\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\libcore-8102e29f.rlib" "msvcrt.lib" "ws2_32.lib" "userenv.lib" "shell32.lib" "advapi32.lib" "msvcrt.lib" "compiler-rt.lib"
note: /usr/bin/link: extra operand '/LIBPATH:C:\\Program Files\\Rust nightly MSVC 1.11\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib'
Try '/usr/bin/link --help' for more information.

error: aborting due to previous error
error: Could not compile `smetamath`.

Caused by:
  Process didn't exit successfully: `rustc src\main.rs --crate-name smetamath --crate-type bin -C opt-level=1 -C codegen-units=4 -g -C debug-assertions=on --out-dir C:\metamath\smetamath-rs\target\debug --emit=dep-info,link -L dependency=C:\metamath\smetamath-rs\target\debug -L dependency=C:\metamath\smetamath-rs\target\debug\deps --extern fnv=C:\metamath\smetamath-rs\target\debug\deps\libfnv-4e449da685eb30c8.rlib --extern filetime=C:\metamath\smetamath-rs\target\debug\deps\libfiletime-7100ae4a7fd471a6.rlib --extern clap=C:\metamath\smetamath-rs\target\debug\deps\libclap-87fff2d3efdcc5bf.rlib -Z orbit` (exit code: 101)


I'm running on Windows via msys-git. Any ideas? I think it is related to https://github.com/rust-lang/rust/issues/27438 .

Mario


-sorear

Stefan O'Rear

unread,
Jun 25, 2016, 6:36:02 PM6/25/16
to metamath list
Judging from the error message, rustc appears to be invoking GNU link,
which … creates hard links. It is not a linker at all.

http://openjdk.5641.n7.nabble.com/OpenJDK-Windows-build-with-Visual-Studio-2010-Express-Edition-td81647.html#none
implies a possible PATH issue?

How did you install rust?
https://github.com/rust-lang-nursery/rustup.rs/#other-installation-methods
?

What happens if you try using the standard shell rather than the
msysgit bash shell? (I have no idea if this will help)

-sorear

Mario Carneiro

unread,
Jun 25, 2016, 7:07:38 PM6/25/16
to metamath
On Sat, Jun 25, 2016 at 6:36 PM, Stefan O'Rear <sor...@gmail.com> wrote:
Judging from the error message, rustc appears to be invoking GNU link,
which … creates hard links.  It is not a linker at all.

http://openjdk.5641.n7.nabble.com/OpenJDK-Windows-build-with-Visual-Studio-2010-Express-Edition-td81647.html#none
implies a possible PATH issue?

I think the answer is https://github.com/rust-lang/rust/issues/29479 - rustc is trying to use MSVC link.exe as the linker, and it is finding GNU link instead. I think I will have to install VS to get the right version (rustc used to use mingw's linker on Windows, but I don't know how to configure it).
 
Mario

Stefan O'Rear

unread,
Jun 25, 2016, 7:11:40 PM6/25/16
to metamath list
The impression I have from the documentation is that if you install
Rust using the official installer on rustup.rs, it will automatically
install and correctly configure the mingw linker for you, avoiding any
VS dependency. But I am not an expert on Windows environment
configuration.

(rustup.rs gives me a curl command; I _hope_ it's doing some kind of
user-agent sniffing and will point you to the windows installer. If
not, there's another link at
https://github.com/rust-lang-nursery/rustup.rs/#other-installation-methods
.)

-sorear

Mario Carneiro

unread,
Jun 26, 2016, 3:07:09 AM6/26/16
to metamath
FYI, installing visual studio fixed the link.exe problem, and it builds and runs now.

$ time target/release/smetamath --timing --jobs 4 --split --verify ../mm/set.mm
parse 96ms
nameck 42ms
scopeck 197ms
verify 311ms
diag 0ms
../mm/set.mm:443151-443153:Warning:This comment marker must be the first token in the comment to be effective

real    0m0.784s
user    0m0.015s
sys     0m0.015s

God, that speed is amazing.

Mario


-sorear

Stefan O'Rear

unread,
Jun 26, 2016, 3:40:36 AM6/26/16
to metamath list
On Sun, Jun 26, 2016 at 12:07 AM, Mario Carneiro <di....@gmail.com> wrote:
> FYI, installing visual studio fixed the link.exe problem, and it builds and
> runs now.

Great to hear that! Let me know if you have anything you think should
be added to the README.

> $ time target/release/smetamath --timing --jobs 4 --split --verify
> ../mm/set.mm
> parse 96ms
> nameck 42ms
> scopeck 197ms
> verify 311ms
> diag 0ms
> ../mm/set.mm:443151-443153:Warning:This comment marker must be the first
> token in the comment to be effective
>
> real 0m0.784s
> user 0m0.015s
> sys 0m0.015s
>
> God, that speed is amazing.

If I could trouble you to get measurements for these cases, it'd help
with my mental modeling:

1. metamath.exe or smm2 for a baseline of comparison

2. --jobs 1

3. --jobs (physical core count) --split

4. --jobs (logical core count) --split

(--split should be used if --jobs > 1 or --repeat is passed. This
will be handled automatically soon,
https://github.com/sorear/smetamath-rs/issues/13).

On my machine, (logical core count) is the fastest overall option, but
you never know.

(I'd also like to see your reaction to --repeat :) I seem to have
neglected proper documentation; it will read a line from standard
input, then do an incremental verify, then repeat, so you could try
making small changes to set.mm. --trace-recalc may provide more
insight, though it will throw off the timing.)

-sorear

Mario Carneiro

unread,
Jun 26, 2016, 4:15:43 AM6/26/16
to metamath
I have two cores, 4 logical processors.

$ time target/release/smetamath --timing --jobs 1 --verify ../mm/set2.mm
parse 153ms
nameck 36ms
scopeck 308ms
verify 562ms
diag 0ms

real    0m1.101s

$ time target/release/smetamath --timing --jobs 1 --split --verify ../mm/set2.mm
parse 213ms
nameck 49ms
scopeck 411ms
verify 624ms
diag 0ms

real    0m1.362s

$ time target/release/smetamath --timing --jobs 2 --split --verify ../mm/set2.mm
parse 97ms
nameck 38ms
scopeck 219ms
verify 372ms
diag 0ms

real    0m0.771s


$ time target/release/smetamath --timing --jobs 4 --split --verify ../mm/set2.mm
parse 84ms
nameck 37ms
scopeck 152ms
verify 227ms
diag 0ms

real    0m0.545s

$ time target/release/smetamath --timing --jobs 8 --split --verify ../mm/set2.mm
parse 94ms
nameck 35ms
scopeck 197ms
verify 274ms
diag 0ms

real    0m0.643s

$ time target/release/smetamath --timing --jobs 16 --split --verify ../mm/set2.mm

parse 81ms
nameck 34ms
scopeck 177ms
verify 247ms
diag 0ms

real    0m0.581s

$ time target/release/smetamath --timing --jobs 1024 --split --verify ../mm/set2.
mm
parse 76ms
nameck 42ms
scopeck 175ms
verify 246ms
diag 0ms

real    0m0.617s


Not much change above 4 jobs.  Repeat test:

$ time target/release/smetamath --timing --jobs 1 --split --repeat --verify ../mm
/set2.mm
parse 171ms
nameck 62ms
scopeck 331ms
verify 638ms
diag 0ms

<break 1stcrest>

parse 42ms
nameck 1ms
scopeck 96ms
verify 296ms
diag 0ms
../mm/set2.mm:11953911-11953914:Error:Token used here must be active in the current scope
../mm/set2.mm:12013571-12016547:Error:Step {step} referenced by proof does not correspond to a $p statement (or is malformed) step=1stcrest

<break 1stcrest differently>

parse 50ms
nameck 1ms
scopeck 92ms
verify 281ms
diag 0ms
../mm/set2.mm:11953882-11956291:Error:Final step statement does not match assertion
../mm/set2.mm:12013571-12016547:Error:Step used for $e hypothesis does not match statement

<do nothing>

parse 0ms
nameck 0ms
scopeck 11ms
verify 12ms
diag 0ms
../mm/set2.mm:11953882-11956291:Error:Final step statement does not match assertion
../mm/set2.mm:12013571-12016547:Error:Step used for $e hypothesis does not match statement

<restore 1stcrest>

parse 48ms
nameck 1ms
scopeck 118ms
verify 272ms
diag 0ms


I'm seeing about 0.3 seconds for the non-null changes, and basically nothing for an unchanged file. These times are even better than Eclipse's javac recompile + syntax highlighting response time. An Eclipse or emacs plugin is definitely feasible with this as a backend.

How are you planning the interface to SMM3? Are you going to keep it strictly input via command line args or are you going to add a metamath.exe style REPL? With this kind of read time it's certainly feasible to avoid holding on to memory for long periods, and just read, execute, quit, instead of acting as a service. (Now that I come to think about it, I guess --repeat is doing exactly this. Is it possible to write the cache info on disk so that you get --repeat level speed but with a separate process for each job?)

I was thinking of adding mmj2-compatible mmp file output for SMM3, as a starting project to get me acquainted with the architecture. Do you have any tips? (Also, what editor/IDE do you use for rust?)


-sorear

Mario Carneiro

unread,
Jun 26, 2016, 4:19:17 AM6/26/16
to metamath
Oops, forgot the baseline test, with metamath.exe:

$ time ../metamath.exe "r ..\mm\set2.mm" "ve p *" exit/f

real    0m10.426s

looks like you are about 20 times faster. (The above includes about 3s for reading and 7s for verification.)

Mario

Stefan O'Rear

unread,
Jun 26, 2016, 5:02:48 AM6/26/16
to metamath list
On Sun, Jun 26, 2016 at 1:19 AM, Mario Carneiro <di....@gmail.com> wrote:
> On Sun, Jun 26, 2016 at 4:15 AM, Mario Carneiro <di....@gmail.com> wrote:
>>
>> I have two cores, 4 logical processors.
[snip timings which are, in relative proportion, essentially the same
as mine; thank you]

>> $ time target/release/smetamath --timing --jobs 1024 --split --verify
I'd have recommended against the forkbomb personally, but I guess it
didn't break too much. :D

>> Not much change above 4 jobs. Repeat test:
>> <break 1stcrest>
[snip]
>> I'm seeing about 0.3 seconds for the non-null changes, and basically
>> nothing for an unchanged file. These times are even better than Eclipse's
>> javac recompile + syntax highlighting response time. An Eclipse or emacs
>> plugin is definitely feasible with this as a backend.

That change currently forces reverification of all proofs in 25
segments because it thinks all the topology-related math symbols were
potentially renamed. Assuming by "break" you meant changes to the
proof, it sounds like doing #1 and #3 from
https://github.com/sorear/smetamath-rs/issues/11 would speed up your
test dramatically.

>> How are you planning the interface to SMM3? Are you going to keep it
>> strictly input via command line args or are you going to add a metamath.exe
>> style REPL? With this kind of read time it's certainly feasible to avoid

That's the problem now; I haven't decided.

>> holding on to memory for long periods, and just read, execute, quit, instead
>> of acting as a service. (Now that I come to think about it, I guess --repeat
>> is doing exactly this. Is it possible to write the cache info on disk so
>> that you get --repeat level speed but with a separate process for each job?)

The cache info is 109 MB. My laptop SSD is ~ 200 MB/s sequential
reads (modern non-SSDs are nearly as fast, the difference is only huge
for random access); if you have to actually hit disk, it'd be
marginally faster to recompute.

Now, there are various forms of ultra-lightweight compression that
could be applied here (we don't _really_ need to dedicate 4 bytes to
every math symbol atom); we can also declare that some parts, like the
hash tables for statement labels (there are two of them, for a mixture
of good and bad reasons), are fast enough to rebuild after reading,
etc.

We can also use memory-mapped files here, so that if the data is hot
in your OS-level page cache, it can be linked into smetamath's address
space ~ instantaneously. This conflicts with using a compacted
representation, though.

Ultimately it's not clear how compelling an improvement a persistent
cache is over recomputing. During the threads last month, I was
talking about this as a possibility, but back then I was thinking a de
novo read would be at best twice as fast as metamath.exe, so the
question was more "4000ms to recompute or 5-600ms to read". I was
also underestimating the size of the cache data; it was 150MB this
morning, and _might_ still be improvable.

>> I was thinking of adding mmj2-compatible mmp file output for SMM3, as a
>> starting project to get me acquainted with the architecture. Do you have any

You mean something in the conceptual space of
https://github.com/sorear/smetamath-rs/issues/9 ? That's not a bad
idea, I expect the first non-verification feature additions to be
bumpy because of how focused the small-scale design has been on
verifying to this point. Even stuff like "get the names of all $e
hyps of a named statement" is a bit tricky right now.

Anyway, I'll help whereever needed; I'll be focusing on documenting
the architecture now that things have mostly settled down.

>> tips? (Also, what editor/IDE do you use for rust?)

Visual Studio Code (don't be fooled by Microsoft's co-branding, it's
FOSS, runs on Linux/OSX/Windows, and AFAIK shares no code with Visual
Studio Proper)

-sorear
Reply all
Reply to author
Forward
0 new messages