> On Aug 28, 2022, at 7:19 AM, Antony Bartlett <
a...@akb.me.uk> wrote:
>
> I've written a short program (
https://github.com/Antony74/alt-mm) that lists out assertions (axioms, definitions, and proofs) that are identical (and therefore interchangeable).
This is really interesting!! Thanks for doing this!
...
> When I run it on
set.mm I get 661 lines of output. I'll reproduce the first 20 here. It's impossible to know if I've got this completely correct, but the output is very plausible, grouping similarly named labels suffixed with ALT or OLD.
>
> Each line contains the labels representing a group of repeated assertions, ordered by the first appearance
> a1ii, dummylink
> ax-mp, nic-stdmp, re1axmp, e0a
> ax-1, minimp-ax1, luklem5, ax1, tbw-ax2, retbwax2, re1tbw2, tb-ax2, wl-ax1, ax-frege1
> ax-2, minimp-ax2, ax2, re1ax2, wl-ax2, ax-frege2
> ax-3, con4, ax3, wl-ax3, ax3h
> mp2, e00
> a1i, wl-a1i, wl-impchain-a1-1
> imim2i, wl-imim2i
> id, idALT, id1, wl-id, frege27
> idd, frege26
Interesting results. I took a look at a few of these.
Sadly, the few cases I reviewed seem to be intentional:
* ax1 is a proof of axiom ax-1 starting from the *Lukasiewicz* axioms, showing
that our chosen axiom system is at least as powerful as the *Lukasiewicz* axioms
(in the sense of being able to prove at least as much).
There's a naming convention here: We generally know that names-with-hyphens
can be duplicated by same-name-drop-last-hypen, e.g., ax-1 <-> ax1.
Also, note that ax1 is marked
"(Proof modification is discouraged.)" and "(New usage is discouraged.)", suggesting
that this is a pretty special thing.
* idALT is an alternative proof of id. You can tell because its name adds "ALT" to another name.
* frege27 is specifically *commented* as being the same as id :
"Identical to ~ id". It's also marked as "(Proof modification is discouraged.)" - suggesting
that something weird/special is going on.
* con4 notes in its comments that it's an "Alias for ~ ax-3 to be used instead of it for labeling consistency."
So its comment specifically notes the duplicate. You could argue that this duplication
isn't helpful (why not just consistently use the name of the axiom directly?), but
it certainly isn't hidden.
I suggest adding a few heuristics to omit the *intentional* duplicates, so we
can see just the *unintentional* duplicates. I suggest these heuristics:
1. Omit names with ALT or OLD that appear to duplicate one without the marking;
we already know they're alternatives.
2. Omit names with comments that refer to the other duplicate proof, e.g., "~ id" in frege27.
Obviously we *knew* about the duplications in those cases, since they're expressly noted
in the comments.
3. Omit names that only different because of a hyphen (ax-1, ax1).
4. MAYBE: Omit names with comments "(Proof modification is discouraged.)" or
"(New usage is discouraged.)"; these are already highly specialized.
This specific heuristic might omit too much, so I suggest experimenting.
Maybe just omit those that say "(Proof modification is discouraged.)", since that
says we don't want to replace the proof during minimization?
Or maybe the other heuristics are good enough.
I suspect the tool will still find things with some heuristics, but its
results will then be focused on unintentional duplications.
What do you think?
> It uses the checkmm validator to parse the .mm file. Kind of wish I'd asked here first because it's slightly more involved than I'd anticipated to check if two assertions are the same. It's not just the expressions that have to match, it's any hypothesis expressions (unordered), and any disjoint variable conditions (unordered). What I realised but haven't explored is that an assertion lacking a particular hypothesis or disjoint variable condition is presumably still compatible, but only in one direction.
That would be interesting! But I think you'd have even more "false positives"
in the sense of looking for unintentional duplicates, so I think it'd be even more
important to filter out some cases.
I think there's a kernel of a good idea here, but it needs some refining/tweaking
to keep the output from being overwhelming.
--- David A. Wheeler