(probable) bugs in javascript verifier

29 views
Skip to first unread message

Jim Kingdon

unread,
Oct 31, 2013, 11:40:11 AM10/31/13
to ghil...@googlegroups.com
OK, I found this months ago, and should have sent something to the mailing list, but I always thought I was on the verge of fixing it myself (which still could happen, but which hasn't happened yet).

Background:

* there is a testsuite and the python verifier passes all the tests (run "python verify_test.py testsuite" where python is python2, not python3).

* you can run the javascript verifier through the tests with "rhino js/verify_test.js testsuite" (don't know whether it would run on node too, but this is unlikely to be an issue of one javascript versus another).

The problem:

The javascript verifier shows five failures. Based on a quick glance, they would appear to be bugs that I fixed in the python verifier around Dec 2012 (for example, commit 3c24cf374363f0123584c19d753cb64594ee377c, but I think there was at least one more). Since the javascript verifier seems to have been (at least at one time) a straight translation of the python verifier, it may have carried over the bugs.

How you can help:

* fix these bugs in the javascript verifier

* write more tests (especially !reject ones; the !accept cases are probably reasonably well covered by the proofs at https://ghilbert-app.appspot.com ).

How I found this:

* there were proofs in the ghilbert-app repo which weren't verifying with the python repo, and I was assuming they had been written with the javascript verifier. Perhaps this is what was fixed by 2681c105b6adbeca22db7d4269241f9b761c9898 in that repo, but I'm not too sure.

Please do speak up. Don't know about anyone else, but I get more motivated when we talk about stuff on mailing lists as well as writing code.

Paul Merrell

unread,
Oct 31, 2013, 10:12:02 PM10/31/13
to ghil...@googlegroups.com
I have touched the javascript verify code, but I've tried to be careful to not change any behavior. I've been adding code to help display the proofs properly. I will try to look at this, but I've got several things almost done that I plan to finish first.

In the interest of speaking up, I wanted to explain what I've been working on the past several months and how excited I am about this. All this stuff is on the edit page for the proofs. (At some point I'll try to combine the edit page and the other page for the proofs.) I've added the ability to type in arbitrary expressions and Ghilbert will evaluate them like you type in (745 + 23) * 3 mod 7 and it will generate that (745 + 23) * 3 mod 7 = 1. I have it working for a lot of mathematical operations now. It also works for wffs like 12 < 347, in theory you can type in any wff that doesn't contain variables and it will give you a proof. It will generate a proof that it's true if it's true or if you type in 6 < 3 it will generate a proof that it's false. I also have this working for set operations. I'm still missing a bunch of operations, but I have the basic ones. (Also, this is only working in specific files in the peano directory.)

I've also added a way that Ghilbert can automatically suggest proofs you might want to use. All you do is add a ## <suggest> comment before the proof. Then when you're working on another proof and you type in any expression in the form that the theorem can use, it will suggest the theorem to use. It's super useful. All the commutative, associative, distributive properties get suggested. I was planning to add a bunch of tutorials explaining how to use all the tools I've added and then announce this to the group, but maybe I've waiting too long to tell people about it. I'm in the middle of working on a major UI upgrade and then I plan to work on some tutorials.

I've added a bunch of proofs related to recursion in the peano directory and I have also written one proof I'm particularly proud of. It will get better after the UI upgrade, but here's Euclid's theorem:


Paul



--
You received this message because you are subscribed to the Google Groups "Ghilbert" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ghilbert+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Adam Bliss

unread,
Nov 1, 2013, 5:07:36 AM11/1/13
to ghil...@googlegroups.com


On Oct 31, 2013 10:40 PM, "Jim Kingdon" <jkingd...@gmail.com> wrote:
> * you can run the javascript verifier through the tests with "rhino js/verify_test.js testsuite" (don't know whether it would run on node too, but this is unlikely to be an issue of one javascript versus another).

I just pushed an update with a little shim that allows it to work under node.js. I hope that my change does not disrupt your rhino workflow, but I did not have time to test it, so I apologize in advance if it does.

> Please do speak up. Don't know about anyone else, but I get more motivated when we talk about stuff on mailing lists as well as writing code.

+1 to this.

You can consider me also "on the verge of fixing" these JS verify bugs.

For a long time, and for reasons I can't quite explain, I have been obsessed with making a ghilbert theorem-prover with a point-and-click interface and a gamification layer. My latest working version is nicknamed Orcat and is hosted at http://orcat.org . One can currently get up to 86 points, by going as far as defining prime(n). However I doubt that anyone but me could actually do so, since the interface has been drifting, and has never been "good enough to stop and document".

In the course of developing it, I have been shearing away from ghilbert in a few philosophical areas, which I will now spew forth on rather than take the time to write a considered document. These do have a bit of causality flowing throw them; they are in roughly the order I came to them and each is partly motivated by the previous ones.

1. The user/player should be able to operate instantly on any node of an sexp-tree, without having to first prove a result about leaves and then "climb the tree" ("anim1i orim2i bibi1i" etc). That drudgery should be automated.
2. The user should see only deductions, and not inferences. I have retained modus ponens and a couple corresponding biconditional inferences under the hood, but they are invisible to the user. The main advantage of inferences is the reduction of boilerplate, which can be machine-generated.
3. .gh files should be considered machine-readable, not human-readable. Once you accept the machine-generated ghilbert code from #1 and #2, and once you see all the mandhyp text as pollution, this follows quickly.
4. Theorems should not be named. Naming things is famously one of the two hard problems in computer science (along with synchronization and off-by-one errors). There's a clear bifurcation between the metamath camp (short and cryptic) and the ProofWiki camp (long and verbose). I see that in the current ghilbert-app repository, we now have the worst of both worlds: an inconsistent agglomeration of the two. In practice, there is no naming system that doesn't suck. The variety of theorems is too great for a system which is consistent, general, easily memorized, and easily taught, and which provides quick mental access for more than a few people worldwide. What is needed instead is a good theorem search-engine. For compatability, Orcat generates theorem names from their statement sexps (e.g. "_dv_A_y___rarr_forall_z_rarr_equals_z_Oslash_A_rarr_forall_y_rarr_forall_z_rarr_equals_z_y_A_forall_z_rarr_equals_z_sect_y_A_forall_z_A".
5. Nor should there be names for variables, operators, kinds, or interfaces. There should be one universal content-addressable namespace for all ghilbert objects. If you and I prove the same theorem, in the same way, from the same axioms, we should be able to generate the same object. But this is impossible if we each choose our own names for everything (and our own ordering of axioms within an interface). Right now, I have only a vision for this and I don't know whether it's practical. We already have a pseudo-canonical naming scheme for variables within a stmt. My next step is going to be generating names for operators based on their defthms. Of course, these will be uncomfortably long, so I'm thinking about a compact binary encoding for sexps and/or a hashing function (but what about collisions?). For operators defined by axioms and not dethms, I want to come up with some name computed from "what axioms are true about this operator". Kinds, it seems to me, can be eliminated entirely and inferred at verify-time. An interface will be named after the sorted list of statements it contains.

My hope is to eventually have a script that can parse set.mm into theorem-objects, and parse ProofWiki and our own peano_thms.gh as well. All propositional-logic theorems will collide, even if the names are different. The peano theorems will be able to be automatically kind-bound to metamath's cardinals, since the theorems proved in metamath about cardinals will collide with the peano axioms required by our proofs. This will make practical a "global theorem search engine" which can search many different corpuses and convert between them. It will also be able to spit out verifiable Ghilbert versions of all objects; however the names will be opaque and one will need a kind of naming-layer (something like a stylesheet -- each camp can choose its own) in order to read the proofs.

So far I've gotten just far enough to give this harebrained scheme a name: "caghni" for "Content-Addressable GHilbert Namespace and Index". I'll be trying to develop it at http://github.com/abliss/caghni if my motivation stays high enough.

Paul Merrell

unread,
Nov 1, 2013, 2:44:00 PM11/1/13
to ghil...@googlegroups.com
On Fri, Nov 1, 2013 at 2:07 AM, Adam Bliss <abl...@gmail.com> wrote:

In the course of developing it, I have been shearing away from ghilbert in a few philosophical areas, which I will now spew forth on rather than take the time to write a considered document. These do have a bit of causality flowing throw them; they are in roughly the order I came to them and each is partly motivated by the previous ones.

1. The user/player should be able to operate instantly on any node of an sexp-tree, without having to first prove a result about leaves and then "climb the tree" ("anim1i orim2i bibi1i" etc). That drudgery should be automated.

I totally agree. In fact, I've already automated this. As you edit proofs on the edit page, you will see the last statement appear in the bottom right corner. You can click on parts of that statement to traverse the tree and move to the node you want then you can apply different theorems to that node like commuting, associating, and distributing. Any theorem with a <suggest> tag before front of it will appear as suggestion if it is applicable.

2. The user should see only deductions, and not inferences. I have retained modus ponens and a couple corresponding biconditional inferences under the hood, but they are invisible to the user. The main advantage of inferences is the reduction of boilerplate, which can be machine-generated.

I agree that gh files don't need to be human-readable if we have tools to automate the process of writing them, but I think inferences are useful in some cases because when we display the proofs they can make the proofs more human-readable. Consider the following examples:

I'd argue that using imim2i like this:
             φ  →       ψ
      (χ → φ) → χ → ψ

is much more readable than using imim2:
     (φ → ψ) → (χ → φ) → χ → ψ

And using eqcomi like this
     A = B
     B = A

is more readable than eqcom
     A = B ↔ B = A

I should add that I'm not 100% sure I understand what you're proposing, so I'm I may be misunderstanding something.

3. .gh files should be considered machine-readable, not human-readable. Once you accept the machine-generated ghilbert code from #1 and #2, and once you see all the mandhyp text as pollution, this follows quickly.
4. Theorems should not be named. Naming things is famously one of the two hard problems in computer science (along with synchronization and off-by-one errors). There's a clear bifurcation between the metamath camp (short and cryptic) and the ProofWiki camp (long and verbose). I see that in the current ghilbert-app repository, we now have the worst of both worlds: an inconsistent agglomeration of the two. In practice, there is no naming system that doesn't suck. The variety of theorems is too great for a system which is consistent, general, easily memorized, and easily taught, and which provides quick mental access for more than a few people worldwide. What is needed instead is a good theorem search-engine. For compatability, Orcat generates theorem names from their statement sexps (e.g. "_dv_A_y___rarr_forall_z_rarr_equals_z_Oslash_A_rarr_forall_y_rarr_forall_z_rarr_equals_z_y_A_forall_z_rarr_equals_z_sect_y_A_forall_z_A".

We do need a good theorem search-engine. I've added some theorem search ability in the archiveSearcher which suggests theorems you can use based on the s-expression. I think we should also keep in mind that just because two proofs have the same inputs and outputs doesn't mean they're the same. As Ghilbert develops, I think we would want to keep alternative proofs of the same theorem. For example prop.gh/id and prop.gh/id1.

5. Nor should there be names for variables, operators, kinds, or interfaces. There should be one universal content-addressable namespace for all ghilbert objects. If you and I prove the same theorem, in the same way, from the same axioms, we should be able to generate the same object. But this is impossible if we each choose our own names for everything (and our own ordering of axioms within an interface). Right now, I have only a vision for this and I don't know whether it's practical. We already have a pseudo-canonical naming scheme for variables within a stmt. My next step is going to be generating names for operators based on their defthms. Of course, these will be uncomfortably long, so I'm thinking about a compact binary encoding for sexps and/or a hashing function (but what about collisions?). For operators defined by axioms and not dethms, I want to come up with some name computed from "what axioms are true about this operator". Kinds, it seems to me, can be eliminated entirely and inferred at verify-time. An interface will be named after the sorted list of statements it contains.

My hope is to eventually have a script that can parse set.mm into theorem-objects, and parse ProofWiki and our own peano_thms.gh as well. All propositional-logic theorems will collide, even if the names are different. The peano theorems will be able to be automatically kind-bound to metamath's cardinals, since the theorems proved in metamath about cardinals will collide with the peano axioms required by our proofs. This will make practical a "global theorem search engine" which can search many different corpuses and convert between them. It will also be able to spit out verifiable Ghilbert versions of all objects; however the names will be opaque and one will need a kind of naming-layer (something like a stylesheet -- each camp can choose its own) in order to read the proofs.

So far I've gotten just far enough to give this harebrained scheme a name: "caghni" for "Content-Addressable GHilbert Namespace and Index". I'll be trying to develop it at http://github.com/abliss/caghni if my motivation stays high enough.

--

Adam Bliss

unread,
Nov 1, 2013, 6:22:18 PM11/1/13
to ghil...@googlegroups.com


On Nov 2, 2013 1:44 AM, "Paul Merrell" <pa...@merrells.org> wrote:
> I totally agree. In fact, I've already automated this. As you edit proofs on the edit page, you will see the last statement appear in the bottom right corner. You can click on parts of that statement to traverse the tree and move to the node you want then you can apply different theorems to that node like commuting, associating, and distributing. Any theorem with a <suggest> tag before front of it will appear as suggestion if it is applicable.

Neat! Looking forward to your tutorial of all the WhizzBang new features.

<suggest> is good for now, but one day I hope we can display all known theorems which unify, ordered by a TheoremRank which takes into account the popularity of the theorem as well as its relevance to the current situation.

> I think inferences are useful in some cases because when we display the proofs they can make the proofs more human-readable. Consider the following examples:
>
> I'd argue that using imim2i like this:
>              φ  →       ψ
>       (χ → φ) → χ → ψ
>
> is much more readable than using imim2:
>      (φ → ψ) → (χ → φ) → χ → ψ
>
> And using eqcomi like this
>      A = B
>      B = A
>
> is more readable than eqcom
>      A = B ↔ B = A

I agree, but this is something that the proof displayer can detect and format appropriately. It's not worth the extra cognitive load, extra theorem names, and extra code in the verifier for this to be encoded in the .GH file.  (eqcomi is just one special case of "applying eqcom to a node", having chosen the root node of the sexp tree.)

( I forgot to mention: once you do away with inferences, you can also remove the proof stack . at every step of the proof you simply have one valid theorem.)

Paul Merrell

unread,
Nov 1, 2013, 7:12:53 PM11/1/13
to ghil...@googlegroups.com
Interesting. I'm still not sure I fully understand. Maybe it would be good to consider an example. Say I'm writing an inductive proof. First, I prove the basis case and then I proof the inductive step. As it stands now, I have two valid theorems on the proof stack and then I use them both together. How would this work without a proof stack? In many proofs, it's useful to split things into many different cases, prove each case, and then combine them.



--

Adam Bliss

unread,
Nov 1, 2013, 7:28:32 PM11/1/13
to ghil...@googlegroups.com

You would probably still prove the base case, inductive step, and alpha-glue as helper theorems. Then you would start a new proof with the "induction deduction", which is just like the induction inference but rephrased to use -> instead of hypotheses. Then you'd manipulate that theorem until it looked like your goal, by "detatching the antecedents in place" using your helper theorems.

Okay, I realize that's not a very good explanation, so I'll copy an example from orcat. We're going to prove 

Jim Kingdon

unread,
Nov 2, 2013, 6:34:47 PM11/2/13
to ghil...@googlegroups.com
On 11/01/2013 02:07 AM, Adam Bliss wrote:
> I just pushed an update with a little shim that allows it to work
> under node.js. I hope that my change does not disrupt your rhino
> workflow, but I did not have time to test it, so I apologize in
> advance if it does.

Cool, thanks! Still works with rhino (and if it didn't, I would be glad
to switch to node).

> For a long time, and for reasons I can't quite explain, I have been
> obsessed with making a ghilbert theorem-prover with a point-and-click
> interface and a gamification layer.

Much the same lines along which I have been thinking. Even without
anything explicitly game-like, I find writing proofs to be addictive, so
there's gotta be a way to enhance that experience one way or another.

As for the rest, a lotta stuff here, so I'll try to be somewhat concise.

> 1. The user/player should be able to operate instantly on any node of
> an sexp-tree, without having to first prove a result about leaves and
> then "climb the tree" ("anim1i orim2i bibi1i" etc). That drudgery
> should be automated.

Definitely agree on this one, the only question is how. Generating the
part of the proof which invokes orim2i and such is the most
straightforward (which I guess is what Paul Merell did). Principia
Mathematica avoids this issue by having the metalogic have built-in
substitution rules. Unfortunately, they don't clearly state what those
rules are, which I believe is what Gödel is complaining about at
https://en.wikipedia.org/wiki/Principia_Mathematica#G.C3.B6del_1944 .
Margaris has much the same situation (assumes substitution, doesn't
clearly define it). In JHilbert, the automatic folding/unfolding of
definitions provides one kind of built-in substitution, which is what
makes
http://www.wikiproofs.org/w/index.php?title=Principia_Mathematica_propositional_logic
work in JHilbert and not ghilbert (for ghilbert, I took an alternate
route/axiomization more similar to what metamath does).

> 2. The user should see only deductions, and not inferences. I have
> retained modus ponens and a couple corresponding biconditional
> inferences under the hood, but they are invisible to the user. The
> main advantage of inferences is the reduction of boilerplate, which
> can be machine-generated.
>

I guess you'd also retain generalization (
http://us.metamath.org/mpeuni/ax-gen.html )?
>
> 3. .gh files should be considered machine-readable, not
> human-readable. Once you accept the machine-generated ghilbert code
> from #1 and #2, and once you see all the mandhyp text as pollution,
> this follows quickly.
>
Well, .gh is too low-level, yes. But it is nice to have something which
is human-readable, and it would be better if it were higher-level. One
of the really cool things about a metamath/ghilbert style system is that
the proofs are more readable than the dump of what something like coq or
isabelle deduces behind the scenes for you (my impression is that a
coq/isabelle style system can produce such a dump, just that it is even
less readable than a .gh file).

This is for debugging, but also because one of the big benefits for
proof systems is in education.

> 4. Theorems should not be named.
>

My inspiration here is software engineering (with a bias towards, say,
Ruby, or I guess other languages with object-oriented naming and some
kind of hierarchical module system). My take on it is that names are
quite important to humans, but that a single flat namespace (like
ghilbert's) leads to there being no good choices. Without having really
tried to design such a thing, I'm imagining a situation where you can
say "transitivity" and the system will notice whether you are dealing
with x < y and y < z (where x, y, and z are reals), A ⊆ B and B ⊆ C (for
sets), A = B and B = C, x > y and y > z, or what.

Making it possible to name things in hierarchical and context-dependent
ways is one of the key things which distinguishes the programming
environments of the 2000s from the programming environments of the 1960s.

> We already have a pseudo-canonical naming scheme for variables within
a stmt.

Meaning the way that the variables of kind formula are always φ, ψ, χ,
θ, etc? In my opinion that breaks down once you have more than about
three of them, which is why I prefer semantic names when I can think of
them (antecedent, common, etc). Or in a theorem like
http://us.metamath.org/mpeuni/imim12i.html I would prefer p0, p1, q0,
and q1 instead of p, q, r, and s.

> The peano theorems will be able to be automatically kind-bound to
metamath's cardinals, since the theorems
> proved in metamath about cardinals will collide with the peano axioms
required by our proofs.

Hmm, sounds ambitious.

> An interface will be named after the sorted list of statements it
contains.

I suspect you'll want to be able to add a statement without renaming the
interface.

Adam Bliss

unread,
Oct 18, 2014, 6:09:00 PM10/18/14
to ghil...@googlegroups.com
On Thu, Oct 31, 2013 at 8:40 AM, Jim Kingdon <jkingd...@gmail.com> wrote:
> How you can help:
>
> * fix these bugs in the javascript verifier

At long last, I found time to track these down, and I have just pushed
a couple of fixes which allow the JS verifier to pass all 95 tests.

--Adam

Jim Kingdon

unread,
Oct 30, 2014, 11:56:39 AM10/30/14
to ghil...@googlegroups.com
Hey, all that work looks awesome! Love all the new tests, and I learned
about istanbul-js (which
I hadn't been familiar with before).

I was able to get passing results on Ubuntu 14.04 using nodejs 0.10.25
and python 2.7.6.

I was also playing around with python3 at
https://github.com/jkingdon/ghilbert/tree/python3 but that branch
doesn't even pass the tests on python2 and the sexp_to_string code there
doesn't work. Given that appengine seems to mostly still be a python2
thing, I don't know if there is any strong reason to make this happen.
Some of the changes on that branch would be pretty easy to pull in,
especially if we assume python 2.6+ rather than something older.

On 10/18/2014 03:08 PM, Adam Bliss wrote:
> On Thu, Oct 31, 2013 at 8:40 AM, Jim Kingdon <jkingd...@gmail.com> wrote:
>> How you can help:s
Reply all
Reply to author
Forward
0 new messages