Technical Difficulties

Paul Merrell

unread,

Dec 18, 2013, 3:53:45 AM12/18/13

to ghil...@googlegroups.com

I'm flying to Virginia tomorrow, so I have to use my laptop to use Ghilbert. My laptop has worked fine in the past, but now it's not running Ghilbert and it's driving me crazy. First, the app engine wasn't even coming up, so I downloaded the new version 1.8.8. Now the app engine comes up, but it's acting like everything in the repository is missing.

For all the theorems, it says it can't find them. Same with the wiki. For the home page, the logo comes up, but it says the wiki template is missing. Also, I can't update my local version of the git directory. When I try to push, it says I'm behind and I need to pull, when I try to pull, it says the other end hung up. I really don't know how to debug this. I'm going to go crazy if I can't work on this the next two weeks. Here's some stuff it spit out:

2013-12-18 00:37:27 Running command: "['C:\\Python27\\pythonw.exe', 'C:\\Program Files (x86)\\Google\\google_appengine\\dev_appserver.py', '--skip_sdk_update_check=yes', '--port=8080', '--admin_port=8000', 'C:\\theorem\\ghilbert']"

INFO 2013-12-18 00:37:30,414 devappserver2.py:660] Skipping SDK update check.

WARNING 2013-12-18 00:37:30,426 api_server.py:331] Could not initialize images API; you are likely missing the Python "PIL" module.

INFO 2013-12-18 00:37:30,437 api_server.py:138] Starting API server at: http://localhost:61967

INFO 2013-12-18 00:37:30,443 dispatcher.py:171] Starting module "default" running at: http://localhost:8080

INFO 2013-12-18 00:37:30,450 dispatcher.py:171] Starting module "gc" running at: http://localhost:8081

INFO 2013-12-18 00:37:30,451 admin_server.py:117] Starting admin server at: http://localhost:8000

INFO 2013-12-18 08:37:56,526 appengine.py:211] getpacks, len(packs) = 0

INFO 2013-12-18 00:37:56,588 module.py:617] default: "GET /edit/peano/peano_thms.gh/a4i HTTP/1.1" 200 221

INFO 2013-12-18 00:37:56,664 module.py:617] default: "GET /favicon.ico HTTP/1.1" 304 -

Adam Bliss

unread,

Dec 18, 2013, 5:09:56 AM12/18/13

to ghil...@googlegroups.com

What version of python are you running? I had some problems (don't remember exactly what) when I updated to the new appengine SDK and it wanted a newer version of python.

You could try moving your git repo away and cloning a fresh one. Note that you generally can't clone a repo from ghilbert-app.appspot.com because its implementation of git is too slow. Instead you can clone from (e.g.) g...@github.com/abliss/ghilbert-app and then manually add ghilbert-app.appspot.com as a remote. I think Jim wrote a wiki page on the exact commands to achieve this.

--
You received this message because you are subscribed to the Google Groups "Ghilbert" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ghilbert+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Paul Merrell

unread,

Dec 18, 2013, 10:53:44 AM12/18/13

to ghil...@googlegroups.com

I'm using Python 2.7.4. What version are you using?

Jim Kingdon

unread,

Dec 18, 2013, 11:51:12 AM12/18/13

to ghil...@googlegroups.com

Not sure I follow exactly what you tried and what the error message is, but the directions for git are at:

https://ghilbert-app.appspot.com/wiki/Git

and I just expanded the section on pushing to a local ghilbert. Having the theorems missing is what I've seen whenever I restart appengine (might be something I could fix by setting up appengine more, but I just push from git whenever I start it).

Paul Merrell

unread,

Mar 23, 2014, 3:31:26 AM3/23/14

to ghil...@googlegroups.com

Somehow all the theorems have gone missing again for me locally. Looks like a similar issue to the one on this thread that I was having in December. I think I fixed it last time by redoing some of the steps in https://ghilbert-app.appspot.com/wiki/Git, but that's not working for me. When I try just following the last few steps "git push local" it says

"You are pushing to remote 'local' which is not the upstream of your current branch 'master', without telling me what to push to update which remote branch."

I have no idea what that sentence means. Can anyone translate that from nonsense into English?

When I start over with the instructions on https://ghilbert-app.appspot.com/wiki/Git and run git clone git://github.com/jkingdon/ghilbert-app.git, I get a version of theorems that is at least several months old.

Paul

Adam Bliss

unread,

Mar 23, 2014, 5:26:14 AM3/23/14

to ghil...@googlegroups.com

Try "git push local -f master:master"

For more options, visit https://groups.google.com/d/optout.

Jim Kingdon

unread,

Mar 23, 2014, 9:02:39 PM3/23/14

to ghil...@googlegroups.com

Yeah, that sounds right. I've streamlined the instructions at https://ghilbert-app.appspot.com/wiki/Git a bit which also might help.

Paul Merrell

unread,

Mar 23, 2014, 10:33:48 PM3/23/14

to ghil...@googlegroups.com

Still didn't work. It doesn't give me the error message it did before, but now it never finishes doing whatever it is doing. It says:

$ git push local -f master:master
Counting objects: 18816, done.

Delta compression using up to 8 threads.

Compression objects 100% (6867/6867), done.

Writing objects: 100% (18816/18816), 4.98 MiB, done.

Total 18816 (delta 11930, reused 18748 (delta 11883)

It's been sitting there for over an hour.

Paul

Adam Bliss

unread,

Mar 24, 2014, 4:06:28 AM3/24/14

to ghil...@googlegroups.com

My guess is that you have hit a scalability limit in the babygit python implementation.

I tried your command on my device and saw similarly poor performance. It seemed to get better if I changed "blocksize" and "nblocks" in "babygit/appengine.py". I can't say I understand exactly what they do, but I multiplied both by 100 and the "git push" command was able to finish.

I also noticed (by doing an strace) that the python app does an awful lot of file churning in /tmp. So you might get better performance by tweaking that filesystem (e.g. moving it to ramfs if it isn't already; or giving it more memory to work with).

One might hope for a way to skip all this git->appengine->blobstore stuff by simply copying files from .git/objects to /tmp/dev_appserver.blobstore/dev~ghilbert-app/, but I was not able to casually deduce the relationship between their structures.

Raph Levien

unread,

Mar 24, 2014, 9:13:38 AM3/24/14

to ghil...@googlegroups.com

It's unfortunately not all that hard to hit the scalability limits in babygit. I've been talking with Paul separately about this and it's clear that I either need to pour serious work into babygit performance (at which point I'd still run into problems such as the 32M limit in appengine, which would be a problem for clone), or strategically retreat. At this point, I think the simplest thing to do is to just use the main ghilbert repo to store both code and the repo (so from the purpose of the serving code it's pretty much just static). I'll send another mail out shortly expanding on this.

The nblocks and blocksize parameters control the in-process cache for reading packfiles. It's LRU otherwise. From this usecase, I think it might be more limited by individual loose object db writes.

Blocking for an hour seems wrong - I can imagine a couple minutes with 18k objects. But this is considerably bigger than the ~3k objects in my current repo, so it's possible you're running into dev app engine server issues.

Raph

Adam Bliss

unread,

Mar 25, 2014, 4:50:37 AM3/25/14

to ghil...@googlegroups.com

Perhaps this would be an appropriate time for me to pitch the storage model I'm pursuing with caghni ( http://github.com/abliss/caghni ):

1. Rather than the unit of storage being a file (.gh or ghi), the unit of storage is a single thm or stmt command.
a. This means taking a bit more space, since each thm must essentially redeclare the interface consisting of all its dependencies.
b. But there is enough information to pass between this storage system and the .gh/.ghi file format, in case some users want to interact with the system that way.

2. Rather than a diff-oriented DVCS with branching and history, the store is simply a flat collection of content-addressable documents.
a. This means replication should be pretty straightforward, even for very large repositories. See couchdb or camlistore for examples.
b. There is no need to ever delete from the datastore. Each thm that goes in is verified upon insertion, and is always guaranteed to be useful. If someone submits a newer, better proof of a given theorem, it can simply be ranked higher. If space becomes a concern, we can GC the old ones away at leisure, but there is no need to ever propagate deletes.

3. Naturally, this system lends itself to the main goal of caghni: mixing and matching theorems from different ghilbert domains (even ones that are based on different sets of axioms and use different spelling for their terms and kinds).
a. This problem has proven surprisingly thorny but I have a system more or less working now.

Any thoughts from the peanut gallery?

Raph Levien

unread,

Mar 29, 2014, 9:06:19 PM3/29/14

to ghil...@googlegroups.com

It sounds very interesting. I'm reluctant to push it in the mainline because I can see ranking of formally identical theorems as a very hard problem. To my way of thinking, the text around the theorems is just as important to manage as the theorems themselves.

So I think we have something of a fork. We can:

1. Go forward with babygit, pushing harder on scaling so we can get decent performance after, say, importing set.mm.

2. Use the main ghilbert project repo as the authoritative source for theorems, and do most serving statically out of that.

2a. Same as 2, but also allow saving theorems into the database, and serving those as an overlay over the static. We'd want to periodically merge theorems out of the database into the main repo.

3. Go forward with a git-based repo, but integrate a real git implementation. This would require moving off App Engine, because App Engine can't host real git.

4. Adopt a content addressible scheme such as caghni.

I really want to hear from people who actually contribute theorems which of these approaches are viable and which have serious problems. I would ideally like to get two-way bridging reestablished between Ghilbert and Metamath (for a while, a few years ago, I was fairly prolifically contributing theorems to set.mm which were actually developed with Ghilbert), and I feel that the current scaling issues would be a serious blocker.

So, what do people think?

Raph

Choice 2 clearly

Jim Kingdon

unread,

Mar 30, 2014, 4:26:26 PM3/30/14

to ghil...@googlegroups.com

Good summary of the issue, Raph. I'm not sure how many words to attempt on "people who contribute theorems" and how I have contributed theorems (or tried to), but I guess I'll stick to the git question for now.

With respect to content addressable stores, if caghni helps with the problem of tying together theorems written with different axioms, kinds, and terms, it is worth looking at. That's a problem I assumed would be more a matter of kindbind, proving one axiom set from another, and such things, but it is something I think is worth solving, one way or another. My biggest concern with caghni as an approach is readability: whether everything ends up hard to debug and understand, well, and Raph's concern about formally identical theorems not really being identical in all ways we care about.

As for babygit, whether to keep banging our head against that particular wall is in significant part up to Raph, since his is the head which has spent the most time in contact with that particular wall. But it does seem a bit dead-end-y.

Using a real git and hosting somewhere other than AppEngine seems potentially appealing, although that's especially true if we end up wanting a lot of non-AppEngine stuff other than just git. And it could end up bringing a new set of headaches.

Which leaves option 2. It seems to solve our problem without a bunch of new work.

Reply all

Reply to author

Forward