Artifact Evaluation

11 views
Skip to first unread message

Joe Gibbs Politz

unread,
May 24, 2013, 8:52:23 PM5/24/13
to lamb...@googlegroups.com
No TL;DR, please do read in full.

For next Saturday, June 1, we need to submit the code for \Py to be
evaluated alongside our paper. The link for the artifact evaluation
submission info is at

http://splashcon.org/2013/cfp/665

(Aside: There's a few reasons to do this:

- It encourages reproducible science
- It forces us to think about how we should distribute our research software
- It helps us clearly state what the software does and what it is for)


The main, most important, most absolutely crucial thing is that we
make sure the process for reproducing the results we present in the
paper is *flawless*. We don't want to cause any headaches for anyone
who wants to get started running our stuff, and from the point of view
of reproducing results, that's the baseline that we are responsible
for first.

The second important thing is to show that we (in large part thanks to
Alejandro's consistent fantastic efforts), have been *adding to* the
project since it was originally submitted. It's a continuing effort
to cover all of Python and more, and it's good for us to show that
lambda-py is still improving.

I think there are several things to produce:

1. A VM that contains both today's lambda-py, and the tag of lambda-py
at submission time. We'll encourage reviewers to review what we have
today, but be sure to include the old version so they can verify what
the paper claims. We should also make sure that the old tests are in
their own directory so reviewers can verify that the *new* code works
on the *submission-time* tests.

2. Detailed instructions on what commands to run to:
* Start the VM and log in
* Run the test scripts
* Interpret the output of the test scripts
* Check the line-count and type of files that we pass tests on

3. Good instructions for building things from source, which helps
reviewers see how easy it would be to contribute to and build upon the
project.

4. A clear description of what we've added since the submission, to
highlight what we've been working on. This is probably best expressed
by comparing the test suites at present day and at the submission tag.

5. Any code cleanups that folks feel strongly about getting in --- I
know that Matthew had some opinions about how scope desugaring was
done. I'd like to have all of these done by *Wednesday* at the
latest, so there's still two days in which to test the VM.

6. Any low-hanging fruit test cases that people want to tackle, again,
to be done by Wednesday. Don't bite off anything that requires any
infrastructural changes in the core.

5 and 6 are clearly for all of us, if we have the time available.

I'm happy to try and tackle 1 & 2 over the weekend/Monday, and I may
try to steal some of Daniel's time for that (he's sitting next to me this
summer). Alejandro, do you have the time to provide a summary for 4?
Can one of the others produce instructions for #3 if there are any
beyond what's in the repo's README already?

Finally, can someone produce instructions for running the tests and
understanding the output in detail *independently of me* so we can
compare notes (basically, the non-VM parts of 2)? If we can, we
should include instructions like showing reviewers how to break test
cases and read the output, and showing them how to modify a part of
lambda-py and understand what broke/changed.

We have an awesome submission to present, so let's package it up nice
for our reviewers!

Shriram Krishnamurthi

unread,
May 24, 2013, 8:56:33 PM5/24/13
to lamb...@googlegroups.com
The most important thing: don't make anything WORSE! Do not break the
build! Do not try some quick hack that has bad unintended
consequences! We've come this far, it'd be a tragedy for some silly
typo to cause the artifact to fail muster.

Shriram

Alejandro Martinez

unread,
May 25, 2013, 10:00:56 AM5/25/13
to lamb...@googlegroups.com
> Alejandro, do you have the time to provide a summary for 4?
Yes, I will write a summary of newly supported features from original submission including which tests exercise them.

Probably I could also look at the standard suite for tests related to these new features which doesn't require changes to the code to pass (6-LHF).

There were a couple tests with commented-out sections (due to the use of unsupported features) and which now are passing fully, it is enough to document those changes/additions or it would be better to left the original as it was at submission time and to create new ones?



2013/5/24 Shriram Krishnamurthi <s...@cs.brown.edu>

--
You received this message because you are subscribed to the Google Groups "lambda-py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lambda-py+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.





--
Alejandro.

Joe Gibbs Politz

unread,
May 25, 2013, 1:00:50 PM5/25/13
to lamb...@googlegroups.com
On Sat, May 25, 2013 at 10:00 AM, Alejandro Martinez
<amtri...@gmail.com> wrote:
>> Alejandro, do you have the time to provide a summary for 4?
> Yes, I will write a summary of newly supported features from original
> submission including which tests exercise them.

Thanks!

> There were a couple tests with commented-out sections (due to the use of
> unsupported features) and which now are passing fully, it is enough to
> document those changes/additions or it would be better to left the original
> as it was at submission time and to create new ones?

I think it's fine to just document that they changed, without
generating whole new tests.

I used http://cloc.sourceforge.net/ to produce the line counts for the
paper, which doesn't count comments, so a new count will reflect the
updates we've made.

Alejandro Martinez

unread,
May 26, 2013, 5:49:20 PM5/26/13
to lamb...@googlegroups.com
Here is an initial version with my own additions, no mention to the new native parser option, which I think should be added: https://github.com/brownplt/lambda-py/blob/master/CHANGES.txt



2013/5/25 Joe Gibbs Politz <joe.p...@gmail.com>
--
You received this message because you are subscribed to the Google Groups "lambda-py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lambda-py+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.





--
Alejandro.

Junsong Li

unread,
May 26, 2013, 8:41:42 PM5/26/13
to lamb...@googlegroups.com
For building everything from scratch, everything goes well until the "make".

for foo in `git grep -l 'lang plai-typed$' | grep '\.rkt$'`; do sed -i s/'lang plai-typed$'/'lang plai-typed\/untyped'/g $foo; done && git ls-files | grep '\.rkt$' | xargs raco make -j 5 && grep -r -l --color='auto' -P -n "[\x80-\xFF]" | sed s/^/"Warning: found non-ascii characters: "/g | grep -v pyc$ | grep -v zo$ | grep -v rkt$; make clean-python; true
usage: grep [-abcDEFGHhIiJLlmnOoPqRSsUVvwxZ] [-A num] [-B num] [-C[num]]
[-e pattern] [-f file] [--binary-files=value] [--color=when]
[--context[=num]] [--directories=action] [--label] [--line-buffered]
[--null] [pattern] [file ...]
rm: ./__pycache__: is a directory
make[1]: *** [clean-python] Error 1

This is caused by the difference of the command "sed" and "grep" between FreeBSD(Mac) and Linux. What's the platform will be used when the artifact is evaluated?

Matthew Milano

unread,
May 26, 2013, 11:45:24 PM5/26/13
to lamb...@googlegroups.com
The answer here is actually that we just need a different default
target. We should ideally have a fancy one that also does things like
check to make sure the tools we need to build and run *exist* before
building the project.

Would it be fine to have the code build easily on Linux and
less-easily on Windows and Mac, or are we going to try to make the
build on all platforms work nicely?

~matthew

Joe Gibbs Politz

unread,
May 27, 2013, 12:26:00 AM5/27/13
to lamb...@googlegroups.com

Can't we just have them raco make python-main.rkt? Who cares about all the type checking options. That should work cross-platform.

Junsong Li

unread,
May 27, 2013, 1:05:47 AM5/27/13
to lamb...@googlegroups.com
Then we need to update the README to reflect this.

Matthew Milano

unread,
May 27, 2013, 2:03:26 AM5/27/13
to lamb...@googlegroups.com
I was thinking it would be nice to have a script that checks for the right racket version and installs the plt file for the language, though this is far from essential. 

Matthew Milano

unread,
May 30, 2013, 4:29:35 PM5/30/13
to lamb...@googlegroups.com
I find myself with some time to devote to this tonight. What is the
biggest priority item that I could take on in an evening? I presume
there are some outstanding items that are currently being tackled by
Joe + others, and I don't want to step on toes/needlessly replicate
work.

~matthew

Joe Gibbs Politz

unread,
May 30, 2013, 4:33:10 PM5/30/13
to lamb...@googlegroups.com
OK, sorry for the delay, this ended up taking much longer than I
thought. Lots of big files and careful running of tests and creating
things in the right format.

Here's what I want to present to the AEC:

http://cs.brown.edu/research/plt/dl/lambda-py/ae/index.html

For anyone who has time, PLEASE go through as many of the instructions
as you can and let me know if anything doesn't work.

If you have interesting concrete examples in the "Playing around with
lambda-py" section, maybe we could add those. Other than that, I only
want bugfixes to the prose and whatnot. My goal is to submit tomorrow
morning/afternoon while I have a fast Internet connection (its roughly
a 1G file).

On Mon, May 27, 2013 at 2:03 AM, Matthew Milano <mat...@cs.brown.edu> wrote:

Matthew Milano

unread,
May 30, 2013, 4:40:35 PM5/30/13
to lamb...@googlegroups.com
One minor correction - unless there has been substantial work done on
phase1/phase2 since the paper, you've got the purposes of those files
flipped. I've corrected it and pushed it to the git repo.

In general, should we make changes to the index.html in the git repo
(assuming it will be synchronized), or should we email you with
changes that need to happen?

~matthew

Shriram Krishnamurthi

unread,
May 30, 2013, 4:46:17 PM5/30/13
to lamb...@googlegroups.com
It's really crucial that this be double/triple checked with the most
anal reading of every word Joe has written. I caught/improved a few
things yesterday. Others should take a whack at it too.

If you only have time to read, please at least; if you can, please
also download and run through the instructions.

At this point, it is far more important that the instructions and VM
and whatnot be absolutely perfect, than that we get a handful of
additional tests through. A few more tests are a small win that nobody
will notice; but any mistakes in the artifact/bundling/instructions
will be a huge loss that will be noticed in a big way.

Anyone reading this paper will absolutely want to see us say "the
artifact for this paper met/exceeded expectations", or else be very
suspicious (since the whole point of the paper is a *tested*
semantics). So any flaws in the bundling will undo the months
(person-*years*) of great work that have gone into this.

Joe Gibbs Politz

unread,
May 30, 2013, 6:18:32 PM5/30/13
to lamb...@googlegroups.com
Make changes to index.html in the artifactevaluation branch and push.
I have a script to rebuild the tarball for now.

Daniel Patterson

unread,
May 30, 2013, 6:20:33 PM5/30/13
to lamb...@googlegroups.com
Is it intended that the current number of tests says [FILL]?

Joe Gibbs Politz

unread,
May 30, 2013, 6:32:59 PM5/30/13
to lamb...@googlegroups.com
It shouldn't any more. I was waiting until all tests were in, but
we're past that point now. It should say 206 (at least that's how
many tests I count), and we should verify that with running the tests
in the VM.

Shriram Krishnamurthi

unread,
May 30, 2013, 6:35:24 PM5/30/13
to lamb...@googlegroups.com
Agreed. Let's seal up and move on. More important to get the bundling right.

Daniel Patterson

unread,
May 30, 2013, 6:40:31 PM5/30/13
to lamb...@googlegroups.com
Typo, I think:

The instructions refer to running:

racket python-main.rkt --test ../../lambda-py-oopsla2013/tests/python-reference/

But I believe that should be

racket python-main.rkt --test ../../lambda-py-
28-march-2013/tests/python-reference/



On Thu, May 30, 2013 at 6:35 PM, Shriram Krishnamurthi <s...@cs.brown.edu> wrote:
> Agreed. Let's seal up and move on. More important to get the bundling right.
>

Joe Gibbs Politz

unread,
May 30, 2013, 6:41:35 PM5/30/13
to lamb...@googlegroups.com
Yup that's a typo. Thanks.

Daniel Patterson

unread,
May 30, 2013, 6:47:33 PM5/30/13
to lamb...@googlegroups.com
Also, some cross platform results:

On an ubuntu machine I have, I was able to run the machine from the
version of virtualbox that's in the repos (this was true at least
yesterday). The laptop is on campus now, so I don't know what version
it is, but when I try to run it from the version that is currently in
debian testing, 4.1.18, it never starts up - with both the .vbox file
and manually creating a machine with that hard disk, it never gets
past "Starting Virtual Machine 0%".

On a mac laptop I was able to test with, all works fine with the
latest version from virtualbox.org (4.2.12). But we might want to tell
people to get the latest version (4.2), in case older versions are
getting confused (unless it is just a weird thing with my machine,
which it could be).

Alejandro Martinez

unread,
May 30, 2013, 7:21:09 PM5/30/13
to lamb...@googlegroups.com
I am running the VM on Windows Vista 32bits with VirtualBox 4.2.12, without issues with basic tests.

The command line to run the full test suite is missing:
$ cd ~/lambda-py/lambda-py-artifact-submission/base
[FILL] tests succeeded
0 tests failed
Another minor detail is the screen captures are not showing when index.html is open from the folder created using the tarball.

I will try to run the full suite.

2013/5/30 Daniel Patterson <daniel_p...@brown.edu>



--
Alejandro.

Daniel Patterson

unread,
May 30, 2013, 7:21:12 PM5/30/13
to lamb...@googlegroups.com
Hmm. On "test_property_getter_doc_override.py" from the property
folder of the 28 march 2013 tests, I get a failure:

TypeError: unexpected keyword argument(s): {doc: spam spam spam}

This is from the first step in the "Step By Step Instructions".

Daniel Patterson

unread,
May 30, 2013, 7:27:29 PM5/30/13
to lamb...@googlegroups.com
Note that my previous failure was from, as the instructions specified,
using the CURRENT code to run the OLD set of tests (which should be a
subset of the current tests).

I've just run it again, on just the property tests, and get the same
error - 8 tests pass, 1 fails, with the message given.

I'm running the full current suite now, we'll see what I get.

Joe Gibbs Politz

unread,
May 30, 2013, 7:32:34 PM5/30/13
to lamb...@googlegroups.com
Run on just the property tests in the new directory to see if the test
has changed?

Matthew Milano

unread,
May 30, 2013, 7:40:22 PM5/30/13
to lamb...@googlegroups.com
I'm using VirtualBox 4.1.26 and am able to successfully run the VM
without incident. I'm using 64-bit gentoo linux. 4.1.26 is the
most-recent version marked as stable in gentoo.

So it looks like the VM does not require VirtualBox 4.2 or later. I
think trying to pinpoint which versions of VirtualBox best run our VM
may not be an optimal use of our time; I think if we say "tested in VB
<versions we've tested in>" that will be good enough.

~matthew

Alejandro Martinez

unread,
May 30, 2013, 7:47:43 PM5/30/13
to lamb...@googlegroups.com
Yes, the test has changed since it used a keyword argument which was silently ignored, but now it generates an error since we have keyword argument support in calls but it was not extended to the object instantiation logic.

I left this as TODO but never found time to complete it (and it is not easy)...

class PropertyDocBase(object):
    _spam = 1
    def _get_spam(self):
        return self._spam
    # TODO(Alejandro): add doc to property as named argument with default value
    #spam = property(_get_spam, doc="spam spam spam")
    spam = property(_get_spam)



2013/5/30 Joe Gibbs Politz <joe.p...@gmail.com>



--
Alejandro.

Daniel Patterson

unread,
May 30, 2013, 7:49:38 PM5/30/13
to lamb...@googlegroups.com
So we should have them run the old tests with the old implementation, right?

(and yes, all the current property tests pass with the current implementation)

On Thu, May 30, 2013 at 7:47 PM, Alejandro Martinez

Daniel Patterson

unread,
May 30, 2013, 7:55:15 PM5/30/13
to lamb...@googlegroups.com
(I just pushed minor updates for that, and for the missing command
Alejandro mentioned).

I've now run through all the commands in the VM section of the paper,
and modulo these changes, everything worked perfectly.

On Thu, May 30, 2013 at 7:49 PM, Daniel Patterson

Alejandro Martinez

unread,
May 30, 2013, 8:05:13 PM5/30/13
to lamb...@googlegroups.com
Since that default argument was silently ignored, I propose to make the same change to the original test (including the TODO comment) so it alos passes with the new code also, but making clear it is a pending issue.
I tested this in the VM and it works, I would like to have a quick solution without changing the test, but I don't... sorry


2013/5/30 Daniel Patterson <daniel_p...@brown.edu>



--
Alejandro.

Matthew Milano

unread,
May 30, 2013, 8:20:02 PM5/30/13
to lamb...@googlegroups.com
Running

$ cd ~/lambda-py/lambda-py-artifact-submission/base
$ racket python-main.rkt --python-path
~/install-stuff/Python-3.2.3/python --test-py
../tests/python-reference

In the virtual machine produces 201 tests succeeded and 4 tests failed.

The tests that fail with the error "no module named support" are:

test_failing_support_sticks.py
test_in_modules.py,
test_scope.py

One test fails with the error "ZeroDivisionError: division by zero" :

generated4test.py

I'm looking into why these fail python now, but I would appreciate any
insight from the list.

~matthew

Daniel Patterson

unread,
May 30, 2013, 8:24:13 PM5/30/13
to lamb...@googlegroups.com
This is a problem that I _thought_ joe and I fixed. The issue is that
when we exec python we don't set it's current directory, so it's
python path does not contain the directory where support is.

generated4test.py is a temporary file that is supposed to be deleted
(and is created) in the module tests. Where you running multiple sets
of tests simultaneously?

Daniel Patterson

unread,
May 30, 2013, 8:25:50 PM5/30/13
to lamb...@googlegroups.com
Hrm. Followup:

My full test run finished, and I got all the tests to pass (no support
issues), but got the generated4test.py failure too. Is this a
concurrency problem with the test runner? Or is this appearing every
time?

Matthew Milano

unread,
May 30, 2013, 8:30:29 PM5/30/13
to lamb...@googlegroups.com
I am not sure. I will start with a clean VM to check.

If this is a concurrency thing, we should strongly emphasize that
tests cannot be run concurrently, else we risk the reviewers falling
into that trap (an hour is a long time to wait, so I could easily see
them trying to pipeline the tests).

I was somewhat surprised by the name and content of generated4test, so
I'm glad to hear it wasn't intended this way.

~matthew

On Thu, May 30, 2013 at 8:25 PM, Daniel Patterson

Matthew Milano

unread,
May 30, 2013, 8:34:58 PM5/30/13
to lamb...@googlegroups.com
I re-extracted the VM image and ran the tests against python 3.2.3
(and did nothing else). I see the exact same results as I saw before,
complete with generated4test.py

Is there something I'm missing here?

~matthew

Daniel Patterson

unread,
May 30, 2013, 8:44:03 PM5/30/13
to lamb...@googlegroups.com
The support errors are really strange. I'm running other tests now,
but will try to figure them out (as I did the other day... :(...)

As for generated4test, I think this is just a broken test (this one:
tests/python-reference/modules/test_failing_support_sticks.py ). It
doesn't clean up after
itself! Joe - you've been testing this more, are you always seeing
this error?

Assuming we have to fix it (which it seems like we do), here are a few options:

a. We can add code to delete the file. I'm not sure if file deletion works.

b. We can create the file elsewhere. If this were *nix only, I'd just
make the path "/tmp/generated4test.py". Barring that, is
"../../generated4test.py" reliable cross platform? That'll put it in
the top level tests/ directory, so it won't get picked up.

Matthew Milano

unread,
May 30, 2013, 8:44:48 PM5/30/13
to lamb...@googlegroups.com
The generated4test.py mystery might be solved - it's actually already
on the VM at boot time. Removing it seems to work fine. The three "no
module named support" failures are still happening though.

This may be a hack, but we could just put the "support" module on the
system's python path for the VM. It is better to *actually fix* this
bug, but if we can't find it, then putting support on the system path
will avoid the distraction of this failure (since it has nothing to do
with our research).

~matthew

Joe Gibbs Politz

unread,
May 30, 2013, 8:55:23 PM5/30/13
to lamb...@googlegroups.com
I haven't seen any of these errors, but I also haven't tried to run
tests concurrently. I saw 175 tests succeeded on the VM (running new
implementation on old tests, so that error confuses me), and 20X
succeeded/none failed as well on new tests, both yesterday/this
morning. I agree with Daniel that I think we fixed the Python
test-running problem.

generated4test.py is quite silly, but I don't think we should try any
fixes beyond locating what is off kilter. We all know that the tests
pass on our own machines when we run them (right?), so we should
figure out what's different between there and the VM.

I pulled in changes from master to the artifactevaluation branch most
recently, and I did not run all the tests again in the VM after doing
that, so maybe something in that merge caused a problem?

I'm doing some of my own checking to see what's up.

Joe Gibbs Politz

unread,
May 30, 2013, 9:04:38 PM5/30/13
to lamb...@googlegroups.com
I think what happened is the path fix for running the Python test
caused the generated file to be added in a different place than
before, so now it's hit by normal directory traversal when we re-run
the tests (I didn't notice because I kept running on fresh checkouts).
That's sort of silly. Maybe we should have the test script remove it
as part of its setup.

Junsong, you wrote that code to create a test module. Do you have any thoughts?

Matthew Milano

unread,
May 30, 2013, 9:06:10 PM5/30/13
to lamb...@googlegroups.com
The generated4test problem is solved. It was just that the file is
accidentally on the VM. It's not that something fails to clean it up;
remove it once, and it never shows up again. I am running the full
suite again to double-check this, but I think this isn't a bug.

I am only running tests on the VM (not locally), and I am not running
them concurrently. The three failures I'm seeing is not running our
implementation against our tests, it is running python against our
tests.

I'm now trying to run our implementation against our tests. But to
avoid bugs due to running tests concurrently, I'm doing nothing else
with the environment while they run.

~matthew

Joe Gibbs Politz

unread,
May 30, 2013, 9:11:18 PM5/30/13
to lamb...@googlegroups.com
On Thu, May 30, 2013 at 9:06 PM, Matthew Milano <mat...@cs.brown.edu> wrote:
> The generated4test problem is solved. It was just that the file is
> accidentally on the VM. It's not that something fails to clean it up;
> remove it once, and it never shows up again. I am running the full
> suite again to double-check this, but I think this isn't a bug.

OK, thanks, let's see what happens there.

> I am only running tests on the VM (not locally), and I am not running
> them concurrently. The three failures I'm seeing is not running our
> implementation against our tests, it is running python against our
> tests.

I'm checking into this next. Daniel and I noticed this back on Monday
and thought we had it fixed, so I'll see if that's not true or what.

Daniel Patterson

unread,
May 30, 2013, 9:14:00 PM5/30/13
to lamb...@googlegroups.com
To follow up. Running the python tests with the test runner from the
submission time and the tests from the submission time, I don't get
support errors.

Which is actually confusing, because the fixes I made for the current
directory were made in the artifactevaluation branch (and thus aren't
in the old code) - this was the commit:
https://github.com/brownplt/lambda-py/commit/6b8a41c091636931b7e84860aa68ddc977a2820d

With the current code, current tests, I'm still running it with
python, but it's taking forever... Bleh.

Daniel Patterson

unread,
May 30, 2013, 9:29:51 PM5/30/13
to lamb...@googlegroups.com
Ignore my previous message - I'm getting the support errors on both
versions. (I ran the full tests by accident instead... no wonder it
took so long).

On Thu, May 30, 2013 at 9:14 PM, Daniel Patterson

Joe Gibbs Politz

unread,
May 30, 2013, 9:30:16 PM5/30/13
to lamb...@googlegroups.com
Found the problem. The fix was only partial, and we must have only
run on the modules/ directory. run-tests.rkt line 75 cannot use
`dirname', that's the base directory of the tests. It needs to find
the directory of the path defined by `path' and use that as the
(current-directory) so Python has the right module path.,

On Thu, May 30, 2013 at 9:14 PM, Daniel Patterson

Joe Gibbs Politz

unread,
May 30, 2013, 9:34:33 PM5/30/13
to lamb...@googlegroups.com
Fix... gosh this was stupid. Thank you guys for catching this.
Testing modules, the current user path matters deeply, stands to
reason.

https://github.com/brownplt/lambda-py/commit/519bb56fa6847056721eaf511b0d3147618907b2

Matthew Milano

unread,
May 30, 2013, 9:41:27 PM5/30/13
to lamb...@googlegroups.com
Let us know when the changes have hit the VM image + tarball so we can
go through it again.

~matthew

Joe Gibbs Politz

unread,
May 30, 2013, 9:51:23 PM5/30/13
to lamb...@googlegroups.com
Just so I write out the whole state for everyone, I know much more
precisely what happened. Sorry for this, it's my fault with stale
state on the VM.

1. Daniel and I noticed that when running the modules tests with
Python, they failed (though *our* modules work on them)
2. This was because Python wasn't looking in the right place for
imported modules in the modules/ test directory
3. We changed the tester to set the (current-directory) parameter
from the dirname of each test, but we *made a mistake*, and set it
only to the dirname that you pass to --test or --test-py, *not* the
nested directory of the currently-running test
4. We managed to not notice this, and I managed to run the --test-py
on modules/ this afternoon, which had two horrible consequences:
- It passed, because that was the relevant nested directory for
Python to be in, and it was somewhat accidentally set to the right
thing
- It created generated4test.py in that directory, when previously it
had been in a different, safer place (it should still be cleaned up
after, but this hadn't been a problem before)
- I saved the VM in that state
5. Everyone who went to run tests this evening followed the
directions, and since they passed a different top-level directory
(just python-reference, not python-reference/modules), they all got
*both* the error that support.py wasn't found (because Python was
looking in the wrong place), and that generated4test.py was somehow
present in a test directory.
6. I cried, inside and out
7. I fixed the directory in run-tests.rkt to the actual correct location
8. This now causes generated4test.py to appear in modules/ again,
breaking multiple runs of testing with --test-py
9. We *do* need a solution for cleaning up generated4test.py before a
test run --- I'm open to suggestions

I'll be pushing a new VM shortly, but I'd also like a solution for #9 first.

Matthew Milano

unread,
May 30, 2013, 9:52:09 PM5/30/13
to lamb...@googlegroups.com
I've modified the Makefile in the artifactevaluation branch so that
the default target just runs "raco make python-main.rkt" as was
discussed earlier on this list. I did this mostly because the
instructions for building lambda-py said to run "make" and I wanted to
avoid dependence on GNU coreutils (grep, sed and friends).

Speak up if you'd rather revert the Makefile to the old default target.

~matthew

Joe Gibbs Politz

unread,
May 30, 2013, 9:52:53 PM5/30/13
to lamb...@googlegroups.com
On Thu, May 30, 2013 at 9:52 PM, Matthew Milano <mat...@cs.brown.edu> wrote:
> I've modified the Makefile in the artifactevaluation branch so that
> the default target just runs "raco make python-main.rkt" as was
> discussed earlier on this list. I did this mostly because the
> instructions for building lambda-py said to run "make" and I wanted to
> avoid dependence on GNU coreutils (grep, sed and friends).

Great change. Thanks.

Daniel Patterson

unread,
May 30, 2013, 9:58:20 PM5/30/13
to lamb...@googlegroups.com
So the sledgehammer approach is to somehow run this:

find ./tests -name generated4test.py | xargs rm

But that might be a tad dangerous....

Other idea, already mentioned, is to just create that test file
elsewhere, so it doesn't get run as a test. This is probably a better
idea.

Matthew Milano

unread,
May 30, 2013, 9:58:32 PM5/30/13
to lamb...@googlegroups.com
I see two easy solutions:

1 - We can establish a naming convention that all generated files
start with "generated" (or some other prefix) and have the tester
clear files that meet that heuristic from its directories after it
runs tests. None of our real tests start with "generated"

2 - We could also put this file in the temporary directory returned by
(find-system-path 'temp-dir)

if it's possible, I'd prefer 2. We could create a lambda-py-scratch
directory in the temp dir, make files only in that directory, and then
nuke that directory when we are all done.

Would either of these work?

~matthew

Daniel Patterson

unread,
May 30, 2013, 10:03:25 PM5/30/13
to lamb...@googlegroups.com
2. sounds good, but there is a slight issue. The path is stored in a
python file - support.py. So I'm not sure how we can get the output of
(find-system-path 'temp-dir) into that file.

Here's a stranger idea: I assume that the test runner doesn't run
non-py files. Do we know that __import__ needs a file that has a ".py"
extension? Could the generated4test.py file actually be
generated4test.someotherextension?

Joe Gibbs Politz

unread,
May 30, 2013, 10:12:46 PM5/30/13
to lamb...@googlegroups.com
On Thu, May 30, 2013 at 10:03 PM, Daniel Patterson
<daniel_p...@brown.edu> wrote:
> 2. sounds good, but there is a slight issue. The path is stored in a
> python file - support.py. So I'm not sure how we can get the output of
> (find-system-path 'temp-dir) into that file.
>
> Here's a stranger idea: I assume that the test runner doesn't run
> non-py files. Do we know that __import__ needs a file that has a ".py"
> extension? Could the generated4test.py file actually be
> generated4test.someotherextension?

Sadly, this can't work because it's testing whether or not importing a
file with a runtime error causes the module to be added to
sys.modules. It's a really subtle test...

Also, it's *Python* that creates the module, not Racket, so Rackety
solutions are out.

I think that "files starting with generated" is reasonable, but we
should do "files starting with ___lambdapy-generated" are skipped, to
be awfully clear about what's going on.

Joe Gibbs Politz

unread,
May 30, 2013, 10:28:13 PM5/30/13
to lamb...@googlegroups.com
Fix:

https://github.com/brownplt/lambda-py/commit/86229cb1f46fa97b12af9469d07f89bd39d17073

The spec is: files containing ___lambdapy_generated in their path,
anywhere, are skipped. Passes for me when run repeatedly.

Matthew Milano

unread,
May 30, 2013, 10:36:39 PM5/30/13
to lamb...@googlegroups.com
I'm seeing the same results. Looks like all the bugs with test-py are
fixed. Good work! Looking forward to testing it in a VM.

~matthew

Joe Gibbs Politz

unread,
May 30, 2013, 10:47:19 PM5/30/13
to lamb...@googlegroups.com
Uploading is a little slow from home; but it looks like it'll take
about as long as walking into the department. Eta 20 minutes.

Joe Gibbs Politz

unread,
May 30, 2013, 10:53:03 PM5/30/13
to lamb...@googlegroups.com
Also, sorry for wasting everyone's time with this. I was expecting
typo fixes and minor cleanup suggestions, not this!

Alejandro, I the img/ paths should be fixed on this next upload, too,
thanks for noticing.

Joe Gibbs Politz

unread,
May 30, 2013, 11:09:41 PM5/30/13
to lamb...@googlegroups.com
Aaannd.... done.

scp gave a surprisingly good time estimate. Tarball updated.

Junsong Li

unread,
May 31, 2013, 6:14:33 AM5/31/13
to lamb...@googlegroups.com
GOD, terribly sorry about the "generate4test" things. I went out for the U.S. visa appointment today. Sorry for that confusion again.

The problem is in the "python-reference/modules/test_failing_support_sticks.py", which generates generate4test.py but doesn't clean it afterward. I didn't expect this to be such a problem.

For a little tricky method, we can have the test_failing_support_sticks.py to clean the contents of "generate4test" afterwards(it should have been written like this in the current situation, where in lambda-py we don't have a function that can delete files):

try:
# import the new file, which contains a error
___assertRaises(ZeroDivisionError, __import__, support.TESTFN);
# the sys.modules shouldn't contain the TESTFN
___assertNotIn(support.TESTFN, sys.modules)
finally:
f = open(source, "w")
f.write("")
f.close()

This will only cause a minor problem that the number of testing files changes from 205 to 206. To solve this, we can put a blank generate4tests.py in the modules/ directory.

The problem now is solved anyway. I think the current method is better, as the number of testing files doesn't change at all.


Junsong

Daniel Patterson

unread,
May 31, 2013, 9:33:24 AM5/31/13
to lamb...@googlegroups.com
Hey I just tested with the current VM (just downloaded it) and I still
get the generated4test.py error. This is in the 28 march directory - I
assume (though I didn't check before running the tests, sorry!) that
the file was left over from earlier.

Daniel

Joe Gibbs Politz

unread,
May 31, 2013, 9:40:12 AM5/31/13
to lamb...@googlegroups.com
I think that's right, I noticed that this morning, too; I can make
another VM much faster now that I'm in the CIT.

There's also the regression in that directory to handle (you're seeing
a failure on test_property_decorator_baseclass.py, right?) which we
need to decide what to do with:

1. Make a note in CHANGES.txt, and give instructions for running old
tests with old code
2. Still run old tests with old code, and explain the problem
- ...?

I think the first is better; along with this Python path fix, our test
suite is drifting a little bit from what it was at submission time, so
they are becoming less comparable with the same code.

Joe Gibbs Politz

unread,
May 31, 2013, 9:43:29 AM5/31/13
to lamb...@googlegroups.com
On Fri, May 31, 2013 at 9:40 AM, Joe Gibbs Politz <joe.p...@gmail.com> wrote:
> I think that's right, I noticed that this morning, too; I can make
> another VM much faster now that I'm in the CIT.
>
> There's also the regression in that directory to handle (you're seeing
> a failure on test_property_decorator_baseclass.py, right?) which we
> need to decide what to do with:
>
> 1. Make a note in CHANGES.txt, and give instructions for running old
> tests with old code
> 2. Still run old tests with old code, and explain the problem
> - ...?
>
> I think the first is better; along with this Python path fix, our test
> suite is drifting a little bit from what it was at submission time, so
> they are becoming less comparable with the same code.


Nevermind, I forgot that we made this change in index.html already. I
just need to make the change on the VM.

Matthew Milano

unread,
May 31, 2013, 9:46:40 AM5/31/13
to lamb...@googlegroups.com
I would do the first one.

I'm seeing the same things - when running the current tests with
python, everything is fine, but when running the old tests with the
current lambda-py I still see test_property_getter_doc_override.py and
generated4test.py failing.

Since both of these errors were fixed in current by changing the
tests, I don't really see a way our current bug fixes can make them
pass. I think it makes a lot of sense to have reviewers run the old
code on the old tests and the new code on the new tests; for
test_property_getter_doc_override, it's arguably a bug that it passes
the submission-time lambda-py.

~matthew

Junsong Li

unread,
May 31, 2013, 11:00:54 AM5/31/13
to lamb...@googlegroups.com
I only have 5~10kb/s(estimated time 33h) for downloading the VM….It seems that I don't have enough time to download the VM and to test it :-(

Joe Gibbs Politz

unread,
May 31, 2013, 11:45:03 AM5/31/13
to lamb...@googlegroups.com
OK new and improved VM/index.html copied. One more round of testing
should confirm.

Alejandro Martinez

unread,
May 31, 2013, 2:07:53 PM5/31/13
to lamb...@googlegroups.com
I run this tests on the this last VM, here some results:

Running 28-march-2013 code on 28-march-2013 tests (according to index.html):
--test: 175 ok
--test-py: 172Ok, 3 tests failed on modules with "ImportError: No module named support"

Running artifact-submission code on artifact-submission tests (according to index.html):
--test-py: 205 ok
--test: not finished yet, but I hope it will be 205 ok.
BTW, index.html says 206 instead of 205

Running artifact-submission code on 28-march-2013 tests (just to see what happens):
--test: 174 ok, 1 test failed (the one in properties due to keyword argument on constructor call)
--test-py: 175 ok

One minor detail which may confuse someone: all reference to running tests are to python-reference, but the script counting LOC run on all tests folders.



2013/5/31 Joe Gibbs Politz <joe.p...@gmail.com>



--
Alejandro.

Joe Gibbs Politz

unread,
May 31, 2013, 2:17:51 PM5/31/13
to lamb...@googlegroups.com
On Fri, May 31, 2013 at 2:07 PM, Alejandro Martinez
<amtri...@gmail.com> wrote:
> I run this tests on the this last VM, here some results:
>
> Running 28-march-2013 code on 28-march-2013 tests (according to index.html):
> --test: 175 ok
> --test-py: 172Ok, 3 tests failed on modules with "ImportError: No module
> named support"

OK, this is expected. I'll put a note into index.html that this is a
fixed bug in the test harness.

>
> Running artifact-submission code on artifact-submission tests (according to
> index.html):
> --test-py: 205 ok
> --test: not finished yet, but I hope it will be 205 ok.
> BTW, index.html says 206 instead of 205

Fixed this in the last push of index.html just now, thanks for noticing.

>
> Running artifact-submission code on 28-march-2013 tests (just to see what
> happens):
> --test: 174 ok, 1 test failed (the one in properties due to keyword argument
> on constructor call)
> --test-py: 175 ok

Great!

>
> One minor detail which may confuse someone: all reference to running tests
> are to python-reference, but the script counting LOC run on all tests
> folders.

I think the `cd' command in the instructions makes it clear; I've also
updated index.html to have the output, so they can tell if anything is
off.

Thanks for testing!

Joe Gibbs Politz

unread,
May 31, 2013, 2:21:09 PM5/31/13
to lamb...@googlegroups.com
> OK, this is expected. I'll put a note into index.html that this is a
> fixed bug in the test harness.
>

Just wrote this note and put it up live.

Alejandro Martinez

unread,
May 31, 2013, 2:25:20 PM5/31/13
to lamb...@googlegroups.com
Excellent, it is crystal clear now!


2013/5/31 Joe Gibbs Politz <joe.p...@gmail.com>
> OK, this is expected.  I'll put a note into index.html that this is a

> fixed bug in the test harness.
>

Just wrote this note and put it up live.
--
You received this message because you are subscribed to the Google Groups "lambda-py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lambda-py+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.





--
Alejandro.

Joe Gibbs Politz

unread,
May 31, 2013, 4:17:35 PM5/31/13
to lamb...@googlegroups.com
OK, I'm going to submit, because I'm going offline for a bit tonight
and tomorrow. We can submit until really late Saturday night
(actually around 4am Sunday Morning EDT:
http://cyberchair.acm.org/oopslaaec/submit/), so let's see if others
can do some testing and let me know if anything goes wrong.

We just submit the link to index.html, not the whole file, so it's
easy to update until then.

Alejandro Martinez

unread,
May 31, 2013, 4:27:38 PM5/31/13
to lamb...@googlegroups.com
--test finished with 205 ok, as expected.

2013/5/31 Alejandro Martinez <amtri...@gmail.com>

Daniel Patterson

unread,
May 31, 2013, 9:29:00 PM5/31/13
to lamb...@googlegroups.com
I just ran both the submission time code on the submission time tests
and the artifact time code on artifact time tests, and they both
worked flawlessly.

So I think we are totally good to go.

Thanks so much to Joe for doing all of the painful work and writing
for this artifact, and thanks to alejandro and matthew for lots of
testing hours!

Matthew Milano

unread,
May 31, 2013, 11:39:46 PM5/31/13
to lamb...@googlegroups.com
And thanks to you as well!

~matthew
Reply all
Reply to author
Forward
0 new messages