Learning Track Update (finally!)

Mark "mak" Roberts

unread,

Jan 26, 2014, 11:56:26 PM1/26/14

to ipc2014-...@googlegroups.com

Hi!

I ask a few questions for feedback in this update. My general thought is: I'll consider no response as a "sure, that works for me/us". If you don't squeak your wheel now, I can't know that I need to apply oil :-)

It seems clear to me that the schedule needs to slide to incorporate spring deadlines, the usual reviewing we all face, and the publication process. I'd like to revise the schedule for both the quality and execution tracks:

Jan 26 - Initial Domains Posted
Jan 31-Mar 3 - Domain Feedback period
(Feb 20 ICAPS Workshop)
(Feb 28 Camera Ready)
Mar 9 - Test domains ready (participants expected to test Cloud setup before Mar 23, see below for cloud proposal)
Mar 30 - Final domains published
Mar 30 - Beginning of 6 week Learning period
Mayl 3 - End of 6 week Learning Period
May 4-18 - Execution of the final domains w/out DCK (please see below for cloud proposal)
May 18-June 5 - Mak is neck deep in data getting the results done in time for the conference!

Please, can you let me know if this will work for your team or if you have other feedback for this timeline?

Regarding an initial list of domains for the quality track, I am considering the following (with the latest version I'm aware of in parenthesis):

Barman (IPCL-2011)
Spanner (IPCL-2011)
Depots (IPCL-2011)
Rovers (IPCL-2011)
Parking (IPCL-2011)
Gold-miner (IPCL-2008)
Sokoban (IPCL-2008)
Thoughtful (IPCL-2008)

As far as I can tell, all these require :action-costs, but please correct me if I misread their textual descriptions as I did not have time to look in the PDDL files to verify this. I'd like to stick with :action-cost domains, and may decide to add some from the IPC-2008 or IPC-2011 after further review/feedback. If you have a domain that you'd like to see (or bring to my attention in case I forgot a conversation/email we already had)... this is your chance to do so. (I reserve the right to add/remove domains from this list, but you already knew that ;-) )

For the execution track, I have already posted a preliminary version of one domain and am awaiting feedback. I do plan to add a more comprehensive "grid-based domain" similar to the one I proposed by the test domain deadline.

As mentioned in a previous email, I'm leaning toward using a compute cloud (e.g., Google, Amazon) for the competition. Amazon EC2 was used by Scott Sanner with some success in the Probabilistic IPC for 2011 (see http://users.cecs.anu.edu.au/~ssanner/IPPC_2011/index.html). Not only does this allow for speedy collection of the compute results (a week or less) but it also may provide a path for future (or ongoing) competitions. I am thinking of asking Google or Amazon to sponsor this and am hopeful I can find funding for any gap in costs. I currently do not have sponsorship for the competition and, thus, am in need of computing resources. My best guess is it would require running about 60 runs per domain (30 with, and 30 without DCK) to deal with the inherent variability of running planning jobs in the cloud. What are your thoughts on this approach? Please, respond to the whole group so we can move toward consensus.

No doubt, you have a question that needs answered. I'd like to hear how things are going on your end? Even if you just say "hey, things are fine" or "sorry, I can't compete" I'd like to see a note.

Lastly, I want to thank everyone for your patience. Life is finally settling in a new job/city and I'm looking forward to a spring filled with planning and learning!

Cheers,

mak

Mauro Vallati

unread,

Jan 27, 2014, 4:34:10 AM1/27/14

to ipc2014-...@googlegroups.com, mrob...@cs.colostate.edu

Hi Mark,

thank you for updating us and for organising this track of the IPC :)

Just a few comments on your last email.
1. I am missing when is the planner submission deadline. Is it March the 9th?
2. The domains list seem fine to me. Should we expect a single domain model per domain, or one per problem?
4. Will the learning process run on participants' machines, or will we use the cloud also for that step?
5. Could you please let us know the per-track total number of registered systems?

Cheers,
Mauro

Mark "mak" Roberts

unread,

Jan 28, 2014, 8:29:53 AM1/28/14

to ipc2014-...@googlegroups.com

Hi all,

Thanks very much for each of your replies! I received feedback from most of the competitors on a private channel and will attempt to summarize my answers in one shot. Please view my attempt here as simply trying to avoid repeating conversations ... and if I still don't tackle a question you asked me feel free to ask again. I always appreciate reminders! :-)

Q: When is the planner submission deadline?

A: Yes, I intended that you would submit on 9 Mar, before the official domain list is released.

Q: How will you run our planner?

Q: (related) Should we expect a single domain model per domain, or one per problem?

A: Most of you were fine with the cloud proposal. I'll run your planner with a script similar to previous competitions. I believe the general format for the script is "plan domain.pddl problem.pddl [dckFile]" More than one of you are concerned about this aspect, and deservedly so. I'll work on this clarification once I secure some funding for the cloud platform.

Q: Where will the learning process be run? By us? By you? On our machines? On the cloud?

A: The learning computations are run by participants on their own computing platforms (I'd encourage you to look at cloud if you haven't...). You'll be expected to submit initial versions of the planner (9 Mar) and a mechanism by which I can verify you didn't change the learning algorithm during the learning phase such that it becomes a "hand tuned" planner. So I should be able to produce the same DCK files at the start and end of the learning phase for any problem in the initial set. I'll clarify the process I intend to use soon.

Q: How many competitors are there?

A: Sorry. I keep spacing to send this out. At the moment, I show 12 planners from 7 teams for the Quality track and 3 planners from 2 teams for the execution track.

Comment: uh, two weeks to run the competition... are you sure?!?

A: Well, I didn't see another way to handle the delay without encroaching on the paper deadlines and remaining fair to your respective efforts. I think it will work out because the could platform allows massively parallel execution.

Comment: No new domains...

A: I was absolutely open to new domains and I did my best to solicit promising applications for this track. Alas, none were submitted. I hesitate to add my own hand-crafted domains to sidestep injecting "my research" or "my view" into what should be a community effort on building benchmark domains. One might argue that the execution track consists of new domains I'm writing, but in fact, I'm carefully constructing it to use "domains" from existing literature on the topic.

Q: Haven't you primed everyone to the specific domains/problems by providing us a list?

A: My apologies. I should have been more clear. The final problem distribution(s) and the final list of domains is not set. I provided what I thought of as a "preliminary" list so that folks could make sure their planner supports key features (e.g., PDDL requirements, kinds of problems, scope). At the least, it was known that the competition would heavily build on previous competitions -- I recall saying this to several participants in Rome. (Sigh, Rome... that was a nice trip!) I think being upfront about this levels the field a bit for anyone not in those conversations. It was my understanding this format was followed in the last learning competition without ill effect. Maybe I misunderstood?

Comment: I already know my planner won't support DomainXYZ because of RequirementABC.

A: Ah, the classic n-armed bandit problem. You could work really hard to fix your planner for DomainXYZ (i.e., explore an arm you think might pay off). Or ... you could take a chance I won't use that domain and focus on the domains you know really well first (i.e., exploit the arm with the highest payoff) and leave a little time to extend into RequirementABC (i.e., explore new arms with a decaying probability). As with most things in life, strategic decisions are challenging. :-)

Q: Okay. I need a list of PDDL requirements you expect our planners to support.

A: A quick answer is the union of the requirements in the example set I gave you. Some problems may hit that union at different levels. I'll make sure to get you the expected list soon. (Keep in mind that you don't necessarily need to cover the union to have an effective approach. I think challenging such assumptions is where scientific thought guides us the most. )