Parallel groups and Stage I & II

Tijn van der Zant

unread,

Nov 21, 2010, 10:24:09 AM11/21/10

to atHome2011

Hi,

In stage I we have:
RIPS
Follow Me
Go Get It
Who's Who?
Open Challenge

In stage II we have:
Enhanced Who's Who (EWW)
General Purpose Service Robot (GPSR)
Shopping Mall
Demo Challenge

We are going to have a parallel league. So this means that there is a
maximum of 16 teams per group (there are 2 groups). This means that
there is more time for teams to practice and the schedule is more
relaxed. There is a request from the trustees that we have more tests
in stage I, so that there is more time for teams to test their system.
I agree, also because I also think that we should give teams the
opportunity to do more tests. If a team is willing to fly around the
planet to participate in @Home, we should maximize the amount of tests
that a team can participate in. So here is my proposal.

I think that we should have the following schedule:

In Stage I (S1):
RIPS
Follow Me
Go Get It
Demo Challenge (with evaluation by the team leaders, normalized score)
Who's Who?
General Purpose Service Robot
Open Challenge (with evaluation by the team leaders, normalized score)

Best 50% go to stage II

In Stage II (S2):
Enhanced Who's Who (good teams can change Who's Who into EWW or
already prepared this)
Shopping Mall (requires good performance, too dangerous otherwise)

My reasoning is the following:
The demo challenge in S1:
This is a good test for good and not-so-good teams do give a nice
demonstration. The demo challenge is often an inspiring test for the
teams. But with 32 teams it is impossible, time wise, to have every
team leader evaluate every team. So the evaluation by the team leaders
should be per group. If we normalize the score (for example, 1500 for
the best team, 1400 for the 2nd best, etc) then we don't have any
problems between the two groups. A problem could arise if in one group
there is a bias towards giving higher or lower ratings. If the average
of points in one group is much lower than the other group this would
be unfair.

GPSR in S1:
I think that it is so important to get away from the state like
programming that every team should focus on these capabilities or at
least think about it. If it is S2 then I'm afraid that only a very few
teams are willing to tackle this problem.

Keeping EWW in S2:
EWW is a good example where many capabilities have to be integrated
into a difficult test. I think that this is a really complex test ad
should remain in S2

Shopping Mall in S2:
Too dangerous for poor performing teams.

So what do you think? Which test(s) should go to stage I, and why?

Cheers,

Tijn

Javier Ruiz-del-Solar

unread,

Nov 21, 2010, 4:39:28 PM11/21/10

to athom...@googlegroups.com, Javier Ruiz-del-Solar

Dear Tijn:

I agree with your proposal, with "who is who" in stage I.

I think the second best option is to leave the "Demo Challenge" in stage II, for having 3 tests in this stage.

Best regards,

Javier

------------------------------------------------------------
Prof. Dr. Javier Ruiz-del-Solar
Universidad de Chile
Department of Electrical Engineering
Av. Tupper 2007, 837-0451 Santiago, Chile

Ph. +56-2-978 4207
Fax +56-2-6720162
Email: jru...@ing.uchile.cl
WWW: http://www.cec.uchile.cl/~jruizd

Dirk Holz

unread,

Nov 22, 2010, 12:49:54 PM11/22/10

to athom...@googlegroups.com

On Sun, 2010-11-21 at 07:24 -0800, Tijn van der Zant wrote:
> Hi,
>
> In stage I we have:
> RIPS
> Follow Me
> Go Get It
> Who's Who?
> Open Challenge
>
> In stage II we have:
> Enhanced Who's Who (EWW)
> General Purpose Service Robot (GPSR)
> Shopping Mall
> Demo Challenge
>
> We are going to have a parallel league. So this means that there is a
> maximum of 16 teams per group (there are 2 groups). This means that
> there is more time for teams to practice and the schedule is more
> relaxed. There is a request from the trustees that we have more tests
> in stage I, so that there is more time for teams to test their system.
> I agree, also because I also think that we should give teams the
> opportunity to do more tests. If a team is willing to fly around the
> planet to participate in @Home, we should maximize the amount of tests
> that a team can participate in. So here is my proposal.
>

Two arenas and two groups are definitely the way to go if we want to
have 32 teams. YES.

> I think that we should have the following schedule:
>
> In Stage I (S1):
> RIPS
> Follow Me
> Go Get It
> Demo Challenge (with evaluation by the team leaders, normalized score)
> Who's Who?
> General Purpose Service Robot
> Open Challenge (with evaluation by the team leaders, normalized score)
>
> Best 50% go to stage II
>
> In Stage II (S2):
> Enhanced Who's Who (good teams can change Who's Who into EWW or
> already prepared this)
> Shopping Mall (requires good performance, too dangerous otherwise)
>

That doesn't leave so much tests in stage II.

> My reasoning is the following:
> The demo challenge in S1:
> This is a good test for good and not-so-good teams do give a nice
> demonstration. The demo challenge is often an inspiring test for the
> teams. But with 32 teams it is impossible, time wise, to have every
> team leader evaluate every team. So the evaluation by the team leaders
> should be per group. If we normalize the score (for example, 1500 for
> the best team, 1400 for the 2nd best, etc) then we don't have any
> problems between the two groups. A problem could arise if in one group
> there is a bias towards giving higher or lower ratings. If the average
> of points in one group is much lower than the other group this would
> be unfair.
>

Up to now, Demo Challenge evaluation was done by TC, not by the team
leaders. And since we wanted to see out-of-scope really good stuff
(completely unrelated to any other test) and giving only some kind of a
bonus score, we should keep it as it is, in my opinion. 1500 is also
something that should only be given in exceptional cases, e.g.,

max 500 pts: a team shows a nice demo that 1.) corresponds to the
challenge description and 2.) shows some nice stuff by maybe combining
abilities from other tests.
max 1000 pts (cum laude): 1.) + shows some robot abilities that are not
addressed in any other test, e.g., really ironing the clothes etc.
up to 1400 pts (magna cum laude): really good demo, awesome performance,
really new nice features etc.
the way to 1500 (summa cum laude): exceptionally good performance. I do
not expect any team to reach this score ever. I'd give for a robot that
I would directly buy just to have it in my own household ...

TC evaluation plus the above (actually what we had in the last years) is
what we should try to keep as "demo challenge" _and_ in Stage II, where
the top teams can show the really good and new stuff they are doing.

> GPSR in S1:
> I think that it is so important to get away from the state like
> programming that every team should focus on these capabilities or at
> least think about it. If it is S2 then I'm afraid that only a very few
> teams are willing to tackle this problem.
>

I would like to keep GPSR in Stage II, especially regarding all the new
stuff that we might add for 2012 (and that all new (stage I) things are
automatically in GPSR). Furthermore, the changes on GPSR itself, like
getting mental age tests in there etc. Whatever we do with the other
tests, I think that GPSR should always be a Stage II test.

> Keeping EWW in S2:
> EWW is a good example where many capabilities have to be integrated
> into a difficult test. I think that this is a really complex test ad
> should remain in S2
>

If we do not merge Enhanced Who is Who and Who is Who, then YES. The
tests not being merged definitely belong into two stages.

> Shopping Mall in S2:
> Too dangerous for poor performing teams.
>

Totally agree.

But in a similar fashion, I'd also like to keep GPSR in stage II. In
GPSR teams may be told to solve three tests at the same time or
sequentially (but at least within 10 minutes). A low scoring team in
stage I might not be able to solve a single one and according to the
scoring (which I do not want to change), score comes at 50% and more.

> So what do you think? Which test(s) should go to stage I, and why?
>

Hard decision. Looking at my comments above, I'd like to keep the stage
II tests in stage II. Who is Who-Merger is OK, if not, enhanced Who is
Who should also in Stage II. We could add an additional (easy) test to
stage I, but that's a step backwards, and that is _really_ something
that I want to avoid.

> Cheers,
>
> Tijn

Jesus Savage

unread,

Dec 2, 2010, 2:17:26 PM12/2/10

to athom...@googlegroups.com

Hello,

> GPSR in S1:
> I think that it is so important to get away from the state like
> programming that every team should focus on these capabilities or at
> least think about it. If it is S2 then I'm afraid that only a very few
> teams are willing to tackle this problem.
>

I would like to keep GPSR in Stage II, especially regarding all the new
stuff that we might add for 2012 (and that all new (stage I) things are
automatically in GPSR). Furthermore, the changes on GPSR itself, like
getting mental age tests in there etc. Whatever we do with the other
tests, I think that GPSR should always be a Stage II test.

I agree that we should have the GPSR test in Stage II, but we should have first in stage I a test that really grades the minimum capabilities that the robots should have to do the GPRS well in state II.

In other leagues they have something called "Technical Challenge" that consist of several tests that robots need to perform. For instance in the Humanoide league in this year they had the following challenges: Throw-In Challenge, Obstacle Avoidance and Dribbling, Double Pass and The Footrace, and every year they increase the complexity of these challenges.

Thus in our league we could have a test that is divided by small challenges
that would prove that the robots have the desired capabilities that we want:

• Navigation
• Mapping
• Person Recognition
• Person Tracking
• Object Recognition
• Object Manipulation
• Speech Recognition
• Gesture Recognition

Then in a Technical Challenge test we could have five challenges that prove the previous capabilities and
a team can choose which of each of these challenges they will solve, the teams that pick and solve more challenges then earn more points. The test could last 10 minutes giving 2 minutes to solve each challenge.

Best

Jesus

Dirk Holz

unread,

Dec 2, 2010, 3:05:41 PM12/2/10

to athom...@googlegroups.com

Yes, that's something that I proposed last year already. To have some
mini challenges within a first stage I test. Something like having the
robot enter the scenario and then accomplishing some simple tasks, like
"follow me", "stop", "move to the kitchen table". Basically the GPSR has
arisen from that, but I agree. We could come up with some minimal set of
capabilities that every robot should have and test them in something
like a technical challenge. The most important thing here is (see
examples above), to have very simple commands (no GPSR here) and, more
important, tasks that can be done independent of each other, e.g.
recognizing a person doesn't make sense without training at least two
persons (and that's more or less a complete test again just like who is
who). But if we manage to extract some minimal set of tasks (defining
the "challenges" described by Jesus above) that can easily be tested and
evaluated as it is done in other leagues, than "YES!", let's do it!

> Best
>
> Jesus
>
>

Jörg Stückler

unread,

Dec 23, 2010, 4:37:54 PM12/23/10

to atHome2011

hi,

when i think back to the average state of the systems at robocup 2010,
only few teams were able to score significantly even in the stage I
tests..
we should add some simplifications to the tests in stage I, such that
first years competitors can show some adequate performance.
we should not forget that @home system need many complex but BASIC
capabilities such as navigation, person perception, object perception,
mobile manipulation, human-robot interaction, and that it takes a lot
of time to get such single things running on a robot with a good level
of performance.
such simplifications would also have the important consequence, that
the league will demonstrate better performance to the audience in
general.

when a team choses to simplify a test, the test becomes basically a
"technical challenge" like the tests until 2009:

- go get it: the object is located at a specific location, like for
instance the "kitchen table" (=> fetch and carry)

- who is who: less persons are introduced, persons always face the
robot, or are only standing with sufficient distance to walls and
furniture (=> old who is who).

- follow me: skip some of the waypoints, but follow to the end of the
course.

of course, only partial points are given for the simplified tests.

> GPSR in S1:
> I think that it is so important to get away from the state like
> programming that every team should focus on these capabilities or at
> least think about it. If it is S2 then I'm afraid that only a very few
> teams are willing to tackle this problem.

when half the teams advance to S2 and a team seriously wants to
achieve something in the competition, it has to tackle *ALL* the
tests, also GPRS in stage II.
this is the most complex test in the competition.
since gprs builds on all the basic skills in stage 1, i suggest to
keep it in stage 2 for the roughly 50% of the teams that can cope with
the basic skills.

> My reasoning is the following:
> The demo challenge in S1:
> This is a good test for good and not-so-good teams do give a nice
> demonstration. The demo challenge is often an inspiring test for the
> teams. But with 32 teams it is impossible, time wise, to have every
> team leader evaluate every team. So the evaluation by the team leaders
> should be per group. If we normalize the score (for example, 1500 for
> the best team, 1400 for the 2nd best, etc) then we don't have any
> problems between the two groups. A problem could arise if in one group
> there is a bias towards giving higher or lower ratings. If the average
> of points in one group is much lower than the other group this would
> be unfair.

when we keep the old voting system for the demo challenge (tc votes
alone), then it is much simpler to put the demo challenge into stage
I.
i also think, that a second "open" challenge with a theme could be
interesting for all the teams.

best,
jörg

Mohan Rajesh Elara

unread,

Dec 24, 2010, 12:51:59 AM12/24/10

to athom...@googlegroups.com

Hi All,

With number of teams becoming 32, we need to have two venues. The LOC should be informed in advance in this regard to prepare for the infrastructure needs (field space, team space, etc) for the two venues. I also agree with,

Having Demo Challenge in S1 being evaluated by the team leaders per group
Having separate Who's Who in S1 and Enhanced Who's Who in S2
Shopping mall in S2 considering the safety issues in public space

Since, GPSR invloves most tests in S1. I would recommend to have it in S2.

I wish you a Wonderful Christmas and Very Happy New Year!

Best Regards,
Mohan

Javier Ruiz-del-Solar

unread,

Dec 24, 2010, 5:25:13 AM12/24/10

to athom...@googlegroups.com, Javier Ruiz-del-Solar

Seems ok to me.

My only doubt is the Demo Challenge. Should it be in stage I or in stage II? ...

Jesus Savage

unread,

Dec 24, 2010, 7:33:43 PM12/24/10

to athom...@googlegroups.com

Hello,

I agree with Jörg that in the last competitions few teams are really showing the basic capabilities that a service
robot should have, the technical challenge test would remind the teams what are these
capabilities that we expect their robots should have. This technical challenge test is not a new one is just a compilation
of robots' capabilities that we have been asking already in the previous years.

Jesus

PD. By the way, Happy Holidays!

Nico Hochgeschwender

unread,

Dec 25, 2010, 6:17:13 AM12/25/10

to atHome2011

Hello all,

> GPSR in S1:
> I think that it is so important to get away from the state like
> programming that every team should focus on these capabilities or at
> least think about it. If it is S2 then I'm afraid that only a very few
> teams are willing to tackle this problem.

I agree that we should move away from state-based programming. But,
for GPSR almost all capabilities
from S1 are required, so lets keep it in S2 in order to have a logical
and step-wise improvement of complexity.

Regards and Merry Xmas,
Nico

Luca Iocchi

unread,

Dec 27, 2010, 4:30:16 AM12/27/10

to athom...@googlegroups.com

Dear all,
I like more the schedule below (with GPSR in Stage I and Demo Challenge
in Stage II).

Stage I:

RIPS
Follow Me
Go Get It
Who's Who?

Open Challenge (with evaluation by the team leaders, normalized score)
General Purpose Service Robot

Stage II (Best 50%):
Enhanced Who's Who

Demo Challenge (with evaluation by the team leaders, normalized score)

Shopping Mall

This also satisfies the RCF request of allowing teams to participate in
more tests.
With this scheme every team participates to 1 registration test, 3
standard tests, 1 general test, 1 open test
that is surely worth the trip to the competition.

It is certainly true that GPSR is the most difficult test and that we
will have several 0 points scores,
however, I am sure that being in Stage I will be a benefit for the
overall performance.

GPSR will be the last test of Stage I for two reasons:
1) in this way all the basic capabilities will be tested before and the
only difference between having it in Stage II
is to allow for just a few additional time slots.
2) during GPSR OC will have more time to put together Open Challenge
results, which is quite time consuming,
as we have experienced last year.

This year we will have two similar and fully equipped apartments each
one of 15 m x 7.5 m,
that are enough to run tests in parallel. So we should have enough time
slots to do more tests in Stage I.

Finally, before defining this as final, I suggest to wait a few days to
have an idea on the number of
pre-registered teams.

Happy Holidays,
Luca.

Javier Ruiz-del-Solar

unread,

Dec 27, 2010, 5:40:51 AM12/27/10

to athom...@googlegroups.com, Javier Ruiz-del-Solar

I agree.

Luca Iocchi

unread,

Jan 13, 2011, 5:45:44 AM1/13/11

to athom...@googlegroups.com

Dear all,
since pre-registration is over, the number of teams in Istanbul will be less than 30.
This means that the plan of dividing in two groups and having more tests in Stage I
is doable.

So I suggest to finalize this structure (if there are no other objections)
and start updating the rulebook accordingly.

Stage I:
RIPS
Follow Me
Go Get It
Who's Who?
Open Challenge (with evaluation by the team leaders, normalized score)
General Purpose Service Robot

Stage II (Best 50%):
Enhanced Who's Who
Demo Challenge (with evaluation by the team leaders, normalized score)
Shopping Mall

Best,
Luca.

Komei Sugiura

unread,

Jan 13, 2011, 6:20:01 AM1/13/11

to athom...@googlegroups.com

Dear all,

Last year, we had 31 preregistered teams, 26 teams were qualified, and
24 teams participated in the competition. So, do we have to divide
qualified teams? Why not single group? If two arenas were used, we would
have some problems:
- noise from the other arena (moderation, robot's utterances,...)
- fairness (some referees had 50cm bonus to every team using on-board
microphone in 2009)

Also, I prefer to make the stage structure as it is. Namely, GPSR should
be in Stage II in my opinion, because GPSR is:
- time consuming for referees since the referees collect and distribute
objects listed in the team's manipulatable objects whenever the sentence
is randomly generated.
- difficult even for high level teams, and only a couple of teams got
non-zero score in 2010. If GPSR was taken place in Stage I, most teams
would get 0 point, which is not attractive for audiences.

Thus, I guess the following structure is better since major changes are
made bi-anualy in RoboCup@Home, and we should not this year. What do you
think?

Stage I:
RIPS
Follow Me
Go Get It
Who's Who?
Open Challenge (with evaluation by the team leaders, normalized score)

Stage II (Best 50%):
Enhanced Who's Who

General Purpose Service Robot

Demo Challenge (with evaluation by the team leaders, normalized score)
Shopping Mall

Best,

Komei

2011/1/13 Luca Iocchi <luca....@dis.uniroma1.it>:

Dirk Holz

unread,

Jan 13, 2011, 3:00:40 PM1/13/11

to athom...@googlegroups.com

On Thu, 2011-01-13 at 20:20 +0900, Komei Sugiura wrote:
> Dear all,
>
>
> Last year, we had 31 preregistered teams, 26 teams were qualified, and
> 24 teams participated in the competition. So, do we have to divide
> qualified teams? Why not single group? If two arenas were used, we would
> have some problems:
> - noise from the other arena (moderation, robot's utterances,...)
> - fairness (some referees had 50cm bonus to every team using on-board
> microphone in 2009)
>

I guess we do not need to decide now whether or not we split up the
teams in two groups, but could do that in Istanbul once we see how many
teams made it to RoboCup. So we should just add both possibilities in
the rulebook (having one group of teams and having two groups). That
also makes it more "compatible" with local competitions like
japan/german/iran etc. open.

>
> Also, I prefer to make the stage structure as it is. Namely, GPSR should
> be in Stage II in my opinion, because GPSR is:
> - time consuming for referees since the referees collect and distribute
> objects listed in the team's manipulatable objects whenever the sentence
> is randomly generated.
> - difficult even for high level teams, and only a couple of teams got
> non-zero score in 2010. If GPSR was taken place in Stage I, most teams
> would get 0 point, which is not attractive for audiences.
>
> Thus, I guess the following structure is better since major changes are
> made bi-anualy in RoboCup@Home, and we should not this year. What do you
> think?
> Stage I:
> RIPS
> Follow Me
> Go Get It
> Who's Who?
> Open Challenge (with evaluation by the team leaders, normalized score)
>
>
> Stage II (Best 50%):
> Enhanced Who's Who
> General Purpose Service Robot
> Demo Challenge (with evaluation by the team leaders, normalized score)
> Shopping Mall
>

I completely agree. GPSR is a test for test for stage II because of the
above points and the fact that it explicitly includes "all the
capabilities" from stage I. Having GPSR in stage I would lead to some
recursive definition :)
It is designed as a stage II test, and it should stay a stage II test
imho.

Cheers,
Dirk

Tijn van der Zant

unread,

Jan 13, 2011, 5:51:22 PM1/13/11

to athom...@googlegroups.com

>> Last year, we had 31 preregistered teams, 26 teams were qualified, and
>> 24 teams participated in the competition. So, do we have to divide
>> qualified teams? Why not single group? If two arenas were used, we would
>> have some problems:
>> - noise from the other arena (moderation, robot's utterances,...)
>> - fairness (some referees had 50cm bonus to every team using on-board
>> microphone in 2009)
>>
> I guess we do not need to decide now whether or not we split up the
> teams in two groups, but could do that in Istanbul once we see how many
> teams made it to RoboCup. So we should just add both possibilities in
> the rulebook (having one group of teams and having two groups). That
> also makes it more "compatible" with local competitions like
> japan/german/iran etc. open.
>

We have to test with 2 groups, because in the future we will have 32
teams. Also we have about 30 pre-registered teams now ad last year the
schedule was too crowded and mistakes were made. I do not like mistakes
especially if we can avoid them. So let's have a learning organization.
Also the argument of noise does not hold. Whether there is one area
which is used almost 100% of the time of 2 areas interleaving about 50%
of the time --> there is always noise.
About the referees --> better referee instructions and training should
solve this. If the referees for both areas are trained at the same time,
there should be no differences.

>>
>> Thus, I guess the following structure is better since major changes are
>> made bi-anualy in RoboCup@Home, and we should not this year. What do you
>> think?
>> Stage I:
>> RIPS
>> Follow Me
>> Go Get It
>> Who's Who?
>> Open Challenge (with evaluation by the team leaders, normalized score)
>>
>>
>> Stage II (Best 50%):
>> Enhanced Who's Who
>> General Purpose Service Robot
>> Demo Challenge (with evaluation by the team leaders, normalized score)
>> Shopping Mall

This is not a solution. The trustees want more tests in Stage I. So I
guess it is either the Demo Challenge in stage I, or the GPRS. Pick any
one, but one needs to be chosen...

Cheers,

--Tijn

Jesus Savage

unread,

Jan 13, 2011, 7:08:16 PM1/13/11

to athom...@googlegroups.com

Hi,

Thus, I guess the following structure is better since major changes are
made bi-anualy in RoboCup@Home, and we should not this year. What do you
think?
Stage I:
RIPS
Follow Me
Go Get It
Who's Who?
Open Challenge (with evaluation by the team leaders, normalized score)

Stage II (Best 50%):
Enhanced Who's Who
General Purpose Service Robot
Demo Challenge (with evaluation by the team leaders, normalized score)
Shopping Mall

This is not a solution. The trustees want more tests in Stage I. So I guess it is either the Demo Challenge in stage I, or the GPRS. Pick any one, but one needs to be chosen...

I think that we should have the Demo Challenge in stage I and the GPSR in the second.

Jesus

Cheers,

--Tijn

Mohan Rajesh Elara

unread,

Jan 13, 2011, 7:44:37 PM1/13/11

to athom...@googlegroups.com

Dear All,

I agree with Jesus, we should have Demo Challenge in Stage 1 and GPRS in Stage 2. We could standardize the 2 venue approach as the number of teams might increase in the following years and minimize hazzle for the OC/LOC.

Best Regards,

Mohan

Luca Iocchi

unread,

Jan 14, 2011, 4:05:41 AM1/14/11

to athom...@googlegroups.com

Dear all,
the division in two groups is not relevant for the rulebook, it is just
a matter of internal
organization, for deciding if we have time to allocate all the tests.
This will not affect any local
competition. Also I suggest to write in the rulebook that local
competitions may choose a different
order or a subset of the tests, depending on organizing constraints.

As for the structure of tests, let me summarize my point of view:
1) we have an explicit request for RoboCup Federation to increase the
minimum number of tests
of each team
2) the two groups scheme with <= 30 teams pre-registered should not make
any problem in
allowing more teams to do one test (i.e. to move a test from Stage II to
Stage I)
3) Moving a test from Stage II to Stage I is not a major change in the
rule, since all teams are
expected to prepare all tests
4) having GPRS as the last test of Stage I or the first of Stage II does
not make any difference
in terms of preparation for the teams. It will be done in any case after
basic functionalities have been tested.
The only difference is in the number of teams that will do this test.
It is true that we will have more zero-score points if we allow all
teams to participate,
however I think that if teams know that this test is in Stage I, they
will prepare it more carefully
and the average score of this test will be better than if it is declared
as a Stage II test.
5) GPRS is very important from the scientific view point and an increase
of performance in it
will be very valuable for the teams (I can see many pubblications out of
it) and for the
RoboCup@Home League (nobody is doing such kind of test in the world!)

Anyway, we have to come to a conclusion, because it is urgent to start
writing the rulebook.
I propose that TC vote for these three options

1) Leave as in 2010
2) Move GPRS to Stage I
3) Move Demo Challenge to Stage I

My vote is for 2

Best,
Luca.

Javier Ruiz-del-Solar

unread,

Jan 14, 2011, 5:52:41 AM1/14/11

to athom...@googlegroups.com, Javier Ruiz-del-Solar

Stage II (Best 50%):
Enhanced Who's Who
General Purpose Service Robot
Demo Challenge (with evaluation by the team leaders, normalized score)
Shopping Mall
This is not a solution. The trustees want more tests in Stage I. So I guess it is either the Demo Challenge in stage I, or the GPRS. Pick any one, but one needs to be chosen...

I prefer to have GPRS in stage I. Main reason is to give the possibility to start working in this test to all teams. Otherwise, teams that normally don´t go to stage II will not consider this test as an important one, and will not focus any important effort on it.

Javier Ruiz-del-Solar

unread,

Jan 14, 2011, 5:54:41 AM1/14/11

to athom...@googlegroups.com, Javier Ruiz-del-Solar

I propose that TC vote for these three options

1) Leave as in 2010
2) Move GPRS to Stage I
3) Move Demo Challenge to Stage I

My vote is for 2

I also vote for option 2.

Tijn van der Zant

unread,

Jan 14, 2011, 8:15:44 AM1/14/11

to athom...@googlegroups.com

I have another reason to have GPSR in stage I: If teams also focus on this test, then other performances should also increase, because this steers teams away from the state-based systems we see all the time. It is essential for the progress of intelligent and social robotics. Also I'd rather see a small amount of good demo challenges, than a large amount where there are more 'no-so-good' performances.
1) is not really an option...

My vote is also for 2).

Just for the record for new tc-members:
We have the habit of voting. So at the moment there are 3 votes for 2). Please make you're vote heard.

Seyed Mohammad Ghaffarian

unread,

Jan 14, 2011, 10:15:15 AM1/14/11

to athom...@googlegroups.com

Dear all,

Let's take a look at the situation:

1- On one hand, we have an explicit request from the RoboCup Federation to increase the minimum number of tests for each team ... in other words, the trustees want more tests in Stage-I.

2- On the other hand, we have a RoboCup@Home legacy that major modification to the rulebook is done every two year ... and this year, is the one that we shouldn't make major modifications to the rulebook.

So we have a problem and we need a solution. One suggestion was that we move one of the tests from Stage-II to Stage-I, and that test could be either the GPSR or Demo-Challenge. Personally, I don't really like this solution! But we want people to do more tests ... so here's a suggestion:

Instead of moving a test from Stage-II to Stage-I, how about we let the teams decide to try another test. What I have in mind is that this test could be either from Stage-I or Stage-II. Officially, we have 5 tests in Stage-I (RIPS, Follow me, Go Get it, Who is who, Open-Challenge), and each team will have 6 chances in the first stage. Maybe a team performs poor in a test (maybe by bad luck), then they will have another chance to do better and earn more scores (of course, this will replace the poor score, instead of being added to the total score so far). Or, if a team has done well in all tests of Stage-I, they can choose a test from Stage-II to give it a try and earn more (which will be added to the total score so far).
Note that in Stage-II, teams should only participate in 3 tests (from 4 available choices: GPSR, Demo, Enhanced who is who, Shopping mall). Another note is that if a team has tried a test from the 2nd stage in Stage-I, they cannot repeat that test in Stage-II and thus they will have to participate in the other 3 tests of Stage-II.

This is my suggestion. It doesn't solve the problem of major modification, but I think it's more flexible than moving a specific test from Stage-II to Stage-I. Of course, if the TC believes there is no other choice but to move a test from Stage-II to Stage-I, I think Demo-Challenge is a more reasonable choice.

About dividing the teams in Stage-I into 2 parallel groups, I'm concerned about fairness. We have two tests in Stage-I that are scored by the team-leaders (RIPS + Open-Challenge) ... dividing into two groups means: two different refereeing audience! Which could have quite different levels of satisfaction!! This is an important issue. I really don't have a good solution right now but it should be considered by the TC.

Kind Regards
---
Seyed Mohammad Ghaffarian
Computer Engineering Department
Amirkabir University of Technology

Luca Iocchi

unread,

Jan 14, 2011, 10:43:17 AM1/14/11

to athom...@googlegroups.com

Dear Seyed,
thank you for your post.

I have two replies to your suggestion:
1) I do not think that moving a test from Stage II to Stage I is a major change in the rules.
Can you explain why you think so?
2) adding a new free slot where teams can decide what to do is very nice (we had this a couple
of years ago), but also very difficult from the organization viewpoint: can you imagine
30 teams each one deciding a different test, having to organize referees, arrange the arenas,
etc. on the fly? Also this increases substantially the number of slots needed and I am not sure
that in this way we will have enough.

Best regards,
Luca.

Jesus Savage

unread,

Jan 14, 2011, 11:11:25 AM1/14/11

to athom...@googlegroups.com

On Fri, Jan 14, 2011 at 4:52 AM, Javier Ruiz-del-Solar <jru...@ing.uchile.cl> wrote:

Stage II (Best 50%):
Enhanced Who's Who
General Purpose Service Robot
Demo Challenge (with evaluation by the team leaders, normalized score)

Shopping Mall
This is not a solution. The trustees want more tests in Stage I. So I guess it is either the Demo Challenge in stage I, or the GPRS. Pick any one, but one needs to be chosen...

I prefer to have GPRS in stage I. Main reason is to give the possibility to start working in this test to all teams. Otherwise, teams that normally don´t go to stage II will not consider this test as an important one, and will not focus any important effort on it.

I vote for option 3, the reason for this is that we need to be sure that the robots have the basic capabilities that
we want for service robots, and the teams will start working on that.

If the robots do not have these basic capabilities, then it is obvious, that in the GPRS test in stage one the robots would perform only tricks that would simulate that they have these, then it becomes only a robots' beauty contest.

Jesus

Tijn van der Zant

unread,

Jan 14, 2011, 11:24:25 AM1/14/11

to athom...@googlegroups.com

Hi,

Instead of moving a test from Stage-II to Stage-I, how about we let the teams decide to try another test. What I have in mind is that this test could be either from Stage-I or Stage-II. Officially, we have 5 tests in Stage-I (RIPS, Follow me, Go Get it, Who is who, Open-Challenge), and each team will have 6 chances in the first stage. Maybe a team performs poor in a test (maybe by bad luck), then they will have another chance to do better and earn more scores (of course, this will replace the poor score, instead of being added to the total score so far). Or, if a team has done well in all tests of Stage-I, they can choose a test from Stage-II to give it a try and earn more (which will be added to the total score so far).
Note that in Stage-II, teams should only participate in 3 tests (from 4 available choices: GPSR, Demo, Enhanced who is who, Shopping mall). Another note is that if a team has tried a test from the 2nd stage in Stage-I, they cannot repeat that test in Stage-II and thus they will have to participate in the other 3 tests of Stage-II.

I think that this is a bigger change to the rules than giving teams the opportunity to participate in one more advanced test, although that is debatable.
I have to agree with Luca that, although we like the suggestion and have had it before, that the organization of this is infeasible. The burden on the organization would be too large.
Also, by having a more advanced task in Stage I it will (probably) be more clear which teams are fit to go to stage II.

About dividing the teams in Stage-I into 2 parallel groups, I'm concerned about fairness. We have two tests in Stage-I that are scored by the team-leaders (RIPS + Open-Challenge) ... dividing into two groups means: two different refereeing audience! Which could have quite different levels of satisfaction!! This is an important issue. I really don't have a good solution right now but it should be considered by the TC.

We have been thinking about this and have not come up with a final solution. There are several options. One thing we usually do is to eliminate the (two) lowest and highest score (as in, for example, ice skating).
Then, for example, after calculating the score we could normalize the points using ranking, and distribute the scoring based on the ranking.

But I think that this will not be needed. A team get scores from a dozen or more other team leaders. With this amount statistics start to work and I doubt it that the average scores of the two parallel groups will be very far apart.
Two different 'levels of satisfaction' could also be because on average one group had better performances than the other...

Regards,

Tijn

Luca Iocchi

unread,

Jan 14, 2011, 12:16:51 PM1/14/11

to athom...@googlegroups.com

About dividing the teams in Stage-I into 2 parallel groups, I'm concerned about fairness. We have two tests in Stage-I that are scored by the team-leaders (RIPS + Open-Challenge) ... dividing into two groups means: two different refereeing audience! Which could have quite different levels of satisfaction!! This is an important issue. I really don't have a good solution right now but it should be considered by the TC.

We have been thinking about this and have not come up with a final solution. There are several options. One thing we usually do is to eliminate the (two) lowest and highest score (as in, for example, ice skating).
Then, for example, after calculating the score we could normalize the points using ranking, and distribute the scoring based on the ranking.

But I think that this will not be needed. A team get scores from a dozen or more other team leaders. With this amount statistics start to work and I doubt it that the average scores of the two parallel groups will be very far apart.
Two different 'levels of satisfaction' could also be because on average one group had better performances than the other...

Regards,

Tijn

Dear all,
in addition to Tijn's reply, I would add that this is the same kind of variability you have in soccer leagues
when teams are grouped in groups for the round robin phase.

Best regards,
Luca.

Mohan Rajesh Elara

unread,

Jan 15, 2011, 12:52:42 AM1/15/11

to athom...@googlegroups.com

Dear All,

1) Leave as in 2010
2) Move GPRS to Stage I
3) Move Demo Challenge to Stage I

I vote for the 2nd choice too....

Best Regards,
Mohan

Seyed Mohammad Ghaffarian

unread,

Jan 15, 2011, 5:20:44 AM1/15/11

to athom...@googlegroups.com

On 01/14/2011 07:13 PM, Luca Iocchi wrote:

Dear Seyed,
thank you for your post.

I have two replies to your suggestion:
1) I do not think that moving a test from Stage II to Stage I is a major change in the rules.
Can you explain why you think so?

Dear Luca and Tijn,

The main problem with moving GPSR from Stage-II to Stage-I is not about major modification. We are moving a test from Stage-II to Stage-I so that teams can do more tests in the first stage. But the GPSR is a pretty difficult test and I think most teams will withdraw from it or if not, they won't be able to do much. So if our purpose is to provide people with more chances, I think the GPSR is not a good choice.

2) adding a new free slot where teams can decide what to do is very nice (we had this a couple
of years ago), but also very difficult from the organization viewpoint: can you imagine
30 teams each one deciding a different test, having to organize referees, arrange the arenas,
etc. on the fly? Also this increases substantially the number of slots needed and I am not sure
that in this way we will have enough.

As we are all engineers, we well know that nothing is achievable without cost. And the more you want to achieve, the greater the cost is. The suggestion of additional time slots is more flexible than moving the GPSR to the first stage and also I believe that an additional time slot will truly provide the teams with more opportunities, while I believe that the GPSR will not (mainly because of its difficulty).

But as I said, achieving this flexibility has costs (organizational costs). I have complete trust in the opinions of the RoboCup@Home TC (including Luca and Tijn), so if they say that this solution is not practical and it is really hard to organize, then I accept and agree with you. But I really believe that the GPSR is not the answer to our problem.

As I mentioned in my previous post, my suggestion does NOT solve the problem of major modification and it's more difficult than simply moving the GPSR to Stage-I and it has organizational costs; but these costs are not without achievements, and I think this is a better solution if we really want people to do more in the first stage.

Tijn van der Zant

unread,

Jan 15, 2011, 9:42:24 AM1/15/11

to athom...@googlegroups.com

Dear Seyed,

> As we are all engineers, we well know that nothing is achievable
> without cost. And the more you want to achieve, the greater the cost
> is. The suggestion of additional time slots is more flexible than
> moving the GPSR to the first stage and also I believe that an
> additional time slot will truly provide the teams with more
> opportunities, while I believe that the GPSR will not (mainly because
> of its difficulty).
>

It would be nice to have additional time slots for team to redo a test.
But it is really not possible to organize (we've tried). Also that would
imply that we have to actually skip another test, because the schedule
is already very full. So although the idea is good, we simply can't
implement it.

--Tijn

btw, I'm not an engineer, but an AI guy ;-)

Komei Sugiura

unread,

Jan 17, 2011, 4:46:15 AM1/17/11

to athom...@googlegroups.com

Dear all,

@parallel group

> We have to test with 2 groups, because in the future we will have 32
> teams. Also we have about 30 pre-registered teams now ad last year
> the schedule was too crowded and mistakes were made. I do not like
> mistakes especially if we can avoid them. So let's have a learning
> organization.

I have a different opinion here. The point is that this year there's
not going to be 32 teams.
I completely agree to have a learning organization. Imho, thinking
about potential problems are our duty. So, please clarify the mistakes
so that we can discuss the matter.

> Also the argument of noise does not hold. Whether there is one area
> which is used almost 100% of the time of 2 areas interleaving about
> 50% of the time --> there is always noise.

I don't understand the point. I mean by the word "noise" any source
which is not the user utterance, such as ambient noise, announcement,
moderator's speech, etc.
It is clear that most teams failed to handle noise in Singapore. Even
referees sometimes failed to catch robot's words since the arena was
too noisy.

>> *About dividing the teams in Stage-I into 2 parallel groups, I'm
>> concerned about **fairness*. We have two tests in Stage-I that are

>> scored by the team-leaders (RIPS + Open-Challenge) ... dividing into
>> two groups means: two different refereeing audience! Which could have
>> quite different levels of satisfaction!! This is an important issue. I
>> really don't have a good solution right now but it should be
>> considered by the TC.

> We have been thinking about this and have not come up with a final
> solution. There are several options. One thing we usually do is to
> eliminate the (two) lowest and highest score (as in, for example, ice
> skating).

I'm also anxious about RIPS (and GPSR) since TC members give partial
scores in RIPS and GPSR. I believe TC members will try to be fair but
it is going to be difficult since the population is small.

@GPSR

> Otherwise, teams that
> normally don't go to stage II will not consider this test as an
> important one, and will not focus any important effort on it.

That's true, Javier. But last year, Dirk, David and I were refereeing
in the GPSR test, and we felt re-arranging the objects was quite
time-consuming. Maybe my explanation was not so clear, though...

My vote is

> 1) Leave as in 2010

since
- GPSR is designed for Stage II
- Last year most teams got 0 scores.
- Major changes would be necessary since GPSR is a ten-minute, 2000 point-test.

Best,

Komei

Tijn van der Zant

unread,

Jan 17, 2011, 5:44:26 PM1/17/11

to athom...@googlegroups.com

Dear all,

> - Major changes would be necessary since GPSR is a ten-minute, 2000 point-test.

I do not really understand the arguments about the major changes. The
"not making major changes except for the demo challenge" is about the
tests. It is not even a rule but a general agreement.
Fiddling with the schedule and allowing teams to either do the demo
challenge or the GPSR test in stage I is not a major change to any of
the tests. So let's get past this part of the discussion and decide to
either do the demo challenge or the GPSR in stage one, since we have to
do more in stage I. I have not seen another solution which is also
organizable. The proposal of not having an extra test in stage I is not
a solution to the problem of needing an extra test in stage I...

Best,

Tijn

Tijn van der Zant

unread,

Jan 17, 2011, 5:57:29 PM1/17/11

to athom...@googlegroups.com

Dear all,

> @parallel group
>
>> We have to test with 2 groups, because in the future we will have 32
>> teams. Also we have about 30 pre-registered teams now ad last year
>> the schedule was too crowded and mistakes were made. I do not like
>> mistakes especially if we can avoid them. So let's have a learning
>> organization.
>
> I have a different opinion here. The point is that this year there's
> not going to be 32 teams.
> I completely agree to have a learning organization. Imho, thinking
> about potential problems are our duty. So, please clarify the mistakes
> so that we can discuss the matter.

The mistakes (which were all solved) were that we were totally
overwhelmed with the organizational aspect. So to relieve the
organization and make it manageable, from our experience we know that up
to 16 teams is no problem. So the best solution is to have 2 times 16
teams as a maximum, in two different groups.

>> Also the argument of noise does not hold. Whether there is one area
>> which is used almost 100% of the time of 2 areas interleaving about
>> 50% of the time --> there is always noise.
>
> I don't understand the point. I mean by the word "noise" any source
> which is not the user utterance, such as ambient noise, announcement,
> moderator's speech, etc.
> It is clear that most teams failed to handle noise in Singapore. Even
> referees sometimes failed to catch robot's words since the arena was
> too noisy.

That's a problem that we always have at the RoboCup. So what the
technical committee should ensure is that there is a good sound system
and write down the requirements for the local organization so that we
can get the sound system and hear the robots. Perhaps we need several
wireless head sets like they have on those "silent parties". Would that
be a solution?

>>> *About dividing the teams in Stage-I into 2 parallel groups, I'm
>>> concerned about **fairness*. We have two tests in Stage-I that are
>>> scored by the team-leaders (RIPS + Open-Challenge) ... dividing into
>>> two groups means: two different refereeing audience! Which could have
>>> quite different levels of satisfaction!! This is an important issue. I
>>> really don't have a good solution right now but it should be
>>> considered by the TC.
>> We have been thinking about this and have not come up with a final
>> solution. There are several options. One thing we usually do is to
>> eliminate the (two) lowest and highest score (as in, for example, ice
>> skating).
>
> I'm also anxious about RIPS (and GPSR) since TC members give partial
> scores in RIPS and GPSR. I believe TC members will try to be fair but
> it is going to be difficult since the population is small.

Small? We have 30 pre-registered teams and will probably end up with
app. 26-28 teams. This will create two groups of 13-14 teams. This means
that the @Home league is the largest senior league. But besides this
point. Do you have a better solution? It's good to worry, but please
provide a solution or alternative so we can weigh the pros and cons.

> @GPSR
>
>> Otherwise, teams that
>> normally don't go to stage II will not consider this test as an
>> important one, and will not focus any important effort on it.
>
> That's true, Javier. But last year, Dirk, David and I were refereeing
> in the GPSR test, and we felt re-arranging the objects was quite
> time-consuming. Maybe my explanation was not so clear, though...

And that is why we have to organize ourselves and create a schedule
where teams have to provide assistants, whether they are referees or
people who move objects around is not important. But a schedule is
something we definitely need :-) This is one of the things we learned
from last year.

Cheers,

Tijn

Javier Ruiz-del-Solar

unread,

Jan 17, 2011, 6:02:16 PM1/17/11

to athom...@googlegroups.com, athom...@googlegroups.com

Let us continue with the voting process of the TC.

----------------------
Javier @ iPhone

Komei

unread,

Jan 18, 2011, 3:19:32 AM1/18/11

to atHome2011

Dear all,

> >> We have to test with 2 groups, because in the future we will have 32
> >> teams. Also we have about 30 pre-registered teams now ad last year
> >> the schedule was too crowded and mistakes were made. I do not like
> >> mistakes especially if we can avoid them. So let's have a learning
> >> organization.
>
> > I have a different opinion here. The point is that this year there's
> > not going to be 32 teams.
> > I completely agree to have a learning organization. Imho, thinking
> > about potential problems are our duty. So, please clarify the mistakes
> > so that we can discuss the matter.
>
> The mistakes (which were all solved) were that we were totally
> overwhelmed with the organizational aspect. So to relieve the
> organization and make it manageable, from our experience we know that up
> to 16 teams is no problem. So the best solution is to have 2 times 16
> teams as a maximum, in two different groups.

I guess there is going to be many problems other than parallelizing
since this year the LOC does not have experiences to hold @home
events. So, how about parallelizing tests from 2012? We will be also
able to change the rules in 2012 so that the scoring system is more
based on an objective evaluation. Then, @home is going to be a more
reliable benchmarking test and the fairness matter is (partly) solved.

@ speech processing in noisy environments

> >> Also the argument of noise does not hold. Whether there is one area
> >> which is used almost 100% of the time of 2 areas interleaving about
> >> 50% of the time --> there is always noise.
>
> > I don't understand the point. I mean by the word "noise" any source
> > which is not the user utterance, such as ambient noise, announcement,
> > moderator's speech, etc.
> > It is clear that most teams failed to handle noise in Singapore. Even
> > referees sometimes failed to catch robot's words since the arena was
> > too noisy.
>
> That's a problem that we always have at the RoboCup. So what the
> technical committee should ensure is that there is a good sound system
> and write down the requirements for the local organization so that we
> can get the sound system and hear the robots. Perhaps we need several
> wireless head sets like they have on those "silent parties". Would that
> be a solution?

What I meant is
- The @home environment is already tough for speech recognition
systems.
- Parallelizing the competition introduces more noise sources.
- The solution has not been fully discussed.

@ fairness

> > I'm also anxious about RIPS (and GPSR) since TC members give partial
> > scores in RIPS and GPSR. I believe TC members will try to be fair but
> > it is going to be difficult since the population is small.
>
> Small? We have 30 pre-registered teams and will probably end up with
> app. 26-28 teams. This will create two groups of 13-14 teams.

It's team leader voting...
In RIPS and GPSR, TC members (2 or 3 persons) give partial scores. So,
the population is small.

Best,

Komei

Luca Iocchi

unread,

Jan 18, 2011, 5:34:11 AM1/18/11

to athom...@googlegroups.com

Il 18/01/2011 0.02, Javier Ruiz-del-Solar ha scritto:
> Let us continue with the voting process of the TC.
>
>

The current status of voting is

1) Leave as in 2010

Komei

2) Move GPRS to Stage I

Luca, Tijn, Javier, Mohan

3) Move Demo Challenge to Stage I

Jesus

Dirk and Anne-Lise are missing

If option 1 will be dropped, Komei can vote again for either 2) or 3) ?

Please let's close this discussion as soon as possible, because we have
to work
on the rulebook, which is also important.

Best,
L.

Luca Iocchi

unread,

Jan 18, 2011, 5:46:03 AM1/18/11

to athom...@googlegroups.com

Dear Komei,
I understand your concerns about parallel groups and I believe we have
to take them very seriously.
However, let's wait for the submission of qualification material and the
final decision of the OC
about qualification of teams, before finalizing the decision to adopt
parallel groups in 2011 or in 2012.

I want to add anyway that other leagues have the same problems with
parallel groups, but they live with them.
For example, in the soccer leagues, since there are no walls between the
fields, in some cases robots can see
the elements (e.g., goals) of another field. This is a sensor noise that
teams have to take care of.

In @Home, we can certainly try as much as possible to minimize the sound
noise when a test is running,
by using the two @Home fields interleaving actual runs of the tests, or
by defining a schedule such that
when a test critical for speech is running in Field A, Group B will do
something not so noisy.

As for the votes of TC in the tests, I believe that we can guarantee
that all TC members
that are required to vote will do it in both the groups. So in this
matter, there will be no
difference between a single or parallel groups.

Best regards,
Luca.

Komei

unread,

Jan 18, 2011, 8:58:32 PM1/18/11

to atHome2011

Dear Luca,

OK. Let's wait:)

Best,

Komei

Komei

unread,

Jan 18, 2011, 8:59:20 PM1/18/11

to atHome2011

Dear all,

> If option 1 will be dropped, Komei can vote again for either 2) or 3) ?

I prefer 3).

Best,

Komei

Reply all

Reply to author

Forward