combining team files

Jonas

unread,

Dec 7, 2011, 9:56:00 AM12/7/11

to opennero

How does one combine agents from different team files into one file?
What part of the team file defines one agent? How does one determine
which agents to copy from a team file?

Igor Karpov

unread,

Dec 7, 2011, 10:11:08 AM12/7/11

to open...@googlegroups.com

Hi Jonas,

On Wed, Dec 7, 2011 at 8:56 AM, Jonas <boxnu...@gmail.com> wrote:
> How does one combine agents from different team files into one file?

You should just be able to combine agents through careful use of a text editor.
Open the two files you are starting with, let's say team1.rtneat and
team2.qlearning, and then create a new file, team3.mixed, which is
initially empty.

> What part of the team file defines one agent?

For evolved networks, this is one genome, starting with:

genomestart ID

And ending with:

genomeend ID

Where ID is the ID of the agent.

For RL trained Q-tables, this is two (very long) lines that look
something like this:

22 serialization::archive 7 0 0 ... OpenNero::TableApproximator 1 0
0 0 0 18 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...

> How does one determine which agents to copy from a team file?

This is easy to do for the neural networks, because the ID mentioned
above corresponds to the agent ID you see after hitting F2 for a few
times to display the genome ID above each agent while training. So, if
you see a behavior you like, remember it's ID and then after you save
the population, grab that ID's genome for your team. Two things about
this - if your team has fewer genomes then the number of agents on the
field, the loading function will keep cycling over the networks until
every body on the field has a brain - so if you only put one network
into a team, your entire team will be clones of that one network. If
you put two, half will be one and half the other. They won't do
exactly the same thing because they will see slightly different
things, but they should be pretty close.

I actually don't know how to answer this question for the Q-learning
agents - this is something we didn't think about yet. We should really
have a similar mechanism where you can see the the ID of each agent
and then you can find it in the file later, or at least sort the
agents by something. I will try to wrap this into tonights build.

Igor Karpov

unread,

Dec 7, 2011, 10:13:39 AM12/7/11

to open...@googlegroups.com

Oh and the second thing about this is that you can't really mix and
match different rtneat populations for training - only for battle. The
reason is because training assumes that all the networks came from the
same history, and their innovation numbers match up - this allows
rtNEAT to make meaningful crossover operations on networks with
different structure.

boxnumber03

unread,

Dec 7, 2011, 10:27:36 AM12/7/11

to open...@googlegroups.com

Thank you for fleshing out those details! I look forward to seeing what you can provide for identifying qlearning agents.

bill.mybiz

unread,

Dec 8, 2011, 5:27:06 PM12/8/11

to opennero

I just posted source code for utils to do this more quickly. Prune,
duplicate genomes, as well as "hybridize" teams. Check out my other
post here:
http://groups.google.com/group/opennero/t/8d921e58026d3921

-Bill

On Dec 7, 9:27 am, boxnumber03 <boxnumbe...@gmail.com> wrote:
> Thank you for fleshing out those details! I look forward to seeing what
> you can provide for identifying qlearning agents.
>
>
>
>
>
>
>

> On Wed, Dec 7, 2011 at 10:13 AM, Igor Karpov <ikar...@cs.utexas.edu> wrote:

> > On Wed, Dec 7, 2011 at 9:11 AM, Igor Karpov <ikar...@cs.utexas.edu> wrote:
> > > Hi Jonas,
>

bill.mybiz

unread,

Dec 8, 2011, 5:28:11 PM12/8/11

to opennero

What would be expected problems if you do mix and match training?

On Dec 7, 9:13 am, Igor Karpov <ikar...@cs.utexas.edu> wrote:

> On Wed, Dec 7, 2011 at 9:11 AM, Igor Karpov <ikar...@cs.utexas.edu> wrote:
> > Hi Jonas,
>

Igor Karpov

unread,

Dec 8, 2011, 5:37:29 PM12/8/11

to open...@googlegroups.com

Well, if you mix and match genomes, they would be from different histories and so the crossover operation wouldn't be as meaningful.

If you mix and match Q-learning and evolving agents, who knows what would happen - the Q-learning agents' performance depends on what else is going on in the environment, and vice versa. I think it would be an interesting case to study. Incidentally there has been previous work that combines Q-learning and neuroevolution - see for example Shimon Whiteson's thesis work: