Let's develop a brain

26 views
Skip to first unread message

Thorsten Kiefer

unread,
Jan 26, 2015, 2:42:01 AM1/26/15
to becca...@googlegroups.com
Hi,
a brain could be constructed of several reinforcement-learning regions.
each region is connected to other regions by interfaces, one specialized interface for each connection.
Each region receives it's own reward.

With "connection" I mean the region gets its input from other regions and its reward from the outside world
or from other regions.

We could genetically search for "super" brains.
Let the genetic algorithm decide, which outputs of certain regions are connected to the inputs of other
regions. And it even decides, which outputs (includuding inputs from the outside world)
are connected to the reward-input of other regions.

Each region could be a SARSA-agent with Fourier Basis.
A brain can be an  array of SARSA-agents and a set of inputs from the outside world.
These inputs can be rewards and the sensored state.

Each SARSA agent can get a mixture of values from the outside world, rewards and actions from
other regions / SARSA agents from inside the brain as inputs.

Best wishes
Thorsten Kiefer

Brandon Rohrer

unread,
Jan 26, 2015, 9:12:16 AM1/26/15
to becca...@googlegroups.com
Thorsten,

That sounds like a fantastic idea! I strongly encourage you to build it. In my experience, even implementing an extremely simplified version of a concept yields a lot of insight. I would love to see what you create.

Brandon

--
You received this message because you are subscribed to the Google Groups "BECCA_users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to becca_users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Thorsten Kiefer

unread,
Jan 26, 2015, 4:51:18 PM1/26/15
to becca...@googlegroups.com
Hi Brandon,
my first approach would be "multi-layer learning", where the lowest layer would be directly connected to the world,
and the upper layer would learn something abstract.
Maybe more layers, and they would learn more abstract, the higher the layer is.

And later on, I could implement "multi-region learning".

But at the moment I don't know, how to build it exactly.
Must do some drawings first.

-Thorsten

Brandon Rohrer

unread,
Jan 26, 2015, 5:02:19 PM1/26/15
to becca...@googlegroups.com
Sounds totally plausible. I hope you go for it. 

Thorsten Kiefer

unread,
Jan 26, 2015, 6:14:31 PM1/26/15
to becca...@googlegroups.com
With my current thoughts I discovered 5 brain regions :
1. region : learn to map state-action pairs to a "successor state" (learn the underlaying physics)
2. region : learn to map a state to a reward
3. region : the agent, which learns to map a state-reward pair to an action
4. region : learn to map a state to an action

Each region can be run in a seperate thread.

If we want to reuse the brain for multiple domains,
we need an autoencoder (which I do not understand completely)

5. region : if we have 3 domains with n,m and k elemtns in the state vector:
    for domain 1 : use an autoencoder, which encodes an n-element-vector into a min(m,n,k)-element vector
    for domain 2 : use an autoencoder, which encodes an m-element-vector into a min(m,n,k)-element vector
    for domain 3 : use an autoencoder, which encodes an k-element-vector into a min(m,n,k)-element vector
   
Input and output of region 1 are connected to the real world.
input and output of region 2 are connected to the real world.
The input of region 3 is connected to the outputs of region 1 and 2.
The input of region 4 is connected to the outputs of region 1 and 3.

If we are asking for behaviour, we just ask region 4 after or during training.

I'm not sure how to connect region 5. Maybe between 1 and 3 ?
But if so, how do we train region 5 ?
For training region 5, we would need input-output pairs to be presented to it, but we only know the input.
You people here use autoencoders. How do they work ? Where do you we get the output-portion, which must be presented
to region 5, which need input and output for training ?


Conclusion : 1 RL agent and 4 abstractors (i.e. neural nets or fourier basises)

Agree ?

Brandon Rohrer

unread,
Jan 26, 2015, 6:23:58 PM1/26/15
to becca...@googlegroups.com
Thorsten, that sounds completely plausible. There's no way to know for sure what the gotchas are until you implement it. I've learned that lesson the hard way over and over again. The devils, as they say, are in the details. I recommend you start putting the pieces together. Best of luck!
Brandon

Thorsten Kiefer

unread,
Jan 26, 2015, 6:34:15 PM1/26/15
to becca...@googlegroups.com
Next thought : The agent MUST NOT be connected to the outside world directly.
It simply receives its inputs from autoencoders/ neural nets (or what so ever) and feeds its output (an action) to a neural net.
And the 3 or 4 surrounding neural nets or fourier basises talk to the outside world, NOT the agent itself !!

Furthermore, as far as I thought, the autoencoder can be a randomly initialized fourier basis and need not be trained.

Correct ?

SeH

unread,
Jan 26, 2015, 6:42:04 PM1/26/15
to becca...@googlegroups.com
Thorsten-- if all the components can plug into a common interface, maybe genetic programming can evolve expression trees of the connections in a simliar way as how the math expressions are used at fine-grained level in JuRLs.

Thorsten Kiefer

unread,
Jan 26, 2015, 7:30:35 PM1/26/15
to becca...@googlegroups.com
At the moment I have 4 Autoencoders (I call them NeuroMaps).
The only thing, that can be reused in between different domains, is the RLAgent.
Maybe the number of inputs for the RLAgent could be something to be optimized by the genetic algo.
But maybe here "bigger is better" ???
The connection configuration between the regions, as I mentioned it in the first post, is not variable as I found out.
From the first post to this one, I changed my view of the brain.
It is not that generic at the moment, to be optimized with genetic programming.
Pretty fixed for now, but might change in the future and be generalized.

Valentin Carausan

unread,
Jan 26, 2015, 7:35:36 PM1/26/15
to becca...@googlegroups.com
I'm sorry but I have seen very late message, then I realized one and now I checked mail

S.C. ATOM SCP S.R.L.
1.Web. http://atomscp.byethost7.com
      2.Web. http://www.atomscp.aaz.ro
Tel : 0721541806
Fix/Fax : 0244530606

Valentin Carausan

unread,
Jan 26, 2015, 7:35:49 PM1/26/15
to becca...@googlegroups.com
ok

Valentin Carausan

unread,
Jan 26, 2015, 7:39:33 PM1/26/15
to becca...@googlegroups.com
At the moment I have 4 Autoencoders (I call them NeuroMaps).
The only thing, that can be reused in between different domains, is the RLAgent.
Maybe the number of inputs for the RLAgent could be something to be optimized by the genetic algo.
But maybe here "bigger is better" ???
The connection configuration between the regions, as I mentioned it in the first post, is not variable as I found out.
From the first post to this one, I changed my view of the brain.
It is not that generic at the moment, to be optimized with genetic programming.
Pretty fixed for now, but might change in the future and be generalized.



are also disagree with you

Valentin Carausan

unread,
Jan 26, 2015, 7:43:25 PM1/26/15
to becca...@googlegroups.com
i agree with you
not have the correct variable required

Thorsten Kiefer

unread,
Jan 26, 2015, 7:45:09 PM1/26/15
to becca...@googlegroups.com

Yes, the brain is not variable enpugh.

Valentin Carausan

unread,
Jan 26, 2015, 7:47:47 PM1/26/15
to becca...@googlegroups.com
variable enough ok

Valentin Carausan

unread,
Jan 26, 2015, 7:51:42 PM1/26/15
to becca...@googlegroups.com
I will try to connect to an IP network to see if they can identify, and learn from each one number, and I'll try to see if ip exchange between them if they recognize simple taught by number

2015-01-27 2:45 GMT+02:00 Thorsten Kiefer <thorst...@gmail.com>:

Valentin Carausan

unread,
Jan 26, 2015, 7:55:09 PM1/26/15
to becca...@googlegroups.com
but now I go to sleep :)

Thorsten Kiefer

unread,
Jan 26, 2015, 7:57:24 PM1/26/15
to becca...@googlegroups.com

Me too :-)

Reply all
Reply to author
Forward
0 new messages