Question about "Implementing Inverse Reinforcement Learning for 5x5 grid world problem"

nomolos

unread,

Dec 5, 2009, 2:04:42 PM12/5/09

to Reinforcement Learning Mailing List

Hello RL members~!

My name is Bong and I am studying "Inverse Reinforcement Learning"
algorithm
by Andrew Y. Ng, and Stuart Russell which is described on
"Algorithms for inverse reinforcement learning, ICML '00",
and I have something unclear in terms of implementing.

For the case where the policy is known and tabulated reward function,
they presented the following algorithm to find the unknown reward
function R.

(in Latex syntax)
maximize \sum_{i=1}^{N}\min_{a\in\{a_2,\dots,a_k \}} \{(P_{a1}(i)-P_a
(i))(I-\gamma P_{a1})^{-1}R\}-\lambda \|R\|_1

I was trying to solve this using linear program, but found a problem;

General linear programming form is like [maximize c^{T}x], s.t.
Ax<=b.
But for the above equation, it looks like different.
it's like [maximize sum min c^{T}x].
it requires to find "min" value of \{(P_{a1}(i)-P_a(i))(I-\gamma P_
{a1})^{-1}R\}
which can be calculated when R is known only (R is unknown though)
- this is because P_{a1}(i) and P_{a}(i) are vectors, so without
calculate the whole objective function,
it's impossible to know which action gives the minimum value
(scalar),
before matrix and vector (including unknown R) calculation is
complete.

I guess I am missing something,
so is anybody could help out a novice RL member please? Many thanks!

Best regards,
Bong

Abdeslam Boularias

unread,

Dec 6, 2009, 12:27:31 PM12/6/09

to rl-...@googlegroups.com

Hello Bong,

-1- For each state i add a variable y_i.
-2- For each state i and action a different from a_1 (the expert's action) add a constraint y_i \leq \{(P_{a1}(i)-P_a (i))(I-\gamma P_{a1})^{-1}R\}-\lambda \|R\|_1.
-3- The objective function to be maximized is \sum_{i=1}^{N} y_1.

When this function is maximized, the variables y_i are set to \min_{a\in\{a_2, \dots,a_k \}} \{(P_{a1}(i)-P_a (i))(I-\gamma P_{a1})^{-1}R\}-\lambda \|R\|_1

Best,

--
You received this message because you are subscribed to the "Reinforcement Learning Mailing List" group.
To post to this group, send email to rl-...@googlegroups.com
To unsubscribe from this group, send email to
rl-list-u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/rl-list?hl=en

--
Abdeslam Boularias,

Département d'Informatique et de Génie Logiciel
Université Laval
Ste-Foy, Québec, Canada.

nomolos

unread,

Dec 6, 2009, 10:51:09 PM12/6/09

to Reinforcement Learning Mailing List

Thanks for the quick reply.
But still I can't formulate it as a linear program.
* I don't know what is y_i (how to add y_i?)
* 1-norm of the unkowns in the constraint inequality doen's look like
standart linear programming constraints (\|R\|_1)
Does this one can be solved Matlab's "linprog" command?
I'm new with optimization problems so Matlab code will be very
helpful, if available one.
Thanks again~!

Best regards,
Bong

On 12월7일, 오전2시27분, Abdeslam Boularias <boula...@damas.ift.ulaval.ca>
wrote:

Reply all

Reply to author

Forward