offline matched RL?

36 views
Skip to first unread message

knowledgem...@gmail.com

unread,
Oct 13, 2016, 7:46:15 AM10/13/16
to BURLAP Discussion
Hi,

Does Burlap support offline matched RL learning?
LSPI supports to learn from saved episodes.

How about other learning algorithms, td(lambda) ?
When I read it, other learning algorithms interact with env to get state, execute action and get rewards and next state.
Do they support offline training?
How to modify them to get offline training and use vfa as well?

Thanks in advance!

knowledgem...@gmail.com

unread,
Oct 13, 2016, 7:47:41 AM10/13/16
to BURLAP Discussion
Sorry, I mean batched RL, not matched:=)

James MacGlashan

unread,
Nov 8, 2016, 1:18:40 PM11/8/16
to BURLAP Discussion
Hi,

While there is not a single method that pre-defined that will allow you to pass a list of episodes, you can update the TD algorithm by iterating through each step in the episode and calling the critiqueAndUpdate method with a constructed EnvironmentOutcome for that transition in the episode and by calling "endEpisode" on it at the end of each episode you feed it.

James

Vishma Dias

unread,
May 31, 2017, 4:07:45 AM5/31/17
to BURLAP Discussion
Hi,

Can you do Qlearning offline using batch processing? Since the runLearningEpisode on Qlearning class uses the best action to take based on the q values, is there any way to run a episode of my own to learn the agent?

Best Regards,
Vishma.
Reply all
Reply to author
Forward
0 new messages