RLLib 1.0 released

192 views

Skip to first unread message

Abeyruwan, Saminda Wishwajith

unread,

May 17, 2013, 11:32:53 PM5/17/13

to rl-...@googlegroups.com

RLLib 1.0 released

I am pleased to announce the first full release of RLLib: The Lightweight Standard On/Off Policy Reinforcement Learning Library (C++).

RLLib is a simple and highly effective implementation of incremental standard and gradient temporal-difference learning (GTDL) algorithms for robotics platforms using C++ programing language. The implementation of this highly optimized and lightweight library is inspired by the RLPark API (temporal-difference learning algorithms implemented in Java). RLLib is optimized for very fast duty cycles and extensively tested on simulation and on a humanoid robotic platform (e.g., with culling traces, tested on the Robocup 3D simulator and on the Nao V4 (cognition thread)). The library is released under Apache License (Version 2.0) and it is free to use in research, education, and benchmarking.

RLLib has successfully been used in RoboCup 3D soccer simulation; especially for role assignment in formations:
Saminda Abeyruwan, Andreas Seekircher, and Ubbo Visser. Dynamic Role Assignment using General Value Functions.
In AAMAS 2013, ALA, 2013.

The main features of RLLib include:

    Off-policy prediction algorithms: GTD(lambda), and GQ(lambda),
    Off-policy control algorithms: Greedy-GQ(lambda), Softmax-GQ(lambda), and Off-PAC (can be used in on-policy setting),
    On-policy algorithms: TD(lambda), SARSA(lambda), Expected-SARSA(lambda), Actor-Critic, and Average Actor-Critic (continuous and discrete actions),
    Policies: Random, Random50%Bias, Greedy, Epsilon-greedy, Boltzmann, Normal, and Softmax,
    Efficient dot product implementation for tile coding base feature representations (with culling traces),
    Benchmarks: Mountain Car, Mountain Car 3D, Swinging Pendulum, Helicopter and Continuous grid world (Off-PAC paper) environments,
    Optimized for very fast duty cycles (e.g., with culling traces, tested on the Robocup 3D simulator and on the NAO V4 (cognition thread)),
    Main algorithm usage is very much similar to RLPark, therefore, swift learning curve, and
    A plethora of examples demonstrating on-policy and off-policy control experiments.

RLLib configuration:

   The test cases are executed using:
   ./configure
   make
   ./RLLibTest

RLLib Documentation:

   http://web.cs.miami.edu/home/saminda/rllib.html

RLLib sources:

   git clone https://github.com/samindaa/RLLib.git

Contact:

   Saminda Abeyruwan (sam...@cs.miami.edu)

Thank you!

Saminda Abeyruwan
PhD Student
Dept. of Computer Science,
University of Miami

s.abe...@umiami.edu
(305) 457 9753
http://saminda.org/

Reply all

Reply to author

Forward

0 new messages