Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

Article: Decision-Theoretic Planning with non-Markovian Rewards

5 views

Skip to first unread message

jai...@ptolemy.arc.nasa.gov

unread,

Jan 29, 2006, 2:14:35 PM1/29/06

JAIR is pleased to announce the publication of the following article:

Thiebaux, S., Gretton, C., Slaney, J., Price, D. and Kabanza, F. (2006)
"Decision-Theoretic Planning with non-Markovian Rewards",
Volume 25, pages 17-74.

For quick access via your WWW browser, use this URL:
http://www.jair.org/abstracts/thiebaux06a.html

Abstract:
A decision process in which rewards depend on history rather than
merely on the current state is called a decision process with
non-Markovian rewards (NMRDP). In decision-theoretic planning, where
many desirable behaviours are more naturally expressed as properties
of execution sequences rather than as properties of states, NMRDPs
form a more natural model than the commonly adopted fully Markovian
decision process (MDP) model. While the more tractable solution
methods developed for MDPs do not directly apply in the presence of
non-Markovian rewards, a number of solution methods for NMRDPs have
been proposed in the literature. These all exploit a compact
specification of the non-Markovian reward function in temporal logic,
to automatically translate the NMRDP into an equivalent MDP which is
solved using efficient MDP solution methods. This paper presents
NMRDPP (Non-Markovian Reward Decision Process Planner), a software
platform for the development and experimentation of methods for
decision-theoretic planning with non-Markovian rewards. The current
version of NMRDPP implements, under a single interface, a family of
methods based on existing as well as new approaches which we describe
in detail. These include dynamic programming, heuristic search, and
structured methods. Using NMRDPP, we compare the methods and identify
certain problem features that affect their performance. NMRDPP's
treatment of non-Markovian rewards is inspired by the treatment of
domain-specific search control knowledge in the TLPlan planner, which
it incorporates as a special case. In the First International
Probabilistic Planning Competition, NMRDPP was able to compete and
perform well in both the domain-independent and hand-coded tracks,
using search control knowledge in the latter.

The article is available via:

-- comp.ai.jair.papers (also see comp.ai.jair.announce)

-- World Wide Web: The URL for our World Wide Web server is
http://www.jair.org/
For direct access to this article and related files try:
http://www.jair.org/abstracts/thiebaux06a.html

-- Anonymous FTP from Carnegie-Mellon University (USA):
ftp://ftp.cs.cmu.edu/project/jair/volume25/thiebaux06a.ps
The compressed PostScript file is named thiebaux06a.ps.Z

For more information about JAIR, visit our WWW or FTP sites, or
contact jai...@isi.edu

--
Steven Minton
JAIR Managing Editor

0 new messages