Thank you.
Alex
Discounted properties are not implemented for DTMCs (or MDPs) in PRISM.
But if you are interested in experimenting with this, it should be quite
straightforward to adapt the code. For MDPs, you would just need to add
the discount factor into the default value iteration code (and also
disable precomputation algorithms). For DTMCs, you should probably be
able to use the power method to solve them and then proceed as I just
mentioned for MDPs. If you want to pursue this, ask and I can point you
to the relevant parts of the code.
For either model, the only reward-based infinite-horizon property we
support is the expected cumulated reward until reaching a class of
states (i.e. R=? [ F target ] in PRISM property terminology). Other
reward-based infinite-horizon properties would (I guess) require
discounting.
Hope that helps,
Dave,
On 1/27/2010 12:16 AM, Dave Parker wrote:
> Hi Alex,
>
> [...]
> For either model, the only reward-based infinite-horizon property we
> support is the expected cumulated reward until reaching a class of
> states (i.e. R=? [ F target ] in PRISM property terminology).
But is this really infinite-horizon property? Reachability reward properties compute
the reward until a set of (reachable) states is reached, which implies the paths are finite.
Or should one specify a Reachability reward property to an unreachable state? But
when I do this in PRISM, independent of the value of the reward (e.g., if the reward is
always 0) the returned expected reward is infinity.
Thank you.
Any path that reaches the goal is finite - yes. But there is no bound
imposed on the lengths of these paths, unlike for finite-horizon
properties. So I'm not sure whether you'd class it as infinite-horizon
or not.
> Or should one specify a Reachability reward property to an unreachable
> state? But when I do this in PRISM, independent of the value of the
> reward (e.g., if the reward is always 0) the returned expected reward is
> infinity.
Yes - this is the definition of such properties in PRISM. E.g., for a
DTMC, any state from which the probability of reaching the goal is less
than 1 has an infinite expected reward of reaching it, regardless of
0-valued rewards.
Dave.