Discounted rewards and infinite time horizon in PRISM

127 views
Skip to first unread message

Alex Susu

unread,
Jan 25, 2010, 6:02:36 AM1/25/10
to prismmod...@googlegroups.com
Hello.
Could you please tell me if it is possible to specify discounted rewards for a (DTMC)
model in PRISM.
If the answer is yes, then is there a way to compute cumulative path reward
properties with infinite time bound (horizon)?

Thank you.
Alex

Dave Parker

unread,
Jan 26, 2010, 5:16:26 PM1/26/10
to prismmod...@googlegroups.com
Hi Alex,

Discounted properties are not implemented for DTMCs (or MDPs) in PRISM.
But if you are interested in experimenting with this, it should be quite
straightforward to adapt the code. For MDPs, you would just need to add
the discount factor into the default value iteration code (and also
disable precomputation algorithms). For DTMCs, you should probably be
able to use the power method to solve them and then proceed as I just
mentioned for MDPs. If you want to pursue this, ask and I can point you
to the relevant parts of the code.

For either model, the only reward-based infinite-horizon property we
support is the expected cumulated reward until reaching a class of
states (i.e. R=? [ F target ] in PRISM property terminology). Other
reward-based infinite-horizon properties would (I guess) require
discounting.

Hope that helps,

Dave,

Alex Susu

unread,
Jan 27, 2010, 5:48:12 AM1/27/10
to prismmod...@googlegroups.com
Hello.

On 1/27/2010 12:16 AM, Dave Parker wrote:
> Hi Alex,
>
> [...]


> For either model, the only reward-based infinite-horizon property we
> support is the expected cumulated reward until reaching a class of
> states (i.e. R=? [ F target ] in PRISM property terminology).

But is this really infinite-horizon property? Reachability reward properties compute
the reward until a set of (reachable) states is reached, which implies the paths are finite.
Or should one specify a Reachability reward property to an unreachable state? But
when I do this in PRISM, independent of the value of the reward (e.g., if the reward is
always 0) the returned expected reward is infinity.

Thank you.

Dave Parker

unread,
Jan 27, 2010, 6:08:49 AM1/27/10
to prismmod...@googlegroups.com

> But is this really infinite-horizon property? Reachability reward
> properties compute the reward until a set of (reachable) states is
> reached, which implies the paths are finite.

Any path that reaches the goal is finite - yes. But there is no bound
imposed on the lengths of these paths, unlike for finite-horizon
properties. So I'm not sure whether you'd class it as infinite-horizon
or not.

> Or should one specify a Reachability reward property to an unreachable
> state? But when I do this in PRISM, independent of the value of the
> reward (e.g., if the reward is always 0) the returned expected reward is
> infinity.

Yes - this is the definition of such properties in PRISM. E.g., for a
DTMC, any state from which the probability of reaching the goal is less
than 1 has an infinite expected reward of reaching it, regardless of
0-valued rewards.

Dave.

Mohammad Abdulaziz

unread,
Mar 30, 2022, 4:32:48 PM3/30/22
to PRISM model checker
Hi David,

This seems like an old thread by now, so I apologise if my question was answered elsewhere.

Could you please point me to the places in the code base which would allow me to use discounted rewards and to disable the pre-computations?

best,
Mohammad

Dave Parker

unread,
Apr 1, 2022, 3:26:57 AM4/1/22
to prismmod...@googlegroups.com, Mohammad Abdulaziz
Hi Mohammad,

If you want a place to experiment with this (rather than adding it
cleanly adding it to the tool), I would suggest (a) using the (pure
Java) explicit engine, and (b) hijacking the finite-horizon cumulative
reward computation (i.e., Rmax=?[C<=k]). The code for expected reward up
to a target (Rmax=?[F "goal"]) and for total expected reward (Rmax=?[C])
is a bit complicated and incomplete, respectively.

Look at computeCumulativeRewards in explicit/MDPModelChecker.java. This
ultimately calls mvMultRewSingle() in explicit/MDP.java. I guess you
want to prepend "discount *" to sumOverTransitions().

If you want to add a convergence check, look for calls to
PrismUtils.doublesAreClose() elsewhere in MDPModelChecker.java.

Let me know if you need further guidance.

Best wishes,

Dave
> --
> You received this message because you are subscribed to the Google
> Groups "PRISM model checker" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to prismmodelchec...@googlegroups.com
> <mailto:prismmodelchec...@googlegroups.com>.
> To view this discussion on the web, visit
> https://groups.google.com/d/msgid/prismmodelchecker/2bbbc32f-7775-4193-8510-3e80a3566349n%40googlegroups.com
> <https://groups.google.com/d/msgid/prismmodelchecker/2bbbc32f-7775-4193-8510-3e80a3566349n%40googlegroups.com?utm_medium=email&utm_source=footer>.

Mohammad Abdulaziz

unread,
Apr 13, 2022, 9:09:05 AM4/13/22
to PRISM model checker
Hi Dave,

Thanks for your response.

I am unable to follow your instructions, unfortunately. I changed function mvMultRewSingle() based on the instructions below:

> Look at computeCumulativeRewards in explicit/MDPModelChecker.java. This
> ultimately calls mvMultRewSingle() in explicit/MDP.java. I guess you
> want to prepend "discount *" to sumOverTransitions().

I call prism with the following command, but the change has effects.

prism simple.nm simple.pctl -explicit

I have not done b), mainly because I do not understand what hijacking the code means, and that is probably why I see no effect. Could you please explain this:

> (b) hijacking the finite-horizon cumulative
> reward computation (i.e., Rmax=?[C<=k]). The code for expected reward up
> to a target (Rmax=?[F "goal"]) and for total expected reward (Rmax=?[C])
> is a bit complicated and incomplete, respectively.

I am attaching the inputs, just in case that makes a difference.

best,
Mohammad
simple.nm
simple.pctl

Mohammad Abdulaziz

unread,
Apr 13, 2022, 9:34:14 AM4/13/22
to PRISM model checker
Just a small correction: The change I made has NO effects.
Mohammad

Mohammad Abdulaziz

unread,
Apr 13, 2022, 12:46:19 PM4/13/22
to PRISM model checker
Ok, sorry. Now I see what you mean by hijack the code -- just implement my function in it. It works

Many thanks!

best,
Mohammad

Mohammad Abdulaziz

unread,
Apr 24, 2022, 1:30:19 PM4/24/22
to PRISM model checker
Hi Dave,

I have a follow-up question: I would like to have infinite horizon discounted for sparsely represented MDPs. Unfortunately, the explicit representation is very slow, and I know that the models I have are actually sparse, so they could benefit from using the sparse engine.

I figured out that I need to make the changes to PS_NondetCumulReward.cc. I suspect the discount should be done in line 199. Is that correct?

I cannot, however, find the convergence check in C++. Could you please point to an example of it being used?

best,
Mohammad

On Friday, 1 April 2022 at 09:26:57 UTC+2 d.a.parker wrote:

Dave Parker

unread,
Apr 26, 2022, 11:50:46 AM4/26/22
to prismmod...@googlegroups.com, Mohammad Abdulaziz
Hi Mohammad,

Yes, line 199 is the comparable line in the sparse engine code.

That code has no convergence check, because it performs a fixed number
of iterations. See for example line 300 in PS_NondetReachReward.cc for a
convergence check in similar code for R[F] properties.

Best wishes,

Dave
> <https://groups.google.com/d/msgid/prismmodelchecker/2bbbc32f-7775-4193-8510-3e80a3566349n%40googlegroups.com?utm_medium=email&utm_source=footer
> <https://groups.google.com/d/msgid/prismmodelchecker/2bbbc32f-7775-4193-8510-3e80a3566349n%40googlegroups.com?utm_medium=email&utm_source=footer>>.
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "PRISM model checker" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to prismmodelchec...@googlegroups.com
> <mailto:prismmodelchec...@googlegroups.com>.
> To view this discussion on the web, visit
> https://groups.google.com/d/msgid/prismmodelchecker/cd747236-7257-405d-9657-e82e2bb0699fn%40googlegroups.com
> <https://groups.google.com/d/msgid/prismmodelchecker/cd747236-7257-405d-9657-e82e2bb0699fn%40googlegroups.com?utm_medium=email&utm_source=footer>.
Reply all
Reply to author
Forward
0 new messages