MDP with positive and negative rewards

225 views
Skip to first unread message

gabrie...@gmail.com

unread,
Jun 12, 2014, 3:10:39 PM6/12/14
to prismmod...@googlegroups.com
Hi all,
I have an MDP model in which there are positive and negative reward items in the same reward structure used to track utility. That is, some actions increase utility, while others decrease utility. I'd like to use verification to quantify what is the maximum utility that can be achieved with a property like Rmax=? [ F s=3 ], where s=3 is some final state. When I try to do the verification, I get the following error: "Reward structure item contains negative rewards..."

Is there any way or workaround to do this? I've seen an example with two reward structures, one for positive rewards and one for negative rewards, but AFIK PRISM can not find the max of the difference between the two rewards, which is what I would need if I split it into positive and negative reward structures.

Thanks

atna...@gmail.com

unread,
Jun 13, 2014, 2:15:59 AM6/13/14
to prismmod...@googlegroups.com, gabrie...@gmail.com
You can normalize your rewards to a positive range.
If you know the upper and lower bounds of both your rewards you can normalize them to a range like 0-1.

You can use a formula like: ((max_new_range-min_new_range)*(current_value-min_value)/(max_value-min_value))+min_new_range

I believe that the effect will be the same.

On Thursday, June 12, 2014 10:10:39 PM UTC+3, gabrie...@gmail.com wrte:

Gethin Norman

unread,
Jun 13, 2014, 8:37:48 AM6/13/14
to prismmod...@googlegroups.com, Gethin Norman
Sorry but I do not see how “normalising" is going to work. I do not have the time to look into this today, but my understanding is that having both positive and negative rewards makes the problem harder to solve, and it cannot be simply reduced to the non-negative case.

For example, adding a negative value to a cumulated rewarded this will decrease the value, while after normalising adding the value this will increase the cumulated reward which are clearly different effects.

thanks

Gethin
> --
> You received this message because you are subscribed to the Google Groups "PRISM model checker" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to prismmodelchec...@googlegroups.com.
> To post to this group, send email to prismmod...@googlegroups.com.
> Visit this group at http://groups.google.com/group/prismmodelchecker.
> For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages