We're working on our final project for AA228, and we'd like to find a way to make our reward a function of the current belief state. Is there any way to do this in POMDPs.jl?
As background, our problem is robot scent search, where we are trying to localize a stationary person in a grid world. We currently have our reward function implemented as the inverse of the distance between the robot and the person. Instead, we want to encode the fact that the robot does not actually know where the person is, so we want our reward function to be based on the distance between the robot and several samples from the belief state of where the person might be.
UndefVarError: Sim not defined
Stacktrace:
[1] include_string(::String, ::String) at .\loading.jl:515