Setting up action/observation/reward functions for multiple

68 views
Skip to first unread message

wj...@stanford.edu

unread,
Dec 4, 2017, 7:15:58 PM12/4/17
to julia-pomdp-users

Hello, 


I am encountering this error when I run my code:


LoadError: MethodError: Cannot `convert` an object of type Tuple{Int64,Int64} to an object of type FState

This may have arisen from a call to the constructor FState(...),

since type constructors fall back to convert methods.

while loading C:\Users\willi\OneDrive\Documents\Stanford\Research\Faucet\Python Faucet Model\Learning Julia\Faucetv2.jl, in expression starting on line 41

setindex!(::Dict{FState,Int64}, ::Int64, ::Tuple{Int64,Int64}) at dict.jl:412

Dict{FState,Int64}(::Base.Generator{Enumerate{Array{Tuple{Int64,Int64},1}},##23#24}) at dict.jl:118

include_string(::String, ::String) at loading.jl:522

include_string(::Module, ::String, ::String) at Compat.jl:174

(::LastMain.Atom.##57#60{String,String})() at eval.jl:74

withpath(::LastMain.Atom.##57#60{String,String}, ::String) at utils.jl:30

withpath(::Function, ::String) at eval.jl:38

macro expansion at eval.jl:72 [inlined]

(::LastMain.Atom.##56#59{Dict{String,Any}})() at task.jl:80

 


I think I am close to figuring this out… but having a bit of trouble with knowing whether to use tuples or arrays.

 

Do I need some function that breaks 2D index values into a pairing of values accessible by the observation, reward and transition functions?


Would function observation(p::FPOMDP, a::Int, sp::FState) be a::Int still, or a::Tuple{Int, Int}, or does it just depend on if I want to use indices or tuples? 


If using tuples, I am assuming:


sp.time > 2 && a != DTEMP[sp.task]

 

Would it now be: if sp.time > 2 && a[1] != DTEMP[sp.task] && a[2] != DFLOW[sp.task]?


Link to code: https://gist.github.com/WilliamJou/98a30e1be430680f805329311a35db2a

Zachary Sunberg

unread,
Dec 4, 2017, 10:44:39 PM12/4/17
to julia-pomdp-users

I am encountering this error when I run my code:

 while loading C:\Users\willi\OneDrive\Documents\Stanford\Research\Faucet\Python Faucet Model\Learning Julia\Faucetv2.jl, in expression starting on line 41

On line 41, by specifying the parameters (in the curly brackets), you are asserting that the AINDEX dictionary should map FState objects to Ints. This doesn't work because it should be mapping actions to integers. You can just remove the parameters and let the Dict constructor infer them
const AINDEX = Dict(a=>i for (i,a) in enumerate(actions(p)))

 having a bit of trouble with knowing whether to use tuples or arrays.

Tuples are immutable and are the fastest choice when the collection is small. Arrays are mutable and are the fastest choice when the collection is large. Arrays have math operations defined; Tuples do not. StaticArrays.jl provides small immutable arrays that are like tuples but have math operations defined.

Do I need some function that breaks 2D index values into a pairing of values accessible by the observation, reward and transition functions?

I'm not sure exactly what you mean here. Perhaps an example or pointing to a line number in your code would help.

Would function observation(p::FPOMDP, a::Int, sp::FState) be a::Int still, or a::Tuple{Int, Int}, or does it just depend on if I want to use indices or tuples?

a::Tuple{Int, Int} is correct if stick with struct FPOMDP <: POMDP{FState, Tuple{Int,Int}, Tuple{Int, Int}}

It should always match action_type(p)
 
Would it now be: if sp.time > 2 && a[1] != DTEMP[sp.task] && a[2] != DFLOW[sp.task]?

Yes

Zachary Sunberg

unread,
Dec 4, 2017, 10:44:59 PM12/4/17
to julia-pomdp-users
Thanks for reposting on this forum btw

wj...@stanford.edu

unread,
Dec 5, 2017, 6:04:53 AM12/5/17
to julia-pomdp-users
No problem, thanks for your help! I am now encountering an issue with generate_sor

(Sorry for the big error blob)

LoadError: MethodError: no method matching generate_sor(::FPOMDP, ::FState, ::Tuple{Int64,Int64}, ::MersenneTwister)
Closest candidates are:
generate_sor(::POMDPs.POMDP, ::Any, ::Any, ::AbstractRNG) at C:\Users\willi\.julia\v0.6\POMDPs\src\generative_impl.jl:120
while loading C:\Users\willi\OneDrive\Documents\Stanford\Research\Faucet\Python Faucet Model\Learning Julia\Faucetv2.jl, in expression starting on line 76
simulate(::BasicPOMCP.POMCPPlanner{FPOMDP,BasicPOMCP.SolvedPORollout{POMDPToolbox.RandomPolicy{MersenneTwister,FPOMDP,POMDPToolbox.VoidUpdater},POMDPToolbox.VoidUpdater,MersenneTwister},MersenneTwister}, ::FState, ::BasicPOMCP.POMCPObsNode{Tuple{Int64,Int64},Tuple{Int64,Int64}}, ::Int64) at solver.jl:79
search(::BasicPOMCP.POMCPPlanner{FPOMDP,BasicPOMCP.SolvedPORollout{POMDPToolbox.RandomPolicy{MersenneTwister,FPOMDP,POMDPToolbox.VoidUpdater},POMDPToolbox.VoidUpdater,MersenneTwister},MersenneTwister}, ::ParticleFilters.ParticleCollection{FState}, ::BasicPOMCP.POMCPTree{Tuple{Int64,Int64},Tuple{Int64,Int64}}) at solver.jl:23
action(::BasicPOMCP.POMCPPlanner{FPOMDP,BasicPOMCP.SolvedPORollout{POMDPToolbox.RandomPolicy{MersenneTwister,FPOMDP,POMDPToolbox.VoidUpdater},POMDPToolbox.VoidUpdater,MersenneTwister},MersenneTwister}, ::ParticleFilters.ParticleCollection{FState}) at solver.jl:5
next(::POMDPToolbox.POMDPSimIterator{(:b, :s, :a, :r, :o),FPOMDP,BasicPOMCP.POMCPPlanner{FPOMDP,BasicPOMCP.SolvedPORollout{POMDPToolbox.RandomPolicy{MersenneTwister,FPOMDP,POMDPToolbox.VoidUpdater},POMDPToolbox.VoidUpdater,MersenneTwister},MersenneTwister},ParticleFilters.SimpleParticleFilter{FState,ParticleFilters.LowVarianceResampler,MersenneTwister},MersenneTwister,ParticleFilters.ParticleCollection{FState},FState}, ::Tuple{Int64,FState,ParticleFilters.ParticleCollection{FState}}) at stepthrough.jl:82
anonymous at <missing>:?
include_string(::String, ::String) at loading.jl:522
include_string(::Module, ::String, ::String) at Compat.jl:174
(::LastMain.Atom.##57#60{String,String})() at eval.jl:74
withpath(::LastMain.Atom.##57#60{String,String}, ::String) at utils.jl:30
withpath(::Function, ::String) at eval.jl:38
macro expansion at eval.jl:72 [inlined]
(::LastMain.Atom.##56#59{Dict{String,Any}})() at task.jl:80

do i need to write a new generate_sor function now that my action is a tuple?

Zachary Sunberg

unread,
Dec 5, 2017, 2:48:06 PM12/5/17
to julia-pomdp-users
POMDPs.jl should be able to synthesize generate_sor() automatically if the functions from the explicit interface are implemented correctly. In the last code you posted, you had transition with a::Int, so it won't be able to synthesize generate_sor() for a::Tuple{Int, Int}

The error messages right now are not very informative, but the latest master version of POMDPs that doesn't have a registered release yet has more helpful warnings. You can get that with Pkg.checkout("POMDPs").

wj...@stanford.edu

unread,
Dec 6, 2017, 6:41:24 PM12/6/17
to julia-pomdp-users
Hello,

We spoke in person, and the working code is attached here:

wj...@stanford.edu

unread,
Dec 7, 2017, 3:36:37 AM12/7/17
to julia-pomdp-users
Hey Zach,

I have updated a lot of the problems. But I am now running into an error when I enter the stepthrough function at the end.

The error is on line 112 when the initial_user value is greater than 2. I am not sure why this happens, as I try to initialize the states to any possible value in the array [1,2,3,4].

I have added StatsBase as a package, and am using some of its weight sampling properties and I originially thought the error had to due with method overwriting. Thanks!



Zachary Sunberg

unread,
Dec 7, 2017, 1:56:27 PM12/7/17
to julia-pomdp-users
The problem is that observation returns nothing sometimes.

If it doesn't hit one of the returns inside the if statements in lines 75-95, it reaches the end of the function and returns nothing, then rand is called in line 77 of generative_impl.jl on nothing, which has type Void, and there is no method for that. Don't be too scared to look inside of generative_impl.jl Even if you don't understand the code completely, you can see that line 77 has something to do with calling rand on the return of observation.

One way to catch this kind of error is with the @inferred test https://docs.julialang.org/en/stable/stdlib/test/#Base.Test.@inferred. If the compiler can't infer the return type correctly, it likely means that it will return something you don't intend.


wj...@stanford.edu

unread,
Dec 7, 2017, 4:30:20 PM12/7/17
to julia-pomdp-users
Ah, I was missing else statements in my observations code that made it return nothing. I will look into base test and generative_impl.jl. Thanks!

wj...@stanford.edu

unread,
Dec 8, 2017, 6:47:44 AM12/8/17
to julia-pomdp-users
Hello!

I have code up and running now, mostly due to your help. 

For validating purposes, I am comparing the POMCP method with a RandomPolicy, and a Policy that selects a state from a random belief state. Currently I am letting the stepthrough function iterate 100 times through each policy, and the random selection from the belief state is outperforming the POMCP algorithm. Do you have any insight into why this may be? 


Sincerely,
William

Zachary Sunberg

unread,
Dec 8, 2017, 1:55:27 PM12/8/17
to julia-pomdp-users
How much is it outperforming it by? It may be that the heuristic policy is near-optimal for this problem since it is pretty simple. You also might need to adjust the POMCP parameters (like the number of iterations and the exploration constant) to get good performance. Looking at the trees that POMCP produces (via D3Trees) can help with this (are the trees consistently valuing actions correclty?).

Reply all
Reply to author
Forward
0 new messages