I'm not sure how much messier that project is than the download you have. At least somewhat messier. Good luck, keep me posted.
On 2010-10-07, at 2:15 PM, Luis Carlos Cobo Rus wrote:
> On Thu, Oct 7, 2010 at 2:37 PM, Brian Tanner <br...@tannerpages.com> wrote:
>> Hi Luis. Sorry I didn't respond to your other email yet, a bit busy here.
>> But I'll respond to this one now, inline :)
>
> Heh, well, I think a 1-day response time to a complete stranger is
> actually pretty good :)
>
>> On 2010-10-07, at 1:13 PM, Luis Carlos Cobo Rus wrote:
>> This is probably true. I do most of my work in undiscounted problems and I
>> might have been lazy on initial implementation and never went back.
>> Apologies if that's true. Scanning source code.... agree about Sarsa0
>> disagree about SarsaLambda.
>> In SarsaLambdaLearningModule gamma is set in the constructor (right?):
>> this.gamma=theTaskObject.getDiscountFactor();
>
> :O Not in the copy I have. I got the package from the rl-library web
> (http://library.rl-community.org/wiki/Sarsa_Lambda_Tile_Coding_(Java)),
> which is version R-30. Where can I get the last version?? (sorry for
> bugging you about the old version..)
>
>> I would be very pleased to discuss anything you find and to have your
>> changes in the repository. If someone is using my code, that makes me
>> happy. Send me your Google account e-mail address and I'll add you to the
>> project.
>
> luisc...@gmail.com
>
>> Oh, and about your serializing to a file and loading it back up... that's
>> not a trivial thing to get working, so congrats if it is. I don't mean for
>> that to sound condescending: but you wouldn't believe the sorts of questions
>> I get about RL-Glue on a semi-regular basis. It's nice to talk to competent
>> people :P
>
> Haha, well it is not completely working yet, but hopefully will be
> soon. Thanks for the kudos :)
>
>> I'd be happy to have those changes in the repository as well (subject to my
>> limited scrutiny).
>
> Great, let me know where can I find the most up-to-date version, I'll
> be glad to contribute.
>
> Thanks, I owe you a beer!
>
> --
> Luis Carlos Cobo Rus GnuPG ID: 44019B60
--
Brian Tanner
PhD Student, University of Alberta
br...@tannerpages.com
> --
> You received this message because you are subscribed to the Google Groups "rl-library" group.
> To post to this group, send email to rl-li...@googlegroups.com.
> To unsubscribe from this group, send email to rl-library+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/rl-library?hl=en.
They are not implemented in a very elegant way (e.g. now EGSL supports
a few adhoc commands besides rlViz ones and I had to modify an
interface, probably breaking all other classes that implement that
interface) but the changes are clean and easy to understand. The
problem is that saving policy does not really work yet. I am sure I am
saving and loading properly the 'theWeights' array in each function
approximator, but the performance is not right, so I must be missing
something. I will keep looking around, but any clues on more state I
should save/load will be welcome.
One parameter that I am ignoring is minimumTrace in
SarsaLambdaLearningModule, but that should deteriorate agent
performance, and furthermore, I believe the parameter should be reset
every time the traces are cleaned.
Brian: there is a lot more stuff in the repository than in the copy I
was working with, but for what matters to this agent, everything is
pretty much the same, except that the bug I was concerned about is
fixed, which had a huge impact in the performance in my tests. I think
it would be great if you remove the outdated package from the
rl-library website and include instead a pointer to the svn
repository.
Thanks!
Finally I solved this issue,
the static attribute rndseq[] in TileCoder.java is initialized the
first time the hash_UNH() method is called (controlled by a first_call
attribute). Saving and restoring that array and then setting the
first_call variable to false does the trick. Basically I was saving
the weights but the coordinate-to-tiles mapping was changing at every
run...