Tutorial 7 Exercise 1 Problem

50 views
Skip to first unread message

David Braune

unread,
Jun 7, 2021, 2:08:10 PM6/7/21
to Machine Learning for Physicists
Hi there!

When we tried exercise 1 in the breakout room, our code did not work as desired. We introduced reward2=reward and only let the program change reward2. However somehow reward was also changed, so that there weren't any rewards left after a few trajectories so that there was no training effect. I have no idea, how this can happen. The notebook is found here:

Thanks for reading!

Florian Marquardt

unread,
Jun 7, 2021, 3:05:17 PM6/7/21
to Machine Learning for Physicists
Remember that in python, if you have an array A and say "B=A", this does NOT copy the array A into a new array B. Rather, B just points to the data in A. So now if you say B[5]=42, then afterwards also A[5]==42 ! The way around this is to say B=np.copy(A).

I hope this helps!
Reply all
Reply to author
Forward
0 new messages