Tutorial 7 Exercise 1 Problem

50 views

Skip to first unread message

David Braune

unread,

Jun 7, 2021, 2:08:10 PM6/7/21

to Machine Learning for Physicists

Hi there!

When we tried exercise 1 in the breakout room, our code did not work as desired. We introduced reward2=reward and only let the program change reward2. However somehow reward was also changed, so that there weren't any rewards left after a few trajectories so that there was no training effect. I have no idea, how this can happen. The notebook is found here:

https://drive.google.com/file/d/1iCYpHQEUBkZWJaSOv1dzcV_JN6xcOeRg/view?usp=sharing

Thanks for reading!

Florian Marquardt

unread,

Jun 7, 2021, 3:05:17 PM6/7/21

to Machine Learning for Physicists

Remember that in python, if you have an array A and say "B=A", this does NOT copy the array A into a new array B. Rather, B just points to the data in A. So now if you say B[5]=42, then afterwards also A[5]==42 ! The way around this is to say B=np.copy(A).