Hello everyone,
We have simulated a simple policy for the UR3 robot to reach a point in Gazebo simulation. Then we save the tensor flow model in update_policy for each itteration.
we get something that looks like executing some part of the sampling rather than the final policy.
Is there any way to get the final trajectory to the robot that corresponds to the final policy? Is the gps code meant to execute saved TF model which contains the final policy or is just written to perform sampling?
Hope to get your feedback soon.
Thank you so much for your help.
Best regards,
Risto