Tensorflow estimator with multiple graph execution

403 views
Skip to first unread message

alexis....@googlemail.com

unread,
Mar 16, 2018, 5:54:13 AM3/16/18
to Discuss

I'm trying to implement a basic RL model as a custom estimator and I'm facing some issue regarding the control flow and graph handling of estimators.


I have to run 2 passes on the graph, one for the current state and one for the next state.


Using the core api, I would execute a pass with the next state as input, and use the result to calculate the loss of the current state and update the graph but as estimator handle the session and graph I can't find a way to execute these steps in a model_fn.


I had a look at the tf.train.SessionRunHook interface but it doesn't seem to fit my need.


Is there a away to customize estimator execution flow to support this king of scenario?


(I cannot use any additional library like Keras)

Martin Wicke

unread,
Mar 16, 2018, 10:22:28 AM3/16/18
to alexis....@googlemail.com, Discuss
Estimators are not great for this kind of thing. If hooks don't work for you, you'd have to encode both passes in a single graph. That doesn't sound terrible to me, but it can get complicated. 

I see no good reason why you wouldn't be able to use tf.keras though (assuming that would solve your problem). It comes with TensorFlow and is part of the TemsorFlow API. 

Martin

--
You received this message because you are subscribed to the Google Groups "Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss+unsubscribe@tensorflow.org.
To post to this group, send email to dis...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/discuss/ac2c7053-f2c8-4b3d-ae32-84c16f05a6e0%40tensorflow.org.

alexis....@googlemail.com

unread,
Mar 19, 2018, 3:29:42 AM3/19/18
to Discuss
Thanks Martin for your answer.

Actually I kept experimenting this weekend and I have additional questions.
I ended up using "reuse" on my network layer and creating 2 tensors with the different inputs.
Below is a simplified example of the result (the code doesn't really make sense by itself but just illustrate my reasoning related to have the same operation applied to different inputs used in loss function) :

# Define input function
input_fn = tf.estimator.inputs.numpy_input_fn(
    x={"current_state": input_x, "next_state": input_x_},
    y=input_y,
    shuffle=True)

# Define model function
def model_fn(features, labels, mode, params):  
    
    predictions = tf.layers.dense(features["current_state"], 1, activation=None, reuse=tf.AUTO_REUSE)
    predictions_ = tf.layers.dense(features["next_state"], 1, activation=None, reuse=tf.AUTO_REUSE)
    
    if mode == tf.estimator.ModeKeys.PREDICT:
        return tf.estimator.EstimatorSpec(
            mode=mode,
            predictions={"y_H": predictions})
    
    loss = tf.losses.mean_squared_error(labels=predictions_,
                                    predictions=predictions)
    
    optimizer = tf.train.GradientDescentOptimizer(
      learning_rate=0.01)
    train_op = optimizer.minimize(
      loss=loss, global_step=tf.train.get_global_step())
    
    return tf.estimator.EstimatorSpec(
      mode=mode,
      loss=loss,
      train_op=train_op)

regression = tf.estimator.Estimator(model_fn=model_fn)

I was a bit surprised this code ran and it raised a couple of questions :
- How is the gradient  calculated in this situation ? Is it with respect to both inputs ? As the layers are shared by both inputs, are there 2 rounds of updates or gradients are cumulated?
From what I see in tensorboard there is only one update operation on the dense layer.
- If the answer to the above is positive, how can I prevent the double gradient update ? Assigning the output of my second tensor (predictions_) to a non trainable variable would fix the issue?


Le vendredi 16 mars 2018 22:22:28 UTC+8, Martin Wicke a écrit :
Estimators are not great for this kind of thing. If hooks don't work for you, you'd have to encode both passes in a single graph. That doesn't sound terrible to me, but it can get complicated. 

I see no good reason why you wouldn't be able to use tf.keras though (assuming that would solve your problem). It comes with TensorFlow and is part of the TemsorFlow API. 

Martin

On Mar 16, 2018 2:54 AM, "alexis.gillain via Discuss" <dis...@tensorflow.org> wrote:

I'm trying to implement a basic RL model as a custom estimator and I'm facing some issue regarding the control flow and graph handling of estimators.


I have to run 2 passes on the graph, one for the current state and one for the next state.


Using the core api, I would execute a pass with the next state as input, and use the result to calculate the loss of the current state and update the graph but as estimator handle the session and graph I can't find a way to execute these steps in a model_fn.


I had a look at the tf.train.SessionRunHook interface but it doesn't seem to fit my need.


Is there a away to customize estimator execution flow to support this king of scenario?


(I cannot use any additional library like Keras)

--
You received this message because you are subscribed to the Google Groups "Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss+u...@tensorflow.org.

Martin Wicke

unread,
Mar 19, 2018, 1:19:20 PM3/19/18
to alexis....@googlemail.com, Discuss
I suggest you use the OO layers (Dense), which make it more explicit what ends up being shared. In this case, I think you want to create a single layer, and use it twice. 

In your code, it seems to me that you are creating two layers which do not share variables, and train both. In any case, the gradients are computed with respect to the variables, not the inputs. By default, all variables, but you can select the ones you want as well by passing an explicit list to the optimizer.


alexis....@googlemail.com

unread,
Mar 21, 2018, 8:51:56 AM3/21/18
to Discuss
Martin,

My code create a second node which read the kernel/bias of the first dense tensor and apply them to the second input.
The variable are shared but during the gradient computation both "current_state" and "next_state" impact the kernel/bias gradient.

To make it it clearer, let say:
x=1
x'=x
gradient(current_sate*x+next_state*x') ==> (current_state + next_state) because tensorflow consider x' as a reference to x
In my scenario I would like :
gradient(current_sate*x+next_state*x') ==> (current_state) because (next_state*x') is suppose to be static

So I need to dynamically assign at the beginning of every iteration the current value of my variables to a constant (hard copy vs reference in my current code).
It doesn't seems I can solve the issue inside model_fn definition. I will have another look at the hooks but I didnt find an easy way to get the values of my variables in the previous iteration to inject them in the new one.

If you have any ideas....
Message has been deleted

Martin Wicke

unread,
Mar 21, 2018, 11:48:41 AM3/21/18
to alexis....@googlemail.com, Discuss
To assign variable values, use tf.assign. You can use tf.control_dependencies to control which order things are executed (for instance, to enforce that a variable update runs before some other part of the computation.

I am not sure I have fully understood your use case, but a variable is static if you do not mention it in the list of variables to update for any optimizer.

Martin


Reply all
Reply to author
Forward
0 new messages