How does Tf.data.dataset.interleave AUTOTUNE work?

cláudia correia

unread,

Jul 15, 2020, 2:13:53 PM7/15/20

to TensorFlow Developers

Hello,

When using the interleave method it is possible to set num_parallel_calls equal to AUTOTUNE. When analyzing the parallel_interleave_dataset op (parallel_interleave_dataset_op.cc) I understood that during training the autotuning may increase/decrease the num_parallel_calls argument however, I didn't find the code responsible for the autotuning.

Given this, I would like to know how the autotune works, more precisely in which situations does the autotune change the value of num_parallel_calls. I would also like to know where's the code responsible for the autotuning.

Thank you in advance for your help!

Jiri Simsa

unread,

Jul 15, 2020, 2:28:11 PM7/15/20

to cláudia correia, TensorFlow Developers

The bulk of the autotuning implementation can be found here:

- https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/model.h

- https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/model.cc

A high-level summary: tf.data collects runtime information about what transformations the input pipeline performs and how much time is spent in each of them. This information is then used to periodically perform a background computation which uses an analytical model of the input pipeline performance and a hill climbing technique to decide how to divide available CPU across parallel transformations; the result of this computation is then propagated to the actual running pipeline.

Best,

Jiri

--
You received this message because you are subscribed to the Google Groups "TensorFlow Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to developers+...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/developers/57cd81aa-b642-499d-bdc0-ed47c27b141do%40tensorflow.org.

cláudia correia

unread,

Jul 15, 2020, 7:23:50 PM7/15/20

to TensorFlow Developers, claudiac...@gmail.com

Jiri, thank you very much, the information you gave was very helpful!

After analyzing model.cc I concluded that the autotune changes multiple parameters, including the buffer size, correct? This buffer is used by the interleave method to store what exactly? The elements that need to be processed or the interleaved output? I'm still a bit confused about how It all works...

quarta-feira, 15 de Julho de 2020 às 19:28:11 UTC+1, Jiri Simsa escreveu:

The bulk of the autotuning implementation can be found here:
- https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/model.h
- https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/model.cc

A high-level summary: tf.data collects runtime information about what transformations the input pipeline performs and how much time is spent in each of them. This information is then used to periodically perform a background computation which uses an analytical model of the input pipeline performance and a hill climbing technique to decide how to divide available CPU across parallel transformations; the result of this computation is then propagated to the actual running pipeline.

Best,

Jiri

On Wed, Jul 15, 2020 at 11:13 AM cláudia correia <claudiac...@gmail.com> wrote:

Hello,

When using the interleave method it is possible to set num_parallel_calls equal to AUTOTUNE. When analyzing the parallel_interleave_dataset op (parallel_interleave_dataset_op.cc) I understood that during training the autotuning may increase/decrease the num_parallel_calls argument however, I didn't find the code responsible for the autotuning.
Given this, I would like to know how the autotune works, more precisely in which situations does the autotune change the value of num_parallel_calls. I would also like to know where's the code responsible for the autotuning.

Thank you in advance for your help!

--
You received this message because you are subscribed to the Google Groups "TensorFlow Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to devel...@tensorflow.org.

Ilya Persky

unread,

Jul 16, 2020, 8:31:54 AM7/16/20

to cláudia correia, TensorFlow Developers

Hey Cláudia, I don't know if that could help, but I recall watching
Jiri's very clear and interesting talk on tf.data that got the
motivation/approach of autotuning briefly covered at the end:
https://www.youtube.com/watch?v=kVEOCfBy9uY

On Thu, Jul 16, 2020 at 2:23 AM cláudia correia

<claudiac...@gmail.com> wrote:
>
> Jiri, thank you very much, the information you gave was very helpful!
>
> After analyzing model.cc I concluded that the autotune changes multiple parameters, including the buffer size, correct? This buffer is used by the interleave method to store what exactly? The elements that need to be processed or the interleaved output? I'm still a bit confused about how It all works...
>
> quarta-feira, 15 de Julho de 2020 às 19:28:11 UTC+1, Jiri Simsa escreveu:
>>
>> The bulk of the autotuning implementation can be found here:
>> - https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/model.h
>> - https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/model.cc
>>
>> A high-level summary: tf.data collects runtime information about what transformations the input pipeline performs and how much time is spent in each of them. This information is then used to periodically perform a background computation which uses an analytical model of the input pipeline performance and a hill climbing technique to decide how to divide available CPU across parallel transformations; the result of this computation is then propagated to the actual running pipeline.
>>
>> Best,
>>
>> Jiri
>>
>> On Wed, Jul 15, 2020 at 11:13 AM cláudia correia <claudiac...@gmail.com> wrote:
>>>
>>> Hello,
>>>
>>> When using the interleave method it is possible to set num_parallel_calls equal to AUTOTUNE. When analyzing the parallel_interleave_dataset op (parallel_interleave_dataset_op.cc) I understood that during training the autotuning may increase/decrease the num_parallel_calls argument however, I didn't find the code responsible for the autotuning.
>>> Given this, I would like to know how the autotune works, more precisely in which situations does the autotune change the value of num_parallel_calls. I would also like to know where's the code responsible for the autotuning.
>>>
>>> Thank you in advance for your help!
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups "TensorFlow Developers" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an email to devel...@tensorflow.org.
>>> To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/developers/57cd81aa-b642-499d-bdc0-ed47c27b141do%40tensorflow.org.
>
> --
> You received this message because you are subscribed to the Google Groups "TensorFlow Developers" group.

> To unsubscribe from this group and stop receiving emails from it, send an email to developers+...@tensorflow.org.
> To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/developers/507cac9c-3f6a-42a1-87d5-abc88f20cd7eo%40tensorflow.org.

--
Thank you,
Ilya.

Jiri Simsa

unread,

Jul 16, 2020, 12:12:53 PM7/16/20

to Ilya Persky, cláudia correia, TensorFlow Developers

Hi Cláudia,

The implementation in model.cc is capable of autotuning both a) the degree of parallelism of parallel transformations (in particular, parallel map, parallel interleave, and fused map + batch) and b) the buffer size of prefetch. Having said that the autotuning of prefetch buffer sizing defaults to using legacy implementation (and we hope to migrate away from using the legacy code to the model.cc based autotuning later this year). The parallel transformations I mentioned each also have a buffer (as the results of the parallel invocations have to be stored somewhere until the consumer requests them) and its size is determined by the degree of parallelism. Let me know if you have further questions.

Best,

Jiri

To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/developers/CAAxHhJ2iGLdAVZ7Tpwgg892Nghmi4%2BOsWTFk36EYdCqpH%2Bi-NQ%40mail.gmail.com.

cláudia correia

unread,

Jul 27, 2020, 10:09:06 AM7/27/20

to TensorFlow Developers, ilya....@gmail.com, claudiac...@gmail.com

Hello,

Ilya, I've watched Jiri's talk and it was really helpful, thank you!

Jiri, thank you very much for your explanation, everything is much clearer now. However, I still have a question regarding the autotuning algorithm. In your talk you say that the algorithm is greedy and that you assume the optimization is monotonic. So, the number of threads/buffer sizes never decreases, right? Is there a reason for this? Won't this strategy lead to a waste of resources?

Best regards,

Cláudia

> To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/developers/507cac9c-3f6a-42a1-87d5-abc88f20cd7eo%40tensorflow.org.

--
Thank you,
Ilya.

--
You received this message because you are subscribed to the Google Groups "TensorFlow Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to devel...@tensorflow.org.

To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/developers/CAAxHhJ2iGLdAVZ7Tpwgg892Nghmi4%2BOsWTFk36EYdCqpH%2Bi-NQ%40mail.gmail.com.

Jiri Simsa

unread,

Jul 27, 2020, 6:09:42 PM7/27/20

to cláudia correia, TensorFlow Developers, Ilya Persky

Hi Cláudia,

The autotuning algorithm assumes that you give it a CPU and RAM budget to use. The optimization that is performed is to minimize the input pipeline latency w.r.t. to the CPU and RAM budget constraints.

To account for data and input pipeline structure variance over time, the optimization is executed repeatedly. Different runs of the optimization can choose different parallelism / buffer size values for different nodes of the input pipeline. In particular, a parallelism or a size of a buffer may decrease between runs. Having said that, the optimization will generally aim to "spend" the CPU and RAM budget it was provided.

To unsubscribe from this group and stop receiving emails from it, send an email to developers+...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/developers/ad156841-acbf-47ba-8cbe-3f9b9acc1153o%40tensorflow.org.

Reply all

Reply to author

Forward