TF-TRT failed to build Engine in JetsonNano

587 views
Skip to first unread message

Roberto Canale

unread,
Feb 25, 2021, 8:17:35 AM2/25/21
to TensorFlow Developers

Hello, I am using a JetsonNano with JetPack 4.3, Tensorflow 2.3.1 and Tensorrt 7.1.3
I have a Keras model that i covnerted to a TF-TRT model

When performing inference on the model, I get the following error:

TF-TRT Warning: Engine creation for PartitionedCall/TRTEngineOp_0_0 failed. The native segment will be used instead. Reason: Internal: Failed to build TensorRT engine

During Inference i get:

W tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:629] TF-TRT Warning: Engine retrieval for input shapes: [[1,100,68,3]] failed. Running native segment for PartitionedCall/TRTEngineOp_0_0

what does it mean?

It seems like TRT is not building engines but the inference works the same.
I have performed the same inference on another PC (TF-2.4.1 and TRT 7.2) and I do not get this error. However, I have compared the inference results between the Keras and TF-TRT model and they are the same (both with the error on JetsonNano and without the error on PC)

Why are my results the same? How do I solve this? Thank you!

Sanjoy Das

unread,
Feb 25, 2021, 6:44:38 PM2/25/21
to Roberto Canale, Bixia Zheng, Jonathan Dekhtiar, TensorFlow Developers

--
You received this message because you are subscribed to the Google Groups "TensorFlow Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to developers+...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/developers/01ea1fd4-4f07-4855-8ec8-ed71a9685f86n%40tensorflow.org.

Matthew Conley US

unread,
Feb 26, 2021, 11:45:34 AM2/26/21
to TensorFlow Developers, Sanjoy Das, TensorFlow Developers, Roberto Canale, Bixia Zheng, Jonathan Dekhtiar
Hi Roberto,

As you expected, the warning message you received means that there was something preventing proper execution of your TF-TRT converted model.  However, the inference results are the same because TF-TRT uses the original model segment as a fallback when it encounters such an issue, so you can still get valid results.  This means that your model is being run, but is not receiving the benefit of TF-TRT acceleration.  There are a few things which could cause this to happen, but more information would be needed to determine the specific problem.

Can you share any more details about the model you're trying to run?  A link if it's publicly available, or description of the type of model / another public model it is similar to if not would be helpful.
Did you convert the model with TF-TRT on the Nano itself, or on another PC (potentially with a different TensorRT version)?  Were there any warning messages output during the conversion itself? (Any reproducer here would also be great.)

Thanks,
Matt Conley

Roberto Canale

unread,
Mar 1, 2021, 4:03:54 AM3/1/21
to TensorFlow Developers, mco...@nvidia.com, Sanjoy Das, TensorFlow Developers, Roberto Canale, Bixia Zheng, Jonathan Dekhtiar
Hello,
the model I am using is HandsNet (described in this paper: Handsnet Table 1). I am converting the model according to TF 2 workflow described in TFTRT_userguide (not TF1).
I converted the models both on my PC and the JetsonNano, and while on the PC they work fine, I still get the error on the JetsonNano (even with the model converted on the Jetson).

During conversion i get:
W tensorflow/core/framework/op_kernel.cc:1763] OP_REQUIRES failed at trt_engine_resource_ops.cc:195 : Not found: Container TF-TRT does not exist. (Could not find resource: TF-TRT/TRTEngineOp_0_0) 
but i read on the TF Release Notes that it is a known issue and is not a problem and should be ignored.

Furthermore, when performing inference i get:

PC
Linked TensorRT version:7.1.3
Loaded TensorRTversion:7.2
JetsonNano
Linked TensorRT version:7.1.3
Loaded TensorRTversion:7.1.3

Specifications
PC: Ubuntu 20, TF 2.4.1
Jetson: Ubuntu 18, TF 2.3.1, Tensor RT 7.1.3

Lastly, on the PC if i try to import TensorRT in python it says no module found, but when i launch the TFTRT module he says he loads it and uses it.

Thank  you for your help, I hope you can help me out!
Best,
Roberto Canale

Roberto Canale

unread,
Mar 1, 2021, 4:15:03 AM3/1/21
to TensorFlow Developers, Roberto Canale, mco...@nvidia.com, Sanjoy Das, TensorFlow Developers, Bixia Zheng, Jonathan Dekhtiar
Sorry, I should correct the fact that i am using JetPack 4.4.1

Roberto Canale

unread,
Mar 1, 2021, 5:16:09 AM3/1/21
to TensorFlow Developers, Roberto Canale, mco...@nvidia.com, Sanjoy Das, TensorFlow Developers, Bixia Zheng, Jonathan Dekhtiar
A small appendix on Classification and Time PERFORMANCE:

I ran the Keras (Tensorflow) model on the JetsonNano, and I get an inference time of about 250 ms/per image.
A TensorFlow-TensorRT model on the JetsonNano gives me an inference time of about 1.1ms (FP32 Precision) and  0.9ms (FP16 precision) /per image.
(images are 100x68x3)

And again, the accuracy performance on the recognition is unchanged. So you mentioned i am not receiving the benefits of TF-TRT acceleration, but experimentally I have a speed improvement of about 2 orders of magnitude? How is this? I wonder if this is maybe a bug, and that I actually am getting the acceleration despite the warning (not being presumptuous here, just a flow of thoughts). Or would it be even faster if i get the acceleration?

Waiting for your thoughts on this,
Roberto

Matthew Conley US

unread,
Mar 4, 2021, 4:10:42 PM3/4/21
to TensorFlow Developers, Roberto Canale, Matthew Conley US, Sanjoy Das, TensorFlow Developers, Bixia Zheng, Jonathan Dekhtiar
Hi Roberto,

Thanks for your response.  Based on the information provided, it looks like the model has successfully been converted by TF-TRT, and is executing faster as a result.  The native fallback option of TF-TRT is implemented for these types of situations, where there may be certain portions of the graph which are unsupported at runtime but their execution does not interrupt the inference itself.
The warning message you received does mean that a section of the graph was executed in its native form, but this situation does not seem like a bug or error; rather, that one portion of the graph may not currently be fully supported by TF-TRT on the Jetson platform.  TF-TRT support is constantly expanding, as are the various models themselves, so it is likely that the warning you received (and the incompatibility responsible for it) will not be present in a future release.
Because you are still seeing accurate results and are also experiencing an increase of performance, I would recommend continuing to work with the model as you are, and posting again or filing an issue if you have any trouble in the future.

I hope this helps, let me know if there are any more issues.

Thanks,
Matt Conley

Sanjoy Das

unread,
Mar 5, 2021, 12:04:26 AM3/5/21
to Matthew Conley US, TensorFlow Developers, Roberto Canale, Bixia Zheng, Jonathan Dekhtiar
Would it be better to print these kinds of messages as VLOG() and not as warnings?  My experience (also in other contexts) is that such warnings tend to confuse more than they help.

-- Sanjoy

Jonathan DEKHTIAR

unread,
Mar 5, 2021, 2:43:19 AM3/5/21
to Sanjoy Das, Bixia Zheng, Jonathan Dekhtiar, Matthew Conley US, Roberto Canale, TensorFlow Developers
It's not easy to find the ideal solution. Technically we want that the user see this message. Falling back to native segment may lead to serious performance regression. It's good practice and behavior to wonder why it is happening and having the information that it happens on hands. So I would argue that it should be a warning.

Now if you tell me that warning the user is good, knowing what to do with this warning is better. I would agree. And if you have any suggestion about that, we can definitely work on this 👍

Jonathan 

Roberto Canale

unread,
Mar 5, 2021, 4:14:45 AM3/5/21
to TensorFlow Developers, jonathan...@gmail.com, Bixia Zheng, Jonathan Dekhtiar, mco...@nvidia.com, Roberto Canale, TensorFlow Developers, Sanjoy Das
Hi,
Everything seems pretty clear to me then.
Thank you for your thorough reply and suggestions,  they have definitely been a great help!

Kind regards,
Roberto

Sanjoy Das

unread,
Mar 5, 2021, 7:14:46 PM3/5/21
to Roberto Canale, TensorFlow Developers, jonathan...@gmail.com, Bixia Zheng, Jonathan Dekhtiar, mco...@nvidia.com
On Fri, Mar 5, 2021 at 1:14 AM Roberto Canale <roberto.c...@gmail.com> wrote:
Hi,
Everything seems pretty clear to me then.
Thank you for your thorough reply and suggestions,  they have definitely been a great help!

Kind regards,
Roberto

On Friday, 5 March 2021 at 08:43:19 UTC+1 jonathan...@gmail.com wrote:
It's not easy to find the ideal solution. Technically we want that the user see this message. Falling back to native segment may lead to serious performance regression. It's good practice and behavior to wonder why it is happening and having the information that it happens on hands. So I would argue that it should be a warning.

Now if you tell me that warning the user is good, knowing what to do with this warning is better. I would agree. And if you have any suggestion about that, we can definitely work on this 👍

Do you have a sense on how often users read these warnings and do something about it?  Would it be reasonable to predicate these warnings under a `TF_PERFORMANCE_WARNINGS=1` env var?

I'm not singling out TF/TensorRT, IMO TensorFlow as a whole logs too much and I'm wondering if this is a problem in practice.

-- Sanjoy
Reply all
Reply to author
Forward
0 new messages