About testing on different sized images.

155 views
Skip to first unread message

Ahmed Ghorbel

unread,
Mar 8, 2022, 10:45:18 AM3/8/22
to tensorflow-compression
Hello,

I trained my own model (on tfds clic train-set) based on the ms2020 architecture and when i tested on some of the clic22 challenge val-set an 'assertion failed' error occurred (the test worked on some and not on the others).

-----------------------------------------------
assertion failed: [Sanity check failed.] [Condition x == y did not hold element-wise:] [x (location_scale_indexed_entropy_model/EntropyDecodeFinalize:0) = ] [0] [y (location_scale_indexed_entropy_model/assert_equal_1/y:0) = ] [1]
         [[{{node location_scale_indexed_entropy_model/assert_equal_1/Assert/AssertGuard/Assert}}]] [Op:__inference_restored_function_body_23384]
-----------------------------------------------

So my question is how did you deal in the ms2020 model with different sized images in testing phase ?

looking forward for your help.

Best regards,
Ahmed.

Johannes Ballé

unread,
Mar 8, 2022, 12:18:24 PM3/8/22
to tensorflow-...@googlegroups.com
Hi Ahmed,

this sounds like an issue with machine determinism. We don't currently provide a method to deal with cross-platform issues due to floating point non-determinism in TFC (such as in this paper: https://openreview.net/forum?id=S1zz2i0cY7).

One workaround could be to restrict computation to CPU and check if this reduces the issue (by setting the CUDA_VISIBLE_DEVICES environment variable to the empty string, for example).

Hope this helps!
Johannes

--
You received this message because you are subscribed to the Google Groups "tensorflow-compression" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tensorflow-compre...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tensorflow-compression/ba9bff63-8cde-45f3-a382-0da75b0c4463n%40googlegroups.com.

Nikolai Körber

unread,
Aug 8, 2023, 10:14:53 AM8/8/23
to tensorflow-compression
Hello Johannes, 

I wanted to follow-up on this issue.

I sometimes observe the same problem (ms2020.py-based setup, x_hat = model.decompress(*tensors):
Detected at node 'location_scale_indexed_entropy_model/assert_equal_3/Assert/AssertGuard/Assert' defined at (most recent call last):
Node: 'location_scale_indexed_entropy_model/assert_equal_3/Assert/AssertGuard/Assert'
assertion failed: [Sanity check failed.] [Condition x == y did not hold element-wise:] [x (location_scale_indexed_entropy_model/EntropyDecodeFinalize_1:0) = ] [0] [y (location_scale_indexed_entropy_model/assert_equal_3/y:0) = ] [1]
 [[{{node location_scale_indexed_entropy_model/assert_equal_3/Assert/AssertGuard/Assert}}]] [Op:__inference_restored_function_body_27038]

I noticed that PyTorch users typically use something similar to:
torch.backends.cudnn.deterministic=True
torch.backends.cudnn.benchmark=False

I was just wondering if there is a TensorFlow equivalent, trick or some best practices that you could recommend. I also stumbled across https://www.tensorflow.org/api_docs/python/tf/config/experimental/enable_op_determinism, but I am unsure to what extent this may be helpful.

From my own experience I can definitely confirm that 

import os
os.environ["CUDA_VISIBLE_DEVICES"]="" # no GPU 

works, at least for CPU-based evaluation.

I would be quite surprised if you or your colleagues have not encountered similar problems. I typically use Google Colab Pro+ for basic tests (same hardware setup for both training and testing, no cross-platform).

It would be great if you could provide some guidance.

Thanks,
Nikolai

Nikolai Körber

unread,
Aug 9, 2023, 6:55:31 AM8/9/23
to tensorflow-compression
Hello, 

I found my error. While experimenting with different entropy models, I introduced a bug (not related to machine determinism), resulting in the same error message which got me really confused. 

Now everything works just perfectly, sorry for the confusion.

Kind regards, 
Nikolai

Reply all
Reply to author
Forward
0 new messages