"Generic conv implementation only supports NHWC tensor format for now" error when using t2t-decode

91 views
Skip to first unread message

Seongjin Cho

unread,
Jul 20, 2019, 10:15:58 AM7/20/19
to tensor2tensor
Hi tensor2tensor,

I'm getting the following error while trying to decode using transformer model on TF 1.14.
It worked w/o problem when using TF 1.13.1.

Traceback (most recent call last):
  File ".../t2t/t2t_decode.py", line 17, in <module>
    tf.app.run(main=t2t_decoder.main)
  File ".../t2t/py-pkg/tensorflow/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File ".../t2t/py-pkg/absl/app.py", line 300, in run
    _run_main(main, args)
  File ".../t2t/py-pkg/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File ".../t2t/py-pkg/tensor2tensor/bin/t2t_decoder.py", line 196, in main
    decode(estimator, hp, decode_hp)
  File ".../t2t/py-pkg/tensor2tensor/bin/t2t_decoder.py", line 104, in decode
    checkpoint_path=FLAGS.checkpoint_path)
  File ".../t2t/py-pkg/tensor2tensor/utils/decoding.py", line 224, in decode_from_dataset
    checkpoint_path=checkpoint_path)
  File ".../t2t/py-pkg/tensor2tensor/utils/decoding.py", line 313, in decode_once
    for num_predictions, prediction in enumerate(predictions):
  File ".../t2t/py-pkg/tensorflow_estimator/python/estimator/estimator.py", line 637, in predict
    preds_evaluated = mon_sess.run(predictions)
  File ".../t2t/py-pkg/tensorflow/python/training/monitored_session.py", line 754, in run
    run_metadata=run_metadata)
  File ".../t2t/py-pkg/tensorflow/python/training/monitored_session.py", line 1252, in run
    run_metadata=run_metadata)
  File ".../t2t/py-pkg/tensorflow/python/training/monitored_session.py", line 1353, in run
    raise six.reraise(*original_exc_info)
  File ".../t2t/py-pkg/six.py", line 693, in reraise
    raise value
  File ".../t2t/py-pkg/tensorflow/python/training/monitored_session.py", line 1338, in run
    return self._sess.run(*args, **kwargs)
  File ".../t2t/py-pkg/tensorflow/python/training/monitored_session.py", line 1411, in run
    run_metadata=run_metadata)
  File ".../t2t/py-pkg/tensorflow/python/training/monitored_session.py", line 1169, in run
    return self._sess.run(*args, **kwargs)
  File ".../t2t/py-pkg/tensorflow/python/client/session.py", line 950, in run
    run_metadata_ptr)
  File ".../t2t/py-pkg/tensorflow/python/client/session.py", line 1173, in _run
    feed_dict_tensor, options, run_metadata)
  File ".../t2t/py-pkg/tensorflow/python/client/session.py", line 1350, in _do_run
    run_metadata)
  File ".../t2t/py-pkg/tensorflow/python/client/session.py", line 1370, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.UnimplementedError: 2 root error(s) found.
  (0) Unimplemented: Generic conv implementation only supports NHWC tensor format for now.
[[{{node Conv2D}}]]
[[IteratorGetNext]]
[[transformer/Shape/_1455]]
  (1) Unimplemented: Generic conv implementation only supports NHWC tensor format for now.
[[{{node Conv2D}}]]
[[IteratorGetNext]]
0 successful operations.
0 derived errors ignored.

The model is trained on TPU, and I'm using t2t_decoder to decode them using GPU:

2019-07-20 13:32:12.173161: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-07-20 13:32:12.182888: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2019-07-20 13:32:12.516053: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-20 13:32:12.518987: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55e494a743a0 executing computations on platform CUDA. Devices:
2019-07-20 13:32:12.519058: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): Tesla V100-SXM2-16GB, Compute Capability 7.0
2019-07-20 13:32:12.536623: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2200000000 Hz
2019-07-20 13:32:12.538726: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55e4910c9430 executing computations on platform Host. Devices:
2019-07-20 13:32:12.538757: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>, <undefined>
2019-07-20 13:32:12.539248: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-20 13:32:12.541485: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.53
pciBusID: 0000:00:05.0

Any possible solutions?

Thanks.

Lukasz Kaiser

unread,
Jul 20, 2019, 6:10:08 PM7/20/19
to Seongjin Cho, tensor2tensor
This looks very much like a TF or CUDA problem. I'm not sure what the
solution is, but please ask on the TF lists as well, they may know
more!

Lukasz
> --
> You received this message because you are subscribed to the Google Groups "tensor2tensor" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to tensor2tenso...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/tensor2tensor/f0c52d5c-f028-427a-800c-27a02115bd23%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages