As far as I know, there is no way to allocate individual machine type for each TFX components (like ExampleGen - CPU only, Trainer - GPU, ...).
My environment is TFX+Kubeflow in Cloud AI Platform(CAIP). Without "CAIP training", I need to configure all kubernetes nodes with GPU I guess. And if that is true, it is going to cost a lot of money!
So I have switched to use "CAIP Training".
It works ok, but there is one problem.
TFX docker image does not come with NVIDIA/CUDA support which I have to set up by myself to create a custom docker image.
I have tried to compose Dockerfile something like ...
```
FROM tensorflow/tensorflow:latest-gpu
FROM tensorflow/tfx:latest
ENTRYPOINT ["python3.7", "/tfx-src/tfx/scripts/run_executor.py"]
```
and it didn't work.
Whenever I try to build docker images, the logs shows me the below error message.
> Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0:
Could anyone please let me know materials to look up for building a custom docker image for AI Platform training with TFX?
PS;
just setting `scaleTier` to `BASIC_GPU` does not work too. I guess this is because TFX entry is not specified in the designated container for BASIC_GPU.