Improving the Training Pipeline/Speed for Image Compression

84 views
Skip to first unread message

Muhammad Salman Ali

unread,
Sep 6, 2022, 1:24:43 PM9/6/22
to tensorflow-...@googlegroups.com
Hello

Greetings to all the fellow group members. I want to inquire regarding the feasibility of testing any new idea in image compression. For example, in image classification, we can test any idea on the CIFAR10 dataset, which takes about 2 hours to train the model entirely. However, for Image Compression, you have to train the dataset at a complete resolution of 256x256; otherwise, the results do not reflect the true capability of the model or any new iterations. I have tried to check the performance of a model by training on a low resolution like 64x64. However, the model was unable to converge. Currently, on fully training a model from scratch, it takes approximately 3 to 5 days on a single GPU using the CompressAI library for Pytorch.

 I want to ask for suggestions on improving the training pipeline for testing new ideas. As a Ph.D. student, I am currently working on exploring the field of Image Compression; however, the training time bottleneck hinders the speed of my progress. 


Thanks

Best Regards
Salman Ali

Fabian Mentzer

unread,
Sep 8, 2022, 2:38:03 AM9/8/22
to tensorflow-compression
Random things that come to mind
- 3-5 days sounds very long. are you sure you are not constraint by the input pipeline? and if you look at the R-D loss, how much does it go down from where it is after 1day? i.e. maybe the 2 additional days give you 3% only
- if you look to improve the entropy modelling part, a speed trick is to first learn an auto-encoder, and then freeze the encoder part. you now can only train an entropy model on that, without having to run the decoder (we did this for VCT)
- if you can afford it, multi GPU training might pay off

Muhammad Salman Ali

unread,
Dec 4, 2022, 11:35:59 PM12/4/22
to tensorflow-compression

1) The additional gains after three days of training compared to 1 day of training amount to 1.5% to 3%. However, the problem is that these additional gains are essential to determine the effectiveness or failure of any new method. Without this additional training, I am uncertain how to evaluate the efficacy of any newly proposed method. Also, can you elaborate on the input pipeline?


2) In nearly every model I've trained, I've observed that the BPP converges at the same rate or faster than the PSNR.
The autoencoder could be trained without the entropy component, and then the entropy coding could be introduced. Would that be effective or helpful in reducing the training time?

3) I have also attempted training on multiple GPUs, but the training time did not improve. I have also experimented with various GPU machines.
Reply all
Reply to author
Forward
0 new messages