Something else to keep in mind is that when you train for a "bpp", that's not a universal bpp. It really is an average across a set of patches. Once you take that model and apply it it won't produce that specific bpp, but rather something "in that range". What the model is really trained for i sa specific RD tradeoff which will be more or less the same.
Coco is also TERRIBLE for training on. Sorry about that. It's my fault for making it the default. I just didn't know what other dataset to use. I would suggest switching to something less compressed. The images in COCO have very low quality, so your compression model will be misinformed about the real distribution of images. What you need are high quality images (imagine JPEG quality level 90+, whereas in COCO, it's way less than that).
George