If you look carefully, AMD Radeon R9 290 performance is similar to NVIDIA GTX 970, in OpenCL, and their performance is also very similar in real world 3D graphics (gaming).
Those cards are of 2 generations ago. They are similar price and similar generation cards. Both do around 1200 nps via OpenCL.
The big question is: Why is OpenCL so much slower than CUDA ? This I don't know.