Questions about accuracy verification of XLA compiled subgaph

22 views
Skip to first unread message

alex.wu

unread,
Mar 1, 2022, 4:52:48 AM3/1/22
to XLA development
Hi XLA devs,
I successfully replay an clustered subgraph from ResNet50 by using XLA on both a hardware accelerator and a NV GPU with the same input tensor. 

For bring the accelerator into XLA, I just hack the LLVM IR emittion process to make sure the emitted LLVM IRs are suitable for the accelerator. It means all the HLO pass are the same as NV GPU path.  

I used numpy.allclose() with the threashold == 1e-5 to compare the output tensors (_Retval nodes) from these 2 devices,  I got 184 failed out of 270 output tensors. 

My question is how can I nail down the first thunk(kernel/custom-call) that introduced the difference? More generally, is there any method to dump the compute result of a thunk? 

PS: I used tensorflow-r1.15

Regards,
Alex
Reply all
Reply to author
Forward
0 new messages