PTX analysis

46 views
Skip to first unread message

吴昊

unread,
Aug 24, 2015, 11:56:48 AM8/24/15
to gpuocelot
Hi all,

I want do a analysis on PTX code to gather the instruction mixture. Do Ocelot can help me?
Or I just can write a program to do this? I can just count the instructions.

For example, if the loop will be executed in nk times, how many add, mul, ld/st instructions respectively? 
Is the execution time of add.s64 2 times of add.s32 ?
BB0_3:
add.s32 %r21, %r23, %r3;
mul.wide.s32 %rd6, %r21, 4;
add.s64 %rd7, %rd2, %rd6;
ld.global.f32 %f7, [%rd7];
mul.f32 %f8, %f7, %f4;
mad.lo.s32 %r22, %r23, %r6, %r1;
mul.wide.s32 %rd8, %r22, 4;
add.s64 %rd9, %rd3, %rd8;
ld.global.f32 %f9, [%rd9];
fma.rn.f32 %f10, %f8, %f9, %f10;
st.global.f32 [%rd1], %f10;
add.s32 %r23, %r23, 1;
setp.lt.s32 %p5, %r23, %r7;
@%p5 bra BB0_3;

Si Li

unread,
Aug 24, 2015, 10:56:37 PM8/24/15
to gpuocelot
I think WarpSynchronous will do what you want. Take a look here:

https://code.google.com/p/gpuocelot/wiki/OcelotConfigFile
Reply all
Reply to author
Forward
0 new messages