Understanding BRK instruction

125 views
Skip to first unread message

anup holey

unread,
Jul 1, 2013, 8:57:39 PM7/1/13
to asf...@googlegroups.com
Hi,

I am working on a cuda kernel that has a loop that contains 'break' statements. However, this kernel doesn't exit. When I debugged with cuda-gdb (stopped using <ctrl+C>), I found that it always stops 2 instructions after the 'BRK' instruction (my guess is in the 3rd cycle BRK enters execute stage of the pipeline). I don't know what this instruction does. But based on the PTX 3.1 documentation, 'brkpt' instruction suspends the execution. So here is my question: Is BRK == brkpt? If yes, and if it does suspend the execution, why would compiler use this instruction? 

I re-wrote the code by removing the 'break' statements and modifying the loop check condition, which works fine.
I might be missing something very basic here. The code runs properly when compiled with -G flag.
BTW I am running kernel on GTX 480, compiled with cuda 5.0.

Thanks in advance.
Anup

HuanHuan

unread,
Jul 2, 2013, 12:53:44 AM7/2/13
to asf...@googlegroups.com
Hi,

A for loop won't generate brk instructions. You saw it because you were
debugging it, the debuggers (cuda-gdb or nsight) use brk to hot patch
your code to implement a breakpoint.

The compiler won't generate this instruction unless you use the inline
ptx equvilant statement asm volatile ("brkpt;");

Kernels with this ptx statement, will generate a force manually
breakpoint under nsight/cuda-gdb.
And will cause a execution abortion when run directly.

So you can ignore this instruction, it's used by debuggers. Not by you.
> --
> You received this message because you are subscribed to the Google
> Groups "asfermi" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to asfermi+u...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Sylvain Collange

unread,
Jul 2, 2013, 5:04:40 AM7/2/13
to asf...@googlegroups.com
Hi,

BRK is the break instruction in SASS, according to the cuobjdump documentation (breakpoint would have been BPT).

When all the threads that were active at the beginning of the loop have reached a BRK instruction, the control jumps to the instruction address that was previously set by the PBK instruction.

Do you have complex nested loops with multiple PBK and SSY instructions? Can the control flow 'escape' the loop by taking another path than through BRK?

Sylvain

HuanHuan

unread,
Jul 2, 2013, 5:36:21 AM7/2/13
to asf...@googlegroups.com
Yes. I am wrong. I am sorry for mistakening BPT and BRK.
Thank Sylvain.

anup holey

unread,
Jul 3, 2013, 5:49:35 PM7/3/13
to asf...@googlegroups.com
Thanks for quick response.

I have multiple loops, but none of them are nested. There are loops in 'if' 'else' parts and outside as well. Each loop can have more than one exits. Thread can exits loop through multiple break statements or at the end of loop. However, when I rewrote the exit conditions differently, the problem was solved. Although, functionally the changes wouldn't make any difference. This is hard to anticipate because one implementation may not work while the other one might. It could waste a lot of debugging time.

Hou Yunqing

unread,
Jul 4, 2013, 2:04:54 AM7/4/13
to asfermi Google Group
HI Anup, if your problem goes away with -g flag, it surely is an nvcc bug. You can file a bug report with NVIDIA.
Reply all
Reply to author
Forward
0 new messages