sm_35 uses completely different set of opcodes

103 views
Skip to first unread message

Dmitry N. Mikushin

unread,
Nov 20, 2012, 9:16:48 PM11/20/12
to asf...@googlegroups.com
Dear colleagues,

Here comes another challenge in our favourite sport: looks like in
sm_35 (Kepler K20) NVIDIA completely changed the opcodes:

sm_30:
- /*0010*/ /*0x9400dc042c000000*/ S2R R3, SR_CTAid_X;
- /*0018*/ /*0x84001c042c000000*/ S2R R0, SR_Tid_X;
- /*0020*/ /*0x1030dc036000c000*/ SHL R3, R3, 0x4;
- /*0028*/ /*0x00311c0348000000*/ IADD R4, R3, R0;
- /*0030*/ /*0x3c41dc231a0ec000*/ ISETP.GT.AND P0, pt,
R4, 0xf, pt;
- /*0038*/ /*0x000001e780000000*/ @P0 EXIT;
- /*0048*/ /*0x00011de428004005*/ MOV R4, c [0x0] [0x140];
- /*0050*/ /*0xa0411c034801c000*/ IADD R4.CC, R4, 0x28;

sm_35:
+ /*0008*/ /*0x089c000664c03c00*/ MOV R1, c [0x0] [0x44];
+ /*0010*/ /*0x129c000e86400000*/ S2R R3, SR37;
+ /*0018*/ /*0x109c000286400000*/ S2R R0, SR33;
+ /*0020*/ /*0x021ffc0db7c00c00*/ SHF.L R3, RZ, 0x4, R3;
+ /*0028*/ /*0x001c0c12e0800000*/ IADD R4, R3, R0;
+ /*0030*/ /*0x079c101db3481c00*/ ISETP.GT.AND P0, PT,
R4, 0xf, PT;
+ /*0038*/ /*0x0000003c18000000*/ @P0 EXIT;
+ /*0048*/ /*0x281c001264c03c00*/ MOV R4, c [0x0] [0x140];
+ /*0050*/ /*0x141c1011c0840000*/ IADD R4.CC, R4, 0x28;

So, unless it is not a bug in cuobjdump, there is a need to re-run some tools...

- D.

Sun HuanHuan

unread,
Nov 20, 2012, 11:15:05 PM11/20/12
to asf...@googlegroups.com
It seems the instruction set changed little, but encodings differed much?

The encodings should change. as sm_35 can address up to 255 instructions.
Or at least it should add new instruction encodings for the
255-regs-addressing instrution, and others remain the same. But as
you wrote, they seemed to be changed completely.

Dmitry N. Mikushin

unread,
Nov 20, 2012, 11:26:11 PM11/20/12
to asf...@googlegroups.com
Hi Huan,

Right, ISA is likely the same as on sm_30, I have not seen any new
instructions, except those new with respect to Fermi sm_30 already
has. Then why would they need to change the encoding?..

Anyways, at KernelGen this means a lot more job for us, as we use
AsFermi for some portions and have plan to develop direct ISA backend
for LLVM (without PTX). Is anybody else interested to build up new
instruction rules for AsFermi and KernelGen?

Thanks,
- Dima.

2012/11/21 Sun HuanHuan <mailhu...@gmail.com>:

Sun HuanHuan

unread,
Nov 20, 2012, 11:54:33 PM11/20/12
to asf...@googlegroups.com
I think it's because sm_35 now supports new 8bit-reg-indexed instructions.

for examples, if the 4-operand IMAD R3, R2, R1, R0 now supports 8bit
indexed registers. Then encoding for it has have to be some form of
INST(4B) R(1B) R(1B) R(1B) R(1B), making a total of 8B encoding.

While the old encoding for IMAD support 3x 6-bit indexed registers and
1x 20-bit composite operand. So it has to be changed. or it won't
address up to 255 registers.

On Wed, Nov 21, 2012 at 12:26 PM, Dmitry N. Mikushin

Dmitry N. Mikushin

unread,
Nov 21, 2012, 12:07:20 AM11/21/12
to asf...@googlegroups.com
Ah, got it: to index the increased number of registers! Perfectly
explained reasoning, Huan, thanks.

Now, what shall we do about it?

- D.

2012/11/21 Sun HuanHuan <mailhu...@gmail.com>:

Sun HuanHuan

unread,
Nov 21, 2012, 12:10:08 AM11/21/12
to asf...@googlegroups.com
I want to do it. But I don't know C++. So I cannot do.

I have to watch only.

I am feeling sorry.
Reply all
Reply to author
Forward
0 new messages