Rod, in terms of allocating pointers on the AMD devices, will this be
difficult to support? Is there a chance that a pointer on an AMD device
can alias a host pointer?
Also, with every new version of the runtime, there will be some changes
to the Runtime API that we need to spend some time supporting. I don't
foresee any major challenges here, but it will take some effort.
Regards,
Greg
> --
> You received this message because you are subscribed to the Google
> Groups "gpuocelot" group.
> To post to this group, send email to gpuo...@googlegroups.com.
> To unsubscribe from this group, send email to
> gpuocelot+...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/gpuocelot?hl=en.
typedef struct __cudaFatCudaBinary2HeaderRec {
char unknown[20];
unsiged int offset;
} __cudaFatCudaBinary2Header;
typedef struct __cudaFatCudaBinaryRec2 {
int magic;
int version;
const unsigned long long* fatbinData;
char* f;
} __cudaFatCudaBinary2;
The offset in the header gives the start of an ELF file. Here's a dump
from a sample file:
normal@atom:~/temp/fatbin$ objdump -x fatbinstripped2.elf
fatbinstripped2.elf: file format elf64-little
fatbinstripped2.elf
architecture: UNKNOWN!, flags 0x00000112:
EXEC_P, HAS_SYMS, D_PAGED
start address 0x0000000000000000
Program Header:
PHDR off 0x0000000000000684 vaddr 0x0000000000000000 paddr
0x0000000000000000 align 2**2
filesz 0x0000000000000070 memsz 0x0000000000000070 flags r-x
0x60000000 off 0x0000000000000466 vaddr 0x0000000000000000 paddr
0x0000000000000000 align 2**2
filesz 0x00000000000001a4 memsz 0x00000000000001a4 flags r-x a00
Sections:
Idx Name Size VMA LMA File off
Algn
0 .text.kernelEntry 00000130 0000000000000000 0000000000000000
00000466 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .nv.constant0.kernelEntry 0000002c 0000000000000000
0000000000000000 00000596 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
2 .nv.info.kernelEntry 00000048 0000000000000000 0000000000000000
000005c2 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
3 .nv.info 00000078 0000000000000000 0000000000000000
0000060a 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
SYMBOL TABLE:
0000000000000000 l d *ABS* 0000000000000000 .shstrtab
0000000000000000 l d *ABS* 0000000000000000 .strtab
0000000000000000 l d *ABS* 0000000000000000 .symtab
0000000000000000 l d *UND* 0000000000000000
0000000000000000 l d *UND* 0000000000000000
0000000000000000 l d .text.kernelEntry 0000000000000130
.text.kernelEntry
0000000000000000 l d .nv.info.kernelEntry 0000000000000000
.nv.info.kernelEntry
0000000000000000 l d .nv.info 0000000000000000 .nv.info
0000000000000000 l d .nv.constant0.kernelEntry 0000000000000000
.nv.constant0.kernelEntry
0000000000000000 g F .text.kernelEntry 00000000000000f0 0x10 kernelEntry
00000000000000f0 g F .text.kernelEntry 0000000000000010 funcTriple
0000000000000100 g F .text.kernelEntry 0000000000000010 funcPentuple
0000000000000110 g F .text.kernelEntry 0000000000000010 funcQuadruple
0000000000000120 g F .text.kernelEntry 0000000000000010 funcDouble
So we definitely need an ELF reader to be able to load this. The good
news is, though, that this actually gives us a usable symbol table for
PTX modules. So we can be very aggressive in how we lazily load PTX if
we choose to take advantage of this.
We also have a new version of PTX (2.3), which is more or less identical
to 2.2.
I'm currently working on the SCons branch so the fix went in there
first. It will make its way into the trunk after some more testing.
Any pointers to a good some BSD-licensed source code for an ELF reader
would be appreciated.
Regards,
Greg
Greg
Unfortunately that is exactly what I mean. It seems like there are now
multiple fat binary formats that are distinguished by a magic word. The
old format is the same, but the new format is generated by nvcc 4.0 by
default.
Regards,
Greg
Greg