Hello everyone.
This is a proposal to add a special "sharable" flag for sections to the
ELF spec, to support distributed shared memory that may be kept
consistent between separate processes on different systems. This
special support in the Linux ABI will make it possible to more
efficiently implement sharable memory, since the sections can be mapped
to sharable space at load time, rather than at run time.
I sent this proposal to the old Linux ABI mailing list, but was asked
to post it here as well, since this may reach people more easily. I
got feedback from Rod Evans previously (thanks, Rod!) and made some
minor mods based on that. No one else has commented. I am hoping to
have these changes to the ELF spec ratified so that they can be counted
on in future software.
Sharable memory is memory that is shared between cooperating processes,
possibly across a network. This is my third attempt at such a
proposal. During the first try, several people on the mailing list
gave helpful comments and suggested that we model the proposal after
the TLS capability. I made that modification. Then for my second
try, after some discussion, people felt that it looked reasonable, but
we should make a working prototype and come back to the mailing list
when we have one. Now, we have a working prototype and HJ Lu has made
a linker that handles the new sections.
Here is a comparison between TLS and sharable:
section flag: {TLS:
SHF_TLS } {sharable: SHF_SHARABLE }
symbol type: {TLS:
STT_TLS } {sharable: (sect index:
SHN_SHARABLE_COMMONS ) }
new sections: {TLS: .tbss,.tdata,.tdata1 }
{sharable: .sharable_bss,.sharable_data }
programm header section type: {TLS: PT_TLS, with ptr to TLS init
image } {sharable: PT_SHARABLE, with ptr to sharable mem info
area }
DT_FLAGS values: {TLS: DF_STATIC_TLS}
{sharable: DF_STATIC_SHARABLE }
Sharable memory accesses don't need the relocation semantics that TLS
needs, so there is no need for an STT_SHARABLE analog to STT_TLS.
This is intended as a formal proposal to make additions to the ELF
spec. The text of the proposal is listed below (between "START ELF
proposal" and "END ELF proposal"). I also include the additions
to the Linux ABI (which were implemented by HJ Lu), and that text is
also listed below (between "START LINUX ABI proposal" and "END
LINUX ABI proposal"), so that people can see the full extent of the
changes we are proposing.
The basic new things we want to accomplish are
1. to be able to define a new type of memory (sharable memory) that
is distinct from other types of memory, and may be used to share data
between processes in a distributed shared memory system
2. to be able to mark an arbitrary section to use sharable memory,
and
3. to be able to cause uninitialized data to be dropped into a
sharable version of .bss
The proposed modification to the ELF spec allows these three things.
In addition, the changes to the Linux ABI specify page-alignment and
padding for the sharable-memory sections, plus the addition of special
externally-visible symbols to allow the runtime to map the memory
appropriately.
The reasoning behind this division in the proposal is that we consider
the alignment / padding / new symbols to be implementation details for
Linux. It is conceivable that other OSes may not need those things.
I ask for your feedback and that this proposal be included in the ELF
spec.
Thanks for your consideration.
Jay Hoeflinger
Senior Staff Software Engineer
Intel
START ELF proposal
1. To the section attribute flags, add SHF_SHARABLE
#define SHF_SHARABLE (1 << 11)
described as follows:
SHF_SHARABLE
The section contains data that will be placed in sharable memory
accessible from more than one processor in a non-uniform memory access
(NUMA) multiprocessor system. Implementations need not support
sharable memory.
2. To the special section indexes, add SHN_SHARABLE_COMMON
#define SHN_SHARABLE_COMMON 0xFFF3
described as follows:
SHN_SHARABLE_COMMON
Symbols defined relative to this section are common symbols, such as
FORTRAN COMMON or unallocated C external variables that are to be
placed in sharable memory. Implementations need not support sharable
memory.
3. To the discussion of special section index value semantics, add:
SHN_SHARABLE_COMMON
The symbol labels a common block in sharable memory that has not yet
been allocated. The linker editor treats these exactly as it does
symbols with section index SHN_COMMON, except that it allocates the
symbol at an address in sharable memory. Implementations need not
support sharable memory.
4. To Special Sections, add the following:
.sharable_bss, type SHT_NOBITS, attributes
SHF_ALLOC+SHF_WRITE+SHF_SHARABLE
.sharable_data, type SHT_PROGBITS, attributes
SHF_ALLOC+SHF_WRITE+SHF_SHARABLE
.sharable_bss
This section holds uninitialized data that contribute to the program's
memory image. By definition, the system initializes the data with
zeros when the program begins to run. The section resides in sharable
memory. Implementations need not support sharable memory.
.sharable_data
This section holds initialized data that contribute to the program's
memory image. The section resides in sharable memory. Implementations
need not support sharable memory.
5. To the Program Header section, add a segment type PH_SHR
#define PT_SHR 8
The array element specifies the location and size of a sharable memory
information area. The interpretation of the sharable memory
information area is implementation-dependent. Implementations need not
support sharable memory.
END ELF proposal
START LINUX ABI proposal
A new assembly directive:
.sharable_common SYMBOL, LENGTH, ALIGNMENT
It will generate a SHN_SHARABLE_COMMON symbol of
size LENGTH aligned at ALIGMENT.
· Sections:
Sections which include the attribute SHF_SHARABLE should be
page-aligned and padded to an integral number of full pages, plus two
extra symbols should be added to the executable file:
a) an external symbol
whose name is <underscore><underscore>start_<original section name*>
(where <underscore> is "_", and <original section name*> is the
original name of the section with any "special characters", e.g. ".",
deleted), whose value is the starting address of the section.
b) an external symbol
whose name is <underscore><underscore>end_<original section name*>
(where <underscore> is "_", and <original section name*> is the
original name of the section with any "special characters", e.g. ".",
deleted), whose value is the address immediately following the section.
The creation of the external symbols is similar to the action that
currently happens for user-defined sections. The "special characters"
referred to above are any characters not allowed in identifiers in the
C language.
· Special sections:
1. .sharable_bss.*. It has the same section type and attribute as
.sharable_bss.
2. .sharable_data.*. It has the same section type and attribute as
.sharable_data.
3. .gnu.linkonce.shrb.*. It has the same section type and attribute as
.sharable_bss. If linker sees more than one section with the same name,
only one section will be kept.
4. .gnu.linkonce.shrd.*. It has the same section type and attribute as
.sharable_data. If linker sees more than one section with the same
name, only one section will be kept.
5. Assembler will set the proper type and attribute for special
sections listed above, regardless what the assembly directive
specifies. No other sections with the SHF_SHARABLE attribute are
allowed.
6. For shared library and executable outputs, linker will group
together .sharable_bss, .sharable_bss.* and .gnu.linkonce.shrb.*
sections to generate a single sharable_bss section, and group together
.sharable_data, .sharable_data.* and .gnu.linkonce.shrd.* sections to
generate a single sharable_data section. The final .sharable_bss and
.sharable_data sections should be page-aligned and padded to an
integral number of full pages. Linker will provide 2 sets of hidden
symbols, __sharable_bss_start/__sharable_bss_end,
__sharable_data_start/__sharable_data_end, to mark the start and the
end addresses of .sharable_bss and .sharable_data sections. The value
of the start symbol is the starting address of the section and the
value of the end symbol is the address immediately following the
section.
· Multiple symbols of differing types
When we have 2 symbols input to the linker by the same name, one being
sharable and the other not sharable, then the linker should decide what
to do based on the following:
1. One of them is undefined. Final symbol: SHARABLE
2. Both are defined, common or definition.
2.1. One is from relocatable file and other is from DSO:
2.1.1. Sharable definition comes from relocatable file. Final
symbol: SHARABLE.
2.1.2. Sharable common comes from relocatable file. ERROR
2.1.3. Sharable definition comes from DSO.
2.1.3.1. Common in relocation file is non-sharable. Final
symbol: SHARABLE.
2.1.3.2. Definition in relocation file is non-sharable.
ERROR
2.2. Both are from relocatable files.
2.2.1. Both are definitions. ERROR.
2.2.2. Both are common. Final symbol: SHARABLE.
2.2.3. One is definition and the other is common.
2.2.3.1. Common is non-sharable. Final symbol: SHARABLE.
2.2.3.1. Definition is non-sharable. ERROR.
2.3 Both are from DSOs. ERROR
END LINUX ABI proposal