[LLVMdev] Idea for Google Summer Code : C Compiler for EFI Byte Code implement in LLVM

330 views
Skip to first unread message

琬菁楊

unread,
Mar 10, 2010, 11:57:11 AM3/10/10
to llv...@cs.uiuc.edu
Hello all,

I am highly interestd in implementing C compiler for EFI Byte Code in LLVM and participate in Google Summer Code.

EFI is a much larger, more complex,OS-like replacement for the older BIOS
firmware interface present in all IBM PC-compatible personal computers.
and the EFI specification provides for a processor-independent device driver environment(like virtualmachine), called EFI Byte Code or EBC.

Intel(R) C Compiler for EFI Byte Code, the only one C compiler for EFI Byte Code
is not open source, and also a pay software.

I think the main issue is that EFI C dialect is not ANSI-C compliant: the size of pointer is determined at the run-time and therefore the layout of the structure is not static. Does LLVM support this model?

And I am wondering whether this kind of idea is valuable to the LLVM
community? or are there any other related ideas is more valuable?


thanks

ching

Renato Golin

unread,
Mar 15, 2010, 11:50:01 AM3/15/10
to 琬菁楊, llv...@cs.uiuc.edu
On 10 March 2010 16:57, 琬菁楊 <ching11...@g2.nctu.edu.tw> wrote:
> I think the main issue is that EFI C dialect is not ANSI-C compliant: the
> size of pointer is determined at the run-time and therefore the layout of
> the structure is not static. Does LLVM support this model?

Hi Ching,

The LLVM IR doesn't care about the size of your pointers, and this is
why you have the 'datalayout' explicit on the object file. I don't
know, however, if you can omit the layout definition and leave it for
run time.

If LLVM doesn't allow omitting the layout, it should, as we now have
an use case that needs it. If it does, it should be just a matter of
converting the current IR into EFI bytecode and creating intrinsics to
deal with the run-time variables. You could even benefit from having
different languages (LLVM supports) into EFI bytecode...


> And I am wondering whether this kind of idea is valuable to the LLVM
> community? or are there any other related ideas is more valuable?

I think that an open source compiler to EFI byte code is not only
desirable, but necessary.


cheers,
--renato

http://systemcall.org/

Reclaim your digital rights, eliminate DRM, learn more at
http://www.defectivebydesign.org/what_is_drm

_______________________________________________
LLVM Developers mailing list
LLV...@cs.uiuc.edu http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

琬菁楊

unread,
Mar 19, 2010, 6:47:49 AM3/19/10
to Renato Golin, llv...@cs.uiuc.edu
Hello Renato and all,

  I have few questions about your mail

2010/3/15 Renato Golin <reng...@systemcall.org>

On 10 March 2010 16:57, 琬菁楊 <ching11...@g2.nctu.edu.tw> wrote:
> I think the main issue is that EFI C dialect is not ANSI-C compliant: the
> size of pointer is determined at the run-time and therefore the layout of
> the structure is not static. Does LLVM support this model?

Hi Ching,

The LLVM IR doesn't care about the size of your pointers, and this is
why you have the 'datalayout' explicit on the object file. I don't
know, however, if you can omit the layout definition and leave it for
run time.

As you say LLVM IR doesn't care the size of pointer, does it mean what I
only need to implement is the part that convert LLVM to EFI Byte code (just
like we regard native assembly) ??

If LLVM doesn't allow omitting the layout, it should, as we now have
an use case that needs it. If it does, it should be just a matter of
converting the current IR into EFI bytecode and creating intrinsics to
deal with the run-time variables. You could even benefit from having
different languages (LLVM supports) into EFI bytecode...

I feel sorry about I am not vary familer with structure of LLVM that I feel
some confuse.With turely appreciation if you would explain more detail.
thanks.

ching

Andrew Lenharth

unread,
Mar 19, 2010, 9:02:19 AM3/19/10
to Renato Golin, llv...@cs.uiuc.edu
On Mon, Mar 15, 2010 at 10:50 AM, Renato Golin <reng...@systemcall.org> wrote:
> On 10 March 2010 16:57, 琬菁楊 <ching11...@g2.nctu.edu.tw> wrote:
>> I think the main issue is that EFI C dialect is not ANSI-C compliant: the
>> size of pointer is determined at the run-time and therefore the layout of
>> the structure is not static. Does LLVM support this model?
>
> Hi Ching,
>
> The LLVM IR doesn't care about the size of your pointers, and this is
> why you have the 'datalayout' explicit on the object file. I don't
> know, however, if you can omit the layout definition and leave it for
> run time.

The layout for non-packed structures is, in this sense, left for
runtime. The definition, if lacking any alignment attributes, will be
layed out in the backend according to the alignment rules then. The
IR layout definition can be happily cross-platform if you always
access the structures in a type-safe way (using GEP, never castes,
etc).

> If LLVM doesn't allow omitting the layout, it should, as we now have
> an use case that needs it. If it does, it should be just a matter of
> converting the current IR into EFI bytecode and creating intrinsics to
> deal with the run-time variables. You could even benefit from having
> different languages (LLVM supports) into EFI bytecode...

How does EFI describe structures if the pointer size can change? This
shouldn't be a harder problem than C struct -> llvm struct. I assume
the EFI bytecode has some way to describe them. What is it?
(Question is for Ching)

Andrew

Tristan Gingold

unread,
Mar 19, 2010, 9:11:21 AM3/19/10
to Andrew Lenharth, llv...@cs.uiuc.edu
> How does EFI describe structures if the pointer size can change? This
> shouldn't be a harder problem than C struct -> llvm struct. I assume
> the EFI bytecode has some way to describe them. What is it?

EFI describe structures almost like C.

There are EBC instructions that have two immediates: one for 32bits pointers and one for 64bits pointers.

琬菁楊

unread,
Mar 19, 2010, 2:08:23 PM3/19/10
to llv...@cs.uiuc.edu

Hello Tristan and all,

I have already know that if I want to do this feature(c -> EFI Byte code) for GCC
I should further modify the GCC front end(parser) to solve the problem (the size
of pointer is determined at run time).

I have read a powerpoint about LLVM (http://llvm.org/pubs/2008-10-04-ACAT-LLVM-Intro.pdf)
It is the LLVM-GCC design graph (http://www.im.ntu.edu.tw/~b95030/llvm_gcc.png).
According to the above discussion , LLVM IR doesn't care about the size of pointers.
I am wondering how could LLVM support dynamic pointer size model without modifying
GCC front end??

thanks

ching

Chris Lattner

unread,
Mar 19, 2010, 3:47:19 PM3/19/10
to 琬菁楊, llv...@cs.uiuc.edu
What do you mean by "variable sized pointers"?  What does:

struct  S {void *X; };

return for sizeof(struct S); ?

-Chris

Louis Gerbarg

unread,
Mar 19, 2010, 5:40:29 PM3/19/10
to Chris Lattner, llv...@cs.uiuc.edu
It doesn't, at least not for Intel's EBC compiler. They error out on any sizeof that include a pointer.  A piece of EBC code can run in either a 32 bit or 64 bit environment, and everything in the compiler either needs to cope with it (by conditionally choosing the size of offsets into structs, for instance) or give up on it and abort. That also means that you cannot compile code that depends on knowing pointer sizes in the preprocessor, etc.

I suspect getting something like this to work would require substantial changes to any existing C frontend, since as a language assumes knowledge of pointer size. On the other hand, it would allow for some neat tricks since it would allow one to compile a significant subset of C code to a pointer neutral intermediary form. Off the top of my head I can think of several potential uses for that, such as PNaCl <http://blog.chromium.org/2010/03/native-client-and-web-portability.html>.

Louis

Chris Lattner

unread,
Mar 19, 2010, 5:55:21 PM3/19/10
to Louis Gerbarg, llv...@cs.uiuc.edu
On Mar 19, 2010, at 2:40 PM, Louis Gerbarg wrote:


What do you mean by "variable sized pointers"?  What does:

struct  S {void *X; };

return for sizeof(struct S); ?

It doesn't, at least not for Intel's EBC compiler. They error out on any sizeof that include a pointer.  A piece of EBC code can run in either a 32 bit or 64 bit environment, and everything in the compiler either needs to cope with it (by conditionally choosing the size of offsets into structs, for instance) or give up on it and abort. That also means that you cannot compile code that depends on knowing pointer sizes in the preprocessor, etc.

Ok, that makes sense.  It could be done by generalizing the notions of variably modified types (which are VLAs in C99) to include pointers. 

I suspect getting something like this to work would require substantial changes to any existing C frontend, since as a language assumes knowledge of pointer size. On the other hand, it would allow for some neat tricks since it would allow one to compile a significant subset of C code to a pointer neutral intermediary form. Off the top of my head I can think of several potential uses for that, such as PNaCl <http://blog.chromium.org/2010/03/native-client-and-web-portability.html>.

PNaCL is already (planned to be) built with LLVM/Clang.  They just fix the pointer size at 32-bits, which also simplifies their SFI approach on 64-bit hosts.

-Chris

琬菁楊

unread,
Mar 30, 2010, 1:31:32 PM3/30/10
to Louis Gerbarg, llv...@cs.uiuc.edu
2010/3/20 Louis Gerbarg <lger...@gmail.com>
On Fri, Mar 19, 2010 at 3:47 PM, Chris Lattner <clat...@apple.com> wrote:

On Mar 19, 2010, at 11:08 AM, 琬菁楊 wrote:


Hello Tristan and all,

I have already know that if I want to do this feature(c -> EFI Byte code) for GCC
I should further modify the GCC front end(parser) to solve the problem (the size
of pointer is determined at run time).

I have read a powerpoint about LLVM (http://llvm.org/pubs/2008-10-04-ACAT-LLVM-Intro.pdf)
It is the LLVM-GCC design graph (http://www.im.ntu.edu.tw/~b95030/llvm_gcc.png).
According to the above discussion , LLVM IR doesn't care about the size of pointers.
I am wondering how could LLVM support dynamic pointer size model without modifying
GCC front end??

What do you mean by "variable sized pointers"?  What does:

struct  S {void *X; };

return for sizeof(struct S); ?

I have surveyed the UEFI spec2.3.
In my opinion, if the EBC VM is running on 32-bit processor, return value is 4
if the EBC VM is running on 64-bit processor, return value is 8

If error out on any sizeof that include a pointer, does it means that no issue about pointer size determined at runtime??

 
It doesn't, at least not for Intel's EBC compiler. They error out on any sizeof that include a pointer.  A piece of EBC code can run in either a 32 bit or 64 bit environment, and everything in the compiler either needs to cope with it (by conditionally choosing the size of offsets into structs, for instance) or give up on it and abort. That also means that you cannot compile code that depends on knowing pointer sizes in the preprocessor, etc.

I suspect getting something like this to work would require substantial changes to any existing C frontend, since as a language assumes knowledge of pointer size. On the other hand, it would allow for some neat tricks since it would allow one to compile a significant subset of C code to a pointer neutral intermediary form. Off the top of my head I can think of several potential uses for that, such as PNaCl <http://blog.chromium.org/2010/03/native-client-and-web-portability.html>.

Louis

thanks

ching

Chris Lattner

unread,
Mar 30, 2010, 2:23:51 PM3/30/10
to 琬菁楊, llv...@cs.uiuc.edu

On Mar 30, 2010, at 10:31 AM, 琬菁楊 wrote:

What do you mean by "variable sized pointers"?  What does:

struct  S {void *X; };

return for sizeof(struct S); ?

I have surveyed the UEFI spec2.3.
In my opinion, if the EBC VM is running on 32-bit processor, return value is 4
if the EBC VM is running on 64-bit processor, return value is 8

If error out on any sizeof that include a pointer, does it means that no issue about pointer size determined at runtime??

Yes, if it is an error, it makes it much more feasible to implement.

-Chris

琬菁楊

unread,
Mar 30, 2010, 10:41:38 PM3/30/10
to Tristan Gingold, llv...@cs.uiuc.edu
Hello Tristan,

> How does EFI describe structures if the pointer size can change?  This
> shouldn't be a harder problem than C struct -> llvm struct.  I assume
> the EFI bytecode has some way to describe them.  What is it?

EFI describe structures almost like C.

There are EBC instructions that have two immediates: one for 32bits pointers and one for 64bits pointers.

I have read the uefi specification 2.3 and survey the code of EBC VM
It seems that there is no relationship between the EBC byte code and the processor.
like CMPI[32|64][w|d]eq {@}R1 {Index16}, Immed16|Immed32,
choosing which form is not depending on the processor but compiler(means 32-bit processor could even support 64-bit operand)
Does it more feasible to implement it??

thanks

ching

Tristan Gingold

unread,
Mar 31, 2010, 3:59:39 AM3/31/10
to Chris Lattner, llv...@cs.uiuc.edu

On Mar 30, 2010, at 8:23 PM, Chris Lattner wrote:

But it is not an error, otherwise it would be hard to use malloc() like functions.


Renato Golin

unread,
Mar 31, 2010, 5:18:39 AM3/31/10
to Chris Lattner, llv...@cs.uiuc.edu
2010/3/30 Chris Lattner <clat...@apple.com>:

> If error out on any sizeof that include a pointer, does it means that no
> issue about pointer size determined at runtime??
>
> Yes, if it is an error, it makes it much more feasible to implement.

I don't see why it should be that difficult.

If sizeof becomes an intrinsic that is called at runtime to determine
the pointer size (probably stored in some global or read from a
configuration register), than the problem is solved. If the types'
sizes change too, this intrinsic could accept a parameter (enum?) with
the type of the type.


cheers,
--renato

http://systemcall.org/

Reclaim your digital rights, eliminate DRM, learn more at
http://www.defectivebydesign.org/what_is_drm

琬菁楊

unread,
Mar 31, 2010, 2:48:22 PM3/31/10
to Tristan Gingold, llv...@cs.uiuc.edu
Hi Tristan,


But it is not an error, otherwise it would be hard to use malloc() like functions.


I cannot understand that it would be hard to use malloc() like functions
the parameter passed to malloc is evaluated at runtime
what is the issue of malloc when sizeof is determined at runtime?

thanks

ching

Tristan Gingold

unread,
Apr 1, 2010, 3:55:03 AM4/1/10
to 琬菁楊, llv...@cs.uiuc.edu

It would be hard to use malloc *iff* sizeof of a structure that includes a pointer is flagged as an error
by the compiler.

EBC C compilers are clearly non ansi-C conformant.

Tristan.

琬菁楊

unread,
Apr 1, 2010, 4:27:44 PM4/1/10
to llv...@cs.uiuc.edu
Hello, Chris

 
2010/3/20 Chris Lattner <clat...@apple.com>


On Mar 19, 2010, at 2:40 PM, Louis Gerbarg wrote:


What do you mean by "variable sized pointers"?  What does:

struct  S {void *X; };

return for sizeof(struct S); ?

It doesn't, at least not for Intel's EBC compiler. They error out on any sizeof that include a pointer.  A piece of EBC code can run in either a 32 bit or 64 bit environment, and everything in the compiler either needs to cope with it (by conditionally choosing the size of offsets into structs, for instance) or give up on it and abort. That also means that you cannot compile code that depends on knowing pointer sizes in the preprocessor, etc.

Ok, that makes sense.  It could be done by generalizing the notions of variably modified types (which are VLAs in C99) to include pointers.

I have read the sizeof and VLA in C99
I found a example:
EXAMPLE 3 In this example, the size of a variable-length array is computed and returned from a function:
#include <stddef.h>
size_t fsize3(int n)
{
  char b[n+3]; // variable length array
  return sizeof b; // execution time sizeof
}
int main()
{
  size_t size;
  size = fsize3(10); // fsize3 returns 13
  return 0;
}
And I found some information with clang about VLA
(http://clang.llvm.org/cxx_compatibility.html#vla)
Does llvm/clang doesn't support sizeof is evaluated at run time??

thanks

ching

Chris Lattner

unread,
Apr 1, 2010, 6:00:04 PM4/1/10
to 琬菁楊, llv...@cs.uiuc.edu
On Apr 1, 2010, at 1:27 PM, 琬菁楊 wrote:
int main()
{
  size_t size;
  size = fsize3(10); // fsize3 returns 13
  return 0;
}
And I found some information with clang about VLA
(http://clang.llvm.org/cxx_compatibility.html#vla)
Does llvm/clang doesn't support sizeof is evaluated at run time??

Yes, clang supports vlas as defined in C99 and sizeof can return a dynamic value.   C99 vlas cannot occur in structs though, and clang does not support them in structs.

-Chris

琬菁楊

unread,
Apr 2, 2010, 12:28:36 PM4/2/10
to Chris Lattner, llv...@cs.uiuc.edu
Hello Chris,

I have survey the efi specification and ask some question to efi engineer.
Difference between EFI C and ANSI C is as following:
1. void*
    In EFI C, the void* is 4-byte for 32-bit processor and 8-byte for 64-bit processor.
    And it can appears in any where like ANSI C.
    So the main problem is that struct layout like
    struct S{
        void* X;
    };
    is not static.
2. no floating support in EFI C
3. no C++ support in EFI C
4. no assembly support in EFI C, all assembly must convert to C

I am wondering that does LLVM support model which structure layout is determined at run time??
If not, do I need to modify the parser in clang to support this feature??

thanks

ching

Chris Lattner

unread,
Apr 2, 2010, 1:17:12 PM4/2/10
to 琬菁楊, llv...@cs.uiuc.edu

On Apr 2, 2010, at 9:28 AM, 琬菁楊 wrote:

> Hello Chris,
>
> I have survey the efi specification and ask some question to efi engineer.
> Difference between EFI C and ANSI C is as following:
> 1. void*
> In EFI C, the void* is 4-byte for 32-bit processor and 8-byte for 64-bit processor.
> And it can appears in any where like ANSI C.
> So the main problem is that struct layout like
> struct S{
> void* X;
> };
> is not static.
> 2. no floating support in EFI C
> 3. no C++ support in EFI C
> 4. no assembly support in EFI C, all assembly must convert to C

Ok, all of this is easy except #1.

> I am wondering that does LLVM support model which structure layout is determined at run time??

No.

> If not, do I need to modify the parser in clang to support this feature??

No, please don't. This is something we specifically do not want to support. The issue is not the parser, the issue is that struct field offsets are no longer constant in this model.

-Chris

琬菁楊

unread,
Apr 2, 2010, 1:47:49 PM4/2/10
to Chris Lattner, llv...@cs.uiuc.edu
Hello Chris

> If not, do I need to modify the parser in clang to support this feature??

No, please don't.  This is something we specifically do not want to support.  The issue is not the parser, the issue is that struct field offsets are no longer constant in this model.

Does that struct field offsets are no longer constant would crash the following optimization in LLVM??

thanks

ching

Dale Johannesen

unread,
Apr 2, 2010, 1:56:32 PM4/2/10
to Chris Lattner, llv...@cs.uiuc.edu

On Apr 2, 2010, at 10:17 AMPDT, Chris Lattner wrote:
>> Hello Chris,
>>
>> I have survey the efi specification and ask some question to efi engineer.
>> Difference between EFI C and ANSI C is as following:
>> 1. void*
>> In EFI C, the void* is 4-byte for 32-bit processor and 8-byte for 64-bit processor.
>> And it can appears in any where like ANSI C.
>> So the main problem is that struct layout like
>> struct S{
>> void* X;
>> };
>> is not static.
>
>> If not, do I need to modify the parser in clang to support this feature??
>
> No, please don't. This is something we specifically do not want to support. The issue is not the parser, the issue is that struct field offsets are no longer constant in this model.

LLVM may not be a good match for this project, but there's prior art elsewhere; have a look at ANDF.

Russell Wallace

unread,
Apr 3, 2010, 8:36:14 AM4/3/10
to Chris Lattner, llv...@cs.uiuc.edu
On Fri, Apr 2, 2010 at 6:17 PM, Chris Lattner <clat...@apple.com> wrote:
> No, please don't.  This is something we specifically do not want to support.  The issue is not the parser, the issue is that struct field offsets are no longer constant in this model.

What about declaring that pointers are always 64 bits, for all
purposes other than final code generation of actual pointer
instructions? Would that solve the problem?

Chris Lattner

unread,
Apr 3, 2010, 1:27:54 PM4/3/10
to Russell Wallace, llv...@cs.uiuc.edu

On Apr 3, 2010, at 5:36 AM, Russell Wallace wrote:

> On Fri, Apr 2, 2010 at 6:17 PM, Chris Lattner <clat...@apple.com> wrote:
>> No, please don't. This is something we specifically do not want to support. The issue is not the parser, the issue is that struct field offsets are no longer constant in this model.
>
> What about declaring that pointers are always 64 bits, for all
> purposes other than final code generation of actual pointer
> instructions? Would that solve the problem?

Yep. That would be a fine approach, and probably conformant to the spec.

-Chris

琬菁楊

unread,
Apr 3, 2010, 2:48:27 PM4/3/10
to Chris Lattner, llv...@cs.uiuc.edu
Hello Chris, Rusell

> What about declaring that pointers are always 64 bits, for all
> purposes other than final code generation of actual pointer
> instructions? Would that solve the problem?

Yep.  That would be a fine approach, and probably conformant to the spec.

I have some question about your disscussion.
the main difference is as follows:

>1. void*
>     In EFI C, the void* is 4-byte for 32-bit processor and 8-byte for 64-bit processor.
>     And it can appears in any where like ANSI C.
>     So the main problem is that struct layout like
>     struct S{
>         void* X;
>     };
>     is not static.
> 2. no floating support in EFI C
> 3. no C++ support in EFI C
> 4. no assembly support in EFI C, all assembly must convert to C
In my opinion, the main compiling process is C --a--> LLVM IR --b--> EBC byte code
So as you say, solve the difference 2,3,4 in process a and assumed void * be 64-bit in process a
Then solve the difference 1 in process b??

thanks

ching

Tristan Gingold

unread,
Apr 6, 2010, 4:10:44 AM4/6/10
to Russell Wallace, llv...@cs.uiuc.edu

On Apr 3, 2010, at 2:36 PM, Russell Wallace wrote:

> On Fri, Apr 2, 2010 at 6:17 PM, Chris Lattner <clat...@apple.com> wrote:
>> No, please don't. This is something we specifically do not want to support. The issue is not the parser, the issue is that struct field offsets are no longer constant in this model.
>
> What about declaring that pointers are always 64 bits, for all
> purposes other than final code generation of actual pointer
> instructions? Would that solve the problem?

No, sizeof will report wrong values.

Tristan.

Chris Lattner

unread,
Apr 6, 2010, 12:27:58 PM4/6/10
to Tristan Gingold, llv...@cs.uiuc.edu

On Apr 6, 2010, at 1:10 AM, Tristan Gingold wrote:

>
> On Apr 3, 2010, at 2:36 PM, Russell Wallace wrote:
>
>> On Fri, Apr 2, 2010 at 6:17 PM, Chris Lattner <clat...@apple.com> wrote:
>>> No, please don't. This is something we specifically do not want to support. The issue is not the parser, the issue is that struct field offsets are no longer constant in this model.
>>
>> What about declaring that pointers are always 64 bits, for all
>> purposes other than final code generation of actual pointer
>> instructions? Would that solve the problem?
>
> No, sizeof will report wrong values.

Which won't matter (as long as sizeof is consistent), because EFI is a closed system.

-Chris

琬菁楊

unread,
Apr 6, 2010, 1:02:42 PM4/6/10
to Chris Lattner, llv...@cs.uiuc.edu
Hello Chris,

Which won't matter (as long as sizeof is consistent), because EFI is a closed system.

What is the meaning of closed system??

Is it 1. not open source
       2. EBC binary is only running on a single EFI EBC interpreter and never interfacing with the outside VM??

thanks

ching

Chris Lattner

unread,
Apr 6, 2010, 1:12:13 PM4/6/10
to 琬菁楊, llv...@cs.uiuc.edu
#2.

琬菁楊

unread,
Apr 6, 2010, 3:02:00 PM4/6/10
to Chris Lattner, llv...@cs.uiuc.edu
Hello Chris

> I have survey the efi specification and ask some question to efi engineer.
> Difference between EFI C and ANSI C is as following:
> 1. void*
>     In EFI C, the void* is 4-byte for 32-bit processor and 8-byte for 64-bit processor.
>     And it can appears in any where like ANSI C.
>     So the main problem is that struct layout like
>     struct S{
>         void* X;
>     };
>     is not static.
> 2. no floating support in EFI C
> 3. no C++ support in EFI C
> 4. no assembly support in EFI C, all assembly must convert to C

Ok, all of this is easy except #1.

> I am wondering that does LLVM support model which structure layout is determined at run time??

No.

> If not, do I need to modify the parser in clang to support this feature??

No, please don't.  This is something we specifically do not want to support.  The issue is not the parser, the issue is that struct field offsets are no longer constant in this model.

I have read the EFI specification v1.10 and find the natural indexing about solving the dynamic structure layout problem:

19.4.4 Natural Units
Natural units are used when a structure has fields that can vary with the architecture of the
processor. Fields that precipitate the use of natural units include pointers and EFI INTN and
UINTN data types. The size of one pointer or INTN/UINTN equals one natural unit. The natural
units field in an index encoding is a count of the number of natural fields whose sizes (in bytes)
must be added to determine a field offset.
As an example, assume that a given EBC instruction specifies a 16-bit index of 0xA048. This
breaks down into:
• Sign bit (bit 15) = 1 (negative offset)
• Bits assigned to natural units (w, bits 14-12) = 2. Multiply by index size in bytes = 2 x 2 = 4 (A)
• c = bits 11-4 = 4
• n = bits 3-0 = 8
On a 32-bit machine, the offset is then calculated to be:
• Offset = (4 + 8 * 4) * -1 = -36
On a 64-bit machine, the offset is calculated to be:
• Offset = (4 + 8 * 8) * -1 = -68

By this indexing model, the dynamic struture layout problem seems to be solved by the underlying EBC vm.
Although the data field is at different address between 32-bit and 64-bit processor.
Both can use the same encoding.

Does it means that the issue 1 can be solved??

thanks

ching

Tristan Gingold

unread,
Apr 7, 2010, 3:03:35 AM4/7/10
to Chris Lattner, llv...@cs.uiuc.edu

On Apr 6, 2010, at 6:27 PM, Chris Lattner wrote:

>
> On Apr 6, 2010, at 1:10 AM, Tristan Gingold wrote:
>
>>
>> On Apr 3, 2010, at 2:36 PM, Russell Wallace wrote:
>>
>>> On Fri, Apr 2, 2010 at 6:17 PM, Chris Lattner <clat...@apple.com> wrote:
>>>> No, please don't. This is something we specifically do not want to support. The issue is not the parser, the issue is that struct field offsets are no longer constant in this model.
>>>
>>> What about declaring that pointers are always 64 bits, for all
>>> purposes other than final code generation of actual pointer
>>> instructions? Would that solve the problem?
>>
>> No, sizeof will report wrong values.
>
> Which won't matter (as long as sizeof is consistent), because EFI is a closed system.

No, EFI is not that closed: boot loaders interface with EFI.

Don't forget that EBC code can call native functions. If sizeof or fields offsets mismatch I fear that the
program won't work.

Tristan.

琬菁楊

unread,
Apr 7, 2010, 4:10:45 AM4/7/10
to llv...@cs.uiuc.edu
Helo Tristan

> What about declaring that pointers are always 64 bits, for all
> purposes other than final code generation of actual pointer
> instructions? Would that solve the problem?

No, sizeof will report wrong values.

Could I modify the parser to let sizeof be a function.
And use natual indexing(ie. the Natural Units) :

19.4 Natural Indexing
The natural indexing mechanism is the critical functionality that enables EBC to be executed
unchanged on 32- or 64-bit systems. Natural indexing is used to specify the offset of data relative
to a base address. However, rather than specifying the offset as a fixed number of bytes, the offset
is encoded in a form that specifies the actual offset in two parts: a constant offset, and an offset
specified as a number of natural units (where one natural unit = sizeof (VOID *)). These two
values are used to compute the actual offset to data at runtime. When the VM decodes an index
during execution, the resultant offset is computed based on the natural processor size. The encoded
indexes themselves may be 16, 32, or 64 bits in size. Table 19-4 describes the fields in a natural
index encoding.

Table 19-4. Index Encoding
Bit #          Description
N               Sign bit (sign), most significant bit
N-3..N-1    Bits assigned to natural units (w)
A..N-4       Constant units (c)
0..A-1        Natural units (n)

As shown in Table 19-4, for a given encoded index, the most significant bit (bit N) specifies the
sign of the resultant offset after it has been calculated. The sign bit is followed by three bits
(N-3..N-1) that are used to compute the width of the natural units field (n). The value (w) from
this field is multiplied by the index size in bytes to determine the actual width (A) of the natural
units field (n). Once the width of the natural units field has been determined, then the natural units
(n) and constant units (c) can be extracted. The offset is then calculated at runtime according to the
following equation:
Offset = (c + n * (sizeof (VOID *))) * sign

to evaluate sizeof(void*) at running time?

Maybe it can use the following code to get the sizeof(void*) at whether 32-bit and 64-bit processor:

MOVI         R1, 0
MOVI         R2, Label
ADD32         R1, @R2 (1, 0)
Label:

And use natural indexing mechanism to solve the dynamic structure layout problem??

thanks

ching
Reply all
Reply to author
Forward
0 new messages