[LLVMdev] proposed new rule for getelementptr

Dan Gohman

unread,

Jul 22, 2009, 2:30:33 PM7/22/09

to LLVM Developers Mailing List

Hello,

I'm working on refining the definition of getelementptr (GEP) to
clarify its semantics and to allow front-ends to provide additional
information to optimization passes.

To help support this, I'd like to propose the following rule:

Any memory access must be done though a pointer value associated
with with address range of the memory access, otherwise the behavior
is undefined.

"associated with" is defined as follows:

- A pointer value formed from a getelementptr instruction is
associated with the addresses associated with the first operand of
the getelementptr.
- An addresses of a global variable is associated with the address
range of the variable's storage.
- The result value of an allocation instruction is associated with
the address range of the allocated storage.
- A null pointer is associated with no addresses.
- A pointer value formed by a ptrtoint is associated with all address
ranges of all pointer values that contribute (directly or
indirectly) to the computation of the pointer's value.
- An integer value other than zero may be associated with address
ranges allocated through mechanisms other than those provided by
LLVM; such ranges shall not overlap with any ranges of address
allocated by mechanisms provided by LLVM.

For example, in an instruction like this:

%p = getelementptr [4 x i8]* @A, i32 0, i32 %huge

if %huge is beyond the size of @A, %p could potentially point into
some object other than @A. This rule says that it's not valid to use
%p to access that other object. %p would only be usable for memory
accesses when it points within @A.

C and C-like front-ends already obey this rule, since in C pointer
arithmetic results must remain within an allocated object (or one
past the end), which is more strict.

Front-ends that do crazy things to pointers may need to use
ptrtoint+arithmetic+inttoptr instead of getelementptr. If the
target-independent properties of getelementptr are needed, the
"getelementptr null" trick may be useful.

With this rule, BasicAliasAnalysis and similar analyses that depend
on being able to track a pointer value up to its base value will be
able to safely climb through getelementptr definitions without
needing any further guarantees.

That means that this rule will eliminate the most severe need for
having getelementptr overflow produce undefined results. This will
make it practical to have an undefined-on-overflow flag for
getelementptr that can be meaningfully set to either true or false.

A non-overflowing option for getelementptr would be useful to allow
SCEVExpander and other passes to convert code like this:
%a = mul %x, 4
%b = add %y, %a
%c = getelementptr [4 x i8]* @Object, 0, %b

into this:
%c = getelementptr [4 x i8]* @Object, %x, %y

which wouldn't otherwise be safe because (getelementptr @Object, %x)
by itself might overflow. (From a low-level perspective this isn't
much of an optimization, but from a high-level perspective it's
easier for non-SCEV-based passes to analyze.)

Comments and suggestions are welcome,

Dan

_______________________________________________
LLVM Developers mailing list
LLV...@cs.uiuc.edu http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

David Greene

unread,

Jul 22, 2009, 3:33:12 PM7/22/09

to llv...@cs.uiuc.edu

On Wednesday 22 July 2009 13:30, Dan Gohman wrote:

> - A pointer value formed by a ptrtoint is associated with all address
> ranges of all pointer values that contribute (directly or
> indirectly) to the computation of the pointer's value.

Do you mean inttoptr here? And that "all pointer values that contribute
(directly or indirectly) to the computation of the pointer's value" could
equally mean "all pointer values that contribute (directly or indirectly) to
the computation of the integer's value?"

> - An integer value other than zero may be associated with address
> ranges allocated through mechanisms other than those provided by
> LLVM; such ranges shall not overlap with any ranges of address
> allocated by mechanisms provided by LLVM.

What's the intent here? Are you basically saying that memory regions accessed
through random integer arithmetic can't overlap memory regions allocated by
alloca, malloc, etc.? If so, my sense is that's too demanding. In general
one does not know the aliasing properties of addresses computed entirely by
integer arithmetic. It could be some misguided user trying to "help" the
compiler by explicitly linearizing addressing or it could be an embedded
developer computing a memory-mapped I/O address. There's just no way to
know.

> Front-ends that do crazy things to pointers may need to use
> ptrtoint+arithmetic+inttoptr instead of getelementptr. If the
> target-independent properties of getelementptr are needed, the
> "getelementptr null" trick may be useful.

To clarify, the ptrtoint+arithmetic+inttoptr would still be legal?

-Dave

Eli Friedman

unread,

Jul 22, 2009, 3:56:42 PM7/22/09

to LLVM Developers Mailing List

On Wed, Jul 22, 2009 at 11:30 AM, Dan Gohman<goh...@apple.com> wrote:
> - The result value of an allocation instruction is associated with
> the address range of the allocated storage.

Might want to clarify this to include calloc/realloc.

> - A null pointer is associated with no addresses.

A null pointer in address space 0.

> - A pointer value formed by a ptrtoint is associated with all address
> ranges of all pointer values that contribute (directly or
> indirectly) to the computation of the pointer's value.

I assume you mean inttoptr?

> A non-overflowing option for getelementptr would be useful to allow
> SCEVExpander and other passes to convert code like this:
> %a = mul %x, 4
> %b = add %y, %a
> %c = getelementptr [4 x i8]* @Object, 0, %b
>
> into this:
> %c = getelementptr [4 x i8]* @Object, %x, %y

So a "defined-overflow" gep would act like wrapping integer arithmetic
in the width of the pointer? Then what exactly is the definition of
an "undefined-overflow" gep? Here's a first shot at a definition:
"Considering a GEP as a series of calculations, one for each index, if
the result that would be obtained by performing exact arithmetic
(treating the index as a signed integer) is outside the bounds of the
address range of the base pointer, the result is undefined."

-Eli

Eli Friedman

unread,

Jul 22, 2009, 4:06:19 PM7/22/09

to LLVM Developers Mailing List

On Wed, Jul 22, 2009 at 12:33 PM, David Greene<d...@cray.com> wrote:
>> - An integer value other than zero may be associated with address
>> ranges allocated through mechanisms other than those provided by
>> LLVM; such ranges shall not overlap with any ranges of address
>> allocated by mechanisms provided by LLVM.
>
> What's the intent here? Are you basically saying that memory regions accessed
> through random integer arithmetic can't overlap memory regions allocated by
> alloca, malloc, etc.?

Yes, I think that's the intent.

> If so, my sense is that's too demanding. In general
> one does not know the aliasing properties of addresses computed entirely by
> integer arithmetic. It could be some misguided user trying to "help" the
> compiler by explicitly linearizing addressing or it could be an embedded
> developer computing a memory-mapped I/O address. There's just no way to
> know.

If a memory-mapped I/O address aliases the stack, there's something
seriously messed up going on... of course, we wouldn't assume anything
about whether 1000 and 2000 alias.

>> Front-ends that do crazy things to pointers may need to use
>> ptrtoint+arithmetic+inttoptr instead of getelementptr. If the
>> target-independent properties of getelementptr are needed, the
>> "getelementptr null" trick may be useful.
>
> To clarify, the ptrtoint+arithmetic+inttoptr would still be legal?

Yes, I think that is the intent.

-Eli

Dan Gohman

unread,

Jul 22, 2009, 4:05:01 PM7/22/09

to LLVM Developers Mailing List

On Jul 22, 2009, at 12:33 PM, David Greene wrote:

> On Wednesday 22 July 2009 13:30, Dan Gohman wrote:
>
>
>> - A pointer value formed by a ptrtoint is associated with all
>> address
>>
>> ranges of all pointer values that contribute (directly or
>>
>> indirectly) to the computation of the pointer's value.
>>
>
> Do you mean inttoptr here? And that "all pointer values that
> contribute
> (directly or indirectly) to the computation of the pointer's value"
> could
> equally mean "all pointer values that contribute (directly or
> indirectly) to
> the computation of the integer's value?"

Yes, I meant inttoptr instead of ptrtoint here.

>
>
>> - An integer value other than zero may be associated with address
>>
>> ranges allocated through mechanisms other than those provided by
>>
>> LLVM; such ranges shall not overlap with any ranges of address
>>
>> allocated by mechanisms provided by LLVM.
>>
>
> What's the intent here? Are you basically saying that memory
> regions accessed
> through random integer arithmetic can't overlap memory regions
> allocated by
> alloca, malloc, etc.? If so, my sense is that's too demanding. In
> general
> one does not know the aliasing properties of addresses computed
> entirely by
> integer arithmetic. It could be some misguided user trying to
> "help" the
> compiler by explicitly linearizing addressing or it could be an
> embedded
> developer computing a memory-mapped I/O address. There's just no
> way to
> know.

The intent is to cover code like this:

%p = alloca i32
%q = inttoptr i64 0123456789 to i32*

Here, %p and %q are assumed to be non-aliasing. This kind of thing
is used by some JITs; they allocate objects via custom mechanisms
and then tell LLVM IR about the address of the objects via magic
integer constants like this. The optimizer is already making this
assumption today, though implicitly.

If the user hand-linearizes addressing using casts to integers,
it will still be supported -- that's what the broad language for
inttoptr above is intended to cover.

>
>
>> Front-ends that do crazy things to pointers may need to use
>>
>> ptrtoint+arithmetic+inttoptr instead of getelementptr. If the
>>
>> target-independent properties of getelementptr are needed, the
>>
>> "getelementptr null" trick may be useful.
>>
>
> To clarify, the ptrtoint+arithmetic+inttoptr would still be legal?

Yes, subject to constraints in the bullet point quoted at the
top of this email, which effectively just spelling out rules that
are already assumed today.

Dan

David Greene

unread,

Jul 22, 2009, 4:13:48 PM7/22/09

to llv...@cs.uiuc.edu

On Wednesday 22 July 2009 15:05, Dan Gohman wrote:

> The intent is to cover code like this:
>
> %p = alloca i32
> %q = inttoptr i64 0123456789 to i32*
>
> Here, %p and %q are assumed to be non-aliasing. This kind of thing
> is used by some JITs; they allocate objects via custom mechanisms
> and then tell LLVM IR about the address of the objects via magic
> integer constants like this. The optimizer is already making this
> assumption today, though implicitly.

But how do you know 0123456789 doesn't point to the same thing %p does?
I understand it's highly unlikely and that we'd like to define that
there's no aliasing. I just get a little squeemish, especially when
considering embedded devices.

> If the user hand-linearizes addressing using casts to integers,
> it will still be supported -- that's what the broad language for
> inttoptr above is intended to cover.

Right. I was thinking more along the lines of the user "knowing" where
some global is allocated and computing that address directly, without
using pointers at all.

> > To clarify, the ptrtoint+arithmetic+inttoptr would still be legal?
>
> Yes, subject to constraints in the bullet point quoted at the
> top of this email, which effectively just spelling out rules that
> are already assumed today.

Those rules are ok, I think. It's the computed-integer problem that
is gnawing at me.

I *think* the C standard says there can't be an alias and I assume the
Fortran standard does as well. But even so, we all know that users
often ignore standards when convenient.

-Dave

Török Edwin

unread,

Jul 22, 2009, 4:28:54 PM7/22/09

to LLVM Developers Mailing List

On 2009-07-22 21:30, Dan Gohman wrote:
> Hello,
>
> I'm working on refining the definition of getelementptr (GEP) to
> clarify its semantics and to allow front-ends to provide additional
> information to optimization passes.
>
> To help support this, I'd like to propose the following rule:
>
> Any memory access must be done though a pointer value associated
> with with address range of the memory access, otherwise the behavior
> is undefined.
>

If I compute an out-of-bounds GEPs and use it in ICMP is that undefined
behaviour, or wrapping behaviour?

Instcombine currently does this transform ((gep Ptr, OFFSET) cmp Ptr)
---> (OFFSET cmp 0), which is perfectly
fine when GEP is assumed to have undefined behavior on overflow, but is
wrong if GEP is allowed to overflow
with well defined behavior (it wraps).

Also will can the behavior of GEP be changed depending on llvm-gcc's
-fno-strict-aliasing/-fwrapv/-fno-strict-pointer-overflow flags?

> Front-ends that do crazy things to pointers may need to use
> ptrtoint+arithmetic+inttoptr instead of getelementptr. If the
> target-independent properties of getelementptr are needed, the
> "getelementptr null" trick may be useful.

Will various optimizations make sure they don't introduce undefined
operations in programs with otherwise well defined semantics?
I think instcombine may turn them into GEPs, thus transforming a program
with well defined semantics according to above rules into
undefined semantics. Of course if the optimizer can see that the access
is always well defined it should do this transform.

Best regards,
--Edwin

Dan Gohman

unread,

Jul 22, 2009, 4:32:43 PM7/22/09

to LLVM Developers Mailing List

On Jul 22, 2009, at 12:56 PM, Eli Friedman wrote:

> On Wed, Jul 22, 2009 at 11:30 AM, Dan Gohman<goh...@apple.com> wrote:
>
>> - The result value of an allocation instruction is associated with
>>
>> the address range of the allocated storage.
>>
>
> Might want to clarify this to include calloc/realloc.

These sort of fall under the category of allocations done
outside of LLVM IR, but it'd be better to address this
case explicitly. I'll make a note of that.

>
>
>> - A null pointer is associated with no addresses.
>>
>
> A null pointer in address space 0.

I'm not fond of weird address-space semantics, but for consistency
with what the optimizer is currently doing, you're right.

>
>
>> - A pointer value formed by a ptrtoint is associated with all
>> address
>>
>> ranges of all pointer values that contribute (directly or
>>
>> indirectly) to the computation of the pointer's value.
>>
>
> I assume you mean inttoptr?

Yes.

>
>
>> A non-overflowing option for getelementptr would be useful to allow
>>
>> SCEVExpander and other passes to convert code like this:
>>
>> %a = mul %x, 4
>>
>> %b = add %y, %a
>>
>> %c = getelementptr [4 x i8]* @Object, 0, %b
>>
>>
>>
>> into this:
>>
>> %c = getelementptr [4 x i8]* @Object, %x, %y
>>
>
> So a "defined-overflow" gep would act like wrapping integer arithmetic
> in the width of the pointer? Then what exactly is the definition of
> an "undefined-overflow" gep? Here's a first shot at a definition:
> "Considering a GEP as a series of calculations, one for each index, if
> the result that would be obtained by performing exact arithmetic
> (treating the index as a signed integer) is outside the bounds of the
> address range of the base pointer, the result is undefined."

Yes, something like this. I had been thinking of defining gep overflow
in terms of integer overflow, and then just saying that since LLVM IR
doesn't usually know what the address is, the only way to be safe is to
stay within the object, but your suggestion might be simpler.

Also, for completeness, the conversion of indices to pointer-sized
integers shouldn't change their value, the multiplication of the
indices by array element sizes shouldn't overflow, and computing an
address one-past-the-end is ok.

Dan

Dan Gohman

unread,

Jul 22, 2009, 5:02:00 PM7/22/09

to LLVM Developers Mailing List

On Jul 22, 2009, at 1:28 PM, Török Edwin wrote:

> On 2009-07-22 21:30, Dan Gohman wrote:
>
>> Hello,
>>
>>
>>
>> I'm working on refining the definition of getelementptr (GEP) to
>>
>> clarify its semantics and to allow front-ends to provide additional
>>
>> information to optimization passes.
>>
>>
>>
>> To help support this, I'd like to propose the following rule:
>>
>>
>>
>> Any memory access must be done though a pointer value associated
>>
>> with with address range of the memory access, otherwise the behavior
>>
>> is undefined.
>>
>>
>>
>
> If I compute an out-of-bounds GEPs and use it in ICMP is that
> undefined
> behaviour, or wrapping behaviour?

This will be addressed when the wrapping option for GEP is available.
A wrapping GEP will provide wrapping behavior, and a non-wrapping
GEP will have an undefined result on overflow. So it's up to the
front-end to decide which one it wants.

>
> Instcombine currently does this transform ((gep Ptr, OFFSET) cmp Ptr)
> ---> (OFFSET cmp 0), which is perfectly
> fine when GEP is assumed to have undefined behavior on overflow, but
> is
> wrong if GEP is allowed to overflow
> with well defined behavior (it wraps).
>
> Also will can the behavior of GEP be changed depending on llvm-gcc's
> -fno-strict-aliasing/-fwrapv/-fno-strict-pointer-overflow flags?

Yes. To implement -fwrapv, the front-end can omit any of the no-overflow
flags, and choose the wrapping form of GEP, as appropriate.

I'm not familiar with -fno-strict-pointer-overflow. Do you mean
-fno-strict-overflow? I guess for LLVM that would be equivalent
to -fwrapv.

-fno-strict-aliasing is not impacted.

>
>
>> Front-ends that do crazy things to pointers may need to use
>>
>> ptrtoint+arithmetic+inttoptr instead of getelementptr. If the
>>
>> target-independent properties of getelementptr are needed, the
>>
>> "getelementptr null" trick may be useful.
>>
>
> Will various optimizations make sure they don't introduce undefined
> operations in programs with otherwise well defined semantics?
> I think instcombine may turn them into GEPs, thus transforming a
> program
> with well defined semantics according to above rules into
> undefined semantics. Of course if the optimizer can see that the
> access
> is always well defined it should do this transform.

Yes. One of the goals of this effort is to legitimize some of what the
optimizer is already doing, and another is to make some of the things
it's doing dependent on explicit permission in the form of no-overflow
flags.

Dan

Török Edwin

unread,

Jul 22, 2009, 5:17:36 PM7/22/09

to LLVM Developers Mailing List

Yes, I mean -fno-strict-overflow.
Since LLVM has defined semantics for signed overflow, -fwrapv doesn't
add anything wrt. -fno-strict-overflow, yes.

> -fno-strict-aliasing is not impacted.
>

Will LLVM's default behavior be that of -fstrict-aliasing, or
-fno-strict-aliasing?
We don't have TBAA, but the "computer pointer doesn't alias existing
pointers" sounds like something for -fno-strict-aliasing, no?
I'm not sure what gcc does here, does -fno-strict-aliasing have any
effect on pointer computed from ints?

>
>>
>>> Front-ends that do crazy things to pointers may need to use
>>>
>>> ptrtoint+arithmetic+inttoptr instead of getelementptr. If the
>>>
>>> target-independent properties of getelementptr are needed, the
>>>
>>> "getelementptr null" trick may be useful.
>>>
>>>
>> Will various optimizations make sure they don't introduce undefined
>> operations in programs with otherwise well defined semantics?
>> I think instcombine may turn them into GEPs, thus transforming a
>> program
>> with well defined semantics according to above rules into
>> undefined semantics. Of course if the optimizer can see that the
>> access
>> is always well defined it should do this transform.
>>
>
> Yes. One of the goals of this effort is to legitimize some of what the
> optimizer is already doing, and another is to make some of the things
> it's doing dependent on explicit permission in the form of no-overflow
> flags.
>

I think this is great. Frontends that want always well-defined behavior
can just set the flags appropriately.

Best regards,
--Edwin

Török Edwin

unread,

Jul 22, 2009, 5:27:57 PM7/22/09

to LLVM Developers Mailing List

On 2009-07-22 21:30, Dan Gohman wrote:

What happens when you allocate more room for a variable than its type
says it needs,
and then access the variable beyond the limit of what its type would
allow (but is otherwise valid, because you did allocate enough bytes)?

I assume that the intention is to treat these as well defined accesses
(pointer is within address range), is that right?

I'm thinking of flexible array members in structs, and things like:
struct foo
{
...
char var[1];
};

struct foo *x = malloc(sizeof(*x) + len);
x->var[len-1];

Best regards,
--Edwin

Dan Gohman

unread,

Jul 22, 2009, 5:37:18 PM7/22/09

to LLVM Developers Mailing List

On Jul 22, 2009, at 2:17 PM, Török Edwin wrote:

> On 2009-07-23 00:02, Dan Gohman wrote:
>
>> -fno-strict-aliasing is not impacted.
>>
>>
>>
>
> Will LLVM's default behavior be that of -fstrict-aliasing, or
> -fno-strict-aliasing?
> We don't have TBAA, but the "computer pointer doesn't alias existing
> pointers" sounds like something for -fno-strict-aliasing, no?
> I'm not sure what gcc does here, does -fno-strict-aliasing have any
> effect on pointer computed from ints?

It does sound like it from the name, but no, it's unrelated.

Dan

Dan Gohman

unread,

Jul 22, 2009, 5:42:40 PM7/22/09

to LLVM Developers Mailing List

Yep, this is fine. The types in a getelementptr are just used to
guide offset computation; they don't say anything about how the
underlying memory was allocated.

Dan

Dan Gohman

unread,

Jul 22, 2009, 5:51:55 PM7/22/09

to LLVM Developers Mailing List

On Jul 22, 2009, at 1:13 PM, David Greene wrote:

> On Wednesday 22 July 2009 15:05, Dan Gohman wrote:
>
>
>> The intent is to cover code like this:
>>
>>
>>
>> %p = alloca i32
>>
>> %q = inttoptr i64 0123456789 to i32*
>>
>>
>>
>> Here, %p and %q are assumed to be non-aliasing. This kind of thing
>>
>> is used by some JITs; they allocate objects via custom mechanisms
>>
>> and then tell LLVM IR about the address of the objects via magic
>>
>> integer constants like this. The optimizer is already making this
>>
>> assumption today, though implicitly.
>>
>
> But how do you know 0123456789 doesn't point to the same thing %p
> does?
> I understand it's highly unlikely and that we'd like to define that
> there's no aliasing. I just get a little squeemish, especially when
> considering embedded devices.

Without this assumption, much of LLVM would be useless. It's
more practical to make LLVM entirely useless to obscure
hypothetical users than to make it mostly useless to most
current real users.

> Those rules are ok, I think. It's the computed-integer problem that
> is gnawing at me.
>
> I *think* the C standard says there can't be an alias and I assume the
> Fortran standard does as well. But even so, we all know that users
> often ignore standards when convenient.

If anyone comes forward with a real problem that this is causing,
we'll consider the problem in context, where there may be alternative
solutions.

Dan

Chris Lattner

unread,

Jul 22, 2009, 8:23:33 PM7/22/09

to LLVM Developers Mailing List

On Jul 22, 2009, at 1:32 PM, Dan Gohman wrote:

>>>
>>> - A null pointer is associated with no addresses.
>>>
>>
>> A null pointer in address space 0.
>
> I'm not fond of weird address-space semantics, but for consistency
> with what the optimizer is currently doing, you're right.

FWIW, this will be addressed when we finally get around to adding
explicit support for trappability to load/store.

-Chris

Dan Gohman

unread,

Jul 22, 2009, 9:51:36 PM7/22/09

to LLVM Developers Mailing List

On Jul 22, 2009, at 5:23 PM, Chris Lattner wrote:

>
> On Jul 22, 2009, at 1:32 PM, Dan Gohman wrote:
>
>>>>
>>>> - A null pointer is associated with no addresses.
>>>>
>>>
>>> A null pointer in address space 0.
>>
>> I'm not fond of weird address-space semantics, but for consistency
>> with what the optimizer is currently doing, you're right.
>
> FWIW, this will be addressed when we finally get around to adding
> explicit support for trappability to load/store.

This won't free the optimizer to treat null as an invalid pointer
though.
Some LLVM hacker thought it would be a good idea to use address
spaces 256 and 257 on x86 in a way that can require dereferences
of address 0 to be valid.

Dan

Rob Grapes

unread,

Jul 23, 2009, 12:43:15 AM7/23/09

to LLVM Developers Mailing List

Hi,

Would it be possible to make the following slight change to ExecutionEngine::create()?

I would like to be able to disable the automatic enumeration of loaded modules, and the subsequent searching for undefined symbols (using SearchForAddressOfSymbol).

I propose adding an extra parameter to the function definition, defaulting to true.

static ExecutionEngine *create(ModuleProvider *MP,
bool ForceInterpreter = false,
std::string *ErrorStr = 0,
bool Fast = false,
bool EnumModules = true);

And corresponding change to the function.

ExecutionEngine *ExecutionEngine::create(ModuleProvider *MP,
bool ForceInterpreter,
std::string *ErrorStr,
bool Fast,
bool EnumModules) {
ExecutionEngine *EE = 0;
if (EnumModules) {
// Make sure we can resolve symbols in the program as well. The zero arg
// to the function tells DynamicLibrary to load the program, not a library.
if (sys::DynamicLibrary::LoadLibraryPermanently(0, ErrorStr))
return 0;
}
...

Your help would be appreciated.

Rob.

Evan Cheng

unread,

Jul 23, 2009, 2:22:27 AM7/23/09

to LLVM Developers Mailing List

On Jul 22, 2009, at 9:43 PM, Rob Grapes wrote:

> Hi,
>
> Would it be possible to make the following slight change to
> ExecutionEngine::create()?
>
> I would like to be able to disable the automatic enumeration of
> loaded modules, and the subsequent searching for undefined symbols
> (using SearchForAddressOfSymbol).
>
> I propose adding an extra parameter to the function definition,
> defaulting to true.

I'd rather not. The current API is messy enough already. Is there not
another way to solve the problem?

Evan

Daniel Dunbar

unread,

Jul 23, 2009, 2:34:56 AM7/23/09

to LLVM Developers Mailing List

Hi Rob,

Can you comment on exactly what the problem is you want to solve? Is
it a performance issue with LoadLibraryPermanently, or do you simply
not want the external symbols to be resolved from within the JIT?

- Daniel

Mark Shannon

unread,

Jul 23, 2009, 4:59:32 AM7/23/09

to LLVM Developers Mailing List

Hi Dan,

What you are proposing is a major change in the semantics of llvm.

You are taking certain uses of an instruction that have well defined
behaviour and undefining them.

Have you made any estimate of how many peoples' code this may or may not
break?

I think this is a *very* bad idea.

Let me make some more detailed comments:

I notice no mention of bitcasts on pointers here.

Intermediate representations are often full of weird-looking, but
correct, pointer arithmetic.

>
> For example, in an instruction like this:
>
> %p = getelementptr [4 x i8]* @A, i32 0, i32 %huge
>
> if %huge is beyond the size of @A, %p could potentially point into
> some object other than @A. This rule says that it's not valid to use
> %p to access that other object. %p would only be usable for memory
> accesses when it points within @A.
>
> C and C-like front-ends already obey this rule, since in C pointer
> arithmetic results must remain within an allocated object (or one
> past the end), which is more strict.

Lots of C code passes pointers around which might point to the middle of
an array, say for searching. That means that negative indices could
still remain within an allocated object.

>
> Front-ends that do crazy things to pointers may need to use
> ptrtoint+arithmetic+inttoptr instead of getelementptr. If the
> target-independent properties of getelementptr are needed, the
> "getelementptr null" trick may be useful.

Who says pointer arithmetic is crazy?
I create llvm code from an IR that has 2 types of pointers,
heap-pointers and non-heap pointers, and is safe for GC, but prevents
alias analysis.
So, I don't use any alias analysis stuff when doing optimisation, no big
deal.
Suddenly, my correct and working code will become "undefined" :(

>
> With this rule, BasicAliasAnalysis and similar analyses that depend
> on being able to track a pointer value up to its base value will be
> able to safely climb through getelementptr definitions without
> needing any further guarantees.

So all this breakage is for a few optimisation passes, that I don't even
use?

>
> That means that this rule will eliminate the most severe need for
> having getelementptr overflow produce undefined results. This will
> make it practical to have an undefined-on-overflow flag for
> getelementptr that can be meaningfully set to either true or false.
>
> A non-overflowing option for getelementptr would be useful to allow
> SCEVExpander and other passes to convert code like this:
> %a = mul %x, 4
> %b = add %y, %a
> %c = getelementptr [4 x i8]* @Object, 0, %b
>
> into this:
> %c = getelementptr [4 x i8]* @Object, %x, %y
>
> which wouldn't otherwise be safe because (getelementptr @Object, %x)
> by itself might overflow. (From a low-level perspective this isn't
> much of an optimization, but from a high-level perspective it's
> easier for non-SCEV-based passes to analyze.)
>
> Comments and suggestions are welcome,
>

Please don't do this!

My suggestion is this:

Add a "strict-GEP", which does what you suggest. This would allow
front-ends to tell the back-end about what things cannot be aliased,
and not cause any breakages for code that uses the "old" GEP.

Cheers,
Mark.

Eli Friedman

unread,

Jul 23, 2009, 10:44:09 AM7/23/09

to LLVM Developers Mailing List

On Thu, Jul 23, 2009 at 1:59 AM, Mark Shannon<ma...@dcs.gla.ac.uk> wrote:
> What you are proposing is a major change in the semantics of llvm.

The thing is, it isn't really a big change; it's essentially what we
already implement.

> I notice no mention of bitcasts on pointers here.
>
> Intermediate representations are often full of weird-looking, but
> correct, pointer arithmetic.

A bitcast of a pointer is still the same pointer for the purposes of
this section.

> Lots of C code passes pointers around which might point to the middle of
> an array, say for searching. That means that negative indices could
> still remain within an allocated object.

Umm, right... GEP should be explicitly documented to take signed
indices. Anyway, this isn't actually saying anything about the given
example, because that's still the same object (the array).

>> With this rule, BasicAliasAnalysis and similar analyses that depend
>> on being able to track a pointer value up to its base value will be
>> able to safely climb through getelementptr definitions without
>> needing any further guarantees.
> So all this breakage is for a few optimisation passes, that I don't even
> use?

If your code is such that BasicAA isn't correct, it's already well
into undefined territory; you're basically redefining your own IR
semantics.

> Add a "strict-GEP", which does what you suggest. This would allow
> front-ends to tell the back-end about what things cannot be aliased,
> and not cause any breakages for code that uses the "old" GEP.

This isn't actually proposing any immediate code changes except adding
a new flag to GEP for strict overflow semantics.

-Eli

Eli Friedman

unread,

Jul 23, 2009, 10:51:20 AM7/23/09

to LLVM Developers Mailing List

On Wed, Jul 22, 2009 at 1:32 PM, Dan Gohman<goh...@apple.com> wrote:
> Yes, something like this. I had been thinking of defining gep overflow
> in terms of integer overflow, and then just saying that since LLVM IR
> doesn't usually know what the address is, the only way to be safe is to
> stay within the object, but your suggestion might be simpler.

Hmm, yeah, I was considering that too. We quickly run into the issue
of "what sort of integer overflow", though.

-Eli

Dan Gohman

unread,

Jul 23, 2009, 12:18:00 PM7/23/09

to LLVM Developers Mailing List

On Jul 23, 2009, at 1:59 AM, Mark Shannon wrote:

> Hi Dan,
>
> What you are proposing is a major change in the semantics of llvm.
>
> You are taking certain uses of an instruction that have well defined
> behaviour and undefining them.
>
> Have you made any estimate of how many peoples' code this may or may
> not
> break?
>
> I think this is a *very* bad idea.

Hi Mark,

I estimate this will break very little. I apologize for being unclear;
In practical terms, this proposal isn't a big change. It's a conceptual
framework for describing assumptions that the optimizer is already
making in many cases. It will help us resolve some cases where the
optimizer's assumptions are contradictory. And, it sets the stage for a
new flavor of getelementptr which will be more permissive than the
current getelementptr.

This was an oversight. Bitcasts are simple:

- The result value of a bitcast is associated with all addresses
associated with the operand of the bitcast.

>> For example, in an instruction like this:
>>
>> %p = getelementptr [4 x i8]* @A, i32 0, i32 %huge
>>
>> if %huge is beyond the size of @A, %p could potentially point into
>> some object other than @A. This rule says that it's not valid to use
>> %p to access that other object. %p would only be usable for memory
>> accesses when it points within @A.
>>
>> C and C-like front-ends already obey this rule, since in C pointer
>> arithmetic results must remain within an allocated object (or one
>> past the end), which is more strict.
> Lots of C code passes pointers around which might point to the
> middle of
> an array, say for searching. That means that negative indices could
> still remain within an allocated object.

This proposal supports negative GEP indices. In fact, negative GEP
indices are one of the corner cases that the optimizer today handles
incorrectly, though only in obscure ways that usually don't break real
code. This proposal will actually let LLVM support them in a more
robust manner.

>
>>
>> Front-ends that do crazy things to pointers may need to use
>> ptrtoint+arithmetic+inttoptr instead of getelementptr. If the
>> target-independent properties of getelementptr are needed, the
>> "getelementptr null" trick may be useful.
>
> Who says pointer arithmetic is crazy?
> I create llvm code from an IR that has 2 types of pointers,
> heap-pointers and non-heap pointers, and is safe for GC, but prevents
> alias analysis.
> So, I don't use any alias analysis stuff when doing optimisation, no
> big
> deal.
> Suddenly, my correct and working code will become "undefined" :(

I don't see anything problematic in your description here. Alias
analysis
may not be especially profitable in these circumstances, but I wouldn't
guess that it would be unsafe.

Dan

Jeffrey Yasskin

unread,

Jul 23, 2009, 12:27:52 PM7/23/09

to LLVM Developers Mailing List

On Thu, Jul 23, 2009 at 1:59 AM, Mark Shannon<ma...@dcs.gla.ac.uk> wrote:
> Hi Dan,
>
> What you are proposing is a major change in the semantics of llvm.
>
> You are taking certain uses of an instruction that have well defined
> behaviour and undefining them.

As far as I can tell, this only affects uses of GEP that were
undefined under the "Note that it is undefined to access an array out
of bounds: array and pointer indexes must always be within the defined
bounds of the array type." rule. (Removed by Dan in
http://llvm.org/viewvc/llvm-project?view=rev&revision=76495) What are
the cases that the new behavior makes undefined that the old behavior
didn't? I think this is just a loosening of the rules.

Mark Shannon

unread,

Jul 23, 2009, 1:55:28 PM7/23/09

to LLVM Developers Mailing List

Hi Dan,
Thanks for reply, it seems that

Dan Gohman wrote:
> On Jul 23, 2009, at 1:59 AM, Mark Shannon wrote:
>
>> Hi Dan,
>>
>> What you are proposing is a major change in the semantics of llvm.
>>
>> You are taking certain uses of an instruction that have well defined
>> behaviour and undefining them.
>>
>> Have you made any estimate of how many peoples' code this may or may
>> not
>> break?
>>
>> I think this is a *very* bad idea.
>
> Hi Mark,
>
> I estimate this will break very little. I apologize for being unclear;

No apology needed

> In practical terms, this proposal isn't a big change. It's a conceptual
> framework for describing assumptions that the optimizer is already
> making in many cases. It will help us resolve some cases where the
> optimizer's assumptions are contradictory. And, it sets the stage for a

I'm curious what these assumptions are, but its not important.

This makes a big difference :)

>
>>> For example, in an instruction like this:
>>>
>>> %p = getelementptr [4 x i8]* @A, i32 0, i32 %huge
>>>
>>> if %huge is beyond the size of @A, %p could potentially point into
>>> some object other than @A. This rule says that it's not valid to use
>>> %p to access that other object. %p would only be usable for memory
>>> accesses when it points within @A.
>>>
>>> C and C-like front-ends already obey this rule, since in C pointer
>>> arithmetic results must remain within an allocated object (or one
>>> past the end), which is more strict.
>> Lots of C code passes pointers around which might point to the
>> middle of
>> an array, say for searching. That means that negative indices could
>> still remain within an allocated object.
>
> This proposal supports negative GEP indices. In fact, negative GEP
> indices are one of the corner cases that the optimizer today handles
> incorrectly, though only in obscure ways that usually don't break real
> code. This proposal will actually let LLVM support them in a more
> robust manner.

It doesn't appear to have broken mine. My compiler could potentially use
negative indices in GEPs.

>
>>> Front-ends that do crazy things to pointers may need to use
>>> ptrtoint+arithmetic+inttoptr instead of getelementptr. If the
>>> target-independent properties of getelementptr are needed, the
>>> "getelementptr null" trick may be useful.
>> Who says pointer arithmetic is crazy?
>> I create llvm code from an IR that has 2 types of pointers,
>> heap-pointers and non-heap pointers, and is safe for GC, but prevents
>> alias analysis.
>> So, I don't use any alias analysis stuff when doing optimisation, no
>> big
>> deal.
>> Suddenly, my correct and working code will become "undefined" :(
>
> I don't see anything problematic in your description here. Alias
> analysis
> may not be especially profitable in these circumstances, but I wouldn't
> guess that it would be unsafe.

Any tests I can do to find out if its safe?

One last question.
If I were to turn off all optimisations, would that ensure that the
final machine code semantics with the new GEP would be unchanged from
the present GEP?
In other words, will any of this affect to the code-generator?

Cheers,
Mark.

Dan Gohman

unread,

Jul 23, 2009, 2:49:17 PM7/23/09

to LLVM Developers Mailing List

On Jul 23, 2009, at 10:55 AM, Mark Shannon wrote:
>
>> In practical terms, this proposal isn't a big change. It's a
>> conceptual
>>
>> framework for describing assumptions that the optimizer is already
>>
>> making in many cases. It will help us resolve some cases where the
>>
>> optimizer's assumptions are contradictory. And, it sets the stage
>> for a
>>
> I'm curious what these assumptions are, but its not important.

My first email aimed to make them explicit. Here's another example:

@g = global i32

%a = alloca i32
%b = malloc i32
%c = inttoptr 777770 to i32*

%aa = getelementptr i32* %a, i64 %i
%bb = getelementptr i32* %b, i64 %j
%cc = getelementptr i32* %c, i64 %k
%gg = getelementptr i32* @g, i64 %l

store i32 0, i32* %aa
store i32 1, i32* %bb
store i32 2, i32* %cc
store i32 3, i32* %gg

Here, %i, %j, %k, and %l are totally arbitrary values that the optimizer
knows nothing about. The optimizer will assume that all four stores are
to different memory locations. In theory, some truly crazy code could
figure out what values to use for %i, %j, %k, and/or %l to cause the
getelementptrs to overindex recklessly through the address space and
break this assumption. This the kind of thing that the optimizer
assumes doesn't happen.

>>
>>
>> I don't see anything problematic in your description here. Alias
>>
>> analysis
>>
>> may not be especially profitable in these circumstances, but I
>> wouldn't
>>
>> guess that it would be unsafe.
>>
> Any tests I can do to find out if its safe?

I don't know of any. You can run GVN and LICM and so on to see if
they break anything, but of course that wouldn't prove that there
aren't any problems.

> One last question.
> If I were to turn off all optimisations, would that ensure that the
> final machine code semantics with the new GEP would be unchanged from
> the present GEP?
> In other words, will any of this affect to the code-generator?

It probably will at some point, to use alias-analysis info for
scheduling and other things. After all, CodeGen is just a highly
specialized optimizer ;-).

Dan

Rob Grapes

unread,

Jul 23, 2009, 5:10:22 PM7/23/09

to LLVM Developers Mailing List

Hi Daniel,

In my case ExecutionEngine::create() loads 40 modules, then each time I try to resolve a symbol that I know is in a DLL that I supply, it looks through all 40 modules first. This is on Windows, so I get the following modules loaded:

ntdll.dll, kernel32.dll, USER32.dll, GDI32.dll, SHELL32.dll, ADVAPI32.dll, RPCRT4.dll,
Secur32.dll, msvcrt.dll, SHLWAPI.dll, ole32.dll, OLEAUT32.dll, WS2_32.dll, WS2HELP.dll,
PSAPI.DLL, VERSION.dll, MSACM32.dll, WINMM.dll, WININET.dll, Normaliz.dll, urlmon.dll,
iertutil.dll, IPHLPAPI.DLL, IMM32.dll, HID.DLL, SETUPAPI.dll, OPENGL32.dll, GLU32.dll,
DDRAW.dll, DCIMAN32.dll, dbghelp.dll, LPK.DLL, USP10.dll, comctl32.dll, comctl32.dll,
rsaenh.dll, uxtheme.dll, MSCTF.dll, USERENV.dll, as well as the exe.

It wouldn't be so bad if it looked through the modules in reverse order, so mine were searched
first.

So, as an alternative can the order be changed, or, can it be set up so that the user must explicitly enumerate and add modules.

Rob.

Daniel Dunbar

unread,

Jul 23, 2009, 8:49:58 PM7/23/09

to LLVM Developers Mailing List

I probably shouldn't have weighed in on this thread, since I have
limited knowledge on the JIT.

If the current behavior is to enumerate all those modules, however,
adding a flag to the constructor doesn't seem so bad -- especially now
that we have the builder. I'll defer to Evan though.

- Daniel

Vikram S. Adve

unread,

Jul 28, 2009, 12:21:55 AM7/28/09

to LLVM Developers Mailing List

On Jul 22, 2009, at 1:30 PM, Dan Gohman wrote:

> Hello,
>
> I'm working on refining the definition of getelementptr (GEP) to
> clarify its semantics and to allow front-ends to provide additional
> information to optimization passes.

It would be quite valuable to have cleaner semantics for GEPs, both
for LLVM in general and for the memory safety work we have been doing
in the SAFECode project. For example, Andrew's optimization to
convert integer address arithmetic to equivalent GEPs was recently
disabled because GEPs don't have well-defined overflow semantics but
Andrew's optimization is very important for SAFECode.

At the same time, I don't understand how you plan to handle a couple
of cases (plus some comments):

>
> To help support this, I'd like to propose the following rule:
>
> Any memory access must be done though a pointer value associated
> with with address range of the memory access, otherwise the behavior
> is undefined.
>
> "associated with" is defined as follows:
>
> - A pointer value formed from a getelementptr instruction is
> associated with the addresses associated with the first operand of
> the getelementptr.

[I think this correctly handles an illegal, though commonly used,
special case: when a pointer walks off the end of an object but walks
back before being dereferenced. This should work.]

> - An addresses of a global variable is associated with the address
> range of the variable's storage.
> - The result value of an allocation instruction is associated with
> the address range of the allocated storage.
> - A null pointer is associated with no addresses.

Null in C is always implemented as 0, and address 0 is a perfectly
valid address in kernel code. What happens there?

> - A pointer value formed by a ptrtoint is associated with all address
> ranges of all pointer values that contribute (directly or
> indirectly) to the computation of the pointer's value.

This seems technically well defined (after rephrasing in terms of
inttoptr, as you said), but can be very expensive for a pointer
analysis to track. I'm not sure any of our current analyses actually
track this in any sense, except DSA which only does this as an option
that hasn't been used for a while and has likely bit-rotted. This
means the "ptrtoint+arithmetic+inttoptr" case, though legal, should
lead to very poor aliasing results. I guess you could argue that
current LLVM passes don't do any better?

> - An integer value other than zero may be associated with address
> ranges allocated through mechanisms other than those provided by
> LLVM; such ranges shall not overlap with any ranges of address
> allocated by mechanisms provided by LLVM.
>

> For example, in an instruction like this:
>
> %p = getelementptr [4 x i8]* @A, i32 0, i32 %huge
>
> if %huge is beyond the size of @A, %p could potentially point into
> some object other than @A. This rule says that it's not valid to use
> %p to access that other object. %p would only be usable for memory
> accesses when it points within @A.
>
> C and C-like front-ends already obey this rule, since in C pointer
> arithmetic results must remain within an allocated object (or one
> past the end), which is more strict.
>

> Front-ends that do crazy things to pointers may need to use
> ptrtoint+arithmetic+inttoptr instead of getelementptr. If the
> target-independent properties of getelementptr are needed, the
> "getelementptr null" trick may be useful.
>

> With this rule, BasicAliasAnalysis and similar analyses that depend
> on being able to track a pointer value up to its base value will be
> able to safely climb through getelementptr definitions without
> needing any further guarantees.
>

> That means that this rule will eliminate the most severe need for
> having getelementptr overflow produce undefined results. This will
> make it practical to have an undefined-on-overflow flag for
> getelementptr that can be meaningfully set to either true or false.
>
> A non-overflowing option for getelementptr would be useful to allow
> SCEVExpander and other passes to convert code like this:
> %a = mul %x, 4
> %b = add %y, %a
> %c = getelementptr [4 x i8]* @Object, 0, %b
>
> into this:
> %c = getelementptr [4 x i8]* @Object, %x, %y
>
> which wouldn't otherwise be safe because (getelementptr @Object, %x)
> by itself might overflow. (From a low-level perspective this isn't
> much of an optimization, but from a high-level perspective it's
> easier for non-SCEV-based passes to analyze.)
>
> Comments and suggestions are welcome,
>

> Dan

>
> _______________________________________________
> LLVM Developers mailing list
> LLV...@cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

--Vikram
Associate Professor, Computer Science
University of Illinois at Urbana-Champaign
http://llvm.org/~vadve

Eli Friedman

unread,

Jul 28, 2009, 2:35:26 AM7/28/09

to LLVM Developers Mailing List

On Mon, Jul 27, 2009 at 9:21 PM, Vikram S. Adve<va...@illinois.edu> wrote:
>> - An addresses of a global variable is associated with the address
>> range of the variable's storage.
>> - The result value of an allocation instruction is associated with
>> the address range of the allocated storage.
>> - A null pointer is associated with no addresses.
>
> Null in C is always implemented as 0, and address 0 is a perfectly
> valid address in kernel code. What happens there?

It doesn't work; this documentation change is just clarifying that.
There are various ugly ways to deal with this at the moment, like
inline asm, but I'm not sure what the right thing to do here is...

>> - A pointer value formed by a ptrtoint is associated with all address
>> ranges of all pointer values that contribute (directly or
>> indirectly) to the computation of the pointer's value.
>
> This seems technically well defined (after rephrasing in terms of
> inttoptr, as you said), but can be very expensive for a pointer
> analysis to track. I'm not sure any of our current analyses actually
> track this in any sense, except DSA which only does this as an option
> that hasn't been used for a while and has likely bit-rotted. This
> means the "ptrtoint+arithmetic+inttoptr" case, though legal, should
> lead to very poor aliasing results. I guess you could argue that
> current LLVM passes don't do any better?

On one end, this is only slightly more general than what BasicAA
already assumes, which is roughly "if a pointer doesn't escape, it
doesn't alias any other pointer." And on the other end, we don't want
to break legal C code like
http://llvm.org/bugs/show_bug.cgi?id=2831#c11 . I don't think there's
really much choice involved in this definition.

-Eli

Dan Gohman

unread,

Jul 28, 2009, 4:52:35 PM7/28/09

to LLVM Developers Mailing List

On Jul 27, 2009, at 9:21 PM, Vikram S. Adve wrote:

> On Jul 22, 2009, at 1:30 PM, Dan Gohman wrote:
>
>
>> Hello,
>>
>>
>>
>> I'm working on refining the definition of getelementptr (GEP) to
>>
>> clarify its semantics and to allow front-ends to provide additional
>>
>> information to optimization passes.
>>
>
> It would be quite valuable to have cleaner semantics for GEPs, both
> for LLVM in general and for the memory safety work we have been doing
> in the SAFECode project. For example, Andrew's optimization to
> convert integer address arithmetic to equivalent GEPs was recently
> disabled because GEPs don't have well-defined overflow semantics but
> Andrew's optimization is very important for SAFECode.

Ok. With this new rule, getelementptr semantics will hopefully be
quite well specified. It won't be possible to translate arbitrary
adds into getelementptrs, but you can do this kind of thing if you
can prove that one operand of an add is an address and that the
other is a well-behaved offset.

>
> At the same time, I don't understand how you plan to handle a couple
> of cases (plus some comments):
>
>
>
>>
>>
>> To help support this, I'd like to propose the following rule:
>>
>>
>>
>> Any memory access must be done though a pointer value associated
>>
>> with with address range of the memory access, otherwise the behavior
>>
>> is undefined.
>>
>>
>>
>> "associated with" is defined as follows:
>>
>>
>>
>> - A pointer value formed from a getelementptr instruction is
>>
>> associated with the addresses associated with the first operand of
>>
>> the getelementptr.
>>
>
> [I think this correctly handles an illegal, though commonly used,
> special case: when a pointer walks off the end of an object but walks
> back before being dereferenced. This should work.]

Yes, while it's not permitted in C, it's useful in LLVM IR.
This special case is specifically intended to be supported.
Some LLVM optimizations rely on it.

>
>
>
>> - An addresses of a global variable is associated with the address
>>
>> range of the variable's storage.
>>
>> - The result value of an allocation instruction is associated with
>>
>> the address range of the allocated storage.
>>
>> - A null pointer is associated with no addresses.
>>
>
> Null in C is always implemented as 0, and address 0 is a perfectly
> valid address in kernel code. What happens there?

Handling address 0 as a special case here is just documenting
existing practice within LLVM. For example, instcombine
followed by simplifycfg turns "store i32 0, i32* null" into
"unreachable". I'm going to leave this rule in place for now
to cover this, however there's certainly room for improvement
in this area.

>
>
>
>> - A pointer value formed by a ptrtoint is associated with all address
>>
>> ranges of all pointer values that contribute (directly or
>>
>> indirectly) to the computation of the pointer's value.
>>
>
> This seems technically well defined (after rephrasing in terms of
> inttoptr, as you said), but can be very expensive for a pointer
> analysis to track. I'm not sure any of our current analyses actually
> track this in any sense, except DSA which only does this as an option
> that hasn't been used for a while and has likely bit-rotted. This
> means the "ptrtoint+arithmetic+inttoptr" case, though legal, should
> lead to very poor aliasing results. I guess you could argue that
> current LLVM passes don't do any better?

Yes; this is existing practice in LLVM. inttoptr provides
essential functionality, but optimizers don't attempt
expensive heroics to understand what it's doing.

I don't know how to constrain inttoptr in a way that would
meaningfully aid pointer analysis without either making it
over-complicated or breaking some extant idiom. I'm open
to suggestions though.

Reply all

Reply to author

Forward