Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Placement of memory loads

0 views
Skip to first unread message

YuGr

unread,
Nov 5, 2009, 1:42:32 AM11/5/09
to
Hi,

I want to figure out the best way to place initial memory loads in generated
code.

My current approach is very simple and inefficient. I do a liveness analysis
and then insert memory loads for all undefined variables in the beginning of
initial basic block. Here is an example (both x and y are global):
int main() {
x = x + 1;
//A lot of code
y = y + 1;
}
I see that both x and y belong to initial block's Live_in and insert loads:
int main() {
load x;
load y;
x = x + 1;
//A lot of code
y = y + 1;
}

The problem is that this simple approach leads to artificial conflicts (e.g. x
and y are in conflict) and high register pressure. I think I would better do
with something like
int main() {
load x;
x = x + 1;
//A lot of code
load y;
y = y + 1;
}
i.e. put load immediatelly before first use (note that x and y are not
conflicting any more). But how should I approach non-linear situations e.g.
int main() {
if( ... ) {
//A lot of code
x = x + 1;
} else {
//A lot of code
x = x + 2;
}
}
Should I put load in every branch (this will lead to code bloat) or before
switch (this will lead to increased register pressure)?

Another problem is loops: surely I do not want to insert load inside loop body
because it will lead to poor performance.

So
1) where and how do you usually place loads in generated code?
2) how do you approach conditional operators, loops, etc.?
3) can you recommend some paper which covers this problem in detail? I haven't
found any mentions of it in my textbooks (Aho, Cooper).

--
Best regards,
Yuri

BGB / cr88192

unread,
Nov 5, 2009, 5:13:48 PM11/5/09
to
"YuGr" <tetr...@googlemail.com> wrote in message

> I want to figure out the best way to place initial memory loads in
> generated code.

> My current approach is very simple and inefficient. I do a liveness
> analysis and then insert memory loads for all undefined variables in
> the beginning of initial basic block.

<snip>


> The problem is that this simple approach leads to artificial
> conflicts (e.g. x and y are in conflict) and high register pressure.

<snip>


> 1) where and how do you usually place loads in generated code?
> 2) how do you approach conditional operators, loops, etc.?
> 3) can you recommend some paper which covers this problem in detail? I
> haven't found any mentions of it in my textbooks (Aho, Cooper).

my strategy is simple enough and seems to be fairly effective:
I start in some initial state (locals are on stack, args are maybe on stack
or in regs);
if I need to synchronize, I may force everything back into memory;
if a need to load is encountered, a variable is loaded, and subsequently
assumed to reside in a given register;
any writes to a variable in a register mark it as dirty;
if I really need a register, something may be picked and "flushed", which
will either discard it (if not dirty), or write it back to memory (if
dirty);
I often synchronize things at labels and prior to function calls.

typically, most aspects of control flow are ignored, and internally the
compiler pretends as if flow proceeds linearly through the function.

in general, the compiler concerns itself almost entirely with the "present
state", and generally ignores upcomming instructions (its behavior is then,
almost purely responsive to particular "code events"). however, I use a
multi-pass strategy, and in a few places "magic bits" are used to flag
places where a bad decision had been previously made, allowing some small
amount of fine tuning to avoid places where "something bad happened".

(this reminding me of the ST: TNG episode where they were stuck in a time
loop and messages were being sent to a "past" Data to fine-tune certain
decisions and avert destruction).

sadly, all this relies on my compiler being deterministic, since
non-determinism would throw the whole process into chaos. multiple passes
are used, and help resolve certain issues (such as register allocation), but
each has to remain essentially lock-step with the others (and simply
dummy-out parts which are not relevant in a given pass, such as generating
output, ...).


so, alas, it is all fairly naive, but seems to work well enough most
of the time. granted, my aim is not usually for max possible
performance, mostly just to reduce generating lame code.


it is worth noting that my current compiler does not use SSA, mostly
because thus far I have not been able to successfully bridge the gaps
which exist between my compiler and SSA form (nor can I personally all
that easily imagine how SSA works, which I guess does not much help
matters, ...).

personally, I find a "behavioral" model of all of the parts of the compiler
process easier to imagine, and in general it seems to work fairly well
(different parts have certain "behaviors", and act in response to a stream
of "events" representing the input code, which in my case is an RPN-based
IL).

it is a whole lot easier to figure out "well, that screwed me over, don't do
that next time" than it is to figure out "how will this decision effect
things in the future?...".

Kaz Kylheku

unread,
Nov 5, 2009, 7:55:36 PM11/5/09
to
On 2009-11-05, YuGr <tetr...@googlemail.com> wrote:
> Hi,
>
> I want to figure out the best way to place initial memory loads in generated
> code.
>
> My current approach is very simple and inefficient. I do a liveness analysis
> and then insert memory loads for all undefined variables in the beginning of
> initial basic block. Here is an example (both x and y are global):
> int main() {
> x = x + 1;
> //A lot of code
> y = y + 1;
> }
> I see that both x and y belong to initial block's Live_in and insert loads:
> int main() {
> load x;
> load y;
> x = x + 1;
> //A lot of code
> y = y + 1;
> }

Strange, old-schoolish approach to compiling. Loading from memory into
registers is a low level concept, variables high level.

Why not single-store-assignment or something?

How about generating a new temporary for each intermediate expression,
including loads. Then assign registers to temporaries.

x = x + 1;

y = y + x;

becomes something like:

t1 := x
t2 := t1 + 1 /* x understood to live in t2 now */
x := t2
t3 := y
t4 := t3 + t2
y := t4

Now you have a lot of freedom in moving around the loads and stores.
Of course, you can't move an instruction in front of one which produces
an operand which that instruction needs. But this rearrangement is valid:

t3 := y /* moved way up */
t1 := x
t2 := t1 + 1
t4 := t3 + t2
y := t4
x := t2 /* moved way down */

Of course, if accessing x and y is considered a visible side effect because
they are some kind of specially qualified object in the source language, like
``volatile'', the above moves would not be correct; the accesses must remain in
the order: load x, store x, load y, store y. For that you can have some fence
instruction.

fence
t1 := x
t2 := t1 + 1 /* x understood to live in t2 now */
x := t2
fence /* separates full expressions */
t3 := y
t4 := t3 + t2
y := t4
fence

No load or store can be moved across a fence if the variable is qualified
as volatile.

0 new messages