PAPER 10/30: Bypass Aware instruction scheduling for register file power reduction

4 views
Skip to first unread message

Guofeng

unread,
Oct 26, 2007, 1:25:32 PM10/26/07
to ASU:CSE520 FALL 07 Advanced Computer Architecture
@article{1134675,
author = {Sanghyun Park and Aviral Shrivastava and Nikil Dutt and
Alex Nicolau and Yunheung Paek and Eugene Earlie},
title = {Bypass aware instruction scheduling for register file power
reduction},
journal = {SIGPLAN Not.},
volume = {41},
number = {7},
year = {2006},
issn = {0362-1340},
pages = {173--181},
doi = {http://doi.acm.org/10.1145/1159974.1134675},
publisher = {ACM},
address = {New York, NY, USA},
}

Sai Raghunath T

unread,
Nov 1, 2007, 3:54:44 AM11/1/07
to asucse520-fall-07-advanc...@googlegroups.com
Hi,
 
Please find my critic for the paper, "Bypass Aware instruction scheduling for register file power reduction" attached.
 
Regards
Sai Raghunath T

 
Critic_comp arch-II.doc

Ayan Banerjee

unread,
Nov 4, 2007, 3:14:21 PM11/4/07
to asucse520-fall-07-advanc...@googlegroups.com
Hi Sai,

By "technique of scheduling the dependent instructions nearer to each
other" do you mean nearer in the pipeline? And can you explain what do
you mean by "the power consumption is minimal when compared to
complete bypassing technique".
Again does this paper suggest to reduce the number of register files.
I think it just states it as an option. But it does not use it. It
just fetches the operand values from the bypass if register values are
not going to be used. So it should not be considered as a weakness.
Waiting for your reply.
Regards
Ayan

Sai Raghunath T

unread,
Nov 4, 2007, 4:02:25 PM11/4/07
to asucse520-fall-07-advanc...@googlegroups.com
Hi Ayan,
 
The data from the bypass register will be used in subsequent nearby instructions (may be 3-5 instructions). Else,subsequent instructions will have to acces the register files for the data. This is the case for partial byapssing. If it is full bypassing, the paper says that the most of the data will be in the bypass registers without getting written in the register files immediately. In that case, the bypass registers should be designed robustly.Meaning, the sizing of bypass registers should be such that each bypas register must be able to send the data to as many as instructions as possible. Also, the data in the bypass reigsters should be made available for long time.  This issue is not considered in the paper.
 
Reducing number of registers is dependent on the number of bypass registers and also their size and driving strength. The paper mentions that reducing the number of register files as a method of reducing the RF power consumption.As I said in my critic, if an application computes lot of temporary data, then that data should be stored in the registers (or bypass registers, if available). But as the number of registers are less, temporary data is stored in the register file and again modified later on very frequently. So, everytime RF is being accessed, the dynamic power consumption of RF increases.
 
Now there are two cases:
1. Reduce the number of RF and access them frequently as and when required
2. Increase the bypass registers and reduce accessing the RF frequently.
The tradeoff is between the no. of bypass registers - their power consumption and no. of RFs- power consumption during accessing them.
 
Paper says that reducing the registers is an acceptable one (which i critcised in my critic) and they are also exploiting the second option.
 
Regards
Sai Raghunath T

 

Sushma Myneni

unread,
Nov 4, 2007, 5:39:59 PM11/4/07
to asucse520-fall-07-advanc...@googlegroups.com
Hi Sai,

I disagree with the below statement
 ---> "The data from the bypass register will be used in subsequent nearby instructions (may be 3-5 instructions). Else,subsequent instructions will have to acces the register files for the data. This is the case for partial byapssing."  - 
In the paper it was mentioned that in a completely bypassed register architecture - The result is available to every source operand in the cycle 'l' if  (l1 << l << l2) where l1 = number of cycles after issuing the instruction the result is computed and l2 = number of cycles after issuing the instruction the result is written in to register.
In the reference paper [19], it was mentioned that for partial bypassing which register values are bypassed is determined by simulation results and designer's intuition. What you mentioned as partial bypassing is actually the definition for complete bypassing.

I also disagree with
--> " Reducing number of registers is dependent on the number of bypass registers and also their size and driving strength".
Temporary data cannot be stored using bypass techniques. Bypass technique passes the register values to all the source operands in the next stage of the pipeline.


Thankyou,
Sushma

Sai Raghunath T

unread,
Nov 4, 2007, 9:14:25 PM11/4/07
to asucse520-fall-07-advanc...@googlegroups.com
Hi Sushma,
 
Thanks a lot for the feedback.
 
I am actually seing the issue from the hardware design point of view. So, I think my statements would have confused you.
 
In both the comments, when I mentioned the "Size and driving strength ", I did not mean the 'number' of datas that can be stored and but the size of each transistor (or latch) in the register holding the data.
 
As you said, the result is available to every source operand in the cycle 'l' if  (l1 << l << l2) where l1 = number of cycles after issuing the instruction the result is computed and l2 = number of cycles after issuing the instruction the result is written in to register.
 
So,what I meant was that the pipeline register which is actually used in bypassing in complete bypassing technique should be good enough in size and driving capability (Here, driving capability tells us to how many instructions does the pipeline register can send the data correctly without the loss in logic).
 
Hope this supports what I mentioned in my critic and subsequent replies.
 
Regards
Sai Raghunath T

 

Saleel Kudchadker

unread,
Nov 5, 2007, 2:42:09 AM11/5/07
to asucse520-fall-07-advanc...@googlegroups.com
Hi Sai
 
the problem of driving strenght and size doesnt come into picture here. you are including VLSI here. :) .  They just discuss abt a way for accessing register file less.
 
regds
saleel

 
TA, Del E. Webb School of Construction
Arizona State University
Tempe, AZ, 85281

Ayan Banerjee

unread,
Nov 6, 2007, 2:11:07 PM11/6/07
to asucse520-fall-07-advanc...@googlegroups.com
Here is the summary.
Summary.pdf
Reply all
Reply to author
Forward
0 new messages