PAPER 10/30: Reducing Register File Power by Exploiting Value Lifetime Characteristics

11 views
Skip to first unread message

Guofeng

unread,
Oct 26, 2007, 1:28:10 PM10/26/07
to ASU:CSE520 FALL 07 Advanced Computer Architecture
@misc{ hu-reducing,
author = "Z. Hu and M. Martonosi",
title = "Reducing register file power consumption by exploiting
value lifetime",
text = "Z. Hu and M. Martonosi, Reducing register file power
consumption by exploiting
value lifetime, in",
url = "citeseer.ist.psu.edu/580963.html" }

Aarul Jain

unread,
Nov 1, 2007, 3:13:31 PM11/1/07
to asucse520-fall-07-advanc...@googlegroups.com
Please find critic for the paper attached.
 
Thanks
-Aarul

 
Arizona State University, Tempe, US
Ph: 480-278-9230
critic-Aarul Jain.pdf

Mike

unread,
Nov 1, 2007, 9:49:56 PM11/1/07
to ASU:CSE520 FALL 07 Advanced Computer Architecture
This paper proposes the addition of a filtering scheme (a VAB) to the
register file to alleviate some of the power usage these growing
register files use in multiple instruction issue processors.


Strengths
The authors propose, test, and discuss limitations of their VAB in a
reasonable concise manner.

The authors use standard benchmarks (SPECINT95) which adds to the
credibility of their results.

They define the model processor that they used in these tests very
well.

They used useful metrics in their evaluation.

Their approach seems feasible to implement into existing
architectures.


Limitations
The paper is written to address the increasing power usage of register
files in wider instruction issue processors. The move to multi-core
architectures largely makes this work obsolete until superscalar multi-
core architectures become used.

The introduction section of this paper could use a lot of filling in
with supportive data to better motivate their idea.

They do a poor job defining finite numbers for the power usage of a
typical register file and the effect that the increasing register file
size has on overall system power.

sairag...@gmail.com

unread,
Nov 2, 2007, 12:25:24 AM11/2/07
to ASU:CSE520 FALL 07 Advanced Computer Architecture
Hi all,

I have few doubts in this paper.

1. How is VAB different from pipeline registers? Does a group of
pipeline registers with additional features like valid bit etc make a
VAB??
2.How is the power saved when VAB is accessed for data by consecutive
few instructions? How is the energy per access to RF is greater than
energy per access to VAB? Essentially both get accessed in the similar
manner. Only thing that I understand is the access time for accessing
the register files is more than accessing VAB.

Can someone please help me in clearing my doubts?

Regards
Sai Raghunath T

Pradnyesh Gudadhe

unread,
Nov 2, 2007, 12:41:11 AM11/2/07
to asucse520-fall-07-advanc...@googlegroups.com
HI Raghu,
    Do you mean registers from Register File by Pipeline registers?
The paper says,
1) VAB has smaller size that register file, hence it has good power characteristics.
2) Also, by using VAB we are basically reducing read/write requests to the register file.
And hence, this allows us to reduce the number of ports provided on the register file.
     Because, of this we can reduce the power consumption of the register file.

During my presentation, one question was raised regarding the ability to reduce overall
power consumption even after adding an additional unit(VAB) to datapath. Even, I had this
question when I read this paper for the first time. Hence, I have emailed both authors
regarding this. I am eagerly waiting for their replies so that I can forward their answers
to all on this group.

Thanks.

Regards,
-Pradnyesh


Sai Raghunath T

unread,
Nov 2, 2007, 12:49:32 AM11/2/07
to asucse520-fall-07-advanc...@googlegroups.com
Hi Pradynesh,
 
Consider a 5-stage pipeline in which you have pipeline line registers after every execution stage. So, my question was that if VAB is a cluster of ALU pipeline registers. I mean, during execution of every instruction, is the data from ALU written in to VAB directly withour using pipeline register? Does this paper meant to say that pipeline register after the EX stage is not required if VAB is used?
 
Also, please let us know after you get reply from the authors.
 
Regards
Sai Raghunath T

 

Regards
Sai Raghunath T

Pradnyesh Gudadhe

unread,
Nov 2, 2007, 1:07:44 AM11/2/07
to asucse520-fall-07-advanc...@googlegroups.com
Hi Raghu,
    The authors have simulated the experiment on a machine with 64 Physical Registers ( Register File ).
The optimal size of VAB which they have found for that system is 16-entry Value Added Buffer.
The idea behind this paper is: Keeping your most recently calculated values in an easily and cheaply accessible
buffer so that following instructions (which in most of the cases, use the values calculated by previous instructions)
can access those values directly from VAB instead of getting them from power-hungry register file.
As, VAB has only 16 entries and there are 64 physical registers in register file, we can not drop register file from the
pipeline. Please refer the diagram below. It would give you some idea.
     Yes. The data from ALU is directly written to the VAB. When the VAB is full and newly calculated value from ALU is
to be stored in the VAB; then in that case the oldest entry in VAB is written to the register file. This is called as
Eviction. This happens only when VAB is full and a new entry is to added to VAB.
     I don't have electronics background, but I think the basic electronic structure of registers from register file and an
entry from VAB has to be same. Please correct me if I'm wrong in this regard. I guess a buffer and register both
are made up of flip-flops
.
     I would surely let you know once I get a reply from the authors of the paper.
Thanks.

Regards,
Pradnyesh


--
Please visit my UPDATED website here:
http://www.geocities.com/paddyinpilani
VAB in datapath.JPG

Sai Raghunath T

unread,
Nov 2, 2007, 1:24:40 AM11/2/07
to asucse520-fall-07-advanc...@googlegroups.com
Hi Pradynesh,
 
Buffer is the one which allow processes to temporarily store input in memory until the process can deal with it. Also, they can be used to boost the signal properly to logic-1 or logic-0. So, buffer can be a cross coupled inverter(latch or FF) or two inverters in series with appropriate sizing of inverters.
 
Registers are the one which store the data permanently(which can be also modified during the process) till the process is over. Registers are generally series of buffers which shift the data out of the register depending on the clock signal.
 
I was wondering about the strength of VAB as it needs to provide the data to about 2-3 instructions(or more depending on the pipeline depth) at the same time if consecutive instructions require the same data from VAB.
 
Regards
Sai Raghunath T

peyman...@gmail.com

unread,
Nov 2, 2007, 12:28:49 PM11/2/07
to ASU:CSE520 FALL 07 Advanced Computer Architecture
I apologize for the delay in posting this critic:

The paper suggests using a new element (VAB) for accessing the
Register File which resembles a Cache in many ways. They claim doing
so will help reduce the overall power consumption with a slightly
negative effect on performance. The values that need to be kept in
this holding tank called VAB, as well as the mechanism used to store
or remove values from it are well defined; as is the lifetime
characteristics of values in the register file. The impact this new
element will have on the mechanism of read, write and freeing
operations are also described sufficiently.

One thing that is missing in the results is the ratio of power
consumption of VAB. They only claim that the overall power consumption
had reduced, however if a high percentage of this power was consumed
by the VAB which should be very small in size this could result in a
major hot spot problem.

In the last section they suggest adding a location bit to the VAB to
prevent performance loss. If we are prepared to go as far as adding a
bit to decide to access the VAB or the register file why stop at that?
Why not assign multiple bits to determine this register belongs to
which one of the many and possibly smaller register files?

As a final note one thing that is missing from this paper is any
indication of the cost of implementing this VAB architecture.


rduc...@gmail.com

unread,
Nov 6, 2007, 2:47:02 AM11/6/07
to ASU:CSE520 FALL 07 Advanced Computer Architecture
Summary of online discussion

* Paper overview:

Register files are a large culprit of power consumption due to their
large size and many read/write ports. However, the short-lived nature
of many register values can be exploited by instead retrieving some
values from a VAB; this distributes heat away from the register file
and reduces the need for more read/write ports, which in turn
decreases overall power usage.

* Praise during online discussion:

Several critics complimented the paper on its usage of concrete
benchmarks such as SPECINT95. The authors go to great detail to
describe their testing methodology, and there is a lot of confidence
that the results could be duplicated independently.

In addition, the paper provides power-saving statistics, which allow
an implementor to predict benefits and trade-offs. The paper
acknowledge performance decreases-extra cycles and extra bits are used-
and additional difficulties (such as misprediction) are openly
admitted..

* Criticism during online discussion:

The relevance of the article is questioned, citing the paper's
dependence on super scalar architectures. Simple architectures may not
benefit as much from a VAB.

Moreover, there were multiple questions (both in-class and out) about
the usefulness of the paper's technique. For example, how is the
presence of a VAB different from having registers elsewhere on the
chip (e.g., pipeline registers)? If the register file was broken up
into smaller chunks and distributed throughout the processor, that
would reduce the size of any single chunk of the register file-
potentially providing a similar tangible heat benefit as the VAB. (The
paper's authors were contacted for further explanation, but no
response has been received.)

Another critic questioned whether the technique could be self-
defeating. If the VAB itself became a hot-spot, then the power-benefit
could be mitigated. This would be especially unfortunate when
considering the performance loss and implementation challenges
associated with a VAB.

Last, the paper lacks some important information. For example, the
implementation of the micro-architecture is not described. Application
debugging in a VAB chip is not discussed at all.

Sugan Vinayagam

unread,
Nov 6, 2007, 2:55:04 AM11/6/07
to asucse520-fall-07-advanc...@googlegroups.com
My comment on the following point in the summary:

" If the register file was broken up
into smaller chunks and distributed throughout the processor, that
would reduce the size of any single chunk of the register file-
potentially providing a similar tangible heat benefit as the VAB. (The
paper's authors were contacted for further explanation, but no
response has been received.)"

The above stated idea of splitting the register file and distributing it through out the processor may sound reasonable when considering heat perspective of register file, but this point does not consider the performance loss( i.e. delay) by taking up this technique. The difference in delay in accessing different registers in the register file would vary a lot as they spread all around. Moreover, register file is  usually kept close to the execution units to minimize the delay in accessing data from the registers.

thanks,
Sugan Vinayagam.


chip ( e.g., pipeline registers)? If the register file was broken up
Reply all
Reply to author
Forward
0 new messages