Hi all, The CBP-1 source code framework seemed to provide much more information about architectural state than CBP-2. It gave details of all instructions, not just branches, and even allowed the branch predictor to peek at register values etc. Unless I am mistaken, these features are not present in CBP-2. Is this a clear-cut design decision? Please can you give some justification? Thanks very much, Jeremy
Yes, I decided to provide less information in the traces. The CBP-2 traces provide the following information:
- Branch address - Branch target - Conditional branch opcode - Conditional branch outcome, i.e. taken or not taken - Branch type i.e. conditional, indirect, call, return
Most research in branch prediction focuses on these features or a subset of these features. For this contest, I am interested in seeing how far we can push this "standard model" of branch prediction. I don't think that any of the finalists of CBP-1 used anything beyond these features, and my understanding is that the data values provided in the larger traces went mostly unused by all of the contestants. -- Daniel Jiménez
Jeremy Singer wrote: > Hi all, > The CBP-1 source code framework seemed to provide much more information > about architectural state than CBP-2. It gave details of all > instructions, not just branches, and even allowed the branch predictor > to peek at register values etc. Unless I am mistaken, these features > are not present in CBP-2. Is this a clear-cut design decision? Please > can you give some justification? > Thanks very much, > Jeremy
i will also find useful to have other information in the traces, specifically for all instructions: opcodes, register names for sources and destinations as well as the addresses accessed by memory instructions (i guess jeremy wanted values also).
BTW i understand the point of limits based only on branch history information and the time/effort to produce new traces, but may be two requests will be more convincing :) in particular, i think for the limits part of the contest it may be useful to allow for other sources of information to be investigated. for example, previous work on dataflow prediction appearred promising (isca03 and hpca03).
It might be possible to convince me that more information is a good idea, but at this point I'm unable to devote the time and resources to a change of this magnitude in the infrastructure.
However, let me be so bold as to disagree with you with respect to the limits part of the contest. I think it will be an important achievement to establish a credible lower limit on misprediction rates using just the information available to traditional branch predictors. If we allow other information to be considered, then the lower bound is not meaningful for those traditional predictors that form the bulk of branch prediction research. By the way, I'm a huge fan of both the papers you cite (Renju Thomas and Steve Dropsho are two of the sharpest guys I know who have worked on branch prediction), but they require a little more buy-in from the rest of the microarchitecture than I think the industry folks would be willing to provide at this point. Thanks. -- Daniel Jiménez
> i will also find useful to have other information in the traces, > specifically for all instructions: opcodes, register names for sources > and destinations as well as the addresses accessed by memory > instructions (i guess jeremy wanted values also).
> BTW i understand the point of limits based only on branch history > information and the time/effort to produce new traces, but may be two > requests will be more convincing :) in particular, i think for the > limits part of the contest it may be useful to allow for other sources > of information to be investigated. for example, previous work on > dataflow prediction appearred promising (isca03 and hpca03).
I agree that access to all data values is not everybody's concern and would leave you with a great deal of implementation work. But then, why not use the same framework as in the previous contest? Personally I would like to have access to the branch target. Can you provide this? I believe it is useful to many people. Hongliang also used this in his loop predictor. BTW, I believe it would be interesting to measure progress in branch prediction by comparing the winner of this edition to the winner of the previous edition, which is not possible now!
The branch target is provided as the third argument to the 'update' method in the 'branch_predictor' class, so your predictor will have access to this value when it is updated.
It's not clear that comparing this year's entries the the previous finalists would be valuable since the hardware budgets are different. One could adapt some or all of the previous finalists to work with this infrastructure. -- Daniel Jiménez
> I agree that access to all data values is not everybody's concern and > would leave you with a great deal of implementation work. But then, why > not use the same framework as in the previous contest? > Personally I would like to have access to the branch target. Can you > provide this? I believe it is useful to many people. Hongliang also > used this in his loop predictor. BTW, I believe it would be interesting > to measure progress in branch prediction by comparing the winner of > this edition to the winner of the previous edition, which is not > possible now!
I just wanted to chyme in with my two cents. First, it is true that none of the finalists used the data values, and I recall Jared and Chris saying that only a single non-finalist entry actually used the values.
Independent on whether or not values are useful, how they might be used, etc... from the perspective of running the contest, it is my opinion (not speaking for Daniel or anyone else on the organizing committee) that it is quite difficult to get everyone to agree on a fair model for using the values. CBP-1 provides values after some delay, but even that didn't really feel "correct". At the end of the day, the ideal/limit branch predictor should have as much info available to it as possible, but *when* it is appropriate for each piece of information to be revealed to the predictor is wide open to debate (e.g., after a fixed delay, based on the timing of execution of the producing instructions in a real processor, based on the underlying dataflow graph?). Given that the values were practically unused for CBP-1, my 1-bit predictor says it's not worth the effort to wrangle and argue over the exact rules for data values (and then modify the framework and collect all of the traces).