[BoomV3] Stalling the commit stage

121 views
Skip to first unread message

zhe jiang

unread,
Jul 27, 2022, 10:23:33 AM7/27/22
to riscv-boom
Hello dudes,

I am currently working on a security project, which 
(i) run-time monitors the online status of the Boom core
(ii) stalls the Boom core, when a ''security risk'' is detected.

At stage (ii), I need to stall the big core, and no instruction should be committed.

To this end, I modified the source code of the rob.scala. Specifically, I changed:

can_commit(w) := rob_val(rob_head) && !(rob_bsy(rob_head)) && !io.csr_stall

to:

can_commit(w) := rob_val(rob_head) && !(rob_bsy(rob_head)) && !io.csr_stall ***&& !security_stall***

However, in the experiments, I can see that the BOOM core still commits the instructions, even though it commits fewer instructions than the original design (where !security_stall is not inserted).

I wonder if anyone could guide me to my design?

Best regards,
Hugo

vi

unread,
Jul 27, 2022, 10:58:11 AM7/27/22
to riscv-boom
Hi,

From what I understand of the ROB logic, it seems to me that your modification of can_commit(w) should be enough to block commit.

How is the security_stall signal assigned? Is it a Reg? Or does it depend on a Reg?
Did you make sure there isn't any delay between the cycle you need to have commit stalled, and the actual cycle where security_stall in the ROB gets asserted?

You say the core still commits the instructions, although it commits fewer than in the original design.
Could you tell us in more details what instructions are committed when security_stall is asserted? What's the exact behavior you observe?
Does it commit a few consecutive instructions once the signal gets asserted, before blocking commit of all the subsequent instructions?
Or does it keep committing instructions (although not all) again and again for many cycles while security_stall is asserted?
Or something else?

Also, did you dump VCD traces?
If yes, would you mind sharing the VCD file? It would be a great help for troubleshooting.

Best regards

zhe jiang

unread,
Jul 27, 2022, 12:35:56 PM7/27/22
to riscv-boom
Hi vi,

It is very nice to hear from you! I would like to give you more information and test cases regarding our project to answer your questions better.

Background
Our security sub-system executes with the following steps:
(Step i) run-time monitors the committed instructions of the Boom core
(Step ii) filters the ``interesting instructions'' committed by the Boom core
(Step iii) sends the filtered ``interesting instructions'' to our security analysers
(Step iv) the security analysers perform security analysis to check if an attack occurs

Example and problem descriptions:
In our test cases, we configured the ``interesting instructions'' as ld and st instructions, which means that all the committed ld and st should be filtered and sent to our security analyser.
Therefore, there are two scenarios that should stall the Boom core:
(Scenario i) an attack is detected
(Scenario ii) the message queue (hardware FIFO) of the security analyser is nearly full <- the problem occurs here

We proceeded with three test cases:
(Test case i) setting: depth_message_queue := 64; and executes 1077 ld instructions on the Boom core
                       observation: 1077 ld instructions are filtered and sent to our analyser (no overflow)
(Test case ii) setting: depth_message_queue := 16; and executes 1077 ld instructions on the Boom core
                       observation: 1052 ld instructions are filtered and sent to our analyser (overflow)
(Test case iii) setting: depth_message_queue := 16, also set security_stall := true. B if message_queue_nearlyfull === true.B; and executes 1077 ld instructions on the Boom core
                       observation: 1055 ld instructions are filtered and sent to our analyser (overflow)

Based on this observation, I thought that (in Test case iii) still commits ld instructions, even though it commits less ld instructions than the original design (Test case ii).

Question Answering:
Q: How is the security_stall signal assigned? Is it a Reg? Or does it depend on a Reg?
A: security_stall := message_queue_nearlyfull, and the message_queue_nearlyfull is stored in a Reg.

Q: Did you make sure there isn't any delay between the cycle you need to have commit stalled, and the actual cycle where security_stall in the ROB gets asserted?
A: Yes, there is no Reg inserted between the security_stall and Rob.

Q: You say the core still commits the instructions, although it commits fewer than in the original design.
Could you tell us in more details what instructions are committed when security_stall is asserted? What's the exact behavior you observe?
A: Hopefully the above example answers the question. I am happy to provide more information if it is required.

Q: Does it commit a few consecutive instructions once the signal gets asserted, before blocking commit of all the subsequent instructions?
Or does it keep committing instructions (although not all) again and again for many cycles while security_stall is asserted?
Or something else?
A: The Boom only executes the ld instruction.
From Test case iii,  I can see that the rob_head[4:0] is not updated when security_stall is asserted.
I suspect with my original thinking now; maybe there are no instructions are committed? However, why do we still miss 22 instructions?

Q: Also, did you dump VCD traces?
If yes, would you mind sharing the VCD file? It would be a great help for troubleshooting.
A: I am more than happy to do so. What is the best way to share the VCD file? Shall we use e-mails?

Again, lots for thanks  for your kind help!

Best regards,
Hugo

vi

unread,
Jul 27, 2022, 1:51:17 PM7/27/22
to riscv-boom
Thank you for taking the time to write such a detailed reply! This is very helpful.

This behavior is intriguing indeed.

Are there any other differences in the core parameters or source code between test cases ii and iii?

I assume the VCD file size must be tenths to hundreds of MB, so posting it directly on here is probably impossible.
I guess the simplest way would be for you to upload it on your Google Drive, and then either post the link publicly here, or send the link privately to me if sharing the VCD traces publicly is a concern to you or your team.

Best regards

zhe jiang

unread,
Jul 27, 2022, 4:04:13 PM7/27/22
to riscv-boom
Hi vi,

It is nice to hear from you, and I really appreciate your help.

Regarding your question: there is no difference between test cases ii and iii. Only sercurity_stall is added.

Regarding the VCD file: yes, you are right; it is huge! Following your suggestions, I have uploaded it to google drive (it takes 2 hours!)  
Please access the VCD file using the below (please let me know if you have any issues with the accesses):

Some small points might be useful for your debug:
(i) the sercurity_stall is actually called gh_stall,where you can find it from the rob instance
(ii) the Boom core is indexed with Hart ID 0. There are also Rocket cores with Hart ID 1, 2, but we do not care them in this case.
(iii) during the problem execution, the boom core only executes ld instructions -- "lw   t1,   (t2);"
(iv) below code illustrates how we monitor the commits of the Boom core
   io.pc                      := rob.io.commit.uops(0).debug_pc(31,0);
  io.inst                    := rob.io.commit.uops(0).debug_inst(31,0);
  io.new_commit              := rob.io.commit.arch_valids(0);


Also, if you wish, we are also happy to share the source code with you; please feel free to contact me using the e-mail address: zj266 AT cam.ac.uk

Again, lots of thanks for your kind help! It is really helpful for us.

Best regards,
Hugo

zhe jiang

unread,
Jul 27, 2022, 10:46:53 PM7/27/22
to riscv-boom
Hi vi,

Please do not spend time on my questions, as I think I found the reasons for the problem -- it is caused by the FIFO queue (rather than the can_commit), but I still need to further investigate the problem

I really appreciate your help and discussions; you help me to accelerate the debugging!

Lots of thanks!

Best regards,
Hugo

vi

unread,
Jul 28, 2022, 2:36:02 AM7/28/22
to riscv-boom
Hi Hugo,

Glad you found where the problem comes from!

Don't hesitate to post again if you encounter any other problem with the BOOM core.
Although I should clarify that I'm not a member/contributor to the BOOM core.
I'm just working on it as a personal side project and I don't know everything about the core's operation, so I may not be able to help much.
But if I can be of any help on some issue I'll be glad to take a look.

Best regards

zhe jiang

unread,
Jul 28, 2022, 9:31:54 AM7/28/22
to riscv-boom
Hi Vi,

Great thanks! The discussions with you are really helpful, helping with a clear mind for debugging and giving some key information (e.g., the usage of can_commit)!

I really appreciate that! Definitely, more questions will come :-)

Wish you have a great weekend!

Best regards,
Hugo
Reply all
Reply to author
Forward
0 new messages