The implementation of fence instruction in Boom

赵夏

unread,

Mar 23, 2020, 2:36:44 AM3/23/20

to riscv-boom

Hi guys,

I think the current implementation of the fence instruction in Boom is a bit naive where the load/store instructions behind the fence need to wait until the fence is removed from the store queue. Please clarify if my understanding is wrong:).

One efficient implementation in my mind is that the loads behind the fence can execute out-of-order and we roll back if the load is observed by other cores. But I am not sure that the benefits can clearly outperform the implementing complexity. Any comments about this? Any information about the implementation in commercial Arm processors as they also use the weak memory ordering model and rely on the fence to keep memory ordering.

Many thanks

Xia

Jerry Zhao

unread,

Mar 23, 2020, 3:30:00 AM3/23/20

to 赵夏, riscv-boom

Hi,

You are right the FENCE implementation in hardware is naive. Practically, a more efficient implementation would look at the granularity of the FENCE (pr/pw/pi/po sr/sw/si/so) to determine what instructions are permitted to execute out-of-order, and when to roll back. Its simpler for us to reduce all FENCE types to a global FENCE.

I have not looked into this in that much detail, since FENCE instructions are infrequent in the workloads I have been running.

However, I suspect that the implementation complexity is not too great. BOOM already provide mechanisms for rolling back memory-ordering failures, and provides a mechanism for detecting when a load is observed by another hart. A better implementation would relax FENCE to no longer be a is_unique micro-op, and use the observed signals in the LSU to determine when a FENCE'd load should throw an ordering failure.

-Jerry

--
You received this message because you are subscribed to the Google Groups "riscv-boom" group.
To unsubscribe from this group and stop receiving emails from it, send an email to riscv-boom+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/riscv-boom/4164c7eb-01fe-4460-9c41-1162c34b35be%40googlegroups.com.

赵夏

unread,

Mar 29, 2020, 11:50:48 PM3/29/20

to riscv-boom

Hi Jerry,

Thanks for the reply. Following the last email. I am a bit confused about how Boom provides a mechanism for detecting when a load is observed by another hart. I tried to figure this out by myself last week but failed.

I know this is related to "can_fire_release" but I did not find how boom handles this in detail. For example, I guess in this case, LSU is communicating with dcache.scala module, right? I know the checked address is stored in "lcam_addr" but I do not find where does this address come from in this mechanism.

Thanks for your time and looking forward to your reply.

Cheers,

Xia

To unsubscribe from this group and stop receiving emails from it, send an email to riscv...@googlegroups.com.

Jerry Zhao

unread,

Mar 30, 2020, 2:53:43 PM3/30/20

to 赵夏, riscv-boom

Sorry, it appears I did not finish pushing out support for this feature into master.
I've backported the fix from my dev branch here (https://github.com/riscv-boom/riscv-boom/pull/448). You should be able to pull the most recent commit of master to get the change.

To unsubscribe from this group and stop receiving emails from it, send an email to riscv-boom+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/riscv-boom/65bd53fe-b7e3-47d6-a2da-ec492769f676%40googlegroups.com.

Message has been deleted

赵夏

unread,

Mar 30, 2020, 9:12:54 PM3/30/20

to riscv-boom

Hi Jerry,

Thanks! After pulling the updated codes, it makes a lot of more sense now.

Cheers,

Xia

To view this discussion on the web visit https://groups.google.com/d/msgid/riscv-boom/65bd53fe-b7e3-47d6-a2da-ec492769f676%40googlegroups.com.

Reply all

Reply to author

Forward