This is an archive of the discontinued LLVM Phabricator instance.

[MCA] Support in-order CPUs with MicroOpBufferSize=1
ClosedPublic

Authored by foad on Mar 10 2021, 8:14 AM.

Diff Detail

Event Timeline

foad created this revision.Mar 10 2021, 8:14 AM
foad requested review of this revision.Mar 10 2021, 8:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 10 2021, 8:14 AM
andreadb accepted this revision.Mar 10 2021, 12:09 PM

Hi Jay,

is this patch needed because otherwise you see the following crash/assertion failure?

llvm-mca: /home/andrea/llvm-project/llvm/lib/MCA/HardwareUnits/RetireControlUnit.cpp:50: unsigned int llvm::mca::RetireControlUnit::dispatch(const llvm::mca::InstRef&): Assertion `(AvailableEntries >= Entries) && "Reorder Buffer unavailable!"' failed.

I had a better look at the implementation of our new InOrderIssueStage, and I am not convinced that we do the right checks for the reorder buffer.

TL;DR:
Your patch has the effect of disabling the reorder buffer logic entirely. RCU queries would simply be ignored, and an invalid "retire token ID" would be associated to all instructions.
This all makes sense to me, since the end result is that we want to bypass the RCU. So the patch LGTM.

That being said, this patch forced me to have another look at our implementation of the in-order pipeline, and I think I found a few problems in how we simulate the RCU for in-order processors. I have also reported a more general issue in PR41796.

The retire stage was originally added by Andrew to deal with processors like Cortex-A55 which - under specific conditions - allow OoO execution of some instructions. Quoting a document which I found on the web "While the Cortex-A55 core only issues instructions in-order, due to the number of cycles required to complete more complex floating-point and NEON instructions, out-of-order retire is allowed on ..." some instructions.

I think that the logic in the new InOrderIssue stage is missing some important checks. In particular, we are definitely missing a check on the availability of the reorder buffer. If we have run out of entries, then the issue should be stalled. While this check is not important for your target, it is important for cortex-a55 (because it declares a ROB with multiple entries in the "extra processor info" struct). That check should be ignored for other targets, for which we don't care about simulating a ROB.

This revision is now accepted and ready to land.Mar 10 2021, 12:09 PM
foad added a comment.Mar 10 2021, 12:43 PM

is this patch needed because otherwise you see the following crash/assertion failure?

Yes, that was the failure, and I just did the minimum work I could to make MicroOpBufferSize=1 work like MicroOpBufferSize=0. I don't really know what reorder buffers and retire control units are -- they are not part of my mental model of how the AMDGPU processors work.

is this patch needed because otherwise you see the following crash/assertion failure?

Yes, that was the failure, and I just did the minimum work I could to make MicroOpBufferSize=1 work like MicroOpBufferSize=0. I don't really know what reorder buffers and retire control units are -- they are not part of my mental model of how the AMDGPU processors work.

Fair enough :).
Your patch looks good. Thanks!

This revision was automatically updated to reflect the committed changes.