This is an archive of the discontinued LLVM Phabricator instance.

[MCA] Moving six instruction flags out of InstrDesc and into InstructionBase.
ClosedPublic

Authored by holland11 on Mar 11 2022, 7:58 PM.

Details

Summary

TLDR: This patch proposes moving a few instruction flags out of the InstrDesc and into the InstructionBase. This move allows for those flags to be modified by InstrPostProcess.

This is patch is an audible from https://reviews.llvm.org/D117451 . The primary goal of that/this patch was to give developers the ability (through InstrPostProcess and/or CustomBehaviour) to modify instruction flags such as RetireOOO. The reason why flags like RetireOOO (and the others affected by this patch) can't be modified yet is because they are stored within the InstrDesc object and InstrDesc objects are made const after creation. In my original attempt to solve this problem, I figured that I'd take the opportunity to allow for the other attributes within the InstrDesc to be modified by InstrPostProcess at the same time. This would allow for mca users to be able to do things like modify the instruction's Reads / Writes, MaxLatency, NumMicroOps, and which resources are used. This is something that I wanted to accomplish eventually because currently, CB and IPP have the ability to add *new* hazards to instructions, but they don't have the ability to remove *existing* hazards.

The first problem with my initial proposal was that it was way too "open-ended". The way it let IPP modify the InstrDesc objects gave the user way too much control and not enough guidance. The second problem (and one that contributes to why the first problem is a problem) is that doing things like modifying the MaxLatency attribute, or removing a Use / Def from the Writes / Reads vectors isn't nearly as straight forward as just modifying / removing them. These attributes are originally set with each other in mind and so to modify them correctly, you may need to modify a group of them in a logical way. For example, if you lower an instruction's MaxLatency, you will likely get an error when running MCA due to MCA thinking the instruction is finished executing, but it still has Reads / Writes active for more cycles.

To solve both of those problems, making the API much more rigid and self explanatory would be desired, if not required. On top of that, I would also need to make sure that I have a full understanding of how all of these attributes are connected together so that I could either design the API in a way that makes it difficult for users to make mistakes or just provide very detailed comments that help users make their modifications in the proper ways.

At this time, I do not feel confident in my ability to do the second part of the above paragraph, but I do still want to be able to modify the flags from this patch that do not have dependencies like the ones described above (MayLoad, MayStore, HasSideEffects, BeginGroup, EndGroup, and RetireOOO).

There are two main ways that I can achieve this. The first is shown in this diff. What I have done is moved those flags out of the InstrDesc struct and into the InstructionBase class.

The alternative is to leave the flags as part of the InstrDesc struct and inject IPP into the InstrDesc creation function so that it can have a chance to modify the flags. (If done this way, I would have the IPP injection done in a way that would restrict IPP to only modifying these flags.)

The main advantages of doing it in the proposed way are:

  • The flags can be modified easily by IPP without actually having to change anything about the way IPP works or is injected.
  • The flags can be modified per individual instruction rather than being forced to modify them for every single instruction which matches a particular opcode (only one InstrDesc object is created per opcode so you could have several instructions in the same input that share the same InstrDesc, but they do not share the same InstructionBase).

The main disadvantage is:

  • Those flags are now duplicated for every single instruction with the same opcode. So if your input file has multiple instructions all with the same opcode (and you don't want to modify them independently), you'll end up using unnecessary memory from those duplicated / redundant flags.

Since what's being duplicated is only 6 bools, I don't see this as a major disadvantage, but I am absolutely open to changing this patch to conform to the second option (leave the flags in InstrDesc and then inject IPP in a restrictive way to let it modify those flags) if you feel it would be best. I am also open to other ideas, suggestions, questions, and feedback.

This patch also includes a few unrelated and minor changes. I added the opcode to the debug printing for an instruction (so that it's not as much of a hassle to get the opcode number of an in input instruction) and I added the IPP::resetState() function. It's theoretically possible that a target's IPP may maintain some state during its lifetime to help it make decisions based on what it has previously seen in the input. But since IPP is created outside of the CodeRegion loop, IPP should have an opportunity to reset its state whenever a new CodeRegion is being evaluated (since the CodeRegions are supposed to be evaluated independently of each other). Neither of these are particularly disruptive and I don't imagine they'll be controversial so I figured they'd be fine to tack on here.

As always, thank you very much for your time. I would appreciate any form of input that anybody might have.

Diff Detail

Event Timeline

holland11 created this revision.Mar 11 2022, 7:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 11 2022, 7:58 PM
holland11 requested review of this revision.Mar 11 2022, 7:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 11 2022, 7:58 PM
holland11 edited the summary of this revision. (Show Details)Mar 11 2022, 7:59 PM
andreadb accepted this revision.Mar 12 2022, 3:38 AM

Just a couple of minor comments. Otherwise, patch looks good to me.

Thanks!

llvm/include/llvm/MCA/CustomBehaviour.h
53–60

The last sentence alone is already descriptive enough.

I suggest that we keep that last sentence only, and get rid of the reference to llvm-mca.cpp. While it is true that the llvm-mca driver is currently the only user of this logic (at least, this is true for upstream), there may be other downstream users of this library.

llvm/include/llvm/MCA/Instruction.h
513–525

Could you please convert all these eight boolean flags (starting from IsALoadBarrier, ending with your new flags) to bitfields?

Example:

bool IsALoadBarrier : 1;
bool IsAStoreBarrier : 1;
...
bool RetireOOO : 1;
llvm/tools/llvm-mca/llvm-mca.cpp
516–517

This comment is a bit repetitive and it can be removed. A similar comment is already added by this patch to the declaration of resetState. So, I am not convinced that it is adding extra information to the reader.

This revision is now accepted and ready to land.Mar 12 2022, 3:38 AM

Just a couple of minor comments. Otherwise, patch looks good to me.

Thanks!

I'd never learned about that bitfield declaration pattern before. Just googled it and it makes a lot of sense. I've seen that pattern around mca's codebase, but I always assumed it was just an alternative way to give default values. Thanks for the suggestions!

holland11 updated this revision to Diff 414959.Mar 13 2022, 2:21 PM

Made the changes that were suggested by Andrea.

This revision was landed with ongoing or failed builds.Mar 13 2022, 9:22 PM
This revision was automatically updated to reflect the committed changes.