This commit purges direct accesses to MD_prof metadata and replaces them
with the accessors provided from the utility file wherever possible.
This commit can be seen as the first step towards switching the branch weights to 64 bits.
See post here: https://discourse.llvm.org/t/extend-md-prof-branch-weights-metadata-from-32-to-64-bits/67492
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
Thanks for picking this up. When I implemented ProfDataUtils there were still several things I wanted to address but havent' had sufficient bandwidth to pick up again. Looks like you've got most of the ones that I recall so far.
I left some feedback inline. Overall I think this is a good change. I have a few nits, and concerns that we may be missing some important checks, but most are pretty minor.
The one thing I'd say that definitely needs to change IMO is the addition of the SwitchInst to the True/False extractBranchWeights API. It seems error prone and wrong to use in almost all situations. But I'm not opposed to the change if there is a compelling reason, but I still don't see much benefit to it w/in this patch, except for one place in Local.cpp.
llvm/include/llvm/IR/ProfDataUtils.h | ||
---|---|---|
76 | This is incorrect. This overload is specifically for branch/select instructions where the number of successors is 2. There is a more general version that uses weights vector for all instruction types, including switches. If this API isn't needed anywhere/in the code base anymore, then it can be removed IMO, but this comment shouldn't change. | |
llvm/lib/IR/Instructions.cpp | ||
4606–4609 | I don't think we do this check in ProfDatUtils, so we probably shouldn't remove it, should we? I know I've tripped this assert before when modifying some passes, so I think it's good one to keep. | |
llvm/lib/IR/ProfDataUtils.cpp | ||
46 | This constant is used to check the validity of branch weight metadata. Changing this to 2 implies that a MD_prof node w/ only the "branch_weight" moniker and a single weight are valid. I don't think that's true, so I don't think this should change unless we change how MD_prof is laid out as a whole. | |
144 | This is wrong. you can't include a switch here. The overload above takes a SmallVectorImpl is the only appropriate API when dealing with a switch, since it can have an arbitrary number of arms. The assertion here is to prevent API misuse. If used on a switch, this will always return false, so I don't think thats the right choice here. You get some utility by being able to use this in more places, but I think this will be easy to introduce subtle bugs. For instance, if a switch happens to only have 2 arms this will work when it arguably shouldn't. My preference would be that everything settle on the SmallVectorImpl API, since that can correctly handle all cases. It isn't' as ergonomic in some of the code uses though, IIRC. | |
llvm/lib/Transforms/IPO/PartialInlining.cpp | ||
719 | ||
llvm/lib/Transforms/Instrumentation/ControlHeightReduction.cpp | ||
578–579 | Oh, this is a good change. The API use is much better at call sites. Since the validation is basically done in ProfDataUtils, maybe rename this API to something more self descriptive? | |
588 | nit: if we're changing the variable names, let's be consistent. I'd leave them as they were to limit the change, but if we're changing some, we should change them uniformly. | |
llvm/lib/Transforms/Scalar/JumpThreading.cpp | ||
2524–2525 | I think we discourage auto in cases like this, don't we? See: https://llvm.org/docs/CodingStandards.html#use-auto-type-deduction-to-make-code-more-readable | |
llvm/lib/Transforms/Scalar/LoopPredication.cpp | ||
977–978 | Should we also provide this API in ProfDataUtils? It's a validity check for branch weight nodes, and we already encapsulate most of it ... | |
llvm/lib/Transforms/Utils/Local.cpp | ||
221–222 | ||
315–319 | The overload will just get the MD node again anyway before calling the other API, so I'd suggest just using the MD node directly. Would you mind changing these and any others in this patch that use getBranchWeightMDNode prior to extractBranchWeights |
llvm/lib/IR/ProfDataUtils.cpp | ||
---|---|---|
104–124 | I'd consider inverting the logic between getValidBranchWeightMDNode() and hasValidBranchWeightMD() | |
llvm/lib/Transforms/Scalar/LoopPredication.cpp | ||
992 | I always forget the enumerate API exists ... what a massive improvement to readability. | |
llvm/lib/Transforms/Utils/Local.cpp | ||
225 | Should we track this improvement w/ an issue? If you don't want to that's fine, but I always lose track of these TODOs unless I file something in a bug tracker and ref it from my code. Totally up to you though. | |
315 | uint64_t, maybe? I can never remember what is guaranteed to run before we scale down to 32-bit weights, but the size should at least be the same between the weights vector and DefWeight/CaseWeight, right? |
llvm/lib/IR/Instructions.cpp | ||
---|---|---|
4606–4609 | I changed this back to have the llvm_unreachable. The newly added getValidBranchWeightMDNode could be used here but that would not cause an assertion violation. | |
llvm/lib/IR/ProfDataUtils.cpp | ||
46 | I changed this because calls can also have branch weights: https://llvm.org/docs/BranchWeightMetadata.html#callinst | |
llvm/lib/Transforms/Utils/Local.cpp | ||
225 | Good idea, I tracked this and another problem with the subsequent swap in here: https://github.com/llvm/llvm-project/issues/59956 | |
315 | The extractBranchWeights function expects an SmallVectorImp<uint32_t>, so I now changed the Weight variables to uint32_t as well. |
This is LGTM from my perspective, as long as the 32-bit usage in Local.cpp is safe. Thanks for taking this up. I think it's a big improvement.
llvm/lib/Transforms/Utils/Local.cpp | ||
---|---|---|
315 | I think this is fine but we may want to double check that this only gets called after scaling down to 32-bit. Otherwise, maybe we should have a uint64_t version too, or make this a templated function in the .h, so the you can choose at the callsite. That can be a separate change though, as long as 32-bit is safe. @davidxl do you know if this gets called after scaling down to 32-bit? |
lgtm
llvm/lib/Transforms/Utils/Local.cpp | ||
---|---|---|
315 | IRPGO scales the count before setting the edge weights. Frontend based instrumentation does the same. Static branch weights have small values so they are fine too. |
Thank you very much for the reviews. Could one of you commit this revision for me as I do not yet have commit access?
@Dinistro I can do that for you, but looking at your history, I'd say you should just request access. This is especially true, since I think you plan some follow up work w/ 64-bit weights, correct? https://llvm.org/docs/DeveloperPolicy.html#obtaining-commit-access. Usually people get access w/in an hour or so, but w/in a day is the expectation.
But let me know, and I can land this for you. But just FYI, if I do so I'll be the Author unfortunately, but I will attribute you in the commit summary. For some reason the committer doesn't get emails from bots when breakages happen, and only the author does. So to avoid that I've just started doing it that way. It's very imperfect, and not my preference, but it's not easy to catch breakages otherwise.
This is incorrect. This overload is specifically for branch/select instructions where the number of successors is 2. There is a more general version that uses weights vector for all instruction types, including switches.
If this API isn't needed anywhere/in the code base anymore, then it can be removed IMO, but this comment shouldn't change.