I have had this patch from a while back when looking at the MVETailPredication pass. It strips out a lot of the code that should no longer be needed, leaving the important part of the pass - find active lane mask instructions and convert them to VCTP operations.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/ARM/MVETailPredication.cpp | ||
---|---|---|
351–352 | nit: looks more than 80 columns... | |
llvm/test/CodeGen/Thumb2/LowOverheadLoops/tail-pred-intrinsic-round.ll | ||
244 | Intially I was expecting this to be a NFC, but looks like it is also doing something good for codegen and we get more tail-predication. Why is that? Is that because of the dead code removal that is no longer in the way? |
llvm/lib/Target/ARM/MVETailPredication.cpp | ||
---|---|---|
351–352 | Ah yeah. Will do. | |
llvm/test/CodeGen/Thumb2/LowOverheadLoops/tail-pred-intrinsic-round.ll | ||
244 | IsPredicatedVectorLoop would be checking for certain instructions, and only convert the active lane mask if it finds them. This included intrinsics which it did not recognize as vector instructions. It will always be better to convert to a VCTP than the to expand the active lane mask though, even if the loop does not get tail predicated successfully. Also, as can be seen in this case, it sometimes gets intrinsics wrong. This is serialized, but that does not block tail predication. |
llvm/test/CodeGen/Thumb2/LowOverheadLoops/tail-pred-intrinsic-round.ll | ||
---|---|---|
244 | Ah yes, got it, cheers. Like Sam said, LGTM |
nit: looks more than 80 columns...