This adds a new BFI intrinsic which can be used to emit the v_bfi instruction
directly with a mask.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
We’ve specifically avoided adding intrinsics for easy to match instructions. This needs a semantic justification over just emitting the expanded bit sequence. It’s a huge amount of work teaching every part of the compiler the equivalent bit optimizations.
llvm/test/CodeGen/AMDGPU/bfi_nested.ll | ||
---|---|---|
3 | Why lose a test |
Unfortunately, in the case of BFI instructions, these are not so easy to match in LLVM. There is some unfortunate stuff going on that prevents us from generating nested v_bfi instructions (e. g. one BFI as base of another etc.) - which in turn lets us generate way less v_bfi instructions than we could. I have been working on that for a while on https://reviews.llvm.org/D136432, and it is not so easy to get it working properly. From my tests, it seems, that there is no real advantage in terms of codegen when trying to match the and / or patterns, so I'd thought that emitting the intrinsic directly would be sufficient and an improvement over the current state. This should not replace the few existing BFI ISel patterns but rather serve as a way to teach the middle-end to generate BFI instructions.
llvm/test/CodeGen/AMDGPU/bfi_nested.ll | ||
---|---|---|
3 | Because that was implemented as base for the bementioned, abandoned BFI patch which was never merged. |
Why lose a test