As noticed on PR39174, if we're extracting a single non-constant bit, then try to use BT+SETCC instead to avoid messing around moving the shift amount to the ECX register, using slow x86 shift ops etc.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
As noticed on PR39174, if we're extracting a single non-constant bit, then try to use BT+SETCC instead to avoid messing around moving the shift amount to the ECX register, using slow x86 shift ops etc.
Why using ECX register is a slow shift?
Just that we have to move the shift amount to ECX for the shift ops, which can have side effects on register pressure/allocation.
LGTM. I thought that, but the word slow made me think if there's any other issue I don't know :)
What about the case where we want to have the inverse of the bit, https://godbolt.org/z/sEjq9n9Kn
I would presume we can consume any such not by inverting the predicate (b<->nb).
np - we already do something like that in some of the other X86ISD::BT lowering cases - I'll take a look next week