When translating multiple bitfieldInserts where one
bitfieldInsert is the base for another one, the current
mechanism of creating XOR, AND, XOR sequences in the
frontend and having them lowered in ISel is not sufficient,
as the information about the bitmasks is lost during InstCombine.
This leads to only one v_bfi instruction being generated.
When creating the canonical bitfieldInsert pattern directly in the
frontend, the constants will still be partially merged by
SimplifyDemandedBits, leading to a simplified pattern, however, the
original sequence can be reconstructed and the nested bitfieldInserts
can be generated.
In general, this approach tries to match sequences such as
(X1 & C1) | (((X2 & C2) | (X3 & (~C1 | ~C3))))
and checks if this pattern is derived from
(X1 & C1) | (~C1 & ((X2 & C2) | (X3 & ~C3)))
by looking at the constants and checking if they are disjoint and
partitioning -1.
In such cases, it will try to generate the appropriate v_bfi instructions.
"SelectBFI" (no underscore) or "SelectV_BFI" for consistency.