If we are just modifying a single bit at a variable bit position we can use the BT* instructions to make the change instead of shifting a 1(or rotating a -1) and doing a binop. These instruction also ignore the upper bits of their index input so we can also remove an and if one is present on the index.
I'll see if I can spread some multiclass goodness on the td file to reduce the repetition.
Fixes PR37938
It doesn't look like the 16-bit cases matched?