For narrow stores (e.g., strb, srth) we know the upper bits of the register are unused/not useful. In some cases we can use this information to eliminate unnecessary instructions.
For example, without this patch we generate (from the 2nd test case):
ldr w8, [x0] and w8, w8, #0xfff0 bfxil w8, w2, #16, #4 strh w8, [x1]
and after the patch the 'and' is removed:
ldr w8, [x0] bfxil w8, w2, #16, #4 strh w8, [x1] ret
During the lowering of the bitfield insert instruction the 'and' is eliminated because we know the upper 16-bits that are masked off are unused and the lower 4-bits that are masked off are overwritten by the insert itself. Therefore, the 'and' is unnecessary.
Please take a look,
Chad