In {test_bfi_not_orr, test_bfxil_not_orr}, bfi/bfxil are better since they simplifies away two instructions (extracting bits into destination directly)
In {test_orr_not_bfi, test_orr_not_bfxil}, orr is better since
- both orr and bfm would simplify away one instruction (the shl node)
- orr has higher throughput and shorter latency than bfm.