ADDI(C.ADDI) may achieve better code size than XORI, since XORI has no C extension.
This patch transforms two patterns and gets almost equivalent results.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
The case was from dhrystone(dhry_2.c):
Boolean Func_3 (Enum_Par_Val) /***************************/ /* executed once */ /* Enum_Par_Val == Ident_3 */ Enumeration Enum_Par_Val; { Enumeration Enum_Loc; Enum_Loc = Enum_Par_Val; if (Enum_Loc == Ident_3) /* then, executed */ return (true); else /* not executed */ return (false); } /* Func_3 */
I found gcc will have better code size (c.addi + sqez) than llvm (xori + sqez), so this patch try to implement the same optimization.
This seems like a fun issue:
- addi is compressible
- xor is almost certainly easier to analyse (from the view of KnownBits and the like).
Have you seen any regressions in code generation from this change?
No, there's no any regressions. For the case where the register is compared with an immediate value (equal or unequal), using xori or addi with neg imm in the pattern seteq or setne is equivalent, in order to get a result equal to 0 or not equal to 0.
Well spotted, this seems like a good change to me. I wonder if there are other optimization opportunities to use this NegImm/simm12_plus1 pattern in further patches?
That's a fair comment, though I'd hope the situation where the result of the xor in this pattern is used for anything else later on would be extremely rare/impossible?
Ah, yes, that does make sense, given the addi is generated with one use.
I'm happy for this to land now!