This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Optimize seteq/setne pattern expansions for better code size
ClosedPublic

Authored by wwei on Dec 20 2019, 9:10 AM.

Details

Summary

ADDI(C.ADDI) may achieve better code size than XORI, since XORI has no C extension.
This patch transforms two patterns and gets almost equivalent results.

Diff Detail

Event Timeline

wwei created this revision.Dec 20 2019, 9:10 AM
wwei added a comment.Dec 20 2019, 9:19 AM

The case was from dhrystone(dhry_2.c):

Boolean Func_3 (Enum_Par_Val)
/***************************/
    /* executed once        */
    /* Enum_Par_Val == Ident_3 */

Enumeration Enum_Par_Val;
{
  Enumeration Enum_Loc;
  Enum_Loc = Enum_Par_Val;
  if (Enum_Loc == Ident_3)
    /* then, executed */
    return (true);
  else /* not executed */
    return (false);
} /* Func_3 */

I found gcc will have better code size (c.addi + sqez) than llvm (xori + sqez), so this patch try to implement the same optimization.

Jim added a subscriber: Jim.Dec 23 2019, 10:08 PM

This seems like a fun issue:

  • addi is compressible
  • xor is almost certainly easier to analyse (from the view of KnownBits and the like).

Have you seen any regressions in code generation from this change?

wwei added a comment.Jan 30 2020, 1:14 AM

This seems like a fun issue:

  • addi is compressible
  • xor is almost certainly easier to analyse (from the view of KnownBits and the like).

Have you seen any regressions in code generation from this change?

No, there's no any regressions. For the case where the register is compared with an immediate value (equal or unequal), using xori or addi with neg imm in the pattern seteq or setne is equivalent, in order to get a result equal to 0 or not equal to 0.

Well spotted, this seems like a good change to me. I wonder if there are other optimization opportunities to use this NegImm/simm12_plus1 pattern in further patches?

This seems like a fun issue:

  • addi is compressible
  • xor is almost certainly easier to analyse (from the view of KnownBits and the like).

Have you seen any regressions in code generation from this change?

That's a fair comment, though I'd hope the situation where the result of the xor in this pattern is used for anything else later on would be extremely rare/impossible?

lenary accepted this revision.Feb 4 2020, 5:51 AM

Have you seen any regressions in code generation from this change?

That's a fair comment, though I'd hope the situation where the result of the xor in this pattern is used for anything else later on would be extremely rare/impossible?

Ah, yes, that does make sense, given the addi is generated with one use.

I'm happy for this to land now!

This revision is now accepted and ready to land.Feb 4 2020, 5:51 AM
This revision was automatically updated to reflect the committed changes.