This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Lower GREVI and GORCI as custom nodes
ClosedPublic

Authored by frasercrmck on Nov 11 2020, 6:57 AM.

Details

Summary

This moves the recognition of GREVI and GORCI from TableGen patterns
into a DAGCombine. This is done primarily to match "deeper" patterns in
the future, like (grevi (grevi x, 1) 2) -> (grevi x, 3).

TableGen is not best suited to matching patterns such as these as the compile
time of the DAG matchers quickly gets out of hand due to the expansion of
commutative permutations.

Diff Detail

Event Timeline

frasercrmck created this revision.Nov 11 2020, 6:57 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 11 2020, 6:57 AM
frasercrmck requested review of this revision.Nov 11 2020, 6:57 AM
craig.topper added inline comments.Nov 12 2020, 8:57 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
1138

Could we match i32 pre-type legalization on RV64 and type legalize i32 RISCVISD::GREVI to GREVIW? I think that would remove the need to match from SIGN_EXTEND_INREG.

1145

Since this is expected to be a constant, can we use getTargetConstant here and use timm instead of imm in the isel patterns?

1282

Can this just be "return GREV;"?

2814

General question for the RISCV backend, would it make sense to use a macro to generate these cases like X86 and AMDGPU do

#define NODE_NAME_CASE(NODE) case RISCVISD::NODE: return "RISCVISD::" #NODE;

This would eliminate having to mention the node name twice. On X86 I found a couple typos in the strings so having them automatically created can remove human error there.

frasercrmck added inline comments.Nov 12 2020, 9:29 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
1138

Good idea; I'll give it a go.

1145

Sounds good to me.

1282

Yes, I think so. Thanks.

2814

I'm glad you brought it up as I'm all in favour of that approach. I can do that as a separate patch to gather feedback.

Address Craig's feedback except:

  • using macros; that can be a separate patch
  • matching 'W before' legalization; will experiment with that
frasercrmck marked 2 inline comments as done.Nov 12 2020, 9:31 AM
craig.topper added inline comments.Nov 12 2020, 10:50 PM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
1153

Is it possible that the input is (or (or (BITMANIP_SHL x), (BITMANIP_SHR x)), x)?

frasercrmck added inline comments.Nov 13 2020, 3:56 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
1138

This idea mostly works but is failing for two RV64 test cases: i32 fshl and fshr intrinsics with constant 16.

fshl is lowered to sign-extended i32 rotl instructions and dag-combined to a (sext_inreg (i64 (or (shl x, 16), (srl (and x, 0xffff0000), 16)))) pattern, skipping the 32-bit or combine. That's something the sext combine was getting but not something the or combine is. Similar for fshr. I think we need the sext in there to be able to match this as a greviw.

Combining certain rotl (and rotr, bswap and bitreverse) to grevi would catch this and would unify everything under one node (something I was mulling about anyway as it'd provide more opportunity for grev combining) but it might also be heavy-handed.

1153

Yeah. It's currently generating this on RV32 master:

  %shl = shl i32 %a, 16
  %shr = lshr i32 %a, 16
  %or = or i32 %shr, %shl
  %or2 = or i32 %or, %a
  ret i32 %or2

-->
  rev16   a1, a0
  or      a0, a1, a0

In that case it's essentially or (grevi x), x) so could in theory be picked up like that rather than fully exploring all potential combinations of (or (or x, y), (or z, w)) inputs.

Sadly though LLVM is picking up that shl/srl as rotl, and still being TableGen pattern matched to grev.

That relates to my other comment, that we might need to dive further into rotl, rotr, bswap and bitreverse to find more opportunities.

craig.topper added inline comments.Nov 13 2020, 8:16 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
1153

Can you post your patch to a separate review? I'd like to experiment with it.

craig.topper added inline comments.Nov 13 2020, 9:54 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
1138

This idea mostly works but is failing for two RV64 test cases: i32 fshl and fshr intrinsics with constant 16.

fshl is lowered to sign-extended i32 rotl instructions and dag-combined to a (sext_inreg (i64 (or (shl x, 16), (srl (and x, 0xffff0000), 16)))) pattern, skipping the 32-bit or combine. That's something the sext combine was getting but not something the or combine is. Similar for fshr. I think we need the sext in there to be able to match this as a greviw.

Combining certain rotl (and rotr, bswap and bitreverse) to grevi would catch this and would unify everything under one node (something I was mulling about anyway as it'd provide more opportunity for grev combining) but it might also be heavy-handed.

I'm not sure that we should be using GREVI for cases that can be handled with ROLW/RORW. I could see a CPU having a better performing implementation of rotate than grevi.

frasercrmck added inline comments.Nov 13 2020, 10:11 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
1153

In the absence of time I just sent you an email with the wip patch

frasercrmck added inline comments.Nov 16 2020, 7:45 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
1138

Fair enough. Do you think D91449 and D91479 address this problem by catching fshl, fshr, rotl and rotr earlier?

1153

My understanding of what you've done in D91449 would be that it would help us more easily find (or (rorw x), x) and generate a gorc. I think what you've done in removing the reliance on SIGN_EXTEND_INREG is taking our support in a positive direction.

craig.topper added inline comments.Nov 16 2020, 10:52 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
1153

I think that would still require x to provably have 33 sign bits so that the or propagates the 33 sign bits from rorw. So it might be better to match it while it is still i32 rotr before type legalization.

We'll probably need to match (or (bswap x), x) and (or (bitreverse x), x) before type legalization too.

  • Rebase
  • Catch W before legalization; legalize to W
  • Catch more GORCs
  • Adjust fshl/fshr tests expectations
frasercrmck marked an inline comment as done.Nov 18 2020, 8:26 AM
This revision is now accepted and ready to land.Nov 18 2020, 10:41 AM

rebase & fix formatting

This revision was landed with ongoing or failed builds.Nov 19 2020, 10:17 AM
This revision was automatically updated to reflect the committed changes.