This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Teach isel to select ADDW/SUBW/MULW/SLLIW when only the lower 32-bits are used.
ClosedPublic

Authored by craig.topper on Aug 6 2021, 10:08 AM.

Details

Summary

We normally select these when the root node is a sext_inreg, but
SimplifyDemandedBits can sometimes bypass the sext_inreg for some
users. This can create situation where sext_inreg+add/sub/mul/shl
is selected to a W instruction, and then the add/sub/mul/shl is
separately selected to a non-W instruction with the same inputs.

This patch tries to detect when it would still be ok to use a W
instruction without the sext_inreg by checking the direct users.
This can allow the W instruction to CSE with any created for a
sext_inreg+add/sub/mul/shl. To minimize complexity and cost of
checking, we make no attempt to determine if the CSE will happen
and just always use a W instruction when we can.

Diff Detail

Event Timeline

craig.topper created this revision.Aug 6 2021, 10:08 AM
craig.topper requested review of this revision.Aug 6 2021, 10:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 6 2021, 10:08 AM
Herald added a subscriber: MaskRay. · View Herald Transcript

Fix subw pattern. Add mulw pattern.

Update one test I missed.

Add SLLI with shift amount >= 32 to the list of W user instructions.

Make hasAllWUsers more generic so it can be used for HUsers or BUsers as well.

What's the reasoning behind the current set of opcodes? E.g. are there cases where div[u]w/rem[u]w or sra[i]w/srl[i]w are worth using?

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.h
61

NBit with N being the SDNode is slightly confusing at first glance, though obvious once you stop and think. Unfortunate collision.

llvm/lib/Target/RISCV/RISCVInstrInfo.td
1258

You could have a class for this so you just def addw : PatFragFoo<add>; etc and avoid the repetition

1294

Nit: 32-bit but 32 bits

1296

PatGprGpr/PatGprSimm12?

What's the reasoning behind the current set of opcodes? E.g. are there cases where div[u]w/rem[u]w or sra[i]w/srl[i]w are worth using?

These are the only opcodes that are type legalized by any_extend because the result in the lower 32 bits doesn't depend on the upper 32 bits after promotion. The other binary operators use RISCVISD::*W ISD opcodes or require a sign_extend_inreg/zero_extend_inreg because the upper 32 bits can effect the lower 32 bits. These are also the only opcodes used in an isel pattern that starts with sext_inreg.

I'm probably going to explain the rest of this poorly, but I'll try.

div[u]w/remuw are converted to their own special ISD opcode during type legalization. I believe there was a subtle issue with pattern matching ISD::DIV/UDIV/UREM and zext_inreg/sext_inreg to the W instructions. We can only pattern match ISD::SREM+sext_inreg to REMW.

sraiw/srliw are type legalized by inserting a sext_inreg or zext_inreg. We select SRAIW/SRLIW based on the inputs being a sext_inreg/zext_inreg(or the equivalent computeKnownBits/computeNumSignBits) and the immediate fitting in 5 bits. We can't select SRAIW/SRLIW based on how the result is used. We need to know that sign bits or 0 are supposed to be shifted into bit 31.

sraw/srlw are type legalized using their own RISCVISD::SRAW/SRLW nodes. Since we can't see the shift amount, we need these ISD opcodes in order to know that it was UB for the shift amount to be more than 5 bits. Default type legalization would insert a sext_inreg/zext_inreg, but we can't pattern match that if we don't know sure that it came from a type legalized i32 shift and an i64 shift that happened to have a sext_inreg/zext_inreg input. If it came from an i64 shift we might incorrectly ignore bit 5 of the shift amount by turning it into a W instruction.

Address review comments

craig.topper marked 4 inline comments as done.Aug 17 2021, 7:07 PM

What's the reasoning behind the current set of opcodes? E.g. are there cases where div[u]w/rem[u]w or sra[i]w/srl[i]w are worth using?

These are the only opcodes that are type legalized by any_extend because the result in the lower 32 bits doesn't depend on the upper 32 bits after promotion. The other binary operators use RISCVISD::*W ISD opcodes or require a sign_extend_inreg/zero_extend_inreg because the upper 32 bits can effect the lower 32 bits. These are also the only opcodes used in an isel pattern that starts with sext_inreg.

I'm probably going to explain the rest of this poorly, but I'll try.

div[u]w/remuw are converted to their own special ISD opcode during type legalization. I believe there was a subtle issue with pattern matching ISD::DIV/UDIV/UREM and zext_inreg/sext_inreg to the W instructions. We can only pattern match ISD::SREM+sext_inreg to REMW.

sraiw/srliw are type legalized by inserting a sext_inreg or zext_inreg. We select SRAIW/SRLIW based on the inputs being a sext_inreg/zext_inreg(or the equivalent computeKnownBits/computeNumSignBits) and the immediate fitting in 5 bits. We can't select SRAIW/SRLIW based on how the result is used. We need to know that sign bits or 0 are supposed to be shifted into bit 31.

sraw/srlw are type legalized using their own RISCVISD::SRAW/SRLW nodes. Since we can't see the shift amount, we need these ISD opcodes in order to know that it was UB for the shift amount to be more than 5 bits. Default type legalization would insert a sext_inreg/zext_inreg, but we can't pattern match that if we don't know sure that it came from a type legalized i32 shift and an i64 shift that happened to have a sext_inreg/zext_inreg input. If it came from an i64 shift we might incorrectly ignore bit 5 of the shift amount by turning it into a W instruction.

Thanks, that makes sense.

llvm/lib/Target/RISCV/RISCVInstrInfoM.td
74

You can use your new class here too

luismarques added inline comments.Aug 18 2021, 5:12 AM
llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
1566–1574

Are these Bits inequality comparisons correct?

jrtc27 added inline comments.Aug 18 2021, 5:27 AM
llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
1552

This != 32 is a bit weird, would expect it to be at least < 32. But I think you can instead do something more general (and I think also clearer) like:

if (Bits < XLen - cast<ConstantSDNode>(User->getOperand(1))->getZExtValue())
  return false;

(don't know if we need to care about SLLI with an immediate >= XLen; it's an illegal instruction so presumably shouldn't come out of *CodeGen* no matter what the input is?)

1566–1574

Hm, indeed, I believe this one is wrong, and the others are correct? (Bits being how many bits you want to truncate the value to).

This just happens to work because x > 32 and x < 32 behave identically when you only use x=32.

Address review comments

This revision is now accepted and ready to land.Aug 18 2021, 10:13 AM
This revision was landed with ongoing or failed builds.Aug 18 2021, 10:23 AM
This revision was automatically updated to reflect the committed changes.
llvm/lib/Target/RISCV/RISCVInstrInfo.td
1259

I am wondering could it be applied to more complicated condition with 3 operands or 1 operand? For example, mulaw, or unary operator.

craig.topper added inline comments.Sep 21 2021, 11:48 PM
llvm/lib/Target/RISCV/RISCVInstrInfo.td
1259

We would need a different PatFrag for different number of operates, but the underlying helper function hasAllWUsers can be shared.