Download Raw Diff

Details

Reviewers

reames
luismarques
asb
frasercrmck

Commits

rG4186a49d793e: [RISCV] Custom type legalize i32 loads by sign extending.

Summary

The default is to use extload which can become a zextload or
sextload if it is followed by an 'and' or sext_inreg.

Sometimes type legalization will introduce an 'and' from promoting
something like 'srl X, C' and a sext_inreg from from a setcc. The
'and' could be freely folded with the promoted 'srl' by using srliw,
but the sext_inreg can't be folded into a compare. DAG combiner
will see both of these choices and may decide to fold the 'and'
instead of the 'sext_inreg'. This forces the sext_inreg to become
a sext.w.

By picking sextload in the type legalizer we take this choice away.
Looking at spec2006 compiled with Zba and Zbb this appeared to be
net reduction in lines of code in the objdump disassembly output.

This is similar to what we do with i32 add/sub/mul/shl in
type legalization where we always emit a sext_inreg.

There's some followup improvements we could do. For example, folding
(and (sextload X), 0xffffffff) to (zextload X) if the 'and' is the
only user.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

craig.topper created this revision.Jul 22 2022, 1:56 PM

Herald added a project: Restricted Project. · View Herald TranscriptJul 22 2022, 1:56 PM

Herald added subscribers: sunshaoce, VincentWu, luke957 and 27 others. · View Herald Transcript

craig.topper requested review of this revision.Jul 22 2022, 1:56 PM

Herald added a project: Restricted Project. · View Herald TranscriptJul 22 2022, 1:56 PM

Herald added subscribers: • pcwang-thead, eopXD, MaskRay. · View Herald Transcript

Harbormaster completed remote builds in B177109: Diff 446974.Jul 22 2022, 3:05 PM

I wonder if this should be a target hook? I accept that adding a hook for something that only RISC-V uses isn't necessarily an obvious win. But it also doesn't seem ideal for us to accrete too many of these heuristic workarounds within the backend. (not sure I have a super strong opinion either way - just raising for discussion).

In D130397#3690597, @asb wrote:

I wonder if this should be a target hook? I accept that adding a hook for something that only RISC-V uses isn't necessarily an obvious win. But it also doesn't seem ideal for us to accrete too many of these heuristic workarounds within the backend. (not sure I have a super strong opinion either way - just raising for discussion).

This doesn't feel very different than what we do for ADD/SUB/MUL/SHL so I'm not sure its worth a target hook right now.

@asb have you checked gcc torture suite with this?

Yes, from a quick check the overall impact is positive (550 files changed, 9764 insertions(+), 9810 deletions(-) - that's across rv64imafdc {lp64,lp64d} {O0,O1,O2,O3,Os,Oz}. But there are some cases, e.g. var-arg-24.s where some LWU get converted to LW + SLLI + SRLI.

Some cases where LWU + SRLI => LW + SRLIW which I suppose might have a tiny code-size impact. LWU is never compressible, SRLI may be. LW may be compressible, SRLIW never is. Doesn't feel like a big deal either way.

In D130397#3694200, @asb wrote:

Yes, from a quick check the overall impact is positive (550 files changed, 9764 insertions(+), 9810 deletions(-) - that's across rv64imafdc {lp64,lp64d} {O0,O1,O2,O3,Os,Oz}. But there are some cases, e.g. var-arg-24.s where some LWU get converted to LW + SLLI + SRLI.

Some cases where LWU + SRLI => LW + SRLIW which I suppose might have a tiny code-size impact. LWU is never compressible, SRLI may be. LW may be compressible, SRLIW never is. Doesn't feel like a big deal either way.

Thanks Alex! I was going took at adding a DAG combine to turn sextload+and into zextload which might fix the LW + SLLI + SRLI case and maybe the LW+SRLIW cases.

Add a small DAGCombiner change that recovers va-arg-24.c I'll write a test
before I commit this. Wanted to get this out for further evaluation on other
tests Alex saw.

We might consider supporting isTruncateFree true for i64->i32 with RV64.
W instructions make it free in a lot of cases.

Herald added a subscriber: ecnelises. · View Herald TranscriptAug 2 2022, 5:29 PM

Harbormaster completed remote builds in B178909: Diff 449494.Aug 2 2022, 6:39 PM

In D130397#3695322, @craig.topper wrote:

Add a small DAGCombiner change that recovers va-arg-24.c I'll write a test
before I commit this. Wanted to get this out for further evaluation on other
tests Alex saw.

Comparing old patch vs new patch, the inter-diff shows all positive changes (i.e. no cases where codegen is worse vs the first patch version).

There are still cases where the static instruction count rises, I've pasted in all such cases in this gist

More fixes. I'll separate them back out if this gets approved. Just trying to keep them together for testing.

Harbormaster completed remote builds in B179189: Diff 449863.Aug 3 2022, 9:48 PM

With this version of the patch these are the cases that ended up adding more lines:

output_rv64imafdc_lp64_O0/mode-dependent-address.s: 1 lines added (net)
output_rv64imafdc_lp64_O0/pr53645.s: 16 lines added (net)
output_rv64imafdc_lp64_O1/pr23135.s: 15 lines added (net)
output_rv64imafdc_lp64_O1/pr53645.s: 32 lines added (net)
output_rv64imafdc_lp64_O2/loop-5.s: 1 lines added (net)
output_rv64imafdc_lp64_O2/pr53645.s: 30 lines added (net)
output_rv64imafdc_lp64_O3/loop-5.s: 1 lines added (net)
output_rv64imafdc_lp64_O3/memset-2.s: 46 lines added (net)
output_rv64imafdc_lp64_O3/pr53645.s: 30 lines added (net)
output_rv64imafdc_lp64_Os/pr53645.s: 30 lines added (net)
output_rv64imafdc_lp64d_O0/mode-dependent-address.s: 1 lines added (net)
output_rv64imafdc_lp64d_O0/pr53645.s: 16 lines added (net)
output_rv64imafdc_lp64d_O1/pr23135.s: 15 lines added (net)
output_rv64imafdc_lp64d_O1/pr53645.s: 32 lines added (net)
output_rv64imafdc_lp64d_O2/loop-5.s: 1 lines added (net)
output_rv64imafdc_lp64d_O2/pr53645.s: 30 lines added (net)
output_rv64imafdc_lp64d_O3/loop-5.s: 1 lines added (net)
output_rv64imafdc_lp64d_O3/memset-2.s: 46 lines added (net)
output_rv64imafdc_lp64d_O3/pr53645.s: 30 lines added (net)
output_rv64imafdc_lp64d_Os/pr53645.s: 30 lines added (net)

Diff vs current HEAD.

This gives a better idea of the impact:
output_rv64imafdc_lp64_O0/mode-dependent-address.s: 3 lines added, 2 removed (+1 net)
output_rv64imafdc_lp64_O0/pr53645.s: 112 lines added, 96 removed (+16 net)
output_rv64imafdc_lp64_O1/pr23135.s: 147 lines added, 132 removed (+15 net)
output_rv64imafdc_lp64_O1/pr53645.s: 282 lines added, 250 removed (+32 net)
output_rv64imafdc_lp64_O2/loop-5.s: 6 lines added, 5 removed (+1 net)
output_rv64imafdc_lp64_O2/pr53645.s: 278 lines added, 248 removed (+30 net)
output_rv64imafdc_lp64_O3/loop-5.s: 6 lines added, 5 removed (+1 net)
output_rv64imafdc_lp64_O3/memset-2.s: 1129 lines added, 1083 removed (+46 net)
output_rv64imafdc_lp64_O3/pr53645.s: 278 lines added, 248 removed (+30 net)
output_rv64imafdc_lp64_Os/pr53645.s: 278 lines added, 248 removed (+30 net)
output_rv64imafdc_lp64d_O0/mode-dependent-address.s: 3 lines added, 2 removed (+1 net)
output_rv64imafdc_lp64d_O0/pr53645.s: 112 lines added, 96 removed (+16 net)
output_rv64imafdc_lp64d_O1/pr23135.s: 147 lines added, 132 removed (+15 net)
output_rv64imafdc_lp64d_O1/pr53645.s: 282 lines added, 250 removed (+32 net)
output_rv64imafdc_lp64d_O2/loop-5.s: 6 lines added, 5 removed (+1 net)
output_rv64imafdc_lp64d_O2/pr53645.s: 278 lines added, 248 removed (+30 net)
output_rv64imafdc_lp64d_O3/loop-5.s: 6 lines added, 5 removed (+1 net)
output_rv64imafdc_lp64d_O3/memset-2.s: 1129 lines added, 1083 removed (+46 net)
output_rv64imafdc_lp64d_O3/pr53645.s: 278 lines added, 248 removed (+30 net)
output_rv64imafdc_lp64d_Os/pr53645.s: 278 lines added, 248 removed (+30 net)

It's probably worth a quick check if there are obvious reasons for the additions, but the overall impact seems positive so if there's not an obvious deficiency I don't have an objection to declaring these cases are just noise due to taking a different codegen path.

asb added inline comments.Aug 4 2022, 10:33 AM

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
182	It's not immediately obvious why `Promote` is needed, so it's probably worth a comment.

In D130397#3700198, @asb wrote:

This gives a better idea of the impact:
output_rv64imafdc_lp64_O0/mode-dependent-address.s: 3 lines added, 2 removed (+1 net)
output_rv64imafdc_lp64_O0/pr53645.s: 112 lines added, 96 removed (+16 net)
output_rv64imafdc_lp64_O1/pr23135.s: 147 lines added, 132 removed (+15 net)
output_rv64imafdc_lp64_O1/pr53645.s: 282 lines added, 250 removed (+32 net)
output_rv64imafdc_lp64_O2/loop-5.s: 6 lines added, 5 removed (+1 net)
output_rv64imafdc_lp64_O2/pr53645.s: 278 lines added, 248 removed (+30 net)
output_rv64imafdc_lp64_O3/loop-5.s: 6 lines added, 5 removed (+1 net)
output_rv64imafdc_lp64_O3/memset-2.s: 1129 lines added, 1083 removed (+46 net)
output_rv64imafdc_lp64_O3/pr53645.s: 278 lines added, 248 removed (+30 net)
output_rv64imafdc_lp64_Os/pr53645.s: 278 lines added, 248 removed (+30 net)
output_rv64imafdc_lp64d_O0/mode-dependent-address.s: 3 lines added, 2 removed (+1 net)
output_rv64imafdc_lp64d_O0/pr53645.s: 112 lines added, 96 removed (+16 net)
output_rv64imafdc_lp64d_O1/pr23135.s: 147 lines added, 132 removed (+15 net)
output_rv64imafdc_lp64d_O1/pr53645.s: 282 lines added, 250 removed (+32 net)
output_rv64imafdc_lp64d_O2/loop-5.s: 6 lines added, 5 removed (+1 net)
output_rv64imafdc_lp64d_O2/pr53645.s: 278 lines added, 248 removed (+30 net)
output_rv64imafdc_lp64d_O3/loop-5.s: 6 lines added, 5 removed (+1 net)
output_rv64imafdc_lp64d_O3/memset-2.s: 1129 lines added, 1083 removed (+46 net)
output_rv64imafdc_lp64d_O3/pr53645.s: 278 lines added, 248 removed (+30 net)
output_rv64imafdc_lp64d_Os/pr53645.s: 278 lines added, 248 removed (+30 net)

It's probably worth a quick check if there are obvious reasons for the additions, but the overall impact seems positive so if there's not an obvious deficiency I don't have an objection to declaring these cases are just noise due to taking a different codegen path.

I explored these differences. Some notes.

memset-2.s - Directly caused by the isTruncateFree change. We are now sharing a 64-bit constant between i64 and i32 stores by truncating. Somehow this caused some repeated rematerialization of LUI instructions. Despite on the surface the change reducing register pressure. The basic block is quite large with many calls to memset.

pr53645.s - test includes a vector value from a load passed across basic blocks that get scalarized. this increases the use count of the broken down scalars but the other basic block only wants 1 element. There's also a visitation order issue with urem by constant expansion interacting with SimplifyDemandedBits.

mode-dependent-address.s - we have an i32 load used by sext_inreg and an and with 255. This and was used to form a zextload, but it didn't remove the and, but prevented the sext_inreg from making a sextload. Seems related to isTruncateFree.

pr23135.s - DAGCombiner's ForwardStoreValueToDirectLoad needs to support sextload by creating sext_inreg.

loop-5.s - We need to both sext and zext a load. We used to use lwu and sext.w, now we use lw and slli+srli.

I only have a good answer on how to fix pr23135.s

Going to start splitting this up.

craig.topper mentioned this in D131819: [RISCV] Enable isTruncateFree in SDAG for i64->i32 on rv64..Aug 12 2022, 4:54 PM

craig.topper mentioned this in rG7a73ab5818a1: [RISCV] Enable isTruncateFree in SDAG for i64->i32 on rv64..Aug 15 2022, 8:36 AM

Add comment to extending load change

craig.topper edited the summary of this revision. (Show Details)Aug 27 2022, 1:28 PM

Harbormaster completed remote builds in B183774: Diff 456147.Aug 27 2022, 3:16 PM

Ping

[Sorry Craig - it looks like I'd re-reviewed and written my approval but I either didn't submit or it didn't go through].

This looks good to me, thanks!

This revision is now accepted and ready to land.Sep 12 2022, 8:39 AM

Closed by commit rG4186a49d793e: [RISCV] Custom type legalize i32 loads by sign extending. (authored by craig.topper). · Explain WhySep 12 2022, 9:13 AM

This revision was automatically updated to reflect the committed changes.

craig.topper added a commit: rG4186a49d793e: [RISCV] Custom type legalize i32 loads by sign extending..

Diff 459483

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 173 Lines • ▼ Show 20 Lines	RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,

// Compute derived properties from the register classes.		// Compute derived properties from the register classes.
computeRegisterProperties(STI.getRegisterInfo());		computeRegisterProperties(STI.getRegisterInfo());

setStackPointerRegisterToSaveRestore(RISCV::X2);		setStackPointerRegisterToSaveRestore(RISCV::X2);

setLoadExtAction({ISD::EXTLOAD, ISD::SEXTLOAD, ISD::ZEXTLOAD}, XLenVT,		setLoadExtAction({ISD::EXTLOAD, ISD::SEXTLOAD, ISD::ZEXTLOAD}, XLenVT,
MVT::i1, Promote);		MVT::i1, Promote);
		// DAGCombiner can call isLoadExtLegal for types that aren't legal.
		asbUnsubmitted Not Done Reply Inline Actions It's not immediately obvious why `Promote` is needed, so it's probably worth a comment. asb: It's not immediately obvious why `Promote` is needed, so it's probably worth a comment.
		setLoadExtAction({ISD::EXTLOAD, ISD::SEXTLOAD, ISD::ZEXTLOAD}, MVT::i32,
		MVT::i1, Promote);

// TODO: add all necessary setOperationAction calls.		// TODO: add all necessary setOperationAction calls.
setOperationAction(ISD::DYNAMIC_STACKALLOC, XLenVT, Expand);		setOperationAction(ISD::DYNAMIC_STACKALLOC, XLenVT, Expand);

setOperationAction(ISD::BR_JT, MVT::Other, Expand);		setOperationAction(ISD::BR_JT, MVT::Other, Expand);
setOperationAction(ISD::BR_CC, XLenVT, Expand);		setOperationAction(ISD::BR_CC, XLenVT, Expand);
setOperationAction(ISD::BRCOND, MVT::Other, Custom);		setOperationAction(ISD::BRCOND, MVT::Other, Custom);
setOperationAction(ISD::SELECT_CC, XLenVT, Expand);		setOperationAction(ISD::SELECT_CC, XLenVT, Expand);
Show All 15 Lines	RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
setOperationAction(ISD::EH_DWARF_CFA, MVT::i32, Custom);		setOperationAction(ISD::EH_DWARF_CFA, MVT::i32, Custom);

if (!Subtarget.hasStdExtZbb())		if (!Subtarget.hasStdExtZbb())
setOperationAction(ISD::SIGN_EXTEND_INREG, {MVT::i8, MVT::i16}, Expand);		setOperationAction(ISD::SIGN_EXTEND_INREG, {MVT::i8, MVT::i16}, Expand);

if (Subtarget.is64Bit()) {		if (Subtarget.is64Bit()) {
setOperationAction(ISD::EH_DWARF_CFA, MVT::i64, Custom);		setOperationAction(ISD::EH_DWARF_CFA, MVT::i64, Custom);

		setOperationAction(ISD::LOAD, MVT::i32, Custom);

setOperationAction({ISD::ADD, ISD::SUB, ISD::SHL, ISD::SRA, ISD::SRL},		setOperationAction({ISD::ADD, ISD::SUB, ISD::SHL, ISD::SRA, ISD::SRL},
MVT::i32, Custom);		MVT::i32, Custom);

setOperationAction({ISD::UADDO, ISD::USUBO, ISD::UADDSAT, ISD::USUBSAT},		setOperationAction({ISD::UADDO, ISD::USUBO, ISD::UADDSAT, ISD::USUBSAT},
MVT::i32, Custom);		MVT::i32, Custom);
} else {		} else {
setLibcallName(		setLibcallName(
{RTLIB::SHL_I128, RTLIB::SRL_I128, RTLIB::SRA_I128, RTLIB::MUL_I128},		{RTLIB::SHL_I128, RTLIB::SRL_I128, RTLIB::SRA_I128, RTLIB::MUL_I128},
▲ Show 20 Lines • Show All 6,861 Lines • ▼ Show 20 Lines	case ISD::READCYCLECOUNTER: {
SDValue RCW =		SDValue RCW =
DAG.getNode(RISCVISD::READ_CYCLE_WIDE, DL, VTs, N->getOperand(0));		DAG.getNode(RISCVISD::READ_CYCLE_WIDE, DL, VTs, N->getOperand(0));

Results.push_back(		Results.push_back(
DAG.getNode(ISD::BUILD_PAIR, DL, MVT::i64, RCW, RCW.getValue(1)));		DAG.getNode(ISD::BUILD_PAIR, DL, MVT::i64, RCW, RCW.getValue(1)));
Results.push_back(RCW.getValue(2));		Results.push_back(RCW.getValue(2));
break;		break;
}		}
		case ISD::LOAD: {
		if (!ISD::isNON_EXTLoad(N))
		return;

		// Use a SEXTLOAD instead of the default EXTLOAD. Similar to the
		// sext_inreg we emit for ADD/SUB/MUL/SLLI.
		LoadSDNode *Ld = cast<LoadSDNode>(N);

		SDLoc dl(N);
		SDValue Res = DAG.getExtLoad(ISD::SEXTLOAD, dl, MVT::i64, Ld->getChain(),
		Ld->getBasePtr(), Ld->getMemoryVT(),
		Ld->getMemOperand());
		Results.push_back(DAG.getNode(ISD::TRUNCATE, dl, MVT::i32, Res));
		Results.push_back(Res.getValue(1));
		return;
		}
case ISD::MUL: {		case ISD::MUL: {
unsigned Size = N->getSimpleValueType(0).getSizeInBits();		unsigned Size = N->getSimpleValueType(0).getSizeInBits();
unsigned XLen = Subtarget.getXLen();		unsigned XLen = Subtarget.getXLen();
// This multiply needs to be expanded, try to use MULHSU+MUL if possible.		// This multiply needs to be expanded, try to use MULHSU+MUL if possible.
if (Size > XLen) {		if (Size > XLen) {
assert(Size == (XLen * 2) && "Unexpected custom legalisation");		assert(Size == (XLen * 2) && "Unexpected custom legalisation");
SDValue LHS = N->getOperand(0);		SDValue LHS = N->getOperand(0);
SDValue RHS = N->getOperand(1);		SDValue RHS = N->getOperand(1);
▲ Show 20 Lines • Show All 5,957 Lines • Show Last 20 Lines

llvm/test/CodeGen/RISCV/sextw-removal.ll

Show First 20 Lines • Show All 59 Lines • ▼ Show 20 Lines	bb2: ; preds = %bb2, %bb
br i1 %i6, label %bb7, label %bb2		br i1 %i6, label %bb7, label %bb2

bb7: ; preds = %bb2		bb7: ; preds = %bb2
ret void		ret void
}		}

declare signext i32 @bar(i32 signext)		declare signext i32 @bar(i32 signext)

; The load here will be an anyext load in isel and sext.w will be emitted for		; The load here was previously an aext load, but this has since been changed
; the ret. Make sure we can look through logic ops to prove the sext.w is		; to a signext load allowing us to remove a sext.w before isel. Thus we get
; unnecessary.		; the same result with or without the sext.w removal pass.
		; Test has been left for coverage purposes.
define signext i32 @test2(i32* %p, i32 signext %b) nounwind {		define signext i32 @test2(i32* %p, i32 signext %b) nounwind {
; RV64I-LABEL: test2:		; RV64I-LABEL: test2:
; RV64I: # %bb.0:		; RV64I: # %bb.0:
; RV64I-NEXT: lw a0, 0(a0)		; RV64I-NEXT: lw a0, 0(a0)
; RV64I-NEXT: li a2, 1		; RV64I-NEXT: li a2, 1
; RV64I-NEXT: sllw a1, a2, a1		; RV64I-NEXT: sllw a1, a2, a1
; RV64I-NEXT: not a1, a1		; RV64I-NEXT: not a1, a1
; RV64I-NEXT: and a0, a1, a0		; RV64I-NEXT: and a0, a1, a0
; RV64I-NEXT: ret		; RV64I-NEXT: ret
;		;
; RV64ZBB-LABEL: test2:		; RV64ZBB-LABEL: test2:
; RV64ZBB: # %bb.0:		; RV64ZBB: # %bb.0:
; RV64ZBB-NEXT: lw a0, 0(a0)		; RV64ZBB-NEXT: lw a0, 0(a0)
; RV64ZBB-NEXT: li a2, -2		; RV64ZBB-NEXT: li a2, -2
; RV64ZBB-NEXT: rolw a1, a2, a1		; RV64ZBB-NEXT: rolw a1, a2, a1
; RV64ZBB-NEXT: and a0, a1, a0		; RV64ZBB-NEXT: and a0, a1, a0
; RV64ZBB-NEXT: ret		; RV64ZBB-NEXT: ret
;		;
; NOREMOVAL-LABEL: test2:		; NOREMOVAL-LABEL: test2:
; NOREMOVAL: # %bb.0:		; NOREMOVAL: # %bb.0:
; NOREMOVAL-NEXT: lw a0, 0(a0)		; NOREMOVAL-NEXT: lw a0, 0(a0)
; NOREMOVAL-NEXT: li a2, -2		; NOREMOVAL-NEXT: li a2, -2
; NOREMOVAL-NEXT: rolw a1, a2, a1		; NOREMOVAL-NEXT: rolw a1, a2, a1
; NOREMOVAL-NEXT: and a0, a1, a0		; NOREMOVAL-NEXT: and a0, a1, a0
; NOREMOVAL-NEXT: sext.w a0, a0
; NOREMOVAL-NEXT: ret		; NOREMOVAL-NEXT: ret
%a = load i32, i32* %p		%a = load i32, i32* %p
%shl = shl i32 1, %b		%shl = shl i32 1, %b
%neg = xor i32 %shl, -1		%neg = xor i32 %shl, -1
%and1 = and i32 %neg, %a		%and1 = and i32 %neg, %a
ret i32 %and1		ret i32 %and1
}		}

Show All 16 Lines
; RV64ZBB-NEXT: ret		; RV64ZBB-NEXT: ret
;		;
; NOREMOVAL-LABEL: test3:		; NOREMOVAL-LABEL: test3:
; NOREMOVAL: # %bb.0:		; NOREMOVAL: # %bb.0:
; NOREMOVAL-NEXT: lw a0, 0(a0)		; NOREMOVAL-NEXT: lw a0, 0(a0)
; NOREMOVAL-NEXT: li a2, -2		; NOREMOVAL-NEXT: li a2, -2
; NOREMOVAL-NEXT: rolw a1, a2, a1		; NOREMOVAL-NEXT: rolw a1, a2, a1
; NOREMOVAL-NEXT: or a0, a1, a0		; NOREMOVAL-NEXT: or a0, a1, a0
; NOREMOVAL-NEXT: sext.w a0, a0
; NOREMOVAL-NEXT: ret		; NOREMOVAL-NEXT: ret
%a = load i32, i32* %p		%a = load i32, i32* %p
%shl = shl i32 1, %b		%shl = shl i32 1, %b
%neg = xor i32 %shl, -1		%neg = xor i32 %shl, -1
%and1 = or i32 %neg, %a		%and1 = or i32 %neg, %a
ret i32 %and1		ret i32 %and1
}		}

Show All 16 Lines
; RV64ZBB-NEXT: ret		; RV64ZBB-NEXT: ret
;		;
; NOREMOVAL-LABEL: test4:		; NOREMOVAL-LABEL: test4:
; NOREMOVAL: # %bb.0:		; NOREMOVAL: # %bb.0:
; NOREMOVAL-NEXT: lw a0, 0(a0)		; NOREMOVAL-NEXT: lw a0, 0(a0)
; NOREMOVAL-NEXT: li a2, 1		; NOREMOVAL-NEXT: li a2, 1
; NOREMOVAL-NEXT: sllw a1, a2, a1		; NOREMOVAL-NEXT: sllw a1, a2, a1
; NOREMOVAL-NEXT: xnor a0, a1, a0		; NOREMOVAL-NEXT: xnor a0, a1, a0
; NOREMOVAL-NEXT: sext.w a0, a0
; NOREMOVAL-NEXT: ret		; NOREMOVAL-NEXT: ret
%a = load i32, i32* %p		%a = load i32, i32* %p
%shl = shl i32 1, %b		%shl = shl i32 1, %b
%neg = xor i32 %shl, -1		%neg = xor i32 %shl, -1
%and1 = xor i32 %neg, %a		%and1 = xor i32 %neg, %a
ret i32 %and1		ret i32 %and1
}		}

▲ Show 20 Lines • Show All 541 Lines • Show Last 20 Lines

llvm/test/CodeGen/RISCV/vec3-setcc-crash.ll

	Show All 40 Lines
	; RV32-NEXT: srli a0, a0, 16			; RV32-NEXT: srli a0, a0, 16
	; RV32-NEXT: .LBB0_8:			; RV32-NEXT: .LBB0_8:
	; RV32-NEXT: sb a0, 2(a1)			; RV32-NEXT: sb a0, 2(a1)
	; RV32-NEXT: sh a2, 0(a1)			; RV32-NEXT: sh a2, 0(a1)
	; RV32-NEXT: ret			; RV32-NEXT: ret
	;			;
	; RV64-LABEL: vec3_setcc_crash:			; RV64-LABEL: vec3_setcc_crash:
	; RV64: # %bb.0:			; RV64: # %bb.0:
	; RV64-NEXT: lwu a0, 0(a0)			; RV64-NEXT: lw a0, 0(a0)
	; RV64-NEXT: slli a2, a0, 40			; RV64-NEXT: slli a2, a0, 40
	; RV64-NEXT: slli a3, a0, 56			; RV64-NEXT: slli a3, a0, 56
	; RV64-NEXT: slli a4, a0, 48			; RV64-NEXT: slli a4, a0, 48
	; RV64-NEXT: srai a5, a4, 56			; RV64-NEXT: srai a5, a4, 56
	; RV64-NEXT: srai a3, a3, 56			; RV64-NEXT: srai a3, a3, 56
	; RV64-NEXT: bgtz a5, .LBB0_2			; RV64-NEXT: bgtz a5, .LBB0_2
	; RV64-NEXT: # %bb.1:			; RV64-NEXT: # %bb.1:
	; RV64-NEXT: li a5, 0			; RV64-NEXT: li a5, 0
	Show All 10 Lines
	; RV64-NEXT: .LBB0_5:			; RV64-NEXT: .LBB0_5:
	; RV64-NEXT: andi a3, a5, 255			; RV64-NEXT: andi a3, a5, 255
	; RV64-NEXT: or a2, a3, a2			; RV64-NEXT: or a2, a3, a2
	; RV64-NEXT: bgtz a4, .LBB0_7			; RV64-NEXT: bgtz a4, .LBB0_7
	; RV64-NEXT: # %bb.6:			; RV64-NEXT: # %bb.6:
	; RV64-NEXT: li a0, 0			; RV64-NEXT: li a0, 0
	; RV64-NEXT: j .LBB0_8			; RV64-NEXT: j .LBB0_8
	; RV64-NEXT: .LBB0_7:			; RV64-NEXT: .LBB0_7:
	; RV64-NEXT: srli a0, a0, 16			; RV64-NEXT: srliw a0, a0, 16
	; RV64-NEXT: .LBB0_8:			; RV64-NEXT: .LBB0_8:
	; RV64-NEXT: sb a0, 2(a1)			; RV64-NEXT: sb a0, 2(a1)
	; RV64-NEXT: sh a2, 0(a1)			; RV64-NEXT: sh a2, 0(a1)
	; RV64-NEXT: ret			; RV64-NEXT: ret
	%a = load <3 x i8>, <3 x i8>* %in			%a = load <3 x i8>, <3 x i8>* %in
	%cmp = icmp sgt <3 x i8> %a, zeroinitializer			%cmp = icmp sgt <3 x i8> %a, zeroinitializer
	%c = select <3 x i1> %cmp, <3 x i8> %a, <3 x i8> zeroinitializer			%c = select <3 x i1> %cmp, <3 x i8> %a, <3 x i8> zeroinitializer
	store <3 x i8> %c, <3 x i8>* %out			store <3 x i8> %c, <3 x i8>* %out
	ret void			ret void
	}			}

This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Custom type legalize i32 loads by sign extending.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 459483

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

llvm/test/CodeGen/RISCV/sextw-removal.ll

llvm/test/CodeGen/RISCV/vec3-setcc-crash.ll

This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Custom type legalize i32 loads by sign extending.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 459483

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

llvm/test/CodeGen/RISCV/sextw-removal.ll

llvm/test/CodeGen/RISCV/vec3-setcc-crash.ll

[RISCV] Custom type legalize i32 loads by sign extending.
ClosedPublic