This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Teach SExtWRemoval to recognize sign extended values that come from arguments.
ClosedPublic

Authored by craig.topper on Sep 25 2022, 8:54 PM.

Download Raw Diff

Details

Reviewers

reames
asb
luismarques
frasercrmck

Commits

rGece4bb5ab894: [RISCV] Teach SExtWRemoval to recognize sign extended values that come from…

Summary

This information is not preserved in MIR today. So this patch adds
information to RISCVMachineFunctionInfo when the vreg is created for
the argument.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

craig.topper created this revision.Sep 25 2022, 8:54 PM

Herald added a project: Restricted Project. · View Herald TranscriptSep 25 2022, 8:54 PM

Herald added subscribers: sunshaoce, VincentWu, StephenFan and 26 others. · View Herald Transcript

craig.topper requested review of this revision.Sep 25 2022, 8:54 PM

Herald added a project: Restricted Project. · View Herald TranscriptSep 25 2022, 8:54 PM

Herald added subscribers: • pcwang-thead, eopXD, MaskRay. · View Herald Transcript

clang-format

Harbormaster completed remote builds in B188627: Diff 462800.Sep 25 2022, 9:43 PM

Playing around with this on the GCC torture suite, I'm seeing a number of small codegen improvements, no miscompiles, and no instruction count regressions worth worrying about (pr67037 gets a few more instructions added than are removed, seems to just be due to slightly different BB structure due to different regalloc choices).

This looks reasonable to me, but I'm not knowledgeable enough of the argument lowering to know if this scheme is sound. I'll defer to others on the review.

llvm/lib/Target/RISCV/RISCVSExtWRemoval.cpp
321	These declarations can be pulled out of the loop.

reames added inline comments.Sep 28 2022, 8:05 AM

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
10711	One thought here. If this logic is sound, then we could use a similar approach to prove known bits for the RegisterSDNode representing the argument location. This would allow SDAG to eliminate some extends, but more importantly, might allow us to do this in a place more likely to expose any unsoundness in a way it gets caught. Particularly, if we can do that in target independent code. Looking at the calling code, I think this basically translates to putting an AssertSext after the CopyFromReg? Actually, it looks like the caller already does this for us based on the isZExt and isSExt flags. I suspect you can rewrite this code as if not split and is sext from the flags. Not sure mind you, just suspect.

arsenm added a subscriber: arsenm.Sep 28 2022, 8:19 AM

arsenm added inline comments.

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
10713–10717	Don't see why you need to look at the underlying IR here instead of just relying on the argument flags
llvm/lib/Target/RISCV/RISCVSExtWRemoval.cpp
9	It feels wrong to me that you would need to optimize these after selection but I guess I don't know why you are seeing these

craig.topper added inline comments.Sep 28 2022, 9:24 AM

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
10711	One thought here. If this logic is sound, then we could use a similar approach to prove known bits for the RegisterSDNode representing the argument location. This would allow SDAG to eliminate some extends, but more importantly, might allow us to do this in a place more likely to expose any unsoundness in a way it gets caught. Particularly, if we can do that in target independent code. Looking at the calling code, I think this basically translates to putting an AssertSext after the CopyFromReg? Actually, it looks like the caller already does this for us based on the isZExt and isSExt flags. That is correct. We already do exactly that. The changed test in this patch hit the depth limit on computeKnownBits in SelectionDAG or it would have been optimized. I suspect you can rewrite this code as if not split and is sext from the flags. Not sure mind you, just suspect. That wouldn't cover the i8 and i16 zext case. And I don't know how the sext in the flags behaves if someone puts it on an i64 that doesn't need to be promoted. clang wouldn't do that, but that's not the only source of IR.
10713–10717	If I don't look at the original IR, then I would do the wrong thing if someone put a signext attribute on an i33 argument. I also wouldn't be able to handle zeroext for i16 and i8.

craig.topper added inline comments.Sep 28 2022, 9:27 AM

llvm/lib/Target/RISCV/RISCVSExtWRemoval.cpp
9	Some of them are from SelectionDAG's depth limit in simplifyDemandedBits/computeKnownBits/computeNumSignBits. Some of them are because this handles phi loops and SelectionDAG doesn't. Not sure if there are other reasons.

craig.topper added inline comments.Sep 28 2022, 9:28 AM

llvm/lib/Target/RISCV/RISCVSExtWRemoval.cpp
9	This pass also converts some instructions to their W forms if it would remove a sext.w later. SelectionDAG doesn't do that.

reames added inline comments.Sep 28 2022, 10:10 AM

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
10711	On your last point, I don't think you're correct. Here is some pseudo code: if (!In.isSplit() && LocVT == i64) { auto ValVT = VA.getValVT() if (In.isSExt() && ValVT in {MVT::is32, i16, i8} // set sext32 flag } else if (In.isZExt() && ValVT in {i16, i8}) { //set sext32 flag } } This doesn't handle odd sized integers as neatly, but do we care? To be clear, arguing to be pedantic and to understand. I am not objecting to the approach taken.

craig.topper added inline comments.Sep 28 2022, 10:18 AM

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
10711	ValVT is always the same as LocVT. The promotion to i64 happens in SelectionDAGBuilder before the setting of ValVT and LocVT happens.

craig.topper added inline comments.Sep 28 2022, 10:28 AM

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

10711

AArch64 also has code to recover the original VT from the IR

for (unsigned i = 0; i != NumArgs; ++i) {
  MVT ValVT = Ins[i].VT;
  if (Ins[i].isOrigArg()) {
    std::advance(CurOrigArg, Ins[i].getOrigArgIndex() - CurArgIdx);
    CurArgIdx = Ins[i].getOrigArgIndex();

    // Get type of the original argument.
    EVT ActualVT = getValueType(DAG.getDataLayout(), CurOrigArg->getType(),
                                /*AllowUnknown*/ true);
    MVT ActualMVT = ActualVT.isSimple() ? ActualVT.getSimpleVT() : MVT::Other;
    // If ActualMVT is i1/i8/i16, we should set LocVT to i8/i8/i16.
    if (ActualMVT == MVT::i1 || ActualMVT == MVT::i8)
      ValVT = MVT::i8;
    else if (ActualMVT == MVT::i16)
      ValVT = MVT::i16;
  }

reames added inline comments.Sep 28 2022, 10:35 AM

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
10711	Ah, agreed. I'd missed the order of operation happening here.

Add test cases

craig.topper retitled this revision from [RISCV][WIP] Teach SExtWRemoval to recognize sign extended values that come from arguments. to [RISCV] Teach SExtWRemoval to recognize sign extended values that come from arguments..Oct 4 2022, 1:38 PM

craig.topper edited the summary of this revision. (Show Details)

LGTM

This revision is now accepted and ready to land.Oct 4 2022, 2:50 PM

Harbormaster completed remote builds in B190295: Diff 465142.Oct 4 2022, 3:14 PM

Closed by commit rGece4bb5ab894: [RISCV] Teach SExtWRemoval to recognize sign extended values that come from… (authored by craig.topper). · Explain WhyOct 4 2022, 3:42 PM

This revision was automatically updated to reflect the committed changes.

craig.topper added a commit: rGece4bb5ab894: [RISCV] Teach SExtWRemoval to recognize sign extended values that come from….

Revision Contents

Path

Size

llvm/

lib/

Target/

RISCV/

RISCVISelLowering.cpp

17 lines

RISCVMachineFunctionInfo.h

6 lines

RISCVMachineFunctionInfo.cpp

8 lines

RISCVSExtWRemoval.cpp

15 lines

test/

CodeGen/

RISCV/

select-cc.ll

13 lines

sextw-removal.ll

265 lines

Diff 465142

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

//===-- RISCVISelLowering.cpp - RISCV DAG Lowering Implementation --------===//		//===-- RISCVISelLowering.cpp - RISCV DAG Lowering Implementation --------===//
		Lint: Lint Inline Actions clang-format not found in user’s local PATH; not linting file. Lint: Lint: clang-format not found in user’s local PATH; not linting file.
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
▲ Show 20 Lines • Show All 10,683 Lines • ▼ Show 20 Lines	static SDValue convertLocVTToValVT(SelectionDAG &DAG, SDValue Val,
}		}
return Val;		return Val;
}		}

// The caller is responsible for loading the full value if the argument is		// The caller is responsible for loading the full value if the argument is
// passed with CCValAssign::Indirect.		// passed with CCValAssign::Indirect.
static SDValue unpackFromRegLoc(SelectionDAG &DAG, SDValue Chain,		static SDValue unpackFromRegLoc(SelectionDAG &DAG, SDValue Chain,
const CCValAssign &VA, const SDLoc &DL,		const CCValAssign &VA, const SDLoc &DL,
		const ISD::InputArg &In,
const RISCVTargetLowering &TLI) {		const RISCVTargetLowering &TLI) {
MachineFunction &MF = DAG.getMachineFunction();		MachineFunction &MF = DAG.getMachineFunction();
MachineRegisterInfo &RegInfo = MF.getRegInfo();		MachineRegisterInfo &RegInfo = MF.getRegInfo();
EVT LocVT = VA.getLocVT();		EVT LocVT = VA.getLocVT();
SDValue Val;		SDValue Val;
const TargetRegisterClass *RC = TLI.getRegClassFor(LocVT.getSimpleVT());		const TargetRegisterClass *RC = TLI.getRegClassFor(LocVT.getSimpleVT());
Register VReg = RegInfo.createVirtualRegister(RC);		Register VReg = RegInfo.createVirtualRegister(RC);
RegInfo.addLiveIn(VA.getLocReg(), VReg);		RegInfo.addLiveIn(VA.getLocReg(), VReg);
Val = DAG.getCopyFromReg(Chain, DL, VReg, LocVT);		Val = DAG.getCopyFromReg(Chain, DL, VReg, LocVT);

		// If input is sign extended from 32 bits, note it for the SExtWRemoval pass.
		reamesUnsubmitted Not Done Reply Inline Actions One thought here. If this logic is sound, then we could use a similar approach to prove known bits for the RegisterSDNode representing the argument location. This would allow SDAG to eliminate some extends, but more importantly, might allow us to do this in a place more likely to expose any unsoundness in a way it gets caught. Particularly, if we can do that in target independent code. Looking at the calling code, I think this basically translates to putting an AssertSext after the CopyFromReg? Actually, it looks like the caller already does this for us based on the isZExt and isSExt flags. I suspect you can rewrite this code as if not split and is sext from the flags. Not sure mind you, just suspect. reames: One thought here. If this logic is sound, then we could use a similar approach to prove known…
		craig.topperAuthorUnsubmitted Done Reply Inline Actions One thought here. If this logic is sound, then we could use a similar approach to prove known bits for the RegisterSDNode representing the argument location. This would allow SDAG to eliminate some extends, but more importantly, might allow us to do this in a place more likely to expose any unsoundness in a way it gets caught. Particularly, if we can do that in target independent code. Looking at the calling code, I think this basically translates to putting an AssertSext after the CopyFromReg? Actually, it looks like the caller already does this for us based on the isZExt and isSExt flags. That is correct. We already do exactly that. The changed test in this patch hit the depth limit on computeKnownBits in SelectionDAG or it would have been optimized. I suspect you can rewrite this code as if not split and is sext from the flags. Not sure mind you, just suspect. That wouldn't cover the i8 and i16 zext case. And I don't know how the sext in the flags behaves if someone puts it on an i64 that doesn't need to be promoted. clang wouldn't do that, but that's not the only source of IR. craig.topper: > One thought here. > > If this logic is sound, then we could use a similar approach to prove…
		reamesUnsubmitted Not Done Reply Inline Actions On your last point, I don't think you're correct. Here is some pseudo code: if (!In.isSplit() && LocVT == i64) { auto ValVT = VA.getValVT() if (In.isSExt() && ValVT in {MVT::is32, i16, i8} // set sext32 flag } else if (In.isZExt() && ValVT in {i16, i8}) { //set sext32 flag } } This doesn't handle odd sized integers as neatly, but do we care? To be clear, arguing to be pedantic and to understand. I am not objecting to the approach taken. reames: On your last point, I don't think you're correct. Here is some pseudo code: ``` if (!In.
		craig.topperAuthorUnsubmitted Done Reply Inline Actions ValVT is always the same as LocVT. The promotion to i64 happens in SelectionDAGBuilder before the setting of ValVT and LocVT happens. craig.topper: ValVT is always the same as LocVT. The promotion to i64 happens in SelectionDAGBuilder before…
		craig.topperAuthorUnsubmitted Done Reply Inline Actions AArch64 also has code to recover the original VT from the IR for (unsigned i = 0; i != NumArgs; ++i) { MVT ValVT = Ins[i].VT; if (Ins[i].isOrigArg()) { std::advance(CurOrigArg, Ins[i].getOrigArgIndex() - CurArgIdx); CurArgIdx = Ins[i].getOrigArgIndex(); // Get type of the original argument. EVT ActualVT = getValueType(DAG.getDataLayout(), CurOrigArg->getType(), /AllowUnknown/ true); MVT ActualMVT = ActualVT.isSimple() ? ActualVT.getSimpleVT() : MVT::Other; // If ActualMVT is i1/i8/i16, we should set LocVT to i8/i8/i16. if (ActualMVT == MVT::i1 \|\| ActualMVT == MVT::i8) ValVT = MVT::i8; else if (ActualMVT == MVT::i16) ValVT = MVT::i16; } craig.topper: AArch64 also has code to recover the original VT from the IR ``` for (unsigned i = 0; i !=…
		reamesUnsubmitted Not Done Reply Inline Actions Ah, agreed. I'd missed the order of operation happening here. reames: Ah, agreed. I'd missed the order of operation happening here.
		if (In.isOrigArg()) {
		Argument *OrigArg = MF.getFunction().getArg(In.getOrigArgIndex());
		if (OrigArg->getType()->isIntegerTy()) {
		unsigned BitWidth = OrigArg->getType()->getIntegerBitWidth();
		// An input zero extended from i31 can also be considered sign extended.
		if ((BitWidth <= 32 && In.Flags.isSExt()) \|\|
		arsenmUnsubmitted Not Done Reply Inline Actions Don't see why you need to look at the underlying IR here instead of just relying on the argument flags arsenm: Don't see why you need to look at the underlying IR here instead of just relying on the…
		craig.topperAuthorUnsubmitted Done Reply Inline Actions If I don't look at the original IR, then I would do the wrong thing if someone put a signext attribute on an i33 argument. I also wouldn't be able to handle zeroext for i16 and i8. craig.topper: If I don't look at the original IR, then I would do the wrong thing if someone put a signext…
		(BitWidth < 32 && In.Flags.isZExt())) {
		RISCVMachineFunctionInfo *RVFI = MF.getInfo<RISCVMachineFunctionInfo>();
		RVFI->addSExt32Register(VReg);
		}
		}
		}

if (VA.getLocInfo() == CCValAssign::Indirect)		if (VA.getLocInfo() == CCValAssign::Indirect)
return Val;		return Val;

return convertLocVTToValVT(DAG, Val, VA, DL, TLI.getSubtarget());		return convertLocVTToValVT(DAG, Val, VA, DL, TLI.getSubtarget());
}		}

static SDValue convertValVTToLocVT(SelectionDAG &DAG, SDValue Val,		static SDValue convertValVTToLocVT(SelectionDAG &DAG, SDValue Val,
const CCValAssign &VA, const SDLoc &DL,		const CCValAssign &VA, const SDLoc &DL,
▲ Show 20 Lines • Show All 296 Lines • ▼ Show 20 Lines	SDValue RISCVTargetLowering::LowerFormalArguments(
for (unsigned i = 0, e = ArgLocs.size(); i != e; ++i) {		for (unsigned i = 0, e = ArgLocs.size(); i != e; ++i) {
CCValAssign &VA = ArgLocs[i];		CCValAssign &VA = ArgLocs[i];
SDValue ArgValue;		SDValue ArgValue;
// Passing f64 on RV32D with a soft float ABI must be handled as a special		// Passing f64 on RV32D with a soft float ABI must be handled as a special
// case.		// case.
if (VA.getLocVT() == MVT::i32 && VA.getValVT() == MVT::f64)		if (VA.getLocVT() == MVT::i32 && VA.getValVT() == MVT::f64)
ArgValue = unpackF64OnRV32DSoftABI(DAG, Chain, VA, DL);		ArgValue = unpackF64OnRV32DSoftABI(DAG, Chain, VA, DL);
else if (VA.isRegLoc())		else if (VA.isRegLoc())
ArgValue = unpackFromRegLoc(DAG, Chain, VA, DL, *this);		ArgValue = unpackFromRegLoc(DAG, Chain, VA, DL, Ins[i], *this);
else		else
ArgValue = unpackFromMemLoc(DAG, Chain, VA, DL);		ArgValue = unpackFromMemLoc(DAG, Chain, VA, DL);

if (VA.getLocInfo() == CCValAssign::Indirect) {		if (VA.getLocInfo() == CCValAssign::Indirect) {
// If the original argument was split and passed by reference (e.g. i128		// If the original argument was split and passed by reference (e.g. i128
// on RV32), we need to load all parts of it here (using the same		// on RV32), we need to load all parts of it here (using the same
// address). Vectors may be partly split to registers and partly to the		// address). Vectors may be partly split to registers and partly to the
// stack, in which case the base address is partly offset and subsequent		// stack, in which case the base address is partly offset and subsequent
▲ Show 20 Lines • Show All 1,560 Lines • Show Last 20 Lines

llvm/lib/Target/RISCV/RISCVMachineFunctionInfo.h

//=- RISCVMachineFunctionInfo.h - RISCV machine function info ------ C++ --=//		//=- RISCVMachineFunctionInfo.h - RISCV machine function info ------ C++ --=//
		Lint: Lint Inline Actions clang-format not found in user’s local PATH; not linting file. Lint: Lint: clang-format not found in user’s local PATH; not linting file.
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines	private:
uint64_t RVVStackSize = 0;		uint64_t RVVStackSize = 0;
/// Alignment of RVV stack.		/// Alignment of RVV stack.
Align RVVStackAlign;		Align RVVStackAlign;
/// Padding required to keep RVV stack aligned within the main stack.		/// Padding required to keep RVV stack aligned within the main stack.
uint64_t RVVPadding = 0;		uint64_t RVVPadding = 0;
/// Size of stack frame to save callee saved registers		/// Size of stack frame to save callee saved registers
unsigned CalleeSavedStackSize = 0;		unsigned CalleeSavedStackSize = 0;

		/// Registers that have been sign extended from i32.
		SmallVector<Register, 8> SExt32Registers;

public:		public:
RISCVMachineFunctionInfo(const MachineFunction &MF) {}		RISCVMachineFunctionInfo(const MachineFunction &MF) {}

MachineFunctionInfo *		MachineFunctionInfo *
clone(BumpPtrAllocator &Allocator, MachineFunction &DestMF,		clone(BumpPtrAllocator &Allocator, MachineFunction &DestMF,
const DenseMap<MachineBasicBlock , MachineBasicBlock > &Src2DstMBB)		const DenseMap<MachineBasicBlock , MachineBasicBlock > &Src2DstMBB)
const override;		const override;

Show All 36 Lines	public:

uint64_t getRVVPadding() const { return RVVPadding; }		uint64_t getRVVPadding() const { return RVVPadding; }
void setRVVPadding(uint64_t Padding) { RVVPadding = Padding; }		void setRVVPadding(uint64_t Padding) { RVVPadding = Padding; }

unsigned getCalleeSavedStackSize() const { return CalleeSavedStackSize; }		unsigned getCalleeSavedStackSize() const { return CalleeSavedStackSize; }
void setCalleeSavedStackSize(unsigned Size) { CalleeSavedStackSize = Size; }		void setCalleeSavedStackSize(unsigned Size) { CalleeSavedStackSize = Size; }

void initializeBaseYamlFields(const yaml::RISCVMachineFunctionInfo &YamlMFI);		void initializeBaseYamlFields(const yaml::RISCVMachineFunctionInfo &YamlMFI);

		void addSExt32Register(Register Reg);
		bool isSExt32Register(Register Reg) const;
};		};

} // end namespace llvm		} // end namespace llvm

#endif // LLVM_LIB_TARGET_RISCV_RISCVMACHINEFUNCTIONINFO_H		#endif // LLVM_LIB_TARGET_RISCV_RISCVMACHINEFUNCTIONINFO_H

llvm/lib/Target/RISCV/RISCVMachineFunctionInfo.cpp

//=- RISCVMachineFunctionInfo.cpp - RISCV machine function info ---- C++ --=//		//=- RISCVMachineFunctionInfo.cpp - RISCV machine function info ---- C++ --=//
		Lint: Lint Inline Actions clang-format not found in user’s local PATH; not linting file. Lint: Lint: clang-format not found in user’s local PATH; not linting file.
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
Show All 21 Lines	void yaml::RISCVMachineFunctionInfo::mappingImpl(yaml::IO &YamlIO) {
MappingTraits<RISCVMachineFunctionInfo>::mapping(YamlIO, *this);		MappingTraits<RISCVMachineFunctionInfo>::mapping(YamlIO, *this);
}		}

void RISCVMachineFunctionInfo::initializeBaseYamlFields(		void RISCVMachineFunctionInfo::initializeBaseYamlFields(
const yaml::RISCVMachineFunctionInfo &YamlMFI) {		const yaml::RISCVMachineFunctionInfo &YamlMFI) {
VarArgsFrameIndex = YamlMFI.VarArgsFrameIndex;		VarArgsFrameIndex = YamlMFI.VarArgsFrameIndex;
VarArgsSaveSize = YamlMFI.VarArgsSaveSize;		VarArgsSaveSize = YamlMFI.VarArgsSaveSize;
}		}

		void RISCVMachineFunctionInfo::addSExt32Register(Register Reg) {
		SExt32Registers.push_back(Reg);
		}

		bool RISCVMachineFunctionInfo::isSExt32Register(Register Reg) const {
		return is_contained(SExt32Registers, Reg);
		}

llvm/lib/Target/RISCV/RISCVSExtWRemoval.cpp

//===-------------- RISCVSExtWRemoval.cpp - MI sext.w Removal -------------===//		//===-------------- RISCVSExtWRemoval.cpp - MI sext.w Removal -------------===//
		Lint: Lint Inline Actions clang-format not found in user’s local PATH; not linting file. Lint: Lint: clang-format not found in user’s local PATH; not linting file.
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===---------------------------------------------------------------------===//		//===---------------------------------------------------------------------===//
//		//
// This pass removes unneeded sext.w instructions at the MI level.		// This pass removes unneeded sext.w instructions at the MI level.
		arsenmUnsubmitted Not Done Reply Inline Actions It feels wrong to me that you would need to optimize these after selection but I guess I don't know why you are seeing these arsenm: It feels wrong to me that you would need to optimize these after selection but I guess I don't…
		craig.topperAuthorUnsubmitted Done Reply Inline Actions Some of them are from SelectionDAG's depth limit in simplifyDemandedBits/computeKnownBits/computeNumSignBits. Some of them are because this handles phi loops and SelectionDAG doesn't. Not sure if there are other reasons. craig.topper: Some of them are from SelectionDAG's depth limit in…
		craig.topperAuthorUnsubmitted Done Reply Inline Actions This pass also converts some instructions to their W forms if it would remove a sext.w later. SelectionDAG doesn't do that. craig.topper: This pass also converts some instructions to their W forms if it would remove a sext.w later.
//		//
//===---------------------------------------------------------------------===//		//===---------------------------------------------------------------------===//

#include "RISCV.h"		#include "RISCV.h"
		#include "RISCVMachineFunctionInfo.h"
#include "RISCVSubtarget.h"		#include "RISCVSubtarget.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/CodeGen/MachineFunctionPass.h"		#include "llvm/CodeGen/MachineFunctionPass.h"
#include "llvm/CodeGen/TargetInstrInfo.h"		#include "llvm/CodeGen/TargetInstrInfo.h"

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "riscv-sextw-removal"		#define DEBUG_TYPE "riscv-sextw-removal"
▲ Show 20 Lines • Show All 288 Lines • ▼ Show 20 Lines	if (isSignExtendingOpW(*MI, MRI, FixableDef))
continue;		continue;

// Is this an instruction that propagates sign extend.		// Is this an instruction that propagates sign extend.
switch (MI->getOpcode()) {		switch (MI->getOpcode()) {
default:		default:
// Unknown opcode, give up.		// Unknown opcode, give up.
return false;		return false;
case RISCV::COPY: {		case RISCV::COPY: {
Register SrcReg = MI->getOperand(1).getReg();		const MachineFunction *MF = MI->getMF();
		const RISCVMachineFunctionInfo *RVFI =
		MF->getInfo<RISCVMachineFunctionInfo>();
		reamesUnsubmitted Not Done Reply Inline Actions These declarations can be pulled out of the loop. reames: These declarations can be pulled out of the loop.
		if (MI->getParent()->getBasicBlock() ==
		&MF->getFunction().getEntryBlock()) {
		Register VReg = MI->getOperand(0).getReg();
		if (MF->getRegInfo().isLiveIn(VReg))
		return RVFI->isSExt32Register(VReg);
		}

// TODO: Handle arguments and returns from calls?		// TODO: Handle returns from calls?

		Register SrcReg = MI->getOperand(1).getReg();

// If this is a copy from another register, check its source instruction.		// If this is a copy from another register, check its source instruction.
if (!SrcReg.isVirtual())		if (!SrcReg.isVirtual())
return false;		return false;
MachineInstr *SrcMI = MRI.getVRegDef(SrcReg);		MachineInstr *SrcMI = MRI.getVRegDef(SrcReg);
if (!SrcMI)		if (!SrcMI)
return false;		return false;

▲ Show 20 Lines • Show All 172 Lines • Show Last 20 Lines

llvm/test/CodeGen/RISCV/select-cc.ll

	Show First 20 Lines • Show All 108 Lines • ▼ Show 20 Lines
	; RV64I-NEXT: mv a0, a2			; RV64I-NEXT: mv a0, a2
	; RV64I-NEXT: .LBB0_10:			; RV64I-NEXT: .LBB0_10:
	; RV64I-NEXT: lw a2, 0(a1)			; RV64I-NEXT: lw a2, 0(a1)
	; RV64I-NEXT: bgeu a2, a0, .LBB0_12			; RV64I-NEXT: bgeu a2, a0, .LBB0_12
	; RV64I-NEXT: # %bb.11:			; RV64I-NEXT: # %bb.11:
	; RV64I-NEXT: mv a0, a2			; RV64I-NEXT: mv a0, a2
	; RV64I-NEXT: .LBB0_12:			; RV64I-NEXT: .LBB0_12:
	; RV64I-NEXT: lw a2, 0(a1)			; RV64I-NEXT: lw a2, 0(a1)
	; RV64I-NEXT: sext.w a3, a0			; RV64I-NEXT: blt a2, a0, .LBB0_14
	; RV64I-NEXT: blt a2, a3, .LBB0_14
	; RV64I-NEXT: # %bb.13:			; RV64I-NEXT: # %bb.13:
	; RV64I-NEXT: mv a0, a2			; RV64I-NEXT: mv a0, a2
	; RV64I-NEXT: .LBB0_14:			; RV64I-NEXT: .LBB0_14:
	; RV64I-NEXT: lw a2, 0(a1)			; RV64I-NEXT: lw a2, 0(a1)
	; RV64I-NEXT: sext.w a3, a0			; RV64I-NEXT: bge a0, a2, .LBB0_16
	; RV64I-NEXT: bge a3, a2, .LBB0_16
	; RV64I-NEXT: # %bb.15:			; RV64I-NEXT: # %bb.15:
	; RV64I-NEXT: mv a0, a2			; RV64I-NEXT: mv a0, a2
	; RV64I-NEXT: .LBB0_16:			; RV64I-NEXT: .LBB0_16:
	; RV64I-NEXT: lw a2, 0(a1)			; RV64I-NEXT: lw a2, 0(a1)
	; RV64I-NEXT: sext.w a3, a0			; RV64I-NEXT: blt a0, a2, .LBB0_18
	; RV64I-NEXT: blt a3, a2, .LBB0_18
	; RV64I-NEXT: # %bb.17:			; RV64I-NEXT: # %bb.17:
	; RV64I-NEXT: mv a0, a2			; RV64I-NEXT: mv a0, a2
	; RV64I-NEXT: .LBB0_18:			; RV64I-NEXT: .LBB0_18:
	; RV64I-NEXT: lw a2, 0(a1)			; RV64I-NEXT: lw a2, 0(a1)
	; RV64I-NEXT: sext.w a3, a0			; RV64I-NEXT: bge a2, a0, .LBB0_20
	; RV64I-NEXT: bge a2, a3, .LBB0_20
	; RV64I-NEXT: # %bb.19:			; RV64I-NEXT: # %bb.19:
	; RV64I-NEXT: mv a0, a2			; RV64I-NEXT: mv a0, a2
	; RV64I-NEXT: .LBB0_20:			; RV64I-NEXT: .LBB0_20:
	; RV64I-NEXT: lw a2, 0(a1)			; RV64I-NEXT: lw a2, 0(a1)
	; RV64I-NEXT: blez a2, .LBB0_22			; RV64I-NEXT: blez a2, .LBB0_22
	; RV64I-NEXT: # %bb.21:			; RV64I-NEXT: # %bb.21:
	; RV64I-NEXT: mv a0, a2			; RV64I-NEXT: mv a0, a2
	; RV64I-NEXT: .LBB0_22:			; RV64I-NEXT: .LBB0_22:
	Show All 9 Lines
	; RV64I-NEXT: mv a0, a3			; RV64I-NEXT: mv a0, a3
	; RV64I-NEXT: .LBB0_26:			; RV64I-NEXT: .LBB0_26:
	; RV64I-NEXT: lw a1, 0(a1)			; RV64I-NEXT: lw a1, 0(a1)
	; RV64I-NEXT: li a3, 2046			; RV64I-NEXT: li a3, 2046
	; RV64I-NEXT: bltu a3, a2, .LBB0_28			; RV64I-NEXT: bltu a3, a2, .LBB0_28
	; RV64I-NEXT: # %bb.27:			; RV64I-NEXT: # %bb.27:
	; RV64I-NEXT: mv a0, a1			; RV64I-NEXT: mv a0, a1
	; RV64I-NEXT: .LBB0_28:			; RV64I-NEXT: .LBB0_28:
	; RV64I-NEXT: sext.w a0, a0
	; RV64I-NEXT: ret			; RV64I-NEXT: ret
	%val1 = load volatile i32, i32* %b			%val1 = load volatile i32, i32* %b
	%tst1 = icmp eq i32 %a, %val1			%tst1 = icmp eq i32 %a, %val1
	%val2 = select i1 %tst1, i32 %a, i32 %val1			%val2 = select i1 %tst1, i32 %a, i32 %val1

	%val3 = load volatile i32, i32* %b			%val3 = load volatile i32, i32* %b
	%tst2 = icmp ne i32 %val2, %val3			%tst2 = icmp ne i32 %val2, %val3
	%val4 = select i1 %tst2, i32 %val2, i32 %val3			%val4 = select i1 %tst2, i32 %val2, i32 %val3
	▲ Show 20 Lines • Show All 172 Lines • Show Last 20 Lines

llvm/test/CodeGen/RISCV/sextw-removal.ll

Show First 20 Lines • Show All 700 Lines • ▼ Show 20 Lines	bb2: ; preds = %bb2, %entry
%i5 = add i64 %i4, %arg2		%i5 = add i64 %i4, %arg2
%i6 = icmp ugt i64 %i2, 255		%i6 = icmp ugt i64 %i2, 255
br i1 %i6, label %bb7, label %bb2		br i1 %i6, label %bb7, label %bb2

bb7: ; preds = %bb2		bb7: ; preds = %bb2
%i8 = trunc i64 %i5 to i32		%i8 = trunc i64 %i5 to i32
ret i32 %i8		ret i32 %i8
}		}


		; int test14(int a, int n) {
		; for (int i = 1; i < n; ++i) {
		; if (a > 1000)
		; return -1;
		; a += i;
		; }
		;
		; return a;
		; }
		;
		; There should be no sext.w in the loop.
		define signext i32 @test14(i32 signext %0, i32 signext %1) {
		; CHECK-LABEL: test14:
		; CHECK: # %bb.0:
		; CHECK-NEXT: li a2, 2
		; CHECK-NEXT: blt a1, a2, .LBB13_4
		; CHECK-NEXT: # %bb.1: # %.preheader
		; CHECK-NEXT: li a2, 1
		; CHECK-NEXT: li a3, 1000
		; CHECK-NEXT: .LBB13_2: # =>This Inner Loop Header: Depth=1
		; CHECK-NEXT: blt a3, a0, .LBB13_5
		; CHECK-NEXT: # %bb.3: # in Loop: Header=BB13_2 Depth=1
		; CHECK-NEXT: addw a0, a2, a0
		; CHECK-NEXT: addiw a2, a2, 1
		; CHECK-NEXT: blt a2, a1, .LBB13_2
		; CHECK-NEXT: .LBB13_4:
		; CHECK-NEXT: ret
		; CHECK-NEXT: .LBB13_5:
		; CHECK-NEXT: li a0, -1
		; CHECK-NEXT: ret
		;
		; NOREMOVAL-LABEL: test14:
		; NOREMOVAL: # %bb.0:
		; NOREMOVAL-NEXT: li a2, 2
		; NOREMOVAL-NEXT: blt a1, a2, .LBB13_4
		; NOREMOVAL-NEXT: # %bb.1: # %.preheader
		; NOREMOVAL-NEXT: li a2, 1
		; NOREMOVAL-NEXT: li a3, 1000
		; NOREMOVAL-NEXT: .LBB13_2: # =>This Inner Loop Header: Depth=1
		; NOREMOVAL-NEXT: sext.w a4, a0
		; NOREMOVAL-NEXT: blt a3, a4, .LBB13_5
		; NOREMOVAL-NEXT: # %bb.3: # in Loop: Header=BB13_2 Depth=1
		; NOREMOVAL-NEXT: addw a0, a2, a0
		; NOREMOVAL-NEXT: addiw a2, a2, 1
		; NOREMOVAL-NEXT: blt a2, a1, .LBB13_2
		; NOREMOVAL-NEXT: .LBB13_4:
		; NOREMOVAL-NEXT: ret
		; NOREMOVAL-NEXT: .LBB13_5:
		; NOREMOVAL-NEXT: li a0, -1
		; NOREMOVAL-NEXT: ret
		%3 = icmp sgt i32 %1, 1
		br i1 %3, label %4, label %12

		4: ; preds = %2, %8
		%5 = phi i32 [ %10, %8 ], [ 1, %2 ]
		%6 = phi i32 [ %9, %8 ], [ %0, %2 ]
		%7 = icmp sgt i32 %6, 1000
		br i1 %7, label %12, label %8

		8: ; preds = %4
		%9 = add nsw i32 %5, %6
		%10 = add nuw nsw i32 %5, 1
		%11 = icmp slt i32 %10, %1
		br i1 %11, label %4, label %12

		12: ; preds = %8, %4, %2
		%13 = phi i32 [ %0, %2 ], [ -1, %4 ], [ %9, %8 ]
		ret i32 %13
		}

		; Same as test14 but the signext attribute is missing from the argument so we
		; can't optimize out the sext.w.
		define signext i32 @test14b(i32 %0, i32 signext %1) {
		; CHECK-LABEL: test14b:
		; CHECK: # %bb.0:
		; CHECK-NEXT: li a2, 2
		; CHECK-NEXT: blt a1, a2, .LBB14_4
		; CHECK-NEXT: # %bb.1: # %.preheader
		; CHECK-NEXT: li a2, 1
		; CHECK-NEXT: li a3, 1000
		; CHECK-NEXT: .LBB14_2: # =>This Inner Loop Header: Depth=1
		; CHECK-NEXT: sext.w a4, a0
		; CHECK-NEXT: blt a3, a4, .LBB14_5
		; CHECK-NEXT: # %bb.3: # in Loop: Header=BB14_2 Depth=1
		; CHECK-NEXT: addw a0, a2, a0
		; CHECK-NEXT: addiw a2, a2, 1
		; CHECK-NEXT: blt a2, a1, .LBB14_2
		; CHECK-NEXT: .LBB14_4:
		; CHECK-NEXT: sext.w a0, a0
		; CHECK-NEXT: ret
		; CHECK-NEXT: .LBB14_5:
		; CHECK-NEXT: li a0, -1
		; CHECK-NEXT: sext.w a0, a0
		; CHECK-NEXT: ret
		;
		; NOREMOVAL-LABEL: test14b:
		; NOREMOVAL: # %bb.0:
		; NOREMOVAL-NEXT: li a2, 2
		; NOREMOVAL-NEXT: blt a1, a2, .LBB14_4
		; NOREMOVAL-NEXT: # %bb.1: # %.preheader
		; NOREMOVAL-NEXT: li a2, 1
		; NOREMOVAL-NEXT: li a3, 1000
		; NOREMOVAL-NEXT: .LBB14_2: # =>This Inner Loop Header: Depth=1
		; NOREMOVAL-NEXT: sext.w a4, a0
		; NOREMOVAL-NEXT: blt a3, a4, .LBB14_5
		; NOREMOVAL-NEXT: # %bb.3: # in Loop: Header=BB14_2 Depth=1
		; NOREMOVAL-NEXT: addw a0, a2, a0
		; NOREMOVAL-NEXT: addiw a2, a2, 1
		; NOREMOVAL-NEXT: blt a2, a1, .LBB14_2
		; NOREMOVAL-NEXT: .LBB14_4:
		; NOREMOVAL-NEXT: sext.w a0, a0
		; NOREMOVAL-NEXT: ret
		; NOREMOVAL-NEXT: .LBB14_5:
		; NOREMOVAL-NEXT: li a0, -1
		; NOREMOVAL-NEXT: sext.w a0, a0
		; NOREMOVAL-NEXT: ret
		%3 = icmp sgt i32 %1, 1
		br i1 %3, label %4, label %12

		4: ; preds = %2, %8
		%5 = phi i32 [ %10, %8 ], [ 1, %2 ]
		%6 = phi i32 [ %9, %8 ], [ %0, %2 ]
		%7 = icmp sgt i32 %6, 1000
		br i1 %7, label %12, label %8

		8: ; preds = %4
		%9 = add nsw i32 %5, %6
		%10 = add nuw nsw i32 %5, 1
		%11 = icmp slt i32 %10, %1
		br i1 %11, label %4, label %12

		12: ; preds = %8, %4, %2
		%13 = phi i32 [ %0, %2 ], [ -1, %4 ], [ %9, %8 ]
		ret i32 %13
		}

		; Same as test14, but the argument is zero extended instead of sign extended so
		; we can't optimize it.
		define signext i32 @test14c(i32 zeroext %0, i32 signext %1) {
		; CHECK-LABEL: test14c:
		; CHECK: # %bb.0:
		; CHECK-NEXT: li a2, 2
		; CHECK-NEXT: blt a1, a2, .LBB15_4
		; CHECK-NEXT: # %bb.1: # %.preheader
		; CHECK-NEXT: li a2, 1
		; CHECK-NEXT: li a3, 1000
		; CHECK-NEXT: .LBB15_2: # =>This Inner Loop Header: Depth=1
		; CHECK-NEXT: sext.w a4, a0
		; CHECK-NEXT: blt a3, a4, .LBB15_5
		; CHECK-NEXT: # %bb.3: # in Loop: Header=BB15_2 Depth=1
		; CHECK-NEXT: addw a0, a2, a0
		; CHECK-NEXT: addiw a2, a2, 1
		; CHECK-NEXT: blt a2, a1, .LBB15_2
		; CHECK-NEXT: .LBB15_4:
		; CHECK-NEXT: sext.w a0, a0
		; CHECK-NEXT: ret
		; CHECK-NEXT: .LBB15_5:
		; CHECK-NEXT: li a0, -1
		; CHECK-NEXT: sext.w a0, a0
		; CHECK-NEXT: ret
		;
		; NOREMOVAL-LABEL: test14c:
		; NOREMOVAL: # %bb.0:
		; NOREMOVAL-NEXT: li a2, 2
		; NOREMOVAL-NEXT: blt a1, a2, .LBB15_4
		; NOREMOVAL-NEXT: # %bb.1: # %.preheader
		; NOREMOVAL-NEXT: li a2, 1
		; NOREMOVAL-NEXT: li a3, 1000
		; NOREMOVAL-NEXT: .LBB15_2: # =>This Inner Loop Header: Depth=1
		; NOREMOVAL-NEXT: sext.w a4, a0
		; NOREMOVAL-NEXT: blt a3, a4, .LBB15_5
		; NOREMOVAL-NEXT: # %bb.3: # in Loop: Header=BB15_2 Depth=1
		; NOREMOVAL-NEXT: addw a0, a2, a0
		; NOREMOVAL-NEXT: addiw a2, a2, 1
		; NOREMOVAL-NEXT: blt a2, a1, .LBB15_2
		; NOREMOVAL-NEXT: .LBB15_4:
		; NOREMOVAL-NEXT: sext.w a0, a0
		; NOREMOVAL-NEXT: ret
		; NOREMOVAL-NEXT: .LBB15_5:
		; NOREMOVAL-NEXT: li a0, -1
		; NOREMOVAL-NEXT: sext.w a0, a0
		; NOREMOVAL-NEXT: ret
		%3 = icmp sgt i32 %1, 1
		br i1 %3, label %4, label %12

		4: ; preds = %2, %8
		%5 = phi i32 [ %10, %8 ], [ 1, %2 ]
		%6 = phi i32 [ %9, %8 ], [ %0, %2 ]
		%7 = icmp sgt i32 %6, 1000
		br i1 %7, label %12, label %8

		8: ; preds = %4
		%9 = add nsw i32 %5, %6
		%10 = add nuw nsw i32 %5, 1
		%11 = icmp slt i32 %10, %1
		br i1 %11, label %4, label %12

		12: ; preds = %8, %4, %2
		%13 = phi i32 [ %0, %2 ], [ -1, %4 ], [ %9, %8 ]
		ret i32 %13
		}

		; Same as test14 but the argument is zero extended from i31. Since bits 63:31
		; are zero, this counts as an i32 sign extend so we can optimize it.
		define signext i32 @test14d(i31 zeroext %0, i32 signext %1) {
		; CHECK-LABEL: test14d:
		; CHECK: # %bb.0:
		; CHECK-NEXT: li a2, 2
		; CHECK-NEXT: blt a1, a2, .LBB16_4
		; CHECK-NEXT: # %bb.1: # %.preheader
		; CHECK-NEXT: li a2, 1
		; CHECK-NEXT: li a3, 1000
		; CHECK-NEXT: .LBB16_2: # =>This Inner Loop Header: Depth=1
		; CHECK-NEXT: blt a3, a0, .LBB16_5
		; CHECK-NEXT: # %bb.3: # in Loop: Header=BB16_2 Depth=1
		; CHECK-NEXT: addw a0, a2, a0
		; CHECK-NEXT: addiw a2, a2, 1
		; CHECK-NEXT: blt a2, a1, .LBB16_2
		; CHECK-NEXT: .LBB16_4:
		; CHECK-NEXT: ret
		; CHECK-NEXT: .LBB16_5:
		; CHECK-NEXT: li a0, -1
		; CHECK-NEXT: ret
		;
		; NOREMOVAL-LABEL: test14d:
		; NOREMOVAL: # %bb.0:
		; NOREMOVAL-NEXT: li a2, 2
		; NOREMOVAL-NEXT: blt a1, a2, .LBB16_4
		; NOREMOVAL-NEXT: # %bb.1: # %.preheader
		; NOREMOVAL-NEXT: li a2, 1
		; NOREMOVAL-NEXT: li a3, 1000
		; NOREMOVAL-NEXT: .LBB16_2: # =>This Inner Loop Header: Depth=1
		; NOREMOVAL-NEXT: sext.w a4, a0
		; NOREMOVAL-NEXT: blt a3, a4, .LBB16_5
		; NOREMOVAL-NEXT: # %bb.3: # in Loop: Header=BB16_2 Depth=1
		; NOREMOVAL-NEXT: addw a0, a2, a0
		; NOREMOVAL-NEXT: addiw a2, a2, 1
		; NOREMOVAL-NEXT: blt a2, a1, .LBB16_2
		; NOREMOVAL-NEXT: .LBB16_4:
		; NOREMOVAL-NEXT: ret
		; NOREMOVAL-NEXT: .LBB16_5:
		; NOREMOVAL-NEXT: li a0, -1
		; NOREMOVAL-NEXT: ret
		%zext = zext i31 %0 to i32
		%3 = icmp sgt i32 %1, 1
		br i1 %3, label %4, label %12

		4: ; preds = %2, %8
		%5 = phi i32 [ %10, %8 ], [ 1, %2 ]
		%6 = phi i32 [ %9, %8 ], [ %zext, %2 ]
		%7 = icmp sgt i32 %6, 1000
		br i1 %7, label %12, label %8

		8: ; preds = %4
		%9 = add nsw i32 %5, %6
		%10 = add nuw nsw i32 %5, 1
		%11 = icmp slt i32 %10, %1
		br i1 %11, label %4, label %12

		12: ; preds = %8, %4, %2
		%13 = phi i32 [ %zext, %2 ], [ -1, %4 ], [ %9, %8 ]
		ret i32 %13
		}

This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Teach SExtWRemoval to recognize sign extended values that come from arguments.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 465142

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

llvm/lib/Target/RISCV/RISCVMachineFunctionInfo.h

llvm/lib/Target/RISCV/RISCVMachineFunctionInfo.cpp

llvm/lib/Target/RISCV/RISCVSExtWRemoval.cpp

llvm/test/CodeGen/RISCV/select-cc.ll

llvm/test/CodeGen/RISCV/sextw-removal.ll

[RISCV] Teach SExtWRemoval to recognize sign extended values that come from arguments.
ClosedPublic