This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
lib/Target/RISCV/
-
Target/
-
RISCV/
-
RISCVISelLowering.h
-
RISCVISelLowering.cpp
-
Utils/
-
RISCVMatInt.h
-
RISCVMatInt.cpp
-
test/CodeGen/RISCV/
-
CodeGen/
-
RISCV/
-
add-before-shl.ll

Differential D62857

[RISCV] Prevent re-ordering some adds after shifts
ClosedPublic

Authored by lenary on Jun 4 2019, 7:05 AM.

Download Raw Diff

Details

Reviewers

asb
luismarques
efriedma

Commits

rG9f155bc6e592: [RISCV] Prevent re-ordering some adds after shifts
rL363736: [RISCV] Prevent re-ordering some adds after shifts

Summary

DAGCombine will normally turn a (shl (add x, c1), c2) into (add (shl x, c2), c1 << c2), where c1 and c2 are constants. This can be prevented by a callback in TargetLowering.

On RISC-V, materialising the constant c1 << c2 can be more expensive than materialising c1, because materialising the former may take more instructions, and may use a register, where materialising the latter would not.

This patch implements the hook in RISCVTargetLowering to prevent this transform, in the cases where:

c1 fits into the immediate field in an addi instruction.
c1 takes fewer instructions to materialise than c1 << c2.

In future, DAGCombine could do the check to see whether c1 fits into an add immediate, which might simplify more targets hooks than just RISC-V.

Diff Detail

Repository: rL LLVM

Event Timeline

lenary created this revision.Jun 4 2019, 7:05 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 4 2019, 7:05 AM

Herald added subscribers: llvm-commits, benna, psnobl and 19 others. · View Herald Transcript

Harbormaster completed remote builds in B32876: Diff 202931.Jun 4 2019, 7:07 AM

Add commets about larger constants. These can be improved at a later date

Harbormaster completed remote builds in B32879: Diff 202939.Jun 4 2019, 7:41 AM

Jim added a subscriber: Jim.Jun 4 2019, 7:32 PM

Generalise optimisation to check materialisation cost

Harbormaster completed remote builds in B32936: Diff 203171.Jun 5 2019, 8:58 AM

asb added inline comments.Jun 5 2019, 9:06 AM

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
860 ↗	(On Diff #203171)	I was actually thinking it might be better to define getIntImmCost for RISC-V (which at least initially uses generateInstSeq, even if that might result in a little wasted work), then call that from here (that change might affect codegen in other areas, but should be an improvement). Arguably the introduction of getIntImmCost could make sense as a separate patch (that this one depends on), if a sensible standalone test case is straight forward.

Abstract away calculation of Materialisation Cost

Harbormaster completed remote builds in B32979: Diff 203312.Jun 6 2019, 2:35 AM

lewis-revill added a subscriber: lewis-revill.Jun 6 2019, 2:37 AM

lewis-revill added inline comments.

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
869 ↗	(On Diff #203171)	Isn't there an inaccuracy in this method of checking the materialization cost since `RISCVMatInt::generateInstSeq` always calculates the cost of materializing into a register? In this case we have instructions which might use immediates, but this will calculate the cost as being the same as a single-instruction materialization into a register followed by an instruction using a register. IE: addi rd, rs1, C would appear to be the same cost as: lui rs2, (C >> 12) add rd, rs1, rs2

Remove out-of-date comments

Harbormaster completed remote builds in B32980: Diff 203314.Jun 6 2019, 2:43 AM

asb added inline comments.Jun 6 2019, 3:00 AM

llvm/lib/Target/RISCV/Utils/RISCVMatInt.cpp
78 ↗	(On Diff #203314)	Should add a comment to document what this does, and to document that it really does calculate the cost of materialising an integer (i.e. doesn't take into account whether there might be an opportunity for merging it into an addi). You should also document that it is invalid to call this for a Val which can't be represented with 32Bits when Is64Bit is false (that triggers an assert in generateInstSeq - I think it's probably still ok to treat that as an API misuse rather than adding more explicit error handling).

lenary edited the summary of this revision. (Show Details)Jun 6 2019, 3:35 AM

lenary retitled this revision from [RISCV] Prevent hoisting some adds after shifts to [RISCV] Prevent re-ordering some adds after shifts.Jun 6 2019, 3:48 AM

lenary marked 3 inline comments as done.Jun 6 2019, 4:06 AM

lenary added inline comments.

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
869 ↗	(On Diff #203171)	On line 860, we check that `C1` (which you're calling `C`) will fit into an add immediate. If it will, that counts as "free", and we definitely want to prevent the re-ordering, and we don't even check the materialisation cost. In fact, I'm going to update the patch to always allow the re-ordering if `C1 << C2` will fit into an add immediate, because then we also know that materialisation of that constant is "free", and so we should allow the re-ordering because it might help later dagcombines.

Allow Combine if C1 << C2 will fit into an immediate
Explain restrictions on RISCVMatInt::getIntMatCost

Harbormaster completed remote builds in B33052: Diff 203525.Jun 7 2019, 3:21 AM

lenary marked an inline comment as done.Jun 7 2019, 3:21 AM

I added some minor comments, along with a bigger suggestion. What do you think about adding a getIntMatCost taking APInt and IsRV64, which will split the immediate into XLEN-sized chunks (see comment for reference to similar code in AArch64)?

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
848 ↗	(On Diff #203525)	Actually, with the logic there now I guess we're checking that (add val, c1) is fewer instructions than (add, val', c1 << c2) which is a similar condition but not quite the same. i.e. both c1 and c1 << c2 might be materialisable with a single instruction, but if c1 << c2 is materialised using a single lui it's still unprofitable as we can't merge it into the add.
860 ↗	(On Diff #203525)	shift immediate -> add immediate
861 ↗	(On Diff #203525)	End comment with a full stop
870 ↗	(On Diff #203525)	I'm not totally liking calling getIntMatCost with something that isn't logically equivalent to IsRV64 (you could of course have MVT::i64 on RV32 pre-legalisation). The cost of materialising a 64-bit constant on RV64 also isn't necessarily the same as materialising an i64 split into two i32 on RV2. In an ideal world, we'd have a getIntMatCost taking an APInt+IsRV64, with similar logic to `int AArch64TTIImpl::getIntImmCost(const APInt &Imm, Type *Ty)` - i.e. splitting into 32-bit/64-bit chunks. Hard to imagine this being a big deal for this particular case, but it's good infrastructure to have.

asb mentioned this in D63007: [RISCV] Add RISCV-specific TargetTransformInfo.Jun 7 2019, 11:43 PM

Address review feedback

Introduce new getIntMatCost(const APInt &Val, bool IsRV64) API, which can materialise much wider constants than the previous method.
Clarify and check grammar in comments.

Harbormaster completed remote builds in B33479: Diff 205064.Jun 17 2019, 6:49 AM

lenary marked 4 inline comments as done.Jun 17 2019, 6:51 AM

In future, DAGCombine could do the check to see whether c1 fits into an add immediate, which might simplify more targets hooks than just RISC-V.

Why not do this the right way already?

@lebedev.ri @craig.topper

lenary mentioned this in D63433: [RISCV] Add RISCV-specific TargetTransformInfo.Jun 17 2019, 8:29 AM

Thanks for the update Sam, I think there might be a minor correctness issue with getIntMatCost - let me know what you think.

It would be great to add a simple sanity check for the chunking logic. Perhaps there's a test case involving -1 that produces a different answer with the current version of the patch versus a version updated to use the type size for the bit size.

llvm/lib/Target/RISCV/Utils/RISCVMatInt.cpp
78 ↗	(On Diff #205064)	Given you have a description in the header, I think this repeated description for the implementation is redundant (see https://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments "Don’t duplicate the documentation comment in the header file and in the implementation file"). Though maybe I missed a de facto standard elsewhere in the codebase for duplicating a reduced version. I don't feel strongly about this, so do feel free to keep if you prefer.
81 ↗	(On Diff #205064)	I think this isn't going to give the desired result if Val is e.g. -1. In that case, getMinSignedBits is going to return 1, meaning the loop is only executed once regardless of the original type width. If the original type was e.g. an i64 on RV32, then two instructions are required to materialise the constant, yet the logic in this function will only return 1. I think the solutation is to more closely mirror the AArch64TTIImpl::getIntImmCost function I mentioned before, and add a Type parameter (and maybe also adopt similar logic for sign-extending constants to be a multiple of the PlatRegSize).
llvm/lib/Target/RISCV/Utils/RISCVMatInt.h
41 ↗	(On Diff #205064)	Nit: Shouldn't this be Is64Bit to to match the naming in the implementation?

In D62857#1546071, @xbolva00 wrote:

In future, DAGCombine could do the check to see whether c1 fits into an add immediate, which might simplify more targets hooks than just RISC-V.

Why not do this the right way already?

@lebedev.ri @craig.topper

Now this patch has been extended to cover more cases (i.e. comparing materialisation cost rather than just identifying the isLegalAddImmediate case), it wouldn't have much effect on this backend. A patch that uses isLegalAddImmediate in the relevant DAGCombine might be a small benefit for targets that don't implement the isDesirableToCommuteWithShift hook, but I don't think it would affect the implementation here (it's not obvious to me the hook implementation should assume isLegalAddImmediate had already been checked). So if it does make sense to add, I think it's definitely a separate patch to this one.

asb added inline comments.Jun 18 2019, 1:10 AM

llvm/test/CodeGen/RISCV/add-before-shl.ll
13 ↗	(On Diff #205064)	The patch has since been updated to do a direct cost comparison, rather than just looking at the case where the constant fits into an immediate. These two paragraphs should be updated to reflect that

lenary added a child revision: D63433: [RISCV] Add RISCV-specific TargetTransformInfo.Jun 18 2019, 1:48 AM

Address review feedback

Update getIntMatCost to take an integer size (in bits). This ensures we chunk the constant correctly for the legal types on the target, and account for the costs of all required chunks. I was unable to devise a simple test case for when this behaviour would not match the behaivour using getMinSignedBits, due to legalisation always splitting the wider type before the isDesirableToCommuteWithShift callback is called.

The chunking will automatically expand each chunk to be the platform register width, so we don't need to sign extend the constant to be a multiple of that width before we start chunking.

Update and de-duplicate comments on tests and implementations
Update naming of IsRV64 parameter in RISCVMatInt.{h,cpp}

Harbormaster completed remote builds in B33558: Diff 205350.Jun 18 2019, 7:55 AM

lenary marked 4 inline comments as done.Jun 18 2019, 7:58 AM

lenary added inline comments.

llvm/lib/Target/RISCV/Utils/RISCVMatInt.cpp
81 ↗	(On Diff #205064)	I think I now cover chunking correctly. As in my message above, I don't need to sign extend Val before chunking, because each chunk is extended to be PlatRegSize within the loop.

LGTM, thanks!

This revision is now accepted and ready to land.Jun 18 2019, 8:19 AM

Closed by commit rL363736: [RISCV] Prevent re-ordering some adds after shifts (authored by lenary). · Explain WhyJun 18 2019, 1:35 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Target/

RISCV/

RISCVISelLowering.h

2 lines

RISCVISelLowering.cpp

45 lines

Utils/

RISCVMatInt.h

9 lines

RISCVMatInt.cpp

23 lines

test/

CodeGen/

RISCV/

add-before-shl.ll

74 lines

Diff 205425

llvm/trunk/lib/Target/RISCV/RISCVISelLowering.h

Show First 20 Lines • Show All 119 Lines • ▼ Show 20 Lines	ISD::NodeType getExtendForAtomicOps() const override {
return ISD::SIGN_EXTEND;		return ISD::SIGN_EXTEND;
}		}

bool shouldExpandShift(SelectionDAG &DAG, SDNode *N) const override {		bool shouldExpandShift(SelectionDAG &DAG, SDNode *N) const override {
if (DAG.getMachineFunction().getFunction().hasMinSize())		if (DAG.getMachineFunction().getFunction().hasMinSize())
return false;		return false;
return true;		return true;
}		}
		bool isDesirableToCommuteWithShift(const SDNode *N,
		CombineLevel Level) const override;

private:		private:
void analyzeInputArgs(MachineFunction &MF, CCState &CCInfo,		void analyzeInputArgs(MachineFunction &MF, CCState &CCInfo,
const SmallVectorImpl<ISD::InputArg> &Ins,		const SmallVectorImpl<ISD::InputArg> &Ins,
bool IsRet) const;		bool IsRet) const;
void analyzeOutputArgs(MachineFunction &MF, CCState &CCInfo,		void analyzeOutputArgs(MachineFunction &MF, CCState &CCInfo,
const SmallVectorImpl<ISD::OutputArg> &Outs,		const SmallVectorImpl<ISD::OutputArg> &Outs,
bool IsRet, CallLoweringInfo *CLI) const;		bool IsRet, CallLoweringInfo *CLI) const;
▲ Show 20 Lines • Show All 55 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/RISCV/RISCVISelLowering.cpp

Show All 11 Lines
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "RISCVISelLowering.h"		#include "RISCVISelLowering.h"
#include "RISCV.h"		#include "RISCV.h"
#include "RISCVMachineFunctionInfo.h"		#include "RISCVMachineFunctionInfo.h"
#include "RISCVRegisterInfo.h"		#include "RISCVRegisterInfo.h"
#include "RISCVSubtarget.h"		#include "RISCVSubtarget.h"
#include "RISCVTargetMachine.h"		#include "RISCVTargetMachine.h"
		#include "Utils/RISCVMatInt.h"
#include "llvm/ADT/SmallSet.h"		#include "llvm/ADT/SmallSet.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/CodeGen/CallingConvLower.h"		#include "llvm/CodeGen/CallingConvLower.h"
#include "llvm/CodeGen/MachineFrameInfo.h"		#include "llvm/CodeGen/MachineFrameInfo.h"
#include "llvm/CodeGen/MachineFunction.h"		#include "llvm/CodeGen/MachineFunction.h"
#include "llvm/CodeGen/MachineInstrBuilder.h"		#include "llvm/CodeGen/MachineInstrBuilder.h"
#include "llvm/CodeGen/MachineRegisterInfo.h"		#include "llvm/CodeGen/MachineRegisterInfo.h"
#include "llvm/CodeGen/SelectionDAGISel.h"		#include "llvm/CodeGen/SelectionDAGISel.h"
▲ Show 20 Lines • Show All 821 Lines • ▼ Show 20 Lines	return DCI.CombineTo(N,
DAG.getNode(ISD::AND, DL, MVT::i64, NewFMV,		DAG.getNode(ISD::AND, DL, MVT::i64, NewFMV,
DAG.getConstant(~SignBit, DL, MVT::i64)));		DAG.getConstant(~SignBit, DL, MVT::i64)));
}		}
}		}

return SDValue();		return SDValue();
}		}

		bool RISCVTargetLowering::isDesirableToCommuteWithShift(
		const SDNode *N, CombineLevel Level) const {
		// The following folds are only desirable if `(OP _, c1 << c2)` can be
		// materialised in fewer instructions than `(OP _, c1)`:
		//
		// (shl (add x, c1), c2) -> (add (shl x, c2), c1 << c2)
		// (shl (or x, c1), c2) -> (or (shl x, c2), c1 << c2)
		SDValue N0 = N->getOperand(0);
		MVT Ty = N0.getSimpleValueType();
		if (Ty.isScalarInteger() &&
		(N0.getOpcode() == ISD::ADD \|\| N0.getOpcode() == ISD::OR)) {
		auto *C1 = dyn_cast<ConstantSDNode>(N0->getOperand(1));
		auto *C2 = dyn_cast<ConstantSDNode>(N->getOperand(1));
		if (C1 && C2) {
		APInt C1Int = C1->getAPIntValue();
		APInt ShiftedC1Int = C1Int << C2->getAPIntValue();

		// We can materialise `c1 << c2` into an add immediate, so it's "free",
		// and the combine should happen, to potentially allow further combines
		// later.
		if (isLegalAddImmediate(ShiftedC1Int.getSExtValue()))
		return true;

		// We can materialise `c1` in an add immediate, so it's "free", and the
		// combine should be prevented.
		if (isLegalAddImmediate(C1Int.getSExtValue()))
		return false;

		// Neither constant will fit into an immediate, so find materialisation
		// costs.
		int C1Cost = RISCVMatInt::getIntMatCost(C1Int, Ty.getSizeInBits(),
		Subtarget.is64Bit());
		int ShiftedC1Cost = RISCVMatInt::getIntMatCost(
		ShiftedC1Int, Ty.getSizeInBits(), Subtarget.is64Bit());

		// Materialising `c1` is cheaper than materialising `c1 << c2`, so the
		// combine should be prevented.
		if (C1Cost < ShiftedC1Cost)
		return false;
		}
		}
		return true;
		}

unsigned RISCVTargetLowering::ComputeNumSignBitsForTargetNode(		unsigned RISCVTargetLowering::ComputeNumSignBitsForTargetNode(
SDValue Op, const APInt &DemandedElts, const SelectionDAG &DAG,		SDValue Op, const APInt &DemandedElts, const SelectionDAG &DAG,
unsigned Depth) const {		unsigned Depth) const {
switch (Op.getOpcode()) {		switch (Op.getOpcode()) {
default:		default:
break;		break;
case RISCVISD::SLLW:		case RISCVISD::SLLW:
case RISCVISD::SRAW:		case RISCVISD::SRAW:
▲ Show 20 Lines • Show All 1,501 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/RISCV/Utils/RISCVMatInt.h

	//===- RISCVMatInt.h - Immediate materialisation ---------------- C++ ---===//			//===- RISCVMatInt.h - Immediate materialisation ---------------- C++ ---===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#ifndef LLVM_LIB_TARGET_RISCV_MATINT_H			#ifndef LLVM_LIB_TARGET_RISCV_MATINT_H
	#define LLVM_LIB_TARGET_RISCV_MATINT_H			#define LLVM_LIB_TARGET_RISCV_MATINT_H

				#include "llvm/ADT/APInt.h"
	#include "llvm/ADT/SmallVector.h"			#include "llvm/ADT/SmallVector.h"
	#include "llvm/Support/MachineValueType.h"			#include "llvm/Support/MachineValueType.h"
	#include <cstdint>			#include <cstdint>

	namespace llvm {			namespace llvm {

	namespace RISCVMatInt {			namespace RISCVMatInt {
	struct Inst {			struct Inst {
	unsigned Opc;			unsigned Opc;
	int64_t Imm;			int64_t Imm;

	Inst(unsigned Opc, int64_t Imm) : Opc(Opc), Imm(Imm) {}			Inst(unsigned Opc, int64_t Imm) : Opc(Opc), Imm(Imm) {}
	};			};
	using InstSeq = SmallVector<Inst, 8>;			using InstSeq = SmallVector<Inst, 8>;

	// Helper to generate an instruction sequence that will materialise the given			// Helper to generate an instruction sequence that will materialise the given
	// immediate value into a register. A sequence of instructions represented by			// immediate value into a register. A sequence of instructions represented by
	// a simple struct produced rather than directly emitting the instructions in			// a simple struct produced rather than directly emitting the instructions in
	// order to allow this helper to be used from both the MC layer and during			// order to allow this helper to be used from both the MC layer and during
	// instruction selection.			// instruction selection.
	void generateInstSeq(int64_t Val, bool IsRV64, InstSeq &Res);			void generateInstSeq(int64_t Val, bool IsRV64, InstSeq &Res);

				// Helper to estimate the number of instructions required to materialise the
				// given immediate value into a register. This estimate does not account for
				// `Val` possibly fitting into an immediate, and so may over-estimate.
				//
				// This will attempt to produce instructions to materialise `Val` as an
				// `Size`-bit immediate. `IsRV64` should match the target architecture.
				int getIntMatCost(const APInt &Val, unsigned Size, bool IsRV64);
	} // namespace RISCVMatInt			} // namespace RISCVMatInt
	} // namespace llvm			} // namespace llvm
	#endif			#endif

llvm/trunk/lib/Target/RISCV/Utils/RISCVMatInt.cpp

Show All 10 Lines
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/Support/MachineValueType.h"		#include "llvm/Support/MachineValueType.h"
#include "llvm/Support/MathExtras.h"		#include "llvm/Support/MathExtras.h"
#include <cstdint>		#include <cstdint>

namespace llvm {		namespace llvm {

namespace RISCVMatInt {		namespace RISCVMatInt {
void generateInstSeq(int64_t Val, bool Is64Bit, InstSeq &Res) {		void generateInstSeq(int64_t Val, bool IsRV64, InstSeq &Res) {
if (isInt<32>(Val)) {		if (isInt<32>(Val)) {
// Depending on the active bits in the immediate Value v, the following		// Depending on the active bits in the immediate Value v, the following
// instruction sequences are emitted:		// instruction sequences are emitted:
//		//
// v == 0 : ADDI		// v == 0 : ADDI
// v[0,12) != 0 && v[12,32) == 0 : ADDI		// v[0,12) != 0 && v[12,32) == 0 : ADDI
// v[0,12) == 0 && v[12,32) != 0 : LUI		// v[0,12) == 0 && v[12,32) != 0 : LUI
// v[0,32) != 0 : LUI+ADDI(W)		// v[0,32) != 0 : LUI+ADDI(W)
int64_t Hi20 = ((Val + 0x800) >> 12) & 0xFFFFF;		int64_t Hi20 = ((Val + 0x800) >> 12) & 0xFFFFF;
int64_t Lo12 = SignExtend64<12>(Val);		int64_t Lo12 = SignExtend64<12>(Val);

if (Hi20)		if (Hi20)
Res.push_back(Inst(RISCV::LUI, Hi20));		Res.push_back(Inst(RISCV::LUI, Hi20));

if (Lo12 \|\| Hi20 == 0) {		if (Lo12 \|\| Hi20 == 0) {
unsigned AddiOpc = (Is64Bit && Hi20) ? RISCV::ADDIW : RISCV::ADDI;		unsigned AddiOpc = (IsRV64 && Hi20) ? RISCV::ADDIW : RISCV::ADDI;
Res.push_back(Inst(AddiOpc, Lo12));		Res.push_back(Inst(AddiOpc, Lo12));
}		}
return;		return;
}		}

assert(Is64Bit && "Can't emit >32-bit imm for non-RV64 target");		assert(IsRV64 && "Can't emit >32-bit imm for non-RV64 target");

// In the worst case, for a full 64-bit constant, a sequence of 8 instructions		// In the worst case, for a full 64-bit constant, a sequence of 8 instructions
// (i.e., LUI+ADDIW+SLLI+ADDI+SLLI+ADDI+SLLI+ADDI) has to be emmitted. Note		// (i.e., LUI+ADDIW+SLLI+ADDI+SLLI+ADDI+SLLI+ADDI) has to be emmitted. Note
// that the first two instructions (LUI+ADDIW) can contribute up to 32 bits		// that the first two instructions (LUI+ADDIW) can contribute up to 32 bits
// while the following ADDI instructions contribute up to 12 bits each.		// while the following ADDI instructions contribute up to 12 bits each.
//		//
// On the first glance, implementing this seems to be possible by simply		// On the first glance, implementing this seems to be possible by simply
// emitting the most significant 32 bits (LUI+ADDIW) followed by as many left		// emitting the most significant 32 bits (LUI+ADDIW) followed by as many left
Show All 13 Lines	void generateInstSeq(int64_t Val, bool IsRV64, InstSeq &Res) {
// fits into 32 bits. The emission of the shifts and additions is subsequently		// fits into 32 bits. The emission of the shifts and additions is subsequently
// performed when the recursion returns.		// performed when the recursion returns.

int64_t Lo12 = SignExtend64<12>(Val);		int64_t Lo12 = SignExtend64<12>(Val);
int64_t Hi52 = (Val + 0x800) >> 12;		int64_t Hi52 = (Val + 0x800) >> 12;
int ShiftAmount = 12 + findFirstSet((uint64_t)Hi52);		int ShiftAmount = 12 + findFirstSet((uint64_t)Hi52);
Hi52 = SignExtend64(Hi52 >> (ShiftAmount - 12), 64 - ShiftAmount);		Hi52 = SignExtend64(Hi52 >> (ShiftAmount - 12), 64 - ShiftAmount);

generateInstSeq(Hi52, Is64Bit, Res);		generateInstSeq(Hi52, IsRV64, Res);

Res.push_back(Inst(RISCV::SLLI, ShiftAmount));		Res.push_back(Inst(RISCV::SLLI, ShiftAmount));
if (Lo12)		if (Lo12)
Res.push_back(Inst(RISCV::ADDI, Lo12));		Res.push_back(Inst(RISCV::ADDI, Lo12));
}		}

		int getIntMatCost(const APInt &Val, unsigned Size, bool IsRV64) {
		int PlatRegSize = IsRV64 ? 64 : 32;

		// Split the constant into platform register sized chunks, and calculate cost
		// of each chunk.
		int Cost = 0;
		for (unsigned ShiftVal = 0; ShiftVal < Size; ShiftVal += PlatRegSize) {
		APInt Chunk = Val.ashr(ShiftVal).sextOrTrunc(PlatRegSize);
		InstSeq MatSeq;
		generateInstSeq(Chunk.getSExtValue(), IsRV64, MatSeq);
		Cost += MatSeq.size();
		}
		return std::max(1, Cost);
		}
} // namespace RISCVMatInt		} // namespace RISCVMatInt
} // namespace llvm		} // namespace llvm

llvm/trunk/test/CodeGen/RISCV/add-before-shl.ll

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -mtriple=riscv32 -verify-machineinstrs < %s \
				; RUN: \| FileCheck -check-prefix=RV32I %s
				; RUN: llc -mtriple=riscv64 -verify-machineinstrs < %s \
				; RUN: \| FileCheck -check-prefix=RV64I %s

				; These test that constant adds are not moved after shifts by DAGCombine,
				; if the constant is cheaper to materialise before it has been shifted.

				define signext i32 @add_small_const(i32 signext %a) nounwind {
				; RV32I-LABEL: add_small_const:
				; RV32I: # %bb.0:
				; RV32I-NEXT: addi a0, a0, 1
				; RV32I-NEXT: slli a0, a0, 24
				; RV32I-NEXT: srai a0, a0, 24
				; RV32I-NEXT: ret
				;
				; RV64I-LABEL: add_small_const:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi a0, a0, 1
				; RV64I-NEXT: slli a0, a0, 56
				; RV64I-NEXT: srai a0, a0, 56
				; RV64I-NEXT: ret
				%1 = add i32 %a, 1
				%2 = shl i32 %1, 24
				%3 = ashr i32 %2, 24
				ret i32 %3
				}

				define signext i32 @add_large_const(i32 signext %a) nounwind {
				; RV32I-LABEL: add_large_const:
				; RV32I: # %bb.0:
				; RV32I-NEXT: slli a0, a0, 16
				; RV32I-NEXT: lui a1, 65520
				; RV32I-NEXT: add a0, a0, a1
				; RV32I-NEXT: srai a0, a0, 16
				; RV32I-NEXT: ret
				;
				; RV64I-LABEL: add_large_const:
				; RV64I: # %bb.0:
				; RV64I-NEXT: lui a1, 1
				; RV64I-NEXT: addiw a1, a1, -1
				; RV64I-NEXT: add a0, a0, a1
				; RV64I-NEXT: slli a0, a0, 48
				; RV64I-NEXT: srai a0, a0, 48
				; RV64I-NEXT: ret
				%1 = add i32 %a, 4095
				%2 = shl i32 %1, 16
				%3 = ashr i32 %2, 16
				ret i32 %3
				}

				define signext i32 @add_huge_const(i32 signext %a) nounwind {
				; RV32I-LABEL: add_huge_const:
				; RV32I: # %bb.0:
				; RV32I-NEXT: slli a0, a0, 16
				; RV32I-NEXT: lui a1, 524272
				; RV32I-NEXT: add a0, a0, a1
				; RV32I-NEXT: srai a0, a0, 16
				; RV32I-NEXT: ret
				;
				; RV64I-LABEL: add_huge_const:
				; RV64I: # %bb.0:
				; RV64I-NEXT: lui a1, 8
				; RV64I-NEXT: addiw a1, a1, -1
				; RV64I-NEXT: add a0, a0, a1
				; RV64I-NEXT: slli a0, a0, 48
				; RV64I-NEXT: srai a0, a0, 48
				; RV64I-NEXT: ret
				%1 = add i32 %a, 32767
				%2 = shl i32 %1, 16
				%3 = ashr i32 %2, 16
				ret i32 %3
				}