Download Raw Diff

Details

Reviewers

asb
lenary
luismarques
shiva0217
kito-cheng
MaskRay

Commits

rGcb82de296017: [RISCV] Optimize multiplication by constant

Summary

... to shift/add or shift/sub.

Diff Detail

Event Timeline

benshi001 created this revision.Jun 26 2020, 7:46 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 26 2020, 7:46 AM

Herald added subscribers: llvm-commits, evandro, apazos and 23 others. · View Herald Transcript

This patch can not cover all cases (especially "call __mulsi3" on rv32 without M extension), but at least it works well for most cases.

Maybe a better solution is make ISD::MUL as Custom, which I will try later. You are appreciated to review and accept such a partial optimization.

benshi001 edited the summary of this revision. (Show Details)Jun 26 2020, 8:11 AM

benshi001 edited the summary of this revision. (Show Details)Jun 26 2020, 8:14 AM

benshi001 edited the summary of this revision. (Show Details)Jun 26 2020, 8:25 AM

Thanks for the patch!

This optimisation is done by DAGCombine if you instead implement decomposeMulByConstant in RISCVTargetLowering. Read the comment on the TargetLoweringBase class to understand how to use it. This is preferrable, because we don't want to maintain a target-specific copy of this optimisation if we can avoid it.

It would be sensible to base your implementation on the one in the x86 backend: X86TargetLowering::decomposeMulByConstant, which deals with some phase ordering issues around legalisation.

benshi001 edited the summary of this revision. (Show Details)Jun 26 2020, 8:27 AM

benshi001 updated this revision to Diff 273889.Jun 27 2020, 1:49 AM

benshi001 edited the summary of this revision. (Show Details)

Thanks. I have uploaded a new version according to your suggestion!

In D82660#2117161, @lenary wrote:

Thanks for the patch!

This optimisation is done by DAGCombine if you instead implement decomposeMulByConstant in RISCVTargetLowering. Read the comment on the TargetLoweringBase class to understand how to use it. This is preferrable, because we don't want to maintain a target-specific copy of this optimisation if we can avoid it.

It would be sensible to base your implementation on the one in the x86 backend: X86TargetLowering::decomposeMulByConstant, which deals with some phase ordering issues around legalisation.

This is looking good.

I'm going to pre-commit the test additions today - if you could rebase your changes on top, that will allow us to see how this change affects the new testcases you added. I'll keep you as the author and let you know the sha so you can rebase on top of the commit.

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
2989	This TODO should not apply to RISC-V, yet.

In D82660#2126537, @lenary wrote:

This is looking good.

I'm going to pre-commit the test additions today - if you could rebase your changes on top, that will allow us to see how this change affects the new testcases you added. I'll keep you as the author and let you know the sha so you can rebase on top of the commit.

Done in rG003a086ffc0.

lenary mentioned this in rG003a086ffc0d: [RISCV][NFC] Pre-commit tests for D82660.Jul 1 2020, 3:09 PM

benshi001 updated this revision to Diff 274997.Jul 1 2020, 8:24 PM

benshi001 edited the summary of this revision. (Show Details)

benshi001 marked 2 inline comments as done.Jul 1 2020, 8:31 PM

benshi001 added inline comments.

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
2989	Thanks. I have rebased and fixed according to what you suggested.

I'm happy with this optimisation where this patch removes multiply libcalls.

Where the target has a the m extension, and especially for 64-bit multiplies on rv32im, I'm not sure this is an optimisation.

I think that, for the moment, we should add a guard to the hook to avoid this transformation where we do have mul instructions:

if (Subtarget.hasStdExtM())
  return false;

What do you think?

llvm/test/CodeGen/RISCV/mul.ll
296–454	I think this is a pessimisation, though I realise that depends on how slow the 32-bit multiplier is compared to add/shift.

benshi001 updated this revision to Diff 275136.Jul 2 2020, 8:36 AM

benshi001 marked an inline comment as done.

benshi001 updated this revision to Diff 275139.Jul 2 2020, 8:44 AM

In D82660#2127626, @lenary wrote:
I'm happy with this optimisation where this patch removes multiply libcalls.

Where the target has a the m extension, and especially for 64-bit multiplies on rv32im, I'm not sure this is an optimisation.

I think that, for the moment, we should add a guard to the hook to avoid this transformation where we do have mul instructions:
if (Subtarget.hasStdExtM())
  return false;
What do you think?

Shall we loose the guard condition to that ?

if (!Subtarget.is64Bit && Subtarget.hasStdExtM())
   return false;

This will prevent the optimization for RV32IM, but still work for RV64IM。

I think a mul-instruction's latency is sure to be >=2, so all existing test cases will not have regresion.

LGTM.
I'm not overly concerned about the occasional code size increases from doing the optimization for RV32IM, so the loosening of the condition is OK IMO.
Everything else seems to be in order now.
Maybe wait a couple of days more for @lenary's OK.

This revision is now accepted and ready to land.Jul 6 2020, 10:10 AM

One issue, then I'm happy.

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
2984	getSExtValue will assert if the value does not fit into 64 bits - you need to do a check before you get there. I think this hook can be called before legalisation, so you may not get only legal types in this call.

benshi001 updated this revision to Diff 276297.Jul 7 2020, 6:27 PM

benshi001 marked an inline comment as done.

MaskRay accepted this revision.Jul 7 2020, 6:38 PM

MaskRay retitled this revision from [RISCV] Optimize multiplication by specific immediates to [RISCV] Optimize multiplication by constant.

MaskRay edited the summary of this revision. (Show Details)

Closed by commit rGcb82de296017: [RISCV] Optimize multiplication by constant (authored by benshi001, committed by MaskRay). · Explain WhyJul 7 2020, 6:50 PM

This revision was automatically updated to reflect the committed changes.

Herald added a subscriber: jrtc27. · View Herald TranscriptJul 7 2020, 6:50 PM

Diff 273889

llvm/lib/Target/RISCV/RISCVISelLowering.h

Show First 20 Lines • Show All 179 Lines • ▼ Show 20 Lines	public:

bool shouldConvertConstantLoadToIntImm(const APInt &Imm,		bool shouldConvertConstantLoadToIntImm(const APInt &Imm,
Type *Ty) const override {		Type *Ty) const override {
return true;		return true;
}		}
bool mayBeEmittedAsTailCall(const CallInst *CI) const override;		bool mayBeEmittedAsTailCall(const CallInst *CI) const override;
bool shouldConsiderGEPOffsetSplit() const override { return true; }		bool shouldConsiderGEPOffsetSplit() const override { return true; }

		bool decomposeMulByConstant(LLVMContext &Context, EVT VT,
		SDValue C) const override;

TargetLowering::AtomicExpansionKind		TargetLowering::AtomicExpansionKind
shouldExpandAtomicRMWInIR(AtomicRMWInst *AI) const override;		shouldExpandAtomicRMWInIR(AtomicRMWInst *AI) const override;
Value emitMaskedAtomicRMWIntrinsic(IRBuilder<> &Builder, AtomicRMWInst AI,		Value emitMaskedAtomicRMWIntrinsic(IRBuilder<> &Builder, AtomicRMWInst AI,
Value AlignedAddr, Value Incr,		Value AlignedAddr, Value Incr,
Value Mask, Value ShiftAmt,		Value Mask, Value ShiftAmt,
AtomicOrdering Ord) const override;		AtomicOrdering Ord) const override;
TargetLowering::AtomicExpansionKind		TargetLowering::AtomicExpansionKind
shouldExpandAtomicCmpXchgInIR(AtomicCmpXchgInst *CI) const override;		shouldExpandAtomicCmpXchgInIR(AtomicCmpXchgInst *CI) const override;
▲ Show 20 Lines • Show All 46 Lines • Show Last 20 Lines

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

Show All 28 Lines
#include "llvm/CodeGen/TargetLoweringObjectFileImpl.h"		#include "llvm/CodeGen/TargetLoweringObjectFileImpl.h"
#include "llvm/CodeGen/ValueTypes.h"		#include "llvm/CodeGen/ValueTypes.h"
#include "llvm/IR/DiagnosticInfo.h"		#include "llvm/IR/DiagnosticInfo.h"
#include "llvm/IR/DiagnosticPrinter.h"		#include "llvm/IR/DiagnosticPrinter.h"
#include "llvm/IR/IntrinsicsRISCV.h"		#include "llvm/IR/IntrinsicsRISCV.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
		#include "llvm/Support/MathExtras.h"

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "riscv-lower"		#define DEBUG_TYPE "riscv-lower"

STATISTIC(NumTailCalls, "Number of tail calls");		STATISTIC(NumTailCalls, "Number of tail calls");

RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,		RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
▲ Show 20 Lines • Show All 2,922 Lines • ▼ Show 20 Lines	bool RISCVTargetLowering::shouldExtendTypeInLibCall(EVT Type) const {
// arguments or return value is f32 type for LP64 ABI.		// arguments or return value is f32 type for LP64 ABI.
RISCVABI::ABI ABI = Subtarget.getTargetABI();		RISCVABI::ABI ABI = Subtarget.getTargetABI();
if (ABI == RISCVABI::ABI_LP64 && (Type == MVT::f32))		if (ABI == RISCVABI::ABI_LP64 && (Type == MVT::f32))
return false;		return false;

return true;		return true;
}		}

		bool RISCVTargetLowering::decomposeMulByConstant(LLVMContext &Context, EVT VT,
		SDValue C) const {
		// Check integral scalar types.
		if (VT.isScalarInteger())
		if (auto *ConstNode = dyn_cast<ConstantSDNode>(C.getNode())) {
		int64_t Imm = ConstNode->getSExtValue();
		// This also benefits i64 mul on RISCV32, since MUL/MULHS/MULHU
		// are all eliminated.
		if (isPowerOf2_64(Imm + 1) \|\| isPowerOf2_64(Imm - 1) \|\|
		lenaryUnsubmitted Done Reply Inline Actions getSExtValue will assert if the value does not fit into 64 bits - you need to do a check before you get there. I think this hook can be called before legalisation, so you may not get only legal types in this call. lenary: getSExtValue will assert if the value does not fit into 64 bits - you need to do a check before…
		isPowerOf2_64(1 - Imm) \|\| isPowerOf2_64(-1 - Imm))
		return true;
		}

		// TODO: Check vector types.
		lenaryUnsubmitted Done Reply Inline Actions This TODO should not apply to RISC-V, yet. lenary: This TODO should not apply to RISC-V, yet.
		benshi001AuthorUnsubmitted Done Reply Inline Actions Thanks. I have rebased and fixed according to what you suggested. benshi001: Thanks. I have rebased and fixed according to what you suggested.
		return false;
		}

#define GET_REGISTER_MATCHER		#define GET_REGISTER_MATCHER
#include "RISCVGenAsmMatcher.inc"		#include "RISCVGenAsmMatcher.inc"

Register		Register
RISCVTargetLowering::getRegisterByName(const char *RegName, LLT VT,		RISCVTargetLowering::getRegisterByName(const char *RegName, LLT VT,
const MachineFunction &MF) const {		const MachineFunction &MF) const {
Register Reg = MatchRegisterAltName(RegName);		Register Reg = MatchRegisterAltName(RegName);
if (Reg == RISCV::NoRegister)		if (Reg == RISCV::NoRegister)
Show All 10 Lines

llvm/test/CodeGen/RISCV/mul.ll

	Show First 20 Lines • Show All 72 Lines • ▼ Show 20 Lines
	; RV64IM-NEXT: mulw a0, a0, a1			; RV64IM-NEXT: mulw a0, a0, a1
	; RV64IM-NEXT: ret			; RV64IM-NEXT: ret
	%1 = mul i32 %a, %b			%1 = mul i32 %a, %b
	ret i32 %1			ret i32 %1
	}			}

	define signext i32 @mul_constant(i32 %a) nounwind {			define signext i32 @mul_constant(i32 %a) nounwind {
	; RV32I-LABEL: mul_constant:			; RV32I-LABEL: mul_constant:
	; RV32I: # %bb.0:			; RV32I-NEXT: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: slli a1, a0, 2
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: add a0, a1, a0
	; RV32I-NEXT: addi a1, zero, 5
	; RV32I-NEXT: call __mulsi3
	; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
	;			;
	; RV32IM-LABEL: mul_constant:			; RV32IM-LABEL: mul_constant:
	; RV32IM: # %bb.0:			; RV32IM: # %bb.0:
	; RV32IM-NEXT: addi a1, zero, 5			; RV32IM-NEXT: slli a1, a0, 2
	; RV32IM-NEXT: mul a0, a0, a1			; RV32IM-NEXT: add a0, a1, a0
	; RV32IM-NEXT: ret			; RV32IM-NEXT: ret
	;			;
	; RV64I-LABEL: mul_constant:			; RV64I-LABEL: mul_constant:
	; RV64I: # %bb.0:			; RV64I: # %bb.0:
	; RV64I-NEXT: addi sp, sp, -16			; RV64I-NEXT: slli a1, a0, 2
	; RV64I-NEXT: sd ra, 8(sp)			; RV64I-NEXT: addw a0, a1, a0
	; RV64I-NEXT: addi a1, zero, 5
	; RV64I-NEXT: call __muldi3
	; RV64I-NEXT: sext.w a0, a0
	; RV64I-NEXT: ld ra, 8(sp)
	; RV64I-NEXT: addi sp, sp, 16
	; RV64I-NEXT: ret			; RV64I-NEXT: ret
	;			;
	; RV64IM-LABEL: mul_constant:			; RV64IM-LABEL: mul_constant:
	; RV64IM: # %bb.0:			; RV64IM: # %bb.0:
	; RV64IM-NEXT: addi a1, zero, 5			; RV64IM-NEXT: slli a1, a0, 2
	; RV64IM-NEXT: mulw a0, a0, a1			; RV64IM-NEXT: addw a0, a1, a0
	; RV64IM-NEXT: ret			; RV64IM-NEXT: ret
	%1 = mul i32 %a, 5			%1 = mul i32 %a, 5
	ret i32 %1			ret i32 %1
	}			}

	define i32 @mul_pow2(i32 %a) nounwind {			define i32 @mul_pow2(i32 %a) nounwind {
	; RV32I-LABEL: mul_pow2:			; RV32I-LABEL: mul_pow2:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	▲ Show 20 Lines • Show All 53 Lines • ▼ Show 20 Lines
	; RV64IM-NEXT: ret			; RV64IM-NEXT: ret
	%1 = mul i64 %a, %b			%1 = mul i64 %a, %b
	ret i64 %1			ret i64 %1
	}			}

	define i64 @mul64_constant(i64 %a) nounwind {			define i64 @mul64_constant(i64 %a) nounwind {
	; RV32I-LABEL: mul64_constant:			; RV32I-LABEL: mul64_constant:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: slli a3, a0, 2
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: add a2, a3, a0
	; RV32I-NEXT: addi a2, zero, 5			; RV32I-NEXT: sltu a3, a2, a3
	; RV32I-NEXT: mv a3, zero			; RV32I-NEXT: srli a0, a0, 30
	; RV32I-NEXT: call __muldi3			; RV32I-NEXT: slli a4, a1, 2
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: or a0, a4, a0
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: add a0, a0, a1
				; RV32I-NEXT: add a1, a0, a3
				; RV32I-NEXT: mv a0, a2
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
	;			;
	; RV32IM-LABEL: mul64_constant:			; RV32IM-LABEL: mul64_constant:
	; RV32IM: # %bb.0:			; RV32IM: # %bb.0:
	; RV32IM-NEXT: addi a2, zero, 5			; RV32IM-NEXT: slli a3, a0, 2
	; RV32IM-NEXT: mul a1, a1, a2			; RV32IM-NEXT: add a2, a3, a0
	; RV32IM-NEXT: mulhu a3, a0, a2			; RV32IM-NEXT: sltu a3, a2, a3
	; RV32IM-NEXT: add a1, a3, a1			; RV32IM-NEXT: srli a0, a0, 30
	; RV32IM-NEXT: mul a0, a0, a2			; RV32IM-NEXT: slli a4, a1, 2
				; RV32IM-NEXT: or a0, a4, a0
				; RV32IM-NEXT: add a0, a0, a1
				; RV32IM-NEXT: add a1, a0, a3
				; RV32IM-NEXT: mv a0, a2
	; RV32IM-NEXT: ret			; RV32IM-NEXT: ret
	;			;
	; RV64I-LABEL: mul64_constant:			; RV64I-LABEL: mul64_constant:
	; RV64I: # %bb.0:			; RV64I: # %bb.0:
	; RV64I-NEXT: addi sp, sp, -16			; RV64I-NEXT: slli a1, a0, 2
	; RV64I-NEXT: sd ra, 8(sp)			; RV64I-NEXT: add a0, a1, a0
	; RV64I-NEXT: addi a1, zero, 5
	; RV64I-NEXT: call __muldi3
	; RV64I-NEXT: ld ra, 8(sp)
	; RV64I-NEXT: addi sp, sp, 16
	; RV64I-NEXT: ret			; RV64I-NEXT: ret
	;			;
	; RV64IM-LABEL: mul64_constant:			; RV64IM-LABEL: mul64_constant:
	; RV64IM: # %bb.0:			; RV64IM: # %bb.0:
	; RV64IM-NEXT: addi a1, zero, 5			; RV64IM-NEXT: slli a1, a0, 2
	; RV64IM-NEXT: mul a0, a0, a1			; RV64IM-NEXT: add a0, a1, a0
	; RV64IM-NEXT: ret			; RV64IM-NEXT: ret
	%1 = mul i64 %a, 5			%1 = mul i64 %a, 5
	ret i64 %1			ret i64 %1
	}			}

	define i32 @mulhs(i32 %a, i32 %b) nounwind {			define i32 @mulhs(i32 %a, i32 %b) nounwind {
	; RV32I-LABEL: mulhs:			; RV32I-LABEL: mulhs:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	▲ Show 20 Lines • Show All 75 Lines • ▼ Show 20 Lines
	; RV64IM-NEXT: srli a0, a0, 32			; RV64IM-NEXT: srli a0, a0, 32
	; RV64IM-NEXT: ret			; RV64IM-NEXT: ret
	%1 = zext i32 %a to i64			%1 = zext i32 %a to i64
	%2 = zext i32 %b to i64			%2 = zext i32 %b to i64
	%3 = mul i64 %1, %2			%3 = mul i64 %1, %2
	%4 = lshr i64 %3, 32			%4 = lshr i64 %3, 32
	%5 = trunc i64 %4 to i32			%5 = trunc i64 %4 to i32
	ret i32 %5			ret i32 %5
	}			}

				define i32 @muli32_p65(i32 %a) nounwind {
				; RV32I-LABEL: muli32_p65:
				; RV32I: # %bb.0:
				; RV32I-NEXT: slli a1, a0, 6
				; RV32I-NEXT: add a0, a1, a0
				; RV32I-NEXT: ret
				;
				; RV64I-LABEL: muli32_p65:
				; RV64I: # %bb.0:
				; RV64I-NEXT: slli a1, a0, 6
				; RV64I-NEXT: addw a0, a1, a0
				; RV64I-NEXT: ret
				%1 = mul i32 %a, 65
				ret i32 %1
				}

				define i32 @muli32_p63(i32 %a) nounwind {
				; RV32I-LABEL: muli32_p63:
				; RV32I: # %bb.0:
				; RV32I-NEXT: slli a1, a0, 6
				; RV32I-NEXT: sub a0, a1, a0
				; RV32I-NEXT: ret
				;
				; RV64I-LABEL: muli32_p63:
				; RV64I: # %bb.0:
				; RV64I-NEXT: slli a1, a0, 6
				; RV64I-NEXT: subw a0, a1, a0
				; RV64I-NEXT: ret
				%1 = mul i32 %a, 63
				ret i32 %1
				}

				define i64 @muli64_p65(i64 %a) nounwind {
				; RV32I-LABEL: muli64_p65:
				; RV32I: # %bb.0:
				; RV32I-NEXT: slli a3, a0, 6
				; RV32I-NEXT: add a2, a3, a0
				; RV32I-NEXT: sltu a3, a2, a3
				; RV32I-NEXT: srli a0, a0, 26
				; RV32I-NEXT: slli a4, a1, 6
				; RV32I-NEXT: or a0, a4, a0
				; RV32I-NEXT: add a0, a0, a1
				; RV32I-NEXT: add a1, a0, a3
				; RV32I-NEXT: mv a0, a2
				;
				; RV64I-LABEL: muli64_p65:
				; RV64I: # %bb.0:
				; RV64I-NEXT: slli a1, a0, 6
				; RV64I-NEXT: add a0, a1, a0
				; RV64I-NEXT: ret
				%1 = mul i64 %a, 65
				ret i64 %1
				}

				define i64 @muli64_p63(i64 %a) nounwind {
				; RV32I-LABEL: muli64_p63:
				; RV32I: # %bb.0:
				; RV32I-NEXT: slli a2, a0, 6
				; RV32I-NEXT: sltu a3, a2, a0
				; RV32I-NEXT: srli a4, a0, 26
				; RV32I-NEXT: slli a5, a1, 6
				; RV32I-NEXT: or a4, a5, a4
				; RV32I-NEXT: sub a1, a4, a1
				; RV32I-NEXT: sub a1, a1, a3
				; RV32I-NEXT: sub a0, a2, a0
				;
				; RV64I-LABEL: muli64_p63:
				; RV64I: # %bb.0:
				; RV64I-NEXT: slli a1, a0, 6
				; RV64I-NEXT: sub a0, a1, a0
				; RV64I-NEXT: ret
				%1 = mul i64 %a, 63
				ret i64 %1
				}

				define i32 @muli32_m63(i32 %a) nounwind {
				; RV32I-LABEL: muli32_m63:
				; RV32I: # %bb.0:
				; RV32I-NEXT: slli a1, a0, 6
				; RV32I-NEXT: sub a0, a0, a1
				; RV32I-NEXT: ret
				;
				; RV64I-LABEL: muli32_m63:
				; RV64I: # %bb.0:
				; RV64I-NEXT: slli a1, a0, 6
				; RV64I-NEXT: subw a0, a0, a1
				; RV64I-NEXT: ret
				%1 = mul i32 %a, -63
				ret i32 %1
				}

				define i32 @muli32_m65(i32 %a) nounwind {
				; RV32I-LABEL: muli32_m65:
				; RV32I: # %bb.0:
				; RV32I-NEXT: slli a1, a0, 6
				; RV32I-NEXT: add a0, a1, a0
				; RV32I-NEXT: neg a0, a0
				; RV32I-NEXT: ret
				;
				; RV64I-LABEL: muli32_m65:
				; RV64I: # %bb.0:
				; RV64I-NEXT: slli a1, a0, 6
				; RV64I-NEXT: add a0, a1, a0
				; RV64I-NEXT: negw a0, a0
				; RV64I-NEXT: ret
				%1 = mul i32 %a, -65
				ret i32 %1
				}

				define i64 @muli64_m63(i64 %a) nounwind {
				; RV32I-LABEL: muli64_m63:
				; RV32I: # %bb.0:
				; RV32I-NEXT: slli a2, a0, 6
				; RV32I-NEXT: sltu a3, a0, a2
				; RV32I-NEXT: srli a4, a0, 26
				; RV32I-NEXT: slli a5, a1, 6
				; RV32I-NEXT: or a4, a5, a4
				; RV32I-NEXT: sub a1, a1, a4
				; RV32I-NEXT: sub a1, a1, a3
				; RV32I-NEXT: sub a0, a0, a2
				; RV32I-NEXT: ret
				;
				; RV64I-LABEL: muli64_m63:
				; RV64I: # %bb.0:
				; RV64I-NEXT: slli a1, a0, 6
				; RV64I-NEXT: sub a0, a0, a1
				; RV64I-NEXT: ret
				%1 = mul i64 %a, -63
				ret i64 %1
				}

				define i64 @muli64_m65(i64 %a) nounwind {
				; RV32I-LABEL: muli64_m65:
				; RV32I: # %bb.0:
				; RV32I-NEXT: slli a2, a0, 6
				; RV32I-NEXT: add a3, a2, a0
				; RV32I-NEXT: sltu a2, a3, a2
				; RV32I-NEXT: srli a0, a0, 26
				; RV32I-NEXT: slli a4, a1, 6
				; RV32I-NEXT: or a0, a4, a0
				; RV32I-NEXT: add a0, a0, a1
				; RV32I-NEXT: add a0, a0, a2
				; RV32I-NEXT: snez a1, a3
				; RV32I-NEXT: add a0, a0, a1
				; RV32I-NEXT: neg a1, a0
				; RV32I-NEXT: neg a0, a3
				; RV32I-NEXT: ret
				;
				; RV64I-LABEL: muli64_m65:
				; RV64I: # %bb.0:
				; RV64I-NEXT: slli a1, a0, 6
				; RV64I-NEXT: add a0, a1, a0
				; RV64I-NEXT: neg a0, a0
				; RV64I-NEXT: ret
				%1 = mul i64 %a, -65
				ret i64 %1
				}
				lenaryUnsubmitted Done Reply Inline Actions I think this is a pessimisation, though I realise that depends on how slow the 32-bit multiplier is compared to add/shift. lenary: I think this is a pessimisation, though I realise that depends on how slow the 32-bit…

This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Optimize multiplication by constant
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 273889

llvm/lib/Target/RISCV/RISCVISelLowering.h

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

llvm/test/CodeGen/RISCV/mul.ll

This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Optimize multiplication by constantClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 273889

llvm/lib/Target/RISCV/RISCVISelLowering.h

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

llvm/test/CodeGen/RISCV/mul.ll

[RISCV] Optimize multiplication by constant
ClosedPublic