This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Constant materialisation for RV64I
ClosedPublic

Authored by asb on Oct 6 2018, 3:29 AM.

Download Raw Diff

Details

Reviewers

apazos
mgrang
sabuasal

Commits

rG2146e8fb1e5f: [RISCV] Constant materialisation for RV64I
rL347042: [RISCV] Constant materialisation for RV64I

Summary

This commit introduces support for materialising 64-bit constants for RV64I,
making use of the RISCVMatInt::generateInstSeq helper in order to share logic
for immediate materialisation with the MC layer (where it's used for the li
pseudoinstruction).

test/CodeGen/RISCV/imm.ll is updated to test RV64, and gains new 64-bit
constant tests. It would be preferable if anyext constant returns were sign
rather than zero extended (see PR39092). This patch simply adds an explicit
signext to the returns in imm.ll.

Further optimisations for constant materialisation are possible, most notably
for mask-like values which can be generated my loading -1 and shifting right.
A future patch will standardise on the C++ codepath for immediate selection on
RV32 as well as RV64, and then add further such optimisations to
RISCVMatInt::generateInstSeq in order to benefit both RV32 and RV64 for
codegen and li expansion.

Diff Detail

Repository: rL LLVM

Event Timeline

asb created this revision.Oct 6 2018, 3:29 AM

Herald added subscribers: jocewei, PkmX, rkruppe and 13 others. · View Herald TranscriptOct 6 2018, 3:29 AM

asb added a parent revision: D52961: [RISCV] Introduce the RISCVMatInt::generateInstSeq helper.Oct 6 2018, 3:30 AM

psnobl added a subscriber: psnobl.Oct 9 2018, 12:58 AM

Wouldn't it better to put more complicated constants (requiring 4 and more instructions to materialize) in constant pool and load them from there?

In D52962#1258544, @psnobl wrote:

Wouldn't it better to put more complicated constants (requiring 4 and more instructions to materialize) in constant pool and load them from there?

gcc does that, at some point. It depends. If you can guarantee a cache hit for the load then it can be better. If you can't then you could be looking at a big slowdown. Typical processors go to a lot of trouble to prefetch the instruction stream. Eight instructions and 20 to 24 bytes to load an eight byte literal isn't too horrible. The load from a pool approach is going to generally take two 4 byte instructions (auipc/ld or lui/ld) plus the 8 byte literal itself, for a total of 16 bytes. Plus, it might cost anything from half a dozen clock cycles to several hundred clock cycles to get the cache line. Well, let's say on average it's going to be in L2 cache. It's not a clear win. And the worst case is awful. All to save 4 to 8 bytes of code and maybe four clock cycles in the best case.

Oh, and by the way, if the user really wants to use a load, they can easily force that by writing a global const (or single element initialized array to be really sure).

Well you can save more in the total size if the constant is needed in more than one place in the application (and linker can merge the CP entries) but I see your point about possible cache miss. It's not a clear win, as you say.

Updated patch to add a TODO to indicate that it may sometimes be preferable to load from the constant pool. As Bruce points out, this isn't always a clear win.

If people are happy with this approach, my intent with this patchset is to land something that generates "reasonable" code and leave tuning such as constpool vs materialisation for future patches.

asb added a child revision: D52977: [RISCV] Introduce codegen patterns for instructions introduced in RV64I.Oct 12 2018, 4:37 PM

lewis-revill mentioned this in D52961: [RISCV] Introduce the RISCVMatInt::generateInstSeq helper.Oct 18 2018, 9:09 AM

sabuasal edited reviewers, added: sabuasal; removed: sameer.abuasal.Oct 22 2018, 4:43 PM

xxuejie added a subscriber: xxuejie.Oct 29 2018, 4:41 PM

Confirming this patch still applies cleanly against current LLVM HEAD. Ping?

LGTM!

This revision is now accepted and ready to land.Nov 15 2018, 2:48 PM

Closed by commit rL347042: [RISCV] Constant materialisation for RV64I (authored by asb). · Explain WhyNov 16 2018, 2:16 AM

This revision was automatically updated to reflect the committed changes.

Herald added a subscriber: jrtc27. · View Herald TranscriptNov 16 2018, 2:17 AM

TheWaWaR added a subscriber: TheWaWaR.Dec 14 2018, 6:54 PM

Revision Contents

Path

Size

llvm/

trunk/

lib/

Target/

RISCV/

RISCVISelDAGToDAG.cpp

29 lines

test/

CodeGen/

RISCV/

imm.ll

211 lines

Diff 174342

llvm/trunk/lib/Target/RISCV/RISCVISelDAGToDAG.cpp

//===-- RISCVISelDAGToDAG.cpp - A dag to dag inst selector for RISCV ------===//		//===-- RISCVISelDAGToDAG.cpp - A dag to dag inst selector for RISCV ------===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This file defines an instruction selector for the RISCV target.		// This file defines an instruction selector for the RISCV target.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "RISCV.h"
#include "MCTargetDesc/RISCVMCTargetDesc.h"		#include "MCTargetDesc/RISCVMCTargetDesc.h"
		#include "RISCV.h"
#include "RISCVTargetMachine.h"		#include "RISCVTargetMachine.h"
		#include "Utils/RISCVMatInt.h"
#include "llvm/CodeGen/MachineFrameInfo.h"		#include "llvm/CodeGen/MachineFrameInfo.h"
#include "llvm/CodeGen/SelectionDAGISel.h"		#include "llvm/CodeGen/SelectionDAGISel.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/MathExtras.h"		#include "llvm/Support/MathExtras.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "riscv-isel"		#define DEBUG_TYPE "riscv-isel"
Show All 33 Lines	private:
void doPeepholeLoadStoreADDI();		void doPeepholeLoadStoreADDI();
};		};
}		}

void RISCVDAGToDAGISel::PostprocessISelDAG() {		void RISCVDAGToDAGISel::PostprocessISelDAG() {
doPeepholeLoadStoreADDI();		doPeepholeLoadStoreADDI();
}		}

		static SDNode selectImm(SelectionDAG CurDAG, const SDLoc &DL, int64_t Imm,
		MVT XLenVT) {
		RISCVMatInt::InstSeq Seq;
		RISCVMatInt::generateInstSeq(Imm, XLenVT == MVT::i64, Seq);

		SDNode *Result;
		SDValue SrcReg = CurDAG->getRegister(RISCV::X0, XLenVT);
		for (RISCVMatInt::Inst &Inst : Seq) {
		SDValue SDImm = CurDAG->getTargetConstant(Inst.Imm, DL, XLenVT);
		if (Inst.Opc == RISCV::LUI)
		Result = CurDAG->getMachineNode(RISCV::LUI, DL, XLenVT, SDImm);
		else
		Result = CurDAG->getMachineNode(Inst.Opc, DL, XLenVT, SrcReg, SDImm);

		// Only the first instruction has X0 as its source.
		SrcReg = SDValue(Result, 0);
		}

		return Result;
		}

void RISCVDAGToDAGISel::Select(SDNode *Node) {		void RISCVDAGToDAGISel::Select(SDNode *Node) {
// If we have a custom node, we have already selected.		// If we have a custom node, we have already selected.
if (Node->isMachineOpcode()) {		if (Node->isMachineOpcode()) {
LLVM_DEBUG(dbgs() << "== "; Node->dump(CurDAG); dbgs() << "\n");		LLVM_DEBUG(dbgs() << "== "; Node->dump(CurDAG); dbgs() << "\n");
Node->setNodeId(-1);		Node->setNodeId(-1);
return;		return;
}		}

// Instruction Selection not handled by the auto-generated tablegen selection		// Instruction Selection not handled by the auto-generated tablegen selection
// should be handled here.		// should be handled here.
unsigned Opcode = Node->getOpcode();		unsigned Opcode = Node->getOpcode();
MVT XLenVT = Subtarget->getXLenVT();		MVT XLenVT = Subtarget->getXLenVT();
SDLoc DL(Node);		SDLoc DL(Node);
EVT VT = Node->getValueType(0);		EVT VT = Node->getValueType(0);

switch (Opcode) {		switch (Opcode) {
case ISD::Constant: {		case ISD::Constant: {
auto ConstNode = cast<ConstantSDNode>(Node);		auto ConstNode = cast<ConstantSDNode>(Node);
if (VT == XLenVT && ConstNode->isNullValue()) {		if (VT == XLenVT && ConstNode->isNullValue()) {
SDValue New = CurDAG->getCopyFromReg(CurDAG->getEntryNode(), SDLoc(Node),		SDValue New = CurDAG->getCopyFromReg(CurDAG->getEntryNode(), SDLoc(Node),
RISCV::X0, XLenVT);		RISCV::X0, XLenVT);
ReplaceNode(Node, New.getNode());		ReplaceNode(Node, New.getNode());
return;		return;
}		}
		int64_t Imm = ConstNode->getSExtValue();
		if (XLenVT == MVT::i64) {
		ReplaceNode(Node, selectImm(CurDAG, SDLoc(Node), Imm, XLenVT));
		return;
		}
break;		break;
}		}
case ISD::FrameIndex: {		case ISD::FrameIndex: {
SDValue Imm = CurDAG->getTargetConstant(0, DL, XLenVT);		SDValue Imm = CurDAG->getTargetConstant(0, DL, XLenVT);
int FI = cast<FrameIndexSDNode>(Node)->getIndex();		int FI = cast<FrameIndexSDNode>(Node)->getIndex();
SDValue TFI = CurDAG->getTargetFrameIndex(FI, VT);		SDValue TFI = CurDAG->getTargetFrameIndex(FI, VT);
ReplaceNode(Node, CurDAG->getMachineNode(RISCV::ADDI, DL, VT, TFI, Imm));		ReplaceNode(Node, CurDAG->getMachineNode(RISCV::ADDI, DL, VT, TFI, Imm));
return;		return;
▲ Show 20 Lines • Show All 124 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/RISCV/imm.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -mtriple=riscv32 -verify-machineinstrs < %s \			; RUN: llc -mtriple=riscv32 -verify-machineinstrs < %s \
	; RUN: \| FileCheck %s -check-prefix=RV32I			; RUN: \| FileCheck %s -check-prefix=RV32I
				; RUN: llc -mtriple=riscv64 -verify-machineinstrs < %s \
				; RUN: \| FileCheck %s -check-prefix=RV64I

	; Materializing constants			; Materializing constants

	define i32 @zero() nounwind {			; TODO: It would be preferable if anyext constant returns were sign rather
				; than zero extended. See PR39092. For now, mark returns as explicitly signext
				; (this matches what Clang would generate for equivalent C/C++ anyway).

				define signext i32 @zero() nounwind {
	; RV32I-LABEL: zero:			; RV32I-LABEL: zero:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: mv a0, zero			; RV32I-NEXT: mv a0, zero
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV64I-LABEL: zero:
				; RV64I: # %bb.0:
				; RV64I-NEXT: mv a0, zero
				; RV64I-NEXT: ret
	ret i32 0			ret i32 0
	}			}

	define i32 @pos_small() nounwind {			define signext i32 @pos_small() nounwind {
	; RV32I-LABEL: pos_small:			; RV32I-LABEL: pos_small:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi a0, zero, 2047			; RV32I-NEXT: addi a0, zero, 2047
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV64I-LABEL: pos_small:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi a0, zero, 2047
				; RV64I-NEXT: ret
	ret i32 2047			ret i32 2047
	}			}

	define i32 @neg_small() nounwind {			define signext i32 @neg_small() nounwind {
	; RV32I-LABEL: neg_small:			; RV32I-LABEL: neg_small:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi a0, zero, -2048			; RV32I-NEXT: addi a0, zero, -2048
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV64I-LABEL: neg_small:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi a0, zero, -2048
				; RV64I-NEXT: ret
	ret i32 -2048			ret i32 -2048
	}			}

	define i32 @pos_i32() nounwind {			define signext i32 @pos_i32() nounwind {
	; RV32I-LABEL: pos_i32:			; RV32I-LABEL: pos_i32:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: lui a0, 423811			; RV32I-NEXT: lui a0, 423811
	; RV32I-NEXT: addi a0, a0, -1297			; RV32I-NEXT: addi a0, a0, -1297
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV64I-LABEL: pos_i32:
				; RV64I: # %bb.0:
				; RV64I-NEXT: lui a0, 423811
				; RV64I-NEXT: addiw a0, a0, -1297
				; RV64I-NEXT: ret
	ret i32 1735928559			ret i32 1735928559
	}			}

	define i32 @neg_i32() nounwind {			define signext i32 @neg_i32() nounwind {
	; RV32I-LABEL: neg_i32:			; RV32I-LABEL: neg_i32:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: lui a0, 912092			; RV32I-NEXT: lui a0, 912092
	; RV32I-NEXT: addi a0, a0, -273			; RV32I-NEXT: addi a0, a0, -273
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV64I-LABEL: neg_i32:
				; RV64I: # %bb.0:
				; RV64I-NEXT: lui a0, 912092
				; RV64I-NEXT: addiw a0, a0, -273
				; RV64I-NEXT: ret
	ret i32 -559038737			ret i32 -559038737
	}			}

	define i32 @pos_i32_hi20_only() nounwind {			define signext i32 @pos_i32_hi20_only() nounwind {
	; RV32I-LABEL: pos_i32_hi20_only:			; RV32I-LABEL: pos_i32_hi20_only:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: lui a0, 16			; RV32I-NEXT: lui a0, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV64I-LABEL: pos_i32_hi20_only:
				; RV64I: # %bb.0:
				; RV64I-NEXT: lui a0, 16
				; RV64I-NEXT: ret
	ret i32 65536			ret i32 65536
	}			}

	define i32 @neg_i32_hi20_only() nounwind {			define signext i32 @neg_i32_hi20_only() nounwind {
	; RV32I-LABEL: neg_i32_hi20_only:			; RV32I-LABEL: neg_i32_hi20_only:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: lui a0, 1048560			; RV32I-NEXT: lui a0, 1048560
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV64I-LABEL: neg_i32_hi20_only:
				; RV64I: # %bb.0:
				; RV64I-NEXT: lui a0, 1048560
				; RV64I-NEXT: ret
	ret i32 -65536			ret i32 -65536
	}			}

				define i64 @imm64_1() nounwind {
				; RV32I-LABEL: imm64_1:
				; RV32I: # %bb.0:
				; RV32I-NEXT: lui a0, 524288
				; RV32I-NEXT: mv a1, zero
				; RV32I-NEXT: ret
				;
				; RV64I-LABEL: imm64_1:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi a0, zero, 1
				; RV64I-NEXT: slli a0, a0, 31
				; RV64I-NEXT: ret
				ret i64 2147483648
				}

				; TODO: This and similar constants with all 0s in the upper bits and all 1s in
				; the lower bits could be lowered to addi a0, zero, -1 followed by a logical
				; right shift.
				define i64 @imm64_2() nounwind {
				; RV32I-LABEL: imm64_2:
				; RV32I: # %bb.0:
				; RV32I-NEXT: addi a0, zero, -1
				; RV32I-NEXT: mv a1, zero
				; RV32I-NEXT: ret
				;
				; RV64I-LABEL: imm64_2:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi a0, zero, 1
				; RV64I-NEXT: slli a0, a0, 32
				; RV64I-NEXT: addi a0, a0, -1
				; RV64I-NEXT: ret
				ret i64 4294967295
				}

				define i64 @imm64_3() nounwind {
				; RV32I-LABEL: imm64_3:
				; RV32I: # %bb.0:
				; RV32I-NEXT: addi a1, zero, 1
				; RV32I-NEXT: mv a0, zero
				; RV32I-NEXT: ret
				;
				; RV64I-LABEL: imm64_3:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi a0, zero, 1
				; RV64I-NEXT: slli a0, a0, 32
				; RV64I-NEXT: ret
				ret i64 4294967296
				}

				define i64 @imm64_4() nounwind {
				; RV32I-LABEL: imm64_4:
				; RV32I: # %bb.0:
				; RV32I-NEXT: lui a1, 524288
				; RV32I-NEXT: mv a0, zero
				; RV32I-NEXT: ret
				;
				; RV64I-LABEL: imm64_4:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi a0, zero, -1
				; RV64I-NEXT: slli a0, a0, 63
				; RV64I-NEXT: ret
				ret i64 9223372036854775808
				}

				define i64 @imm64_5() nounwind {
				; RV32I-LABEL: imm64_5:
				; RV32I: # %bb.0:
				; RV32I-NEXT: lui a1, 524288
				; RV32I-NEXT: mv a0, zero
				; RV32I-NEXT: ret
				;
				; RV64I-LABEL: imm64_5:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi a0, zero, -1
				; RV64I-NEXT: slli a0, a0, 63
				; RV64I-NEXT: ret
				ret i64 -9223372036854775808
				}

				define i64 @imm64_6() nounwind {
				; RV32I-LABEL: imm64_6:
				; RV32I: # %bb.0:
				; RV32I-NEXT: lui a0, 74565
				; RV32I-NEXT: addi a1, a0, 1656
				; RV32I-NEXT: mv a0, zero
				; RV32I-NEXT: ret
				;
				; RV64I-LABEL: imm64_6:
				; RV64I: # %bb.0:
				; RV64I-NEXT: lui a0, 9321
				; RV64I-NEXT: addiw a0, a0, -1329
				; RV64I-NEXT: slli a0, a0, 35
				; RV64I-NEXT: ret
				ret i64 1311768464867721216
				}

				define i64 @imm64_7() nounwind {
				; RV32I-LABEL: imm64_7:
				; RV32I: # %bb.0:
				; RV32I-NEXT: lui a0, 45056
				; RV32I-NEXT: addi a0, a0, 15
				; RV32I-NEXT: lui a1, 458752
				; RV32I-NEXT: ret
				;
				; RV64I-LABEL: imm64_7:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi a0, zero, 7
				; RV64I-NEXT: slli a0, a0, 36
				; RV64I-NEXT: addi a0, a0, 11
				; RV64I-NEXT: slli a0, a0, 24
				; RV64I-NEXT: addi a0, a0, 15
				; RV64I-NEXT: ret
				ret i64 8070450532432478223
				}

				; TODO: it can be preferable to put constants that are expensive to materialise
				; into the constant pool, especially for -Os.
				define i64 @imm64_8() nounwind {
				; RV32I-LABEL: imm64_8:
				; RV32I: # %bb.0:
				; RV32I-NEXT: lui a0, 633806
				; RV32I-NEXT: addi a0, a0, -272
				; RV32I-NEXT: lui a1, 74565
				; RV32I-NEXT: addi a1, a1, 1656
				; RV32I-NEXT: ret
				;
				; RV64I-LABEL: imm64_8:
				; RV64I: # %bb.0:
				; RV64I-NEXT: lui a0, 583
				; RV64I-NEXT: addiw a0, a0, -1875
				; RV64I-NEXT: slli a0, a0, 14
				; RV64I-NEXT: addi a0, a0, -947
				; RV64I-NEXT: slli a0, a0, 12
				; RV64I-NEXT: addi a0, a0, 1511
				; RV64I-NEXT: slli a0, a0, 13
				; RV64I-NEXT: addi a0, a0, -272
				; RV64I-NEXT: ret
				ret i64 1311768467463790320
				}

				define i64 @imm64_9() nounwind {
				; RV32I-LABEL: imm64_9:
				; RV32I: # %bb.0:
				; RV32I-NEXT: addi a0, zero, -1
				; RV32I-NEXT: mv a1, a0
				; RV32I-NEXT: ret
				;
				; RV64I-LABEL: imm64_9:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi a0, zero, -1
				; RV64I-NEXT: ret
				ret i64 -1
				}