This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/lib/Target/AArch64/
-
trunk/
-
lib/
-
Target/
-
AArch64/
-
AArch64ISelDAGToDAG.cpp
-
AArch64InstrInfo.td

Differential D67497

[aarch64] move custom isel of extract_vector_elt to td file - NFC
ClosedPublic

Authored by sebpop on Sep 12 2019, 7:13 AM.

Download Raw Diff

Details

Reviewers

SjoerdMeijer
jgreenhalgh
evandro

Commits

rGd93e136be14c: [aarch64] move custom isel of extract_vector_elt to td file - NFC
rL371887: [aarch64] move custom isel of extract_vector_elt to td file - NFC

Summary

In preparation for def-pat selection of dot product instructions,
this patch moves the custom instruction selection of extract_vector_elt
to the td file. Without this change it is impossible to catch a pattern that
starts with an extract_vector_elt: the custom cpp code is executed first
ahead of the patterns in the td files that are only executed at the end of
the switch statement in SelectCode(Node).

With this patch applied, it becomes possible to select a different pattern
that starts with extract_vector_elt by selecting a higher complexity than
this pattern.

The patch has been tested on aarch64-linux with make check-all.

Diff Detail

Repository: rL LLVM

Event Timeline

sebpop created this revision.Sep 12 2019, 7:13 AM

Herald added a project: Restricted Project. · View Herald TranscriptSep 12 2019, 7:13 AM

Herald added subscribers: hiraditya, kristof.beyls. · View Herald Transcript

SjoerdMeijer added inline comments.Sep 12 2019, 7:36 AM

llvm/lib/Target/AArch64/AArch64InstrInfo.td
6978 ↗	(On Diff #219906)	We were curious why you need a new AArch64 specific DAG node? Can you not use `extractelt` in the match patterns?

sebpop marked an inline comment as done.Sep 12 2019, 7:49 AM

sebpop added inline comments.

llvm/lib/Target/AArch64/AArch64InstrInfo.td

6978 ↗

(On Diff #219906)

I did that because two other targets do that: git grep says

SystemZ/SystemZOperators.td:def z_vector_extract    : SDNode<"ISD::EXTRACT_VECTOR_ELT",
X86/X86InstrAVX512.td:def X86kextract : SDNode<"ISD::EXTRACT_VECTOR_ELT",

I tried replacing kextract with extractelt and I got this error:

Type set is empty for each HW mode:
possible type contradiction in the pattern below (use -print-records with llvm-tblgen to see all expanded records).
vtInt: 	(vt:{ *:[Other] })

I still need to see how to solve this error.

Is there a test case that checks that this change does not break what the code in AArch64DAGToDAGISel::Select() was meant to handle?

SjoerdMeijer added inline comments.Sep 12 2019, 8:03 AM

llvm/lib/Target/AArch64/AArch64InstrInfo.td
6978 ↗	(On Diff #219906)	Okay, now I see where the name `kextract` comes from :-) Before wondering why we need another node, I was wondering what it could mean, and wanted to add that as a nitpick. That useless tablegen error doesn't ring a bell.... I mean, I have seen it before and then the type contradiction was obvious, but in this case I am a bit clueless.

In D67497#1667851, @evandro wrote:

Is there a test case that checks that this change does not break what the code in AArch64DAGToDAGISel::Select() was meant to handle?

Yes, there are several tests relying on this behavior.
If I just comment out the patterns to the TD file, I get all these tests failing:

Failing Tests (28):
    LLVM :: CodeGen/AArch64/arm64-2012-05-07-DAGCombineVectorExtract.ll
    LLVM :: CodeGen/AArch64/arm64-AdvSIMD-Scalar.ll
    LLVM :: CodeGen/AArch64/arm64-crypto.ll
    LLVM :: CodeGen/AArch64/arm64-neon-copy.ll
    LLVM :: CodeGen/AArch64/arm64-popcnt.ll
    LLVM :: CodeGen/AArch64/arm64-smaxv.ll
    LLVM :: CodeGen/AArch64/arm64-sminv.ll
    LLVM :: CodeGen/AArch64/arm64-vaddv.ll
    LLVM :: CodeGen/AArch64/arm64-vmul.ll
    LLVM :: CodeGen/AArch64/bitcast-v2i8.ll
    LLVM :: CodeGen/AArch64/bitreverse.ll
    LLVM :: CodeGen/AArch64/build-vector-extract.ll
    LLVM :: CodeGen/AArch64/div-rem-pair-recomposition-signed.ll
    LLVM :: CodeGen/AArch64/div-rem-pair-recomposition-unsigned.ll
    LLVM :: CodeGen/AArch64/expand-select.ll
    LLVM :: CodeGen/AArch64/extract-insert.ll
    LLVM :: CodeGen/AArch64/sadd_sat_vec.ll
    LLVM :: CodeGen/AArch64/ssub_sat_vec.ll
    LLVM :: CodeGen/AArch64/trunc-v1i64.ll
    LLVM :: CodeGen/AArch64/uadd_sat_vec.ll
    LLVM :: CodeGen/AArch64/usub_sat_vec.ll
    LLVM :: CodeGen/AArch64/vec_uaddo.ll
    LLVM :: CodeGen/AArch64/vec_umulo.ll
    LLVM :: CodeGen/AArch64/vecreduce-add-legalization.ll
    LLVM :: CodeGen/AArch64/vecreduce-and-legalization.ll
    LLVM :: CodeGen/AArch64/vecreduce-bool.ll
    LLVM :: CodeGen/AArch64/vecreduce-umax-legalization.ll

The tablegen error only happens on the i16 patterns:

def : Pat<(i16 (extractelt (v8i16 V128:$V), (i64 0))), (EXTRACT_SUBREG V128:$V, hsub)>;
def : Pat<(i16 (extractelt (v4i16 V64:$V), (i64 0))), (EXTRACT_SUBREG V64:$V, hsub)>;

If I remove these two patterns, make check-all passes.

I guess the reason for that is because i16s are not legal types so rules producing them would be strange....

Updated patch by removing the patterns that generate i16. Patch passes make check-all on aarch64-linux.

The patch looks okay to me, but I am still curious what happens with i16. The lowering to umov w8, v0.h[1] in build-vector-extract.ll is probably the interesting one. This is probably covered by rule:

def : Pat<(sext_inreg (vector_extract (v8i16 V128:$Rn), VectorIndexH:$idx),i16),
        (i32 (SMOVvi16to32 V128:$Rn, VectorIndexH:$idx))>;

which is why you don't need the i16 case. But I am not really at my computer, perhaps you can confirm this.

In D67497#1668236, @SjoerdMeijer wrote:

The patch looks okay to me, but I am still curious what happens with i16. The lowering to umov w8, v0.h[1] in build-vector-extract.ll is probably the interesting one. This is probably covered by rule:

The patch would not impact that case as the patch only handles the first lane extraction: "Extracting lane zero is a special case"
The example of umov that you gave is moving the content of the second lane.

def : Pat<(sext_inreg (vector_extract (v8i16 V128:$Rn), VectorIndexH:$idx),i16),
        (i32 (SMOVvi16to32 V128:$Rn, VectorIndexH:$idx))>;

This pattern applies to all lanes.

Ah yes, copy-paste mistake, there is umov extracting the zero element, and you're right about that rule that applies to all lanes.

Anyway, if we are not missing a test case for this, then LGTM.

This revision is now accepted and ready to land.Sep 13 2019, 12:21 AM

Closed by commit rL371887: [aarch64] move custom isel of extract_vector_elt to td file - NFC (authored by spop). · Explain WhySep 13 2019, 12:32 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Target/

AArch64/

AArch64ISelDAGToDAG.cpp

43 lines

AArch64InstrInfo.td

10 lines

Diff 220153

llvm/trunk/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp

Show First 20 Lines • Show All 2,907 Lines • ▼ Show 20 Lines	if (tryBitfieldExtractOpFromSExt(Node))
return;		return;
break;		break;

case ISD::OR:		case ISD::OR:
if (tryBitfieldInsertOp(Node))		if (tryBitfieldInsertOp(Node))
return;		return;
break;		break;

case ISD::EXTRACT_VECTOR_ELT: {
// Extracting lane zero is a special case where we can just use a plain
// EXTRACT_SUBREG instruction, which will become FMOV. This is easier for
// the rest of the compiler, especially the register allocator and copyi
// propagation, to reason about, so is preferred when it's possible to
// use it.
ConstantSDNode *LaneNode = cast<ConstantSDNode>(Node->getOperand(1));
// Bail and use the default Select() for non-zero lanes.
if (LaneNode->getZExtValue() != 0)
break;
// If the element type is not the same as the result type, likewise
// bail and use the default Select(), as there's more to do than just
// a cross-class COPY. This catches extracts of i8 and i16 elements
// since they will need an explicit zext.
if (VT != Node->getOperand(0).getValueType().getVectorElementType())
break;
unsigned SubReg;
switch (Node->getOperand(0)
.getValueType()
.getVectorElementType()
.getSizeInBits()) {
default:
llvm_unreachable("Unexpected vector element type!");
case 64:
SubReg = AArch64::dsub;
break;
case 32:
SubReg = AArch64::ssub;
break;
case 16:
SubReg = AArch64::hsub;
break;
case 8:
llvm_unreachable("unexpected zext-requiring extract element!");
}
SDValue Extract = CurDAG->getTargetExtractSubreg(SubReg, SDLoc(Node), VT,
Node->getOperand(0));
LLVM_DEBUG(dbgs() << "ISEL: Custom selection!\n=> ");
LLVM_DEBUG(Extract->dumpr(CurDAG));
LLVM_DEBUG(dbgs() << "\n");
ReplaceNode(Node, Extract.getNode());
return;
}
case ISD::Constant: {		case ISD::Constant: {
// Materialize zero constants as copies from WZR/XZR. This allows		// Materialize zero constants as copies from WZR/XZR. This allows
// the coalescer to propagate these into other instructions.		// the coalescer to propagate these into other instructions.
ConstantSDNode *ConstNode = cast<ConstantSDNode>(Node);		ConstantSDNode *ConstNode = cast<ConstantSDNode>(Node);
if (ConstNode->isNullValue()) {		if (ConstNode->isNullValue()) {
if (VT == MVT::i32) {		if (VT == MVT::i32) {
SDValue New = CurDAG->getCopyFromReg(		SDValue New = CurDAG->getCopyFromReg(
CurDAG->getEntryNode(), SDLoc(Node), AArch64::WZR, MVT::i32);		CurDAG->getEntryNode(), SDLoc(Node), AArch64::WZR, MVT::i32);
▲ Show 20 Lines • Show All 1,252 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 6,967 Lines • ▼ Show 20 Lines
	def : Pat<(AArch64tcret tglobaladdr:$dst, (i32 timm:$FPDiff)),			def : Pat<(AArch64tcret tglobaladdr:$dst, (i32 timm:$FPDiff)),
	(TCRETURNdi texternalsym:$dst, imm:$FPDiff)>;			(TCRETURNdi texternalsym:$dst, imm:$FPDiff)>;
	def : Pat<(AArch64tcret texternalsym:$dst, (i32 timm:$FPDiff)),			def : Pat<(AArch64tcret texternalsym:$dst, (i32 timm:$FPDiff)),
	(TCRETURNdi texternalsym:$dst, imm:$FPDiff)>;			(TCRETURNdi texternalsym:$dst, imm:$FPDiff)>;

	def MOVMCSym : Pseudo<(outs GPR64:$dst), (ins i64imm:$sym), []>, Sched<[]>;			def MOVMCSym : Pseudo<(outs GPR64:$dst), (ins i64imm:$sym), []>, Sched<[]>;
	def : Pat<(i64 (AArch64LocalRecover mcsym:$sym)), (MOVMCSym mcsym:$sym)>;			def : Pat<(i64 (AArch64LocalRecover mcsym:$sym)), (MOVMCSym mcsym:$sym)>;

				// Extracting lane zero is a special case where we can just use a plain
				// EXTRACT_SUBREG instruction, which will become FMOV. This is easier for the
				// rest of the compiler, especially the register allocator and copy propagation,
				// to reason about, so is preferred when it's possible to use it.
				let AddedComplexity = 10 in {
				def : Pat<(i64 (extractelt (v2i64 V128:$V), (i64 0))), (EXTRACT_SUBREG V128:$V, dsub)>;
				def : Pat<(i32 (extractelt (v4i32 V128:$V), (i64 0))), (EXTRACT_SUBREG V128:$V, ssub)>;
				def : Pat<(i32 (extractelt (v2i32 V64:$V), (i64 0))), (EXTRACT_SUBREG V64:$V, ssub)>;
				}

	include "AArch64InstrAtomics.td"			include "AArch64InstrAtomics.td"
	include "AArch64SVEInstrInfo.td"			include "AArch64SVEInstrInfo.td"