This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/CodeGen/
-
llvm/
-
CodeGen/
-
ISDOpcodes.h
-
lib/CodeGen/SelectionDAG/
-
CodeGen/
-
SelectionDAG/
1/4
DAGCombiner.cpp
1/3
LegalizeIntegerTypes.cpp
4/10
LegalizeVectorTypes.cpp
1
SelectionDAG.cpp
-
test/CodeGen/
-
CodeGen/
-
AArch64/
-
sve-stepvector.ll
-
RISCV/rvv/
-
rvv/
2/4
stepvector.ll

Differential D100812

[DAGCombiner] Allow operand of step_vector to be negative.
ClosedPublic

Authored by junparser on Apr 19 2021, 9:05 PM.

Download Raw Diff

Details

Reviewers

david-arm
sdesmalen
paulwalker-arm
ctetreau
efriedma
kmclaughlin
frasercrmck

Commits

rG978eb3f168be: [DAGCombiner] Allow operand of step_vector to be negative.

Summary

It is proper to relax non-negative limitation of step_vector. Also this patch adds more combines for step_vector:
(sub X, step_vector(C)) -> (add X, step_vector(-C))

TestPlan: check-llvm

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

junparser created this revision.Apr 19 2021, 9:05 PM

Herald added subscribers: ecnelises, hiraditya. · View Herald TranscriptApr 19 2021, 9:05 PM

junparser requested review of this revision.Apr 19 2021, 9:05 PM

Herald added a project: Restricted Project. · View Herald TranscriptApr 19 2021, 9:05 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

junparser edited the summary of this revision. (Show Details)Apr 19 2021, 9:05 PM

junparser edited the summary of this revision. (Show Details)Apr 19 2021, 9:09 PM

Harbormaster completed remote builds in B99609: Diff 338702.Apr 19 2021, 9:39 PM

junparser added a child revision: D100816: [AArch64][SVE] Lower index_vector to step_vector.Apr 19 2021, 10:40 PM

rebase.

I've nothing against the change but it is more involved than updating the comment and assert. There are places where STEP_VECTOR's operand is treated as unsigned, based on the original requirement, that will need to be updated. For example DAGTypeLegalizer::PromoteIntRes_STEP_VECTOR. Ideally we'd want to add/update tests to show the signedness is properly protected during type legalisation.

Harbormaster completed remote builds in B99641: Diff 338754.Apr 20 2021, 2:09 AM

In D100812#2700861, @paulwalker-arm wrote:

I've nothing against the change but it is more involved than updating the comment and assert. There are places where STEP_VECTOR's operand is treated as unsigned, based on the original requirement, that will need to be updated. For example DAGTypeLegalizer::PromoteIntRes_STEP_VECTOR. Ideally we'd want to add/update tests to show the signedness is properly protected during type legalisation.

Yep, I'll update later.

junparser added a reviewer: frasercrmck.Apr 20 2021, 4:59 AM

Address comments.

@frasercrmck, Do you have any idea about the change in riscv?

Herald added subscribers: luismarques, apazos, sameer.abuasal and 18 others. · View Herald TranscriptApr 20 2021, 5:37 AM

I also have nothing against the change in principle, but in addition to @paulwalker-arm's comments, RISC-V won't support this: it expects IMM to be 1, as it always was before this. We shouldn't introduce something that regresses this target, so the lowering of STEP_VECTOR will need to be extended to legalize/lower the operation.

Hmm I just saw that D100088 went in without anyone involved with RISC-V being on the reviewer list or notified. I think that's technically a regression since the RISC-V backend crashes on all those test cases that were added.

llvm/test/CodeGen/RISCV/rvv/stepvector.ll
277	This is definitely a regression so I think we need to see what's going on here. The split-vector legalization will be causing this. Perhaps because it's still zero-extending there?

Harbormaster completed remote builds in B99698: Diff 338838.Apr 20 2021, 7:20 AM

In D100812#2701403, @frasercrmck wrote:

I also have nothing against the change in principle, but in addition to @paulwalker-arm's comments, RISC-V won't support this: it expects IMM to be 1, as it always was before this. We shouldn't introduce something that regresses this target, so the lowering of STEP_VECTOR will need to be extended to legalize/lower the operation.

Yes, I just noticed that after D100088 committed. I'm fixing this.

In D100812#2701563, @frasercrmck wrote:

Hmm I just saw that D100088 went in without anyone involved with RISC-V being on the reviewer list or notified. I think that's technically a regression since the RISC-V backend crashes on all those test cases that were added.

OK, I saw D100856.

junparser added inline comments.Apr 20 2021, 9:23 PM

llvm/test/CodeGen/RISCV/rvv/stepvector.ll
277	This is caused by SplitVecRes_STEP_VECTOR which changes getZExtOrTrunc to getSExtOrTrunc, we may need check whether stepval is isNonNegative.

Fix regression in riscv32.

Harbormaster completed remote builds in B99918: Diff 339146.Apr 21 2021, 2:51 AM

sdesmalen added inline comments.Apr 21 2021, 2:56 AM

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
3547	nit: double space.
3548	If there are multiple uses of step_vector(C), then it may be more beneficial to have a single step_vector(C) and use separate add/sub. Can you add a check that N1 has only a single use?
llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
1665	This can also use getSExtOrTrunc for the non-negative case?

paulwalker-arm added inline comments.Apr 21 2021, 2:58 AM

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
4794–4796	Is calling sextOrTrunc necessary? I would have thought you could just replace `getZExtValue()` with `getSExtValue()`. That said with @frasercrmck previous patch to remove the size restriction of `STEP_VECTOR`'s operands I don't believe there's need for this function to extend the operand at all and so `N->getOperand(0)` can be passed directly to `getStepVector`.
llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
1658–1665	This doesn't look correct to me. The operand is either signed or unsigned and should be treated accordingly. Given this patch wants to allow negative value then we're essentially converting the STEP_VECTOR to take a signed operand and should always use SExt. That said I'm wondering if the issue here is using DAG to do the extension even though the operand is defined to be a constant. Perhaps the following works for you? SDValue StartOfHi = DAG.getVScale(HiVT.getVectorElementType(), StepVal * LoVT.getVectorMinNumElements());

frasercrmck added inline comments.Apr 21 2021, 3:05 AM

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
4794–4796	Yeah I think if the incoming step is legal (which it possibly isn't in theory -- in practice we always create a step with TypeToTransformTo?) then it should be possible to pass it straight through.
llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
1658–1665	I agree, I think it should be consistently SExt. I think @junparser tried to mitigate a RISC-V regression but in line with what you're saying, I bet some of the poor RISC-V code comes from the DAG extending VSCALE and not being able to optimize it due to not being able to infer the known bits.

junparser added inline comments.Apr 21 2021, 3:47 AM

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
4794–4796	ok，let's use it directly and add assertion here.
llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
1658–1665	@paulwalker-arm, The tradeoff here does try to mitigate regression in riscv-32 with element type like i64. Meanwhile, extension is necessary for vscale(i64) under riscv-32. I checked PromoteIntRes_VSCALE which does not do truncation. @frasercrmck, would it be ok to relax vscale as same as step_vector?

junparser added inline comments.Apr 21 2021, 4:31 AM

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
3548	will update later.
llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
1658–1665	@paulwalker-arm, The tradeoff here does try to mitigate regression in riscv-32 with element type like i64. Meanwhile, extension is necessary for vscale(i64) under riscv-32. Since there is no ExpandIntegerResult with vscale. @frasercrmck, would it be ok to relax vscale as same as step_vector?

paulwalker-arm added inline comments.Apr 21 2021, 5:17 AM

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
1658–1665	I believe `VSCALE` and `STEP_VECTOR` have different problems. `STEP_VECTOR` has a vector result but scalar operand so it seemed reasonable to allow the flexibility with regards to its operand's type. I don't believe the same flexibility should be given to `VSCALE` because that is a scalar only operation. That's to say that if a target doesn't support i64 for the operand type, then it will also not support i64 as its result. I mean if we decided `STEP_VECTOR`'s operand was not worth the effort and removed it then this code will still need to plant an explicit `MUL` that'll have the same problem. So whilst I agree there's a problem I'm not sure this is the function to solve it. Perhaps SPLAT_VECTOR_PARTS need to be involved somewhere? Perhaps what we're really asking for here is that `SPLAT_VECTOR`'s operand be relaxed to allow an operand that is smaller than its result element type?

junparser added inline comments.Apr 21 2021, 5:57 AM

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp

1658–1665

yes, relax vscale does not work here. I have did some local test in ExpandIntegerResult:

case ISD::VSCALE:
   {
 EVT VT = N->getValueType(0);
 EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), VT);
SDLoc dl(N);
 SDValue NewNode = DAG.getNode(ISD::MUL, dl, VT, DAG.getZExtOrTrunc(DAG.getVScale(dl, 
 NVT, APInt(NVT.getSizeInBits(), 1)), dl, VT), N->getOperand(0));
 ReplaceValueWith(SDValue(N, 0), NewNode);
  break;
  }

turns out as @frasercrmck said, we get much worse code here. So I prefer keep SEXT here.

Matt added a subscriber: Matt.Apr 21 2021, 6:24 AM

Address comments.

junparser added inline comments.Apr 21 2021, 7:27 AM

llvm/test/CodeGen/RISCV/rvv/stepvector.ll
277	@frasercrmck still keep this issue open.

Harbormaster completed remote builds in B99978: Diff 339230.Apr 21 2021, 7:59 AM

paulwalker-arm added inline comments.Apr 21 2021, 8:48 AM

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
1658–1665	I'm not happy with this approach. It doesn't make sense to change the definition of `ISD::STEP_VECTOR` and then have to tip toe around this definition because a target might not be in a position to generate the most optimal code for it. I see two options: We change the definition of `ISD::STEP_VECTOR` and work within that definition, or We maintain the current definition. Personally I prefer option 1 because it's in the spirit of this node's original intent, but option 2 also works until we're in a position to revisit. That said, I really think the poor code generation problem is more to do with the definition of `ISD::SPLAT_VECTOR` which doesn't work well for target's whose max legal scalar type is smaller than its max legal vector element type.

frasercrmck added inline comments.Apr 21 2021, 8:57 AM

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
1658–1665	I haven't had time to dig in but I was wondering if the issue of poor code generation could be mitigated by some "known bits" analysis for `ISD::VSCALE`. I don't know if there's precedent for target-specific known-bits analysis of a generic node, but I feel like that's what it'd have to be. RISC-V has known limits on `vscale`. We do have `ISD::SPLAT_VECTOR_PARTS` to deal with `ISD::SPLAT_VECTOR` for targets whose max scalar type is smaller than the max vector element type. But I feel like the sign extension is still going to be the blocker?

junparser added inline comments.Apr 21 2021, 8:02 PM

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
1658–1665	Although it is also ok with me to maintain current definition of step_vector. Since then we may make some promise to keep operand of step_vector still be positive constant after combine in D100088. As for D100816, we also need some extra pattern to handle negative case with sub. I prefer 1 as well. @frasercrmck, do you have any suggestion about where to start, I may have time to dig in.

craig.topper added a subscriber: craig.topper.Apr 21 2021, 8:25 PM

craig.topper added inline comments.

llvm/test/CodeGen/RISCV/rvv/stepvector.ll
277	Does f6d8cf7798440f303d5a273999e6647cbe795ac6 make this code better?

rebased.
@craig.topper, Now rv32 get some code as rv64, thanks!

Harbormaster completed remote builds in B100161: Diff 339482.Apr 21 2021, 11:58 PM

A couple of minor requests plus it's worth adding a "step-vector has more than one use" test but otherwise looks good to me. Thanks for your efforts @junparser, also thanks @frasercrmck and @craig.topper for the assists.

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
3552	DL
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
4712	to fit

This revision is now accepted and ready to land.Apr 22 2021, 3:14 AM

Address comments.

The patch looks good to me, thanks @junparser.

Harbormaster completed remote builds in B100228: Diff 339568.Apr 22 2021, 5:47 AM

This revision was landed with ongoing or failed builds.Apr 22 2021, 5:58 AM

Closed by commit rG978eb3f168be: [DAGCombiner] Allow operand of step_vector to be negative. (authored by junparser). · Explain Why

This revision was automatically updated to reflect the committed changes.

junparser added a commit: rG978eb3f168be: [DAGCombiner] Allow operand of step_vector to be negative..

junparser mentioned this in D100856: [RISCV] Support STEP_VECTOR with a step greater than one.Apr 22 2021, 6:03 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

CodeGen/

ISDOpcodes.h

6 lines

lib/

CodeGen/

SelectionDAG/

DAGCombiner.cpp

8 lines

LegalizeIntegerTypes.cpp

3 lines

LegalizeVectorTypes.cpp

2 lines

SelectionDAG.cpp

3 lines

test/

CodeGen/

AArch64/

sve-stepvector.ll

45 lines

RISCV/

rvv/

stepvector.ll

14 lines

Diff 338838

llvm/include/llvm/CodeGen/ISDOpcodes.h

Show First 20 Lines • Show All 588 Lines • ▼ Show 20 Lines	enum NodeType {
/// allows representing a 64-bit splat on a target with 32-bit integers. The		/// allows representing a 64-bit splat on a target with 32-bit integers. The
/// total width of the scalars must cover the element width. SCALAR1 contains		/// total width of the scalars must cover the element width. SCALAR1 contains
/// the least significant bits of the value regardless of endianness and all		/// the least significant bits of the value regardless of endianness and all
/// scalars should have the same type.		/// scalars should have the same type.
SPLAT_VECTOR_PARTS,		SPLAT_VECTOR_PARTS,

/// STEP_VECTOR(IMM) - Returns a scalable vector whose lanes are comprised		/// STEP_VECTOR(IMM) - Returns a scalable vector whose lanes are comprised
/// of a linear sequence of unsigned values starting from 0 with a step of		/// of a linear sequence of unsigned values starting from 0 with a step of
/// IMM, where IMM must be a vector index constant positive integer value		/// IMM, where IMM must be a vector index constant integer value which must
/// which must fit in the vector element type.		/// fit in the vector element type.
/// Note that IMM may be a smaller type than the vector element type, in		/// Note that IMM may be a smaller type than the vector element type, in
/// which case the step is implicitly zero-extended to the vector element		/// which case the step is implicitly sign-extended to the vector element
/// type. IMM may also be a larger type than the vector element type, in		/// type. IMM may also be a larger type than the vector element type, in
/// which case the step is implicitly truncated to the vector element type.		/// which case the step is implicitly truncated to the vector element type.
/// The operation does not support returning fixed-width vectors or		/// The operation does not support returning fixed-width vectors or
/// non-constant operands. If the sequence value exceeds the limit allowed		/// non-constant operands. If the sequence value exceeds the limit allowed
/// for the element type then the values for those lanes are undefined.		/// for the element type then the values for those lanes are undefined.
STEP_VECTOR,		STEP_VECTOR,

/// MULHU/MULHS - Multiply high - Multiply two integers of type iN,		/// MULHU/MULHS - Multiply high - Multiply two integers of type iN,
▲ Show 20 Lines • Show All 826 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 3,538 Lines • ▼ Show 20 Lines	SDValue DAGCombiner::visitSUB(SDNode *N) {
}		}

// canonicalize (sub X, (vscale * C)) to (add X, (vscale * -C))		// canonicalize (sub X, (vscale * C)) to (add X, (vscale * -C))
if (N1.getOpcode() == ISD::VSCALE) {		if (N1.getOpcode() == ISD::VSCALE) {
const APInt &IntVal = N1.getConstantOperandAPInt(0);		const APInt &IntVal = N1.getConstantOperandAPInt(0);
return DAG.getNode(ISD::ADD, DL, VT, N0, DAG.getVScale(DL, VT, -IntVal));		return DAG.getNode(ISD::ADD, DL, VT, N0, DAG.getVScale(DL, VT, -IntVal));
}		}

		// canonicalize (sub X, step_vector(C)) to (add X, step_vector(-C))
		sdesmalenUnsubmitted Not Done Reply Inline Actions nit: double space. sdesmalen: nit: double space.
		if (N1.getOpcode() == ISD::STEP_VECTOR) {
		sdesmalenUnsubmitted Not Done Reply Inline Actions If there are multiple uses of step_vector(C), then it may be more beneficial to have a single step_vector(C) and use separate add/sub. Can you add a check that N1 has only a single use? sdesmalen: If there are multiple uses of step_vector(C), then it may be more beneficial to have a single…
		junparserAuthorUnsubmitted Done Reply Inline Actions will update later. junparser: will update later.
		SDValue NewStep = DAG.getConstant(-N1.getConstantOperandAPInt(0), DL,
		N1.getOperand(0).getValueType());
		return DAG.getNode(ISD::ADD, DL, VT, N0,
		DAG.getStepVector(SDLoc(N), VT, NewStep));
		paulwalker-armUnsubmitted Not Done Reply Inline Actions DL paulwalker-arm: DL
		}

// Prefer an add for more folding potential and possibly better codegen:		// Prefer an add for more folding potential and possibly better codegen:
// sub N0, (lshr N10, width-1) --> add N0, (ashr N10, width-1)		// sub N0, (lshr N10, width-1) --> add N0, (ashr N10, width-1)
if (!LegalOperations && N1.getOpcode() == ISD::SRL && N1.hasOneUse()) {		if (!LegalOperations && N1.getOpcode() == ISD::SRL && N1.hasOneUse()) {
SDValue ShAmt = N1.getOperand(1);		SDValue ShAmt = N1.getOperand(1);
ConstantSDNode *ShAmtC = isConstOrConstSplat(ShAmt);		ConstantSDNode *ShAmtC = isConstOrConstSplat(ShAmt);
if (ShAmtC &&		if (ShAmtC &&
ShAmtC->getAPIntValue() == (N1.getScalarValueSizeInBits() - 1)) {		ShAmtC->getAPIntValue() == (N1.getScalarValueSizeInBits() - 1)) {
SDValue SRA = DAG.getNode(ISD::SRA, DL, VT, N1.getOperand(0), ShAmt);		SDValue SRA = DAG.getNode(ISD::SRA, DL, VT, N1.getOperand(0), ShAmt);
▲ Show 20 Lines • Show All 19,604 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

	Show First 20 Lines • Show All 4,785 Lines • ▼ Show 20 Lines
	SDValue DAGTypeLegalizer::PromoteIntRes_STEP_VECTOR(SDNode *N) {			SDValue DAGTypeLegalizer::PromoteIntRes_STEP_VECTOR(SDNode *N) {
	SDLoc dl(N);			SDLoc dl(N);
	EVT OutVT = N->getValueType(0);			EVT OutVT = N->getValueType(0);
	EVT NOutVT = TLI.getTypeToTransformTo(*DAG.getContext(), OutVT);			EVT NOutVT = TLI.getTypeToTransformTo(*DAG.getContext(), OutVT);
	assert(NOutVT.isVector() && "Type must be promoted to a vector type");			assert(NOutVT.isVector() && "Type must be promoted to a vector type");
	EVT NOutElemVT = TLI.getTypeToTransformTo(*DAG.getContext(),			EVT NOutElemVT = TLI.getTypeToTransformTo(*DAG.getContext(),
	NOutVT.getVectorElementType());			NOutVT.getVectorElementType());
	APInt StepVal = cast<ConstantSDNode>(N->getOperand(0))->getAPIntValue();			APInt StepVal = cast<ConstantSDNode>(N->getOperand(0))->getAPIntValue();
	SDValue Step = DAG.getConstant(StepVal.getZExtValue(), dl, NOutElemVT);			SDValue Step = DAG.getConstant(
				StepVal.sextOrTrunc(NOutElemVT.getSizeInBits()), dl, NOutElemVT);
	return DAG.getStepVector(dl, NOutVT, Step);			return DAG.getStepVector(dl, NOutVT, Step);
				paulwalker-armUnsubmitted Not Done Reply Inline Actions Is calling sextOrTrunc necessary? I would have thought you could just replace `getZExtValue()` with `getSExtValue()`. That said with @frasercrmck previous patch to remove the size restriction of `STEP_VECTOR`'s operands I don't believe there's need for this function to extend the operand at all and so `N->getOperand(0)` can be passed directly to `getStepVector`. paulwalker-arm: Is calling sextOrTrunc necessary? I would have thought you could just replace `getZExtValue()`…
				frasercrmckUnsubmitted Not Done Reply Inline Actions Yeah I think if the incoming step is legal (which it possibly isn't in theory -- in practice we always create a step with TypeToTransformTo?) then it should be possible to pass it straight through. frasercrmck: Yeah I think if the incoming step is legal (which it possibly isn't in theory -- in practice we…
				junparserAuthorUnsubmitted Done Reply Inline Actions ok，let's use it directly and add assertion here. junparser: ok，let's use it directly and add assertion here.
	}			}

	SDValue DAGTypeLegalizer::PromoteIntRes_CONCAT_VECTORS(SDNode *N) {			SDValue DAGTypeLegalizer::PromoteIntRes_CONCAT_VECTORS(SDNode *N) {
	SDLoc dl(N);			SDLoc dl(N);

	EVT OutVT = N->getValueType(0);			EVT OutVT = N->getValueType(0);
	EVT NOutVT = TLI.getTypeToTransformTo(*DAG.getContext(), OutVT);			EVT NOutVT = TLI.getTypeToTransformTo(*DAG.getContext(), OutVT);
	assert(NOutVT.isVector() && "This type must be promoted to a vector type");			assert(NOutVT.isVector() && "This type must be promoted to a vector type");
	▲ Show 20 Lines • Show All 153 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp

Show First 20 Lines • Show All 1,649 Lines • ▼ Show 20 Lines	assert(N->getValueType(0).isScalableVector() &&
"Only scalable vectors are supported for STEP_VECTOR");		"Only scalable vectors are supported for STEP_VECTOR");
std::tie(LoVT, HiVT) = DAG.GetSplitDestVTs(N->getValueType(0));		std::tie(LoVT, HiVT) = DAG.GetSplitDestVTs(N->getValueType(0));
SDValue Step = N->getOperand(0);		SDValue Step = N->getOperand(0);

Lo = DAG.getNode(ISD::STEP_VECTOR, dl, LoVT, Step);		Lo = DAG.getNode(ISD::STEP_VECTOR, dl, LoVT, Step);

// Hi = Lo + (EltCnt * Step)		// Hi = Lo + (EltCnt * Step)
EVT EltVT = Step.getValueType();		EVT EltVT = Step.getValueType();
SDValue StartOfHi =		SDValue StartOfHi =
DAG.getVScale(dl, EltVT,		DAG.getVScale(dl, EltVT,
cast<ConstantSDNode>(Step)->getAPIntValue() *		cast<ConstantSDNode>(Step)->getAPIntValue() *
LoVT.getVectorMinNumElements());		LoVT.getVectorMinNumElements());
StartOfHi = DAG.getZExtOrTrunc(StartOfHi, dl, HiVT.getVectorElementType());		StartOfHi = DAG.getSExtOrTrunc(StartOfHi, dl, HiVT.getVectorElementType());
StartOfHi = DAG.getNode(ISD::SPLAT_VECTOR, dl, HiVT, StartOfHi);		StartOfHi = DAG.getNode(ISD::SPLAT_VECTOR, dl, HiVT, StartOfHi);

Hi = DAG.getNode(ISD::STEP_VECTOR, dl, HiVT, Step);		Hi = DAG.getNode(ISD::STEP_VECTOR, dl, HiVT, Step);
		paulwalker-armUnsubmitted Not Done Reply Inline Actions This doesn't look correct to me. The operand is either signed or unsigned and should be treated accordingly. Given this patch wants to allow negative value then we're essentially converting the STEP_VECTOR to take a signed operand and should always use SExt. That said I'm wondering if the issue here is using DAG to do the extension even though the operand is defined to be a constant. Perhaps the following works for you? SDValue StartOfHi = DAG.getVScale(HiVT.getVectorElementType(), StepVal * LoVT.getVectorMinNumElements()); paulwalker-arm: This doesn't look correct to me. The operand is either signed or unsigned and should be…
		frasercrmckUnsubmitted Not Done Reply Inline Actions I agree, I think it should be consistently SExt. I think @junparser tried to mitigate a RISC-V regression but in line with what you're saying, I bet some of the poor RISC-V code comes from the DAG extending VSCALE and not being able to optimize it due to not being able to infer the known bits. frasercrmck: I agree, I think it should be consistently SExt. I think @junparser tried to mitigate a RISC-V…
		junparserAuthorUnsubmitted Done Reply Inline Actions @paulwalker-arm, The tradeoff here does try to mitigate regression in riscv-32 with element type like i64. Meanwhile, extension is necessary for vscale(i64) under riscv-32. Since there is no ExpandIntegerResult with vscale. @frasercrmck, would it be ok to relax vscale as same as step_vector? junparser: @paulwalker-arm, The tradeoff here does try to mitigate regression in riscv-32 with element…
		junparserAuthorUnsubmitted Done Reply Inline Actions @paulwalker-arm, The tradeoff here does try to mitigate regression in riscv-32 with element type like i64. Meanwhile, extension is necessary for vscale(i64) under riscv-32. I checked PromoteIntRes_VSCALE which does not do truncation. @frasercrmck, would it be ok to relax vscale as same as step_vector? junparser: @paulwalker-arm, The tradeoff here does try to mitigate regression in riscv-32 with element…
		paulwalker-armUnsubmitted Not Done Reply Inline Actions I believe `VSCALE` and `STEP_VECTOR` have different problems. `STEP_VECTOR` has a vector result but scalar operand so it seemed reasonable to allow the flexibility with regards to its operand's type. I don't believe the same flexibility should be given to `VSCALE` because that is a scalar only operation. That's to say that if a target doesn't support i64 for the operand type, then it will also not support i64 as its result. I mean if we decided `STEP_VECTOR`'s operand was not worth the effort and removed it then this code will still need to plant an explicit `MUL` that'll have the same problem. So whilst I agree there's a problem I'm not sure this is the function to solve it. Perhaps SPLAT_VECTOR_PARTS need to be involved somewhere? Perhaps what we're really asking for here is that `SPLAT_VECTOR`'s operand be relaxed to allow an operand that is smaller than its result element type? paulwalker-arm: I believe `VSCALE` and `STEP_VECTOR` have different problems. `STEP_VECTOR` has a vector…
		junparserAuthorUnsubmitted Done Reply Inline Actions yes, relax vscale does not work here. I have did some local test in ExpandIntegerResult: case ISD::VSCALE: { EVT VT = N->getValueType(0); EVT NVT = TLI.getTypeToTransformTo(DAG.getContext(), VT); SDLoc dl(N); SDValue NewNode = DAG.getNode(ISD::MUL, dl, VT, DAG.getZExtOrTrunc(DAG.getVScale(dl, NVT, APInt(NVT.getSizeInBits(), 1)), dl, VT), N->getOperand(0)); ReplaceValueWith(SDValue(N, 0), NewNode); break; } turns out as @frasercrmck said, we get much worse code here. So I prefer keep SEXT here. junparser:* yes, relax vscale does not work here. I have did some local test in ExpandIntegerResult: ```…
		paulwalker-armUnsubmitted Not Done Reply Inline Actions I'm not happy with this approach. It doesn't make sense to change the definition of `ISD::STEP_VECTOR` and then have to tip toe around this definition because a target might not be in a position to generate the most optimal code for it. I see two options: We change the definition of `ISD::STEP_VECTOR` and work within that definition, or We maintain the current definition. Personally I prefer option 1 because it's in the spirit of this node's original intent, but option 2 also works until we're in a position to revisit. That said, I really think the poor code generation problem is more to do with the definition of `ISD::SPLAT_VECTOR` which doesn't work well for target's whose max legal scalar type is smaller than its max legal vector element type. paulwalker-arm: I'm not happy with this approach. It doesn't make sense to change the definition of `ISD…
		frasercrmckUnsubmitted Not Done Reply Inline Actions I haven't had time to dig in but I was wondering if the issue of poor code generation could be mitigated by some "known bits" analysis for `ISD::VSCALE`. I don't know if there's precedent for target-specific known-bits analysis of a generic node, but I feel like that's what it'd have to be. RISC-V has known limits on `vscale`. We do have `ISD::SPLAT_VECTOR_PARTS` to deal with `ISD::SPLAT_VECTOR` for targets whose max scalar type is smaller than the max vector element type. But I feel like the sign extension is still going to be the blocker? frasercrmck: I haven't had time to dig in but I was wondering if the issue of poor code generation could be…
		junparserAuthorUnsubmitted Done Reply Inline Actions Although it is also ok with me to maintain current definition of step_vector. Since then we may make some promise to keep operand of step_vector still be positive constant after combine in D100088. As for D100816, we also need some extra pattern to handle negative case with sub. I prefer 1 as well. @frasercrmck, do you have any suggestion about where to start, I may have time to dig in. junparser: Although it is also ok with me to maintain current definition of step_vector. Since then we may…
		sdesmalenUnsubmitted Not Done Reply Inline Actions This can also use getSExtOrTrunc for the non-negative case? sdesmalen: This can also use getSExtOrTrunc for the non-negative case?
Hi = DAG.getNode(ISD::ADD, dl, HiVT, Hi, StartOfHi);		Hi = DAG.getNode(ISD::ADD, dl, HiVT, Hi, StartOfHi);
}		}

void DAGTypeLegalizer::SplitVecRes_ScalarOp(SDNode *N, SDValue &Lo,		void DAGTypeLegalizer::SplitVecRes_ScalarOp(SDNode *N, SDValue &Lo,
SDValue &Hi) {		SDValue &Hi) {
EVT LoVT, HiVT;		EVT LoVT, HiVT;
SDLoc dl(N);		SDLoc dl(N);
std::tie(LoVT, HiVT) = DAG.GetSplitDestVTs(N->getValueType(0));		std::tie(LoVT, HiVT) = DAG.GetSplitDestVTs(N->getValueType(0));
▲ Show 20 Lines • Show All 3,897 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 4,701 Lines • ▼ Show 20 Lines	SDValue SelectionDAG::getNode(unsigned Opcode, const SDLoc &DL, EVT VT,
switch (Opcode) {		switch (Opcode) {
case ISD::STEP_VECTOR:		case ISD::STEP_VECTOR:
assert(VT.isScalableVector() &&		assert(VT.isScalableVector() &&
"STEP_VECTOR can only be used with scalable types");		"STEP_VECTOR can only be used with scalable types");
assert(VT.getScalarSizeInBits() >= 8 &&		assert(VT.getScalarSizeInBits() >= 8 &&
"STEP_VECTOR can only be used with vectors of integers that are at "		"STEP_VECTOR can only be used with vectors of integers that are at "
"least 8 bits wide");		"least 8 bits wide");
assert(isa<ConstantSDNode>(Operand) &&		assert(isa<ConstantSDNode>(Operand) &&
cast<ConstantSDNode>(Operand)->getAPIntValue().isNonNegative() &&
cast<ConstantSDNode>(Operand)->getAPIntValue().isSignedIntN(		cast<ConstantSDNode>(Operand)->getAPIntValue().isSignedIntN(
VT.getScalarSizeInBits()) &&		VT.getScalarSizeInBits()) &&
"Expected STEP_VECTOR integer constant to be positive and fit in "		"Expected STEP_VECTOR integer constant to be fit in "
		paulwalker-armUnsubmitted Not Done Reply Inline Actions to fit paulwalker-arm: to fit
"the vector element type");		"the vector element type");
break;		break;
case ISD::FREEZE:		case ISD::FREEZE:
assert(VT == Operand.getValueType() && "Unexpected VT!");		assert(VT == Operand.getValueType() && "Unexpected VT!");
break;		break;
case ISD::TokenFactor:		case ISD::TokenFactor:
case ISD::MERGE_VALUES:		case ISD::MERGE_VALUES:
case ISD::CONCAT_VECTORS:		case ISD::CONCAT_VECTORS:
▲ Show 20 Lines • Show All 5,767 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/sve-stepvector.ll

	Show First 20 Lines • Show All 253 Lines • ▼ Show 20 Lines
	entry:			entry:
	%0 = insertelement <vscale x 8 x i8> poison, i8 2, i32 0			%0 = insertelement <vscale x 8 x i8> poison, i8 2, i32 0
	%1 = shufflevector <vscale x 8 x i8> %0, <vscale x 8 x i8> poison, <vscale x 8 x i32> zeroinitializer			%1 = shufflevector <vscale x 8 x i8> %0, <vscale x 8 x i8> poison, <vscale x 8 x i32> zeroinitializer
	%2 = call <vscale x 8 x i8> @llvm.experimental.stepvector.nxv8i8()			%2 = call <vscale x 8 x i8> @llvm.experimental.stepvector.nxv8i8()
	%3 = shl <vscale x 8 x i8> %2, %1			%3 = shl <vscale x 8 x i8> %2, %1
	ret <vscale x 8 x i8> %3			ret <vscale x 8 x i8> %3
	}			}

				define <vscale x 8 x i16> @sub_stepvector_nxv8i16() {
				; CHECK-LABEL: sub_stepvector_nxv8i16:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: index z0.h, #2, #-1
				; CHECK-NEXT: ret
				entry:
				%0 = insertelement <vscale x 8 x i16> poison, i16 2, i32 0
				%1 = shufflevector <vscale x 8 x i16> %0, <vscale x 8 x i16> poison, <vscale x 8 x i32> zeroinitializer
				%2 = call <vscale x 8 x i16> @llvm.experimental.stepvector.nxv8i16()
				%3 = sub <vscale x 8 x i16> %1, %2
				ret <vscale x 8 x i16> %3
				}

				define <vscale x 8 x i8> @promote_sub_stepvector_nxv8i8() {
				; CHECK-LABEL: promote_sub_stepvector_nxv8i8:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: index z0.h, #2, #-1
				; CHECK-NEXT: ret
				entry:
				%0 = insertelement <vscale x 8 x i8> poison, i8 2, i32 0
				%1 = shufflevector <vscale x 8 x i8> %0, <vscale x 8 x i8> poison, <vscale x 8 x i32> zeroinitializer
				%2 = call <vscale x 8 x i8> @llvm.experimental.stepvector.nxv8i8()
				%3 = sub <vscale x 8 x i8> %1, %2
				ret <vscale x 8 x i8> %3
				}

				define <vscale x 16 x i32> @split_sub_stepvector_nxv16i32() {
				; CHECK-LABEL: split_sub_stepvector_nxv16i32:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: cntw x9
				; CHECK-NEXT: cnth x8
				; CHECK-NEXT: neg x9, x9
				; CHECK-NEXT: index z0.s, #0, #-1
				; CHECK-NEXT: neg x8, x8
				; CHECK-NEXT: mov z1.s, w9
				; CHECK-NEXT: mov z3.s, w8
				; CHECK-NEXT: add z1.s, z0.s, z1.s
				; CHECK-NEXT: add z2.s, z0.s, z3.s
				; CHECK-NEXT: add z3.s, z1.s, z3.s
				; CHECK-NEXT: ret
				entry:
				%0 = call <vscale x 16 x i32> @llvm.experimental.stepvector.nxv16i32()
				%1 = sub <vscale x 16 x i32> zeroinitializer, %0
				ret <vscale x 16 x i32> %1
				}

	declare <vscale x 2 x i64> @llvm.experimental.stepvector.nxv2i64()			declare <vscale x 2 x i64> @llvm.experimental.stepvector.nxv2i64()
	declare <vscale x 4 x i32> @llvm.experimental.stepvector.nxv4i32()			declare <vscale x 4 x i32> @llvm.experimental.stepvector.nxv4i32()
	declare <vscale x 8 x i16> @llvm.experimental.stepvector.nxv8i16()			declare <vscale x 8 x i16> @llvm.experimental.stepvector.nxv8i16()
	declare <vscale x 16 x i8> @llvm.experimental.stepvector.nxv16i8()			declare <vscale x 16 x i8> @llvm.experimental.stepvector.nxv16i8()

	declare <vscale x 4 x i64> @llvm.experimental.stepvector.nxv4i64()			declare <vscale x 4 x i64> @llvm.experimental.stepvector.nxv4i64()
	declare <vscale x 16 x i32> @llvm.experimental.stepvector.nxv16i32()			declare <vscale x 16 x i32> @llvm.experimental.stepvector.nxv16i32()
	declare <vscale x 2 x i32> @llvm.experimental.stepvector.nxv2i32()			declare <vscale x 2 x i32> @llvm.experimental.stepvector.nxv2i32()
	declare <vscale x 8 x i8> @llvm.experimental.stepvector.nxv8i8()			declare <vscale x 8 x i8> @llvm.experimental.stepvector.nxv8i8()
	declare <vscale x 4 x i16> @llvm.experimental.stepvector.nxv4i16()			declare <vscale x 4 x i16> @llvm.experimental.stepvector.nxv4i16()

llvm/test/CodeGen/RISCV/rvv/stepvector.ll

	Show First 20 Lines • Show All 268 Lines • ▼ Show 20 Lines
	}			}

	declare <vscale x 16 x i64> @llvm.experimental.stepvector.nxv16i64()			declare <vscale x 16 x i64> @llvm.experimental.stepvector.nxv16i64()

	define <vscale x 16 x i64> @stepvector_nxv16i64() {			define <vscale x 16 x i64> @stepvector_nxv16i64() {
	; RV32-LABEL: stepvector_nxv16i64:			; RV32-LABEL: stepvector_nxv16i64:
	; RV32: # %bb.0:			; RV32: # %bb.0:
	; RV32-NEXT: csrr a0, vlenb			; RV32-NEXT: csrr a0, vlenb
	; RV32-NEXT: vsetvli a1, zero, e64,m8,ta,mu			; RV32-NEXT: srai a1, a0, 31
				frasercrmckUnsubmitted Not Done Reply Inline Actions This is definitely a regression so I think we need to see what's going on here. The split-vector legalization will be causing this. Perhaps because it's still zero-extending there? frasercrmck: This is definitely a regression so I think we need to see what's going on here. The split…
				junparserAuthorUnsubmitted Done Reply Inline Actions This is caused by SplitVecRes_STEP_VECTOR which changes getZExtOrTrunc to getSExtOrTrunc, we may need check whether stepval is isNonNegative. junparser: This is caused by SplitVecRes_STEP_VECTOR which changes getZExtOrTrunc to getSExtOrTrunc, we…
				junparserAuthorUnsubmitted Done Reply Inline Actions @frasercrmck still keep this issue open. junparser: @frasercrmck still keep this issue open.
				craig.topperUnsubmitted Not Done Reply Inline Actions Does f6d8cf7798440f303d5a273999e6647cbe795ac6 make this code better? craig.topper: Does f6d8cf7798440f303d5a273999e6647cbe795ac6 make this code better?
	; RV32-NEXT: vmv.v.x v8, a0			; RV32-NEXT: vsetvli a2, zero, e64,m8,ta,mu
	; RV32-NEXT: addi a0, zero, 32			; RV32-NEXT: vmv.v.x v8, a1
	; RV32-NEXT: vsll.vx v8, v8, a0			; RV32-NEXT: addi a1, zero, 32
	; RV32-NEXT: vsrl.vx v16, v8, a0			; RV32-NEXT: vsll.vx v8, v8, a1
				; RV32-NEXT: vmv.v.x v16, a0
				; RV32-NEXT: vsll.vx v16, v16, a1
				; RV32-NEXT: vsrl.vx v16, v16, a1
				; RV32-NEXT: vor.vv v16, v16, v8
	; RV32-NEXT: vid.v v8			; RV32-NEXT: vid.v v8
	; RV32-NEXT: vadd.vv v16, v8, v16			; RV32-NEXT: vadd.vv v16, v8, v16
	; RV32-NEXT: ret			; RV32-NEXT: ret
	;			;
	; RV64-LABEL: stepvector_nxv16i64:			; RV64-LABEL: stepvector_nxv16i64:
	; RV64: # %bb.0:			; RV64: # %bb.0:
	; RV64-NEXT: csrr a0, vlenb			; RV64-NEXT: csrr a0, vlenb
	; RV64-NEXT: vsetvli a1, zero, e64,m8,ta,mu			; RV64-NEXT: vsetvli a1, zero, e64,m8,ta,mu
	; RV64-NEXT: vid.v v8			; RV64-NEXT: vid.v v8
	; RV64-NEXT: vadd.vx v16, v8, a0			; RV64-NEXT: vadd.vx v16, v8, a0
	; RV64-NEXT: ret			; RV64-NEXT: ret
	%v = call <vscale x 16 x i64> @llvm.experimental.stepvector.nxv16i64()			%v = call <vscale x 16 x i64> @llvm.experimental.stepvector.nxv16i64()
	ret <vscale x 16 x i64> %v			ret <vscale x 16 x i64> %v
	}			}

This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner] Allow operand of step_vector to be negative.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 338838

llvm/include/llvm/CodeGen/ISDOpcodes.h

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

llvm/test/CodeGen/AArch64/sve-stepvector.ll

llvm/test/CodeGen/RISCV/rvv/stepvector.ll

[DAGCombiner] Allow operand of step_vector to be negative.
ClosedPublic