This is an archive of the discontinued LLVM Phabricator instance.

lib/Target/X86/X86ISelDAGToDAG.cpp
712	Why are we matching the pattern in reverse? Normally we would look for a sub followed by a shift. Why are we starting with a shift and looking backwards?

lebedev.ri added a subscriber: lebedev.ri.Jul 29 2018, 10:57 AM

lebedev.ri added inline comments.

lib/Target/X86/X86ISelDAGToDAG.cpp
703–707	I suspect this comment should be at the call site. The comment here should ideally explain what the function does, more specifically.
712	Early return?
720	Early return?
729	It would be great of there was a comment before each of these two cases showing which pattern it transforms into what.

jbhateja added inline comments.Jul 29 2018, 11:10 AM

lib/Target/X86/X86ISelDAGToDAG.cpp
712	Following DAG transformation is being done in this revision %1 = SHL X , 2 %2 = SUB Y , %1 \| V %1 = SUB 0 , X %2 = SHL %1 , 2 -> A %3 = ADD Y , %2 -> B We are trying to remap SHL + SUB pattern so that it can be consumed easily, transformation is currently not kicking in for any SUB patterns but only for case where SUB is a USER of SHL.

craig.topper added inline comments.Jul 29 2018, 11:17 AM

lib/Target/X86/X86ISelDAGToDAG.cpp
712	But why can't you look for subtracts that have shl as an operand? That's how nearly every DAG pattern match is written. I'm trying to understand why this one is different. A call to use_begin is pretty unusual the DAG.

[X86] Review comments incorporation for patch D49966

jbhateja marked 7 inline comments as done.Jul 29 2018, 12:46 PM

jbhateja added reviewers: craig.topper, lebedev.ri.

[X86] Correcting a comment in patch D49966

Harbormaster completed remote builds in B20829: Diff 157896.Jul 29 2018, 12:56 PM

lebedev.ri added inline comments.Jul 29 2018, 1:53 PM

lib/Target/X86/X86ISelDAGToDAG.cpp
705–707	These can be moved to after the lambda.
708	But this doesn't check that it's argument is ValidConstantShift. It checks that the `getOperand(1).getNode()` of the argument is `ValidConstantShift`. How about you sink the `Opr0.getOpcode() == ISD::SHL` check into the lambda, and rename the lambda somehow?
710–711	Is this sufficient to limit this to scalars only? Also, should this be limited to only some specific `EVT`'s?
716	Are there other patterns with different root nodes possible? If not i'd early-return.
717–718	General naming comment: it seems, these would normally be named `N0` and `N1`, because they are 0'th and 1'th operands of `N`.
744	Any thoughts on whether the `shl` should be one-use?

[X86] Review comment resolutions for patch D49966

jbhateja marked 5 inline comments as done.Jul 29 2018, 8:43 PM

craig.topper added inline comments.Jul 29 2018, 10:04 PM

lib/Target/X86/X86ISelDAGToDAG.cpp
747	You can't call ReplaceAllUses with the caller holding an iterator pointing to the node after N. The replacement can trigger a CSE update that can invalidate the iterator. You eventually need to delete N and the subtract once you've replaced it. I'm not sure anything will remove it before it goes through instruction selection. But the iterator in the caller makes that hard. You can't delete N until the iterator is moved past it. But you can't delete the subtract instruction because the iterator might be pointing at it. So I think if this optimization makes any changes, you need to call RemoveDeadNodes after the loop. Since that would be the only time it's safe to remove both.
765	Your new code needs to do the --I, Replace, ++I, Delete that's done here. Otherwise I might get invalidated by the Delete.

xbolva00 added a subscriber: xbolva00.Jul 30 2018, 7:21 AM

This comment was removed by xbolva00.

RKSimon added inline comments.Jul 30 2018, 9:10 AM

test/CodeGen/X86/lea-opt.ll
312	Are these attributes necessary: dso_local, local_unnamed_addr + #0? I think you might just need nounwind ?

[X86] Review comments incorporation for patch D49966

Harbormaster completed remote builds in B20853: Diff 158025.Jul 30 2018, 11:30 AM

RKSimon added inline comments.Jul 31 2018, 2:53 AM

test/CodeGen/X86/lea-opt.ll
311	Please can you give the functions more useful (short, descriptive) names, commit the tests to trunk with current codegen and then update this patch to show the codegen diff

[X86] Review comments resolution for patch D49966
Patch rebase.

lebedev.ri edited the summary of this revision. (Show Details)Jul 31 2018, 11:38 PM

lebedev.ri added inline comments.Jul 31 2018, 11:42 PM

lib/Target/X86/X86ISelDAGToDAG.cpp
725–728	This turned out hard to read. Can you replace it with %o0 = shl i8 %x, C -> %n0 = sub i8 0, %y %r = sub i8 %o0, %y %n1 = shl i8 %x, C %r = add i8 %n1, %n0
734–737	%o0 = shl i8 %x, C -> %n0 = sub i8 0, %x %r = sub i8 %y, %o0 %n1 = shl i8 %n0, C %r = add i8 %y, %n1

jbhateja edited the summary of this revision. (Show Details)Aug 1 2018, 3:25 AM

We do actually want to see that alive link though.

[X86] Limiting optimization to i32 and i64 types

jbhateja added inline comments.Aug 2 2018, 10:15 AM

lib/Target/X86/X86ISelDAGToDAG.cpp

725–728

Optimization has been limited to i32 and i16, i8 optimization is correct but LEA extraction is being done only for 32 and 64 bits types. On way is to promote i8 types to i32 type which will trigger LEA selection and then truncate result to i8 , this is what is being done for i16.

/// Return true if the target has native support for the specified value type
/// and it is 'desirable' to use the type for the given node type. e.g. On x86
/// i16 is legal, but undesirable since i16 instruction encodings are longer and
/// some i16 instructions are slow.
bool X86TargetLowering::isTypeDesirableForOp(unsigned Opc, EVT VT) const {
  if (!isTypeLegal(VT))
    return false;

  // There are no vXi8 shifts.
  if (Opc == ISD::SHL && VT.isVector() && VT.getVectorElementType() == MVT::i8)
    return false;

**  if (VT != MVT::i16)
    return true;**

Ping reviewers, can put this under Size oriented optimization.

jbhateja edited the summary of this revision. (Show Details)Aug 8 2018, 9:54 AM

In D49966#1189555, @jbhateja wrote:

Ping reviewers, can put this under Size oriented optimization.

Please check Roman's comment too
https://bugs.llvm.org/show_bug.cgi?id=37939#c4

lebedev.ri added inline comments.Aug 16 2018, 3:00 AM

lib/Target/X86/X86ISelDAGToDAG.cpp
706–707	Can you add a comment here in the code as to why this restriction exists?
712	I'm not sure why you need `.getNode()`? Elsewhere `isa<ConstantSDNode>(SDValue)` works fine.
713–716	Can you add a comment before the `if` as to why the limit is `3`?
720–721	This is the only use of `Opcode` variable. Inline it into the usage? Why not early return? It took me a moment to understand that we indeed don't have an `else if` for this `if`.
774	Hm, why do we want to only do this in `-Os`/`-Oz`?

xbolva00 added inline comments.Aug 16 2018, 3:23 AM

lib/Target/X86/X86ISelDAGToDAG.cpp
774	I agree with Roman, we should do it always.

lebedev.ri requested changes to this revision.Aug 30 2018, 10:54 AM

This revision now requires changes to proceed.Aug 30 2018, 10:54 AM

Also we should avoid cases like
mov eax, dword ptr [rsi]
lea eax, [rax + rdi]

and use add instruction like gcc:
add edi, DWORD PTR [rsi]

https://godbolt.org/z/KQAOV-

cc @craig.topper

In D49966#1268534, @xbolva00 wrote:

Also we should avoid cases like
mov eax, dword ptr [rsi]
lea eax, [rax + rdi]

and use add instruction like gcc:
add edi, DWORD PTR [rsi]

https://godbolt.org/z/KQAOV-

cc @craig.topper

Please file a bug. That's a completely different issue.

Does this handle also e.g.

int foo(int i,int c) {

c *= 2*i+1;
return c;

}

? We miss lea here..

In D49966#1343135, @xbolva00 wrote:
Does this handle also e.g.

int foo(int i,int c) {
c *= 2*i+1;
return c;
}

? We miss lea here..

Do you have a god bolt link? I’m seeing an lea for that except on Intel CPUs that have a slow 3 src lea instruction. On those CPUs we issue an LEA, ADD, IMUL

https://godbolt.org/z/0ZOx52

In D49966#1343225, @xbolva00 wrote:

https://godbolt.org/z/0ZOx52

That behavior is intentional. The X86FixupLEAs.cpp pass changed from the gcc/icc code to LEA+ADD because "lea eax, [rdi+1+rdi]" is a 3 cycle instruction with 1 cycle reciprocal throughput on all Intel Core CPUs from Sandy Bridge to present. "lea eax, [rdi+rdi]" is 1 cycle latency with 0.5 reciprocal throughput. And "add eax, 1" is 1 cycle with a reciprocal throughput of 0.33 or 0.25. So the 2 instructions should be better performing than the single LEA. Though it is bad that -Os doesn't disable this optimization.

This review seems to be stuck/dead, consider abandoning if no longer relevant.

This revision now requires review to proceed.Jan 12 2023, 4:41 PM

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJan 12 2023, 4:41 PM

Herald added subscribers: StephenFan, pengfei. · View Herald Transcript

Revision Contents

Path

Size

lib/

Target/

X86/

X86ISelDAGToDAG.cpp

64 lines

test/

CodeGen/

X86/

lea-opt.ll

57 lines

Diff 158791

lib/Target/X86/X86ISelDAGToDAG.cpp

Show First 20 Lines • Show All 184 Lines • ▼ Show 20 Lines	public:
void PostprocessISelDAG() override;		void PostprocessISelDAG() override;

// Include the pieces autogenerated from the target description.		// Include the pieces autogenerated from the target description.
#include "X86GenDAGISel.inc"		#include "X86GenDAGISel.inc"

private:		private:
void Select(SDNode *N) override;		void Select(SDNode *N) override;

		SDValue pruneForLEAExtraction(SDNode *N);
bool foldOffsetIntoAddress(uint64_t Offset, X86ISelAddressMode &AM);		bool foldOffsetIntoAddress(uint64_t Offset, X86ISelAddressMode &AM);
bool matchLoadInAddress(LoadSDNode *N, X86ISelAddressMode &AM);		bool matchLoadInAddress(LoadSDNode *N, X86ISelAddressMode &AM);
bool matchWrapper(SDValue N, X86ISelAddressMode &AM);		bool matchWrapper(SDValue N, X86ISelAddressMode &AM);
bool matchAddress(SDValue N, X86ISelAddressMode &AM);		bool matchAddress(SDValue N, X86ISelAddressMode &AM);
bool matchVectorAddress(SDValue N, X86ISelAddressMode &AM);		bool matchVectorAddress(SDValue N, X86ISelAddressMode &AM);
bool matchAdd(SDValue N, X86ISelAddressMode &AM, unsigned Depth);		bool matchAdd(SDValue N, X86ISelAddressMode &AM, unsigned Depth);
bool matchAddressRecursively(SDValue N, X86ISelAddressMode &AM,		bool matchAddressRecursively(SDValue N, X86ISelAddressMode &AM,
unsigned Depth);		unsigned Depth);
▲ Show 20 Lines • Show All 493 Lines • ▼ Show 20 Lines	if (Chain.getOperand(0).getNode() == Callee.getNode())
return true;		return true;
if (Chain.getOperand(0).getOpcode() == ISD::TokenFactor &&		if (Chain.getOperand(0).getOpcode() == ISD::TokenFactor &&
Callee.getValue(1).isOperandOf(Chain.getOperand(0).getNode()) &&		Callee.getValue(1).isOperandOf(Chain.getOperand(0).getNode()) &&
Callee.getValue(1).hasOneUse())		Callee.getValue(1).hasOneUse())
return true;		return true;
return false;		return false;
}		}

		// Perform DAG pruning to facilitate LEA selection.
		SDValue X86DAGToDAGISel::pruneForLEAExtraction(SDNode *N) {
		EVT VT = N->getValueType(0);
		if (VT != MVT::i32 && VT != MVT::i64)
		return SDValue();
		lebedev.riUnsubmitted Done Reply Inline Actions I suspect this comment should be at the call site. The comment here should ideally explain what the function does, more specifically. lebedev.ri: I suspect this comment should be at the call site. The comment here should ideally explain…
		lebedev.riUnsubmitted Done Reply Inline Actions These can be moved to after the lambda. lebedev.ri: These can be moved to after the lambda.
		lebedev.riUnsubmitted Not Done Reply Inline Actions Can you add a comment here in the code as to why this restriction exists? lebedev.ri: Can you add a comment here in the code as to why this restriction exists?

		lebedev.riUnsubmitted Done Reply Inline Actions But this doesn't check that it's argument is ValidConstantShift. It checks that the `getOperand(1).getNode()` of the argument is `ValidConstantShift`. How about you sink the `Opr0.getOpcode() == ISD::SHL` check into the lambda, and rename the lambda somehow? lebedev.ri: But this doesn't check that it's argument is ValidConstantShift. It checks that the…
		auto IsValidConstantShift = [=](SDValue Shift) {
		if (Shift.getOpcode() != ISD::SHL)
		return false;
		lebedev.riUnsubmitted Done Reply Inline Actions Is this sufficient to limit this to scalars only? Also, should this be limited to only some specific `EVT`'s? lebedev.ri: Is this sufficient to limit this to scalars only? Also, should this be limited to only some…
		SDNode *ShiftVal = Shift.getOperand(1).getNode();
		craig.topperUnsubmitted Done Reply Inline Actions Why are we matching the pattern in reverse? Normally we would look for a sub followed by a shift. Why are we starting with a shift and looking backwards? craig.topper: Why are we matching the pattern in reverse? Normally we would look for a sub followed by a…
		jbhatejaAuthorUnsubmitted Done Reply Inline Actions Following DAG transformation is being done in this revision %1 = SHL X , 2 %2 = SUB Y , %1 \| V %1 = SUB 0 , X %2 = SHL %1 , 2 -> A %3 = ADD Y , %2 -> B We are trying to remap SHL + SUB pattern so that it can be consumed easily, transformation is currently not kicking in for any SUB patterns but only for case where SUB is a USER of SHL. jbhateja: Following DAG transformation is being done in this revision %1 = SHL X , 2 %2 = SUB Y…
		craig.topperUnsubmitted Done Reply Inline Actions But why can't you look for subtracts that have shl as an operand? That's how nearly every DAG pattern match is written. I'm trying to understand why this one is different. A call to use_begin is pretty unusual the DAG. craig.topper: But why can't you look for subtracts that have shl as an operand? That's how nearly every DAG…
		lebedev.riUnsubmitted Done Reply Inline Actions Early return? lebedev.ri: Early return?
		lebedev.riUnsubmitted Not Done Reply Inline Actions I'm not sure why you need `.getNode()`? Elsewhere `isa<ConstantSDNode>(SDValue)` works fine. lebedev.ri: I'm not sure why you need `.getNode()`? Elsewhere `isa<ConstantSDNode>(SDValue)` works fine.
		if (isa<ConstantSDNode>(ShiftVal) &&
		(cast<ConstantSDNode>(ShiftVal)->getZExtValue() <= 3))
		return true;
		return false;
		lebedev.riUnsubmitted Not Done Reply Inline Actions Are there other patterns with different root nodes possible? If not i'd early-return. lebedev.ri: Are there other patterns with different root nodes possible? If not i'd early-return.
		lebedev.riUnsubmitted Not Done Reply Inline Actions Can you add a comment before the `if` as to why the limit is `3`? lebedev.ri: Can you add a comment before the `if` as to why the limit is `3`?
		};

		lebedev.riUnsubmitted Done Reply Inline Actions General naming comment: it seems, these would normally be named `N0` and `N1`, because they are 0'th and 1'th operands of `N`. lebedev.ri: General naming comment: it seems, these would normally be named `N0` and `N1`, because they are…
		SDLoc DL(N);
		unsigned Opcode = N->getOpcode();
		lebedev.riUnsubmitted Done Reply Inline Actions Early return? lebedev.ri: Early return?
		if (Opcode == ISD::SUB) {
		lebedev.riUnsubmitted Not Done Reply Inline Actions This is the only use of `Opcode` variable. Inline it into the usage? Why not early return? It took me a moment to understand that we indeed don't have an `else if` for this `if`. lebedev.ri: 1. This is the only use of `Opcode` variable. Inline it into the usage? 2. Why not early return?
		SDValue N0 = N->getOperand(0);
		SDValue N1 = N->getOperand(1);
		if (IsValidConstantShift(N0)) {
		// Following transformation is being done
		// %1 = SHL X , 2 -> %1 = SUB 0 , Y
		// %2 = SUB %1 , Y %2 = SHL X , 2
		// %3 = ADD %2 , %1
		lebedev.riUnsubmitted Not Done Reply Inline Actions This turned out hard to read. Can you replace it with %o0 = shl i8 %x, C -> %n0 = sub i8 0, %y %r = sub i8 %o0, %y %n1 = shl i8 %x, C %r = add i8 %n1, %n0 lebedev.ri: This turned out hard to read. Can you replace it with ``` %o0 = shl i8 %x, C -> %n0 = sub i8…
		jbhatejaAuthorUnsubmitted Not Done Reply Inline Actions Optimization has been limited to i32 and i16, i8 optimization is correct but LEA extraction is being done only for 32 and 64 bits types. On way is to promote i8 types to i32 type which will trigger LEA selection and then truncate result to i8 , this is what is being done for i16. /// Return true if the target has native support for the specified value type /// and it is 'desirable' to use the type for the given node type. e.g. On x86 /// i16 is legal, but undesirable since i16 instruction encodings are longer and /// some i16 instructions are slow. bool X86TargetLowering::isTypeDesirableForOp(unsigned Opc, EVT VT) const { if (!isTypeLegal(VT)) return false; // There are no vXi8 shifts. if (Opc == ISD::SHL && VT.isVector() && VT.getVectorElementType() == MVT::i8) return false; if (VT != MVT::i16) return true; jbhateja: Optimization has been limited to i32 and i16, i8 optimization is correct but LEA extraction is…
		SDValue OpsSUB[] = {CurDAG->getConstant(0, DL, VT), N1};
		lebedev.riUnsubmitted Done Reply Inline Actions It would be great of there was a comment before each of these two cases showing which pattern it transforms into what. lebedev.ri: It would be great of there was a comment before each of these two cases showing which pattern…
		SDValue NewSub = CurDAG->getNode(ISD::SUB, DL, VT, OpsSUB);
		SDValue OpsAdd[] = {N0, NewSub};
		return CurDAG->getNode(ISD::ADD, DL, VT, OpsAdd);
		} else if (IsValidConstantShift(N1)) {
		// Following transformation is being done
		// %1 = SHL X , CONST -> %1 = SUB 0 , X
		// %2 = SUB Y , %1 %2 = SHL %1 , CONST
		// %3 = ADD Y , %2
		lebedev.riUnsubmitted Not Done Reply Inline Actions %o0 = shl i8 %x, C -> %n0 = sub i8 0, %x %r = sub i8 %y, %o0 %n1 = shl i8 %n0, C %r = add i8 %y, %n1 lebedev.ri: ``` %o0 = shl i8 %x, C -> %n0 = sub i8 0, %x %r = sub i8 %y, %o0 %n1 = shl i8 %n0, C…
		SDValue OpsSUB[] = {CurDAG->getConstant(0, DL, VT), N1.getOperand(0)};
		SDValue NewSub = CurDAG->getNode(ISD::SUB, DL, VT, OpsSUB);
		SDValue OpsShift[] = {NewSub, N1.getOperand(1)};
		SDValue NewShift = CurDAG->getNode(ISD::SHL, DL, VT, OpsShift);
		SDValue OpsAdd[] = {N0, NewShift};
		return CurDAG->getNode(ISD::ADD, DL, VT, OpsAdd);
		}
		lebedev.riUnsubmitted Done Reply Inline Actions Any thoughts on whether the `shl` should be one-use? lebedev.ri: Any thoughts on whether the `shl` should be one-use?
		}
		return SDValue();
		}
		craig.topperUnsubmitted Not Done Reply Inline Actions You can't call ReplaceAllUses with the caller holding an iterator pointing to the node after N. The replacement can trigger a CSE update that can invalidate the iterator. You eventually need to delete N and the subtract once you've replaced it. I'm not sure anything will remove it before it goes through instruction selection. But the iterator in the caller makes that hard. You can't delete N until the iterator is moved past it. But you can't delete the subtract instruction because the iterator might be pointing at it. So I think if this optimization makes any changes, you need to call RemoveDeadNodes after the loop. Since that would be the only time it's safe to remove both. craig.topper: You can't call ReplaceAllUses with the caller holding an iterator pointing to the node after N.

void X86DAGToDAGISel::PreprocessISelDAG() {		void X86DAGToDAGISel::PreprocessISelDAG() {
// OptFor[Min]Size are used in pattern predicates that isel is matching.		// OptFor[Min]Size are used in pattern predicates that isel is matching.
OptForSize = MF->getFunction().optForSize();		OptForSize = MF->getFunction().optForSize();
OptForMinSize = MF->getFunction().optForMinSize();		OptForMinSize = MF->getFunction().optForMinSize();
assert((!OptForMinSize \|\| OptForSize) && "OptForMinSize implies OptForSize");		assert((!OptForMinSize \|\| OptForSize) && "OptForMinSize implies OptForSize");

		bool TriggerDeadNodesRemoval = false;
for (SelectionDAG::allnodes_iterator I = CurDAG->allnodes_begin(),		for (SelectionDAG::allnodes_iterator I = CurDAG->allnodes_begin(),
E = CurDAG->allnodes_end(); I != E; ) {		E = CurDAG->allnodes_end(); I != E; ) {
SDNode N = &I++; // Preincrement iterator to avoid invalidation issues.		SDNode N = &I++; // Preincrement iterator to avoid invalidation issues.

// If this is a target specific AND node with no flag usages, turn it back		// If this is a target specific AND node with no flag usages, turn it back
// into ISD::AND to enable test instruction matching.		// into ISD::AND to enable test instruction matching.
if (N->getOpcode() == X86ISD::AND && !N->hasAnyUseOfValue(1)) {		if (N->getOpcode() == X86ISD::AND && !N->hasAnyUseOfValue(1)) {
SDValue Res = CurDAG->getNode(ISD::AND, SDLoc(N), N->getValueType(0),		SDValue Res = CurDAG->getNode(ISD::AND, SDLoc(N), N->getValueType(0),
N->getOperand(0), N->getOperand(1));		N->getOperand(0), N->getOperand(1));
--I;		--I;
		craig.topperUnsubmitted Not Done Reply Inline Actions Your new code needs to do the --I, Replace, ++I, Delete that's done here. Otherwise I might get invalidated by the Delete. craig.topper: Your new code needs to do the --I, Replace, ++I, Delete that's done here. Otherwise I might get…
CurDAG->ReplaceAllUsesOfValueWith(SDValue(N, 0), Res);		CurDAG->ReplaceAllUsesOfValueWith(SDValue(N, 0), Res);
++I;		++I;
CurDAG->DeleteNode(N);		CurDAG->DeleteNode(N);
continue;		continue;
}		}

		// Look for pattern which could be transformed to facilitate
		// addressing mode based LEA selection.
		if (OptForSize) {
		lebedev.riUnsubmitted Not Done Reply Inline Actions Hm, why do we want to only do this in `-Os`/`-Oz`? lebedev.ri: Hm, why do we want to only do this in `-Os`/`-Oz`?
		xbolva00Unsubmitted Not Done Reply Inline Actions I agree with Roman, we should do it always. xbolva00: I agree with Roman, we should do it always.
		SDValue NewN = pruneForLEAExtraction(N);
		if (NewN.getNode()) {
		--I;
		CurDAG->ReplaceAllUsesOfValueWith(SDValue(N, 0), NewN);
		++I;
		CurDAG->DeleteNode(N);
		TriggerDeadNodesRemoval = true;
		continue;
		}
		}

if (OptLevel != CodeGenOpt::None &&		if (OptLevel != CodeGenOpt::None &&
// Only do this when the target can fold the load into the call or		// Only do this when the target can fold the load into the call or
// jmp.		// jmp.
!Subtarget->useRetpoline() &&		!Subtarget->useRetpoline() &&
((N->getOpcode() == X86ISD::CALL && !Subtarget->slowTwoMemOps()) \|\|		((N->getOpcode() == X86ISD::CALL && !Subtarget->slowTwoMemOps()) \|\|
(N->getOpcode() == X86ISD::TC_RETURN &&		(N->getOpcode() == X86ISD::TC_RETURN &&
(Subtarget->is64Bit() \|\|		(Subtarget->is64Bit() \|\|
!getTargetMachine().isPositionIndependent())))) {		!getTargetMachine().isPositionIndependent())))) {
▲ Show 20 Lines • Show All 88 Lines • ▼ Show 20 Lines	for (SelectionDAG::allnodes_iterator I = CurDAG->allnodes_begin(),
--I;		--I;
CurDAG->ReplaceAllUsesOfValueWith(SDValue(N, 0), Result);		CurDAG->ReplaceAllUsesOfValueWith(SDValue(N, 0), Result);

// Now that we did that, the node is dead. Increment the iterator to the		// Now that we did that, the node is dead. Increment the iterator to the
// next node to process, then delete N.		// next node to process, then delete N.
++I;		++I;
CurDAG->DeleteNode(N);		CurDAG->DeleteNode(N);
}		}
		if (TriggerDeadNodesRemoval)
		CurDAG->RemoveDeadNodes();
}		}


void X86DAGToDAGISel::PostprocessISelDAG() {		void X86DAGToDAGISel::PostprocessISelDAG() {
// Skip peepholes at -O0.		// Skip peepholes at -O0.
if (TM.getOptLevel() == CodeGenOpt::None)		if (TM.getOptLevel() == CodeGenOpt::None)
return;		return;

▲ Show 20 Lines • Show All 2,588 Lines • Show Last 20 Lines

test/CodeGen/X86/lea-opt.ll

Show First 20 Lines • Show All 302 Lines • ▼ Show 20 Lines	sw.bb.2: ; preds = %entry
store i32 333, i32* %b, align 4		store i32 333, i32* %b, align 4
store i32 444, i32* %c, align 4		store i32 444, i32* %c, align 4
br label %sw.epilog		br label %sw.epilog

sw.epilog: ; preds = %sw.bb.2, %sw.bb.1, %entry		sw.epilog: ; preds = %sw.bb.2, %sw.bb.1, %entry
ret void		ret void
}		}

define i32 @test5(i32 %x, i32 %y) #0 {		define i32 @test5(i32 %x, i32 %y) #0 {
		RKSimonUnsubmitted Not Done Reply Inline Actions Please can you give the functions more useful (short, descriptive) names, commit the tests to trunk with current codegen and then update this patch to show the codegen diff RKSimon: Please can you give the functions more useful (short, descriptive) names, commit the tests to…
; CHECK-LABEL: test5:		; CHECK-LABEL: test5:
		RKSimonUnsubmitted Not Done Reply Inline Actions Are these attributes necessary: dso_local, local_unnamed_addr + #0? I think you might just need nounwind ? RKSimon: Are these attributes necessary: dso_local, local_unnamed_addr + #0? I think you might just need…
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: addl %esi, %esi		; CHECK-NEXT: # kill: def $esi killed $esi def $rsi
; CHECK-NEXT: subl %esi, %edi		; CHECK-NEXT: # kill: def $edi killed $edi def $rdi
; CHECK-NEXT: movl %edi, %eax		; CHECK-NEXT: negl %esi
		; CHECK-NEXT: leal (%rdi,%rsi,2), %eax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
entry:		entry:
%mul = mul nsw i32 %y, -2		%mul = mul nsw i32 %y, -2
%add = add nsw i32 %mul, %x		%add = add nsw i32 %mul, %x
ret i32 %add		ret i32 %add
}		}

define i32 @test6(i32 %x, i32 %y) #0 {		define i32 @test6(i32 %x, i32 %y) #0 {
; CHECK-LABEL: test6:		; CHECK-LABEL: test6:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: # kill: def $esi killed $esi def $rsi		; CHECK-NEXT: # kill: def $esi killed $esi def $rsi
; CHECK-NEXT: leal (%rsi,%rsi,2), %eax		; CHECK-NEXT: leal (%rsi,%rsi,2), %eax
; CHECK-NEXT: subl %eax, %edi		; CHECK-NEXT: subl %eax, %edi
; CHECK-NEXT: movl %edi, %eax		; CHECK-NEXT: movl %edi, %eax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
entry:		entry:
%mul = mul nsw i32 %y, -3		%mul = mul nsw i32 %y, -3
%add = add nsw i32 %mul, %x		%add = add nsw i32 %mul, %x
ret i32 %add		ret i32 %add
}		}

define i32 @test7(i32 %x, i32 %y) #0 {		define i32 @test7(i32 %x, i32 %y) #0 {
; CHECK-LABEL: test7:		; CHECK-LABEL: test7:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: shll $2, %esi		; CHECK-NEXT: # kill: def $esi killed $esi def $rsi
; CHECK-NEXT: subl %esi, %edi		; CHECK-NEXT: # kill: def $edi killed $edi def $rdi
; CHECK-NEXT: movl %edi, %eax		; CHECK-NEXT: negl %esi
		; CHECK-NEXT: leal (%rdi,%rsi,4), %eax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
entry:		entry:
%mul = mul nsw i32 %y, -4		%mul = mul nsw i32 %y, -4
%add = add nsw i32 %mul, %x		%add = add nsw i32 %mul, %x
ret i32 %add		ret i32 %add
}		}

define i32 @test8(i32 %x, i32 %y) #0 {		define i32 @test8(i32 %x, i32 %y) #0 {
; CHECK-LABEL: test8:		; CHECK-LABEL: test8:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: # kill: def $esi killed $esi def $rsi		; CHECK-NEXT: # kill: def $esi killed $esi def $rsi
; CHECK-NEXT: leal (,%rsi,4), %eax		; CHECK-NEXT: # kill: def $edi killed $edi def $rdi
; CHECK-NEXT: subl %edi, %eax		; CHECK-NEXT: negl %edi
		; CHECK-NEXT: leal (%rdi,%rsi,4), %eax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
entry:		entry:
%mul = shl nsw i32 %y, 2		%mul = shl nsw i32 %y, 2
%sub = sub nsw i32 %mul, %x		%sub = sub nsw i32 %mul, %x
ret i32 %sub		ret i32 %sub
}		}


define i32 @test9(i32 %x, i32 %y) #0 {		define i32 @test9(i32 %x, i32 %y) #0 {
; CHECK-LABEL: test9:		; CHECK-LABEL: test9:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: addl %esi, %esi		; CHECK-NEXT: # kill: def $esi killed $esi def $rsi
; CHECK-NEXT: subl %esi, %edi		; CHECK-NEXT: # kill: def $edi killed $edi def $rdi
; CHECK-NEXT: movl %edi, %eax		; CHECK-NEXT: negl %esi
		; CHECK-NEXT: leal (%rdi,%rsi,2), %eax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
entry:		entry:
%mul = mul nsw i32 -2, %y		%mul = mul nsw i32 -2, %y
%add = add nsw i32 %x, %mul		%add = add nsw i32 %x, %mul
ret i32 %add		ret i32 %add
}		}

define i32 @test10(i32 %x, i32 %y) #0 {		define i32 @test10(i32 %x, i32 %y) #0 {
; CHECK-LABEL: test10:		; CHECK-LABEL: test10:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: # kill: def $esi killed $esi def $rsi		; CHECK-NEXT: # kill: def $esi killed $esi def $rsi
; CHECK-NEXT: leal (%rsi,%rsi,2), %eax		; CHECK-NEXT: leal (%rsi,%rsi,2), %eax
; CHECK-NEXT: subl %eax, %edi		; CHECK-NEXT: subl %eax, %edi
; CHECK-NEXT: movl %edi, %eax		; CHECK-NEXT: movl %edi, %eax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
entry:		entry:
%mul = mul nsw i32 -3, %y		%mul = mul nsw i32 -3, %y
%add = add nsw i32 %x, %mul		%add = add nsw i32 %x, %mul
ret i32 %add		ret i32 %add
}		}

define i32 @test11(i32 %x, i32 %y) #0 {		define i32 @test11(i32 %x, i32 %y) #0 {
; CHECK-LABEL: test11:		; CHECK-LABEL: test11:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: shll $2, %esi		; CHECK-NEXT: # kill: def $esi killed $esi def $rsi
; CHECK-NEXT: subl %esi, %edi		; CHECK-NEXT: # kill: def $edi killed $edi def $rdi
; CHECK-NEXT: movl %edi, %eax		; CHECK-NEXT: negl %esi
		; CHECK-NEXT: leal (%rdi,%rsi,4), %eax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
entry:		entry:
%mul = mul nsw i32 -4, %y		%mul = mul nsw i32 -4, %y
%add = add nsw i32 %x, %mul		%add = add nsw i32 %x, %mul
ret i32 %add		ret i32 %add
}		}

define i32 @test12(i32 %x, i32 %y) #0 {		define i32 @test12(i32 %x, i32 %y) #0 {
; CHECK-LABEL: test12:		; CHECK-LABEL: test12:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: # kill: def $esi killed $esi def $rsi		; CHECK-NEXT: # kill: def $esi killed $esi def $rsi
; CHECK-NEXT: leal (,%rsi,4), %eax		; CHECK-NEXT: # kill: def $edi killed $edi def $rdi
; CHECK-NEXT: subl %edi, %eax		; CHECK-NEXT: negl %edi
		; CHECK-NEXT: leal (%rdi,%rsi,4), %eax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
entry:		entry:
%mul = mul nsw i32 4, %y		%mul = mul nsw i32 4, %y
%sub = sub nsw i32 %mul, %x		%sub = sub nsw i32 %mul, %x
ret i32 %sub		ret i32 %sub
}		}

define i64 @test13(i64 %x, i64 %y) #0 {		define i64 @test13(i64 %x, i64 %y) #0 {
; CHECK-LABEL: test13:		; CHECK-LABEL: test13:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: shlq $2, %rsi		; CHECK-NEXT: negq %rsi
; CHECK-NEXT: subq %rsi, %rdi		; CHECK-NEXT: leaq (%rdi,%rsi,4), %rax
; CHECK-NEXT: movq %rdi, %rax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
entry:		entry:
%mul = mul nsw i64 -4, %y		%mul = mul nsw i64 -4, %y
%add = add nsw i64 %x, %mul		%add = add nsw i64 %x, %mul
ret i64 %add		ret i64 %add
}		}

define i32 @test14(i32 %x, i32 %y) #0 {		define i32 @test14(i32 %x, i32 %y) #0 {
; CHECK-LABEL: test14:		; CHECK-LABEL: test14:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: # kill: def $esi killed $esi def $rsi		; CHECK-NEXT: # kill: def $esi killed $esi def $rsi
; CHECK-NEXT: leal (,%rsi,4), %eax		; CHECK-NEXT: # kill: def $edi killed $edi def $rdi
; CHECK-NEXT: subl %edi, %eax		; CHECK-NEXT: negl %edi
		; CHECK-NEXT: leal (%rdi,%rsi,4), %eax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
entry:		entry:
%mul = mul nsw i32 4, %y		%mul = mul nsw i32 4, %y
%sub = sub nsw i32 %mul, %x		%sub = sub nsw i32 %mul, %x
ret i32 %sub		ret i32 %sub
}		}

define zeroext i16 @test15(i16 zeroext %x, i16 zeroext %y) #0 {		define zeroext i16 @test15(i16 zeroext %x, i16 zeroext %y) #0 {
; CHECK-LABEL: test15:		; CHECK-LABEL: test15:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: shll $3, %esi		; CHECK-NEXT: # kill: def $esi killed $esi def $rsi
; CHECK-NEXT: subl %esi, %edi		; CHECK-NEXT: # kill: def $edi killed $edi def $rdi
; CHECK-NEXT: movl %edi, %eax		; CHECK-NEXT: negl %esi
		; CHECK-NEXT: leal (%rdi,%rsi,8), %eax
		; CHECK-NEXT: # kill: def $ax killed $ax killed $eax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
entry:		entry:
%conv = zext i16 %x to i32		%conv = zext i16 %x to i32
%conv1 = zext i16 %y to i32		%conv1 = zext i16 %y to i32
%mul = mul nsw i32 -8, %conv1		%mul = mul nsw i32 -8, %conv1
%add = add nsw i32 %conv, %mul		%add = add nsw i32 %conv, %mul
%conv2 = trunc i32 %add to i16		%conv2 = trunc i32 %add to i16
ret i16 %conv2		ret i16 %conv2
}		}

attributes #0 = { norecurse nounwind optsize readnone uwtable}		attributes #0 = { norecurse nounwind optsize readnone uwtable}

This is an archive of the discontinued LLVM Phabricator instance.

[X86] Performing DAG pruning before selection of LEA instructions.Needs ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 158791

lib/Target/X86/X86ISelDAGToDAG.cpp

test/CodeGen/X86/lea-opt.ll

[X86] Performing DAG pruning before selection of LEA instructions.
Needs ReviewPublic