This is an archive of the discontinued LLVM Phabricator instance.

Differential D17973

[DAG]: Combine multiple AssertZexts separated by trunc
Needs ReviewPublic

Authored by hans on Mar 8 2016, 3:59 PM.

Download Raw Diff

This revision needs review, but there are no reviewers specified.

Details

Reviewers: None

Summary

(Uploading as a work-in-progress in case anyone wants to comment.)

The two AssertZexts situation arises for boolean arguments on X86_64:

void f(bool x) {

LLVM will copy x from a 32-bit register, insert an AssertZext to capture that bits 8-31 are zero (as byte-sized arguments get extended on x86_64), and then another AssertZext to capture that bits 1-7 are also zero, because that's true for bools.

Combining the two allows us to optimize this code:

void g(bool);
void f(bool x) {
  g(x);
}

Previously:

movzbl  %dil, %edi
jmp     _Z1gb

After my patch, no zero-extension is generated, as the value in %edi was already extended by the caller.

This raises a number of questions, though:

Are 8- and 16-bit arguments really extended on x86_64? LLVM assumes so since http://reviews.llvm.org/rL34623, but the psABI doesn't seem to actually spell that out?

Are bools really extended to 32 bits on x86_64? http://reviews.llvm.org/rL127766 says they are not and "[...] We still insert an i32 ZextAssert when reading a function's arguments, but it is followed by a truncate and another i8 ZextAssert so it is not optimized." The psABI seems to have gone back and forth here(!).

My patch would break Win64, because there it's clear that a bool argument is extended only to 8 bits, not 32 bits. But from what I understand, this isn't represented well: Clang's WinX86_64ABIInfo::classify will slap a "zext" attribute on the x parameter, and we end up with both AssertZexts.

Diff Detail

Event Timeline

hans updated this revision to Diff 50078.Mar 8 2016, 3:59 PM

hans retitled this revision from to [DAG]: Combine multiple AssertZexts separated by trunc.

hans updated this object.

hans added subscribers: rnk, llvm-commits.

Revision Contents

Path

Size

lib/

CodeGen/

SelectionDAG/

DAGCombiner.cpp

27 lines

Diff 50078

lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 222 Lines • ▼ Show 20 Lines	private:
// Visitation implementation - Implement dag node combining for different		// Visitation implementation - Implement dag node combining for different
// node types. The semantics are as follows:		// node types. The semantics are as follows:
// Return Value:		// Return Value:
// SDValue.getNode() == 0 - No change was made		// SDValue.getNode() == 0 - No change was made
// SDValue.getNode() == N - N was replaced, is dead and has been handled.		// SDValue.getNode() == N - N was replaced, is dead and has been handled.
// otherwise - N should be replaced by the returned Operand.		// otherwise - N should be replaced by the returned Operand.
//		//
SDValue visitTokenFactor(SDNode *N);		SDValue visitTokenFactor(SDNode *N);
		SDValue visitAssertZext(SDNode *N);
SDValue visitMERGE_VALUES(SDNode *N);		SDValue visitMERGE_VALUES(SDNode *N);
SDValue visitADD(SDNode *N);		SDValue visitADD(SDNode *N);
SDValue visitSUB(SDNode *N);		SDValue visitSUB(SDNode *N);
SDValue visitADDC(SDNode *N);		SDValue visitADDC(SDNode *N);
SDValue visitSUBC(SDNode *N);		SDValue visitSUBC(SDNode *N);
SDValue visitADDE(SDNode *N);		SDValue visitADDE(SDNode *N);
SDValue visitSUBE(SDNode *N);		SDValue visitSUBE(SDNode *N);
SDValue visitMUL(SDNode *N);		SDValue visitMUL(SDNode *N);
▲ Show 20 Lines • Show All 1,106 Lines • ▼ Show 20 Lines	void DAGCombiner::Run(CombineLevel AtLevel) {
DAG.setRoot(Dummy.getValue());		DAG.setRoot(Dummy.getValue());
DAG.RemoveDeadNodes();		DAG.RemoveDeadNodes();
}		}

SDValue DAGCombiner::visit(SDNode *N) {		SDValue DAGCombiner::visit(SDNode *N) {
switch (N->getOpcode()) {		switch (N->getOpcode()) {
default: break;		default: break;
case ISD::TokenFactor: return visitTokenFactor(N);		case ISD::TokenFactor: return visitTokenFactor(N);
		case ISD::AssertZext: return visitAssertZext(N);
case ISD::MERGE_VALUES: return visitMERGE_VALUES(N);		case ISD::MERGE_VALUES: return visitMERGE_VALUES(N);
case ISD::ADD: return visitADD(N);		case ISD::ADD: return visitADD(N);
case ISD::SUB: return visitSUB(N);		case ISD::SUB: return visitSUB(N);
case ISD::ADDC: return visitADDC(N);		case ISD::ADDC: return visitADDC(N);
case ISD::SUBC: return visitSUBC(N);		case ISD::SUBC: return visitSUBC(N);
case ISD::ADDE: return visitADDE(N);		case ISD::ADDE: return visitADDE(N);
case ISD::SUBE: return visitSUBE(N);		case ISD::SUBE: return visitSUBE(N);
case ISD::MUL: return visitMUL(N);		case ISD::MUL: return visitMUL(N);
▲ Show 20 Lines • Show All 238 Lines • ▼ Show 20 Lines	if (Changed) {
bool UseAA = CombinerAA.getNumOccurrences() > 0 ? CombinerAA		bool UseAA = CombinerAA.getNumOccurrences() > 0 ? CombinerAA
: DAG.getSubtarget().useAA();		: DAG.getSubtarget().useAA();
return CombineTo(N, Result, UseAA /add to worklist/);		return CombineTo(N, Result, UseAA /add to worklist/);
}		}

return Result;		return Result;
}		}

		SDValue DAGCombiner::visitAssertZext(SDNode *N) {
		SDValue N0 = N->getOperand(0);
		EVT VT = N->getValueType(0);

		// Try to fold (assert_zext (trunc (assert_zext x t1)) t2).
		if (N0.getOpcode() == ISD::TRUNCATE &&
		N0.getOperand(0)->getOpcode() == ISD::AssertZext) {
		SDValue Trunc = N0;
		SDValue AZext1 = N0.getOperand(0);
		EVT T1 = cast<VTSDNode>(AZext1->getOperand(1))->getVT();
		EVT T2 = cast<VTSDNode>(N->getOperand(1))->getVT();

		if (Trunc->getValueType(0) == T1 && T2.getSizeInBits() < T1.getSizeInBits()) {
		// Trunc removes the known zeros of AZext1.
		SDValue CombinedAssertZext =
		DAG.getNode(ISD::AssertZext, SDLoc(N), AZext1->getValueType(0),
		AZext1->getOperand(0), N->getOperand(1));
		return DAG.getNode(ISD::TRUNCATE, SDLoc(N), VT, CombinedAssertZext);

		}
		}

		return SDValue();
		}

/// MERGE_VALUES can always be eliminated.		/// MERGE_VALUES can always be eliminated.
SDValue DAGCombiner::visitMERGE_VALUES(SDNode *N) {		SDValue DAGCombiner::visitMERGE_VALUES(SDNode *N) {
WorklistRemover DeadNodes(*this);		WorklistRemover DeadNodes(*this);
// Replacing results may cause a different MERGE_VALUES to suddenly		// Replacing results may cause a different MERGE_VALUES to suddenly
// be CSE'd with N, and carry its uses with it. Iterate until no		// be CSE'd with N, and carry its uses with it. Iterate until no
// uses remain, to ensure that the node can be safely deleted.		// uses remain, to ensure that the node can be safely deleted.
// First add the users of this node to the work list so that they		// First add the users of this node to the work list so that they
// can be tried again once they have new operands.		// can be tried again once they have new operands.
▲ Show 20 Lines • Show All 13,216 Lines • Show Last 20 Lines