This is an archive of the discontinued LLVM Phabricator instance.

[X86] Remove dead code from X86ISelDAGToDAG.cpp multiply handling
ClosedPublic

Authored by craig.topper on Sep 26 2017, 12:39 AM.

Download Raw Diff

Details

Reviewers

RKSimon
spatel
zvi

Commits

rGfd6b8a67fb9e: [X86] Remove dead code from X86ISelDAGToDAG.cpp multiply handling
rL314430: [X86] Remove dead code from X86ISelDAGToDAG.cpp multiply handling

Summary

Lowering never creates X86ISD::UMUL for 8-bit types. X86ISD::UMUL8 is used instead. If X86ISD::UMUL 8-bit were ever used it would crash.

DAGCombiner replaces UMUL_LOHI/SMUL_LOHI with a wider MUL and a shift if the type twice as wide is legal. So we should never see i8 UMUL_LOHI/SMUL_LOHI. In fact I think there was a bug in part of the i8 code. Similar is true for i16 though without the bug.

Diff Detail

Repository: rL LLVM

Event Timeline

craig.topper created this revision.Sep 26 2017, 12:39 AM

craig.topper added inline comments.

lib/Target/X86/X86ISelDAGToDAG.cpp
2733 ↗	(On Diff #116647)	Shouldn't this ReplaceUses call have used SDValue(Node, 0)?

What is the test coverage like? Searching for "mul i8" in x86 doesn't find much.....

lib/Target/X86/X86ISelDAGToDAG.cpp
2733 ↗	(On Diff #116647)	Yes, I think so. It looks like there is a lot of a MUL_LOHI ambiguity.

Rough outline of mul handling as I understand it.

-A plain mul i8 in IR will go through pretty much untouched and be pattern matched to MUL8r/MUL8m during isel.
-X86ISD::UMUL8/SMUL8 are created from i8 mul with overflow intrinsics. Requires special isel handling due to fixed register constraints on the MUL8r instruction which is made worse by the isel pattern for plain mul that caused the implicit defs list to be in a weird order.
-X86ISD::UMUL is created by i16/i32/i64 multiply with overflow intrinsics. Requires special isel handling due to fixed register constraints on the MUL16r/MUL32r/MUL64r instructions. I don't believe we ever use the high half result of this SDNode. Just the low data and the overflow flag. But there is no unsigned mul instruction that doesn't produce high data.
-X86ISD::SMUL is created by i16/i32/i64 multiply with overflow intrinsics. But we pattern match those to IMUL16rr/IMUL32rr/IMUL64rr. These instruction don't produce high data.
-A i32 umul_lohi/smul_lohi will be created for an i64 multiply on 32-bit targets. A i64 umul_lohi/smul_lohi will be created for i128 multiply on 64-bit targets.
-ISD::UMUL_LOHI/SMUL_LOHI can be produced directly by sdiv/udiv by constants for i8/i16/i32/i64. We're only using the high half so it briefly becomes ISDI::MULHU/MULHS pre-legalization, but is converted back to umul_lohi/smul_lohi during legalization.
-DAGCombiner turns ISD::UMUL_LOHI/SMUL_LOHI into regular multiplies and shifts when possible which should remove the ones created for divides.

I think I should be explicitly setting UMUL_LOHI/SMUL_LOHI to promote for i8/i16 rather than relying on dag combine to do it.

Breaking this up into smaller pieces. Starting with just the X86ISD::UMUL piece since that one is easy.

In D38276#881247, @craig.topper wrote:

Rough outline of mul handling as I understand it.

I always find these kind of overview helpful. Would be great if we could put these in a document or wiki,

LGTM

This revision is now accepted and ready to land.Sep 28 2017, 5:49 AM

Closed by commit rL314430: [X86] Remove dead code from X86ISelDAGToDAG.cpp multiply handling (authored by ctopper). · Explain WhySep 28 2017, 9:59 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Target/

X86/

X86ISelDAGToDAG.cpp

2 lines

Diff 117013

llvm/trunk/lib/Target/X86/X86ISelDAGToDAG.cpp

Show First 20 Lines • Show All 2,588 Lines • ▼ Show 20 Lines	void X86DAGToDAGISel::Select(SDNode *Node) {

case X86ISD::UMUL: {		case X86ISD::UMUL: {
SDValue N0 = Node->getOperand(0);		SDValue N0 = Node->getOperand(0);
SDValue N1 = Node->getOperand(1);		SDValue N1 = Node->getOperand(1);

unsigned LoReg;		unsigned LoReg;
switch (NVT.SimpleTy) {		switch (NVT.SimpleTy) {
default: llvm_unreachable("Unsupported VT!");		default: llvm_unreachable("Unsupported VT!");
case MVT::i8: LoReg = X86::AL; Opc = X86::MUL8r; break;		// MVT::i8 is handled by X86ISD::UMUL8.
case MVT::i16: LoReg = X86::AX; Opc = X86::MUL16r; break;		case MVT::i16: LoReg = X86::AX; Opc = X86::MUL16r; break;
case MVT::i32: LoReg = X86::EAX; Opc = X86::MUL32r; break;		case MVT::i32: LoReg = X86::EAX; Opc = X86::MUL32r; break;
case MVT::i64: LoReg = X86::RAX; Opc = X86::MUL64r; break;		case MVT::i64: LoReg = X86::RAX; Opc = X86::MUL64r; break;
}		}

SDValue InFlag = CurDAG->getCopyToReg(CurDAG->getEntryNode(), dl, LoReg,		SDValue InFlag = CurDAG->getCopyToReg(CurDAG->getEntryNode(), dl, LoReg,
N0, SDValue()).getValue(1);		N0, SDValue()).getValue(1);

▲ Show 20 Lines • Show All 508 Lines • Show Last 20 Lines