This is an archive of the discontinued LLVM Phabricator instance.

[X86] Remove dead code from X86ISelDAGToDAG.cpp multiply handling
ClosedPublic

Authored by craig.topper on Sep 26 2017, 12:39 AM.

Details

Summary

Lowering never creates X86ISD::UMUL for 8-bit types. X86ISD::UMUL8 is used instead. If X86ISD::UMUL 8-bit were ever used it would crash.

DAGCombiner replaces UMUL_LOHI/SMUL_LOHI with a wider MUL and a shift if the type twice as wide is legal. So we should never see i8 UMUL_LOHI/SMUL_LOHI. In fact I think there was a bug in part of the i8 code. Similar is true for i16 though without the bug.

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper created this revision.Sep 26 2017, 12:39 AM
craig.topper added inline comments.
lib/Target/X86/X86ISelDAGToDAG.cpp
2733 ↗(On Diff #116647)

Shouldn't this ReplaceUses call have used SDValue(Node, 0)?

RKSimon edited edge metadata.Sep 26 2017, 8:18 AM

What is the test coverage like? Searching for "mul i8" in x86 doesn't find much.....

lib/Target/X86/X86ISelDAGToDAG.cpp
2733 ↗(On Diff #116647)

Yes, I think so. It looks like there is a lot of a MUL_LOHI ambiguity.

Rough outline of mul handling as I understand it.

-A plain mul i8 in IR will go through pretty much untouched and be pattern matched to MUL8r/MUL8m during isel.
-X86ISD::UMUL8/SMUL8 are created from i8 mul with overflow intrinsics. Requires special isel handling due to fixed register constraints on the MUL8r instruction which is made worse by the isel pattern for plain mul that caused the implicit defs list to be in a weird order.
-X86ISD::UMUL is created by i16/i32/i64 multiply with overflow intrinsics. Requires special isel handling due to fixed register constraints on the MUL16r/MUL32r/MUL64r instructions. I don't believe we ever use the high half result of this SDNode. Just the low data and the overflow flag. But there is no unsigned mul instruction that doesn't produce high data.
-X86ISD::SMUL is created by i16/i32/i64 multiply with overflow intrinsics. But we pattern match those to IMUL16rr/IMUL32rr/IMUL64rr. These instruction don't produce high data.
-A i32 umul_lohi/smul_lohi will be created for an i64 multiply on 32-bit targets. A i64 umul_lohi/smul_lohi will be created for i128 multiply on 64-bit targets.
-ISD::UMUL_LOHI/SMUL_LOHI can be produced directly by sdiv/udiv by constants for i8/i16/i32/i64. We're only using the high half so it briefly becomes ISDI::MULHU/MULHS pre-legalization, but is converted back to umul_lohi/smul_lohi during legalization.
-DAGCombiner turns ISD::UMUL_LOHI/SMUL_LOHI into regular multiplies and shifts when possible which should remove the ones created for divides.

craig.topper planned changes to this revision.Sep 27 2017, 12:27 AM

I think I should be explicitly setting UMUL_LOHI/SMUL_LOHI to promote for i8/i16 rather than relying on dag combine to do it.

Breaking this up into smaller pieces. Starting with just the X86ISD::UMUL piece since that one is easy.

zvi edited edge metadata.Sep 28 2017, 5:49 AM

Rough outline of mul handling as I understand it.

I always find these kind of overview helpful. Would be great if we could put these in a document or wiki,

zvi accepted this revision.Sep 28 2017, 5:49 AM

LGTM

This revision is now accepted and ready to land.Sep 28 2017, 5:49 AM
This revision was automatically updated to reflect the committed changes.