This is an archive of the discontinued LLVM Phabricator instance.

Fix bug 30945- [AVX512] Failure to flip vector comparison to remove not mask instruction
ClosedPublic

Authored by m_zuckerman on Nov 28 2016, 3:42 AM.

Details

Summary

Adding new optimization opportunity by adding new X86ISelLowering pattern. The test case was shown in https://llvm.org/bugs/show_bug.cgi?id=30945.

Test explanation:
Select gets three arguments mask, op and op2. In this case, the Mask is a result of ICMP. The ICMP instruction compares (with equal operand) the zero initializer vector and the result of the first ICMP.

In general, The result of "cmp eq, op1, zero initializers" is "not(op1)" where op1 is a mask. By rearranging of the two arguments inside the Select instruction, we can get the same result. Without the necessary of the middle phase ("cmp eq, op1, zero initializers").

Missed optimization opportunity:
vpcmpled %zmm0, %zmm1, %k0
knotw %k0, %k1

can be combine to
vpcmpgtd %zmm0, %zmm2, %k1

Diff Detail

Repository
rL LLVM

Event Timeline

m_zuckerman retitled this revision from to Fix bug 30945- [AVX512] Failure to flip vector comparison to remove not mask instruction.
m_zuckerman updated this object.
m_zuckerman updated this object.Nov 28 2016, 3:52 AM
m_zuckerman added reviewers: igorb, delena.
igorb edited edge metadata.Nov 28 2016, 4:05 AM

please update summery, add more details.

lib/Target/X86/X86ISelLowering.cpp
27622 ↗(On Diff #79387)

please add comments.
make this transformation only for AVX512 ( you may benefit only if instruction with zero mask exist)

m_zuckerman updated this object.Nov 28 2016, 4:58 AM
m_zuckerman edited edge metadata.
m_zuckerman updated this object.Nov 28 2016, 5:02 AM
m_zuckerman updated this object.Nov 28 2016, 1:21 PM
m_zuckerman updated this object.Nov 28 2016, 1:53 PM
m_zuckerman updated this object.
m_zuckerman updated this object.
m_zuckerman updated this object.Nov 28 2016, 1:57 PM
m_zuckerman added a reviewer: RKSimon.
delena added inline comments.Nov 28 2016, 11:26 PM
lib/Target/X86/X86ISelLowering.cpp
27575 ↗(On Diff #79441)

You check N->getOpcode() == ISD::VSELECT 2 lines above.

test/CodeGen/X86/avx512-vbroadcast.ll
223 ↗(On Diff #79441)

I think that the logic here is broken. %k1 should be a result of unordered cmp.

igorb added inline comments.Nov 29 2016, 1:49 AM
test/CodeGen/X86/avx512-vbroadcast.ll
223 ↗(On Diff #79441)

I belive the code is correct.
In this case Vselect condition has more than one use, could you please check why/when the second cmp was created instead KNOT.

m_zuckerman updated this object.
m_zuckerman added inline comments.Nov 30 2016, 9:57 AM
lib/Target/X86/X86ISelLowering.cpp
27575 ↗(On Diff #79441)

Yes I know, but if someone will delete this line I have the assert to check it also.

delena added inline comments.Dec 6 2016, 1:46 AM
lib/Target/X86/X86ISelLowering.cpp
27574 ↗(On Diff #80272)

I'd put Cond.hasOneUse() here, not in the common code. I assume that we'll need all patterns back in this case.

27575 ↗(On Diff #79441)

I don't think that we need duplication here.

m_zuckerman updated this revision to Diff 80408.Dec 6 2016, 5:38 AM
igorb accepted this revision.Dec 11 2016, 12:13 AM
igorb edited edge metadata.

Thanks,
one minor comment, please see in code.

LGTM

lib/Target/X86/X86ISelLowering.cpp
27572 ↗(On Diff #80408)

please add space after //
the same bellow.

This revision is now accepted and ready to land.Dec 11 2016, 12:13 AM
This revision was automatically updated to reflect the committed changes.