This is an archive of the discontinued LLVM Phabricator instance.

[X86][SSE] SimplifyDemandedBitsForTargetNode - Add PACKSS support
ClosedPublic

Authored by RKSimon on Apr 5 2019, 12:13 PM.

Details

Summary

In the case where we only want the sign bit (e.g. when using PACKSS truncation of comparison results for MOVMSK) then we can just demand the sign bit of the source operands.

This makes use of the fact that PACKSS saturates out of range values to the +/- max int values - so the sign bit is always preserved.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon created this revision.Apr 5 2019, 12:13 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 5 2019, 12:13 PM
nikic added a comment.EditedApr 5 2019, 2:08 PM

This looks right to me, but I think we should be able to do better. The sign bit of the result is determined only by the sign bit of the input. Any other bit is determined by the same bit in the input, as well as the high BitWidth+1 bits of the input. I'm thinking something along these lines:

APInt DemandedBits;
if (OriginalDemandedBits.isSignMask()) {
    DemandedBits = APInt::getSignMask(BitWidth * 2);
} else {
    DemandedBits = OriginalDemandedBits.zext(BitWidth * 2);
    DemandedBits.setHighBits(BitWidth + 1);
}

This looks right to me, but I think we should be able to do better. The sign bit of the result is determined only by the sign bit of the input. Any other bit is determined by the same bit in the input, as well as the high BitWidth+1 bits of the input. I'm thinking something along these lines:

APInt DemandedBits;
if (OriginalDemandedBits.isSignMask()) {
    DemandedBits = APInt::getSignMask(BitWidth * 2);
} else {
    DemandedBits = OriginalDemandedBits.zext(BitWidth * 2);
    DemandedBits.setHighBits(BitWidth + 1);
}

I agree, I hope to do that in the future (and something similar for PACKUS) once I have suitable test coverage (its related to better support for TRUNCATE vector SimplifiedDemandedBits support as well) - but for now I'd prefer just to get this MOVMSK fix in.

Looks good to me.

nikic added a comment.Apr 6 2019, 8:27 AM

I agree, I hope to do that in the future (and something similar for PACKUS) once I have suitable test coverage (its related to better support for TRUNCATE vector SimplifiedDemandedBits support as well) - but for now I'd prefer just to get this MOVMSK fix in.

Sounds good. The sign bit case is rather different from the rest.

lib/Target/X86/X86ISelLowering.cpp
33513

What does -ve/+ve mean?

33526

Could use the known bits to determine the sign bit:

if (KnownLHS.isNegative() && KnownRHS.isNegative())
  Known.makeNegative();
else if (KnownLHS.isNonNegative() && KnownRHS.isNonNegative())
  Known.makeNonNegative();
RKSimon marked an inline comment as done.Apr 6 2019, 12:26 PM
RKSimon added inline comments.
lib/Target/X86/X86ISelLowering.cpp
33513

shorthand for negative and positive - I'll change the comment to INT_MIN/INT_MAX

RKSimon marked an inline comment as done.Apr 6 2019, 1:51 PM
RKSimon added inline comments.
lib/Target/X86/X86ISelLowering.cpp
33526

This needs test coverage which is proving tricky to create useful tests for - I'll add a TODO.

RKSimon updated this revision to Diff 194033.Apr 6 2019, 1:53 PM

rebased and added TODO comments

nikic accepted this revision.Apr 6 2019, 11:16 PM

LGTM

This revision is now accepted and ready to land.Apr 6 2019, 11:16 PM
This revision was automatically updated to reflect the committed changes.