This is an archive of the discontinued LLVM Phabricator instance.

[X86][SSE] Improve lowering of vXi64 multiplies
ClosedPublic

Authored by RKSimon on Dec 14 2016, 8:11 AM.

Details

Summary

As mentioned on PR30845, we were performing our vXi64 multiplication as:

AloBlo = pmuludq(a, b);
AloBhi = pmuludq(a, psrlqi(b, 32));
AhiBlo = pmuludq(psrlqi(a, 32), b);
return AloBlo + psllqi(AloBhi, 32)+ psllqi(AhiBlo, 32);

when we could avoid one of the upper shifts with:

AloBlo = pmuludq(a, b);
AloBhi = pmuludq(a, psrlqi(b, 32));
AhiBlo = pmuludq(psrlqi(a, 32), b);
return AloBlo + psllqi(AloBhi + AhiBlo, 32);

This matches the lowering on gcc/icc.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon updated this revision to Diff 81382.Dec 14 2016, 8:11 AM
RKSimon retitled this revision from to [X86][SSE] Improve lowering of vXi64 multiplies .
RKSimon updated this object.
RKSimon set the repository for this revision to rL LLVM.
RKSimon added a subscriber: llvm-commits.
sroland edited edge metadata.Dec 20 2016, 12:46 PM

This looks correct to me, always nice to shave off some instructions...

This looks correct to me, always nice to shave off some instructions...

Thanks Roland. OK to commit guys?

craig.topper accepted this revision.Dec 21 2016, 9:30 AM
craig.topper added a reviewer: craig.topper.
craig.topper added a subscriber: craig.topper.

LGTM with one comment.

lib/Target/X86/X86TargetTransformInfo.cpp
556 ↗(On Diff #81382)

This comment still says the cost is 18.

This revision is now accepted and ready to land.Dec 21 2016, 9:30 AM
RKSimon added inline comments.Dec 21 2016, 11:38 AM
lib/Target/X86/X86TargetTransformInfo.cpp
556 ↗(On Diff #81382)

Nice catch! Thanks.

This revision was automatically updated to reflect the committed changes.