This is an archive of the discontinued LLVM Phabricator instance.

[X86][SSE] Missing SSE/AVX1 memory folding integer instructions
ClosedPublic

Authored by RKSimon on Jan 21 2015, 7:28 AM.

Details

Summary

Added most of the missing integer vector folding patterns for SSE (to SSE42) and AVX1.

The most useful of these are probably the i32/i64 extraction, i8/i16/i32/i64 insertions, zero/sign extension, unsigned saturation subtractions, i64 subtractions and the variable mask blends (pblendvb) - others include CLMUL, SSE42 string comparisons and bit tests.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon updated this revision to Diff 18514.Jan 21 2015, 7:28 AM
RKSimon retitled this revision from to [X86][SSE] Missing SSE/AVX1 memory folding integer instructions.
RKSimon updated this object.
RKSimon edited the test plan for this revision. (Show Details)
RKSimon added reviewers: qcolombet, andreadb, spatel.
RKSimon set the repository for this revision to rL LLVM.
RKSimon added a subscriber: Unknown Object (MLST).
qcolombet edited edge metadata.Jan 21 2015, 10:58 AM

Hi Simon,

Do you know why some of the psubusw are not folded with this patch?

Thanks,
-Quentin

test/CodeGen/X86/psubus.ll
29 ↗(On Diff #18514)

Why is this not folded anymore?

Sorry some of those old tests aren't very clear as to what is going on at all.

test/CodeGen/X86/psubus.ll
29 ↗(On Diff #18514)

Oddly the addition of the folding patterns has allowed the load of the constant (to %xmm0) to be pulled out of the loop.

qcolombet added inline comments.Jan 21 2015, 1:46 PM
test/CodeGen/X86/psubus.ll
29 ↗(On Diff #18514)

Thanks for checking.
Could we get rid of the loop to have a test on the folding?

Sure I can simplify psubus.ll tests - do you want it as part of this patch? We do already have the proper stack folding tests for (v)psubusb as well of course.

qcolombet accepted this revision.Jan 21 2015, 2:43 PM
qcolombet edited edge metadata.

Sure I can simplify psubus.ll tests - do you want it as part of this patch?

No, a follow-up patch is fine.

We do already have the proper stack folding tests for (v)psubusb as well of course.

LGTM then :).

Thanks,
-Quentin

This revision is now accepted and ready to land.Jan 21 2015, 2:43 PM
This revision was automatically updated to reflect the committed changes.