This is an archive of the discontinued LLVM Phabricator instance.

[X86][SSE] Update the cost table for integer-integer conversions on SSE2.
ClosedPublic

Authored by congh on Dec 1 2015, 4:02 PM.

Details

Summary

Previously in the conversion cost table there are no entries for integer-integer conversions on SSE2. This will result in imprecise costs for certain vectorized operations. This patch adds those entries for SSE2. The cost numbers are counted from the result of running llc on the new test case in this patch.

Diff Detail

Repository
rL LLVM

Event Timeline

congh updated this revision to Diff 41572.Dec 1 2015, 4:02 PM
congh retitled this revision from to [X86][SSE] Update the cost table for integer-integer conversions on SSE2..
congh updated this object.
congh added reviewers: hfinkel, RKSimon, davidxl.
congh added a subscriber: llvm-commits.
RKSimon added inline comments.Dec 8 2015, 7:04 AM
lib/Target/X86/X86TargetTransformInfo.cpp
703 ↗(On Diff #41572)

These values don't appear to be correct for SSE41 which has PMOVSX/PMOVZX ops - maybe split off the 128-bit extension ops from AVXConversionTbl into SSE41ConversionTbl ?

congh added inline comments.Dec 8 2015, 10:30 AM
lib/Target/X86/X86TargetTransformInfo.cpp
703 ↗(On Diff #41572)

SSSE3 also provides pshufb from which several operations here can benefit. So you suggestion adding more tables for SSSE3/SSE4.1?

RKSimon added inline comments.Dec 9 2015, 6:48 AM
lib/Target/X86/X86TargetTransformInfo.cpp
703 ↗(On Diff #41572)

You may not need SSSE3 (often PSHUFB is as costly as fixed shuffles on older hardware - it just reduces register use) but splitting the extensions from AVX into SSE41 needs to be done.

724 ↗(On Diff #41572)

I haven't checked this very thoroughly but you might need to improve non-simple type handling here, especially for extensions?

congh added inline comments.Dec 9 2015, 5:19 PM
lib/Target/X86/X86TargetTransformInfo.cpp
703 ↗(On Diff #41572)

I agree. SSE41 really provides several instructions that can greatly reduce those costs. I have added a table for SSE41 and also updated the test case. PTAL.

724 ↗(On Diff #41572)

This patch is actually adding cost entries for non-simple types. I added another query on the table below for non-simple types except this one. Those entries of float/int conversions also need to be updated for non-simple types and I will do it later. To prevent vector-size/types combination explosion, I think we probably need to redesign the cost table.

congh updated this revision to Diff 42362.Dec 9 2015, 5:19 PM

Update the patch by adding a cost table for SSE4.1. The test case is also updated accordingly.

RKSimon accepted this revision.Dec 10 2015, 6:50 AM
RKSimon edited edge metadata.

LGTM - if possible please add a FIXME comment describing the future work necessary to improve the handling of simple + non-simple types.

This revision is now accepted and ready to land.Dec 10 2015, 6:50 AM
This revision was automatically updated to reflect the committed changes.