This is an archive of the discontinued LLVM Phabricator instance.

[X86][SSE] Vector double -> float conversion memory folding (cvtpd2ps)
ClosedPublic

Authored by RKSimon on Dec 15 2014, 9:35 AM.

Details

Summary

Added a missing memory folding relationship for the (V)CVTPD2PS instruction (and its AVX variants) - we can safely fold these for stack reloads.

Follow up to http://reviews.llvm.org/D5981

I'd like to add the (V)CVTPS2PD and (V)CVTDQ2PD instructions as well but I'm hitting issues with irrelevant register/memory size differences in the ymm implementations - it reloads the whole ymm and then references the lower xmm as the src for the conversion. Any suggestions on how I should deal with this? The xmm versions seem to fold fine but I'd prefer to add them all at the same time in my next patch.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon updated this revision to Diff 17292.Dec 15 2014, 9:35 AM
RKSimon retitled this revision from to [X86][SSE] Vector double -> float conversion memory folding (cvtpd2ps).
RKSimon updated this object.
RKSimon edited the test plan for this revision. (Show Details)
RKSimon added reviewers: qcolombet, spatel, andreadb.
RKSimon set the repository for this revision to rL LLVM.
RKSimon added a subscriber: Unknown Object (MLST).
qcolombet accepted this revision.Dec 15 2014, 3:08 PM
qcolombet edited edge metadata.

Hi Simon,

LGTM.

Regarding this problem:

it reloads the whole ymm and then references the lower xmm as the src for the conversion

Could you elaborate on the variants that are causing problems? IIRC, all inputs are xmm, not ymm.

Thanks,
-Quentin

This revision is now accepted and ready to land.Dec 15 2014, 3:08 PM
This revision was automatically updated to reflect the committed changes.