This is an archive of the discontinued LLVM Phabricator instance.

[X86] Loosen memory folding requirements for cvtdq2pd and cvtps2pd instructions
ClosedPublic

Authored by aturetsk on Aug 26 2016, 5:22 AM.

Details

Summary

According to spec cvtdq2pd and cvtps2pd instructions don't require memory operand to be aligned to 16 bytes. This patch removes this requirement from the memory folding table.

Diff Detail

Repository
rL LLVM

Event Timeline

aturetsk updated this revision to Diff 69346.Aug 26 2016, 5:22 AM
aturetsk retitled this revision from to [X86] Loosen memory folding requirements for cvtdq2pd and cvtps2pd instructions.
aturetsk updated this object.
aturetsk added reviewers: nadav, echristo, bruno.
aturetsk added subscribers: llvm-commits, anadolskiy.
RKSimon added inline comments.
test/CodeGen/X86/peephole-cvt-sse.ll
1 ↗(On Diff #69346)

Please can you test on a 32-bit target as well and use utils/update_llc_test_checks.py if you can.

aturetsk updated this revision to Diff 69561.Aug 29 2016, 5:43 AM

Improve the test.

Hi Simon,

Thanks for the review.

test/CodeGen/X86/peephole-cvt-sse.ll
2 ↗(On Diff #69561)

Both are done.

RKSimon added inline comments.
test/CodeGen/X86/peephole-cvt-sse.ll
2 ↗(On Diff #69561)

Thanks, its also better to use -mattr=+sse4.2 (or similar) instead of a cpu target unless you are specifically testings for that cpu.

aturetsk updated this revision to Diff 69688.Aug 30 2016, 8:17 AM

Change -mcpu=slm with -mattr=+sse4.2.

RKSimon accepted this revision.Aug 31 2016, 6:15 AM
RKSimon edited edge metadata.

LGTM

This revision is now accepted and ready to land.Aug 31 2016, 6:15 AM
This revision was automatically updated to reflect the committed changes.