This is an archive of the discontinued LLVM Phabricator instance.

Enable FeatureFastUAMem for btver2
ClosedPublic

Authored by spatel on Nov 21 2014, 12:12 PM.

Details

Summary

Allow unaligned 16-byte memop codegen for btver2. No functional changes for any other subtargets.

The bigger change is replacing the existing supposed small memcpy test with an actual test of a small memcpy. The previous test wasn't using FileCheck either.

This patch should allow us to close PR21541 ( http://llvm.org/bugs/show_bug.cgi?id=21541 ).

Diff Detail

Repository
rL LLVM

Event Timeline

spatel updated this revision to Diff 16502.Nov 21 2014, 12:12 PM
spatel retitled this revision from to Enable FeatureFastUAMem for btver2.
spatel updated this object.
spatel edited the test plan for this revision. (Show Details)
spatel added reviewers: qcolombet, andreadb, hfinkel.
spatel added a subscriber: Unknown Object (MLST).
andreadb accepted this revision.Nov 24 2014, 9:24 AM
andreadb edited edge metadata.

Hi Sanjay,

The patch LGTM. Thanks!
In future, we should probably check if unaligned memory accesses are also fast on AMD cpus other than Jaguar. For example, I expect that unaligned memory accesses are also fast on amdfam15 cpus.

-Andrea

This revision is now accepted and ready to land.Nov 24 2014, 9:24 AM
spatel closed this revision.Nov 28 2014, 10:40 AM
spatel updated this revision to Diff 16741.

Closed by commit rL222925 (authored by @spatel).

Thanks, Andrea! I added a TODO comment for the other AMD chips. Checked in at r222925.