This is an archive of the discontinued LLVM Phabricator instance.

[x86] Add a feature flag for slow 32-byte unaligned memory accesses.
ClosedPublic

Authored by spatel on Nov 21 2014, 7:53 AM.

Details

Summary

This patch adds a feature flag to avoid unaligned 32-byte load/store AVX codegen for Sandy Bridge and Ivy Bridge. There is no functionality change intended for those chips. Previously, the absence of AVX2 was being used as a proxy to detect this feature. But that hindered codegen for AVX-enabled AMD chips such as btver2 that do not have the 32-byte unaligned access slowdown.

More detailed performance measurements are included in PR21541 ( http://llvm.org/bugs/show_bug.cgi?id=21541 ).

Diff Detail

Repository
rL LLVM

Event Timeline

spatel updated this revision to Diff 16492.Nov 21 2014, 7:53 AM
spatel retitled this revision from to [x86] Add a feature flag for slow 32-byte unaligned memory accesses..
spatel updated this object.
spatel edited the test plan for this revision. (Show Details)
spatel added reviewers: qcolombet, hfinkel, andreadb, nadav.
spatel added a subscriber: Unknown Object (MLST).
andreadb accepted this revision.Nov 21 2014, 8:39 AM
andreadb edited edge metadata.

Hi Sanjay,

The patch looks good to me (I left a minor comment on the test).

Thanks!
Andrea

test/CodeGen/X86/unaligned-32-byte-memops.ll
1–3 ↗(On Diff #16492)

Can you also add a RUN line for testing Ivy Bridge (core-avx-i) ?
Feature flag 'FeatureSlowUAMem32' is also added to Ivy Bridge so, I think you should test it (I guess you could reuse the same SANDYB checks).

This revision is now accepted and ready to land.Nov 21 2014, 8:39 AM
spatel closed this revision.Nov 21 2014, 9:40 AM
spatel updated this revision to Diff 16497.

Closed by commit rL222544 (authored by @spatel).

Thanks, Andrea! Yes, I agree that we should explicitly check Ivy Bridge too. Added that run line and committed with r222544.

llvm/trunk/lib/Target/X86/X86Subtarget.cpp