Page MenuHomePhabricator

AMD family 17h (znver1) enablement
ClosedPublic

Authored by GGanesh on Dec 21 2016, 3:03 AM.

Details

Summary

This patch enables the following

  1. AMD family 17h architecture using "znver1" tune flag (-march, -mcpu).
  2. ISAs that are enabled for "znver1" architecture.
  3. Checks ADX isa from cpuid to identify "znver1" flag when -march=native is used.
  4. Enables CLZERO feature and adds the builtin macro __builtin_ia32_clzero for clzero instruction.
  5. ISAs FMA4, XOP are disabled as they are dropped from amdfam17.
  6. For the time being, it uses the btver2 scheduler model.
  7. Test file is updated to check this flag.

This item is linked to clang review item https://reviews.llvm.org/D28018

Diff Detail

Repository
rL LLVM

Event Timeline

GGanesh updated this revision to Diff 82213.Dec 21 2016, 3:03 AM
GGanesh retitled this revision from to AMD family 17h (znver1) enablement.
GGanesh updated this object.
GGanesh added a reviewer: craig.topper.
GGanesh set the repository for this revision to rL LLVM.
GGanesh added a subscriber: llvm-commits.
GGanesh updated this object.Dec 21 2016, 3:08 AM
RKSimon added a subscriber: RKSimon.

This needs tests for the clzero intrinsic (behind a -mattr=+clzero arg for 32 and 64-bit targets as well as -mcpu=znver1), in fact it probably make sense to separate the clzero support out into a separate preliminary patch and then add -mcpu=znver1 test as part of this one?

lib/Target/X86/X86.td
804 ↗(On Diff #82213)

Only one use so far - probably best to just declare it as :

def : ProcessorModel<"znver1", BtVer2Model, [

Also, add a TODO comment for a znver1 scheduler model

craig.topper added inline comments.Dec 21 2016, 8:57 AM
lib/Target/X86/X86InstrInfo.td
2456 ↗(On Diff #82213)

The custom inserter needs to be implemented in X86ISelLowering.cpp. As Simon said, clzero should be split out of this patch.

RKSimon added inline comments.Dec 21 2016, 12:25 PM
lib/Target/X86/X86.td
799 ↗(On Diff #82213)

GCC seems to think znver1 has MOVBE support - who is right?

I am preparing a patch which doesn't include the clzero feature patch.
I will submit a separate patch for clzero feature patch.

GGanesh updated this revision to Diff 83567.Jan 8 2017, 8:42 AM
GGanesh removed rL LLVM as the repository for this revision.

The clzero intrinsic handling and feature addition will be handled as a separate patch.
Added movbe and sse4a into ISA list of znver1.

RKSimon edited edge metadata.Jan 8 2017, 9:21 AM

A few minors but I don't have any major concerns.

Add znver1 tests to llvm\test\CodeGen\X86\slow-unaligned-mem.ll

lib/Target/X86/X86.td
764 ↗(On Diff #83567)

Are we happy with alphabetical ordering of the feature bits? We don't seem to be consistent for this for many targets at all.

767 ↗(On Diff #83567)

Remove FeatureAVX - it will be implicitly included as FeatureAVX2 is set.

777 ↗(On Diff #83567)

Add znver1 to llvm\test\CodeGen\X86\lzcnt-zext-cmp.ll

791 ↗(On Diff #83567)

Remove FeatureSSSE3 - it will be implicitly included as FeatureAVX2 is set.

792 ↗(On Diff #83567)

Add znver1 to llvm\test\CodeGen\X86\x86-64-double-shifts-var.ll as you're testing for slow SHLD

GGanesh updated this revision to Diff 83632.Jan 9 2017, 8:46 AM
GGanesh edited edge metadata.

Adding znver1 to following tests.
a. LZCNT
b. Slow SHLD
c. slow unaligned memory

Added a TODO noting down znver1 scheduler model is due.

LGTM - @craig.topper any additional comments?

craig.topper accepted this revision.Jan 9 2017, 7:24 PM
craig.topper edited edge metadata.

LGTM

This revision is now accepted and ready to land.Jan 9 2017, 7:24 PM
This revision was automatically updated to reflect the committed changes.