This is an archive of the discontinued LLVM Phabricator instance.

AMD family 17h (znver1) enablement
ClosedPublic

Authored by GGanesh on Dec 21 2016, 3:06 AM.

Details

Summary

This patch enables the following

  1. AMD family 17h architecture using "znver1" tune flag (-march, -mcpu).
  2. ISAs that are enabled for "znver1" architecture.
  3. Checks ADX isa from cpuid to identify "znver1" flag when -march=native is used.
  4. Enables CLZERO feature and adds the builtin macro __builtin_ia32_clzero for clzero instruction.
  5. ISAs FMA4, XOP are disabled as they are dropped from amdfam17.
  6. For the time being, it uses the btver2 scheduler model.
  7. Test file is updated to check this flag.

This is linked to llvm review item https://reviews.llvm.org/D28017

Diff Detail

Repository
rL LLVM

Event Timeline

GGanesh updated this revision to Diff 82217.Dec 21 2016, 3:06 AM
GGanesh retitled this revision from to AMD family 17h (znver1) enablement.
GGanesh updated this object.
GGanesh added a reviewer: craig.topper.
GGanesh set the repository for this revision to rL LLVM.
GGanesh updated this object.
GGanesh added a subscriber: llvm-commits.
RKSimon added inline comments.
lib/Basic/Targets.cpp
3157 ↗(On Diff #82217)

Don't need to specify avx as you have avx2

3177 ↗(On Diff #82217)

SSE4A?

Also, I don't think there is any clzero support on clang yet so adding that feature probably isn't safe. You may need to put up a separate clzero patch for review, preferably including _mm_clzero support to the headers as well.

GGanesh updated this revision to Diff 83566.Jan 8 2017, 8:39 AM
GGanesh removed rL LLVM as the repository for this revision.

The clzero builtins and feature addition will be handled separately in another patch.
SSE4a and movbe are added to the ISA list.

RKSimon added inline comments.Jan 8 2017, 9:23 AM
lib/Basic/Targets.cpp
3189 ↗(On Diff #83566)

Same as what I asked on D28017 - is there an accepted order that we should be using here?

GGanesh added inline comments.Jan 8 2017, 12:33 PM
lib/Basic/Targets.cpp
3189 ↗(On Diff #83566)

Some of them seems to be chronological.
Some of them are alphabetical.

I personally don't have any preference as such.
Alphabetical order suits a long list.
I would like to know your suggestion.

RKSimon added inline comments.Jan 9 2017, 4:07 AM
lib/Basic/Targets.cpp
3189 ↗(On Diff #83566)

@craig.topper Any preferences?

No strong preference and nothing that should slow the acceptance of this patch - alphabetical can be easier to maintain but it's unlikely this code changes often.

Sorting by feature groups/age can be more understandable, and can help account for the fall-through behaviour used in many of the cases here - speaking of which would it be useful to fall-through from CK_ZNVER1 to CK_BTVER2 to CK_BTVER1 since they seem to have a common set of features?

GGanesh updated this revision to Diff 83626.Jan 9 2017, 8:19 AM

Fallback to CK_BTVER1 is ok but not to CK_BTVER2. This is not possible because of the partial YMM writes. They have different behavior for znver1 with AVX and their legacy SIMD counterparts. So, as of now leaving them to alphabetical order.

RKSimon edited edge metadata.Jan 9 2017, 8:51 AM

Fallback to CK_BTVER1 is ok but not to CK_BTVER2. This is not possible because of the partial YMM writes. They have different behavior for znver1 with AVX and their legacy SIMD counterparts. So, as of now leaving them to alphabetical order.

That makes sense for the llvm patch, but for this clang patch in X86TargetInfo::initFeatureMap we don't need to handle those behaviour features just the instruction set features which we can fall though.

Yes. True I mentioned that for the grouping or the order of the features enabled. These initFeatureMap are done based on the intrinsics and the CodeGen part.

If Okay, can you please commit these on my behalf. I don't have write access.

LGTM - @craig.topper any additional comments?

craig.topper accepted this revision.Jan 9 2017, 7:17 PM
craig.topper edited edge metadata.

LGTM

This revision is now accepted and ready to land.Jan 9 2017, 7:17 PM
This revision was automatically updated to reflect the committed changes.