This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Define FP_FAST_FMA{F} macros for amdgcn
ClosedPublic

Authored by kzhuravl on Feb 16 2018, 2:46 PM.

Details

Summary
  • Expand GK_*s (i.e. GFX6 -> GFX600, GFX601, etc.)
    • This allows us to choose features correctly in some cases (for example, fast fmaf is available on gfx600, but not gfx601)
  • Move HasFMAF, HasFP64, HasLDEXPF to GPUInfo tables
  • Add HasFastFMA, HasFastFMAF to GPUInfo tables
  • Add missing tests

Diff Detail

Event Timeline

kzhuravl created this revision.Feb 16 2018, 2:46 PM
t-tye added a subscriber: b-sumner.Feb 16 2018, 2:53 PM
t-tye added inline comments.
lib/Basic/Targets/AMDGPU.cpp
345–348

Do all amdgcn targets have fast FMA? @b-sumner can you clarify?

b-sumner added inline comments.Feb 16 2018, 3:18 PM
lib/Basic/Targets/AMDGPU.cpp
345–348

No. All targets that support double precision should report FAST_FMA. Only targets with full rate v_fma_f32 should report FAST_FMAF

t-tye added inline comments.Feb 16 2018, 3:24 PM
lib/Basic/Targets/AMDGPU.cpp
345–348

It is unfortunate that clang does not have access to the processor features defined in the td files which gives the settings for each target.

t-tye requested changes to this revision.Feb 16 2018, 4:31 PM
t-tye added inline comments.
lib/Basic/Targets/AMDGPU.cpp
345–348

Now that the compiler knows the target it seems the clang options that specify fast_fma et al should be removed and the runtimes changed to not set them.

The implementation of when fast fma[f] is present should match the amdgcn td files which have all gfx9 and some pre-gfx9 targets supporting fast fmaf (the ones that have full rate double precision).

This revision now requires changes to proceed.Feb 16 2018, 4:31 PM
kzhuravl updated this revision to Diff 135437.Feb 22 2018, 9:11 AM
kzhuravl marked 4 inline comments as done.
kzhuravl edited the summary of this revision. (Show Details)

Address review feedback

t-tye requested changes to this revision.Feb 23 2018, 3:03 PM
t-tye added inline comments.
lib/Basic/Targets/AMDGPU.cpp
159

What does this mean? Has it now been addressed by this patch?

256

To be consistent should this be:

if (isAMDGCN(getTriple()))

Similar comment elsewhere.

282

This was incorrect in the old code. Only full rate FP64 gcn targets have fast FMAF.

lib/Basic/Targets/AMDGPU.h
88–91 ↗(On Diff #135437)

Would it be better to position these at the beginning/end of the respective enumerations so it is more obvious that they must be updated when adding a new target?

98–102 ↗(On Diff #135437)

Suggest reordering to be in a logical groups of FP32 and FP64:

bool HasFMAF;
bool HasFastFMAF;
bool HasLDEXPF;
bool HasFP64;
bool HasFastFMA;
108 ↗(On Diff #135437)

Suggest adding a comment here which is a header for the columns to make it easier to check if the settings are right. For example:

// Name  Canonical  Kind  HasFMAF   HasFP64   HasLDEXPF   HasFastFMA   HasFastFMAF
136 ↗(On Diff #135437)

Same comment as above.

Also, I think the fast_fma and fast_fmaf columns are reversed. All gcn has fast_fma, it is fast_fmaf that varies.

147 ↗(On Diff #135437)

gfx702 should have true for fast fmaf.

Does the TD file need correcting too?

325 ↗(On Diff #135437)

I found the original operand order easier to read:-)

This revision now requires changes to proceed.Feb 23 2018, 3:03 PM
b-sumner added inline comments.Feb 23 2018, 3:10 PM
lib/Basic/Targets/AMDGPU.cpp
339–341

I'm not sure why this is here. No languages we support have this AFAIK. We should probably add a comment that this is deprecated and remove it in a year or so.

kzhuravl added inline comments.Feb 23 2018, 3:39 PM
lib/Basic/Targets/AMDGPU.cpp
159

This class has a member called *GPU*. We are using the *CPU* that is passed as an argument.

Has it now been addressed by this patch?

No.

lib/Basic/Targets/AMDGPU.h
136 ↗(On Diff #135437)

fast_fmaf is the last column. I think those are in the correct order.

147 ↗(On Diff #135437)
325 ↗(On Diff #135437)

I was trying to match the style above. But I can change it back.

t-tye added inline comments.Feb 23 2018, 4:58 PM
lib/Basic/Targets/AMDGPU.h
136 ↗(On Diff #135437)

Agreed.

147 ↗(On Diff #135437)

I believe the TD file is incorrect. gfx701 and 702 should both have fast fmaf.

325 ↗(On Diff #135437)

Leave it in your new order which keeps it consistent as you point out. (For me they are all backwards:-) )

nhaehnle removed a subscriber: nhaehnle.Feb 25 2018, 7:36 AM
kzhuravl updated this revision to Diff 135997.Feb 26 2018, 3:09 PM
kzhuravl marked 17 inline comments as done.

Address review feedback.

b-sumner added inline comments.Feb 26 2018, 3:57 PM
lib/Basic/Targets/AMDGPU.h
103 ↗(On Diff #135997)

I guess the HasFastFMA is for simplicity? It is a synonym for HasFP64.

t-tye accepted this revision.Feb 26 2018, 9:58 PM

LGTM except for comment on deprecated macros.

lib/Basic/Targets/AMDGPU.cpp
338–339

@b-sumner which ones were to recommending deprecating? I would think FP64 and FMAF would be ones that needs to be kept? I thought it was HAS_FAST_FMA that is not specified by OpenCL.

This revision is now accepted and ready to land.Feb 26 2018, 9:58 PM
b-sumner added inline comments.Feb 27 2018, 5:41 AM
lib/Basic/Targets/AMDGPU.cpp
338–339

Sorry, I was referring to all of the __HAS_* macros.