This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Define FP_FAST_FMA{F} macros for amdgcn
ClosedPublic

Authored by kzhuravl on Feb 16 2018, 2:46 PM.

Details

Summary
  • Expand GK_*s (i.e. GFX6 -> GFX600, GFX601, etc.)
    • This allows us to choose features correctly in some cases (for example, fast fmaf is available on gfx600, but not gfx601)
  • Move HasFMAF, HasFP64, HasLDEXPF to GPUInfo tables
  • Add HasFastFMA, HasFastFMAF to GPUInfo tables
  • Add missing tests

Diff Detail

Event Timeline

kzhuravl created this revision.Feb 16 2018, 2:46 PM
t-tye added a subscriber: b-sumner.Feb 16 2018, 2:53 PM
t-tye added inline comments.
lib/Basic/Targets/AMDGPU.cpp
357–360

Do all amdgcn targets have fast FMA? @b-sumner can you clarify?

b-sumner added inline comments.Feb 16 2018, 3:18 PM
lib/Basic/Targets/AMDGPU.cpp
357–360

No. All targets that support double precision should report FAST_FMA. Only targets with full rate v_fma_f32 should report FAST_FMAF

t-tye added inline comments.Feb 16 2018, 3:24 PM
lib/Basic/Targets/AMDGPU.cpp
357–360

It is unfortunate that clang does not have access to the processor features defined in the td files which gives the settings for each target.

t-tye requested changes to this revision.Feb 16 2018, 4:31 PM
t-tye added inline comments.
lib/Basic/Targets/AMDGPU.cpp
357–360

Now that the compiler knows the target it seems the clang options that specify fast_fma et al should be removed and the runtimes changed to not set them.

The implementation of when fast fma[f] is present should match the amdgcn td files which have all gfx9 and some pre-gfx9 targets supporting fast fmaf (the ones that have full rate double precision).

This revision now requires changes to proceed.Feb 16 2018, 4:31 PM
kzhuravl updated this revision to Diff 135437.Feb 22 2018, 9:11 AM
kzhuravl marked 4 inline comments as done.
kzhuravl edited the summary of this revision. (Show Details)

Address review feedback

t-tye requested changes to this revision.Feb 23 2018, 3:03 PM
t-tye added inline comments.
lib/Basic/Targets/AMDGPU.cpp
159

What does this mean? Has it now been addressed by this patch?

275–276

To be consistent should this be:

if (isAMDGCN(getTriple()))

Similar comment elsewhere.

282

This was incorrect in the old code. Only full rate FP64 gcn targets have fast FMAF.

lib/Basic/Targets/AMDGPU.h
88–91

Would it be better to position these at the beginning/end of the respective enumerations so it is more obvious that they must be updated when adding a new target?

99–103

Suggest reordering to be in a logical groups of FP32 and FP64:

bool HasFMAF;
bool HasFastFMAF;
bool HasLDEXPF;
bool HasFP64;
bool HasFastFMA;
109

Suggest adding a comment here which is a header for the columns to make it easier to check if the settings are right. For example:

// Name  Canonical  Kind  HasFMAF   HasFP64   HasLDEXPF   HasFastFMA   HasFastFMAF
140

Same comment as above.

Also, I think the fast_fma and fast_fmaf columns are reversed. All gcn has fast_fma, it is fast_fmaf that varies.

151

gfx702 should have true for fast fmaf.

Does the TD file need correcting too?

332

I found the original operand order easier to read:-)

This revision now requires changes to proceed.Feb 23 2018, 3:03 PM
b-sumner added inline comments.Feb 23 2018, 3:10 PM
lib/Basic/Targets/AMDGPU.cpp
349–353

I'm not sure why this is here. No languages we support have this AFAIK. We should probably add a comment that this is deprecated and remove it in a year or so.

kzhuravl added inline comments.Feb 23 2018, 3:39 PM
lib/Basic/Targets/AMDGPU.cpp
159

This class has a member called *GPU*. We are using the *CPU* that is passed as an argument.

Has it now been addressed by this patch?

No.

lib/Basic/Targets/AMDGPU.h
140

fast_fmaf is the last column. I think those are in the correct order.

151
332

I was trying to match the style above. But I can change it back.

t-tye added inline comments.Feb 23 2018, 4:58 PM
lib/Basic/Targets/AMDGPU.h
140

Agreed.

151

I believe the TD file is incorrect. gfx701 and 702 should both have fast fmaf.

332

Leave it in your new order which keeps it consistent as you point out. (For me they are all backwards:-) )

nhaehnle removed a subscriber: nhaehnle.Feb 25 2018, 7:36 AM
kzhuravl updated this revision to Diff 135997.Feb 26 2018, 3:09 PM
kzhuravl marked 17 inline comments as done.

Address review feedback.

b-sumner added inline comments.Feb 26 2018, 3:57 PM
lib/Basic/Targets/AMDGPU.h
103

I guess the HasFastFMA is for simplicity? It is a synonym for HasFP64.

t-tye accepted this revision.Feb 26 2018, 9:58 PM

LGTM except for comment on deprecated macros.

lib/Basic/Targets/AMDGPU.cpp
346–347

@b-sumner which ones were to recommending deprecating? I would think FP64 and FMAF would be ones that needs to be kept? I thought it was HAS_FAST_FMA that is not specified by OpenCL.

This revision is now accepted and ready to land.Feb 26 2018, 9:58 PM
b-sumner added inline comments.Feb 27 2018, 5:41 AM
lib/Basic/Targets/AMDGPU.cpp
346–347

Sorry, I was referring to all of the __HAS_* macros.