This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Replace target feature for global fadd32
ClosedPublic

Authored by gandhi21299 on Mar 24 2023, 12:22 PM.

Details

Summary

Change target feature of __builtin_amdgcn_global_atomic_fadd_f32
to atomic-fadd-rtn-insts. Enable atomic-fadd-rtn-insts for gfx90a,
gfx940 and gfx1100 as they all support the return variant of
global_atomic_add_f32.

Fixes https://github.com/llvm/llvm-project/issues/61331.

Diff Detail

Event Timeline

gandhi21299 created this revision.Mar 24 2023, 12:22 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 24 2023, 12:22 PM
gandhi21299 requested review of this revision.Mar 24 2023, 12:22 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 24 2023, 12:22 PM
gandhi21299 added a project: Restricted Project.Mar 24 2023, 12:25 PM
rampitec requested changes to this revision.Mar 27 2023, 11:54 AM

You cannot just enable it on gfx908 which does not have return version of it.

This revision now requires changes to proceed.Mar 27 2023, 11:54 AM
  • gfx908 does not support return version of the builtin
  • removed builtins-amdgcn-gfx908.cl

Can you please also add gfx90a and gfx940 tests?

Otherwise LGTM *if* @b-sumner has no objections.

gandhi21299 edited the summary of this revision. (Show Details)Mar 28 2023, 1:46 PM
  • Adding tests for gfx90a and gfx940
rampitec accepted this revision.Mar 28 2023, 2:22 PM

LGTM. Please wait for @b-sumner.

This revision is now accepted and ready to land.Mar 28 2023, 2:22 PM
gandhi21299 added a comment.EditedMar 28 2023, 2:30 PM

Sounds good, thanks for the review @rampitec

No objection here.

This revision was landed with ongoing or failed builds.Mar 28 2023, 2:58 PM
This revision was automatically updated to reflect the committed changes.