This is an archive of the discontinued LLVM Phabricator instance.

[Remarks] [AMDGPU] Emit optimization remarks for atomics generating hardware instructions
ClosedPublic

Authored by gandhi21299 on Aug 16 2021, 11:37 AM.

Details

Summary
  • produce remarks when atomic instructions are expanded into hardware instructions in SIISelLowering.cpp
  • an OpenCL test containing both IR-level and ISA level checks
  • currently only support for atomic fadd

Diff Detail

Event Timeline

gandhi21299 created this revision.Aug 16 2021, 11:37 AM
gandhi21299 requested review of this revision.Aug 16 2021, 11:37 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptAug 16 2021, 11:37 AM
  • Add [AMDGPU] to the title.
  • Rebase on top of D106891.
  • Add tests to atomics-remarks-gfx90a.ll as well, including LDS with matching and non-matching rounding mode.
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
12192

This is safe.

12193–12194

This is safe if fpModeMatchesGlobalFPAtomicMode(RMW) returned true and unsafe if not.

gandhi21299 retitled this revision from [Remarks] Emit optimization remarks for atomics generating hardware instructions to [Remarks] [AMDGPU] Emit optimization remarks for atomics generating hardware instructions.Aug 16 2021, 12:36 PM
gandhi21299 marked 2 inline comments as done.Aug 16 2021, 12:38 PM
  • added more tests
  • addressed feedback from reviewer
  • corrected an argument in AtomicExpand pass
rampitec added inline comments.Aug 16 2021, 3:40 PM
clang/test/CodeGenOpenCL/atomics-cas-remarks-gfx90a.cl
9

You are compiling 2 functions with 2 different sets of options. Essentially it is unclear what are you checking because either half skips half of the remarks. Either compile a single function differently or make 2 different tests.

13

This line does not have -munsafe-fp-atomics option...

60

... therefor these checks must fail.

llvm/lib/Target/AMDGPU/SIISelLowering.cpp
12194

This is wrong. Condition is inverted and essentially tests should fail. Make sure you can pass testing before posting a diff.

llvm/test/CodeGen/AMDGPU/atomics-cas-remarks-gfx90a.ll
108

Does it print a function name before the diagnostics? Label checks would be useful.

gandhi21299 marked 5 inline comments as done.Aug 16 2021, 4:31 PM
gandhi21299 added inline comments.
clang/test/CodeGenOpenCL/atomics-cas-remarks-gfx90a.cl
9

I will create 2 seperate tests

llvm/test/CodeGen/AMDGPU/atomics-cas-remarks-gfx90a.ll
108

Nope, it does not.

gandhi21299 marked 2 inline comments as done.
  • split the OpenCL test into two for brevity
  • fixed a mistake in SIISelLowering as pointed out by reviewer
  • added the missing -munsafe-fp-atomics flag
rampitec added inline comments.Aug 17 2021, 10:39 AM
clang/test/CodeGenOpenCL/atomics-cas-remarks-gfx90a.cl
9

You do not need to change this file anymore.

llvm/lib/Target/AMDGPU/SIISelLowering.cpp
12122

It does not need to me a method, just a static function or better even lamdba.

12194

Unresolved.

llvm/test/CodeGen/AMDGPU/atomics-cas-remarks-gfx90a.ll
108

That's pity. Then this file need to be split too.

gandhi21299 marked 3 inline comments as done.Aug 17 2021, 11:03 AM
gandhi21299 added inline comments.
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
12194

Remarks are produced if fpModeMatchesGlobalFPAtomicMode(RMW) == false

rampitec added inline comments.Aug 17 2021, 11:06 AM
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
12194

But you have changed what function was doing. It was returning CmpXChg.

  • requested changes from reviewer
gandhi21299 marked an inline comment as done.
  • corrected AtomicExpansionKind return value in SIISelLowering as pointed out by reviewer
  • eliminated previous function declaration which is not defined/used anymore
rampitec requested changes to this revision.Aug 17 2021, 1:54 PM

Logic is still wrong.

This revision now requires changes to proceed.Aug 17 2021, 1:54 PM

@rampitec Which part of the logic is wrong?

@rampitec Which part of the logic is wrong?

Still the same around LDS.

My understanding is that since we are reporting unsafe expansion into hw instructions, fpModeMatchesGlobalFPAtomicMode(RMW) must be false to match the logic.

My understanding is that since we are reporting unsafe expansion into hw instructions, fpModeMatchesGlobalFPAtomicMode(RMW) must be false to match the logic.

Please run check-llvm before updating the patch.

  • corrected logic for ORE in SIISelLowering.cpp
rampitec added inline comments.Aug 18 2021, 10:50 AM
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
12194
if (fpModeMatchesGlobalFPAtomicMode(RMW))
  return AtomicExpansionKind::None;

return RMW->getFunction()
                   ->getFnAttribute("amdgpu-unsafe-fp-atomics")
                   .getValueAsString() == "true"
           ? ReportUnsafeHWInst(AtomicExpansionKind::None)
           : AtomicExpansionKind::CmpXChg;
gandhi21299 marked an inline comment as done.
  • code refactor
rampitec accepted this revision.Aug 18 2021, 11:24 AM
This revision is now accepted and ready to land.Aug 18 2021, 11:24 AM
gandhi21299 marked an inline comment as done.Aug 18 2021, 11:59 AM

Thanks for the review, I will merge this in as soon as the CI passes.