This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Remove attempt at simplifying the format string in printf lowering
ClosedPublic

Authored by arsenm on Jun 28 2023, 10:55 AM.

Details

Reviewers
vikramRH
sameerds
rampitec
Pierre-vh
cdevadas
Group Reviewers
Restricted Project
Summary

This avoids computing the dominator tree by removing the
simplifyInstruction use.

This was applying simplification with some kind of questionable
load-store forwarding and looking for the global. This had to have
been an ancient hack copied from previous backends. In the OpenCL
case, this is always emitted as required the direct global reference
anyway.

Diff Detail

Event Timeline

arsenm created this revision.Jun 28 2023, 10:55 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 28 2023, 10:55 AM
arsenm requested review of this revision.Jun 28 2023, 10:55 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 28 2023, 10:55 AM
Herald added a subscriber: wdng. · View Herald Transcript
nikic added a subscriber: nikic.Jun 28 2023, 11:47 AM
nikic added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
461

Cached analyses on the NewPM should only be used for analyses preservation. You should either require DT here or not use it at all.

arsenm added inline comments.Jun 28 2023, 12:25 PM
llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
461

I don't see anything about this in the comments on getCachedResult

arsenm added inline comments.Jun 28 2023, 12:30 PM
llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
461

Is this not identical to how getBestSimplifyQuery uses this?

nikic added inline comments.Jun 28 2023, 12:43 PM
llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
461

It is -- we should fix it. Thankfully the places it is used have all the relevant analyses available anyway so it makes no functional difference.

To clarify, the reason why this is problematic is that while the LegacyPM determined analysis availability statically, in the NewPM this depends on whether some pass happened to make any changes and ended up invalidating analyses or not. This means that we (generally speaking) do not have guarantees about whether a given analysis will be available at a specific pipeline position. You'll get the same pass executing with the analysis or without the analysis, at the same pipeline position, depending on how exactly your IR looks like.

arsenm updated this revision to Diff 536433.Jun 30 2023, 2:37 PM
arsenm retitled this revision from AMDGPU: Avoid computing dominator tree for printf lowering to AMDGPU: Remove attempt at simplifying the format string in printf lowering.
arsenm edited the summary of this revision. (Show Details)

Just remove simplifyInstruction because the usage here makes no sense

Just remove simplifyInstruction because the usage here makes no sense

I vaguely remember it was added because we were unable to compile something without it and with -O0. But that was during HSAIL times and likely irrelevant.

Just remove simplifyInstruction because the usage here makes no sense

I vaguely remember it was added because we were unable to compile something without it and with -O0. But that was during HSAIL times and likely irrelevant.

I figured it was something like that, but clang -O0 -X -disable-llvm-passes still produces the required direct global reference to printf

This revision is now accepted and ready to land.Jun 30 2023, 2:49 PM

Just remove simplifyInstruction because the usage here makes no sense

I vaguely remember it was added because we were unable to compile something without it and with -O0. But that was during HSAIL times and likely irrelevant.

I figured it was something like that, but clang -O0 -X -disable-llvm-passes still produces the required direct global reference to printf

From my recollections we were unable to get to format string. This shall be fine now without.