This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Simplify and improve sincos matching
ClosedPublic

Authored by arsenm on Jul 31 2023, 8:53 AM.

Details

Reviewers
rampitec
vpykhtin
dfukalov
jmmartinez
jhuber6
yaxunl
Pierre-vh
Group Reviewers
Restricted Project
Summary

The first trivial example I tried failed to merge due to the user scan
logic. Remove the complicated scan of users handling with distance
thresholds, with a same block restriction. The actual expansion of
sincos is basically the same size as sin or cos individually. Copy the
technique the generic optimization uses, which is to just use the
input instruction as the insert point or just insert at the start of
the entry block.

Diff Detail

Event Timeline

arsenm created this revision.Jul 31 2023, 8:53 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 31 2023, 8:53 AM
arsenm requested review of this revision.Jul 31 2023, 8:53 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 31 2023, 8:53 AM
Herald added a subscriber: wdng. · View Herald Transcript
jmmartinez added inline comments.Aug 1 2023, 1:53 AM
llvm/lib/Target/AMDGPU/AMDGPULibCalls.cpp
1094–1098

We should also set the debug location for the Call to be the one of Sin, and the debug locaiton of Cos to be the one of Reload

arsenm added inline comments.Aug 1 2023, 4:48 PM
llvm/lib/Target/AMDGPU/AMDGPULibCalls.cpp
1094–1098

I couldn't figure out what to do about the debug loc. There didn't seem to be a update-these-two-for-merge function anywhere. Currently it will get the debug loc for the initial call

jmmartinez added inline comments.Aug 2 2023, 1:01 AM
llvm/lib/Target/AMDGPU/AMDGPULibCalls.cpp
1094–1098

There is Instruction::applyMergedLocation but it doesn't seem appropiate in this case.

If I'm not wrong, crrently it's getting the debug location of the IRBuilder insert point (and only if Arg is an instruction).

arsenm added inline comments.Aug 2 2023, 5:53 AM
llvm/lib/Target/AMDGPU/AMDGPULibCalls.cpp
1094–1098

You're half wrong. The default is the location for the initially visited instruction. If the incoming argument is an instruction, it breaks it by taking the location from the argument. It's not unset anywhere

arsenm added inline comments.Aug 2 2023, 6:00 AM
llvm/lib/Target/AMDGPU/AMDGPULibCalls.cpp
1094–1098

I'm also not sure SetInsertPointPastAllocas is doing the right thing by *not* resetting the debug location

arsenm added inline comments.Aug 2 2023, 6:15 AM
llvm/lib/Target/AMDGPU/AMDGPULibCalls.cpp
1094–1098

I think applyMergedLocation is appropriate, the API is just bad and makes you go through the raw DILocation instead of supplying Instruction wrappers

arsenm updated this revision to Diff 546438.Aug 2 2023, 6:26 AM

Improve debug info handling

jmmartinez accepted this revision.Aug 2 2023, 6:44 AM
This revision is now accepted and ready to land.Aug 2 2023, 6:44 AM