This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Emit debugger prologue and emit the rest of the debugger fields in the kernel code header
ClosedPublic

Authored by kzhuravl on May 17 2016, 12:59 PM.

Details

Summary

Debugger prologue is emitted if -mattr=+amdgpu-debugger-emit-prologue.

Debugger prologue writes work group IDs and work item IDs to scratch memory at fixed location in the following format:

  • offset 0: work group ID x
  • offset 4: work group ID y
  • offset 8: work group ID z
  • offset 16: work item ID x
  • offset 20: work item ID y
  • offset 24: work item ID z

Set

  • amd_kernel_code_t::debug_wavefront_private_segment_offset_sgpr to scratch wave offset reg
  • amd_kernel_code_t::debug_private_segment_buffer_sgpr to scratch rsrc reg
  • amd_kernel_code_t::is_debug_supported to true if all debugger features are enabled

Diff Detail

Event Timeline

kzhuravl updated this revision to Diff 57509.May 17 2016, 12:59 PM
kzhuravl retitled this revision from to [AMDGPU] Emit debugger prologue and emit the rest of the debugger fields in the kernel code header.
kzhuravl updated this object.
kzhuravl added reviewers: arsenm, tstellarAMD.
kzhuravl added a subscriber: llvm-commits.
lib/Target/AMDGPU/SIInstructions.td
1904–1920

Why can't we use the existing spill pseudos for this?

lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
93–103

I think it would be better to have AMDGPUAnnotateKernelFeatures check ST.debuggerEmitPrologue() and add the attribute in that pass.

111

We are creating stack objects during ISel for the debugger prologue, so doesn't that make ST.debuggerEmitPrologue() redundant?

kzhuravl updated this revision to Diff 57678.May 18 2016, 2:14 PM

Update based on Tom's review feedback

kzhuravl marked 3 inline comments as done.May 18 2016, 2:15 PM
arsenm added inline comments.May 23 2016, 9:20 AM
lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
93–103

I don't agree. I don't think it makes sense to add a attribute based on a subtarget feature. If the feature is attached to the subtarget already, this is just adding more clutter to the attribute list. We should just implement the per-function subtarget stuff since a subtarget is basically a collection of subtarget features.

arsenm added inline comments.May 23 2016, 9:24 AM
lib/Target/AMDGPU/AMDGPUAsmPrinter.h
80–86

Why are these using the SGPR number and special invalid register instead of the phys reg ID and NoReg?

lib/Target/AMDGPU/SIISelLowering.cpp
690

Why not create a single object and use offsets into it?

kzhuravl updated this revision to Diff 58434.May 25 2016, 8:57 AM

Rebase to tot

kzhuravl added inline comments.May 25 2016, 9:09 AM
lib/Target/AMDGPU/AMDGPUAsmPrinter.h
80–86

Because we have to be compliant with our existing spec, which says to put SGPR number or (uint16_t)-1.

lib/Target/AMDGPU/SIISelLowering.cpp
690

Is there any advantage to having a single object?

The reason for having multiple objects is so we can use storeRegToStackSlot, which creates spills.

kzhuravl updated this revision to Diff 60950.Jun 16 2016, 12:55 AM

Address review feedback

kzhuravl marked an inline comment as done.Jun 16 2016, 12:56 AM
arsenm added inline comments.Jun 24 2016, 9:48 AM
lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
201–202

This willl print the wrong thing if it isn't set. Should it print something else if not valid?

lib/Target/AMDGPU/AMDGPUAsmPrinter.h
80–86

Oh, OK. I didn't realize this was in the printer

lib/Target/AMDGPU/AMDGPUSubtarget.h
321

&& on previous line

lib/Target/AMDGPU/SIFrameLowering.cpp
42

Since my patch yesterday this should be SISubtarget, and ST can be re-used below

295–296

SISubtarget and then you can remove the casts

lib/Target/AMDGPU/SIISelLowering.cpp
686–687

Can you move the body of the loop into a helper function as well as the comment

test/CodeGen/AMDGPU/debugger-emit-prologue.ll
19–20

Should also check these in a test where these aren't set

kzhuravl marked 11 inline comments as done.Jun 24 2016, 1:51 PM
kzhuravl added inline comments.
lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
201–202

Thanks for catching this one. I have changed it so it only gets printed if amdgpu-debugger-emit-prologue attribute was specified.

kzhuravl updated this revision to Diff 61832.Jun 24 2016, 1:53 PM
kzhuravl marked an inline comment as done.

Address review comments

kzhuravl updated this revision to Diff 61834.Jun 24 2016, 1:54 PM

Upload correct diff

arsenm accepted this revision.Jun 24 2016, 6:59 PM
arsenm edited edge metadata.

LGTM

This revision is now accepted and ready to land.Jun 24 2016, 6:59 PM
This revision was automatically updated to reflect the committed changes.
aaboud reopened this revision.Jun 26 2016, 1:32 AM
aaboud added a subscriber: aaboud.

This commit is causing build to fail on windows configuration.

llvm/trunk/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
57 ↗(On Diff #61883)

These lines caused a build failure on windows BuildBot.

error C2536: 'llvm::SIMachineFunctionInfo::llvm::SIMachineFunctionInfo::DebuggerWorkGroupIDStackObjectIndices' : cannot specify explicit initializer for arrays

One option to fix this is to use "std::array<int, 3>" as type of these two fields instead of "int[3]".
Please, fix the issue as soon as possible.

This revision is now accepted and ready to land.Jun 26 2016, 1:32 AM

Sorry, I was travelling until now. It seems like it is fixed in rL273860

kzhuravl closed this revision.Jul 6 2016, 8:18 AM