Page MenuHomePhabricator

AMDGPU: Write LDS objects out as global symbols in code generation
Needs ReviewPublic

Authored by nhaehnle on May 3 2019, 5:04 AM.



The symbols are of the STT_AMDGPU_LDS type introduced with a previous
change. The linker is then expected to resolve relocations, which are
also emitted.

Initially disabled for HSA and PAL environments until they have caught up
in terms of linker and runtime loader.

Some notes:

  • The llvm.amdgcn.groupstaticsize intrinsics can no longer be lowered to a constant at compile times, which means some tests can no longer be applied.

    The current "solution" is a terrible hack, but the intrinsic isn't used by Mesa, so we can keep it for now.
  • We no longer know the full LDS size per kernel at compile time, which means that we can no longer generate a relevant error message at compile time. It would be possible to add a check for the size of individual variables, but ultimately the linker will have to perform the final check.

Change-Id: If66dbf33fccfbf3609aefefa2558ac0850d42275

Diff Detail

Event Timeline

nhaehnle created this revision.May 3 2019, 5:04 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 3 2019, 5:04 AM
arsenm added a comment.May 3 2019, 6:00 AM

Is it still possible to make use constant addresses in some cases? If there are no calls for example

623–628 ↗(On Diff #197962)

Separate patch

716–718 ↗(On Diff #197962)

This should be a separate patch

nhaehnle marked 2 inline comments as done.May 7 2019, 1:18 PM

Is it still possible to make use constant addresses in some cases? If there are no calls for example

I'm changing the patch so that the symbols are only generated for globals with external linkage as discussed.

nhaehnle updated this revision to Diff 198518.May 7 2019, 1:21 PM

Address review comments.

arsenm added inline comments.May 8 2019, 5:46 AM

This uses the deprecated pointer type method. This should use getValueType() on the global instead


Shouldn't introduce machine nodes here. What is the problem with selecting the GlobalAddress directly in a tablegen pattern or manually in AMDGPUISelDAGToDAG?


Maybe we can define the intrinsic to return -1 if the size can't be determined, similar to how llvm.objectsize works?


I don't think there should be a need to handle machine nodes here, particularly moves. The GlobalAddress should be legal, and the select pattern should be inserting the move?

4–6 ↗(On Diff #198518)

I'm not sure these tests have been entirely replaced with MIR yet. Can you just set the triple to preserve them?

4–8 ↗(On Diff #198518)

I'm 99% sure we don't have replacements for any of these tests either. The point of these isn't to use groupstaticsize, but to get some constant that isn't visible during selection. These shouldn't be removed

367–369 ↗(On Diff #198518)


147 ↗(On Diff #198518)


nhaehnle updated this revision to Diff 204976.Sun, Jun 16, 4:37 PM
nhaehnle marked 5 inline comments as done.

Rebase, address some of the comments.

nhaehnle marked an inline comment as done.Sun, Jun 16, 4:42 PM
nhaehnle added inline comments.

Makes sense, though that seems like a task for a separate change.

nhaehnle updated this revision to Diff 204982.Sun, Jun 16, 6:02 PM
nhaehnle marked 3 inline comments as done.

Wrap LDS TargetGlobalAddress nodes in an AMDGPUISD::LDS node that is
selected to S_MOV/V_MOV. This should address the concerns about having
more machine nodes during SelectionDAG, and as a nice side bonus it
actually improves the quality of known-bits information by adding
alignment information as well.

jsji removed a subscriber: jsji.Sun, Jun 16, 7:52 PM
arsenm added inline comments.Mon, Jun 17, 10:16 AM

This breaks the initializer error. This should be moved after

nhaehnle updated this revision to Diff 205157.Mon, Jun 17, 12:54 PM
nhaehnle marked an inline comment as done.

Address review comment