This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Be explicit about whether the high-word in SI_PC_ADD_REL_OFFSET is 0
ClosedPublic

Authored by nhaehnle on May 3 2019, 5:02 AM.

Details

Summary

Instead of encoding a high-word of 0 using a fake TargetGlobalAddress,
just use a literal target constant. This simplifies some subsequent changes.

The generated assembly is now more explicit about the kind of relocation
that is to be used.

Change-Id: I066835202d23b5941fa7a358eb4b89e9b71ab6f8

Diff Detail

Event Timeline

nhaehnle created this revision.May 3 2019, 5:02 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 3 2019, 5:02 AM
arsenm added a comment.May 3 2019, 5:48 AM

But these are often -1?

But these are often -1?

Not sure what you're saying here? The point is that when the GAFlags are MO_NONE, there was previously magic behavior to force the high dword to 0. This patch removes the magic behavior of MO_NONE, or perhaps it's more accurate to say that it pushes the magic behavior forward into buildPCRelGlobalAddress, so that it's confined locally. The MO_NONE case is only hit in the badly named shouldEmitFixup case, which applies to constants that should be emitted to the .text section, where they will always be after the code.

(The shouldEmitConstantsToTextSection is itself a gross hack that is applied based on the triple. We'll soon be able to remove it for Mesa, once the proper linking lands, and then hopefully later on for PAL as well.)

arsenm accepted this revision.Jun 6 2019, 7:21 PM

LGTM. Are you saying we should remove shouldEmitFixup and just always use buildPCRelGlobalAddress?

This revision is now accepted and ready to land.Jun 6 2019, 7:21 PM
This revision was automatically updated to reflect the committed changes.

Are you saying we should remove shouldEmitFixup and just always use buildPCRelGlobalAddress?

The current trunk always uses buildPCRelGlobalAddress, and that makes sense to me. shouldEmitFixup is part of the logic that determines what kind of PC-relative global address is used. There seem to be three modes right now:

  1. 64-bit PC-relative to the symbol
  2. 32-bit PC-relative to the symbol
  3. 64-bit PC-relative to the GOT, loading the final symbol address from the GOT.

The logic deciding between these cases seems like it could benefit from being streamlined and simplified, but I wouldn't remove it outright as the distinction can be helpful.