Page MenuHomePhabricator

AMDGPU/AMDHSA: Implement new target ID support in AMDGPU backend
Needs ReviewPublic

Authored by kzhuravl on Jun 12 2020, 6:25 PM.


Target ID format is a list of strings delimited by ':', e.g.
amdgcn-amd-amdhsa--gfx908:xnack+:sramecc-. The first string
is an id string which could be triple-cpu but may not be
necessarily so. The other strings are called feature string
which may or may not be target features. Except for the id
string, all feature strings end with '+' or '-'.

  - Add --amdgcn-new-target-id option that enables new
    target ID in the backend (off by default)
  - ELF Header changes
    - Per code object features are now recorded in e_flags
      - xnack, values: default, off, on
        - EF_AMDGPU_FEATURE_XNACK (mask)
      - sramecc, values: default, off, on
    - e_ident[EI_ABIVERSION] is bumped up to 2
  - Metadata
    - Add root metadata with key "" and value
      being new target id
  - Assembler
    - Parse and emit new target id
  - Tools
    - Relevant update to elf tools

Diff Detail

Event Timeline

kzhuravl created this revision.Jun 12 2020, 6:25 PM
Herald added a project: Restricted Project. · View Herald Transcript

Changes to AMDGPUUsage are WIP.

lib/ObjectYAML/ELFYAML.cpp and tools/llvm-readobj/ELFDumper.cpp changes can be made to a separate patch. They are not tightly coupled with the MC/CodeGen changes. A group of binary format focused people will be interested to review that part.

lib/ObjectYAML/ELFYAML.cpp and tools/llvm-readobj/ELFDumper.cpp changes can be made to a separate patch. They are not tightly coupled with the MC/CodeGen changes. A group of binary format focused people will be interested to review that part.

Sure, I will move that part to a different change.

Changes to AMDGPUUsage are WIP.

Since Sam is updating the docs with the overall target id concept, I think these docs are sufficient for this review.

scott.linder added inline comments.Jun 22 2020, 11:42 AM

Why do we enumerate ELFABIVERSION_* if they can have multiple actual values? It seems like the only other place we mention these is to say that they describe the version of their corresponding ELFOSABI_*. Could we just replace those places with the literal values that are possible for each ELFOSABI_*?


Are these shared between all ELFOSABI_*? Other places seem to also mention the EI_OSABI field when specifying where these values are legal.


Nit: I would prefer the value in parenthesis be written in hex to be consistent with the Value column.


Can you update these comments to explicitly state they are applicable to EI_ABIVERSION<2 ?


Same question as above; are these really only under the specified EI_OSABI and if so can that be reflected in AMDGPUUsage?


Is there a reason for this to be non-zero? Maybe it would make sense to have four values:

  • 0b00: Not an applicable feature for the given MACH
  • 0b01: Default
  • 0b10: Off
  • 0b11: On

But from my original reading of the proposal my understanding was we would instead use:

  • 0b00: If applicable, Default. Otherwise, the only legal value for a non-applicable target feature.
  • 0b01: Off
  • 0b10: On

In effect the context of the ARCH tells you if you even need to consider the feature bits for a given feature, so differentiating the case isn't useful. I guess we don't save any bits this way though?

The original proposal also mentions "makes using a mask to check for a match easier as value & test will be 0 if exact match or matching against default", although I'm not actually clear what it means to say.


I think I need some help to understand what case this intends to diagnose. Is the GlobalSTI a property of the module? Where is it derived from if the subtarget is a per-function concept?


It seems like the condition/assertion here should be flipped. We should identify the metadata by the key, then assert it has the required merge behavior. Is it just inverted to avoid repeated string comparisons, and because we can be certain there won't be other uses of the merge behavior for AMDGPU IR?


Do we really want to have this "fallback"? It seems like we should just define the "target-id" metadata entry as required, and fail here instead.

I can't remember exactly, but my understanding is that the Target you get from MMI is essentially incorrect, and is only there for legacy purposes. We have had bugs in the past, some of which led us to the whole TargetID proposal in the first place, when we used the MMI instead of per-function Subtargets. Not AMDGPU, but see for an example.

I am worried that this behavior will just lead to more bugs. Is it here to avoid updating tests or something similar?


Nit: Can this condition be inverted, i.e. the new case first.


Why is TargetID ignored, but passed in by AMDGPUAsmParser::ParseDirectiveAMDGCNTarget for this case, and constructed in the same way?


Same question as above about the bit-pattern for the Default case; is there a need to differentiate the non-supported-feature case from the supported-and-Default case? If not, can we just change the bitpatterns of the three values and delete these bitwise-ORs?

kzhuravl marked 8 inline comments as done.Aug 11 2020, 8:11 AM
kzhuravl added inline comments.

doc update in separate review.


doc update in separate review.


doc update in separate review.




discussed offline. docs have been updated to match this implementation.


yes, avoiding string comparisons. also implementation of MergeTargetID implies there won't be other uses of the merge behavior.


no, according to latest docs.