This is an archive of the discontinued LLVM Phabricator instance.

[NFC][AMDGPU] Refactor AMDGPUDisassembler
ClosedPublic

Authored by scott.linder on Jul 19 2022, 11:32 AM.

Details

Summary

Clean up ahead of a patch to fix bugs in the AMDGPUDisassembler.

Use split-file to simplify and extend existing kernel-descriptor
disassembly tests.

Add a comment to AMDHSAKernelDescriptor.h, as at least one small set
towards keeping all kernel-descriptor sensitive code in sync.

Diff Detail

Event Timeline

scott.linder created this revision.Jul 19 2022, 11:32 AM
Herald added a reviewer: MaskRay. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
scott.linder requested review of this revision.Jul 19 2022, 11:32 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 19 2022, 11:32 AM
arsenm added inline comments.Jul 19 2022, 11:42 AM
llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
1926

What are these comments?

scott.linder added inline comments.Jul 19 2022, 11:43 AM
llvm/test/tools/llvm-objdump/ELF/AMDGPU/lit.local.cfg
5

I'm not certain I struck the best balance here between "hiding mostly irrelevant details" and "obfuscating the RUN: lines of the KD tests", but I found it hard to tell at a glance what each test case was doing without at least some of the factoring I ended up with here (and the RUN lines were getting long, even with continuations across lines)

Suggestions on this are welcome!

scott.linder added inline comments.Jul 19 2022, 11:45 AM
llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
1926

They suppress clang-tidy diagnostics, with the idea here being that it is worth using the COMPUTE_PGM_RSRC3 identifier from the KD specification directly in the method name, even though it breaks the normal LLVM style rules

arsenm accepted this revision.Sep 16 2022, 9:41 AM
arsenm added inline comments.
llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-vgpr.s
3

Do you really need to rm first? I don't remember seeing any other test do that

This revision is now accepted and ready to land.Sep 16 2022, 9:41 AM
kzhuravl accepted this revision.Sep 19 2022, 7:56 AM

LGTM, thanks!

scott.linder added inline comments.Sep 20 2022, 1:35 PM
llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-vgpr.s
3

I'm also curious where this originates, as it seems around 40% of the uses of split-file are preceded by rm -rf:

$ rg --count-matches split-file | awk -F: '{ sum+=$2 } END { print sum }'
519
$ rg --count-matches 'rm -rf.*split-file' | awk -F: '{ sum+=$2 } END { print sum }'
202

I will remove the rm from the version I land, but it would be nice to understand why it is sometimes present.

This revision was landed with ongoing or failed builds.Sep 20 2022, 1:38 PM
This revision was automatically updated to reflect the committed changes.
cmtice added a subscriber: cmtice.Sep 20 2022, 9:45 PM

This commit appears to be causing our bootstrap builds to fail, while running these tests. The error we get for kd-vgpr.s is
[ 5] ;--- 1.s
[ 6] ; RUN: %assemble -mcpu=gfx908 <1.s >1.o [FAIL]
%assemble -mcpu=gfx908 <1.s >1.o
bash: line 0: fg: no job control
Command failed: exit status 1

The error we get for kd-sgpr.s is
[ 6] ;; Only set next_free_sgpr.
[ 7] ; RUN: %assemble -mcpu=gfx908 <1.s >1.o [FAIL]
%assemble -mcpu=gfx908 <1.s >1.o
bash: line 0: fg: no job control
Command failed: exit status 1

Could you revert this commit until these issues are fixed?

scott.linder reopened this revision.Oct 17 2022, 3:11 PM
This revision is now accepted and ready to land.Oct 17 2022, 3:11 PM

I am not sure I understand the exact root cause of the failures, but I do know
that bash sometimes interprets % as starting a "job specifier", so I tried
removing it from the substitution names.

I also removed two substitutions that actually belong in D128014.

Does anyone have a better idea of why this might be failing? Is there a way
for me to reproduce the bootstrap build locally to test?

foad added a comment.Oct 18 2022, 3:27 AM

[ 6] ; RUN: %assemble -mcpu=gfx908 <1.s >1.o [FAIL]
%assemble -mcpu=gfx908 <1.s >1.o
bash: line 0: fg: no job control

This seems pretty clear that lit has not applied the substitution for %assemble before passing the command line to the shell. But I don't know how that could have happened. @cmtice do you have a link to your actual build? Is there anything unusual about it?

arsenm accepted this revision.Jun 22 2023, 8:34 AM

Was this ever re-committed?

[ 6] ; RUN: %assemble -mcpu=gfx908 <1.s >1.o [FAIL]
%assemble -mcpu=gfx908 <1.s >1.o
bash: line 0: fg: no job control

This seems pretty clear that lit has not applied the substitution for %assemble before passing the command line to the shell. But I don't know how that could have happened. @cmtice do you have a link to your actual build? Is there anything unusual about it?

I didn't make the connection that the lit substitutions would apply before bash sees the result, somehow. I am also curious why the substitution would not apply correctly,

Was this ever re-committed?

No, I never reproduced the issue and didn't have the time to keep chasing it down

Remove the lit substitutions as I can't work out why they were failing

MaskRay added inline comments.Jun 27 2023, 3:17 PM
llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-sgpr.s
7–49

<1.s => < 1.s. The same applies to >1.o

10

Use cmp for binary blobs.

MaskRay accepted this revision.Jun 27 2023, 3:17 PM
scott.linder edited the summary of this revision. (Show Details)
  • Add spaces around bash redirection operators
  • Use cmp for binary files
  • Update commit message to remove mention of lit substitutions
MaskRay accepted this revision.Jun 27 2023, 10:01 PM
This revision was automatically updated to reflect the committed changes.