This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] refactor ds instruction definitions (proposal)
ClosedPublic

Authored by vpykhtin on Jul 19 2016, 10:51 AM.

Details

Summary

Hi,

this is my first attempt on improve our td instruction definitions.

  1. All DS related definitions are moved to the new DSInstructions.td. This is done to reduce the number of definitions we currently have in a single td file.
  1. Multiclasses that define Pseudo, SI and VI instructions are removed. Instead there is DS_Pseudo instruction that is supposed to handle all CodeGen related things and carry some hint flags for MC layer things. It's counterpart DS_Real copies relevant data from origin DS_Pseudo. So typical definition consist of two stages:

// CodeGen part - DSInstructions.td

DS_INSTRUCTION : DS_PSEUDO<"DS_MNEMONIC", outs, ins, asmString>

// Assembler, disassembler, encoding part

// SIInstructions.td

DS_INSTRUCTION_si : DS_REAL < 0x123 /* SI opcode */, DS_INSTRUCTION /* origin pseudo op*/ >;

// VIInstructions.td

DS_INSTRUCTION_vi : DS_REAL < 0x456 /* VI opcode */, DS_INSTRUCTION /* origin pseudo op*/ >;

Having these things split allows to:

  1. Simplify codegen related definitions and pseudo ops
  2. No dummy "real" instructions. Previously if we had to define new VI instruction it created dummy SI instruction with 0 opcode.
  3. Workaround flags such like DisableSIDecoder/DisableVIDecoder can be removed.
  4. having all real instructions groupped the same way allows easily diff subtarget td files.
  5. we can change real instruction naming to prefixing it with subtarget tag (example: DS_INST_SI to SI_DS_INST) so that all subtarget opcodes are groupped together. This would allow to use direct translation table PseudoOp <-> RealOp. Currently there're log N translation map.
  6. I'm also thinking about the possibility to split subtarget generated tables so that there're no mixed subtarget instructions. This would allow to avoid subtarget predicate checks.

Currently I broke the following tests but I'll fix them tomorrow:

LLVM :: CodeGen/AMDGPU/amdgpu.private-memory.ll
LLVM :: CodeGen/AMDGPU/atomic_load_add.ll
LLVM :: CodeGen/AMDGPU/atomic_load_sub.ll
LLVM :: CodeGen/AMDGPU/extload.ll
LLVM :: CodeGen/AMDGPU/lds-oqap-crash.ll
LLVM :: CodeGen/AMDGPU/lds-output-queue.ll
LLVM :: CodeGen/AMDGPU/load-local-f32.ll
LLVM :: CodeGen/AMDGPU/load-local-f64.ll
LLVM :: CodeGen/AMDGPU/load-local-i1.ll
LLVM :: CodeGen/AMDGPU/load-local-i16.ll
LLVM :: CodeGen/AMDGPU/load-local-i32.ll
LLVM :: CodeGen/AMDGPU/load-local-i64.ll
LLVM :: CodeGen/AMDGPU/load-local-i8.ll
LLVM :: CodeGen/AMDGPU/local-atomics.ll
LLVM :: CodeGen/AMDGPU/local-memory.r600.ll
LLVM :: CodeGen/AMDGPU/private-memory-r600.ll
LLVM :: CodeGen/AMDGPU/store.ll

Diff Detail

Repository
rL LLVM

Event Timeline

vpykhtin updated this revision to Diff 64517.Jul 19 2016, 10:51 AM
vpykhtin retitled this revision from to [AMDGPU] refactor ds instruction definitions (proposal).
vpykhtin updated this object.
vpykhtin set the repository for this revision to rL LLVM.
vpykhtin added a project: Restricted Project.

I think in general this is a nice improvement. The main draw back is the duplicate definitions of the real instructions for SI/VI, but it seems like this is necessary in order to support the assembler/disassembler without more hacks. And I think overall this makes the .td files less complicated.

lib/Target/AMDGPU/DSInstructions.td
734–740 ↗(On Diff #64517)

There's a few place like this with lots of extra whitespace that could be cleaned up.

I think in general this is a nice improvement. The main draw back is the duplicate definitions of the real instructions for SI/VI, but it seems like this is necessary in order to support the assembler/disassembler without more hacks. And I think overall this makes the .td files less complicated.

I agree, I don't like the duplication either. On the other hand real definition part is really straightforward, I hope there will be no need to modify it often.

SamWot edited edge metadata.Jul 20 2016, 5:39 AM

I generally like this changes but there are some drawbacks:

  • Pseudo instruction ideally should not contain AsmStrings, AsmMatcherConverters and other fields that are used only in MC layer. But I can't imagine how to move those fields to real instruction without breaking whole idea of this change and without huge code duplication.
  • There would be a lot of problems later with other types of instructions like VOP (sdwa and dpp including) or MUBUF atomics.

I generally like this changes but there are some drawbacks:

  • Pseudo instruction ideally should not contain AsmStrings, AsmMatcherConverters and other fields that are used only in MC layer. But I can't imagine how to move those fields to real instruction without breaking whole idea of this change and without huge code duplication.

Right, generally its a good idea to split CodeGen and MCLayer data and left Pseudo instruction for CodeGen and Real instruction for MCLayer info, however it would probably create another level of indirection to remove asm duplication.

One of my thoughts was to do asm parsing on Pseudo instructions and translate them into Real upon encoding. This way having asm string in Pseudo would signficantly reduce asm parsing tables.

  • There would be a lot of problems later with other types of instructions like VOP (sdwa and dpp including) or MUBUF atomics.

I'm actually interested in collecting those problems beforehand.

vpykhtin updated this revision to Diff 65774.Jul 27 2016, 11:20 AM
vpykhtin edited edge metadata.
vpykhtin removed rL LLVM as the repository for this revision.
  • fixed test failures
  • removed extra spaces

Guys,

I would like to submit this, if there're no objections, to avoid potential merging. Lit tests are now passing, should we perform any additional testing on this?

SamWot accepted this revision.Jul 28 2016, 8:33 AM
SamWot edited edge metadata.
This revision is now accepted and ready to land.Jul 28 2016, 8:33 AM

Guys,

I would like to submit this, if there're no objections, to avoid potential merging. Lit tests are now passing, should we perform any additional testing on this?

Let me test it out on our graphics stack first.

tstellarAMD accepted this revision.Jul 28 2016, 11:20 AM
tstellarAMD edited edge metadata.

Tests pass. LGTM.

This revision was automatically updated to reflect the committed changes.