This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Disassembler: Added basic disassembler for AMDGPU target.
ClosedPublic

Authored by SamWot on Jan 29 2016, 6:59 AM.

Details

Summary

Changes:

  • Added disassembler project
  • Fixed all decoding conflicts in .td files
  • Added DecoderMethod=“NONE” option to Target.td that allows to disable decoder generation for an instruction
  • Created decoding functions for VS_32 and VReg_32 register classes.
  • Added stubs for decoding all register classes.
  • Added several tests for disassembler

Disassembler only supports:

  • VI subtarget
  • VOP1 instruction encoding
  • 32-bit register operands and inline constants

[Valery]

One of the point that requires to pay attention to is how decoder conflicts were resolved:

  • Groups of target instructions were separated by using different DecoderNamespace (SICI, VI, CI) using similar to AssemblerPredicate approach.
  • There were conflicts in IMAGE_<> instructions caused by two different reasons:
    1. dmask wasn’t specified for the output (fixed)
    2. There’re image instructions that differ only by the number of the address components but have the same encoding by the HW spec. The actual number of address components is determined by the HW at runtime using image resource descriptor starting from the VGPR encoded in an IMAGE instruction. This means that we should choose only one instruction from conflicting group to be the rule for decoder. I didn’t find the way to disable decoder generation for an arbitrary instruction and therefore made a onelinear fix to tablegen generator that would suppress decoder generation when DecoderMethod is set to “NONE”. This is a change that should be reviewed and submitted first. Otherwise I would need to specify different DecoderNamespace for every instruction in the conflicting group. I haven’t checked yet if DecoderMethod=“NONE” is not used in other targets.
    3. IMAGE_GATHER decoder generation is for now disabled and to be done later.

      [/Valery]

Diff Detail

Repository
rL LLVM

Event Timeline

SamWot updated this revision to Diff 46379.Jan 29 2016, 6:59 AM
SamWot retitled this revision from to [AMDGPU] Disassembler: Added basic disassembler for AMDGPU target..
SamWot updated this object.
SamWot added reviewers: tstellarAMD, arsenm.
SamWot added subscribers: vpykhtin, nhaustov, llvm-commits.
lib/Target/AMDGPU/Disassembler/LLVMBuild.txt
1 ↗(On Diff #46379)

Copy and paste error: s/AArch64/AMDGPU/

lib/Target/AMDGPU/Disassembler/Makefile
1 ↗(On Diff #46379)

Makefiles have been deleted, so you don't need to update this file.

lib/Target/AMDGPU/SIInstrInfo.td
3017 ↗(On Diff #46379)

This can be removed.

utils/TableGen/FixedLenDecoderEmitter.cpp
1733 ↗(On Diff #46379)

This change needs to be in a separate patch.

arsenm added inline comments.Jan 29 2016, 5:30 PM
lib/Target/AMDGPU/AMDGPU.td
306 ↗(On Diff #46379)

Do we need this? Ideally we would only use the names

lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
22 ↗(On Diff #46379)

This should be the first included file

40–74 ↗(On Diff #46379)

Can you move the definitions of these functions up to avoid needing declaring them separately?

Can they also be moved down to be below the includes for the generated files?

113 ↗(On Diff #46379)

It looks like this doesn't handle using one of the literal constants for an 64-bit/f64 operand, which does work.

lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.h
1 ↗(On Diff #46379)

This should have a -*- C++ -*_ added at the end of the first line

15 ↗(On Diff #46379)

Remove this extra comment line

24–26 ↗(On Diff #46379)

These should be alphabetized

vpykhtin added inline comments.Jan 31 2016, 3:48 AM
utils/TableGen/FixedLenDecoderEmitter.cpp
1733 ↗(On Diff #46379)

I found out isAsmParserOnly=1 does the job, so I'll remove this.

SamWot updated this revision to Diff 46525.Feb 1 2016, 6:01 AM
SamWot marked 8 inline comments as done.

Fixed some issues, reverted changes in core TableGen.

SamWot added a comment.Feb 1 2016, 6:09 AM

Last uploaded diff unfortunately contains new changes from LLVM master. Comparing it with previous diff will show changes from master.

lib/Target/AMDGPU/AMDGPU.td
306 ↗(On Diff #46379)

This is needed now. Otherwise LLVM generate invalid decoder methods for most instructions. It tries to match operand names with names of fileds that specify operands in instruction encoding. E.g. for V_MOV_B32 operand names are "dst" and "src0" and fields names are "vdst" and "src0". So LLVM fail to match destination operand ("vdst" vs. "dst") and doesn't generate decoding method for it. Same time it match "src0" operand and tries to decode it successfully.
So until we fix names of operands using this option is only reasonable choice.

lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
113 ↗(On Diff #46379)

Disassembler for now support only 32-bit operands. Support for 64-bit operands is planned to be next step.

tstellarAMD added inline comments.Feb 1 2016, 8:07 AM
lib/Target/AMDGPU/AMDGPU.td
306 ↗(On Diff #46525)

We should fix these name mismatches rather than enabling this bit. It is much safer to use the name based matching.

arsenm added inline comments.Feb 1 2016, 5:40 PM
lib/Target/AMDGPU/AMDGPU.td
306 ↗(On Diff #46525)

I think we should change the names to vdst. The way it is now I think follows the names of the operands as they appear in the ISA documentation, but the document isn't perfectly consistent. This is because a few instructions have a second sdst

SamWot added inline comments.Feb 1 2016, 9:10 PM
lib/Target/AMDGPU/AMDGPU.td
306 ↗(On Diff #46525)

Problem here is that renaming to vdst will affect all patterns that match this type of instructions. Also I'm not sure if this is realy needed because LLVM will still match operands positionalybut with help of operand names.

arsenm added inline comments.Feb 1 2016, 9:57 PM
lib/Target/AMDGPU/AMDGPU.td
306 ↗(On Diff #46525)

This shouldn't be difficult. It would be best to start out always using the names. It took a while to enable noNamedPositionallyEncodedOperands

SamWot updated this revision to Diff 46907.Feb 4 2016, 6:13 AM

Removed decodePositionallyEncodedOperands. Renamed $dst to $vdst operands for VOP instructions.

Removed decodePositionallyEncodedOperands. Renamed $dst to $vdst operands for VOP instructions.

The changes to the operand names should really be in a different patch so they can be tested separately.

SamWot edited edge metadata.Feb 12 2016, 6:29 AM
SamWot added a project: Restricted Project.

Can you rebase this on top of latest LLVM code.

SamWot updated this revision to Diff 48167.Feb 17 2016, 3:06 AM

Updated diff with latest LLVM master code

tstellarAMD requested changes to this revision.Feb 17 2016, 6:56 AM
tstellarAMD edited edge metadata.

The encoding of the dmask field is incorrect with this patch. You may want to split the dmask changes into a separate patch. Here is an example test case:

Good Encoding: image_sample v[0:2], 13, 0, 0, 0, 0, 0, 0, 0, v[2:3], s[12:19], s[0:3] ; F0800D00 00030002
Bad Encoding: image_sample v[0:2], 13, 0, 0, 0, 0, 0, 0, 0, v[2:3], s[12:19], s[0:3] ; F0800700 00030002

target triple = "amdgcn--"

define void @main([9 x <16 x i8>] addrspace(2)* byval, [17 x <16 x i8>] addrspace(2)* byval, [17 x <8 x i32>] addrspace(2)* byval, i32 addrspace(2)* byval, float inreg, i32 inreg, <2 x i32>, <2 x i32>, <2 x i32>, <3 x i32>, <2 x i32>, <2 x i32>, <2 x i32>, float, float, float, float, float, i32, i32, float, float) #0 {
main_body:

%22 = getelementptr [17 x <8 x i32>], [17 x <8 x i32>] addrspace(2)* %2, i64 0, i64 0
%23 = load <8 x i32>, <8 x i32> addrspace(2)* %22, align 32, !tbaa !0
%24 = bitcast [17 x <8 x i32>] addrspace(2)* %2 to [0 x <4 x i32>] addrspace(2)*
%25 = getelementptr [0 x <4 x i32>], [0 x <4 x i32>] addrspace(2)* %24, i64 0, i64 3
%26 = load <4 x i32>, <4 x i32> addrspace(2)* %25, align 16, !tbaa !0
%27 = call float @llvm.SI.fs.interp(i32 0, i32 0, i32 %5, <2 x i32> %7)
%28 = call float @llvm.SI.fs.interp(i32 1, i32 0, i32 %5, <2 x i32> %7)
%29 = bitcast float %27 to i32
%30 = bitcast float %28 to i32
%31 = insertelement <2 x i32> undef, i32 %29, i32 0
%32 = insertelement <2 x i32> %31, i32 %30, i32 1
%33 = call <4 x float> @llvm.SI.image.sample.v2i32(<2 x i32> %32, <8 x i32> %23, <4 x i32> %26, i32 15, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0)
%34 = extractelement <4 x float> %33, i32 0
%35 = extractelement <4 x float> %33, i32 2
%36 = extractelement <4 x float> %33, i32 3
%37 = call i32 @llvm.SI.packf16(float %34, float 0.000000e+00)
%38 = bitcast i32 %37 to float
%39 = call i32 @llvm.SI.packf16(float %35, float %36)
%40 = bitcast i32 %39 to float
call void @llvm.SI.export(i32 15, i32 1, i32 1, i32 0, i32 1, float %38, float %40, float undef, float undef)
ret void

}

; Function Attrs: nounwind readnone
declare float @llvm.SI.fs.interp(i32, i32, i32, <2 x i32>) #1

; Function Attrs: nounwind readnone
declare <4 x float> @llvm.SI.image.sample.v2i32(<2 x i32>, <8 x i32>, <4 x i32>, i32, i32, i32, i32, i32, i32, i32, i32) #1

; Function Attrs: nounwind readnone
declare i32 @llvm.SI.packf16(float, float) #1

declare void @llvm.SI.export(i32, i32, i32, i32, i32, float, float, float, float)

attributes #0 = { "ShaderType"="0" }
attributes #1 = { nounwind readnone }

!0 = !{!"const", null, i32 1}

This revision now requires changes to proceed.Feb 17 2016, 6:56 AM
SamWot updated this revision to Diff 48192.Feb 17 2016, 8:13 AM
SamWot edited edge metadata.

Fix dmask encoding.

This revision was automatically updated to reflect the committed changes.

I committed this with a few warning fixes.