This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Separate R600 and GCN TableGen files
ClosedPublic

Authored by tstellar on May 2 2018, 1:17 PM.

Details

Summary

We now have two sets of generated TableGen files, one for R600 and one
for GCN, so each sub-target now has its own tables of instructions,
registers, ISel patterns, etc. This should help reduce compile time
since each sub-target now only has to consider information that
is specific to itself. This will also help prevent the R600
sub-target from slowing down new features for GCN, like disassembler
support, GlobalISel, etc.

Diff Detail

Event Timeline

tstellar created this revision.May 2 2018, 1:17 PM
kzhuravl added inline comments.May 2 2018, 1:23 PM
lib/Target/AMDGPU/Processors.td
14 ↗(On Diff #144924)

Aren't GCN processors defined in GCNProcessors.td? I do not see it being removed or modified in this change..

tstellar added inline comments.May 2 2018, 1:26 PM
lib/Target/AMDGPU/Processors.td
14 ↗(On Diff #144924)

This file isn't used at all, I think it was a rebase artifact. I can remove it.

arsenm added inline comments.May 3 2018, 4:28 AM
lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
141 ↗(On Diff #144924)

Should this be a separate class as well?

lib/Target/AMDGPU/AMDGPUSubtarget.h
66

Is it possible to avoid making these virtual?

tstellar added inline comments.May 3 2018, 9:19 AM
lib/Target/AMDGPU/AMDGPUSubtarget.h
66

I will look through this again and see if I can eliminate some of these virtual functions, but to get rid of all of them we have a few options:

  1. We could eliminate the AMDGPUCommonSubtarget super class and then in code shared between r600 and amdgcn (which is mostly IR passes and a few remaining classes like AMDGPUTargetLowering, AMDAsmPrinter, etc) do something like:
bool IsAmdHsaOs;
if (Triple.getArch() == Triple::amdgcn)
  IsAmdHsaOS = static_cast<SISubtarget>(Subtarget).isAmdHsaOS()
else
  IsAmdHsaOS = static_cast<R600Subtaget>(Subtarget).isAmdHsaOS();
  1. Remove subtarget checks from shared classes by refactoring code into r600/gcn specific classes.
tstellar updated this revision to Diff 148332.May 23 2018, 8:52 PM
tstellar marked an inline comment as done.

Removed unused Processors.td file and made all AMDGPUCommonSubtarget
functions non-virtual.

arsenm added inline comments.May 23 2018, 11:57 PM
lib/Target/AMDGPU/AMDGPUInstrInfo.cpp
25–26

Commented out code

lib/Target/AMDGPU/AMDGPUInstructions.td
49

Should probably rename this at some point

lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
127–130 ↗(On Diff #148332)

Why is there a difference here?

lib/Target/AMDGPU/AMDGPUSubtarget.h
207–208

Why isn't this SISubtarget/GCNSubtarget?

437–438

Why is this needed outside of GCN code?

lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
589

Can’t this be in a gcn class? I think all of this is for packed anyway

lib/Target/AMDGPU/R600ISelLowering.cpp
1583–1585

Probably should reject these, but that's a separate change

lib/Target/AMDGPU/R600IntrinsicInfo.cpp
23–24 ↗(On Diff #148332)

We can probably drop the whole class. We don't have very many intrinsic definitions left in the backend, and this code never really worked well to begin with

lib/Target/AMDGPU/SIInsertWaitcnts.cpp
928–929

Fix formatting

tstellar marked 8 inline comments as done.May 31 2018, 11:55 AM
tstellar added inline comments.
lib/Target/AMDGPU/AMDGPUSubtarget.h
207–208

I was planning to rename this as a follow on patch to avoid creating even more churn in this patch.

437–438

It's not. I've dropped the R600 implementation of this.

tstellar updated this revision to Diff 149335.May 31 2018, 11:56 AM
tstellar marked an inline comment as done.

Rebase patch after splitting some of the changes requested
into separate patches.

tstellar updated this revision to Diff 151267.Jun 13 2018, 4:05 PM

Rebase patch on latest ToT.

I've tried the updated version of the patch, although it did not apply cleanly. It also causes GPU hangs on my turks in piglit tests.

I've tried the updated version of the patch, although it did not apply cleanly. It also causes GPU hangs on my turks in piglit tests.

Do you have a test case?

I assume that there is no change in generated code intended for r600 (EG/CM).
These are the changes in piglit tests I noticed:

< 	MEM_RAT_CACHELESS STORE_RAW T0.X, T1.X, 1
---
> 	MEM_RAT_CACHELESS STORE_DWORD T0.X, T1.X

There are other changes wrt register allocation and packetizer, but this one looks the most suspicious. My turks is TS2 and STORE_DWORD is not defined in the ISA (STORE_RAW is the only allowed opcode for CACHELESS target). Checking cayman ISA STORE_DWORD is opcode 20 (vs. opc 2 for STORE_RAW), which is reserved on TS2. The instruction also lost the offset.

Now, there are tests for MEMRAT_CACHELESS stoers, and they pass so I guess there is another untested store path that got mixed between TS2 and TS3.
I can paste the .ll file if you're interested.

a quick update. running llc manually on the kernel .ll (dumped using CLOVER_DEBUG=llvm) produces correct assembly. Running it in clover generates incorrect code (dumped using CLOVER_DEBUG=native) and hangs GPU.

lib/Target/AMDGPU/AMDGPUFeatures.td
53

gi complains about blank line at the end of file here

lib/Target/AMDGPU/R600ISelLowering.cpp
237

git complains about whitespace error in this location

I added the below snippet to check whether the caymanISA feature gets initialized correctly:

@@ -415,7 +417,10 @@ R600Subtarget::R600Subtarget(const Triple &TT, StringRef GPU, StringRef FS,
   TLInfo(TM, initializeSubtargetDependencies(TT, GPU, FS)),
   DX10Clamp(false),
   InstrItins(getInstrItineraryForCPU(GPU)),
-  AS (AMDGPU::getAMDGPUAS(TT)) { }
+  AS (AMDGPU::getAMDGPUAS(TT)) {
+  fprintf(stderr, "R600 FEATURE STRING: %s\n", FS.data());
+  fprintf(stderr, "R600 Has Cayman ISA: %s\n", CaymanISA ? "YES" : "NO");
+}

As expected it randomly on occasion printed:

'-fp32-denormals' is not a recognized feature for this target (ignoring feature)
'-fp32-denormals' is not a recognized feature for this target (ignoring feature)
'-fp32-denormals' is not a recognized feature for this target (ignoring feature)
R600 FEATURE STRING: -fp32-denormals
R600 Has Cayman ISA: YES

running llc through valgrind produced flood of 'Conditional jump or move depends on uninitialised value(s)'
269 errors from 24 contexts. Initialzieng just CaymanISA in R600SUbtarget gets rid of most of them.

Now, there are tests for MEMRAT_CACHELESS stoers, and they pass so I guess there is another untested store path that got mixed between TS2 and TS3.
I can paste the .ll file if you're interested.

Yes, that would be helpful.

tstellar updated this revision to Diff 152572.Jun 22 2018, 6:50 PM

Rebase and fix some uninitialized variables in R600Subtarget.

running llc through valgrind produced flood of 'Conditional jump or move depends on uninitialised value(s)'
269 errors from 24 contexts. Initialzieng just CaymanISA in R600SUbtarget gets rid of most of them.

These should be fixed now, can you re-test?

arsenm added inline comments.Jun 26 2018, 12:44 AM
lib/Target/AMDGPU/AMDGPUFeatures.td
2

Missing header comment

lib/Target/AMDGPU/AMDGPUSubtarget.h
480–482

Why are these leftover as virtual?

lib/Target/AMDGPU/R600.td
2

Missing header comment

running llc through valgrind produced flood of 'Conditional jump or move depends on uninitialised value(s)'
269 errors from 24 contexts. Initialzieng just CaymanISA in R600SUbtarget gets rid of most of them.

These should be fixed now, can you re-test?

Fails to build:
llvm-tblgen: Unknown command line argument '-gen-tgt-intrinsic'. Try: '../../../bin/llvm-tblgen -help'
llvm-tblgen: Did you mean '-gen-tgt-intrinsic-impl'?
make[2]: *** [lib/Target/AMDGPU/CMakeFiles/AMDGPUCommonTableGen.dir/build.make:1730: lib/Target/AMDGPU/R600GenIntrinsics.inc.tmp] Error 1

Now, there are tests for MEMRAT_CACHELESS stoers, and they pass so I guess there is another untested store path that got mixed between TS2 and TS3.
I can paste the .ll file if you're interested.

Yes, that would be helpful.

https://people.freedesktop.org/~jvesely/llvm/

test cases 46 and 48 the "n-" and "new-" prefixed versions are the result of the previous iteration of this patch

running llc through valgrind produced flood of 'Conditional jump or move depends on uninitialised value(s)'
269 errors from 24 contexts. Initialzieng just CaymanISA in R600SUbtarget gets rid of most of them.

These should be fixed now, can you re-test?

Fails to build:
llvm-tblgen: Unknown command line argument '-gen-tgt-intrinsic'. Try: '../../../bin/llvm-tblgen -help'
llvm-tblgen: Did you mean '-gen-tgt-intrinsic-impl'?
make[2]: *** [lib/Target/AMDGPU/CMakeFiles/AMDGPUCommonTableGen.dir/build.make:1730: lib/Target/AMDGPU/R600GenIntrinsics.inc.tmp] Error 1

After fixing the build file as tblgen suggested (and few local fixes in my own patches) it builds OK and there are no piglit regressions on my turks.
I think this should land rather soon, with a bunch of cleanup follow ups. Having things (files, classes) that are prefixed R600, AMDGPU, AMDGPUCommon, GCN, AMDGCN, and SI is rather confusing.

tstellar updated this revision to Diff 153250.Jun 27 2018, 8:53 PM

Rebase and stop generating intrinsic info for R600, we don't need this.

Fails to build:
llvm-tblgen: Unknown command line argument '-gen-tgt-intrinsic'. Try: '../../../bin/llvm-tblgen -help'
llvm-tblgen: Did you mean '-gen-tgt-intrinsic-impl'?
make[2]: *** [lib/Target/AMDGPU/CMakeFiles/AMDGPUCommonTableGen.dir/build.make:1730: lib/Target/AMDGPU/R600GenIntrinsics.inc.tmp] Error 1

After fixing the build file as tblgen suggested (and few local fixes in my own patches) it builds OK and there are no piglit regressions on my turks.

IntrinsicInfo isn't needed any more, so I dropped this.

I think this should land rather soon, with a bunch of cleanup follow ups. Having things (files, classes) that are prefixed R600, AMDGPU, AMDGPUCommon, GCN, AMDGCN, and SI is rather confusing.

Ok, I can start working on this once this patch lands.

tstellar marked 3 inline comments as done.Jun 27 2018, 9:07 PM
tstellar updated this revision to Diff 153252.Jun 27 2018, 9:07 PM

Add missing headers to tablegen files and remove virtual functions
from AMDGPUSubtarget.

arsenm accepted this revision.Jun 28 2018, 12:00 AM

LGTM

lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
724–728

I would expect this to be a separate function, but not sure where this would go

This revision is now accepted and ready to land.Jun 28 2018, 12:00 AM
tstellar added inline comments.Jun 28 2018, 9:18 AM
lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
724–728

We can refactor AMDGPUMCInstLower.cpp so that this can be in its own function. I can work on this as one of the follow on clean ups.

This revision was automatically updated to reflect the committed changes.
dstenb added a subscriber: dstenb.Jul 5 2018, 7:59 AM

Hi! We encountered a UBSan runtime error after this was merged. I wrote a bug report about it: https://bugs.llvm.org/show_bug.cgi?id=38071.