We now have two sets of generated TableGen files, one for R600 and one
for GCN, so each sub-target now has its own tables of instructions,
registers, ISel patterns, etc. This should help reduce compile time
since each sub-target now only has to consider information that
is specific to itself. This will also help prevent the R600
sub-target from slowing down new features for GCN, like disassembler
support, GlobalISel, etc.
Details
Diff Detail
- Repository
- rL LLVM
- Build Status
Buildable 17626 Build 17626: arc lint + arc unit
Event Timeline
lib/Target/AMDGPU/Processors.td | ||
---|---|---|
14 | Aren't GCN processors defined in GCNProcessors.td? I do not see it being removed or modified in this change.. |
lib/Target/AMDGPU/Processors.td | ||
---|---|---|
14 | This file isn't used at all, I think it was a rebase artifact. I can remove it. |
lib/Target/AMDGPU/AMDGPUSubtarget.h | ||
---|---|---|
66 | I will look through this again and see if I can eliminate some of these virtual functions, but to get rid of all of them we have a few options:
bool IsAmdHsaOs; if (Triple.getArch() == Triple::amdgcn) IsAmdHsaOS = static_cast<SISubtarget>(Subtarget).isAmdHsaOS() else IsAmdHsaOS = static_cast<R600Subtaget>(Subtarget).isAmdHsaOS();
|
Removed unused Processors.td file and made all AMDGPUCommonSubtarget
functions non-virtual.
lib/Target/AMDGPU/AMDGPUInstrInfo.cpp | ||
---|---|---|
31–32 | Commented out code | |
lib/Target/AMDGPU/AMDGPUInstructions.td | ||
49 | Should probably rename this at some point | |
lib/Target/AMDGPU/AMDGPUMCInstLower.cpp | ||
95–99 | Why is there a difference here? | |
lib/Target/AMDGPU/AMDGPUSubtarget.h | ||
207–208 | Why isn't this SISubtarget/GCNSubtarget? | |
421–423 | Why is this needed outside of GCN code? | |
lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp | ||
580 | Can’t this be in a gcn class? I think all of this is for packed anyway | |
lib/Target/AMDGPU/R600ISelLowering.cpp | ||
1563–1565 | Probably should reject these, but that's a separate change | |
lib/Target/AMDGPU/R600IntrinsicInfo.cpp | ||
24–25 | We can probably drop the whole class. We don't have very many intrinsic definitions left in the backend, and this code never really worked well to begin with | |
lib/Target/AMDGPU/SIInsertWaitcnts.cpp | ||
925–926 | Fix formatting |
I've tried the updated version of the patch, although it did not apply cleanly. It also causes GPU hangs on my turks in piglit tests.
I assume that there is no change in generated code intended for r600 (EG/CM).
These are the changes in piglit tests I noticed:
< MEM_RAT_CACHELESS STORE_RAW T0.X, T1.X, 1 --- > MEM_RAT_CACHELESS STORE_DWORD T0.X, T1.X
There are other changes wrt register allocation and packetizer, but this one looks the most suspicious. My turks is TS2 and STORE_DWORD is not defined in the ISA (STORE_RAW is the only allowed opcode for CACHELESS target). Checking cayman ISA STORE_DWORD is opcode 20 (vs. opc 2 for STORE_RAW), which is reserved on TS2. The instruction also lost the offset.
Now, there are tests for MEMRAT_CACHELESS stoers, and they pass so I guess there is another untested store path that got mixed between TS2 and TS3.
I can paste the .ll file if you're interested.
a quick update. running llc manually on the kernel .ll (dumped using CLOVER_DEBUG=llvm) produces correct assembly. Running it in clover generates incorrect code (dumped using CLOVER_DEBUG=native) and hangs GPU.
lib/Target/AMDGPU/AMDGPUFeatures.td | ||
---|---|---|
53 | gi complains about blank line at the end of file here | |
lib/Target/AMDGPU/R600ISelLowering.cpp | ||
238 | git complains about whitespace error in this location |
I added the below snippet to check whether the caymanISA feature gets initialized correctly:
@@ -415,7 +417,10 @@ R600Subtarget::R600Subtarget(const Triple &TT, StringRef GPU, StringRef FS, TLInfo(TM, initializeSubtargetDependencies(TT, GPU, FS)), DX10Clamp(false), InstrItins(getInstrItineraryForCPU(GPU)), - AS (AMDGPU::getAMDGPUAS(TT)) { } + AS (AMDGPU::getAMDGPUAS(TT)) { + fprintf(stderr, "R600 FEATURE STRING: %s\n", FS.data()); + fprintf(stderr, "R600 Has Cayman ISA: %s\n", CaymanISA ? "YES" : "NO"); +}
As expected it randomly on occasion printed:
'-fp32-denormals' is not a recognized feature for this target (ignoring feature) '-fp32-denormals' is not a recognized feature for this target (ignoring feature) '-fp32-denormals' is not a recognized feature for this target (ignoring feature) R600 FEATURE STRING: -fp32-denormals R600 Has Cayman ISA: YES
running llc through valgrind produced flood of 'Conditional jump or move depends on uninitialised value(s)'
269 errors from 24 contexts. Initialzieng just CaymanISA in R600SUbtarget gets rid of most of them.
Fails to build:
llvm-tblgen: Unknown command line argument '-gen-tgt-intrinsic'. Try: '../../../bin/llvm-tblgen -help'
llvm-tblgen: Did you mean '-gen-tgt-intrinsic-impl'?
make[2]: *** [lib/Target/AMDGPU/CMakeFiles/AMDGPUCommonTableGen.dir/build.make:1730: lib/Target/AMDGPU/R600GenIntrinsics.inc.tmp] Error 1
https://people.freedesktop.org/~jvesely/llvm/
test cases 46 and 48 the "n-" and "new-" prefixed versions are the result of the previous iteration of this patch
After fixing the build file as tblgen suggested (and few local fixes in my own patches) it builds OK and there are no piglit regressions on my turks.
I think this should land rather soon, with a bunch of cleanup follow ups. Having things (files, classes) that are prefixed R600, AMDGPU, AMDGPUCommon, GCN, AMDGCN, and SI is rather confusing.
IntrinsicInfo isn't needed any more, so I dropped this.
I think this should land rather soon, with a bunch of cleanup follow ups. Having things (files, classes) that are prefixed R600, AMDGPU, AMDGPUCommon, GCN, AMDGCN, and SI is rather confusing.
Ok, I can start working on this once this patch lands.
Add missing headers to tablegen files and remove virtual functions
from AMDGPUSubtarget.
LGTM
lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp | ||
---|---|---|
716–718 | I would expect this to be a separate function, but not sure where this would go |
lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp | ||
---|---|---|
716–718 | We can refactor AMDGPUMCInstLower.cpp so that this can be in its own function. I can work on this as one of the follow on clean ups. |
Hi! We encountered a UBSan runtime error after this was merged. I wrote a bug report about it: https://bugs.llvm.org/show_bug.cgi?id=38071.
Missing header comment