TODO
Diff Detail
- Repository
- rG LLVM Github Monorepo
Time | Test | |
---|---|---|
550 ms | x64 debian > Clang.Misc::backend-resource-limit-diagnostics.cl Script:
--
: 'RUN: at line 2'; not /mnt/disks/ssd0/agent/llvm-project/build/bin/clang -cc1 -internal-isystem /mnt/disks/ssd0/agent/llvm-project/build/lib/clang/13.0.0/include -nostdsysteminc -emit-codegen-only -triple=amdgcn-- /mnt/disks/ssd0/agent/llvm-project/clang/test/Misc/backend-resource-limit-diagnostics.cl 2>&1 | /mnt/disks/ssd0/agent/llvm-project/build/bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/clang/test/Misc/backend-resource-limit-diagnostics.cl
| |
300 ms | x64 debian > LLVM.CodeGen/AMDGPU::32-bit-local-address-space.ll Script:
--
: 'RUN: at line 1'; /mnt/disks/ssd0/agent/llvm-project/build/bin/llc -march=amdgcn -mcpu=bonaire -verify-machineinstrs < /mnt/disks/ssd0/agent/llvm-project/llvm/test/CodeGen/AMDGPU/32-bit-local-address-space.ll | /mnt/disks/ssd0/agent/llvm-project/build/bin/FileCheck -check-prefixes=SI,FUNC,GFX7 /mnt/disks/ssd0/agent/llvm-project/llvm/test/CodeGen/AMDGPU/32-bit-local-address-space.ll
| |
290 ms | x64 debian > LLVM.CodeGen/AMDGPU::amdpal-callable.ll Script:
--
: 'RUN: at line 1'; /mnt/disks/ssd0/agent/llvm-project/build/bin/llc -mtriple=amdgcn--amdpal -verify-machineinstrs < /mnt/disks/ssd0/agent/llvm-project/llvm/test/CodeGen/AMDGPU/amdpal-callable.ll | /mnt/disks/ssd0/agent/llvm-project/build/bin/FileCheck -check-prefixes=GCN,SDAG -enable-var-scope /mnt/disks/ssd0/agent/llvm-project/llvm/test/CodeGen/AMDGPU/amdpal-callable.ll
| |
290 ms | x64 debian > LLVM.CodeGen/AMDGPU::atomic_optimizations_local_pointer.ll Script:
--
: 'RUN: at line 2'; /mnt/disks/ssd0/agent/llvm-project/build/bin/llc -march=amdgcn -mtriple=amdgcn---amdgiz -amdgpu-atomic-optimizations=true -verify-machineinstrs < /mnt/disks/ssd0/agent/llvm-project/llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll | /mnt/disks/ssd0/agent/llvm-project/build/bin/FileCheck -enable-var-scope -check-prefixes=GFX7LESS /mnt/disks/ssd0/agent/llvm-project/llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll
| |
300 ms | x64 debian > LLVM.CodeGen/AMDGPU::divergence-at-use.ll Script:
--
: 'RUN: at line 1'; /mnt/disks/ssd0/agent/llvm-project/build/bin/llc -march=amdgcn -mcpu=gfx900 -amdgpu-atomic-optimizations=true < /mnt/disks/ssd0/agent/llvm-project/llvm/test/CodeGen/AMDGPU/divergence-at-use.ll | /mnt/disks/ssd0/agent/llvm-project/build/bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/llvm/test/CodeGen/AMDGPU/divergence-at-use.ll
| |
View Full Test Results (120 Failed) |
Event Timeline
All tests are now missing
llvm/lib/Target/AMDGPU/AMDGPULowerFunctionLocalLDS.cpp | ||
---|---|---|
32 ↗ | (On Diff #319026) | This function is pointless, just directly use isModuleEntryFunctionCC |
41–44 ↗ | (On Diff #319026) | This function is useless. Assert strings also don't need to end in \n |
56 ↗ | (On Diff #319026) | cast<>, don't dyn_cast and assert |
66 ↗ | (On Diff #319026) | Pointless comment |
68 ↗ | (On Diff #319026) | Pointless comment |
74–75 ↗ | (On Diff #319026) | Pointless comment |
82 ↗ | (On Diff #319026) | Extra private |
92 ↗ | (On Diff #319026) | Extra private |
93 ↗ | (On Diff #319026) | Typo unhanlded |
111 ↗ | (On Diff #319026) | .contains |
142 ↗ | (On Diff #319026) | Extra private |
233 ↗ | (On Diff #319026) | Copy of set unnecessary |
320 ↗ | (On Diff #319026) | Don't need all these newlines in assert strings |
376–379 ↗ | (On Diff #319026) | I think you're overcomplicating the CallGraph usage by ignoring most of what it gives you. You should be able to just iterate directly through the CallGraph to get functions reachable from the parent |
410–411 ↗ | (On Diff #319026) | This concept doesn't quite work for the IR. The same global can appear in multiple functions |
449 ↗ | (On Diff #319026) | isa<>, no \n |
479 ↗ | (On Diff #319026) | Should not be checking the function name. Should just skip all declarations |
492–495 ↗ | (On Diff #319026) | Return !Kernels.empty() |
502 ↗ | (On Diff #319026) | Probably should skip declarations. Also not sure about the linkage check |
506–509 ↗ | (On Diff #319026) | Return !empty() |
555–558 ↗ | (On Diff #319026) | Return !empty() |
561–571 ↗ | (On Diff #319026) | Don't understand the point of this stub function |
llvm/lib/Target/AMDGPU/AMDGPULowerFunctionLocalLDS.cpp | ||
---|---|---|
111 ↗ | (On Diff #319026) | The data structures, ValueMap<>, SmallPtrSet<> do not have member function - .contains(). W.r.t std::set<>, this member function is supported in C++20. |
376–379 ↗ | (On Diff #319026) | As far as I understand it, llvm CallGraph infrastructure does not provide any facility as such. Implementer needs to explicitly iterate the callees of the caller. |
410–411 ↗ | (On Diff #319026) | My understanding is that - scope of the shared variable is function/statement block scope. It is not available to access outside this scope. It is just that we implement it as global, just like how the local static variables are implemented in C/C++? Can you give an example of the use-case that you are claiming? |
502 ↗ | (On Diff #319026) | The linkage test is required to ignore the dynamic shared variables like the one defined as extern __shared__ int dy_sm[]; where size of dy_sm is not known at compile time, but is passed as one of the kernel execution configuration parameters at run time. |
561–571 ↗ | (On Diff #319026) | This is a driver function, it looks like a stub now, since implementation is not complete yet. Once this patch is accepted, next step is to (1) define kernel specific LDS layouts (2) create 2D offset table and (3) add new implicit argument. |
llvm/lib/Target/AMDGPU/AMDGPULowerFunctionLocalLDS.cpp | ||
---|---|---|
125–128 ↗ | (On Diff #319316) | Should just inline this function |
168 ↗ | (On Diff #319316) | Should just inline this function |
182 ↗ | (On Diff #319316) | Should just inline this function |
205 ↗ | (On Diff #319316) | Copy here, just directly use this in the for loop |
111 ↗ | (On Diff #319026) | There is an llvm::is_contained. Also why use std::set? You randomly switch set types around here |
410–411 ↗ | (On Diff #319026) | The IR has absolutely no concept of these scopes. The global variables have global scope and no restriction on where their uses can appear. Whether or not this directly corresponds to a direct language feature is unimportant. Some IPO transforms can push global variable references into other functions. The example is just two functions that refer to the same variable: @lds = ... define void @func0() { store i32 0, i32* @lds ret void } define void @func1() { store i32 0, i32* @lds ret void } |
502 ↗ | (On Diff #319026) | That's a function of it having 0 size, not the linkage |
561–571 ↗ | (On Diff #319026) | If it's going to be split, I'd rather see the full stack for the review |
llvm/lib/Target/AMDGPU/AMDGPULowerFunctionLocalLDS.cpp | ||
---|---|---|
376–379 ↗ | (On Diff #319026) | The CallGraph as a whole gives you the functions reachable from each other. I don't think you need to do a stack walk to find the callees. You don't need to care about which functions specifically call which, just that they are all connected |
Could we get some tests *and* a commit message that explains what this is supposed to do.
llvm/lib/Target/AMDGPU/AMDGPULowerLDSGlobal.cpp | ||
---|---|---|
312 | You should only need to do the use replacement, you aren't changing the types of the instructions so cloning/hacking on them shouldn't be needed | |
447–449 | Don't need this, the IR would have failed the verifier to get here | |
468–469 | Should not const_cast | |
518–519 | Too much auto for me | |
554 | Should use lowercase, period separator naming convention with an llvm.amdgcn prefix | |
566 | Should use lowercase, period separator naming convention with an llvm.amdgcn prefix | |
609 | This still needs to add in alignment padding | |
669 | Braces, Also can use range loop | |
743 | I don't see why you need to build your own stack. The call graph already found the reachable functions for you | |
777 | I think trying to handle callees is left for a later patch. Additionally, I think this should be the CallGraph analysis's responsibility to deal with | |
855–858 | The callgraph should already give this to you. Iterating the call graph should give you all of the functions you care about. You don't actually need to worry about which functions call which, since you need to touch every function in the SCC | |
987–988 | I would swap the order of these checks |
I generally would recommend against that but I guess you can use phabricator this way.
However, once people can figure out what this is actually supposed to do, they might effectively restart the entire review process if the design is questioned.
I say this because I have a hunch, or maybe a hope, about the intent of this patch. If it would be that, I'd very much like this to be a generic, non-AMDGPU pass. I might be wrong about what this does, and that is what I'd like to figure out rather sooner than later.
OK let's wait for the complete patch then. I will only going to push complete patch next time. But, it may take some time since there are some major hurdles to overcome.