Saves some add instructions on a couple Rage 2 shaders and is also a
prerequisite for a coming-soon change matching (register + immediate)
offsets.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/test/CodeGen/AMDGPU/amdgcn-load-offset-from-reg.ll | ||
---|---|---|
36 | When this patch and D128836 have both landed, can you add GISEL checks here please. | |
38 | Please don't use unnamed values in tests. (You can run the IR through opt -instnamer to name them automatically.) | |
40 | Do I understand correctly: in SDag this gep ends up as (add %2, @0) (not (add @0, %2)) because @0 is a constant so it gets canonicalized to the RHS? |
llvm/test/CodeGen/AMDGPU/amdgcn-load-offset-from-reg.ll | ||
---|---|---|
45 | Why would we want to check that? Looks like there's nothing specific to 32-bit constants in the change? |
llvm/test/CodeGen/AMDGPU/amdgcn-load-offset-from-reg.ll | ||
---|---|---|
45 | You have the call to Expand32BitAddress |
llvm/test/CodeGen/AMDGPU/amdgcn-load-offset-from-reg.ll | ||
---|---|---|
45 | Yes, which replicates the existing and already-tested logic. Will it eliminate the need for the test if I move the couples of SelectSMRDOffset() and Expand32BitAddress() calls into a separate function, something like SelectSMRDBaseOffset()? We would need it for matching (register + immediate) offsets anyway. Feels like that'd be a test case for what is actually not a special case. |
llvm/test/CodeGen/AMDGPU/amdgcn-load-offset-from-reg.ll | ||
---|---|---|
45 | It's another permutation of a somewhat fragile case (which also works differently in globalisel). There's no reason to skimp on tests |
llvm/test/CodeGen/AMDGPU/amdgcn-load-offset-from-reg.ll | ||
---|---|---|
45 | The if ((Addr.getValueType() != MVT::i32 || Addr->getFlags().hasNoUnsignedWrap())) bit in AMDGPUDAGToDAGISel::SelectSMRD() means for such a test we would need the nuw flag be set on the ISD::ADD in the offset, which we only do for constant getelementptr inbounds indexes that in turn would always go to the RHS operand of the ISD::ADD, and thus can't help reproducing the case. D15544 gives some reasoning for raising nuw for non-negative constant offsets. If we think it's worth it, I could try to do the same for non-constants known to be small enough, but it again would help if we do not require cultivating things specific to the constant address space be a prerequisite for this generic change. |
Match complex register SMRD offsets.
Could you improve the commit message? I don't know what "complex" means here. I think it is really about matching (constant base + register offset) in addition to (register base + constant offset).
Right, the 'complex' part came from when I had a hypothesis that it is the size of the operand that matters. Thanks for catching.
llvm/test/CodeGen/AMDGPU/amdgcn-load-offset-from-reg.ll | ||
---|---|---|
45 | You can just add noinbounds to the getelementptr. It's not a special change and just testing the mechanics for what you already have here |
llvm/test/CodeGen/AMDGPU/amdgcn-load-offset-from-reg.ll | ||
---|---|---|
45 | Sorry, I'm not sure I follow. What is noinbounds and how it should be used with the getelementptr, exactly? |
llvm/test/CodeGen/AMDGPU/amdgcn-load-offset-from-reg.ll | ||
---|---|---|
45 | You can try this: %i3 = getelementptr inbounds [4 x <2 x float>], with the addrspace(6) pointer |
llvm/test/CodeGen/AMDGPU/amdgcn-load-offset-from-reg.ll | ||
---|---|---|
45 | It has already been explained in https://reviews.llvm.org/D129095#inline-1241075 that that wouldn't work. On top of that, we don't seem to support addrspace(6) constants in SDAG ISel currently; I get AMDGPU DAG->DAG Pattern Instruction Selection error whenever I try to use one. Regarding noinbounds, I don't see how your words explain what it is or how it can be useful for triggering the added code. Is that your comment still relevant? And just for the record, I don't understand the argument about the new Expand32BitAddress() call being a sufficient reason for another test case. We usually don't add tests for higher level code just because it employs more of some lower level functions. It's up to tests specifically for these lower level functions to make sure they can handle their intended uses cases, and as hopefully is obvious from https://reviews.llvm.org/D129381 that eliminates multiple calls to Expand32BitAddress(), this patch doesn't add any new use cases for that function. What has been changed is that we now support another permutation of operands in SelectSMRD(), but that is already covered with the added tests. addrspace(6) operands are no special in this regard, so adding more tests for them would mean we try to test combinations that are not special implementation-wise, which is again not how we write regression tests. And yes, unnecessary tests can actually be harmful -- they draw maintenance resources and attention and they hide cases that are really important. I believe some of our MC tests illustrate that. Overall, I feel somewhat confused about your two last comments. If you can reword what you have to suggest in a bit more explicit way, that will likely be a great help. Thanks. |
llvm/test/CodeGen/AMDGPU/amdgcn-load-offset-from-reg.ll | ||
---|---|---|
45 | The point isn't to test Expand32BitAddress but to test that it's used on this path. I guess if constants don't work for constant 32-bit that's a separate issue |
Should use explicit global-isel=0 for dag test