Fixes a bug in the DAG combiner that eliminates the stores because it missed to inspect the address space of the pointers. %v = load %ptr_as1 // no chain side effect store %v, %ptr_as2 As well as store %v, %ptr_as1 store %v, %ptr_as2 Adds a test for above in X86.
Details
Diff Detail
Event Timeline
This patch looks legit to me.
I'll suggest you to put people that have been actively working on this part as reviewers. They might have some other comments.
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
17920 | just curious, is the code here formatted by clang-format? | |
llvm/test/CodeGen/X86/fs_gs_dag_combine.ll | ||
18 ↗ | (On Diff #343779) | copy_fs_diff and copy_gs_diff seem a little redundant to me: The address spaces are already different, why should we care about address? Can you elaborate a little more on the purpose of these two functions? |
What about DeadStoreElimination ? It can also merge stores.. maybe dse needs similar fix.
llvm/test/CodeGen/X86/fs_gs_dag_combine.ll | ||
---|---|---|
3 ↗ | (On Diff #343779) | very pedantic but maybe call this file dagcombine-dead-store.ll? That matches similar files in the folder. Then improve the comment here to explain that the address spaces are different and the stores shouldn't be eliminated? |
https://www.godbolt.org/z/aqncWfb4c seems to suggest maybe, though haven't debugged whether it's DSE or DAG combine again who does it.
If you show the IR for that example, it looks ok:
https://www.godbolt.org/z/55v7anfa3
I don't see any explicit references to address spaces within DSE itself, so I assume the underlying ValueTracking and/or Alias Analysis is ok here.
Can you add that example to the test file and commit that file as a preliminary step?
Please use "utils/update_llc_test_checks.py" if possible to auto-generate the FileCheck lines.
I've found the same bug in DAG combine, this time for store to store as somebody has already suspected above. I will add a fix + test this to this patch.
Extended the scope of this a little bit. The same bug was present for store to store opts as well.
Adds more tests and elaborates more in the comment.
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
17935 | You don't need the getPointerInfo, StoreSDNode directly provides getAddressSpace |
Code changes LG.
I'd still prefer for the tests to go in first, so we have a regression test record of the miscompiles. Let me know if the suggestion was missed or isn't clear.
I did miss this comment. Could you clarify, do you want me to commit the failing test, or a passing version with a FIXME?
@spatel meant the passing version (current trunk codegen) with a FIXME - you then rebase so this patch shows the codegen diffs - its easier to handle if you auto generate the checks with the update scripts.
Right. It would be nice to commit the baseline (current) output - it will look something like this:
define i32 @copy_fs_same() { ; CHECK-LABEL: copy_fs_same: ; CHECK: # %bb.0: ; CHECK-NEXT: movl 1, %eax ; CHECK-NEXT: retl %t0 = load i32, i32* inttoptr (i64 1 to i32*), align 4 store i32 %t0, i32 addrspace(257)* inttoptr (i64 1 to i32 addrspace(257)*), align 4 ret i32 %t0 }
One more test question - is it possible to add a test that exercises the indexed address code path?
Something like this might do it?
define void @output_fs_same_indexed(i32 %v, i32* %b) { ; CHECK-LABEL: output_fs_same: ; CHECK: # %bb.0: ; CHECK-NEXT: movl {{[0-9]+}}(%esp), %eax ; CHECK-NEXT: movl {{[0-9]+}}(%esp), %ecx ; CHECK-NEXT: movl %eax, 168(%ecx) ; CHECK-NEXT: retl %p = getelementptr i32, i32* %b, i64 42 %pa = addrspacecast i32* %p to i32 addrspace(257)* store i32 %v, i32* %p, align 4 store i32 %v, i32 addrspace(257)* %pa, align 4 ret void }
If you update/rebase this patch after landing D102243, the test file should have just a few different CHECK lines.
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
17919 | Should this be weakened using isNoopAddrSpaceCast both(?) ways round? |
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
17919 | I am actually not sure about the _exact_ semantics of isNoopAddrSpaceCast(). Does it imply that if true, it is a complete nop w.r.t. the memory model? |
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
17919 | memcpy/memmove/memset libcall lowering at least relies on being able to lower calls with any pointers that can be no-op cast to address space 0 (see checkAddrSpaceIsValidForLibcall). CodeGenPrepare also peeks through them though haven't looked exactly how. |
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
17919 | My take on this would be to stay rather slightly conservative and fix this bug, which can lead to clear miscompiles, unless somebody jumps in and confirms the exact semantics of isNoopAddrSpaceCast(). The comment just says "is a nop". This doesn't directly mean to me that one can eliminate a store like that. Then OTOH, if the two stores are in any way different, then the cast should not be a nop either, so maybe yes? Should I add a TODO comment here maybe? |
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
17919 | I do not think this should check isNoopAddrSpaceCast. This is mostly a question of whether an addrspacecast changes the bitpattern of the pointer. The memory accesses may still have different properties and isn't related to these combines |
Is this ok pushing? There was an open question about isNoopAddrSpaceCast(). I would suggest somebody would look at that post this patch as a possible minor improvement?
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
17919 | Makes sense. |
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
17915 | No, do not add this todo. It does not make any sense. There is no addrspacecasting going on in this context |
Add an additional TODO comment about adding a isNoopAddrSpaceCast check