This is to match GCC's optimizations: https://gcc.godbolt.org/z/3odh9e7WE
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Time | Test | |
---|---|---|
60 ms | x64 debian > LLVM.Bindings/Go::go.test |
Event Timeline
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
30153 | Probably want to make sure I is in the same BB given the SelectionDAG restrictions? | |
30162 | Is this the same as ~C1->getValue() == C2->getValue()? | |
30936 | Why does it matter if the AND has more than one use? Or were you trying to check that the ATOMIC_LOAD has only one use of the Result 0? | |
30972 | Use the getSETCC function from this file. | |
30974 | Don't you need getZExtOrTrunc here? You aren't guaranteed to shift out the extended bits so you need to make sure they are 0 like the and would have done. | |
llvm/test/CodeGen/X86/atomic-bit-test.ll | ||
293 | Was -1 supposed to be -2? | |
333 | Was -2 supposed to be -3? |
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
30899 | Verify that operand 1 is a constant too. | |
30900 | My big concern is that we rely on the checks done in IR holding until SelectionDAG. If that fails and a user hits this fatal error, how would they know how to fix their code? Did you give any thought about converting the patterns into a target specific intrinsic in IR? That would probably need to be a new hook in the AtomicExpandPass or a separate X86 pass. |
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
30899 | This is not used to check constantency but to check the code is entered once only (C2 hasn't been assigned to a constant value). | |
30900 |
I'm not sure if we can simply use target specific intrinsic here. Atomic instruction has attributes like ordering that need special handling.
We mainly check 3 aspects here:
What do you think? |
This test triggers the fatal error
#include <atomic> std::atomic<int> x; void bar(); void baz(); void foo(int a, int b) { int y = std::atomic_fetch_or(&x, 8); if (y & 8 && a) bar(); if ((y & 8) && b) baz(); }
Break AND + CMP 0 fold for atomic RMW logic instructions. This fixes the problem in Craig's example. Thanks Craig!
This test fails
%"struct.std::atomic" = type { %"struct.std::__atomic_base" } %"struct.std::__atomic_base" = type { i32 } @x = global %"struct.std::atomic" zeroinitializer, align 4 define i32 @_Z3fooii(i32 %0, i32 %1) { %3 = atomicrmw or i32* getelementptr inbounds (%"struct.std::atomic", %"struct.std::atomic"* @x, i64 0, i32 0, i32 0), i32 8 seq_cst, align 4 %4 = tail call i32 @llvm.ctlz.i32(i32 %0, i1 false) %5 = and i32 %3, 8 %6 = icmp ne i32 %5, 0 %7 = icmp ne i32 %1, 0 %8 = select i1 %6, i1 %7, i1 false br i1 %8, label %9, label %10 9: ; preds = %2 tail call void @_Z3bazi(i32 %4) br label %10 10: ; preds = %9, %2 ret i32 %4 } declare i32 @llvm.ctlz.i32(i32, i1 immarg) declare void @_Z3bazi(i32)
CodeGenPrepare splits the basic block to despeculate the ctlz.
Add a new pass to hosit AND instructions that been split by other passes.
This fixes the failed example. Thanks a lot for Craig's helps!
llvm/lib/Target/X86/X86AtomicANDHoist.cpp | ||
---|---|---|
61 ↗ | (On Diff #398418) | It might not be safe to move the And if the other operand isn't a constant. There's no guarantee that the atomic instruction occurs after the source of the other operand. |
Add constant check for AND operand.
llvm/lib/Target/X86/X86AtomicANDHoist.cpp | ||
---|---|---|
61 ↗ | (On Diff #398418) | Makes sense. Thank you! |
pre-commit the tests and rebase so this patch shows the codegen diff?
llvm/lib/Target/X86/X86ISelLowering.h | ||
---|---|---|
1647 | Log? |
llvm/lib/Target/X86/X86ISelLowering.h | ||
---|---|---|
1647 | I assumed it meant Logic |
This breaks -O0. Fast isel consumes the And, then we fall back to SelectionDAG, but the And is gone so it triggers the fatal error.
I'm uncomfortable with the cross pass requirements this approach depends on. It seems very fragile.
Log?