Hi,
This is an attempt to fix the failure of cmpxchg at -O0 on AArch64. The issue is that fast-regalloc puts spills into ldxr/stxr loops, which can continually clear the exclusive monitor and block progress.
I really don't like having to re-add the pseudo-instructions, since they're error-prone and ugly, but I think I've tried everything else (in approximate order of preference):
- Emit libcalls: they don't exist in libgcc, so even if we could herd all other cats we'd break compatibility with existing versions of GCC.
- Handle the expanded @llvm.aarch64.ldxr/stxr intrinsics to do away with any vregs (so fast-regalloc can't botch them), either in DAG or FastISel. I tried both, but the IR comparison always creates more. No change.
- Change to a different register allocator at -O0. Too costly for compile-time performance and debug.
- A usesCustomInserter Pseudo-inst. Not really any better, but doesn't work anyway because we still have the vregs immediately after DAG.
About the one silver lining is that we only need to handle a very small subset of possible atomic operations, and not even many cmpxchg operations (I think we can treat them all as seq_cst, strong exchanges without violating correctness).
If anyone has any other ideas, please speak up. I'm quite happy to give implementing them a try if there's any hope of avoiding this hack.
Otherwise, OK to commit?
Cheers.
Tim.
MBB.addSuccessor(LoadCmpBB) ?