emitAtomic*PartWord unconditionally sign extends its result causing failures
with __c11_atomic_compare_exchange_strong & co due to an unsigned value
being compared to a sign extended value.
Details
- Reviewers
vkalintiris
Diff Detail
Event Timeline
Hi,
It's my understanding that this patch changes AtomicCmpSwap nodes to produce zero-extended results instead so that it compares properly against an 'i32 = AssertZext ..., ValueType:ch:i16'. What happens when it's an AssertSext instead? I think we're moving the problem from one case to the other.
I think the bug may be in the target-independent legalization passes rather than in our backend. Our initial selection DAG (with the irrelevant bits removed) is:
t6: i32 = AssertZext t2, ValueType:ch:i16 t7: i16 = truncate t6 t11: i16,i1,ch = AtomicCmpSwapWithSuccess<Volatile LDST2[@z]> t0, GlobalAddress:i32<i16* @z> 0, t7, t9 t12: i32 = zero_extend t11:1
After type-legalization it becomes:
t6: i32 = AssertZext t2, ValueType:ch:i16 t17: i32,i32,ch = AtomicCmpSwapWithSuccess<Volatile LDST2[@z]> t0, GlobalAddress:i32<i16* @z> 0, t6, t8 t19: i32 = and t17:1, Constant:i32<1>
which is fine so long as the comparison inside AtomicCmpSwapWithSuccess is performed as if it's an i16 comparison and is not affected by bits 16-31. Then after operation-legalization we have:
t6: i32 = AssertZext t2, ValueType:ch:i16 t20: i32,ch = AtomicCmpSwap<Volatile LDST2[@z]> t0, t27, t6, t8 t22: i32 = setcc t20, t6, seteq:ch
I think this step introduced the bug. Whereas the type-legalized code had an i16 comparison we now have an i32-comparison that tries to compare the zero-extended t6 against the sign-extended t20. I think SelectionDAGLegalize::ExpandNode() needs to inject a sign/zero extension into the operands of the SETCC it emits like so:
t6: i32 = AssertZext t2, ValueType:ch:i16 t20: i32,ch = AtomicCmpSwap<Volatile LDST2[@z]> t0, t27, t6, t8 t40: i32 = zero_extend_inreg t6 t41: i32 = zero_extend_inreg t20 t22: i32 = setcc t41, t40, seteq:ch
Assuming this is correct so far, there's two things I'm not sure of:
- Is AtomicCmpSwap's result defined to be sign/zero/any extended? It's probably any-extended but I don't see any documentation saying this.
- How do other targets avoid this bug? Are they just lucky or is there something they know that we don't?
test/CodeGen/Mips/atomic.ll | ||
---|---|---|
350 | Shouldn't we consume the i1 result? It's my understanding that the comparison used to produce the success result is comparing the sign-extended result of the AtomicCmpSwap and with the zero-extended cmp operand. The inconsistency in the extensions causes our GPR-width comparisons to give incorrect results. |
Abandoned in favor of http://reviews.llvm.org/D18241 . We have to keep the sign extension for consistency with 32bit on 64bit, as ll sign extends its result. Other targets appear to be zero extending their results, so any existing bug there is probably latent.
Shouldn't we consume the i1 result? It's my understanding that the comparison used to produce the success result is comparing the sign-extended result of the AtomicCmpSwap and with the zero-extended cmp operand. The inconsistency in the extensions causes our GPR-width comparisons to give incorrect results.