This is an archive of the discontinued LLVM Phabricator instance.

[AtomicExpandPass][AArch64] Promote xchg with floating-point types to integer ones
ClosedPublic

Authored by LemonBoy on May 27 2021, 2:36 AM.

Details

Summary

Follow the same strategy used for atomic loads/stores by converting the operands to equally-sized integer types.
This change prevents the atomic expansion pass from generating illegal LL/SC pairs when targeting AArch64: expand-atomicrmw-xchg-fp.ll would previously instantiate intrinsics such as llvm.aarch64.ldaxr.p0f32 that cannot be lowered.

Diff Detail

Event Timeline

LemonBoy created this revision.May 27 2021, 2:36 AM
LemonBoy requested review of this revision.May 27 2021, 2:36 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 27 2021, 2:36 AM

I don't really like unconditionally messing with the type of the atomic operation; not sure what kind of impact that will have.

Maybe it would make sense to add a new AtomicExpansionKind?

I don't really like unconditionally messing with the type of the atomic operation; not sure what kind of impact that will have.

Maybe it would make sense to add a new AtomicExpansionKind?

The impact is minimal and, according to the test diff, it's quite positive: it fixes the LL/SC lowering for ARM and AArch64 targets (Hexagon, the other target to make use of this expansion, is already integer-expanding the operands) and, if you consider the lower-to-cmpxchg case, pulling the bitcast out of the CAS loop makes the code slimmer at -O0.

I'd be against adding a new AtomicExpansionKind, the TODO comment about adding a TLI hook to precisely control the expansion serves the same purpose IMO.

The transform makes sense on targets that don't have atomic operations on floating-point registers, but that isn't all targets. In particular, the GPU targets have floating-point atomic operations, and bitcasting like this might get in the way of the natural lowering there.

If you think it makes sense to use a dedicated lowering hook, rather than extending AtomicExpansionKind, that's fine, I guess.

The transform makes sense on targets that don't have atomic operations on floating-point registers, but that isn't all targets. In particular, the GPU targets have floating-point atomic operations, and bitcasting like this might get in the way of the natural lowering there.

If you think it makes sense to use a dedicated lowering hook, rather than extending AtomicExpansionKind, that's fine, I guess.

I haven't seen any regression in the test suite, hence the unimplemented TODO comment. The same bitcasting is applied to atomic stores/loads and compare-exchange ops, if it ever becomes a problem it can be easily implemented at a later stage IMO.

efriedma accepted this revision.May 28 2021, 12:38 PM

LGTM

I looked, and apparently we do have an AMDGPU test for xchg; if it isn't impacted, that's fine, I guess.

This revision is now accepted and ready to land.May 28 2021, 12:38 PM
This revision was landed with ongoing or failed builds.May 28 2021, 11:57 PM
This revision was automatically updated to reflect the committed changes.