This is an archive of the discontinued LLVM Phabricator instance.

X86: rework expansion of atomic instructions
AbandonedPublic

Authored by t.p.northover on Jun 16 2014, 8:37 AM.

Details

Reviewers
None
Summary

Hi all,

This is a patch very much along the lines of the recent ARM & AArch64 refactoring of atomics: it moves the complex control-flow (& naive instruction selection) logic out of the MachineInstr level and into IR proper. It has many of the same benefits:

+ More efficient code since the ISel phase can see when constants are used (for example).
+ Neater code

That said, the primary purpose is to support 128-bit atomic operations on x86_64 (hence the added test). These benefits mostly come about because it's the easiest way to do so.

It would have been good to completely reuse the existing LoadLinked pass, instead of copy/paste/adapt, but unfortunately the constraints and optimal forms don't *quite* match up between x86 and LL/SC architectures, so I've added it as an x86-specific pass.

The other detail that may need explaining is the removal of NOCMOV checks in atomic-minmax-i6432.ll: it turns out that any CPU with cmpxchg8b will have cmov, and they can't be controlled independently in LLVM. Since this test is fundamentally about cmpxchg8b, I decided to simply remove those tests.

Do the changes look OK, and are people who more frequently fiddle around with x86 happy with the direction?

Cheers.

Tim.

Diff Detail

Event Timeline

t.p.northover retitled this revision from to X86: rework expansion of atomic instructions.
t.p.northover updated this object.
t.p.northover edited the test plan for this revision. (Show Details)
t.p.northover added a subscriber: Unknown Object (MLST).

Ping?

Cheers.

Tim.

Re-ping?

Cheers.

Tim.

Hi Tim,

I really like this patch and in general it looks very good to me!

However, since I am not the code owner, you might want to ask for more feedback from Nadav or Jim.

Cheers,
Andrea

lib/Target/X86/X86AtomicExpandPass.cpp
108

Please remove the 'else' after return.

147

Please remove the 'else' after return.

One comment on the caching and use of the target machine, but otherwise looks ok here. I haven't reviewed the actual atomics aspect or the testcase changes yet.

lib/Target/X86/X86AtomicExpandPass.cpp
88

Is there any way to get this information from TTI instead? Relying on the subtarget here is going to go badly soon.

Hi Eric,

+bool X86AtomicExpandPass::needsCmpXchgNb(llvm::Type *MemType) {
+ const X86Subtarget &Subtarget = TM->getSubtarget<X86Subtarget>();

+ if (!Subtarget.hasCmpxchg16b())

Is there any way to get this information from TTI instead? Relying on the subtarget here is going to go badly soon.

Should be fairly trivial, as long as a TTI can query the subtarget.
I'm fairly sure they can at the moment, but is that planned to go away
too? I'll get on it anyway.

Cheers.

Tim.

I have some small nits and one big question about ABI. Provided the answer to the question is "no" (IE, the ABI is exactly the same), then LGTM.

lib/Target/X86/X86AtomicExpandPass.cpp
46–47

I'd like some comment about why only AtomicRMW and Store need to be handled here...

83–86

This has the potential to be an ABI-breaking change. Does your new pass fall back to libcalls in exactly the same places that the old code did? Otherwise, depending on how the libctalls are implemented, this could cause a break with code compiled previously.

196–200

I feel like this big switch would be a touch cleaner hoisted into a static function returning the NewVal. Thoughts?

Hi Chandler,

Thanks for looking at the patch!

This has the potential to be an ABI-breaking change. Does your new pass fall
back to libcalls in exactly the same places that the old code did? Otherwise,
depending on how the libctalls are implemented, this could cause a break with
code compiled previously.

That's a difficult one. It *does* change when we fall back, but
because of how it happens I think it's probably OK for a variety of
reasons:

  1. The helpers aren't actually implemented on either Linux or OS X.

GCC already inlines __int128 atomic ops so it's arguably *fixing* an
ABI bug.

  1. We already inline atomic_compare_exchange (on 128-bit types) to

cmpxchg16b when available; this expands the inlining to other
atomicrmw operations. Since these can be mixed, I think any ABI
implementation we're compatible with already has to implement these
calls in terms of cmpxchg16b rather than any global lock. So it
probably doesn't break currently unbroken ABIs.

  1. The change is in the backend, and Clang has its own

MaxAtomicInlineWidth override. It seems a reasonable position that if
you're using LLVM's atomic instructions, you're requesting inlining if
possible.

So I think it's a positive change regardless. And if we do want to go
back to helpers, we should probably do it in Clang (I'd be happy to
get the patch together to do that if I've not managed to convince
you).

Meanwhile, I'll update the patch for your other comments.

Cheers.

Tim.

t.p.northover abandoned this revision.Jul 8 2016, 2:01 PM