This is an archive of the discontinued LLVM Phabricator instance.

[AliasAnalysis] Mark fences as not Mod'ing unescaped local values
Needs ReviewPublic

Authored by aeubanks on Jun 27 2023, 5:40 PM.

Details

Summary

Addresses a regression reported in D145210.

Unescaped local values cannot be Mod'd by fences since other threads
either cannot know the address (unescaped alloca) or it would be UB
(noalias).

Diff Detail

Event Timeline

aeubanks created this revision.Jun 27 2023, 5:40 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 27 2023, 5:40 PM
aeubanks requested review of this revision.Jun 27 2023, 5:40 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 27 2023, 5:40 PM
efriedma added inline comments.
llvm/lib/Analysis/AliasAnalysis.cpp
513–518

The bit about the ModRef mask doesn't make sense in the rewritten comment.

518

"before" in isNotCapturedBeforeOrAt is not happens-before... so if the value is captured after the fence, operations could appear to happen on it before the fence. That might mean reordering read operations across the fence isn't actually legal? Not sure.

nikic added inline comments.
llvm/lib/Analysis/AliasAnalysis.cpp
517

isNotCapturedBeforeOrAt() implies isIdentifiedFunctionLocal().

aeubanks updated this revision to Diff 535442.Jun 28 2023, 9:30 AM

address comments

perhaps @hboehm can comment on the legality of this?

Unfortunately, I'm not an LLVM expert, so let me try to answer in terms of
C++ semantics.
We can certainly have

atomic<int*> y(nullptr);

Thread 1:
{

x = 17;
atomic_thread_fence(memory_order::release);  // Address of x not yet

taken.

y.store(&x, memory_order::relaxed);

}

Thread 2:
int* tmp = y.load(memory_order::acquire);
if (tmp != nullptr) assert(*tmp == 17);

And that assertion should not fail. Without the fence, it may fail, so the
fence orders the store to x. And I think this is not an especially weird
use case.

Does that answer the question? Or did I misinterpret it?

Hans

The relevant scenario would be a bit more complicated, I think, since we're returning that the fence is "ModRefInfo::Ref". Basically, that means non-atomic loads can move across the fence, but stores and atomic ops can't move across the fence. But consider transforming something like the following:

atomic<int*> y(nullptr);

// Thread 1:
x = 17;
z = x;
atomic_thread_fence(memory_order::release);  // Address of x not yet taken.
y.store(&x, memory_order::relaxed);
assert(z == 17);

// Thread 2:
int* tmp = y.load(memory_order::acquire);
if (tmp != nullptr) *tmp += 100;

To something like:

atomic<int*> y(nullptr);

// Thread 1:
x = 17;
atomic_thread_fence(memory_order::release);  // Address of x not yet taken.
z = x;
y.store(&x, memory_order::relaxed);
assert(z == 17);

// Thread 2:
int* tmp = y.load(memory_order::acquire);
if (tmp != nullptr) *tmp += 100;

The original is legal C++, but the modified version has undefined behavior, I think.

And actually, I guess this reasoning also applies to hoisting/sinking across arbitrary calls, since any function can contain a fence. So BasicAAResult::getModRefInfo for CallInst also needs to be fixed.

Thanks for the correction. Subject to my limited understanding of LLVM
conventions, that sounds correct to me.

The behavior of release/acquire is completely symmetric with respect to a
store before the release and a matching load after the acquire, vs. a load
before the release, and a store (that must not be seen by the load) after
the acquire. So the whole notion of letting a load move below the release
fence, but not a store, doesn't make sense to me. There are clearly machine
architectures that have fences specific to one or the other, but the C/C++
memory model, quite intentionally, does not.

reverse ping

I've been busy, probably won't get to this for a while