This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Analysis/
-
Analysis/
1/2
GlobalsModRef.cpp
-
test/Analysis/GlobalsModRef/
-
Analysis/
-
GlobalsModRef/
-
functions_without_nosync.ll
-
intrinsic_addressnottaken1.ll
-
intrinsic_addressnottaken2.ll
-
intrinsic_addresstaken.ll
-
nosync_nocallback.ll

Differential D123531

[GlobalsModRef][FIX] Ensure we honor synchronizing effects of intrinsics
ClosedPublic

Authored by jdoerfert on Apr 11 2022, 11:38 AM.

Download Raw Diff

Details

Reviewers

tra
arsenm
nikic
efriedma
asbirlea
rjmccall

Commits

rG9dc7da3f9cb4: [GlobalsModRef][FIX] Ensure we honor synchronizing effects of intrinsics

Summary

This is a long standing problem that resurfaces once in a while [0,1].

In 2008 we thought intrinsics do not read/write globals passed to them:
https://github.com/llvm/llvm-project/commit/d4133ac31535ce5176f97e9fc81825af8a808760
This is not correct given that intrinsics can synchronize threads and
cause effects to effectively become visible.

Fixes: https://github.com/llvm/llvm-project/issues/54851

[0] https://discourse.llvm.org/t/bug-gvn-memdep-bug-in-the-presence-of-intrinsics/59402
[1] https://reviews.llvm.org/D115302

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

jdoerfert created this revision.Apr 11 2022, 11:38 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 11 2022, 11:38 AM

Herald added subscribers: bollu, hiraditya. · View Herald Transcript

jdoerfert requested review of this revision.Apr 11 2022, 11:38 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 11 2022, 11:38 AM

Herald added a subscriber: wdng. · View Herald Transcript

Clang format

Actually clang format

My understanding is that the assumption the code is trying to make is that intrinsics are special because they can't call other code in the current module.

In practice, this is wrong for two reasons:

Some intrinsics imply synchronization.
Some intrinsics can call arbitrary code.

So I don't think hasNoSync() is sufficient...

I'd appreciate a better comment about the invariants involved here.

(Looking more closely, I don't think this really overlaps with D115302.)

In D123531#3443483, @efriedma wrote:

My understanding is that the assumption the code is trying to make is that intrinsics are special because they can't call other code in the current module.

In practice, this is wrong for two reasons:

Some intrinsics imply synchronization.

Some intrinsics can call arbitrary code.

Right. I only checked 1. I also need to check nocallback for 2. Will update.

Add nocallback check and improve comment

Did you mean to drop the !F->isIntrinsic() check? I mean, I guess you could replace it with !F->isDeclaration(), but I think you need some protection against optnone function definitions, since we don't analyze their bodies.

In D123531#3443564, @efriedma wrote:

Did you mean to drop the !F->isIntrinsic() check? I mean, I guess you could replace it with !F->isDeclaration(), but I think you need some protection against optnone function definitions, since we don't analyze their bodies.

I dropped it on purpose but we might need more checks.
I don't believe intrinsics are special, or at least should be.

To me it looked like optnone is treated like a declaration already. Is that wrong? If it is not, I don't think we need the declaration check but at the end of the day we can be conservative in the presence of optnone.

Handle optnone conservatively

Harbormaster completed remote builds in B159075: Diff 422006.Apr 11 2022, 1:26 PM

The check looks right now.

Remaining concerns:

There's another use of isIntrinsic() a few lines earlier; do we need to fix that?
Is this going to have the same issue as D115302 where there are a bunch of intrinsics that currently don't have the expected markings?
Missing testcases.

Testcase?

In D123531#3443905, @arsenm wrote:

Testcase?

NOTE: I did not yet modify any tests but only tried out the reproducer of https://github.com/llvm/llvm-project/issues/54851.

In D123531#3443851, @efriedma wrote:

The check looks right now.

Great. I'll get on the remaining issues.

There's another use of isIntrinsic() a few lines earlier; do we need to fix that?

We do. I will make a helper and use it both times.

Is this going to have the same issue as D115302 where there are a bunch of intrinsics that currently don't have the expected markings?

People will need to annotate their intrinsics or use DefaultIntrinsics at some point. The latter have nosync and nocallback.

Missing testcases.

working on it.

Update tests and add new one.

@asbirlea, please take a look at the Intrinsics.td change. We can also apply it
prior to this patch separately.

Harbormaster completed remote builds in B159105: Diff 422048.Apr 11 2022, 4:29 PM

Fix Transforms/ObjCARC/basic.ll

Harbormaster completed remote builds in B159122: Diff 422075.Apr 11 2022, 5:48 PM

jdoerfert edited the summary of this revision. (Show Details)Apr 12 2022, 8:08 AM

Add test from https://reviews.llvm.org/D115302

jdoerfert mentioned this in D115302: GlobalsModRef should treat functions w/o nosync conservatively..Apr 12 2022, 8:15 AM

jdoerfert added reviewers: asbirlea, rjmccall.

Harbormaster completed remote builds in B159247: Diff 422238.Apr 12 2022, 9:01 AM

I'm not 100% sure if the issue underlying https://reviews.llvm.org/D115302 would be solved by this or not.

You've included the functions_without_nosync.ll test from D115302 which verifies that this patch fixes that issue, too. AFAICT, both patches are trying to address the same issue, but due to my lack of expertise, I was (overly?) cautious and trying to change as little as possible, and probably was doing it in a wrong place, too.

We need to document how nosync (or lack of it) interacts with other attributes describing side effects. E.g. interaction between IntrReadMem, IntrArgMemOnly and (lack of) nosync.

One concern raised on D115302 was the impact on back-ends that have not been upgraded to use DefaultIntrinsic. For them this patch would prevent some optimizations. This could potentially be an issue for users who may use such intrinsic in the hot path and they will have no easy way to work around that, as intrinsic properties are not under user control. Should we consider having a temporary hidden escape hatch option to revert to the old behavior?

In D123531#3446219, @tra wrote:

I'm not 100% sure if the issue underlying https://reviews.llvm.org/D115302 would be solved by this or not.

You've included the functions_without_nosync.ll test from D115302 which verifies that this patch fixes that issue, too. AFAICT, both patches are trying to address the same issue, but due to my lack of expertise, I was (overly?) cautious and trying to change as little as possible, and probably was doing it in a wrong place, too.

We need to document how nosync (or lack of it) interacts with other attributes describing side effects. E.g. interaction between IntrReadMem, IntrArgMemOnly and (lack of) nosync.

Yes. We need to properly make sync a first-class side-effect, get away from memory writes, etc.

One concern raised on D115302 was the impact on back-ends that have not been upgraded to use DefaultIntrinsic. For them this patch would prevent some optimizations. This could potentially be an issue for users who may use such intrinsic in the hot path and they will have no easy way to work around that, as intrinsic properties are not under user control. Should we consider having a temporary hidden escape hatch option to revert to the old behavior?

DefaultIntrinsic is available for a long time by now, people will not switch if they don't have to, nor will they revisit their intrinsics. It's still plenty of time till a release. Let's not add a hatch too early, give people a chance to realize and adjust.

jdoerfert edited the summary of this revision. (Show Details)Apr 12 2022, 10:46 AM

DefaultIntrinsic is available for a long time by now, people will not switch if they don't have to, nor will they revisit their intrinsics. It's still plenty of time till a release. Let's not add a hatch too early, give people a chance to realize and adjust.

I guess this is fine. But please post a note on Discourse.

llvm/include/llvm/IR/Intrinsics.td
428 ↗	(On Diff #422238)	In general, popping an autorelease pool deallocates objects, and deallocating objects can run arbitrary user code. So the new markings on int_objc_autoreleasePoolPop are wrong, I think. Marking up int_objc_autoreleasePoolPush should be fine, though. @rjmccall @ahatanak can you confirm?

In D123531#3446328, @efriedma wrote:

DefaultIntrinsic is available for a long time by now, people will not switch if they don't have to, nor will they revisit their intrinsics. It's still plenty of time till a release. Let's not add a hatch too early, give people a chance to realize and adjust.

I guess this is fine. But please post a note on Discourse.

Will do as this is ready.

llvm/include/llvm/IR/Intrinsics.td
428 ↗	(On Diff #422238)	I am waiting for ObjC people to look at this. I added the annotation based on two tests we otherwise fail, not based on the intrinsics (which I don't know). Though, I might have misunderstood: `llvm/test/Analysis/GlobalsModRef/intrinsic_addressnottaken1.ll`

tra added inline comments.Apr 12 2022, 12:25 PM

llvm/include/llvm/IR/Intrinsics.td
428 ↗	(On Diff #422238)	I think those tests do not care specifically about `int_objc_autoreleasePoolPush/Pop` and should be changed to use something less special, e.g. to `@llvm.stacksave/@llvm.stackrestore` as I've done in D115302.

Update tests as D115302 did, avoid objc changes

No objc changes. Took test changes from D115302.

LGTM

This revision is now accepted and ready to land.Apr 12 2022, 2:35 PM

This revision was landed with ongoing or failed builds.Apr 12 2022, 2:44 PM

Closed by commit rG9dc7da3f9cb4: [GlobalsModRef][FIX] Ensure we honor synchronizing effects of intrinsics (authored by jdoerfert). · Explain Why

This revision was automatically updated to reflect the committed changes.

jdoerfert added a commit: rG9dc7da3f9cb4: [GlobalsModRef][FIX] Ensure we honor synchronizing effects of intrinsics.

Harbormaster completed remote builds in B159321: Diff 422329.Apr 12 2022, 2:48 PM

@jdoerfert: A friendly nag to update intrinsic attribute docs. Looks like they didn't make it to the landed change.

andrew.w.kaylor added a subscriber: andrew.w.kaylor.Apr 12 2022, 5:29 PM

andrew.w.kaylor added inline comments.

llvm/lib/Analysis/GlobalsModRef.cpp
540	Should this also check onlyAccessesInaccessibleMemory() and onlyAccessesInaccessibleMemOrArgMem()? Same question for line 546 below.

In D123531#3446219, @tra wrote:

We need to document how nosync (or lack of it) interacts with other attributes describing side effects. E.g. interaction between IntrReadMem, IntrArgMemOnly and (lack of) nosync.

In D123531#3447021, @tra wrote:

@jdoerfert: A friendly nag to update intrinsic attribute docs. Looks like they didn't make it to the landed change.

You mean the above? That ain't something I can actually "just do". My opinion and the opinion of others are known to be different. What we need (in my opinion) is a major overhaul of our idea of side effects. We did a first step with termination but now we need to make "sync" and "write" different things. Even writing that up will take some time.

llvm/lib/Analysis/GlobalsModRef.cpp
540	Probably but unrelated. Feel free to propose a follow up.

In D123531#3447211, @jdoerfert wrote:

In D123531#3446219, @tra wrote:

We need to document how nosync (or lack of it) interacts with other attributes describing side effects. E.g. interaction between IntrReadMem, IntrArgMemOnly and (lack of) nosync.

In D123531#3447021, @tra wrote:

@jdoerfert: A friendly nag to update intrinsic attribute docs. Looks like they didn't make it to the landed change.

You mean the above?

Yes.

That ain't something I can actually "just do". My opinion and the opinion of others are known to be different.

You do have the power to change the way LLVM works. I believe you do have the power to document how things work now and save other folks who just want to create an intrinsic the hassle of finding out about the quirks the hard way, when something breaks.

What we need (in my opinion) is a major overhaul of our idea of side effects. We did a first step with termination but now we need to make "sync" and "write" different things. Even writing that up will take some time.

IMO, that's independent of documenting the current state of affairs. For what it's worth, I do agree with you on the nature of the problem we have describing intrinsics' side effects.

In D123531#3449199, @tra wrote:

In D123531#3447211, @jdoerfert wrote:

In D123531#3446219, @tra wrote:

We need to document how nosync (or lack of it) interacts with other attributes describing side effects. E.g. interaction between IntrReadMem, IntrArgMemOnly and (lack of) nosync.

In D123531#3447021, @tra wrote:

@jdoerfert: A friendly nag to update intrinsic attribute docs. Looks like they didn't make it to the landed change.

You mean the above?

Yes.

That ain't something I can actually "just do". My opinion and the opinion of others are known to be different.

You do have the power to change the way LLVM works. I believe you do have the power to document how things work now and save other folks who just want to create an intrinsic the hassle of finding out about the quirks the hard way, when something breaks.

OK. But I'm still not sure where you want me to document what exactly. What I "changed" here is:

Intrinsics can execute arbitrary code if they are not annotated with nocallback, and
Intrinsics can make effects of other "threads of execution" visible if they are not annotated with nosync.

For me that was already a given because we don't have the opposite written anywhere (as far as I know).
I could put the two points explicitly here: https://llvm.org/docs/LangRef.html#intrinsic-functions, would that be a good place?

In D123531#3454133, @jdoerfert wrote:

OK. But I'm still not sure where you want me to document what exactly. What I "changed" here is:

Intrinsics can execute arbitrary code if they are not annotated with nocallback, and

Does it imply that we can not trust Intr*Mem attributes? If that's the case, perhaps we should implement some sort of sanity check for the attributes to report nonsensical combinations.

If Intr*Mem take precedence, then it should be documented. E.g. if we have IntrArgMemOnly, then we're allowed to assume that the intrinsic will only touch specific arg only.

Also, currently Intrinsics.td says this:

//... 'has no other side effects'
// language of the IntrNoMem, IntrReadMem, IntrWriteMem, and IntrArgMemOnly
// intrinsic properties.`

This appears to contradict your assertion that lack of nocallback implies execution of arbitrary code which in turn implies arbitrary side effects.

I can't tell from reading Intrinsics.td what side effects we expect from an intrinsic with only IntrNoMem. Do we now need to specify IntrNoMem,nosync, nocallback in order to have a truly no-side-effects intrinsic?

Intrinsics can make effects of other "threads of execution" visible if they are not annotated with nosync.

Similarly here. nosync gives us a guarantee that no synchronization takes place, but without it, can we believe IntrReadMem that we do not write anything? Or do we need nosync for IntrReadMem to have an effect?

For me that was already a given because we don't have the opposite written anywhere (as far as I know).

The problem with no* attributes is that a person who looks at the intrinsics that do not have them would not know that lack of an attribute may affect the meaning of the explicitly specified ones.
It's hard to ask questions about something one may not even be aware of. Hence my harping on documenting the interaction between these attributes where developers will see it.

I could put the two points explicitly here: https://llvm.org/docs/LangRef.html#intrinsic-functions, would that be a good place?

LangRef does not seem to say anything specifically about intrinsics' attributes (though they do map to function attributes).
It also does not mention nocallback at all.

I was thinking of augmenting Intrinsics.td as that's where developers who implement intrinsics would look for the reference info.

readnone etc. surely encompass the entire behavior of the call, which of course includes any callbacks it might make. And synchronization needs to be understood as effectively a store, since synchronizing with another thread which has done a store guarantees the visibility of that store on this thread; and thus readonly etc. must imply nosync. So whatever nocallback means, it must be finer-grained, suitable for situations where the stronger attributes cannot be used.

I think it would be a huge shame if we had to go update a million testcases just to compensate for nocallback being added when there's already a stronger attribute on the intrinsic. That does not seem like a good use of anyone's time.

Sorry, that was a bit of a jumble of different ideas that I pressed into the form of a single message. Let me try to do better.

Effects summarization attributes like readnone and so on have to be understood as encompassing the entire behavior of the call. That of course includes any callbacks into visible code the call might make. So the absence of nocallback doesn't weaken other attributes in any way.

I think the right way of thinking about synchronization, effects-wise, is that it is effectively a potential store to any memory that might be accessible by another thread. Synchronization is therefore necessarily excluded by readonly, argmemonly, and so on. We can usually ignore the possibility of stores happening concurrently on other threads, but synchronization forces us to honor them, and that is indistinguishable, in terms of effects, from the store happening directly as part of the call.

I haven't read much about nocallback, so I don't entirely understand what the optimization goal is. If it's not implied by the existing effects attributes, then I guess we need to add it everywhere, and if that implies test churn, so be it. But I want to make sure that we're not modeling something useless in the optimizer just because we can imagine code that it accurately describes; ultimately, the IR is a tool that exists to be of use to us as compiler developers, and perfect accuracy by itself is not a goal.

In D123531#3454415, @rjmccall wrote:

readnone etc. surely encompass the entire behavior of the call, which of course includes any callbacks it might make.

Agreed, that is what it means now. What the code here assumed is that an intrinsic could *not* callback into the module, but it can.

And synchronization needs to be understood as effectively a store, since synchronizing with another thread which has done a store guarantees the visibility of that store on this thread; and thus readonly etc. must imply nosync.

Yes, at least that is the traditional interpretation which is leftover from the time we did not have nosync.
So, when we want to document the status quo, then we'd say nosync and read/write/only/none do *not* compose on the call level but only on the overall effect level.
Thus, read/write/only/none are "stronger" than "may-sync", if present. FWIW, we do not have a verifier check for that, e.g., that read-only/none has to go with nosync.
Instead of creating one we could also change that assumption but that is probably another story.
The current interpretation effectively means our annotations for certain intrinsics (e.g., gpu shuffle) are wrong as they are "must-sync" and readnone.

So whatever nocallback means, it must be finer-grained, suitable for situations where the stronger attributes cannot be used.

I don't think that is the right way to look at it. In contrast to nosync, nocallback is even now orthogonal/composable with memory attributes.
nocallback means it cannot invoke module code. It's not otherwise impacting the memory attributes, nor are they impacting nocallback beyond their particular specification.
Think of a readonly function/intrinsic which is not nocallback. It can totally call into the module and use a function there to read a global internal to the module that does not escape.
The function in the module doesn't even need to be readonly. It just needs to be readonly for the invocations by the readonly function/intrinsic.
Even readnone composes as it allows the function/intrinsic that is not nocallback to call into the module and get the address of a internal global, e.g., in order to return it.

I think it would be a huge shame if we had to go update a million testcases just to compensate for nocallback being added when there's already a stronger attribute on the intrinsic. That does not seem like a good use of anyone's time.

As I said above, nocallback and memory attributes compose already. No need to change anything as far as I can tell. The issue fixed in this patch did also not arise because of their combination. Memory attributes on functions w/ or w/o nocallback work the same way.
nosync is a different story. We should de-tangle it wrt. memory intrinsics and that would require test changes (among other things). However, if we could simply augment exiting read/write/only/none with nosync. That's mechanical. We are going to update literally ever test case for opaque pointers, before we did it for explicit load/store types, etc.
De-tangling the effects would hover open up the door for proper reasoning about non-escaped local variables in multi-threaded environments, something we basically cannot do right now. In the current situation "may-sync" prevents read-only/none. This is bad now because shuffle intrinics, for example, have (or would need to have) arbitrary side-effects even though they don't need that. If we untangle the attributes we get composing attributes. It would also ensure memory attributes are "local", as the rest of our attributes are. sync/nosync models effects from other threads of execution, there is no need to muddle that with the local read/write behavior of the function at hand.

LangRef does not seem to say anything specifically about intrinsics' attributes (though they do map to function attributes). It also does not mention nocallback at all.

I know :(, I lost track of it: https://reviews.llvm.org/D90275#2453426
I can try to write something, if anyone else wants, please feel free.

In D123531#3454455, @rjmccall wrote:

Sorry, that was a bit of a jumble of different ideas that I pressed into the form of a single message. Let me try to do better.

I replied to the first post. Let me also reply here ;)

Effects summarization attributes like readnone and so on have to be understood as encompassing the entire behavior of the call. That of course includes any callbacks into visible code the call might make. So the absence of nocallback doesn't weaken other attributes in any way.

Yes, the entire behavior of the call. To be precise, right now it is the entire "becomes visible as part of the call" behavior and not the "local behavior" of the call.

I think the right way of thinking about synchronization, effects-wise, is that it is effectively a potential store to any memory that might be accessible by another thread.

That is what we basically do now, unfortunately. It puts all possible effects by any possible other thread together, which is basically always "worthless".

Synchronization is therefore necessarily excluded by readonly, argmemonly, and so on.

No. We cannot derive either for "may-sync" functions but the user can for sure combine them in a sensible way. If all my shared state is the one argument I pass, argmemonly and "may-sync" work together fine. If I know there is no write effect by other threads that can become visible at this point, e.g., I only wake them up but they haven't done anything on shared state yet, I can use readonly with "may-sync", etc.

We can usually ignore the possibility of stores happening concurrently on other threads, but synchronization forces us to honor them, and that is indistinguishable, in terms of effects, from the store happening directly as part of the call.

Yes. Though with our modeling we can not express that certain effects are known to be not present. We can also not annotate functions based on their local effects and then use the context in which they are called to determine if there are other threads of execution that it might interact with.

I haven't read much about nocallback, so I don't entirely understand what the optimization goal is.

We use it in-tree for inter-procedural reachability analysis. So far, we mostly ask if a load/store may reach another load/store inter-procedurally. If I have nocallback I can get proper results because otherwise I have to assume that things like intrinsics and also known runtime functions are potentially calling any extern user function in my module and thereby reach the load/store in question.

If it's not implied by the existing effects attributes, then I guess we need to add it everywhere, and if that implies test churn, so be it.

Attributes should, in general, not imply other attributes except if we have good reason for it. They compose and that is as it should be.
As an example, I can annotate my malloc and vprintf calls in the OpenMP GPU runtime as nocallback (via __attribute__((leaf))) and then perform proper reachability analysis even though they are not read/write/only/none.

But I want to make sure that we're not modeling something useless in the optimizer just because we can imagine code that it accurately describes; ultimately, the IR is a tool that exists to be of use to us as compiler developers, and perfect accuracy by itself is not a goal.

I agree. But we need to avoid justifying things with the fact that we didn't used in the past. For the longest time we had no parallelism-aware optimizations at all, which allows us to lump synchronization and memory effects together w/o any drawback. We also did not dinstinguish other "non-memory effects", e.g., termination. The latter causes all sorts of hassle for people until we finally decoupled termination from "may-write". I mean, we literally introduced the llvm.sideffect intrinsic to pretend there was a write in a loop that may not actually have one. That kind of solution is not helpful either.

Revision Contents

Path

Size

llvm/

lib/

Analysis/

GlobalsModRef.cpp

16 lines

test/

Analysis/

GlobalsModRef/

functions_without_nosync.ll

38 lines

intrinsic_addressnottaken1.ll

16 lines

intrinsic_addressnottaken2.ll

16 lines

intrinsic_addresstaken.ll

10 lines

nosync_nocallback.ll

133 lines

Diff 422339

llvm/lib/Analysis/GlobalsModRef.cpp

Show First 20 Lines • Show All 505 Lines • ▼ Show 20 Lines	if (!F \|\| !F->isDefinitionExact()) {
continue;		continue;
}		}

FunctionInfo &FI = FunctionInfos[F];		FunctionInfo &FI = FunctionInfos[F];
Handles.emplace_front(*this, F);		Handles.emplace_front(*this, F);
Handles.front().I = Handles.begin();		Handles.front().I = Handles.begin();
bool KnowNothing = false;		bool KnowNothing = false;

		// Intrinsics, like any other synchronizing function, can make effects
		// of other threads visible. Without nosync we know nothing really.
		// Similarly, if `nocallback` is missing the function, or intrinsic,
		// can call into the module arbitrarily. If both are set the function
		// has an effect but will not interact with accesses of internal
		// globals inside the module. We are conservative here for optnone
		// functions, might not be necessary.
		auto MaySyncOrCallIntoModule = [](const Function &F) {
		return !F.isDeclaration() \|\| !F.hasNoSync() \|\|
		!F.hasFnAttribute(Attribute::NoCallback);
		};

// Collect the mod/ref properties due to called functions. We only compute		// Collect the mod/ref properties due to called functions. We only compute
// one mod-ref set.		// one mod-ref set.
for (unsigned i = 0, e = SCC.size(); i != e && !KnowNothing; ++i) {		for (unsigned i = 0, e = SCC.size(); i != e && !KnowNothing; ++i) {
if (!F) {		if (!F) {
KnowNothing = true;		KnowNothing = true;
break;		break;
}		}

if (F->isDeclaration() \|\| F->hasOptNone()) {		if (F->isDeclaration() \|\| F->hasOptNone()) {
// Try to get mod/ref behaviour from function attributes.		// Try to get mod/ref behaviour from function attributes.
if (F->doesNotAccessMemory()) {		if (F->doesNotAccessMemory()) {
// Can't do better than that!		// Can't do better than that!
} else if (F->onlyReadsMemory()) {		} else if (F->onlyReadsMemory()) {
FI.addModRefInfo(ModRefInfo::Ref);		FI.addModRefInfo(ModRefInfo::Ref);
if (!F->isIntrinsic() && !F->onlyAccessesArgMemory())		if (!F->onlyAccessesArgMemory() && MaySyncOrCallIntoModule(*F))
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions Should this also check onlyAccessesInaccessibleMemory() and onlyAccessesInaccessibleMemOrArgMem()? Same question for line 546 below. andrew.w.kaylor: Should this also check onlyAccessesInaccessibleMemory() and onlyAccessesInaccessibleMemOrArgMem…
		jdoerfertAuthorUnsubmitted Done Reply Inline Actions Probably but unrelated. Feel free to propose a follow up. jdoerfert: Probably but unrelated. Feel free to propose a follow up.
// This function might call back into the module and read a global -		// This function might call back into the module and read a global -
// consider every global as possibly being read by this function.		// consider every global as possibly being read by this function.
FI.setMayReadAnyGlobal();		FI.setMayReadAnyGlobal();
} else {		} else {
FI.addModRefInfo(ModRefInfo::ModRef);		FI.addModRefInfo(ModRefInfo::ModRef);
if (!F->onlyAccessesArgMemory())		if (!F->onlyAccessesArgMemory())
FI.setMayReadAnyGlobal();		FI.setMayReadAnyGlobal();
if (!F->isIntrinsic()) {		if (MaySyncOrCallIntoModule(*F)) {
KnowNothing = true;		KnowNothing = true;
break;		break;
}		}
}		}
continue;		continue;
}		}

for (CallGraphNode::iterator CI = SCC[i]->begin(), E = SCC[i]->end();		for (CallGraphNode::iterator CI = SCC[i]->begin(), E = SCC[i]->end();
▲ Show 20 Lines • Show All 521 Lines • Show Last 20 Lines

llvm/test/Analysis/GlobalsModRef/functions_without_nosync.ll

This file was added.

				; RUN: opt -globals-aa -gvn -S < %s \| FileCheck %s
				; RUN: opt -aa-pipeline=basic-aa,globals-aa -passes='require<globals-aa>,gvn' -S < %s \| FileCheck %s
				;
				; Functions w/o `nosync` attribute may communicate via memory and must be
				; treated conservatively. Taken from https://reviews.llvm.org/D115302.

				target datalayout = "e-i64:64-i128:128-v16:16-v32:32-n16:32:64"
				target triple = "nvptx64-nvidia-cuda"

				@s = internal local_unnamed_addr addrspace(3) global i32 undef, align 4

				; CHECK-LABEL: @bar_sync
				; CHECK: store
				; CHECK: tail call void @llvm.nvvm.bar.sync(i32 0)
				; CHECK: load
				define dso_local i32 @bar_sync(i32 %0) local_unnamed_addr {
				store i32 %0, i32* addrspacecast (i32 addrspace(3)* @s to i32*), align 4
				tail call void @llvm.nvvm.bar.sync(i32 0)
				%2 = load i32, i32* addrspacecast (i32 addrspace(3)* @s to i32*), align 4
				ret i32 %2
				}

				declare void @llvm.nvvm.bar.sync(i32) #0

				; CHECK-LABEL: @barrier0
				; CHECK: store
				; CHECK: tail call void @llvm.nvvm.barrier0()
				; CHECK: load
				define dso_local i32 @barrier0(i32 %0) local_unnamed_addr {
				store i32 %0, i32* addrspacecast (i32 addrspace(3)* @s to i32*), align 4
				tail call void @llvm.nvvm.barrier0()
				%2 = load i32, i32* addrspacecast (i32 addrspace(3)* @s to i32*), align 4
				ret i32 %2
				}

				declare void @llvm.nvvm.barrier0() #0

				attributes #0 = { convergent nounwind }

llvm/test/Analysis/GlobalsModRef/intrinsic_addressnottaken1.ll

	; RUN: opt -aa-pipeline=basic-aa,globals-aa -passes='require<globals-aa>,gvn' -S < %s \| FileCheck %s			; RUN: opt -globals-aa -gvn -S < %s \| FileCheck %s

	target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
	target triple = "x86_64-unknown-linux-gnu"			target triple = "x86_64-unknown-linux-gnu"

	@deallocCalled = internal global i8 0, align 1			@deallocCalled = internal global i8 0, align 1

	; CHECK-LABEL: @main()			; CHECK-LABEL: @main()
	define dso_local i32 @main() {			define dso_local i32 @main() {
	entry:			entry:
	%tmp0 = call i8* @llvm.objc.autoreleasePoolPush() #1			%tmp0 = call i8* @llvm.stacksave() #1
	%tmp6 = load i8, i8* @deallocCalled, align 1			%tmp6 = load i8, i8* @deallocCalled, align 1
	%tobool = icmp ne i8 %tmp6, 0			%tobool = icmp ne i8 %tmp6, 0
	br i1 %tobool, label %if.else, label %if.end			br i1 %tobool, label %if.else, label %if.end

	if.else: ; preds = %entry			if.else: ; preds = %entry
	call void @__assert_fail() #0			call void @__assert_fail() #0
	unreachable			unreachable

	; CHECK-LABEL: if.end:			; CHECK-LABEL: if.end:
	; CHECK-NEXT: call void @llvm.objc.autoreleasePoolPop			; CHECK-NEXT: call void @llvm.stackrestore
	; CHECK-NOT: load i8, i8* @deallocCalled			; CHECK-NOT: load i8, i8* @deallocCalled
	if.end: ; preds = %entry			if.end: ; preds = %entry
	call void @llvm.objc.autoreleasePoolPop(i8* %tmp0)			call void @llvm.stackrestore(i8* %tmp0)
	%tmp7 = load i8, i8* @deallocCalled, align 1			%tmp7 = load i8, i8* @deallocCalled, align 1
	%tobool3 = icmp ne i8 %tmp7, 0			%tobool3 = icmp ne i8 %tmp7, 0
	br i1 %tobool3, label %if.end6, label %if.else5			br i1 %tobool3, label %if.end6, label %if.else5

	if.else5: ; preds = %if.end			if.else5: ; preds = %if.end
	call void @__assert_fail() #0			call void @__assert_fail() #0
	unreachable			unreachable

	if.end6: ; preds = %if.end			if.end6: ; preds = %if.end
	store i8 0, i8* @deallocCalled, align 1			store i8 0, i8* @deallocCalled, align 1
	ret i32 0			ret i32 0
	}			}

	declare i8* @llvm.objc.autoreleasePoolPush() #1			declare i8* @llvm.stacksave() #1
	declare void @llvm.objc.autoreleasePoolPop(i8*) #1			declare void @llvm.stackrestore(i8*) #1
	declare dso_local void @__assert_fail() #0			declare dso_local void @__assert_fail() #0

	attributes #0 = { noreturn nounwind }			attributes #0 = { noreturn nosync nounwind }
	attributes #1 = { nounwind }			attributes #1 = { nosync nounwind }

llvm/test/Analysis/GlobalsModRef/intrinsic_addressnottaken2.ll

	; RUN: opt -aa-pipeline=basic-aa,globals-aa -passes='require<globals-aa>,gvn' -S < %s \| FileCheck %s			; RUN: opt -globals-aa -gvn -S < %s \| FileCheck %s

	target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
	target triple = "x86_64-unknown-linux-gnu"			target triple = "x86_64-unknown-linux-gnu"

	@deallocCalled = internal global i8 0, align 1			@deallocCalled = internal global i8 0, align 1

	define internal i8* @_i_Associated__dealloc() {			define internal i8* @_i_Associated__dealloc() {
	entry:			entry:
	store i8 1, i8* @deallocCalled, align 1			store i8 1, i8* @deallocCalled, align 1
	ret i8* null			ret i8* null
	}			}

	; CHECK-LABEL: @main()			; CHECK-LABEL: @main()
	define dso_local i32 @main() {			define dso_local i32 @main() {
	entry:			entry:
	%tmp0 = call i8* @llvm.objc.autoreleasePoolPush() #1			%tmp0 = call i8* @llvm.stacksave() #1
	%tmp6 = load i8, i8* @deallocCalled, align 1			%tmp6 = load i8, i8* @deallocCalled, align 1
	%tobool = icmp ne i8 %tmp6, 0			%tobool = icmp ne i8 %tmp6, 0
	br i1 %tobool, label %if.else, label %if.end			br i1 %tobool, label %if.else, label %if.end

	if.else: ; preds = %entry			if.else: ; preds = %entry
	call void @__assert_fail() #0			call void @__assert_fail() #0
	unreachable			unreachable

	; CHECK-LABEL: if.end:			; CHECK-LABEL: if.end:
	; CHECK-NEXT: call void @llvm.objc.autoreleasePoolPop			; CHECK-NEXT: call void @llvm.stackrestore
	; CHECK-NOT: load i8, i8* @deallocCalled			; CHECK-NOT: load i8, i8* @deallocCalled
	if.end: ; preds = %entry			if.end: ; preds = %entry
	call void @llvm.objc.autoreleasePoolPop(i8* %tmp0)			call void @llvm.stackrestore(i8* %tmp0)
	%tmp7 = load i8, i8* @deallocCalled, align 1			%tmp7 = load i8, i8* @deallocCalled, align 1
	%tobool3 = icmp ne i8 %tmp7, 0			%tobool3 = icmp ne i8 %tmp7, 0
	br i1 %tobool3, label %if.end6, label %if.else5			br i1 %tobool3, label %if.end6, label %if.else5

	if.else5: ; preds = %if.end			if.else5: ; preds = %if.end
	call void @__assert_fail() #0			call void @__assert_fail() #0
	unreachable			unreachable

	if.end6: ; preds = %if.end			if.end6: ; preds = %if.end
	store i8 0, i8* @deallocCalled, align 1			store i8 0, i8* @deallocCalled, align 1
	ret i32 0			ret i32 0
	}			}

	declare i8* @llvm.objc.autoreleasePoolPush() #1			declare i8* @llvm.stacksave() #1
	declare void @llvm.objc.autoreleasePoolPop(i8*) #1			declare void @llvm.stackrestore(i8*) #1
	declare dso_local void @__assert_fail() #0			declare dso_local void @__assert_fail() #0

	attributes #0 = { noreturn nounwind }			attributes #0 = { noreturn nosync nounwind }
	attributes #1 = { nounwind }			attributes #1 = { nosync nounwind }

llvm/test/Analysis/GlobalsModRef/intrinsic_addresstaken.ll

	Show All 12 Lines
	entry:			entry:
	store i8 1, i8* @deallocCalled, align 1			store i8 1, i8* @deallocCalled, align 1
	ret i8* null			ret i8* null
	}			}

	; CHECK-LABEL: @main()			; CHECK-LABEL: @main()
	define dso_local i32 @main() {			define dso_local i32 @main() {
	entry:			entry:
	%tmp0 = call i8* @llvm.objc.autoreleasePoolPush() #1			%tmp0 = call i8* @llvm.stacksave() #1
	%tmp6 = load i8, i8* @deallocCalled, align 1			%tmp6 = load i8, i8* @deallocCalled, align 1
	%tobool = icmp ne i8 %tmp6, 0			%tobool = icmp ne i8 %tmp6, 0
	br i1 %tobool, label %if.else, label %if.end			br i1 %tobool, label %if.else, label %if.end

	if.else: ; preds = %entry			if.else: ; preds = %entry
	call void @__assert_fail() #0			call void @__assert_fail() #0
	unreachable			unreachable

	; CHECK-LABEL: if.end:			; CHECK-LABEL: if.end:
	; CHECK-NEXT: call void @llvm.objc.autoreleasePoolPop			; CHECK-NEXT: call void @llvm.stackrestore
	if.end: ; preds = %entry			if.end: ; preds = %entry
	call void @llvm.objc.autoreleasePoolPop(i8* %tmp0)			call void @llvm.stackrestore(i8* %tmp0)
	%tmp7 = load i8, i8* @deallocCalled, align 1			%tmp7 = load i8, i8* @deallocCalled, align 1
	%tobool3 = icmp ne i8 %tmp7, 0			%tobool3 = icmp ne i8 %tmp7, 0
	br i1 %tobool3, label %if.end6, label %if.else5			br i1 %tobool3, label %if.end6, label %if.else5

	if.else5: ; preds = %if.end			if.else5: ; preds = %if.end
	call void @__assert_fail() #0			call void @__assert_fail() #0
	unreachable			unreachable

	if.end6: ; preds = %if.end			if.end6: ; preds = %if.end
	store i8 0, i8* @deallocCalled, align 1			store i8 0, i8* @deallocCalled, align 1
	ret i32 0			ret i32 0
	}			}

	declare i8* @llvm.objc.autoreleasePoolPush() #1			declare i8* @llvm.stacksave() #1
	declare void @llvm.objc.autoreleasePoolPop(i8*) #1			declare void @llvm.stackrestore(i8*) #1
	declare dso_local void @__assert_fail() #0			declare dso_local void @__assert_fail() #0

	attributes #0 = { noreturn nounwind }			attributes #0 = { noreturn nounwind }
	attributes #1 = { nounwind }			attributes #1 = { nounwind }

llvm/test/Analysis/GlobalsModRef/nosync_nocallback.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --check-attributes
				; RUN: opt -aa-pipeline=basic-aa,globals-aa -passes='require<globals-aa>,gvn' -S < %s \| FileCheck %s

				; Make sure we do not hoist the load before the intrinsic, unknown function, or
				; optnone function except if we know the unknown function is nosync and nocallback.

				@G1 = internal global i32 undef
				@G2 = internal global i32 undef
				@G3 = internal global i32 undef
				@G4 = internal global i32 undef

				define void @test_barrier(i1 %c) {
				; CHECK-LABEL: define {{[^@]+}}@test_barrier
				; CHECK-SAME: (i1 [[C:%.*]]) {
				; CHECK-NEXT: br i1 [[C]], label [[INIT:%.]], label [[CHECK:%.]]
				; CHECK: init:
				; CHECK-NEXT: store i32 0, ptr @G1, align 4
				; CHECK-NEXT: br label [[CHECK]]
				; CHECK: check:
				; CHECK-NEXT: call void @llvm.amdgcn.s.barrier()
				; CHECK-NEXT: [[V:%.*]] = load i32, ptr @G1, align 4
				; CHECK-NEXT: [[CMP:%.*]] = icmp eq i32 [[V]], 0
				; CHECK-NEXT: call void @llvm.assume(i1 [[CMP]])
				; CHECK-NEXT: ret void
				;
				br i1 %c, label %init, label %check
				init:
				store i32 0, ptr @G1
				br label %check
				check:
				call void @llvm.amdgcn.s.barrier()
				%v = load i32, ptr @G1
				%cmp = icmp eq i32 %v, 0
				call void @llvm.assume(i1 %cmp)
				ret void
				}

				define void @test_unknown(i1 %c) {
				; CHECK-LABEL: define {{[^@]+}}@test_unknown
				; CHECK-SAME: (i1 [[C:%.*]]) {
				; CHECK-NEXT: br i1 [[C]], label [[INIT:%.]], label [[CHECK:%.]]
				; CHECK: init:
				; CHECK-NEXT: store i32 0, ptr @G2, align 4
				; CHECK-NEXT: br label [[CHECK]]
				; CHECK: check:
				; CHECK-NEXT: call void @unknown()
				; CHECK-NEXT: [[V:%.*]] = load i32, ptr @G2, align 4
				; CHECK-NEXT: [[CMP:%.*]] = icmp eq i32 [[V]], 0
				; CHECK-NEXT: call void @llvm.assume(i1 [[CMP]])
				; CHECK-NEXT: ret void
				;
				br i1 %c, label %init, label %check
				init:
				store i32 0, ptr @G2
				br label %check
				check:
				call void @unknown()
				%v = load i32, ptr @G2
				%cmp = icmp eq i32 %v, 0
				call void @llvm.assume(i1 %cmp)
				ret void
				}

				define void @test_optnone(i1 %c) {
				; CHECK-LABEL: define {{[^@]+}}@test_optnone
				; CHECK-SAME: (i1 [[C:%.*]]) {
				; CHECK-NEXT: br i1 [[C]], label [[INIT:%.]], label [[CHECK:%.]]
				; CHECK: init:
				; CHECK-NEXT: store i32 0, ptr @G3, align 4
				; CHECK-NEXT: br label [[CHECK]]
				; CHECK: check:
				; CHECK-NEXT: call void @optnone()
				; CHECK-NEXT: [[V:%.*]] = load i32, ptr @G3, align 4
				; CHECK-NEXT: [[CMP:%.*]] = icmp eq i32 [[V]], 0
				; CHECK-NEXT: call void @llvm.assume(i1 [[CMP]])
				; CHECK-NEXT: ret void
				;
				br i1 %c, label %init, label %check
				init:
				store i32 0, ptr @G3
				br label %check
				check:
				call void @optnone()
				%v = load i32, ptr @G3
				%cmp = icmp eq i32 %v, 0
				call void @llvm.assume(i1 %cmp)
				ret void
				}

				define void @optnone() optnone nosync nocallback noinline {
				; CHECK: Function Attrs: nocallback noinline nosync optnone
				; CHECK-LABEL: define {{[^@]+}}@optnone
				; CHECK-SAME: () #[[ATTR0:[0-9]+]] {
				; CHECK-NEXT: ret void
				;
				ret void
				}

				; Here hoisting is legal and we use it to verify it will happen.
				define void @test_unknown_annotated(i1 %c) {
				; CHECK-LABEL: define {{[^@]+}}@test_unknown_annotated
				; CHECK-SAME: (i1 [[C:%.*]]) {
				; CHECK-NEXT: br i1 [[C]], label [[INIT:%.]], label [[DOTCHECK_CRIT_EDGE:%.]]
				; CHECK: .check_crit_edge:
				; CHECK-NEXT: [[V_PRE:%.*]] = load i32, ptr @G4, align 4
				; CHECK-NEXT: br label [[CHECK:%.*]]
				; CHECK: init:
				; CHECK-NEXT: store i32 0, ptr @G4, align 4
				; CHECK-NEXT: br label [[CHECK]]
				; CHECK: check:
				; CHECK-NEXT: [[V:%.*]] = phi i32 [ [[V_PRE]], [[DOTCHECK_CRIT_EDGE]] ], [ 0, [[INIT]] ]
				; CHECK-NEXT: call void @unknown_nosync_nocallback()
				; CHECK-NEXT: [[CMP:%.*]] = icmp eq i32 [[V]], 0
				; CHECK-NEXT: call void @llvm.assume(i1 [[CMP]])
				; CHECK-NEXT: ret void
				;
				br i1 %c, label %init, label %check
				init:
				store i32 0, ptr @G4
				br label %check
				check:
				call void @unknown_nosync_nocallback()
				%v = load i32, ptr @G4
				%cmp = icmp eq i32 %v, 0
				call void @llvm.assume(i1 %cmp)
				ret void
				}

				declare void @unknown()
				declare void @unknown_nosync_nocallback() nosync nocallback
				declare void @llvm.amdgcn.s.barrier()
				declare void @llvm.assume(i1 noundef)

This is an archive of the discontinued LLVM Phabricator instance.

[GlobalsModRef][FIX] Ensure we honor synchronizing effects of intrinsicsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 422339

llvm/lib/Analysis/GlobalsModRef.cpp

llvm/test/Analysis/GlobalsModRef/functions_without_nosync.ll

llvm/test/Analysis/GlobalsModRef/intrinsic_addressnottaken1.ll

llvm/test/Analysis/GlobalsModRef/intrinsic_addressnottaken2.ll

llvm/test/Analysis/GlobalsModRef/intrinsic_addresstaken.ll

llvm/test/Analysis/GlobalsModRef/nosync_nocallback.ll

[GlobalsModRef][FIX] Ensure we honor synchronizing effects of intrinsics
ClosedPublic