Page MenuHomePhabricator

[IR] Adds mustprogress as a LLVM IR attribute
Needs ReviewPublic

Authored by atmnpatel on Aug 5 2020, 8:43 PM.

Details

Summary

This adds the LLVM IR attribute mustprogress as defined in LangRef through D86233. This attribute will be applied to functions with in languages like C++ where forward progress is guaranteed. Functions without this attribute are not required to make progress.

Diff Detail

Event Timeline

atmnpatel created this revision.Aug 5 2020, 8:43 PM
atmnpatel requested review of this revision.Aug 5 2020, 8:43 PM
atmnpatel updated this revision to Diff 283478.

Removed extra file.

atmnpatel edited the summary of this revision. (Show Details)Aug 5 2020, 8:46 PM
atmnpatel edited the summary of this revision. (Show Details)
jdoerfert added subscribers: RalfJung, regehr, pekka and 5 others.

This was quick, we need to write an RFC now ;). For this we should probably write the lang ref part too.
We should also mention somewhere this is the first step to properly fix PR965 and make the llvm.sideeffect obsolete (for this use case).

Adding some people that should be aware.

Harbormaster completed remote builds in B67245: Diff 283478.
uenoku added a subscriber: uenoku.Aug 5 2020, 11:14 PM

I am going solely off the description here as I am not familiar with LLVM internals. Looking forward to the LangRef update. :)

Great to see some progress (heh ;) is made on this, thanks Johannes! One thing made me curious though: if I understand correctly "noprogress" is a somewhat strange attribute in that *not* having this attribute promises that the function *will* "make progress"? That's the opposite from other attributes like "nounwind". Is it not possible to instead have a "progress" or "nodiverge" attribute which promises that the function *will* make progress / will not diverge (in the sense of running infinitely without any side-effect)? I imagine the attribute could be added by default when loading old bitcode to preserve semantics (but dropping the attribute is safe so this would only be crucial for performance, not for correctness).

Also, in terms of semantics (but this might be more appropriate for the LangRef PR), is it legal for a "progress" and "noprogress" function to mutually recursively call each other infinitely? As in, does not having "noprogress" promise that every call to this function will return or have infinitely many side-effects, or does not having "noprogress" just promise that *this function* and all the no-"noprogress" functions it calls will not diverge? (You can see here why I think the attribute should be inverted in its meaning. ;)

nikic added a subscriber: nikic.Aug 6 2020, 2:13 AM

I am going solely off the description here as I am not familiar with LLVM internals. Looking forward to the LangRef update. :)

Great to see some progress (heh ;) is made on this, thanks Johannes! One thing made me curious though: if I understand correctly "noprogress" is a somewhat strange attribute in that *not* having this attribute promises that the function *will* "make progress"? That's the opposite from other attributes like "nounwind". Is it not possible to instead have a "progress" or "nodiverge" attribute which promises that the function *will* make progress / will not diverge (in the sense of running infinitely without any side-effect)? I imagine the attribute could be added by default when loading old bitcode to preserve semantics (but dropping the attribute is safe so this would only be crucial for performance, not for correctness).

Right, the "direction" of the attribute is what we need to discuss now. This was a first try, renaming it now is simple if we decide to do so. I suggested to start with this version as it shouldn't cause regressions for C/C++, though, I acknowledge the opposite attribute might at first be the "safer"/"saner" choice. Either should do the trick after a while given that all optimizations need to preserve/handle attribute already, you cannot just drop them and so noprogress should be sufficient.

Also, in terms of semantics (but this might be more appropriate for the LangRef PR), is it legal for a "progress" and "noprogress" function to mutually recursively call each other infinitely? As in, does not having "noprogress" promise that every call to this function will return or have infinitely many side-effects, or does not having "noprogress" just promise that *this function* and all the no-"noprogress" functions it calls will not diverge? (You can see here why I think the attribute should be inverted in its meaning. ;)

Hm... I will need to think about this. Given that we want to mix C/C++ with Rust on IR-level, right?

Either should do the trick after a while given that all optimizations need to preserve/handle attribute already, you cannot just drop them and so noprogress should be sufficient.

Ah, I thought so far attributes were always safe to drop. If that is not the case, what remains is the possible confusion on our side when having to talk about "noprogress" and no-"noprogress" functions, and the asymmetry with most (but seemingly not all) existing attributes.

Hm... I will need to think about this. Given that we want to mix C/C++ with Rust on IR-level, right?

That would be an example, yes -- cross-language LTO is used e.g. by Firefox. From the normative perspective, neither the C++ spec nor the Rust spec say what happens when a C++ and Rust function form a mutually infinite recursion. My personal preference is to allow such behavior, i.e., infinite recursion is only UB if *all* involved functions are no-"noprogress". But that is mostly my personal bias against making things UB, I cannot say how this affects analyses/optimizations.

(I am assuming here that stack overflows are already handled e.g. by a guard page, which means that in pure Rust code, infinite recursion is well-defined, and this must be handled appropriately by optimizations.)

fhahn added a subscriber: fhahn.Aug 6 2020, 9:44 AM

Either should do the trick after a while given that all optimizations need to preserve/handle attribute already, you cannot just drop them and so noprogress should be sufficient.

Ah, I thought so far attributes were always safe to drop. If that is not the case, what remains is the possible confusion on our side when having to talk about "noprogress" and no-"noprogress" functions, and the asymmetry with most (but seemingly not all) existing attributes.

Just FYI, convergent is one that cannot be dropped. Also naked, nobuiltin, noduplicate, noimplicitfloat, null_pointer_is_valid, ... and some of them have to survive inlining already as well.

Do we need a well-formedness check at Verifier.cpp stating that noprogress and willreturn cannot exist together?

Do we need a well-formedness check at Verifier.cpp stating that noprogress and willreturn cannot exist together?

Let's first create the Lang Ref patch before we dive into details here. Generally, we will need verifier checks, yes. FWIW, noprogress is unrelated to willreturn and noreturn though.

Let's first create the Lang Ref patch before we dive into details here. Generally, we will need verifier checks, yes. FWIW, noprogress is unrelated to willreturn and noreturn though.

IIUC, a function makes progress if the function has willreturn - wouldn't it make noprogress give no information?
It isn't contradiction, but about whether noprogress brings useful information or not, though.

It isn't contradiction, but about whether noprogress brings useful information or not, though.

The presence of "noprogress" doesn't bring much information, but its absence does.

Even if we keep the polarity, maybe it would be better to call it "maybenoprogress" or "maybeprogress"? After all, "noprogress" sounds like it asserts that the function will not make progress, when really the flag that that the function *is permitted to make no progress* (unlike functions without this attribute, which have to make progress).

ychen added a subscriber: ychen.Aug 11 2020, 11:26 AM

It isn't contradiction, but about whether noprogress brings useful information or not, though.

The presence of "noprogress" doesn't bring much information, but its absence does.

It depends on how you look at it:
From the perspective of optimizations we can do, sure absence is the one that enables more.
From the perspective of source behavior, it just distinguishes between two alternatives, so both option w/ and w/o flag have the same information value.

Even if we keep the polarity, maybe it would be better to call it "maybenoprogress" or "maybeprogress"? After all, "noprogress" sounds like it asserts that the function will not make progress, when really the flag that that the function *is permitted to make no progress* (unlike functions without this attribute, which have to make progress).

That is fair. maybe... or mayprogress or mayloop or mayspin? Naming things is hard ...

Let's first create the Lang Ref patch before we dive into details here. Generally, we will need verifier checks, yes. FWIW, noprogress is unrelated to willreturn and noreturn though.

IIUC, a function makes progress if the function has willreturn - wouldn't it make noprogress give no information?
It isn't contradiction, but about whether noprogress brings useful information or not, though.

So this is a may/must difference and these attributes also do not perfectly match. However:
willreturn implies must-make-progress (as it cannot loop), which then allows you to remove this attribue (maybe-no-progress) as it actually will make progress.
Even when willreturn is not applicable you can have must-make-progress. It basically is the attribute we implicitly assume right now. Though I don't have a reason handy why you would derive the must-make-progress and remove the may-... version. If the user/input gives you both, that is a different story though.

It depends on how you look at it:
From the perspective of optimizations we can do, sure absence is the one that enables more.
From the perspective of source behavior, it just distinguishes between two alternatives, so both option w/ and w/o flag have the same information value.

I don't entirely understand what you mean. From the perspective of source behavior, *removing* the attribute adds UB, i.e., it turns some previously non-UB executions into UB executions. That's exactly the opposite of attributes like "noreturn" or "nosync" or "nounwind" that all add UB when being *added*.

It depends on how you look at it:
From the perspective of optimizations we can do, sure absence is the one that enables more.
From the perspective of source behavior, it just distinguishes between two alternatives, so both option w/ and w/o flag have the same information value.

I don't entirely understand what you mean. From the perspective of source behavior, *removing* the attribute adds UB, i.e., it turns some previously non-UB executions into UB executions. That's exactly the opposite of attributes like "noreturn" or "nosync" or "nounwind" that all add UB when being *added*.

I don't look at it as added or removed, for me it is a boolean flag, with a default that requires progress. It is more like nobuiltin, noduplicate, nomerge, returns_twice, .... The difference is that I don't see this being deduced. It is just a description of the input semantics, not a property we derive from the code (to describe the code). Unclear if the view makes a difference, let me know what u think :)

I don't look at it as added or removed, for me it is a boolean flag, with a default that requires progress. It is more like nobuiltin, noduplicate, nomerge, returns_twice, .... The difference is that I don't see this being deduced. It is just a description of the input semantics, not a property we derive from the code (to describe the code). Unclear if the view makes a difference, let me know what u think :)

I don't think of any of these attributes as being deduced. ;) I don't care where the attributes come from, the important bit is how they affect the language semantics, the language here being LLVM IR of course. (It's almost as if I'm a programming language researcher. ;)

nounwind and friends affect the language semantics by saying things like: if this function unwinds, that's UB. (In the follows that if you can deduce that a function doesn't unwind, it is okay to add the attribute. But as with all optimizations and analysis, they are not the ground truth, they have to be justified against the language semantics.)
The *absence* of noprogress affects the language semantics by saying: if this function runs into a side-effect-free infinite loop, that's UB. (And something about infinite recursion, but that's a bit more tricky because multiple functions can be involved.) Adding noprogress disables that UB. (This makes me prefer "allownoprogress" as attribute name, if you want to keep its polarity.)

That's how this attribute is "the other way around" compared to the normal ones. Unfortunately I don't know what nobuiltin, nomerge etc mean, so I cannot say how they affect language semantics. I don't know what you mean by "input semantics".

noprogress could certainly be deduced like the others, it's just backwards again -- we have to deduce its absence. For a function that has noprogress, if the compiler can prove that there is no infinite loop, it may remove the attribute. (I am not sure if that's a useful thing to do, I am just saying it is possible.)

I'm fine with the other name and I removed my long answer because I don't think we were going anywhere.

atmnpatel updated this revision to Diff 289844.EditedThu, Sep 3, 8:19 PM

Renamed the IR attribute to mayprogress from noprogress because it makes it clearer that the forward progress guarantee is relaxed.

atmnpatel updated this revision to Diff 289845.Thu, Sep 3, 8:21 PM
This comment was removed by atmnpatel.
atmnpatel retitled this revision from [WIP] [IR] Adding noprogress as a LLVM IR attribute to [IR] Adds mayprogress as a LLVM IR attribute.Thu, Sep 3, 9:51 PM
atmnpatel edited the summary of this revision. (Show Details)

Changed name from noprogress to mayprogress for clarity.

Shouldn't it be "maynotprogress" (or "maybenoprogress" or so)? *Every* function *may* progress, but only those with this marker are also allowed to not "make progress".

atmnpatel updated this revision to Diff 289996.Fri, Sep 4, 10:50 AM

Renamed to maynotprogress.

atmnpatel updated this revision to Diff 291345.Fri, Sep 11, 2:55 PM

Change direction.

atmnpatel retitled this revision from [IR] Adds mayprogress as a LLVM IR attribute to [IR] Adds mustprogress as a LLVM IR attribute.Fri, Sep 11, 2:55 PM
atmnpatel edited the summary of this revision. (Show Details)
atmnpatel updated this revision to Diff 291548.Mon, Sep 14, 5:37 AM
atmnpatel edited the summary of this revision. (Show Details)

Rebase.