This is an archive of the discontinued LLVM Phabricator instance.

[LangRef] Define mustprogress attribute
ClosedPublic

Authored by atmnpatel on Aug 19 2020, 10:51 AM.

Details

Summary

LLVM IR currently assumes some form of forward progress. This form is
not explicitly defined anywhere, and is the cause of miscompilations
in most languages that are not C++11 or later. This implicit forward progress
guarantee can not be opted out of on a function level nor on a loop
level. Languages such as C (C11 and later), C++ (pre-C++11), and Rust
have different forward progress requirements and this needs to be
evident in the IR.

Specifically, C11 and onwards (6.8.5, Paragraph 6) states that "An
iteration statement whose controlling expression is not a constant
expression, that performs no input/output operations, does not access
volatile objects, and performs no synchronization or atomic operations
in its body, controlling expression, or (in the case of for statement)
its expression-3, may be assumed by the implementation to terminate."
C++11 and onwards does not have this assumption, and instead assumes
that every thread must make progress as defined in [intro.progress] when
it comes to scheduling.

This was initially brought up in [0] as a bug, a solution was presented
in [1] which is the current workaround, and the predecessor to this
change was [2].

After defining a notion of forward progress for IR, there are two
options to address this:

  1. Set the default to assuming Forward Progress and provide an opt-out for functions and an opt-in for loops.
  2. Set the default to not assuming Forward Progress and provide an opt-in for functions, and an opt-in for loops.

Option 2) has been selected because only C++11 and onwards have a
forward progress requirement and it makes sense for them to opt-into it
via the defined mustprogress function attribute. The mustprogress
function attribute indicates that the function is required to make
forward progress as defined. This is sharply in contrast to the status
quo where this is implicitly assumed. In addition, willreturn implies mustprogress.

The background for why this definition was chosen is in [3] and for why
the option was chosen is in [4] and the corresponding thread(s). The implementation is in D85393, the
clang patch is in D86841, the LoopDeletion patch is in D86844, the
Inliner patches are in D87180 and D87262, and there will be more
incoming.

[0] https://bugs.llvm.org/show_bug.cgi?id=965#c25
[1] https://lists.llvm.org/pipermail/llvm-dev/2017-October/118558.html
[2] https://reviews.llvm.org/D65718
[3] https://lists.llvm.org/pipermail/llvm-dev/2020-September/144919.html
[4] https://lists.llvm.org/pipermail/llvm-dev/2020-September/145023.html

Diff Detail

Unit TestsFailed

Event Timeline

atmnpatel created this revision.Aug 19 2020, 10:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 19 2020, 10:51 AM
atmnpatel requested review of this revision.Aug 19 2020, 10:51 AM
atmnpatel edited the summary of this revision. (Show Details)Aug 19 2020, 10:53 AM

There is some appeal in not over-specifying this. Arguably, we can make loop deletion and others aware of this on a function level. We can continue to delete readnone + X calls if they are not noprogress which is the default. Languages that want to keep noprogress everywhere need to mark all functions. Though, I haven't thought this all the way through. @RustFolks do you see this work for you assuming reasonable changes to the optimization passes to look for the attribute?

This attribute indicates that the function is guaranteed to not make progress

I don't think this is correct. I think the correct wording is "the function is *permitted* to not make progress, i.e., to run into an infinite loop without side-effects". (Ideally there's a definition of "progress" somewhere that we can reference.)

There is some appeal in not over-specifying this.

As someone who keeps running into problems where LLVM is underspecified, I don't think "overspecification" is a problem that we have here. ;)

fhahn added a subscriber: fhahn.Aug 20 2020, 1:14 AM

Flipped the semantics as per the comments, included a definition of progress from the C++ standard, could not find strict definitions anywhere else.

I like the new wording and reference. Maybe just callees, unclear.

Addressed comments.

atmnpatel updated this revision to Diff 289847.Sep 3 2020, 8:25 PM

Changing name of function IR attribute from noprogress to mayprogress to make the semantics clearer in that if this attribute is applied to a function, it is now permitted to not make forward progress, but may still make forward progress as opposed to never making any forward progress which is what noprogress sounds like.

jdoerfert added inline comments.Sep 3 2020, 9:01 PM
llvm/docs/LangRef.rst
1959

Say that functions without this attribute are implicit `mustprogress` and required to make progress to put this into context.

atmnpatel updated this revision to Diff 289863.Sep 3 2020, 9:44 PM

Address comments.

atmnpatel retitled this revision from [WIP] [RFC] [LangRef] Define noprogress attribute to [LangRef] Define mayprogress attribute.Sep 3 2020, 9:45 PM
atmnpatel edited the summary of this revision. (Show Details)

Shouldn't it be "maynotprogress" (or "maybenoprogress" or so)? *Every* function *may* progress, but only those with this marker are also allowed to not "make progress".

(Sorry I first posted this in the wrong review I think.)

Shouldn't it be "maynotprogress" (or "maybenoprogress" or so)? *Every* function *may* progress, but only those with this marker are also allowed to not "make progress".

The way to read it is:
Every function *mustmakeprogress* unless it has the *mayprogress* attribute.

So "may" is either they do or they don't, both are fine.
And, "must" means they have to or UB.

Shouldn't it be "maynotprogress" (or "maybenoprogress" or so)? *Every* function *may* progress, but only those with this marker are also allowed to not "make progress".

The way to read it is:
Every function *mustmakeprogress* unless it has the *mayprogress* attribute.

So "may" is either they do or they don't, both are fine.
And, "must" means they have to or UB.

I understand. But this is not how the word "may" is usually used. When I say "you may do X", that typically implies that usually, X is not a thing you may do.
This can certainly be explained away, but the first impression people are going to have before reading the docs is likely going to be wrong.

hfinkel added a subscriber: hfinkel.Sep 4 2020, 9:22 AM

Shouldn't it be "maynotprogress" (or "maybenoprogress" or so)? *Every* function *may* progress, but only those with this marker are also allowed to not "make progress".

The way to read it is:
Every function *mustmakeprogress* unless it has the *mayprogress* attribute.

So "may" is either they do or they don't, both are fine.
And, "must" means they have to or UB.

I understand. But this is not how the word "may" is usually used. When I say "you may do X", that typically implies that usually, X is not a thing you may do.
This can certainly be explained away, but the first impression people are going to have before reading the docs is likely going to be wrong.

I agree. maynotprogress is more self explanatory.

I understand. But this is not how the word "may" is usually used. When I say "you may do X", that typically implies that usually, X is not a thing you may do.
This can certainly be explained away, but the first impression people are going to have before reading the docs is likely going to be wrong.

I agree. maynotprogress is more self explanatory.

It may do something, it may not do something, sure we can rename it.

atmnpatel edited the summary of this revision. (Show Details)Sep 4 2020, 9:47 AM
atmnpatel retitled this revision from [LangRef] Define mayprogress attribute to [LangRef] Define maynotprogress attribute.Sep 4 2020, 10:29 AM
atmnpatel updated this revision to Diff 289992.Sep 4 2020, 10:30 AM

Renamed to maynotprogress.

FWIW maynotprogress reads worse to me, because my first interpretation/reading was if it progresses, it's ub, i.e. it may not (as in, shall not) progress instead of might not make progress

FWIW maynotprogress reads worse to me, because my first interpretation/reading was if it progresses, it's ub, i.e. it may not (as in, shall not) progress instead of might not make progress

Funny. I was just thinking about this too. We really want this to read "might not progress". Maybe we should just name it that. mightnotprogress. 'may' is just too ambiguous.

Atmn, sorry for the churn. Sometimes naming is hard ;)

nikic added a subscriber: nikic.Sep 4 2020, 12:09 PM

Or might_not_progress while we're at it :) No requirement to squash the whitespace.

Or might_not_progress while we're at it :) No requirement to squash the whitespace.

True. Although we squash the whitespace for all other attributes. I don't really like it, and would prefer the underscores, but it would be different.

Or might_not_progress while we're at it :) No requirement to squash the whitespace.

True. Although we squash the whitespace for all other attributes. I don't really like it, and would prefer the underscores, but it would be different.

I take that back. We have null_pointer_is_valid, sanitize_address, etc. now. I also vote for the underscores.

atmnpatel updated this revision to Diff 290633.Sep 8 2020, 9:03 PM

Updated the description based on conversations on the mailing lists.

atmnpatel updated this revision to Diff 291339.Sep 11 2020, 2:41 PM

Flipped directions and a rename.

atmnpatel edited the summary of this revision. (Show Details)Sep 11 2020, 2:45 PM
atmnpatel retitled this revision from [LangRef] Define maynotprogress attribute to [LangRef] Define mustprogress attribute.Sep 11 2020, 2:45 PM

Maybe worth noting somewhere that willreturn implies mustprogress.

atmnpatel edited the summary of this revision. (Show Details)Sep 13 2020, 1:02 PM
atmnpatel edited the summary of this revision. (Show Details)

Sorry, it slipped. Anything else?

Ping. Is this good to go?

@nikic @lebedev.ri @efriedma I'm fine with (the slightly modified) wording below. If you are too I suggest we go ahead.

llvm/docs/LangRef.rst
1963
This attribute indicates that the function is required to return, unwind,
or interact with the environment in an observable way e.g. via a
volatile memory access, I/O, or other synchronization. 
The ``mustprogress`` attribute is intended to model the
requirements of the first section of `[intro.progress] of the C++
Standard <http://eel.is/c++draft/intro.progress>`_. This attribute does
not apply transitively to callees, that means a callee might not make progress.
Note that `willreturn` implies `mustprogress`.

No further comments from me, but i strongly like inline comment reword.

atmnpatel updated this revision to Diff 294355.Sep 25 2020, 9:53 AM

Updated wording.

nikic added inline comments.Sep 25 2020, 11:28 AM
llvm/docs/LangRef.rst
1963

Some notes from my side:

  • I'm not sure how to interpret the "does not apply to callees" part. If mustprogress function A calls not-mustprogress function B and B goes into an infinite loop, then A will not return, unwind or interact with the environment, thus violating the mustprogress contract. Now, the callee does not necessarily have to be mustprogress in general, but I believe it needs to progress for any invocation from a mustprogress function -- or not?
  • I would explicitly state that violating the progress requirement results in undefined behavior.
  • It may be worthwhile to explicitly mention that a non-interacting loop inside a mustprogress function can thus be assumed to terminate and may be removed. That would help clarify the practical effect setting this attribute has.
jdoerfert added inline comments.Sep 25 2020, 11:57 AM
llvm/docs/LangRef.rst
1963

I'm not sure how to interpret the "does not apply to callees" part. If mustprogress function A calls not-mustprogress function B and B goes into an infinite loop, then A will not return, unwind or interact with the environment, thus violating the mustprogress contract. Now, the callee does not necessarily have to be mustprogress in general, but I believe it needs to progress for any invocation from a mustprogress function -- or not?

Yes. That is the intention. If we come from the perspective that C++ has the mustprogress requirement, it has to hold for any statement, including a call to a function that may not progress. However, the statement (=call) has to make progress from the C++ side. This is the same as other attributes, let's say readonly (aka. pure). A function can have that attribute and call an arbitrary function. This is all good as long as the arbitrary function is readonly for this call site.

Do you have an idea how to clarify this in the wording? (Even though other attributes implicitly make this assumption as well.)

I would explicitly state that violating the progress requirement results in undefined behavior.

Sure. @atmnpatel

It may be worthwhile to explicitly mention that a non-interacting loop inside a mustprogress function can thus be assumed to terminate and may be removed. That would help clarify the practical effect setting this attribute has.

Sounds good to me. @atmnpatel

Addressed comments, does this sound like an appropriate wording for the behavior wrt callees?

nikic accepted this revision.Sep 25 2020, 12:17 PM

Thanks for the clarification wrt callees, this LGTM now.

llvm/docs/LangRef.rst
1963

Nit: "it is undefined behavior" -> "the behavior is undefined" (seems to be the phrasing used for other attributes).

This revision is now accepted and ready to land.Sep 25 2020, 12:17 PM

Updated wrt nit.

jdoerfert accepted this revision.Sep 25 2020, 12:42 PM

Again minor wording suggestions, LGTM otherwise

llvm/docs/LangRef.rst
1967
...
As a consequence, a loop in a function with the `mustprogress` attribute can be assumed to terminate if it does not interact with the environment in an observable way, and terminating loops without side-effects can be removed. 
...
apply to
...
efriedma added inline comments.Sep 25 2020, 1:18 PM
llvm/docs/LangRef.rst
1962

Minor question: is it really a good idea to link to eel.is? Certainly it's a convenient resource, but it's not official; as far as I know, it's just someone's personal website.

atmnpatel updated this revision to Diff 294453.Sep 25 2020, 4:43 PM

Removed unofficial link, I agree that an unofficial link isn't that much better than no link. Updated wording as well.

uenoku added a subscriber: uenoku.Sep 29 2020, 5:20 AM
atmnpatel updated this revision to Diff 295400.Sep 30 2020, 1:50 PM

Updated to include syntax modifications in the llvm/utils

aykevl added a subscriber: aykevl.Oct 7 2020, 5:40 AM
atmnpatel updated this revision to Diff 297089.Oct 8 2020, 5:48 PM

Rebase to fix buildkite build.

This revision was automatically updated to reflect the committed changes.