This is an archive of the discontinued LLVM Phabricator instance.

[clang] Refactor mustprogress handling, add it to all loops in c++11+.
ClosedPublic

Authored by fhahn on Feb 10 2021, 7:19 AM.

Details

Summary

Currently Clang does not add mustprogress to inifinite loops with a
known constant condition, matching C11 behavior. The forward progress
guarantee in C++11 and later should allow us to add mustprogress to any
loop (http://eel.is/c++draft/intro.progress#1).

This allows us to simplify the code dealing with adding mustprogress a
bit.

Diff Detail

Event Timeline

fhahn requested review of this revision.Feb 10 2021, 7:19 AM
fhahn created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 10 2021, 7:19 AM
jdoerfert added inline comments.Feb 10 2021, 8:31 AM
clang/lib/CodeGen/CodeGenFunction.h
550–553

Can you modify the documentation to talk about what loops must make progress, this is the code below transcribed.

Also, I don't see how this works. Should the const-ness of the condition not be related to the standard/language to make a decision?

fhahn updated this revision to Diff 322791.Feb 10 2021, 12:34 PM

Document reasoning.

fhahn added inline comments.Feb 10 2021, 12:38 PM
clang/lib/CodeGen/CodeGenFunction.h
550–553

Can you modify the documentation to talk about what loops must make progress, this is the code below transcribed.

I added some comments. I hope they make things clearer.

Also, I don't see how this works. Should the const-ness of the condition not be related to the standard/language to make a decision?

The functionMustProgress call applies the C++ rules. So C++11 and above is already handled by the call.

Afterwards, loops with constant conditions never have to make progress (C++ before 11, any C version). Loops with non-constant conditions have to make progress in C11 and later.

Hopefully the comment makes that clearer.

Seems reasonable to me and makes things simpler. @atmnpatel Any thoughts?

clang/lib/CodeGen/CodeGenFunction.h
550–553

Now, even I understand this. Thanks :)

aaron.ballman added inline comments.Feb 11 2021, 4:07 AM
clang/lib/CodeGen/CodeGenFunction.h
550–553

Loops with non-constant conditions have to make progress in C11 and later.

Can you point me to the part of the C11 (or later) standards that state this? I don't see wording like http://eel.is/c++draft/intro.progress#1 in 5.1.2.4 in http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2596.pdf

fhahn added inline comments.Feb 11 2021, 4:43 AM
clang/lib/CodeGen/CodeGenFunction.h
550–553

For C11 and later, it should be 6.8.5.6, which says:

An iteration statement may be assumed by the implementation to terminate if its controlling expression is not a constant expression,171) and none of the following operations are performed in its body, controlling expression or (in the case of a for statement) its expression-3:172)
— input/output operations
— accessing a volatile object
— synchronization or atomic operations.

It would probably be good to refer to C11 6.8.5.6 here, and the equivalent for C++ (6.9.2.3.1)?

aaron.ballman added inline comments.Feb 11 2021, 4:53 AM
clang/lib/CodeGen/CodeGenFunction.h
550–553

Ah, thank you for that -- my brain was hung up on the atomic progress guarantees not the loop ones specifically. Having a standards reference in the comments would be handy, but I don't insist.

fhahn updated this revision to Diff 323003.Feb 11 2021, 7:30 AM

Add reference to standards.

fhahn updated this revision to Diff 323568.Feb 13 2021, 11:52 AM

Rebased on top of 51bf4c0e6d4c, which added -ffinite-loops.

Some thoughts

clang/lib/CodeGen/CGStmt.cpp
808

I think, if we really want to give this a name, perhaps we want something even more specific,
perhaps CondIsConstImm/CondIsConstInt?

813

Now this doesn't make sense. Why must loop progress if it's condition is a constant?
I think this should be renamed?

clang/lib/CodeGen/CodeGenFunction.h
523–536

What happens for C++11+ when -fno-finite-loops is specified?
I.e., shouldn't this be similar to loopMustProgress()?

523–536

I may be unaware, but to me this looks wrong.
getFiniteLoops() should know the right answer always, we shouldn't need to do checks for standard versions here.

clang/test/CodeGen/attr-mustprogress.c
17

Here and for C++, i would suggest to have bullet-proof RUN line coverage:

  • -std=c99/-std=c11/-std=c18/-std=c2x
  • -std=c99/-std=c11/-std=c18/-std=c2x + -ffinite-loops
  • -std=c99/-std=c11/-std=c18/-std=c2x + -fno-finite-loops
lebedev.ri requested changes to this revision.Mar 25 2021, 1:54 PM

@fhahn Are you still at this?

This revision now requires changes to proceed.Mar 25 2021, 1:54 PM
fhahn added inline comments.Apr 7 2021, 5:53 AM
clang/lib/CodeGen/CGStmt.cpp
808

I updated the name to CondIsConstInt

813

I'm not sure I follow here. Whether the condition is constant or not is important for C11 and later. Something like while(true) {} does not have to make progress, but while (Foo) {} must make progress. I am not sure what it should be renamed to?

clang/lib/CodeGen/CodeGenFunction.h
523–536

What happens for C++11+ when -fno-finite-loops is specified?
I.e., shouldn't this be similar to loopMustProgress()?

-fno-finite-loops for effectively disables marking functions as mustprogress in C++. Otherwise mustprogress on the function may be used to infer mustprogress of loops in the function. If we could prove that there are no loops and only mustprogress functions are called, we could still add mustprogress to the function. But at this point, we can't figure that out.

I think some of the CHECK lines in the original patch were incorrect, but they should be fixed in the latest version.

getFiniteLoops() should know the right answer always, we shouldn't need to do checks for standard versions here.

-ffinite-loops as is does not impact whether we add mustprogress to functions in any way, only the C++ standard version does. (e.g. using -ffinite-loops -std=c11 won't add mustprogress to functions).

-fno-finite-loops on the other hand prevents adding mustprogress even for C++.

But this patch should not change the existing behavior, other than ignoring whether the condition is constant in C++ mode or with -ffinite-loops.

clang/test/CodeGen/attr-mustprogress.c
17

I pushed 7ca4dd82175c to add the additional run lines.

fhahn updated this revision to Diff 335786.Apr 7 2021, 5:53 AM

rebased after pre-committing additional RUN lines in 7ca4dd82175c. Also adjusts a variable name

Thank you!
Some more thoughts.

clang/lib/CodeGen/CGStmt.cpp
813

Right.
But now this says that "if the condition is a constant integer 1 or 0, the loop must make progress".
Isn't that the opposite what this should say, as per your comment?
Shouldn't it be "loop must progress unless the condition is a constant integer 1"?

clang/lib/CodeGen/CodeGenFunction.h
535–536

Since every new standard version will have to be added here,
can we invert this and just check for the know fixed set
of the versions where this doesn't apply?

542

Ah, i see.
My problem is that it's impossible to tell whether loopMustProgress is a member variable
that gets assigned CondIsConstInt value, or a "filter" function.

Maybe this (and it's function friend) should be something like checkIfLoopMustProgress or something..

562

Since every new standard version will have to be added here,
can we invert this and just check for the know fixed set
of the versions where this doesn't apply?

fhahn updated this revision to Diff 336058.Apr 8 2021, 4:06 AM

Change names of loopMustProgress to checkIfLoopMustProgress. Same for functionMustProgress.

fhahn added inline comments.Apr 8 2021, 4:12 AM
clang/lib/CodeGen/CGStmt.cpp
813

Ok, so with 'saying' do you mean loopMustProgress is a bad name? I changed the name as suggested below, but if you meant something else, please let me know.

clang/lib/CodeGen/CodeGenFunction.h
535–536

I tried, but it appears as if there's no LangOpt for C++98. (this also seems not directly related to the patch, so I guess we could do that as follow-up as well?)

542

Oh right, the name was confusing. I updated the name

562

I tried but I am not sure how to best do that, given there's no explicit C89 lang opt.

lebedev.ri accepted this revision.Apr 8 2021, 4:22 AM

Seems fine to me.
Might be good for someone else (@aaron.ballman ?) to also take a look, not sure.

Thank you.

clang/lib/CodeGen/CGStmt.cpp
813

I'm saying that i was completely misreading this code because i thought CondIsConstInt was being stored into loopMustProgress, not that loopMustProgress() was being called with CondIsConstInt arg.

This revision is now accepted and ready to land.Apr 8 2021, 4:22 AM
fhahn added a comment.Apr 19 2021, 8:22 AM

Seems fine to me.
Might be good for someone else (@aaron.ballman ?) to also take a look, not sure.

ping. @aaron.ballman any chance you could take another quick look?

aaron.ballman accepted this revision.Apr 19 2021, 9:38 AM

I only found one tiny nit, and otherwise it LGTM, but this is a bit out of my wheelhouse.

clang/lib/CodeGen/CodeGenFunction.h
529

Seems conceptually okay, given that we have an option to control it.

clang/lib/CodeGen/CGStmt.cpp
808

Please use an explicit != nullptr check when converting a pointer to bool outside of an obvious boolean context.

clang/lib/CodeGen/CodeGenFunction.h
535–536

The way these options work is that the later standards imply the early standards; that's why there isn't a CPlusPlus98. You just need CPlusPlus11 here.

562

Same thing as with the C++ language versions.

fhahn updated this revision to Diff 341863.Apr 30 2021, 5:46 AM

Thanks everyone! I rebased the patch and addressed John's comments. I'll commit it shortly.

fhahn marked an inline comment as done.Apr 30 2021, 5:49 AM
fhahn added inline comments.
clang/lib/CodeGen/CGStmt.cpp
808

Updated, thanks!

clang/lib/CodeGen/CodeGenFunction.h
535–536

Great, thanks for clarifying John! I updated the code.

This revision was landed with ongoing or failed builds.Apr 30 2021, 6:24 AM
This revision was automatically updated to reflect the committed changes.

Hi all, it looks like this commit led to some unexpected behavior in our build. When compiling something like:

// clang++ -c -S -o - /tmp/test.cc -std=c++17 -O1
static int do_nothing(void* unused) {
  for (;;) {
  }
  return 0;
}

typedef int (*func_t)(void *);
void call(func_t);

void func() {
  call(&do_nothing);
}

we get the following assembly for do_nothing:

	.p2align	4, 0x90                         # -- Begin function _ZL10do_nothingPv
	.type	_ZL10do_nothingPv,@function
_ZL10do_nothingPv:                      # @_ZL10do_nothingPv
	.cfi_startproc
# %bb.0:
.Lfunc_end1:
	.size	_ZL10do_nothingPv, .Lfunc_end1-_ZL10do_nothingPv
	.cfi_endproc
                                        # -- End function
	.ident	"Fuchsia clang version 13.0.0 (https://llvm.googlesource.com/a/llvm-project 6555e53ab0f2bca3dc30f5ab3a2d6872d50b6ff8)"
	.section	".note.GNU-stack","",@progbits
	.addrsig
	.addrsig_sym _ZL10do_nothingPv

It seems that the function doesn't have a return statement or halting instruction and it would just jump into the next function. While I do follow what this patch is trying to do, this behavior seems pretty unexpected from a C++ user perspective. I could be wrong, but it doesn't seem clear in this case that the infinite loop results in UB which would justify this assembly. Would it be fine to revert this for now to work out the kinks?

Would it be fine to revert this for now to work out the kinks?

I dont think. This is a known problem, not caused by this patch, just exposed. You can search bugzilla for it, simply, if there is an UB, llvm should emit a “ret”.

Hi all, it looks like this commit led to some unexpected behavior in our build. When compiling something like:

// clang++ -c -S -o - /tmp/test.cc -std=c++17 -O1
static int do_nothing(void* unused) {
  for (;;) {
  }
  return 0;
}

typedef int (*func_t)(void *);
void call(func_t);

void func() {
  call(&do_nothing);
}

we get the following assembly for do_nothing:

	.p2align	4, 0x90                         # -- Begin function _ZL10do_nothingPv
	.type	_ZL10do_nothingPv,@function
_ZL10do_nothingPv:                      # @_ZL10do_nothingPv
	.cfi_startproc
# %bb.0:
.Lfunc_end1:
	.size	_ZL10do_nothingPv, .Lfunc_end1-_ZL10do_nothingPv
	.cfi_endproc
                                        # -- End function
	.ident	"Fuchsia clang version 13.0.0 (https://llvm.googlesource.com/a/llvm-project 6555e53ab0f2bca3dc30f5ab3a2d6872d50b6ff8)"
	.section	".note.GNU-stack","",@progbits
	.addrsig
	.addrsig_sym _ZL10do_nothingPv

It seems that the function doesn't have a return statement or halting instruction and it would just jump into the next function. While I do follow what this patch is trying to do, this behavior seems pretty unexpected from a C++ user perspective. I could be wrong, but it doesn't seem clear in this case that the infinite loop results in UB which would justify this assembly.

Well, no, I'm afraid it is actually clear that that code does have UB according to the C++ standard. Perhaps you mean that it *shouldn't* have UB, or that Clang should define its behavior despite the standard.

I might agree with you that I don't see the value in using this stronger rule in C++, but I think it would help to understand the problem a little better. I assume this is causing problems for a less trivial test case? Or do you really have code that's relying on that loop not terminating?

Well, no, I'm afraid it is actually clear that that code does have UB according to the C++ standard. Perhaps you mean that it *shouldn't* have UB, or that Clang should define its behavior despite the standard.

I might agree with you that I don't see the value in using this stronger rule in C++, but I think it would help to understand the problem a little better. I assume this is causing problems for a less trivial test case? Or do you really have code that's relying on that loop not terminating?

I see. I guess it's good (and bad) that this discovered this UB in our code base. The example I posted is almost exactly what's in our test case. For this particular one, we need a non-terminating thread, so a thread_create function is passed do_nothing. Locally, we could get around this by using something like inline assembly to avoid UB.

(Potentially deviating from this patch) While an infinite loop may be UB according to the standard, it's also something fairly common that can be in a lot of code bases. Although it may be an unintended side-effect, it still seems a bit abrupt that we found about this UB from this patch rather than ubsan or a compiler diagnostic. Are there any plans in the future for something along the lines of improving catching this type of UB? Or alternatively, are there plans of maybe defining it in clang (like a language extension) as @rjmccall points out?

jdoerfert added a comment.EditedMay 26 2021, 6:17 PM

Well, no, I'm afraid it is actually clear that that code does have UB according to the C++ standard. Perhaps you mean that it *shouldn't* have UB, or that Clang should define its behavior despite the standard.

I might agree with you that I don't see the value in using this stronger rule in C++, but I think it would help to understand the problem a little better. I assume this is causing problems for a less trivial test case? Or do you really have code that's relying on that loop not terminating?

I see. I guess it's good (and bad) that this discovered this UB in our code base. The example I posted is almost exactly what's in our test case. For this particular one, we need a non-terminating thread, so a thread_create function is passed do_nothing. Locally, we could get around this by using something like inline assembly to avoid UB.

FWIW, you want an atomic access, something that is explicitly avoiding the UB while "busy waiting". Not just some inline assembly that tries to "hide the fact" it's UB.

[UBSAN and such are obviously good things we should have, I guess it's like always a time thing.]

Well, no, I'm afraid it is actually clear that that code does have UB according to the C++ standard. Perhaps you mean that it *shouldn't* have UB, or that Clang should define its behavior despite the standard.

I might agree with you that I don't see the value in using this stronger rule in C++, but I think it would help to understand the problem a little better. I assume this is causing problems for a less trivial test case? Or do you really have code that's relying on that loop not terminating?

I see. I guess it's good (and bad) that this discovered this UB in our code base. The example I posted is almost exactly what's in our test case. For this particular one, we need a non-terminating thread, so a thread_create function is passed do_nothing. Locally, we could get around this by using something like inline assembly to avoid UB.

(Potentially deviating from this patch) While an infinite loop may be UB according to the standard, it's also something fairly common that can be in a lot of code bases. Although it may be an unintended side-effect, it still seems a bit abrupt that we found about this UB from this patch rather than ubsan or a compiler diagnostic. Are there any plans in the future for something along the lines of improving catching this type of UB? Or alternatively, are there plans of maybe defining it in clang (like a language extension) as @rjmccall points out?

I think this new optimization was mentioned in clang release notes, worth to read them. And there is a flag to disable this opt - -fno-finite-loops I think.
Info about loop finiteness is useful for various optimizations so I dont know why clang should promise any behaviour.

Sometime in the future you may compile your code with gcc and boom as well..

And maybe you can use this https://clang.llvm.org/extra/clang-tidy/checks/bugprone-infinite-loop.html to find infinite loops? Possibly we have compiler remark for it as well (?).

There's no reliable way to report this with UBSan because in general we can't ever know that a loop will not terminate. That said, we could report *obviously* trivial loops, either with UBSan or just with a diagnostic. If the body of your loop really is empty, and the conditions are trivial, that ought to be enough to catch it.

The conformant fix is to do some sort of atomic operation or other side-effect in your loop; the non-conformant fix is to disable the optimization with -fno-finite-loops.