This is an archive of the discontinued LLVM Phabricator instance.

[clang] Better UX for Clang’s unwind-affecting attributes
AcceptedPublic

Authored by lebedev.ri on Nov 29 2022, 3:53 PM.

Details

Summary

This is an implementation of the following RFC:
https://discourse.llvm.org/t/rfc-better-ux-for-clangs-unwind-affecting-attributes/66890

In C++, there are 3 possible behaviors when the
exception escapes out an function that can not unwind:

  1. exception propagates into function's caller
  2. defined behavior of immediate program termination
  3. the wild UB case, behavior is undefined.

Let's look at obvious examples:

  1. exception propagates into caller, is caught there, and all is good: https://godbolt.org/z/MbTW9rofn
  2. exception can not exit noexcept function, program is terminated: https://godbolt.org/z/ffeaPz1dK

Now, the third case, the wild UB case, is the most interesting one.
There are 3 clang/gcc attributes that are relevant here, let's look at them:

  1. __attribute__((pure)): https://godbolt.org/z/PY3KrETb7, there the fun begins. In clang, we get UB, in gcc we get "exception propagates into function's caller"
  2. __attribute__((const)): https://godbolt.org/z/ozxoW16e9, same as __attribute__((pure))
  3. __attribute__((nothrow)): https://godbolt.org/z/YMf4sTcfa, the behavior is consistently defined as immediate program termination. I do not understand why it was defined as such, in the sense of how is that different from plain noexcept, but at least we are consistent.

Now, there are 3 problems:

  1. Our modelling of __attribute__((const))/__attribute__((pure)) differs from that of GCC, we add UB.
  2. We can not ask for __attribute__((const))/__attribute__((pure)) behavior, without it acting as exception barrier.
  3. We can not separately ask for the exception propagation to be UB. This would be a handy optimization tool, especially given how brittle our IRGen for the case 2. (program termination) is.

Therefore, this patch does two things:

  1. Match GCC's implementation-defined behavior on __attribute__((pure))/__attribute__((const))
    • they should not cause UB on exception escape, nor should they cause immediate program termination, exceptions should be free to escape into their caller.
  2. Introduce __attribute__((nounwind)), which would be lowered into LLVM IR's nounwind attribute, and if an exception escapes out of an function marked with such an attribute, wild UB happens.

Please note, while currently such an UB is indeed not sanitized,
i have a patch in progress to handle it:
https://reviews.llvm.org/D137381,
so we would not be introducing something that is impossible to deal with.

Diff Detail

Event Timeline

lebedev.ri created this revision.Nov 29 2022, 3:53 PM
Herald added a reviewer: njames93. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
lebedev.ri requested review of this revision.Nov 29 2022, 3:53 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptNov 29 2022, 3:53 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
erichkeane added inline comments.Dec 1 2022, 9:38 AM
clang/docs/ReleaseNotes.rst
955–956

unrelated changes here?

clang/include/clang/Basic/AttrDocs.td
541
544
560

Similar changes to above. This attribute is more of an 'assertion' by the developer that they promise they don't do these things (and we will UB otherwise), so I think they need to be written from that perspective.

I might suggest making an attempt to reword these docs.

ALSO, since these attributes can be spelled clang::pure and clang::const (IIRC?), I'd suggest we make the documents spelling-agnostic other than in direct code examples.

clang/lib/Analysis/CFG.cpp
2725 ↗(On Diff #478746)

I find myself thinking we should probably have a function in FunctionDecl that tests for the various states of the function, rather than keep checking the attribute's presence.

lebedev.ri updated this revision to Diff 479413.Dec 1 2022, 1:35 PM
lebedev.ri marked 5 inline comments as done.

@erichkeane thank you for taking a look!

clang/docs/ReleaseNotes.rst
955–956

Having whitespaces before newline is bad :/
This is editor doing so on file save,
and my git complains otherwise.

clang/include/clang/Basic/AttrDocs.td
560

I might suggest making an attempt to reword these docs.

This is my best attempt :)
If you can propose a better wording,
please feel free to do so.

clang/lib/Analysis/CFG.cpp
2725 ↗(On Diff #478746)

Yes, we have a huge spaghetti code spread through clang
that tries to answer the same two question:

  1. i'm a caller, if i call this function, might it throw?
  2. i'm a callee, what should i do with exceptions that try to unwind out of me?

I don't know how to improve that, and i don't think just moving this into FunctionDecl would help.

I'll take a look at rewording the docs if no one else does. I should hopefully have time next week, the rest of the patch is perhaps more important at the moment.

clang/docs/ReleaseNotes.rst
955–956

Yep, i had to turn that off in my editor at one point because I did this! I just pushed d5fc931ba to remove that WS, so this will disappear in your next rebase.

clang/lib/Analysis/CFG.cpp
2725 ↗(On Diff #478746)

FunctionDecl::WontThrow and FunctionDecl::ShouldUnwind ?

lebedev.ri marked an inline comment as done.Dec 1 2022, 1:41 PM
lebedev.ri added inline comments.
clang/lib/Analysis/CFG.cpp
2725 ↗(On Diff #478746)

As noted in the RFC, there are 3 possible behaviors on unwind.
If we want to improve interface, we should account for all of them.

lebedev.ri marked an inline comment as done.Dec 2 2022, 5:52 PM

I've been thinking, and it seems to me that explicitly opting into
__attribute__((nounwind))-provided semantics should override
the __attribute__((nothrow))/noexcept semantics.
So looks like i should look into exception specification handling.

(ping, maybe)

I've been thinking, and it seems to me that explicitly opting into
__attribute__((nounwind))-provided semantics should override
the __attribute__((nothrow))/noexcept semantics.
So looks like i should look into exception specification handling.

I've looked, and unless told otherwise, it seems like we
shouldn't add new exception specification for this attribute,
but special-case the attribute where it matters,
but i may be wrong.

Last(?) ping of the year?

erichkeane accepted this revision.Jan 9 2023, 12:44 PM

2 quick nits, otherwise LFTM.

clang/include/clang/Basic/AttrDocs.td
558
565

A quick sentence somewhere saying HOW this is a 'stronger' version of pure would be a good addition here.

This revision is now accepted and ready to land.Jan 9 2023, 12:44 PM
lebedev.ri updated this revision to Diff 487548.Jan 9 2023, 1:52 PM
lebedev.ri marked 2 inline comments as done.

2 quick nits, otherwise LFTM.

@erichkeane thank you for the review!
@aaron.ballman would you like to stamp this too?

lebedev.ri added inline comments.Jan 10 2023, 10:55 AM
clang/include/clang/Basic/AttrDocs.td
565

Hopefully this is sufficient?

erichkeane added inline comments.Jan 10 2023, 10:57 AM
clang/include/clang/Basic/AttrDocs.td
565

I think that clarifies it for me.

Precommit CI seems to have found relevant issues that need to be addressed.

clang/include/clang/Basic/Attr.td
4133

Should this be allowed in C (where structured exception handling is still a thing)?

4138

I would have guessed we'd want to help the user out in a case like: [[clang::nounwind]] void func() noexcept(false);, given that this stuff can creep in via macros?

clang/include/clang/Basic/AttrDocs.td
549
564
569
579

Looks like a whole pile of unrelated whitespace-only changes crept in; those can be backed out.

582

Presumably?

clang/lib/CodeGen/CGCall.cpp
2208–2209
clang/lib/Sema/SemaStmtAttr.cpp
246–248 ↗(On Diff #487548)

Test coverage?

252–257 ↗(On Diff #487548)

Oh great! But you're missing test coverage for this.

lebedev.ri marked 10 inline comments as done.

@aaron.ballman thank you for taking a look!
Addressed all review notes other than the SEH question and the simplehandler one.
(i'll deal with whitespace changes when committing, they should just be fixed upstream)

clang/include/clang/Basic/Attr.td
4133

The only two thing about structured exceptions i know
is that it's abbreviation and that it's a MSVC thing.
I don't know anything about how "exceptions" work there,
and therefore i do not want to touch it.

https://llvm.org/docs/LangRef.html notes:

nounwind
This function attribute indicates that the function never raises an exception. <...>.
However, functions marked nounwind may still trap or generate asynchronous exceptions.
Exception handling schemes that are recognized by LLVM to handle asynchronous exceptions,
such as SEH, will still provide their implementation defined semantics.
4138

Could you please clarify, what do you mean by "help the user" in this case?
Given

[[clang::nounwind]] void func() noexcept(false);

void qqq();

void foo() {
  try {
    func();
  } catch (...) {
    qqq();
  }
}

we already end up with

; Function Attrs: mustprogress noinline nounwind optnone uwtable
define dso_local void @_Z3foov() #0 {
entry:
  call void @_Z4funcv() #2
  ret void
}

; Function Attrs: nounwind
declare void @_Z4funcv() #1

attributes #0 = { mustprogress noinline nounwind optnone uwtable "frame-pointer"="all" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
attributes #1 = { nounwind "frame-pointer"="all" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
attributes #2 = { nounwind }

and if i drop [[clang::nounwind]], landingpad is back.

aaron.ballman added inline comments.Jan 10 2023, 12:57 PM
clang/include/clang/Basic/Attr.td
4133

Okay, let's leave it as-is for now and we can relax the restriction later if we find a need.

4138

Could you please clarify, what do you mean by "help the user" in this case?

The user has said "this function throws exceptions" (noexcept(false)) and it's also said "this function never unwinds to its caller" (the attribute) and these statements are in conflict with one another and likely signify user confusion. I would have expected a warning diagnostic here.

lebedev.ri marked 2 inline comments as done.Jan 10 2023, 1:11 PM
lebedev.ri added inline comments.
clang/include/clang/Basic/Attr.td
4138

Ah. This is effectively intentional. This attribute, effectively,
specifies the behavior in case the exception does get thrown, as being UB.
If we diagnose that case, should we also disallow the stmt case?

void i_will_throw() noexcept(false);

void foo() {
  [[clang::nounwind]] i_will_throw(); // Should this be diagnosed?
}

Presumably not, because that immediately limits usefulness of the attribute.

What we *DO* want, is diagnostics (and more importantly, a sanitizer!)
in the case the exception *does* reach this UB boundary.

aaron.ballman added inline comments.Jan 10 2023, 1:42 PM
clang/include/clang/Basic/Attr.td
4138

Sorry if this is a dumb question, but is the goal of this attribute basically to insert UB where there is well-defined behavior per spec? Throwing an exception that escapes from a function marked noexcept(false) is guaranteed to call std::terminate(): http://eel.is/c++draft/except#spec-5

lebedev.ri marked 2 inline comments as done.Jan 10 2023, 1:47 PM
lebedev.ri added inline comments.
clang/include/clang/Basic/Attr.td
4138

Sorry if this is a dumb question, but is the goal of this attribute basically to insert UB where there is well-defined behavior per spec?

Precisely! I was under impression i did call that out in the RFC/description, but guess not explicitly enough?
This is consistent with the existing (accidental) behavior of const/pure attirbutes.

aaron.ballman added inline comments.Jan 11 2023, 12:04 PM
clang/include/clang/Basic/Attr.td
4138

@lebedev.ri and I (and a few others) discussed this attribute on IRC and we're thinking of trying a different approach. Instead of exposing this attribute (which is pretty expert-only and seems likely to cause unintentional UB as a result), it might be worth exploring adding a -f flag to treat non-throwing exception specifications as implying nounwind. This is a language dialect and we usually avoid those, but we already have a significantly greater dialect in this same space with the -fno-exceptions functionality. We think the perf gains for the flag are likely to be worth supporting the dialect.

Next steps here are to rip out the nounwind attribute bits so we can move on the const and pure changes, and to start an RFC about the potential new flag.

lebedev.ri marked an inline comment as done.
lebedev.ri edited the summary of this revision. (Show Details)
lebedev.ri marked an inline comment as done.
efriedma added subscribers: nikic, efriedma.

From an IR semantics standpoint, I'm not sure the memory(none) marking is right if the function throws an exception. LangRef doesn't explicitly exclude the possibility, but take the following:

void f() { throw 1; }

It gets lowered to:

define dso_local void @_Z1fv() local_unnamed_addr #0 {
entry:
  %exception = tail call ptr @__cxa_allocate_exception(i64 4) #1
  store i32 1, ptr %exception, align 16, !tbaa !5
  tail call void @__cxa_throw(ptr nonnull %exception, ptr nonnull @_ZTIi, ptr null) #2
  unreachable
}

So we have a function that essentially makes a heap allocation, store a value to it, then throws it. And throwing the exception then makes other changes to the exception handling state which are visible to the caller. Given these details are exposed in LLVM IR, saying that this is memory(none) seems a bit dubious. (I don't have a testcase that breaks off the top of my head, but I suspect it's possible to write such a testcase.) Given that, if you're going to drop the nounwind attribute, I suspect it also makes sense to also drop the memory(none) attribute.

The AddPotentialArgAccess() helper exists to handle a similar sort of situation: a function is marked pure in C, but the return value is returned indirectly, so it's not actually pure from the perspective of LLVM IR. To fix that, we adjust the IR attributes to something more appropriate.

(Adding @nikic and @jdoerfert for additional perspective on the IR attributes.)


See also https://github.com/llvm/llvm-project/issues/36098 , which is also about a mismatch between the meaning of the C pure attribute and the LLVM IR memory(none).

From an IR semantics standpoint, I'm not sure the memory(none) marking is right if the function throws an exception.

Then memory(read) would be wrong, either.
A bit of a hot take: it's a Schrodinger's exception in quantum state.
What i mean by that, is that e.g. pure attribute is documented as

and will not cause any observable side-effects other than
returning a value. They may read memory, but can not modify memory in a way
that would either affect observable state or their outcome on subsequent runs.

So if it does unwind, that's fine, but if you *observed* that (not by catching an exception),
then you're in UB lands, because *you*, not the function, violated it's contract.

That might be too hot of a take, but it does seem like the EH internal details aren't really *that* observable...

I'm not trying to argue what the pure/const attribute in C/C++ is supposed to mean. I agree with your interpretation of the gcc documentation/implementation. I'm saying that there's a mismatch between the gcc pure/const and the LLVM IR memory(read)/memory(none), and therefore you can't unconditionally lower const to memory(none).

From an IR semantics standpoint, I'm not sure the memory(none) marking is right if the function throws an exception. LangRef doesn't explicitly exclude the possibility, but take the following:

void f() { throw 1; }

It gets lowered to:

define dso_local void @_Z1fv() local_unnamed_addr #0 {
entry:
  %exception = tail call ptr @__cxa_allocate_exception(i64 4) #1
  store i32 1, ptr %exception, align 16, !tbaa !5
  tail call void @__cxa_throw(ptr nonnull %exception, ptr nonnull @_ZTIi, ptr null) #2
  unreachable
}

If you mark this one as readonly/none we might, at some point, see the mismatch in the body and declare it UB.
From the outside, it should almost be fine since it's not-nounwind and the memory that is accessed is freshly allocated.
However, it still would be a problem waiting to happen to have escaping memory being written in a readonly/none function.

That said, I think we anyway want a memory category to express all accesses are to freshly allocated memory that may escape the function. This is a common pattern.
So memory(allocated, inaccessiblemem) would express that the allocation does access some inaccesible memory and the function will also access the newly allocated memory.
Its arguably not as good as readnone but I'm not sure how to get there.
The best idea I have off the top of my head is a dedicated exception category for memory such that it won't interfere with anything but other exceptions, which it already does due to unwind.
Thus, pure/const -> `memory(exception).

From an IR semantics standpoint, I'm not sure the memory(none) marking is right if the function throws an exception. LangRef doesn't explicitly exclude the possibility, but take the following:

void f() { throw 1; }

It gets lowered to:

define dso_local void @_Z1fv() local_unnamed_addr #0 {
entry:
  %exception = tail call ptr @__cxa_allocate_exception(i64 4) #1
  store i32 1, ptr %exception, align 16, !tbaa !5
  tail call void @__cxa_throw(ptr nonnull %exception, ptr nonnull @_ZTIi, ptr null) #2
  unreachable
}

If you mark this one as readonly/none we might, at some point, see the mismatch in the body and declare it UB.
From the outside, it should almost be fine since it's not-nounwind and the memory that is accessed is freshly allocated.
However, it still would be a problem waiting to happen to have escaping memory being written in a readonly/none function.

That said, I think we anyway want a memory category to express all accesses are to freshly allocated memory that may escape the function. This is a common pattern.
So memory(allocated, inaccessiblemem) would express that the allocation does access some inaccesible memory and the function will also access the newly allocated memory.
Its arguably not as good as readnone but I'm not sure how to get there.
The best idea I have off the top of my head is a dedicated exception category for memory such that it won't interfere with anything but other exceptions, which it already does due to unwind.
Thus, pure/const -> `memory(exception).

Thank you for commenting! Is there a feeling that just dropping the attributes,
until they can be reasonably represented, hinders optimizations too much,
and we must retain the ongoing miscompile?

The issue is the function call, not the stores.
Stores are valid as long as they are to local memory (stack or allocated by the function).

Because of the call, you can't tag it as memory(none). But you may be able to tag it with inaccessiblemem if the call is also marked as such. Otherwise, no.

Performance wise it shouldn't matter much. How many functions are tagged with pure/const in source code? So here I would lean towards correctness.

@jdoerfert can you please clarify, do you believe we are fine as-is, or need to change attributes we emit?

lebedev.ri updated this revision to Diff 489975.EditedJan 17 2023, 4:22 PM
lebedev.ri added subscribers: Anastasia, arsenm.

Ok, if we must not unconditionally emit the memory attributes, then let's not.
Please stamp? :)

@Anastasia @svenvh @arsenm This breaks opencl headers in a way i do not understand how to undo.

Ok, if we must not unconditionally emit the memory attributes, then let's not.
Please stamp? :)

FWIW, I don't think we should land any of this until after the Clang 16 branch has happened; I think we need more time for this to bake before we release it to the wild.

Ok, so. I looked really hard, and essentially i'm not sure we can just change those attributes from implying readonly/readnone.
Things kinda just fall apart

clang/test/CodeGenOpenCL/fdeclare-opencl-builtins.cl
20

I've looked, and i really don't understand how D64319 works.
It seems like the AST is then serialized into a header?
Because just adding a new attribute without spelling does not solve the issue.

svenvh added inline comments.Jan 18 2023, 7:50 AM
clang/test/CodeGenOpenCL/fdeclare-opencl-builtins.cl
20

We're not setting the NoUnwind attribute for OpenCL (yet). The following quick-and-dirty patch appears to fix this test for your patch (but will cause other tests to fail). If you think it's time to add NoUnwind now, I can try putting up a review for that.

diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 276d91fa2758..1ea3c11fbe03 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -1972,7 +1972,7 @@ void CodeGenModule::getDefaultFunctionAttributes(StringRef Name,
   // TODO: NoUnwind attribute should be added for other GPU modes OpenCL, HIP,
   // SYCL, OpenMP offload. AFAIK, none of them support exceptions in device
   // code.
-  if (getLangOpts().CUDA && getLangOpts().CUDAIsDevice) {
+  if ((getLangOpts().CUDA && getLangOpts().CUDAIsDevice) || getLangOpts().OpenCL) {
     // Exceptions aren't supported in CUDA device code.
     FuncAttrs.addAttribute(llvm::Attribute::NoUnwind);
   }
lebedev.ri marked an inline comment as done.Jan 18 2023, 7:53 AM
lebedev.ri added inline comments.
clang/test/CodeGenOpenCL/fdeclare-opencl-builtins.cl
20

Yeah, that would be my guess as to how you'd fix this it, yes.
I'd suggest posting such a patch.