This is an archive of the discontinued LLVM Phabricator instance.

FunctionAttrs: do not mark coroutines with odd return path `noreturn`
ClosedPublic

Authored by t.p.northover on Aug 16 2021, 4:59 AM.

Details

Summary

Under one of the ABIs we support for coroutines, a call to @llvm.coro.end may actually (despite not being a BB terminator) return from the function. This is weird, but documented in https://llvm.org/docs/Coroutines.html#llvm-coro-end-intrinsic:

In returned-continuation lowering, llvm.coro.end fully destroys the coroutine frame. If the second argument is false, it also returns from the coroutine with a null continuation pointer, and the next instruction will be unreachable.

In this case, because of the unreachable at the end we'd normally decide the block cannot return (and if it's the only return block in the function, that the function itself is `noreturn), but this logic needs revising.

Diff Detail

Event Timeline

t.p.northover created this revision.Aug 16 2021, 4:59 AM
t.p.northover requested review of this revision.Aug 16 2021, 4:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 16 2021, 4:59 AM
nikic added a subscriber: nikic.

Looks to me like llvm.coro.end semantics are broken and either need to be changed by making the control flow explicit in IR, or we need additional attributes to specify this behavior in some well-defined fashion.

This sounds weird. Trying to map this to "regular semantics" coro_end seems to be able to unwind in a way that can happen also in nounwind contexts. I'd assume that can probably break the next few implicit assumptions somewhere.
Is there a reason we don't branch to a return on from the coro_end depending on the result?

I've looked into this some more, and the return inserted by @llvm.coro.end is not necessarily representable in this function. If it's a @llvm.coro.id.retcon.once coroutine then the return is always a ret void, regardless of this function's prototype.

So, what if we require a ret undef instead as about the closest we can get? The real return (in the non-once case) is usually roughly {i8* null, <lots of undef>} and importantly doesn't contain any values from the function so we won't be lying about dataflow by using undef.

That's a doc change, a bunch of tests, and probably an autoupgrade (as well as Swift adoption). I'll tentatively get started.

Well, I think the basic problem is that unlowered coroutines cannot be modeled as normal functions very well in LLVM IR, and the coro intrinsics are doing their best to muddle through given the unstated requirement to not bother anyone else too much. We have major representational problems with modeling some of the things we need, especially the distinctions between ordinary-ish behavior in the ramp function, logically separate from the coroutine, vs. things that are semantically part of the coroutine body.

coro.end is one place where the representation is especially awkward, in part because of contrasting requirements between different lowerings.

  • In the the retcon lowerings, coro.end is always an immediate exit from the coroutine, akin to ret. In a more abstract representation, this could just be a ret void (because the semantic return type of a retcon coroutine is always void, at least for now). However, actually using ret would be awkward in the current representation because we'd need to give it special treatment *without* giving special treatment to rets in the ramp portion of the function.
  • For the async lowering, coro.end was totally inadequate for what we needed to express, and we had to introduce a different intrinsic. This is because the way we "return" to our async caller is by performing a tail call to its continuation function, and we need to represent both the function pointer and the return values. In a more abstract representation, these things could be done implicitly, like ordinary CC lowering, and this could just be a ret. I think part of the reason we went this way is because we wanted to give ourselves room to change around how things like the continuation function pointer were stored (e.g. with pointer signing), and we weren't sure if we'd ever want to "return" by tail calling something that wasn't the normal continuation function pointer. We also do need to represent an awful lot of details about the async CC, like the allocation of async frames, so it makes some sense to be consistently less abstract.
  • In the switch lowering, coro.end is not necessarily an immediate end of the coroutine, at least when it appears in cleanups. I don't really understand why the representation works the way it does here, to be honest; I think Gor might not have been certain how else to get Clang to emit the IR pattern he wanted, which is not a very good reason.

Unless you really want to put a lot of work into making a more abstract representation, I think not inferring any function attributes for unlowered coroutines is probably the most reasonable thing to do. IIRC, the way it currently works is that the coroutine lowering passes just remove these attributes, but that's always been questionable, not least because it's possible they'll have caused misbehavior before then. Note that you really shouldn't infer *any* function attributes, including things like readonly, because there's a lot of implicit work that can get added during lowering.

Thanks for your thoughts too John, I've implemented your suggestion instead of the ret undef path I was thinking about.

rjmccall accepted this revision.Nov 3 2021, 11:05 AM

LGTM.

This revision is now accepted and ready to land.Nov 3 2021, 11:05 AM
t.p.northover closed this revision.Nov 4 2021, 3:25 AM

Thanks John, committed as 3d39612b3dd3.