This is an archive of the discontinued LLVM Phabricator instance.

[coroutines] Part 2 of N: Adding Coroutine Intrinsics
ClosedPublic

Authored by GorNishanov on Jul 21 2016, 7:11 PM.

Details

Summary

This is the second patch in the coroutine series. It adds coroutine intrinsics and updates intrinsic cost in TargetTransformInfoImpl.h.

Documentation and overview is here: https://reviews.llvm.org/D22603

Upstreaming sequence (rough plan)

  1. Add documentation.
  2. Add coroutine intrinsics. <= we are here
  3. Add empty coroutine passes.
  4. Add coroutine devirtualization + tests.
  5. Add CGSCC restart trigger + tests.
  6. Add coroutine heap elision + tests.
  7. Add the rest of the logic (split into more patches)

Diff Detail

Event Timeline

GorNishanov retitled this revision from to [coroutines] Part 2 of N: Adding Coroutine Intrinsics.
GorNishanov updated this object.
GorNishanov added reviewers: majnemer, pitrou.
GorNishanov set the repository for this revision to rL LLVM.
GorNishanov added subscribers: dberris, llvm-commits.
GorNishanov updated this object.Jul 21 2016, 7:31 PM
GorNishanov updated this object.
majnemer added inline comments.Jul 23 2016, 9:30 PM
include/llvm/IR/Intrinsics.td
606–607

What happens if there are two calls to llvm.coro.begin in a function?

609

I don't know if IntrNoMem is right here but NoCapture<0> is probably fine.

618–620

How is coro.param sensitive to non-parameter stores?

627–629

NoMem seems wrong, what is your rationale?
I think you can nocapture the pointer parameter.

633–634

Is it possible for optimizations to come across this intrinsic. If so, what semantics does it have?

GorNishanov added inline comments.Jul 24 2016, 8:23 AM
include/llvm/IR/Intrinsics.td
606–607

TLDR:

coro.begin(*,*,*, null) - must have exactly one, broken IR otherwise
coro.begin(*,*,*, not-null) - can have as many or as little as inliner wants

Longer explanation:

The last argument of coro.begin is null when coming out of the frontend. Later, CoroSplit pass sets it to a pointer to a constant array containing pointers to outlined parts of the coroutine. Let's call a coro.begin with null "pre-slit coro.begin", and the one with a pointer to const array, "post-split coro.begin".

Coming out of the frontend, a coroutine has exactly one coro.begin. (Verifier can check for that). Later inliner can inline calls to other coroutines and it may bring more coro.begins into a function, but, those, will be always post-split, since CoroSplit pass runs together with the Inliner in the same CGPassManager.

CoroSplit pass only cares about pre-split coro.begin and there is exactly one. Other coro.begins are there for the benefit of CoroElide pass that devirtualizes and elides heap allocations where possible.

All intrinsics that stay in the coroutine after CoroSplit pass (coro.alloc and coro.free) are linked to coro.begin, so CoroElide pass knows which coroutine they belong to.

609

Ack on NoCapture<0>.
With respect to IntrNoMem my thinking was:

coro.free(%hdl) can be lowered in one of three ways:

  1. null - if CoroElide pass elides dynamic allocations
  2. %hdl - if coroutine handle points to the beginning of the memory block
  3. gep on %hdl - that adjusts the pointer to point to the beginning of the memory block.

Wait, you are right, gep may not be a constant gep, if some of the allocas in the coroutine have alignment requirements that exceed the alignment guarantees provided by the allocation function (second parameter to coro.begin), then, indeed, coro.free would have to read some adjustment value from the memory pointed by handle to do the correct adjustment.

Okay, then,

[IntrReadMem, NoCapture<0>]

it is.

618–620

To double check:

You are asking why [IntrReadMem] and not [IntrNoMem]?

I think it is safe to mark it [IntrNoMem].
It is always lowered to a Boolean constant, and the only purpose for the first and second argument is to tie those two allocas together, in case, I want to elide parameter copy and need to RAUW one with another.

627–629

Ack on NoCapture<0>.

It is NoReadMem, since it is always lowered to a constant gep.

633–634

The life of coro.subfn.addr looks like this:

CoroEarly replaces:

call i8* llvm.coro.resume(i8* %Arg)

with

%0 = call i8* llvm.coro.subfn.addr(i8* %Arg, i8 0)
%1 = bitcast i8* %0 to void(i8*)*
call fastcc void %1 (i8* %Arg)

CoroElide may replace coro.subfn.addr with a f.resume function pointer, so it will be constant folded to:

call fastcc void @f.resume (i8* %Arg)

If CoroElide wasn't able to do devirtualize it, it will stay intact until CoroCleanup pass run at EP_OptimizerLast. CoroCleanup will replace it with: gep + load to get the address of the appropriate resume function at runtime.

0 - f.resume
1 - f.destroy

P.S.

Frontend does not emit coro.subfn.addr directly, as we would like to hide the details of the calling convention for the coroutine sub-functions.

majnemer added inline comments.Jul 24 2016, 9:33 AM
include/llvm/IR/Intrinsics.td
606–607

Can't earlier transforms clone a block containing a coro.begin call?

609

SGTM

618–620

Cool, IntrNoMem sounds good to me.

633–634

Ah, ok.

GorNishanov removed rL LLVM as the repository for this revision.
GorNishanov marked 11 inline comments as done.

Relaxed memory access attributes on coro.begin, coro.free, coro.done and coro.subfn.addr intrinsics.

include/llvm/IR/Intrinsics.td
606–607

I just found a very useful attribute. NoDuplicate.
CoroEarly can mark the coro.begin CallSite as NoDuplicate.
CoroSplit will remove that attribute, so PostSplit coro.begins could be duplicated after split.

Alternatively, we can break coro.begin into two distinct intrinsics:

coro.begin (with NoDuplicate always, and only three parameters, no last parameters)
coro.begin.postsplit (without NoDuplicate attribute and without Promise parameter)

Documentation updated to state that coro.alloc is required to have heap allocation elision.
Intrinsics attributes tweaked:
coro.alloc is now [], (when it was IntrNoMem, multiple coro.allocs belonging to different coroutines were coalesced into one, breaking the semantics)
coro.begin is now capturing the promise parameter, otherwise, GVN won't recognize operations that may write to the promise.

majnemer edited edge metadata.Jul 25 2016, 11:21 PM

coro.begin is now capturing the promise parameter, otherwise, GVN won't recognize operations that may write to the promise.

coro.begin semantically stores the promise parameter somewhere? The semantics portion of the coro.begin intrinsic doesn't describe what happens when the promise parameter is provided.

include/llvm/IR/Intrinsics.td
605

Each coro.alloc is neither a distinct alloca or null right? I think we can mark the function return with noalias.

coro.begin is now capturing the promise parameter, otherwise, GVN won't recognize operations that may write to the promise.

coro.begin semantically stores the promise parameter somewhere? The semantics portion of the coro.begin intrinsic doesn't describe what happens when the promise parameter is provided.

coro.begin does not do anything with the promise on lowering, apart from letting the promise alloca "escape" due to taking it as an argument.
This property comes from the semantic of the promise itself. Promise alloca is directly accessible from outside of the coroutine.

The issue I ran into was that in expected.cpp: https://github.com/GorNishanov/clang/blob/coro/test/Coroutines/expected.cpp, await_suspend stores the error value in the promise.

void await_suspend(std::coroutine_handle<promise_type> h) { 
  h.promise().data.error =std::move(data.error); 
  h.destroy(); 
}

To llvm it looks like:

%4 = call noalias nonnull i8* @llvm.coro.begin(i8* nonnull %3, i32 0, i8* %promise, i8* null)
...
; inlined from call await_suspend(%awaiter_this, %4)
%error4.i = getelementptr inbounds i8, i8* %4, i32 16
%5 = bitcast i8* %error4.i to i32*
store i32 42, i32* %5, align 4, !tbaa !15 ; stores into promise, 
                                          ; GVN did not know that promise escaped due to NoCapture in coro.begin
                                          ; and thought in the block below that the error value is undef
...
AfterCoroEnd:                                     ; Copy expected<int,int> back to result of the function
  %val.i = getelementptr inbounds %struct.expected, %struct.expected* %agg.result, i32 0, i32 0, i32 0
  %val4.i = getelementptr inbounds %"struct.expected<int, int>::promise_type", %"struct.expected<int, int>::promise_type"* %promise, i32 0, i32 0, i32 0
  %9 = load i32, i32* %val4.i, align 4, !tbaa !14
  store i32 %9, i32* %val.i, align 4, !tbaa !1
  %error.i = getelementptr inbounds %struct.expected, %struct.expected* %agg.result, i32 0, i32 0, i32 1
  %error6.i = getelementptr inbounds %"struct.expected<int, int>::promise_type", %"struct.expected<int, int>::promise_type"* %promise, i32 0, i32 0, i32 1
  %10 = load i32, i32* %error6.i, align 4, !tbaa !14
  store i32 %10, i32* %error.i, align 4, !tbaa !6
  ret void
}
include/llvm/IR/Intrinsics.td
605

I am not sure I can do this with within an Intrinsics.td . NoAlias is not defined there.
Currently, I am adding NoAlias and NotNull attributes to the coro.begin calls in CoroEarly pass.

GorNishanov edited edge metadata.

diff reuploaded with -U999999

s/code-block:: C++/code-block:: c++

to fix syntax highlighting on Linux

NoCapture<0> is stripped from coro.begin
NoCapture<0> and NoCapture<1> is stripped from coro.param

majnemer accepted this revision.Jul 26 2016, 10:05 PM
majnemer edited edge metadata.

LGTM

This revision is now accepted and ready to land.Jul 26 2016, 10:05 PM
This revision was automatically updated to reflect the committed changes.