This is an archive of the discontinued LLVM Phabricator instance.

[Coroutines] Overalign coroutine frame when frame alignment exceeds the alignment limit
AbandonedPublic

Authored by ChuanqiXu on Jul 18 2021, 7:23 PM.

Details

Summary

This is an alternative to D97915.


Background: The alignment for ::operator new(size_t) would be 16 generally, which would be used to allocate memories for coroutine frame. But if there is variables requiring more alignment in the frame, it may be possible that the variable may not get align correctly. One solution would be searching ::operator new(size_t, align_val_t) in the front end. But it requires the support from the language side. Another solution is to do over align by the compiler. By allocate llvm.coro.size() + frame_align - new_align, we could make sure that the frame must could get aligned correctly in the over allocated storage. To avoid memory leak, we need one extra fields in the frame to store the address of allocated storage.


The main differences for this diff with D97915 are:

  • Add only one extra optional coroutine intrinsics in switch style.
  • Make the main work to over-align coroutine frame in CoroFrame.cpp.

The first point may be more important. Since coroutine language intrinsics is a language itself. Although it looks like that the switch style is used by C++ coroutine only, from the design it could and should be used by other languages easily. So less the intrinsics we add, the more clear the coroutine intrinsics would be. And I guess the extra intrinsic added in this diff may be more easy for user to use. If the user need to over align the frame due to limited allocator's alignment, it could emit one coro.overalign for the allocated memory and set the corresponding limited alignment. And every thing would be the same if the user doesn't need it.

The second point makes this diff more easy to review and edit.

Test Plan: check-llvm, https://gcc.godbolt.org/z/rGzaco, https://gcc.godbolt.org/z/W3ocj778M


Another question: if it is possible that we add the tests into the llvm? It looks like the test and unit test can't check the runtime behavior. I know there are slow tests in CI system, how could I edit that?


The reason why that I don't support ::operator new(size_t, align_t) is that I am confused about the detailed semantics. For example, it the user provides two allocator, which one should be chosen? If we think it could be chosen by the compiler. Another question is that, do we need to consider both of the allocator if the user doesn't provide two of them? For example, if the user provides promise_type::operator new(size_t) only, need we consider ::operator new(size_t, align_t)? Or if the user provides promise_type::new(size_t, align_t), do we still need to consider ::operator new(size_t)? I feel there must be decisions to make. I know it is easy to make a decision. But I think it is hard to get in consensus. So I don't support operator new(size_t, align_t).

But I think it should not be hard to support them based on this diff. For example, let's define the two expression:

Expr *AlignedAllocator;
Expr *AlignedDeallocator;

Then we need to initialize them in the Sema part. And in the CodeGen part, we could do:

Alloc:         ; controlled by `llvm.coro.alloc`
   %aligned = call i1 @llvm.coro.aligned()
   br %aligned, label %align_alloc, label %normal_alloc 
; ...
Dealloc:     ; controlled by the null ness of  `llvm.coro.free`
   br %aligned, label %align_dealloc, label %normal_dealloc

Then in the middle end, we could replace the value for llvm.coro.aligned simply.

Diff Detail

Event Timeline

ChuanqiXu created this revision.Jul 18 2021, 7:23 PM
ChuanqiXu requested review of this revision.Jul 18 2021, 7:23 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 18 2021, 7:23 PM
ChuanqiXu edited the summary of this revision. (Show Details)Jul 18 2021, 8:02 PM

Thanks for the patch. Does it make more sense to compare this with D100739 (The implementation details are slightly different but the idea is similar: generating code to handle the overaligned frame in the backend. )? Before actually reviewing this patch, I think a consensus is needed about where to implement this: LLVM or Clang. My preference would be Clang however I think implementing this in LLVM is simpler. I'd appreciate other reviewer's comments.

Thanks for the patch. Does it make more sense to compare this with D100739 (The implementation details are slightly different but the idea is similar: generating code to handle the overaligned frame in the backend. )?

Yeah, compared with D100739, the overall idea may be similar. The main overall difference may be the design for introduced intrinsics, llvm.coro.overalign in this diff and llvm.coro.size.aligned in D100739. The intrinsic llvm.coro.overalign emitted by the frontend returns the address of allocated space. It gives more control to users. What I mean is that the user may be available to touch the allocated space once coroutine frame and allocated space don't refer to the same thing anymore. And the design for llvm.coro.size and llvm.coro.size.aligned may be odd to me slightly. Since the LLVM coroutine intrinsics is language itself. I think it matters.

I think a consensus is needed about where to implement this: LLVM or Clang. My preference would be Clang however I think implementing this in LLVM is simpler. I'd appreciate other reviewer's comments.

My preference would be LLVM. First, it is more simpler. Second, it should be the responsibility in the middle end. Since Clang/LLVM coroutine leaves the work for manipulating the coroutine memories to the middle end by design. And I don't know what's the benefit we could get to implement it in clang. After all, we still need some work in the middle end.

ChuanqiXu planned changes to this revision.Nov 15 2021, 2:25 AM
ChuanqiXu abandoned this revision.Nov 15 2021, 7:38 PM