This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Transforms/Coroutines/
-
Transforms/
-
Coroutines/
2
CoroFrame.cpp
-
test/Transforms/Coroutines/
-
Transforms/
-
Coroutines/
-
coro-split-mem2reg.ll

Differential D55966

Ensure coro split pass only spills variables dominated by CoroBegin
AbandonedPublic

Authored by GorNishanov on Dec 20 2018, 3:42 PM.

Download Raw Diff

Details

Reviewers

modocache
tks2103

Summary

Ensure coro split pass only spills variables dominated by CoroBegin

Fixes https://bugs.llvm.org/show_bug.cgi?id=36578.

The old coro split code didn't run safety checks against certain
spillable variables; it would try to spill things that were not
dominated by CoroBegin.

This problem gets worse after the mem2reg
pass is run, as the code which checks that Arguments are safe to
spill is even more lax than the code which checks that
Instructions are safe to spill.

There's a test which checks two things:

First, that a store/load optimized away by mem2reg doesn't fail.

Second, if there is an allocator which takes a potentially
spillable argument, we have to check all of the other arguments
to that allocator to ensure there are no unspillable instructions
as well.

Diff Detail

Repository

rL LLVM

Build Status

Buildable 26527
Build 26526: arc lint + arc unit

Event Timeline

tks2103 created this revision.Dec 20 2018, 3:42 PM

Herald added a subscriber: llvm-commits. · View Herald TranscriptDec 20 2018, 3:42 PM

Harbormaster completed remote builds in B26208: Diff 179181.Dec 20 2018, 3:42 PM

Excellent progress! Thanks for working on this.

However I don't think this logic is quite right. My understanding is that in the context of coroutines transforms, the term "spill" refers to the definition of a value appearing before a suspend point, and a use of that value appearing after. All coroutines include an implicit suspend point at their start and end: the promise_type.initial_suspend and .final_suspend points. So any argument passed into a coroutine and then used from within its body does in fact "spill". For example:

task<int> foo(int x) {
  // Implicit initial suspend point here.
  int y = x + 1; // The definition of 'x' comes at the start of the function, and its use is here past the implicit initial suspend point, so 'x' "spills" across a suspend point.
  // ...
}

Because x spills, the coroutine transform passes move it onto the coroutine frame. This ensures that the value of x is preserved across suspend points.

The logic in this patch, however, prevents x in the example above from being considered a "spill." As a result, it is not moved onto the coroutine frame, and so its value is not preserved.

You can see this in practice by looking at the sample program in https://godbolt.org/z/YNX4dS, which I'll also paste below:

#include <experimental/coroutine>
#include <iostream>

struct task {
  class promise_type;
  using handle_type = std::experimental::coroutine_handle<promise_type>;

  struct promise_type {
    task get_return_object() { return task{handle_type::from_promise(*this)}; }
    std::experimental::suspend_always initial_suspend() { return {}; }
    std::experimental::suspend_always final_suspend() { return {}; }
    void return_void() {}
    void unhandled_exception() { std::exit(1); }

    void *operator new(size_t sz, int x) {
      std::cout << __PRETTY_FUNCTION__ << " -> " << x << std::endl;
      return malloc(sz);
    }
  };

  handle_type handle = nullptr;

  explicit task(handle_type handle) : handle(handle) {}
  task(task&& t) : handle(t.handle) {}
  ~task() { if (handle) handle.destroy(); }
  void resume() { handle.resume(); }
};

task foo(int x) {
  std::cout << __PRETTY_FUNCTION__ << " -> " << x << std::endl;
  co_return;
}

int main() {
  task t = foo(42);
  t.resume();
}

At -O0 the program compilation succeeds, and when run it produces this output:

static void *task::promise_type::operator new(size_t, int) -> 42
task foo(int) -> 42

At -O1 the program produces the error described in https://bugs.llvm.org/show_bug.cgi?id=36578. With your patch applied, it no longer triggers the LLVM coroutines assert, but the value of x is lost:

static void *task::promise_type::operator new(size_t, int) -> 42
task foo(int) -> 0

So while this patch avoids the assert, it doesn't fix the underlying issue: that x spills and so needs to be stored on the coroutine frame, but it cannot be moved onto the coroutine frame because it is *also* used in the allocator of the frame. I'm not sure what the ideal solution here is... perhaps x needs to be duplicated as x.allocarg -- x.allocarg would be passed into the allocator and would not spill, whereas x would continue to be used in the body of the function and would be spilled (and could be moved past and dominated by @llvm.coro.begin). But I am not certain this would work in practice -- maybe @GorNishanov has thought about this problem some more since we last spoke?

lib/Transforms/Coroutines/CoroFrame.cpp
884	I believe `llvm::cast<T>` asserts if the cast to `T` fails, so this would assert for users that are not `Instruction`, such as `Constant`. In the future I'd recommend using `dyn_cast`, since it returns a false-y value if the cast fails: if (auto *I = dyn_cast<Instruction>(U)) { if (!DT.dominates(..., I) { ... } } Writing it this way would also remove the need for the second `cast`.

This revision now requires changes to proceed.Dec 21 2018, 2:33 PM

Hmmm, duh.

Let me look at what's in the coroutine frame in the test case I have set up without running -mem2reg. If it's doing the right thing in that case, that'll provide a clue.

Creating copies for a scalar that is used in operator new and in the body of the function is a sound strategy.
Notice that if we replace int x parameter to Int x with the following int like class:

c++
struct Int {
  int value;
  Int(int v) : value(v) {}
  friend std::ostream& operator<<(std::ostream& o, const Int& me) {{
    return o << me.value;
  }}
  ~Int() { puts("destructor so that the Int is passed as UDT"); }
};

The problem goes away. It happens because operator new operates on raw parameters and the body operates on the copy of the parameters.
For scalar types, or on UDT types that can be turned into scalars after optimizations a copy of the parameter and the raw parameter is collapsed to the same value.

I think creating a copy of the scalar that has to go into the coroutine frame and that is used before and after coro.begin is sound, since it restores the property that was originally in the coroutine conceptual model and was optimized away in the passes prior to CoroSplit

updated to handle case for a scalar spill Argument

Harbormaster completed remote builds in B26527: Diff 180732.Jan 8 2019, 1:49 PM

GorNishanov requested changes to this revision.Jan 9 2019, 4:34 PM

GorNishanov added inline comments.

lib/Transforms/Coroutines/CoroFrame.cpp
932	This does not look right. It states that we never spill arguments that are pointer types, whether we use them across suspend point or not. Any access to those arguments after suspend point will be garbage. To observe, add `i16* %x` argument that you will use after suspend (either on resume path or destroy path` and observe that in .resume or .destroy parts of the coroutine any access to %x will be replaced with `undef`

This revision now requires changes to proceed.Jan 9 2019, 4:34 PM

Any update on this issue? We're trying to upgrade the tool chain for a substantial code base from clang 6.0 to a later version, and have been seeing this error starting with clang 7.0. It also appears in clang 8.0 and 9.0, so we're stuck making any progress to upgrade.

Herald added a project: Restricted Project. · View Herald TranscriptJul 22 2019, 1:11 PM

Noted, I'll try to work on this issue in early August, at the latest. In the meantime feel free to claim that bug report and submit a patch. Unfortunately unless someone develops a patch and immediately requests it be merged into the LLVM 9.0.0 release branch, I don't think it'll be in 9.0.0.

In D55966#1596430, @modocache wrote:

Noted, I'll try to work on this issue in early August, at the latest. In the meantime feel free to claim that bug report and submit a patch. Unfortunately unless someone develops a patch and immediately requests it be merged into the LLVM 9.0.0 release branch, I don't think it'll be in 9.0.0.

Let me see if I can make the fix this week.

In D55966#1599670, @GorNishanov wrote:

In D55966#1596430, @modocache wrote:

Noted, I'll try to work on this issue in early August, at the latest. In the meantime feel free to claim that bug report and submit a patch. Unfortunately unless someone develops a patch and immediately requests it be merged into the LLVM 9.0.0 release branch, I don't think it'll be in 9.0.0.

Let me see if I can make the fix this week.

Hopefully this week

This is no longer needed. https://reviews.llvm.org/D66230 fixes the problem.

GorNishanov abandoned this revision.Aug 16 2019, 10:16 AM

Revision Contents

Path

Size

lib/

Transforms/

Coroutines/

CoroFrame.cpp

31 lines

test/

Transforms/

Coroutines/

coro-split-mem2reg.ll

64 lines

Diff 180732

lib/Transforms/Coroutines/CoroFrame.cpp

Show First 20 Lines • Show All 791 Lines • ▼ Show 20 Lines	for (auto const &E : Spills) {
}		}

// Replace all uses of CurrentDef in the current instruction with the		// Replace all uses of CurrentDef in the current instruction with the
// CurrentMaterialization for the block.		// CurrentMaterialization for the block.
E.user()->replaceUsesOfWith(CurrentDef, CurrentMaterialization);		E.user()->replaceUsesOfWith(CurrentDef, CurrentMaterialization);
}		}
}		}

		// Any values with users not dominated by CoroBegin can't be spilled
		static bool allUsersDominatedByCoroBegin(Value &V, Instruction &CBInst, Function &F) {
		DominatorTree DT(F);
		for (User *U : V.users()) {
		auto isInstruction = dyn_cast_or_null<Instruction>(U);
		if (isInstruction && !DT.dominates(&CBInst, isInstruction))
		return false;
		}
		return true;
		}

// Move early uses of spilled variable after CoroBegin.		// Move early uses of spilled variable after CoroBegin.
// For example, if a parameter had address taken, we may end up with the code		// For example, if a parameter had address taken, we may end up with the code
// like:		// like:
// define @f(i32 %n) {		// define @f(i32 %n) {
// %n.addr = alloca i32		// %n.addr = alloca i32
// store %n, %n.addr		// store %n, %n.addr
// ...		// ...
// call @coro.begin		// call @coro.begin
Show All 14 Lines	for (auto const &E : Spills) {
for (User *U : CurrentValue->users()) {		for (User *U : CurrentValue->users()) {
Instruction *I = cast<Instruction>(U);		Instruction *I = cast<Instruction>(U);
if (!DT.dominates(CoroBegin, I)) {		if (!DT.dominates(CoroBegin, I)) {
LLVM_DEBUG(dbgs() << "will move: " << *I << "\n");		LLVM_DEBUG(dbgs() << "will move: " << *I << "\n");

// TODO: Make this more robust. Currently if we run into a situation		// TODO: Make this more robust. Currently if we run into a situation
// where simple instruction move won't work we panic and		// where simple instruction move won't work we panic and
// report_fatal_error.		// report_fatal_error.
		auto isArg = dyn_cast_or_null<Argument>(CurrentValue);
		if (!isArg \|\| CurrentValue->getType()->isPointerTy()) {
for (User *UI : I->users()) {		for (User *UI : I->users()) {
if (!DT.dominates(CoroBegin, cast<Instruction>(UI)))		if (!DT.dominates(CoroBegin, cast<Instruction>(UI))) {
report_fatal_error("cannot move instruction since its users are not"		report_fatal_error("cannot move instruction since its users are not"
" dominated by CoroBegin");		" dominated by CoroBegin");
}		}
		}
NeedsMoving.push_back(I);		NeedsMoving.push_back(I);
}		}
}		}
}		}
		}

Instruction *InsertPt = CoroBegin->getNextNode();		Instruction *InsertPt = CoroBegin->getNextNode();
for (Instruction *I : NeedsMoving)		for (Instruction *I : NeedsMoving)
I->moveBefore(InsertPt);		I->moveBefore(InsertPt);
}		}

// Splits the block at a particular instruction unless it is the first		// Splits the block at a particular instruction unless it is the first
// instruction in the block with a single predecessor.		// instruction in the block with a single predecessor.
Show All 14 Lines	static void splitAround(Instruction *I, const Twine &Name) {
splitBlockIfNotFirst(I, Name);		splitBlockIfNotFirst(I, Name);
splitBlockIfNotFirst(I->getNextNode(), "After" + Name);		splitBlockIfNotFirst(I->getNextNode(), "After" + Name);
}		}

void coro::buildCoroutineFrame(Function &F, Shape &Shape) {		void coro::buildCoroutineFrame(Function &F, Shape &Shape) {
// Lower coro.dbg.declare to coro.dbg.value, since we are going to rewrite		// Lower coro.dbg.declare to coro.dbg.value, since we are going to rewrite
// access to local variables.		// access to local variables.
LowerDbgDeclare(F);		LowerDbgDeclare(F);

		modocacheUnsubmitted Not Done Reply Inline Actions I believe `llvm::cast<T>` asserts if the cast to `T` fails, so this would assert for users that are not `Instruction`, such as `Constant`. In the future I'd recommend using `dyn_cast`, since it returns a false-y value if the cast fails: if (auto I = dyn_cast<Instruction>(U)) { if (!DT.dominates(..., I) { ... } } Writing it this way would also remove the need for the second `cast`. modocache:* I believe `llvm::cast<T>` asserts if the cast to `T` fails, so this would assert for users that…
Shape.PromiseAlloca = Shape.CoroBegin->getId()->getPromise();		Shape.PromiseAlloca = Shape.CoroBegin->getId()->getPromise();
if (Shape.PromiseAlloca) {		if (Shape.PromiseAlloca) {
Shape.CoroBegin->getId()->clearPromise();		Shape.CoroBegin->getId()->clearPromise();
}		}

// Make sure that all coro.save, coro.suspend and the fallthrough coro.end		// Make sure that all coro.save, coro.suspend and the fallthrough coro.end
// intrinsics are in their own blocks to simplify the logic of building up		// intrinsics are in their own blocks to simplify the logic of building up
// SuspendCrossing data.		// SuspendCrossing data.
Show All 31 Lines	for (int Repeat = 0; Repeat < 4; ++Repeat) {
LLVM_DEBUG(dump("Materializations", Spills));		LLVM_DEBUG(dump("Materializations", Spills));
rewriteMaterializableInstructions(Builder, Spills);		rewriteMaterializableInstructions(Builder, Spills);
Spills.clear();		Spills.clear();
}		}

// Collect the spills for arguments and other not-materializable values.		// Collect the spills for arguments and other not-materializable values.
for (Argument &A : F.args())		for (Argument &A : F.args())
for (User *U : A.users())		for (User *U : A.users())
if (Checker.isDefinitionAcrossSuspend(A, U))		if (Checker.isDefinitionAcrossSuspend(A, U) && !A.getType()->isPointerTy())
		GorNishanovAuthorUnsubmitted Not Done Reply Inline Actions This does not look right. It states that we never spill arguments that are pointer types, whether we use them across suspend point or not. Any access to those arguments after suspend point will be garbage. To observe, add `i16* %x` argument that you will use after suspend (either on resume path or destroy path` and observe that in .resume or .destroy parts of the coroutine any access to %x will be replaced with `undef` GorNishanov: This does not look right. It states that we never spill arguments that are pointer types…
Spills.emplace_back(&A, U);		Spills.emplace_back(&A, U);

for (Instruction &I : instructions(F)) {		for (Instruction &I : instructions(F)) {
// Values returned from coroutine structure intrinsics should not be part		// Values returned from coroutine structure intrinsics should not be part
// of the Coroutine Frame.		// of the Coroutine Frame.
if (isCoroutineStructureIntrinsic(I) \|\| &I == Shape.CoroBegin)		if (isCoroutineStructureIntrinsic(I) \|\| &I == Shape.CoroBegin)
continue;		continue;
// The Coroutine Promise always included into coroutine frame, no need to		// The Coroutine Promise always included into coroutine frame, no need to
// check for suspend crossing.		// check for suspend crossing.
if (Shape.PromiseAlloca == &I)		if (Shape.PromiseAlloca == &I)
continue;		continue;

for (User *U : I.users())		for (User *U : I.users())
if (Checker.isDefinitionAcrossSuspend(I, U)) {		if (Checker.isDefinitionAcrossSuspend(I, U) &&
		allUsersDominatedByCoroBegin(I, *Shape.CoroBegin, F)) {
// We cannot spill a token.		// We cannot spill a token.
if (I.getType()->isTokenTy())		if (I.getType()->isTokenTy())
report_fatal_error(		report_fatal_error(
"token definition is separated from the use by a suspend point");		"token definition is separated from the use by a suspend point");
Spills.emplace_back(&I, U);		Spills.emplace_back(&I, U);
}		}
}		}
LLVM_DEBUG(dump("Spills", Spills));		LLVM_DEBUG(dump("Spills", Spills));
moveSpillUsesAfterCoroBegin(F, Spills, Shape.CoroBegin);		moveSpillUsesAfterCoroBegin(F, Spills, Shape.CoroBegin);
Shape.FrameTy = buildFrameType(F, Shape, Spills);		Shape.FrameTy = buildFrameType(F, Shape, Spills);
Shape.FramePtr = insertSpills(Spills, Shape);		Shape.FramePtr = insertSpills(Spills, Shape);
}		}

test/Transforms/Coroutines/coro-split-mem2reg.ll

This file was added.

				; Tests that coro-split can handle the case when a memory reference
				; which crosses a suspend point is promoted to a register reference
				; by mem2reg.
				; RUN: opt < %s -mem2reg -coro-early -coro-split -coro-elide -S \| FileCheck %s

				%Allocator = type { void (%Allocator) }

				declare i8* @customalloc(void (%Allocator) nonnull, %Allocator* nonnull, i32)

				define i8* @amain(%Allocator, i32 %scalar, i16) {
				Entry:
				%allocStore = alloca %Allocator*, align 8
				%id = call token @llvm.coro.id(i32 16, i8* null, i8* null, i8* null)
				store %Allocator* %0, %Allocator** %allocStore, align 8
				%allocArg = getelementptr inbounds %Allocator, %Allocator* %0, i32 0, i32 0
				%allocPtr = load void (%Allocator), void (%Allocator)* %allocArg, align 8
				%allocSuccess = call fastcc i8* @customalloc(void (%Allocator) %allocPtr, %Allocator* %0, i32 %scalar)
				%coroBegin = call i8* @llvm.coro.begin(token %id, i8* %allocSuccess)
				br label %CoroSuspend

				CoroSuspend: ; preds = %Entry
				%suspend = call i8 @llvm.coro.suspend(token none, i1 true)
				switch i8 %suspend, label %Suspend [
				i8 0, label %InvalidResume
				i8 1, label %CheckFree
				]

				Suspend: ; preds = %CheckFree, %CoroSuspend
				%end = call i1 @llvm.coro.end(i8* null, i1 false)
				ret i8* %coroBegin

				InvalidResume: ; preds = %CoroSuspend
				unreachable

				CheckFree: ; preds = %CoroSuspend
				%allocLoad = load %Allocator, %Allocator* %allocStore, align 8
				%allocLoadPtr = getelementptr inbounds %Allocator, %Allocator* %allocLoad, i32 0, i32 0
				%allocLoadVoid = load void (%Allocator), void (%Allocator)* %allocLoadPtr, align 8
				%scalarStore = alloca i32
				store i32 %scalar, i32* %scalarStore
				call i8* @llvm.coro.free(token %id, i8* %coroBegin)
				br label %Suspend
				}

				; CHECK: amain.Frame = type { void (%amain.Frame), void (%amain.Frame), i1, i1, i32 }

				; CHECK-LABEL: @amain(
				; CHECK-NOT: %allocStore =
				; CHECK-NOT: store %Allocator* %0, %Allocator** %allocStore

				; CHECK-LABEL: @amain.destroy(
				; CHECK-NOT: %allocStore.reload.addr =
				; CHECK: ret void

				; CHECK-LABEL: @amain.cleanup(
				; CHECK-NOT: %allocStore.reload.addr =
				; CHECK: ret void

				declare token @llvm.coro.id(i32, i8* readnone, i8* nocapture readonly, i8*)
				declare i64 @llvm.coro.size.i64()
				declare i8* @llvm.coro.begin(token, i8* writeonly)
				declare i8 @llvm.coro.suspend(token, i1)
				declare i1 @llvm.coro.end(i8*, i1)
				declare i8* @llvm.coro.free(token, i8* nocapture readonly)