This is an archive of the discontinued LLVM Phabricator instance.

[IR] return nullptr in Instruction::getInsertionPointAfterDef for CallBrInst
AbandonedPublic

Authored by nickdesaulniers on Dec 15 2022, 1:42 PM.

Details

Summary

A recommended in
https://reviews.llvm.org/D135997#3991427.

I will fold this in to D135997 if it is accepted in code review on phab.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptDec 15 2022, 1:42 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
nickdesaulniers requested review of this revision.Dec 15 2022, 1:42 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 15 2022, 1:42 PM
nikic added a comment.Dec 15 2022, 2:35 PM

This looks fine, apart from the coro-debug.ll bit, that I'm not familiar with.

llvm/lib/IR/Instruction.cpp
141–143

Maybe comment on why? (Def is available in multiple successors, there's no single dominating insertion point.)

ChuanqiXu added inline comments.Dec 15 2022, 6:05 PM
llvm/test/Transforms/Coroutines/coro-debug.ll
196–198

We don't care about the inserted checks in the test. It should be fine to check the llvm.dbg.declare is in the basic block of DEFAULT_DEST. So maybe we can check these 2 lines are not empty or we can check there is no new BB declaration before llvm.dbg.declare.

Not very familiar with this area, but it looks as though currently InstCombinerImpl::transformConstExprCastCall in InstCombineCalls.cpp will need updating as well. Specifically it may transform a CallBr instruction and insert a Cast for the return, for which it calls NewCall->getInsertionPointAfterDef() and asserts that the return is non-null. Not sure what the solution would be, but it might require dropping support for CallBr in transformConstExprCastCall if the return type is changed: there's already a block in there that bails out of transforming invoke or callbr instructions if they are used in a PHI node, since there's no place to insert the Cast in that case.

nickdesaulniers marked an inline comment as done.
  • add comment, fix InstCombinerImpl::transformConstExprCastCall

Not very familiar with this area, but it looks as though currently InstCombinerImpl::transformConstExprCastCall in InstCombineCalls.cpp will need updating as well. Specifically it may transform a CallBr instruction and insert a Cast for the return, for which it calls NewCall->getInsertionPointAfterDef() and asserts that the return is non-null. Not sure what the solution would be, but it might require dropping support for CallBr in transformConstExprCastCall if the return type is changed: there's already a block in there that bails out of transforming invoke or callbr instructions if they are used in a PHI node, since there's no place to insert the Cast in that case.

Good catch! (I think we were ok since there's already a guard on the callee being a Function and not an InlineAsm and today we don't have frontends generate callbr to Functions; but we could, so I've cleaned that all up, PTAL).

llvm/test/Transforms/Coroutines/coro-debug.ll
196–198

I'm not sure how best to express that to FileCheck.

; CHECK-NEXT-NOT: {{.*}}:

?

ChuanqiXu added inline comments.Dec 20 2022, 5:44 PM
llvm/test/Transforms/Coroutines/coro-debug.ll
196–198

I feel it is a good way to check there is no new BB declaration before llvm.dbg.declare

MaskRay added inline comments.
llvm/test/Transforms/Coroutines/coro-debug.ll
196–198

Actually I think utils/update_test_checks.py may not be a bad choice for this large test file.

If it is not time to migrate the test, I think the patch as-is using ; CHECK-NEXT: %1 = load i8*, i8** ; ... looks good.
It doesn't appear that there is more maintenance burden than ; CHECK-NEXT-NOT: {{.*}}:

void accepted this revision.Dec 21 2022, 2:42 PM
This revision is now accepted and ready to land.Dec 21 2022, 2:42 PM
llvm/test/Transforms/Coroutines/coro-debug.ll
196–198

Curiously, I tried removing existing CHECK lines from this test, then running ./llvm/utils/update_test_checks.py on this. The result doesn't pass llvm-lit -vv llvm/test/Transforms/Coroutines/coro-debug.ll which may be why this test did not use ./llvm/utils/update_test_checks.py in the first place.

ChuanqiXu added inline comments.Dec 21 2022, 6:29 PM
llvm/test/Transforms/Coroutines/coro-debug.ll
196–198

Personally I prefer to not use utils/update_test_checks.py in coroutine's tests. Since CoroSplit pass will generate many codes and it is pretty hard to read. Currently when I see the coroutine's tests, I can know the pattern of the interesting part easily instead of reading tons of CHECK (that's what I feel when I read the test generated by utils/update_test_checks.py).

And for this patch, it is true that ; CHECK-NEXT: %1 = load i8*, i8** ; ... wouldn't matter a lot. But I still feel better to make the tests focus on the things it want to test.

llvm/test/Transforms/Coroutines/coro-debug.ll
196–198

I never actually tested that ; CHECK-NEXT-NOT: {{.*}}: would be viable. Why I try it:

llvm-project/llvm/test/Transforms/Coroutines/coro-debug.ll:196:9: error: unsupported -NOT combo on prefix 'CHECK'

MaskRay added a subscriber: dblaikie.EditedDec 22 2022, 11:39 AM

I agree that ; CHECK-NOT: {{.*}}: is uncommon and doesn't seem particularly useful in this case.
Actually, I think the test has a fair chance that it may bit rot and for maintainability we probably should a comment about the C++ source and some comments about what the test file test.

CC @dblaikie for the coro-debug.ll debate. The test was added from D33614.

we probably should a comment about the C++ source

Maybe there is not a C++ source for this test file. On the one side, the Coroutines intrinsics in LLVM is intended to be an extension for LLVM, which shouldn't be dependent on the frontend language. I mean, maybe it is not possible to find a C++ program which can generate the same or similar LLVM IR with the test in LLVM Coroutines tests. On the other side, you may be interested that why we don't test directly for the IR generated from a real C++ program. The answer from me is that it'll generate too many codes from a real C++ coroutines program due to the grammar of C++ (coroutines). Previously when I tried to use the IR generated from C++ directly, I found it is too big and complex so that I'll be confusing about the test (or in another word, I feel developers can't be focus on the test in that way).

So as a result, many tests in LLVM coroutines are written by hand instead of using the generated IR from a real C++ program.

and some comments about what the test file test.

For this test file, the things it test should be: the debug information for a coroutine function should be remained in its split functions. (Background: a coroutine will be split into pieces). And personally I feel it is already addressed in the top of the file.

Actually, I think the test has a fair chance that it may bit rot

We indeed lack some test for coroutines. But for the specific test, I feel it is good as far as I know.

llvm/test/Transforms/Coroutines/coro-debug.ll
196–198

I think we can refactor them into:

; CHECK: [[DEFAULT_DEST]]:
; CHECK-NOT: {{.*}}:
; CHECK: call void @llvm.dbg.declare(metadata i32 %[[CALLBR_RES]]

Since currently there is no NEXT relationships.

I agree that ; CHECK-NOT: {{.*}}: is uncommon and doesn't seem particularly useful in this case.

I read it as "check that there's not another label between the previous checked label and the following instruction" - which seems OK to me? I might be inclined to firm it up a bit by adding {{$}} at the end to make it more clear that it's checking for a label and not a : that might appear anywhere else in a match line. (if it needs to be resilient to autogenerated comments after the label - yeah, more regexing would be OK, {{( ;.*)?$}} or whatever is necessary to make that work seems OK to me)

Actually, I think the test has a fair chance that it may bit rot and for maintainability we probably should a comment about the C++ source and some comments about what the test file test.

CC @dblaikie for the coro-debug.ll debate. The test was added from D33614.

I'm not sure I'm following all aspects of the discussion here - one is whether we should include the full C++ source and make the IR match the source, which will make it longer but maybe more explainable/maintainable in some ways (handcrafting a few IR instructions is one thing, but if it's so complicated/long we can't reasonably handcraft/modify it, then maybe it's better to generate it from clang instead). Especially for debug information, we tend to use frontend-generated IR, as simplified as possible. Though this test case might be a different situation - the IR is long, but not unmaintainably so, I think? How long's the frontend-generated IR (after as much IR optimization as is acceptable while still preserving the interesting thing to test?)?

How long's the frontend-generated IR (after as much IR optimization as is acceptable while still preserving the interesting thing to test?)?

For most tests under llvm/test/Transform/Coroutines, there is not a corresponding frontend source due to the design of coroutine will generate too many codes.

For example, for the following simple coroutine which returns 43 simply:

#include <cstdio>
#include <coroutine>
#include <exception>
#include <cassert>

template <typename T> struct task {
	struct promise_type {
		T value{123};
		std::coroutine_handle<> caller{std::noop_coroutine()};
		
		struct final_awaiter: std::suspend_always {
			auto await_suspend(std::coroutine_handle<promise_type> me) const noexcept {
				return me.promise().caller;
			}
		};
		
		constexpr auto initial_suspend() const noexcept {
			return std::suspend_always();
		}
		constexpr auto final_suspend() const noexcept {
			return final_awaiter{};
		}
		auto unhandled_exception() noexcept {
			// ignore
		}
		constexpr void return_value(T v) noexcept {
			value = v;
		} 
		constexpr auto & get_return_object() noexcept {
			return *this;
		}
	};
	
	using coroutine_handle = std::coroutine_handle<promise_type>;
	
	promise_type & promise{nullptr};
	
	task(promise_type & p) noexcept: promise{p} { }
	
	~task() noexcept {
		coroutine_handle::from_promise(promise).destroy();
	}
	
	auto await_ready() noexcept {
        return false;
    }

    auto await_suspend(std::coroutine_handle<> caller) noexcept {
        promise.caller = caller;
        return coroutine_handle::from_promise(promise);
    }

    constexpr auto await_resume() const noexcept {
        return promise.value;
    }
	
	// non-coroutine access to result
	auto get() noexcept {
		const auto handle = coroutine_handle::from_promise(promise);
		
		if (!handle.done()) {
			handle.resume();
		}
		
        return promise.value;
	}
};


auto a() noexcept -> task<int> {
	co_return 42;
}

Note that there is only a coroutine a() in the above example. The frontend generated code (with -g) will consist of 1144 lines of IR codes. I get it by:

clang++ -std=c++20 src.cpp -O3 -S -emit-llvm -g -Xclang -disable-llvm-passes -o frontend-generated.ll

And for the optimized (but before coroutine splitting) IR, it'll still consist of 645 lines of IR code. (I get it by inserting codes in CoroSplit.cpp). So I feel it is too lengthy for a lit test.

BTW, I feel the discussion is a little bit far from the revision itself. I suggest the revision land first if it removes the unnecessary check in coro-debug.ll. Then we can discuss the quality/maintainability for the test of coroutines somewhere else.

Note that there is only a coroutine a() in the above example. The frontend generated code (with -g) will consist of 1144 lines of IR codes. I get it by:

clang++ -std=c++20 src.cpp -O3 -S -emit-llvm -g -Xclang -disable-llvm-passes -o frontend-generated.ll

And for the optimized (but before coroutine splitting) IR, it'll still consist of 645 lines of IR code. (I get it by inserting codes in CoroSplit.cpp). So I feel it is too lengthy for a lit test.

That's with all LLVM passes disabled - it might be interesting if there's something more manageable with some optimization passes applied? (I guess if you don't disable them from clang, -O3 ends up lowering the IR beyond the coroutine abstracitons - so you'd have to do this manually with opt or something to only apply some optimizations and stop before the coroutines were lowered too far/beyond what's needed for this test)

Though, yeah, it still wouldn't surprise me if it's infeasibly long.

That said - I'm still confused about the CHECK-NOT for the label - that looked roughly OK to me/didn't seem too brittle (at least with checking for the end of line, if that's practical)

Note that there is only a coroutine a() in the above example. The frontend generated code (with -g) will consist of 1144 lines of IR codes. I get it by:

clang++ -std=c++20 src.cpp -O3 -S -emit-llvm -g -Xclang -disable-llvm-passes -o frontend-generated.ll

And for the optimized (but before coroutine splitting) IR, it'll still consist of 645 lines of IR code. (I get it by inserting codes in CoroSplit.cpp). So I feel it is too lengthy for a lit test.

That's with all LLVM passes disabled - it might be interesting if there's something more manageable with some optimization passes applied? (I guess if you don't disable them from clang, -O3 ends up lowering the IR beyond the coroutine abstracitons - so you'd have to do this manually with opt or something to only apply some optimizations and stop before the coroutines were lowered too far/beyond what's needed for this test)

As far as I know, I don't the method to generate the IR before Coro-splitting automatically. In the past I always tried to insert codes in CoroSplit pass manually. I didn't feel tired for it.

Though, yeah, it still wouldn't surprise me if it's infeasibly long.

That said - I'm still confused about the CHECK-NOT for the label - that looked roughly OK to me/didn't seem too brittle (at least with checking for the end of line, if that's practical)

In the test, we don't care about the inserted instructions. We only care about that the @llvm.dbg.declare lives in the DEFAULT_DEST BB. So it is slightly better to me if we only check the interesting part. The good part if one day someone else made a similar change (insert some instructions to the BB or remove some instructions in the BB), then he probably don't need to worry about the test. Personally, I feel bad when I made a change and I ran the test and many test failed but I found they are not related to my change. So I feel better to add CHECK-NOT to skip these tests.

Note that there is only a coroutine a() in the above example. The frontend generated code (with -g) will consist of 1144 lines of IR codes. I get it by:

clang++ -std=c++20 src.cpp -O3 -S -emit-llvm -g -Xclang -disable-llvm-passes -o frontend-generated.ll

And for the optimized (but before coroutine splitting) IR, it'll still consist of 645 lines of IR code. (I get it by inserting codes in CoroSplit.cpp). So I feel it is too lengthy for a lit test.

That's with all LLVM passes disabled - it might be interesting if there's something more manageable with some optimization passes applied? (I guess if you don't disable them from clang, -O3 ends up lowering the IR beyond the coroutine abstracitons - so you'd have to do this manually with opt or something to only apply some optimizations and stop before the coroutines were lowered too far/beyond what's needed for this test)

As far as I know, I don't the method to generate the IR before Coro-splitting automatically. In the past I always tried to insert codes in CoroSplit pass manually. I didn't feel tired for it.

Sorry, I didn't follow this - could you rephrase?

Though, yeah, it still wouldn't surprise me if it's infeasibly long.

That said - I'm still confused about the CHECK-NOT for the label - that looked roughly OK to me/didn't seem too brittle (at least with checking for the end of line, if that's practical)

In the test, we don't care about the inserted instructions. We only care about that the @llvm.dbg.declare lives in the DEFAULT_DEST BB. So it is slightly better to me if we only check the interesting part. The good part if one day someone else made a similar change (insert some instructions to the BB or remove some instructions in the BB), then he probably don't need to worry about the test. Personally, I feel bad when I made a change and I ran the test and many test failed but I found they are not related to my change. So I feel better to add CHECK-NOT to skip these tests.

Yeah, roughly following here & I think that's my understanding as well.

@MaskRay Could you describe more/restate your concerns with the particular CHECK-NOT: {{.*}}: you're referring to?

Note that there is only a coroutine a() in the above example. The frontend generated code (with -g) will consist of 1144 lines of IR codes. I get it by:

clang++ -std=c++20 src.cpp -O3 -S -emit-llvm -g -Xclang -disable-llvm-passes -o frontend-generated.ll

And for the optimized (but before coroutine splitting) IR, it'll still consist of 645 lines of IR code. (I get it by inserting codes in CoroSplit.cpp). So I feel it is too lengthy for a lit test.

That's with all LLVM passes disabled - it might be interesting if there's something more manageable with some optimization passes applied? (I guess if you don't disable them from clang, -O3 ends up lowering the IR beyond the coroutine abstracitons - so you'd have to do this manually with opt or something to only apply some optimizations and stop before the coroutines were lowered too far/beyond what's needed for this test)

As far as I know, I don't the method to generate the IR before Coro-splitting automatically. In the past I always tried to insert codes in CoroSplit pass manually. I didn't feel tired for it.

Sorry, I didn't follow this - could you rephrase?

Sorry. I thought you were asking: "if there is an automatic/manageable way we can get the IR applied with some optimizations but before coro-spliting?". Then my reply is "No, I don't know such methods. I always create it by editing the codes."

  • rebase, format, fix coro-debug.ll test as per @ChuanqiXu
nickdesaulniers marked 5 inline comments as done.Jan 10 2023, 11:52 AM
ChuanqiXu accepted this revision.Jan 10 2023, 5:55 PM

LGTM. Thanks for your patience!

Small point about the CallBr guard, but no other issues from my PoV!

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
3381–3383

This might be more aggressive than necessary, since if the return type doesn't change or the call instruction has no use, a PHI does not need to be inserted; this could be moderated by moving this to the block at line 3409 (if (OldRetTy != NewRetTy)), and checking whether the return value has any uses. With that said, this is a minor point since as you said this won't even be touched at this point, so doesn't need to block merging imo.

3412

See above comment.

nikic added inline comments.Jan 11 2023, 10:03 AM
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
3381–3383

This entire transform is not relevant for callbr, it only works on calls to functions, not calls to inline asm. You could replace this with assert(!isa<CallBrInst>(Call)).

nickdesaulniers planned changes to this revision.Jan 17 2023, 12:39 PM
nickdesaulniers marked an inline comment as done.Jan 17 2023, 12:55 PM
nickdesaulniers added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
3381–3383

adding the assert causes Transforms/InstCombine/freeze.ll to fail, hitting that assertion.

nickdesaulniers marked 3 inline comments as done.
This revision is now accepted and ready to land.Jan 17 2023, 1:06 PM
nikic added inline comments.Jan 17 2023, 1:21 PM
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
3381–3383

Where did you place the assert? It needs to be after the callee check above, not at the start of the function.

nickdesaulniers marked an inline comment as done.
  • use assert (in correct location) as per @nikic
efriedma accepted this revision.Jan 18 2023, 11:33 AM
void accepted this revision.Jan 18 2023, 3:25 PM
  • rebase, format
MaskRay accepted this revision.Feb 6 2023, 9:44 AM