This is an archive of the discontinued LLVM Phabricator instance.

[Coroutine][Debug] Add line and column number to suspension point id
AbandonedPublic

Authored by avogelsgesang on Aug 19 2022, 8:36 AM.

Details

Summary

Corresponding discussion: https://discourse.llvm.org/t/rfc-debug-info-for-coroutine-suspension-locations/64721

This commit is not ready to be merged! I post it for review mostly to
get early feedback on the overall approach. Test cases etc. are
currently still missing. Also, if we get consensus that we want to
follow up with this approach, I still need to update
https://clang.llvm.org/docs/DebuggingCoroutines.html

This commit improves the debug representation of the "suspension point
id" inside a coroutine frame. So far, the suspension point was presented
to the debugger as a simple integer. There was no simple way to map
this integer back to the corresponding line/column numbers of the
suspension point.

With this change, the suspension point is instead represented as an
enum where the individual enum values are name line_*_column_*.
Furthermore, this commit renames __coro_index into
__suspension_point. I think this name is better suited for the
debugger because it is more familiar to end users: As a C++ programmer I
usually think about suspension points instead of their indices.

When printing a coroutine frame we now get

$1 = {__resume_fn = 0x555555555940 <test(int&)>, __destroy_fn = 0x555555555f10 <test(int&)>,
  __promise = {<No data fields>}, __suspension_point = __suspension_point::line_23_column_36,
  ...

instead of

$1 = {__resume_fn = 0x555555555940 <test(int&)>, __destroy_fn = 0x555555555f10 <test(int&)>,
  __promise = {<No data fields>}, __coro_index = 1 '001',
  ...

Diff Detail

Event Timeline

avogelsgesang created this revision.Aug 19 2022, 8:36 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 19 2022, 8:36 AM
avogelsgesang requested review of this revision.Aug 19 2022, 8:36 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 19 2022, 8:36 AM
ChuanqiXu added a comment.EditedAug 19 2022, 8:53 AM

I haven't looked into the details.

It looks like we need to store some debug information as string for a coroutine. One concern is that if we generates too many debug informations which increases the binary size too much.

I thought to add such facilities in the debugger part. e.g., we can generate a map from the index to the line number and store such pairs. The we could get these values in debugging scripts.


In the end, I didn't move on since I find it may not be so useful/worthful in my cases. For example, for a coroutine frame which has a continuation field, which records the continuation of this coroutine. For example, then the continuation part of a bar coroutine may refer to coro. And there generally only one bar() called in coro(), so we could get the suspended location. (This is not precise. But it is workable in out situations) I am not sure if your case won't use continuation in your coroutine type.

C++
Coro coro() {
    co_await bar();
}

BTW, for the scripts to get the asynchronous stack, I have https://github.com/alibaba/async_simple/blob/main/dbg/LazyStack.py. This is for my coroutine type but I guess it is easy to edit it according to different coroutine types. Hope this helpful.

And I think it may be better to discuss such topics in https://discourse.llvm.org so that more people could know this.

I think it may be better to discuss such topics in https://discourse.llvm.org so that more people could know this.

Good point! Posted this for discussion in https://discourse.llvm.org/t/rfc-debug-info-for-coroutine-suspension-locations/64721.
I also mention your concern around debug info size there.

e.g., we can generate a map from the index to the line number and store such pairs. The we could get these values in debugging scripts.

I didn't completely understand this proposed alternative and hence didn't include it in the Discourse thread. If you could elaborate on this in the Discourse, that would be appreciated!

and there generally only one bar() called in coro(), so we could get the suspended location.

yes, this works for a large number of simple coroutine usages. However, I tend to have multiple calls to boost::asio::async_read in the same coroutine and then it is no longer possible to use this trick.

Also, while this technique works nicely for people with a deep understanding on how coroutines work, I fear it requires too much technical knowledge for wide-spread adoption. I would prefer a "my debugger still works reasonably well, also if I use coroutines"-type of user experience.

for the scripts to get the asynchronous stack, I have https://github.com/alibaba/async_simple/blob/main/dbg/LazyStack.py

Thanks for sharing! This is indeed very useful

avogelsgesang edited the summary of this revision. (Show Details)Aug 19 2022, 10:14 AM
avogelsgesang abandoned this revision.Aug 25 2022, 3:27 PM

I will likely pursue a different approach here