This is an archive of the discontinued LLVM Phabricator instance.

[docs] Add document "Debugging C++ Coroutines"
ClosedPublic

Authored by ChuanqiXu on Jun 13 2022, 2:54 AM.

Details

Summary

Previously in D99179, I tried to construct debug information for coroutine frames in the middle end to enhance the debugability for coroutines. But I forget to add ReleaseNotes to hint people and documents to help people to use. My bad. @probinson revealed this in https://github.com/llvm/llvm-project/issues/55916.

So I try to add the use document now. Due to I am not a native speaker, any suggestion will be really appreciated.

I don't try to add a ReleaseNote since it was released in versions.

I add 'clang-language-wg' since I feel like guys there might be interested. Although this is not part of standard, it matters about user experiences.

Diff Detail

Event Timeline

ChuanqiXu created this revision.Jun 13 2022, 2:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 13 2022, 2:54 AM
Herald added a subscriber: arphaman. · View Herald Transcript
ChuanqiXu edited the summary of this revision. (Show Details)Jun 13 2022, 2:59 AM
ChuanqiXu added reviewers: probinson, dblaikie, jmorse, aprantl, aaron.ballman, erichkeane, Restricted Project.

The organization and content of the document are quite good, I learned quite a lot, so thank you! I found the prose itself a bit difficult to read, so did my best to reword in each section as I could. Feel free to take/modify/review whatever you'd like of it.

Thanks!

clang/docs/DebuggingCoroutines.rst
11

This is my attempt at a rewrite of this intro section, but feel free to keep/change what you'd like. I'm attempting to improve readability despite only minor knowledge of the feature/debug-ability, so please make sure to fact check it! (This likely applies to everything I suggest here). I also might be misusing the 'tick' marks (these things) at times, so you might wish to clean those up too.

For performance and other architectural reasons, the C++ Coroutines feature in the Clang compiler is implemented in two parts of the compiler.  Semantic analysis is performed in Clang, and Coroutine construction and optimization takes place in the LLVM middle-end.  

However, this design forces us to generate insufficient debugging information.  Typically, the compiler generates debug information in the Clang frontend, as debug information is highly language specific.  However, this is not possible for Coroutine frames because the frames are constructed in the LLVM middle-end. 

To mitigate this problem, the LLVM middle end attempts to generate some debug information, which is unfortunately incomplete, since much of the language specific information is missing in the middle end.

This document describes how to use this debug information to better debug Coroutines.
28
Due to the recent nature of C++20 Coroutines, the terminology used to describe the concepts of Coroutines is not settled.  This section defines a common, understandable terminology to be used consistently throughout this document.
34
A `coroutine function` is any function that contains any of the Coroutine Keywords `co_await`, `co_yield`, or `co_return`.  A `coroutine type` is a possible return type of one of these `coroutine functions`.  `Task` and `Generator` are commonly referred to coroutine types.
42
By technical definition, a `coroutine` is a suspendable function. However, this document typically use `coroutine` to refer to an individual instance.  For example:
erichkeane added inline comments.Jun 13 2022, 10:51 AM
clang/docs/DebuggingCoroutines.rst
53
 In practice, we typically say "`Coros` contains 3 coroutines`" in the above example, though this is not strictly correct.  More technically, this should say "`Coros` contains 3 coroutine instances" or "Coros contains 3 coroutine objects."

In this document, we follow common practice of using `coroutine` to refer to an individual `coroutine instance`, since the terms `coroutine instance` and `coroutine object` aren't sufficiently defined in this case.
63

The C++ Standard uses coroutine state to describe the allocated storage. In the compiler, we use coroutine frame to describe the generated data structure that contains the necessary coroutine information.

72
77
78
79
83
In the debugger, the coroutine function's name is obtainable from the address of the coroutine frame.
90
Every coroutine has a `promise_type`, which is shared by common coroutine types. To print a `promise_type` in a debugger when stopped at a breakpoint inside a coroutine, printing the `promise_type` can be done by:

Note: can we better specify/more precisely state 'common coroutine types' here?

96

I suggest using the 'full form' of the GDB/LLDB names for these.

98

It is also possible to print the promise_type of a coroutine from the address of the coroutine frame. For example, if the address of a coroutine is 0x416eb0, and the type of the promise_type is task::promise_type, printing the promise_type can be done by:

105

Note, I'll stop suggesting this now, so please change throughout the document if you agree here.

107
This is possible because the `promise_type` is guaranteed by the ABI to be at a 16 bit offset from the coroutine frame.
113
LLVM generates the debug information for the coroutine frame during the LLVM middle end, which permits printing of the coroutine frame in the debugger.  Much like the `promise_type`, when stopped at a breakpoint inside a coroutine we can print the coroutine frame by:
121
Just as printing the `promise_type` is possible from the coroutine address, printing the details of the coroutine frame from an address is also possible:
136
The above is possible because:
(1) The name of the debug type of the coroutine frame is the `linkage_name`, plus the `__coroutine_frame_ty` suffix because each coroutine function shares the same coroutine type.
(2) The coroutine function name is accessible from the address of the coroutine frame.
143
148
The print examples below use the following definition:
211
In debug mode (`O0` + `g`), printing this coroutine will look like:
  • I don't really like this sentence, particularly the 2nd half? 'look like' feels crummy here... perhaps we can wordsmith a bit better?
218
In the above, the values of `v` and `a` are clearly expressed, as are the temporary values for `await_counter` (`class_await_counter_1` and `class_await_counter_2`) and `std::suspend_always` (`struct_std__suspend_always_0` and `struct_std__suspend_always_3`).  The index of the current suspension point of the coroutine is emitted as `__coro_index`.  In the above example, the `__coro_index` value of `1` corresponds to `__class_await_counter_1`.
223
However, when optimizations are enabled, the printed result changes drastically:
229
Unused values are optimized out, as well as the name of the local variable `a`. The only information available is the value of a 32 bit integer.  In this simple case, it is pretty clear that `__int_32_0` represents `a`, however this becomes much less clear in more complex examples.
235
Another concern with optimization is that the value of a variable may not properly express the intended value in the source code.  For example:
254
When debugging step-by-step, the value of `__int_32_0` seemingly does not change, despite being frequently incremented, and instead is always `43`.  While this might be surprising, this is a result of the optimizer recognizing that it can eliminate most of the load/store operations.  The above code gets optimized to the equivalent of:
275
it should now be obvious why the value of `__int_32_0` remains unchanged throughout the function.  It is important to recognize that `__int_32_0` does not directly correspond to `a`, but is instead a variable generated to assist the compiler in code generation. The variables in an optimized coroutine frame should not be thought of as directly representing the variables in the C++ source, however are related to them.
284
An important concept for debugging coroutines is to understand suspended points, which are where the coroutine is currently suspended and awaiting.

For simple cases like the above, inspecting the value of the `__coro_index` variable in the coroutine frame works well.

However, it is not quite so simple in really complex situations.  In these cases, it is necessary to use the coroutine libraries to insert the line-number. 

For example:
318
In this case, we use `std::source_location` to store the line number of the await inside the `promise_type`.  Since we can locate the coroutine function from the address of the coroutine, we can identify suspended points this way as well.

The downside here is that this comes at the price of additional runtime cost. This is consistent with the C++ philosophy of "Pay for what you use".
327
Another important requirement to debug a coroutine is to print the asynchronous stack to identify the asynchronous caller of the coroutine.  As many implementations of coroutine types store `std::coroutine_handle<> continuation` in the promise type, identifying the caller should be trivial.  The `continuation` is typically the awaiting coroutine for the current coroutine.  That is, the asynchronous parent.

Since the `promise_type` is obtainable from the address of a coroutine and contains the corresponding continuation (which itself is a coroutine with a `promise_type`), it should be trivial to print the entire asynchronous stack.

This logic should be quite easily captured in a debugger script.
342

I don't have a good idea for this, but 'living coroutine' is likely not exactly what we mean here? Could there perhaps be a better term we could come up with? Perhaps List the running coroutines? Or perhaps active coroutines?

345
Another useful task when debugging coroutines is to enumerate the list of active coroutines, like is often done with threads.  While technically possible, this task is not recommended in production code as it is costly at runtime. One such solution is to store the list of currently running coroutines in a collection:
353
367
 In the above code snippet, we save the address of every active coroutine in the `active_coroutines` `unordered_set`.  As before, once we know the address of the coroutine we can derive the function, `promise_type`, and other members of the frame.  Thus, we could print the list of active coroutines from that collection.

Please note that the above is expensive from a storage perspective, and requires some level of locking (not pictured) on the collection to prevent data races.
aprantl added inline comments.Jun 13 2022, 5:09 PM
clang/docs/DebuggingCoroutines.rst
13

coroutines

14

optimizing.
Could you spell-check this?

ChuanqiXu updated this revision to Diff 436702.Jun 14 2022, 1:56 AM

Address comments.

ChuanqiXu marked 34 inline comments as done.Jun 14 2022, 2:19 AM
ChuanqiXu added inline comments.
clang/docs/DebuggingCoroutines.rst
11

Many thanks for the detailed and valuable review! It is really helpful.

14

Sorry for that. I've checked the new revision with grammarly. Hope I can get rid of these problems..

90

I've updated the new statement.

211

I replaced 'look like' to 'be'. Does this feel better?

218

The last statement is not quire right. To avoid further misunderstanding, I've tried to add more words.

235

I feel 'note' is more suitable than 'concern' here.

342

Yeah, I mean 'living' here. In my mind, if a coroutine is created, it is living. And the term 'active'/'running' might be ambiguous since it is question that if a suspended coroutine is active or not?

BTW, the story here is: more than one developers told me that they want to observe all the coroutines just like the threads ('info threads' in debugger). Then I made a similar demo. (It was more complicated since we tried different methods to solve data races, like ConcurrentHashSet, thread local set, etc). But it wasn't put into the production due to performance issues.

The reason why I wrote the section is that I believe that there will be other people who want the similar features. So I wrote the rough solution draft here. I believe this is enough to inspire the developers. And if they want to running coroutines or inactive coroutines, now they know how to implement it.

ChuanqiXu updated this revision to Diff 436712.Jun 14 2022, 2:23 AM
ChuanqiXu marked 4 inline comments as done.

Remove trailing spaces.

jryans added a subscriber: jryans.Jun 14 2022, 5:41 AM

Great work overall, this is very useful to have written up! 😄

clang/docs/DebuggingCoroutines.rst
83

Looking at the raw view (https://reviews.llvm.org/file/data/bxv4276x45yq7iul4suk/PHID-FILE-x3f2y5ttxw33aiuzhv3q/clang_docs_DebuggingCoroutines.rst), it seems like this line uses tab indentation while the one above it uses spaces, leading to potential misalignment.

Did another quick read through and made a handful of 'nit' level suggestions. Since much of it is my words, I'd love it if someone else could take a read through as a double-check.

Also, if you have the ability (I THINK this would work if you use Chrome? https://chrome.google.com/webstore/detail/rst-restructuredtext-view/jdplmdagmppjjcjfbbhhpbhilpogmkoe?hl=en-US), please double check that the formatting 'looks right'. This is a sizable document and I'd hate for silly RST file shenanigans to make it look silly.

clang/docs/DebuggingCoroutines.rst
96
97

Odd english colloquialism, but this is the more commonly used version.

106
108
109
211

Yep! LGTM!

234
235

Ok, reads alright to me.

aaron.ballman added inline comments.Jun 14 2022, 7:17 AM
clang/docs/DebuggingCoroutines.rst
2–4

RST requires that the number of underline characters matches what's being underlined, so this causes build errors.

36–37

Too many underlines here.

44–45

Way too many underlines here.

58

Are there one too many ` in this line?

74–75

Too few underlines here (I'll stop commenting -- please go through the file and fix all of the underlines).

erichkeane added inline comments.Jun 14 2022, 7:18 AM
clang/docs/DebuggingCoroutines.rst
58

Aaron is, of course, right here :)

avogelsgesang added inline comments.Jun 14 2022, 9:13 PM
clang/docs/DebuggingCoroutines.rst
113

would

print std::coroutine_handle<task::promise_type>::from_address(0x416eb0).promise()

also work? Or will the debugger choke on evaluating that expression?

Maybe that's easier to tech/remember, as it doesn't rely on the ABI, but only the C++ standard

ChuanqiXu marked 18 inline comments as done.

Address comments.

ChuanqiXu marked an inline comment as done.Jun 14 2022, 11:53 PM

Did another quick read through and made a handful of 'nit' level suggestions. Since much of it is my words, I'd love it if someone else could take a read through as a double-check.

Also, if you have the ability (I THINK this would work if you use Chrome? https://chrome.google.com/webstore/detail/rst-restructuredtext-view/jdplmdagmppjjcjfbbhhpbhilpogmkoe?hl=en-US), please double check that the formatting 'looks right'. This is a sizable document and I'd hate for silly RST file shenanigans to make it look silly.

The extension doesn't work for me but I've tried to build this document in https://overbits.herokuapp.com/rsteditor/. At least it is buildable and I don't find significant format errors.

clang/docs/DebuggingCoroutines.rst
2–4

Thanks! This is my first time to write a RST file so I don't know the rules much.

83

My bad. I've replaced all the tabs to spaces now.

113

Yeah, ABI independent is important. So I add another paragraph for it. But it doesn't work well if optimization turned on. Since both from_address(void*) and promise() would be optimized out generally due to they are too small. But it works well under O0 at least.

ChuanqiXu retitled this revision from [docs] Add Debugging C++ Coroutines to [docs] Add document "Debugging C++ Coroutines".Jun 15 2022, 12:38 AM
erichkeane added inline comments.Jun 15 2022, 6:33 AM
clang/docs/DebuggingCoroutines.rst
118
124

I'd suggest something more like:

The functions from_address(void*) and promise() are often small enough to be removed during optimization, so this method may not be possible.

ChuanqiXu updated this revision to Diff 437426.Jun 15 2022, 7:51 PM
ChuanqiXu marked an inline comment as done.

Address comments.

ChuanqiXu marked 2 inline comments as done.Jun 15 2022, 7:51 PM
ChuanqiXu updated this revision to Diff 442760.Jul 6 2022, 7:59 PM

Rebasing.

Since all the dependent patches landed, we could move forward on this patch. @erichkeane how do you think about the current status?

erichkeane accepted this revision.Jul 7 2022, 5:59 AM

I'm happy as it sits!

This revision is now accepted and ready to land.Jul 7 2022, 5:59 AM

I'm happy as it sits!

Many thanks for your reviewing!

This revision was landed with ongoing or failed builds.Jul 7 2022, 8:31 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJul 7 2022, 8:31 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript