This is an archive of the discontinued LLVM Phabricator instance.

[Debug] [Coroutine] Adjust the scope and name for coroutine frame
ClosedPublic

Authored by ChuanqiXu on Jun 13 2022, 2:46 AM.

Details

Summary

Previously the scope of debug type of __coro_frame is limited in the current function. It looked good at the first sight. But it prevent us to print the type in splitted functions and other functions. Also the debug type is different for different coroutine functions. So it makes sense to rename the debug type to make it related to the function name.

After the patch, we could print the layout of any coroutine frame once we know the address. We could find the method in https://reviews.llvm.org/D127626.

Diff Detail

Event Timeline

ChuanqiXu created this revision.Jun 13 2022, 2:46 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 13 2022, 2:46 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
ChuanqiXu requested review of this revision.Jun 13 2022, 2:46 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 13 2022, 2:46 AM
ChuanqiXu retitled this revision from [Debug] [Coroutines] Enlarge the scope of the debug type of __coro_frame to [Debug] [Coroutine] Adjust the scope and name for coroutine frame.Jun 13 2022, 2:56 AM
ChuanqiXu edited the summary of this revision. (Show Details)
ChuanqiXu set the repository for this revision to rG LLVM Github Monorepo.

I believe there's particular syntax that's supported by demanglers for suffixes - (specifically, at least for itanium, using "." to separate the suffix from the symbol name will support demanglers doing good things with the name when demangling) - could you check whether that applies here, if we have any tools/utility functions that handle this sort of suffix additions in a portable/tool-compatible way, etc, and do that?

Address comments.

I believe there's particular syntax that's supported by demanglers for suffixes - (specifically, at least for itanium, using "." to separate the suffix from the symbol name will support demanglers doing good things with the name when demangling) - could you check whether that applies here, if we have any tools/utility functions that handle this sort of suffix additions in a portable/tool-compatible way, etc, and do that?

For c++filt, the result for _ZL9coro_taski.coro_frame_ty is 'coro_task(int) [clone .coro_frame_ty]' and for llvm-cxxfilt, the corresponding result is coro_task(int) (.coro_frame_ty). I feel things are better now.

I believe there's particular syntax that's supported by demanglers for suffixes - (specifically, at least for itanium, using "." to separate the suffix from the symbol name will support demanglers doing good things with the name when demangling) - could you check whether that applies here, if we have any tools/utility functions that handle this sort of suffix additions in a portable/tool-compatible way, etc, and do that?

For c++filt, the result for _ZL9coro_taski.coro_frame_ty is 'coro_task(int) [clone .coro_frame_ty]' and for llvm-cxxfilt, the corresponding result is coro_task(int) (.coro_frame_ty). I feel things are better now.

Oh, this one might be problematic. Now the debugger complains for the following command:

  (gdb) # 0x418eb0 is the address of a coroutine frame
  (gdb )p (_ZL9coro_taski.coro_frame_ty)*0x418eb0
Attempt to extract a component of a value that is not a structure.

It looks like the debugger treat _ZL9coro_taski.coro_frame_ty as a function name to me.

I believe there's particular syntax that's supported by demanglers for suffixes - (specifically, at least for itanium, using "." to separate the suffix from the symbol name will support demanglers doing good things with the name when demangling) - could you check whether that applies here, if we have any tools/utility functions that handle this sort of suffix additions in a portable/tool-compatible way, etc, and do that?

For c++filt, the result for _ZL9coro_taski.coro_frame_ty is 'coro_task(int) [clone .coro_frame_ty]' and for llvm-cxxfilt, the corresponding result is coro_task(int) (.coro_frame_ty). I feel things are better now.

Oh, this one might be problematic. Now the debugger complains for the following command:

  (gdb) # 0x418eb0 is the address of a coroutine frame
  (gdb )p (_ZL9coro_taski.coro_frame_ty)*0x418eb0
Attempt to extract a component of a value that is not a structure.

It looks like the debugger treat _ZL9coro_taski.coro_frame_ty as a function name to me.

oh, _ZL9coro_taski is a function name - but because you were appending "__coro_frame_ty" to the end of it you were breaking the mangling and causing the consumer to no longer recognize it as a function?

hmm, I guess because the mangling includes parameter types it interprets it as a function. Fair enough I suppose.

Yeah, maybe there's some other separator that could be used that'd at least make it easier for a human to know where the mangled name ends and the suffix begins that doesn't confuse a debugger - that way the human could always pass the mangled portion to a demangler (c++filt, etc) to get the original function name? (or maybe using a prefix instead of a suffix, and the "_Z" would stand out enough to make it easy for a human to copy/paste "_Z" to the end of the name and demangle that? Not sure.

I believe there's particular syntax that's supported by demanglers for suffixes - (specifically, at least for itanium, using "." to separate the suffix from the symbol name will support demanglers doing good things with the name when demangling) - could you check whether that applies here, if we have any tools/utility functions that handle this sort of suffix additions in a portable/tool-compatible way, etc, and do that?

For c++filt, the result for _ZL9coro_taski.coro_frame_ty is 'coro_task(int) [clone .coro_frame_ty]' and for llvm-cxxfilt, the corresponding result is coro_task(int) (.coro_frame_ty). I feel things are better now.

Oh, this one might be problematic. Now the debugger complains for the following command:

  (gdb) # 0x418eb0 is the address of a coroutine frame
  (gdb )p (_ZL9coro_taski.coro_frame_ty)*0x418eb0
Attempt to extract a component of a value that is not a structure.

It looks like the debugger treat _ZL9coro_taski.coro_frame_ty as a function name to me.

oh, _ZL9coro_taski is a function name - but because you were appending "__coro_frame_ty" to the end of it you were breaking the mangling and causing the consumer to no longer recognize it as a function?

hmm, I guess because the mangling includes parameter types it interprets it as a function. Fair enough I suppose.

Yeah, maybe there's some other separator that could be used that'd at least make it easier for a human to know where the mangled name ends and the suffix begins that doesn't confuse a debugger - that way the human could always pass the mangled portion to a demangler (c++filt, etc) to get the original function name?

Do you mean something like this: _ZN14_ZL9coro_taski13coro_frame_tyE? The detangled result would be _ZL9coro_taski::coro_frame_ty. So the human could pass the mangled portion _ZL9coro_taski to the demangler.

But this one is not satisfying to me. My original idea is that users could get the coroutine type name easily by concatenating the linkage name and the suffix __coro_frame_ty. But in the above fashion, things goes complicated. They need to count the size of the linkage name, concatenating "13coro_frame_ty" and put the result in _ZN and E. I feel like it is not user friendly enough.

For the demangle part, I think the user could erase the suffix "__coro_frame_ty" easily. For tooling side, I am willing to see if I can implement it in llvm/Demangler.

(or maybe using a prefix instead of a suffix, and the "_Z" would stand out enough to make it easy for a human to copy/paste "_Z" to the end of the name and demangle that?

I feel like this is not friendly enough. It requires the user to know "_Z" stands for the start of a mangled name. But I think most of the programmer lack the knowledge.

I believe there's particular syntax that's supported by demanglers for suffixes - (specifically, at least for itanium, using "." to separate the suffix from the symbol name will support demanglers doing good things with the name when demangling) - could you check whether that applies here, if we have any tools/utility functions that handle this sort of suffix additions in a portable/tool-compatible way, etc, and do that?

For c++filt, the result for _ZL9coro_taski.coro_frame_ty is 'coro_task(int) [clone .coro_frame_ty]' and for llvm-cxxfilt, the corresponding result is coro_task(int) (.coro_frame_ty). I feel things are better now.

Oh, this one might be problematic. Now the debugger complains for the following command:

  (gdb) # 0x418eb0 is the address of a coroutine frame
  (gdb )p (_ZL9coro_taski.coro_frame_ty)*0x418eb0
Attempt to extract a component of a value that is not a structure.

It looks like the debugger treat _ZL9coro_taski.coro_frame_ty as a function name to me.

oh, _ZL9coro_taski is a function name - but because you were appending "__coro_frame_ty" to the end of it you were breaking the mangling and causing the consumer to no longer recognize it as a function?

hmm, I guess because the mangling includes parameter types it interprets it as a function. Fair enough I suppose.

Yeah, maybe there's some other separator that could be used that'd at least make it easier for a human to know where the mangled name ends and the suffix begins that doesn't confuse a debugger - that way the human could always pass the mangled portion to a demangler (c++filt, etc) to get the original function name?

Do you mean something like this: _ZN14_ZL9coro_taski13coro_frame_tyE? The detangled result would be _ZL9coro_taski::coro_frame_ty. So the human could pass the mangled portion _ZL9coro_taski to the demangler.

No, sorry, I didn't mean to adjust the mangled name, or nest it inside another - I mean some character that's not valid in a mangled name (like '.', but clearly not '.' itself, since that has other issues as you've noted) to make it clear which portion is the mangled name and which part is the suffix.

But this one is not satisfying to me. My original idea is that users could get the coroutine type name easily by concatenating the linkage name and the suffix __coro_frame_ty. But in the above fashion, things goes complicated. They need to count the size of the linkage name, concatenating "13coro_frame_ty" and put the result in _ZN and E. I feel like it is not user friendly enough.

For the demangle part, I think the user could erase the suffix "__coro_frame_ty" easily. For tooling side, I am willing to see if I can implement it in llvm/Demangler.

I'm not sure it'd be appropriate to add to the demangler as-is, since it's not part of the mangling spec. Maybe it's worth a discussion with the itanium mangling folks. https://github.com/itanium-cxx-abi

(or maybe using a prefix instead of a suffix, and the "_Z" would stand out enough to make it easy for a human to copy/paste "_Z" to the end of the name and demangle that?

I feel like this is not friendly enough. It requires the user to know "_Z" stands for the start of a mangled name. But I think most of the programmer lack the knowledge.

Still seems marginally more likely to be able to be separated by a user and demangled than as a suffix, where figuring out the end of the mangled name and the suffix is pretty non-obvious.

I'm not sure it'd be appropriate to add to the demangler as-is, since it's not part of the mangling spec. Maybe it's worth a discussion with the itanium mangling folks. https://github.com/itanium-cxx-abi

I've posted an issue at https://github.com/itanium-cxx-abi/cxx-abi/issues/142. Maybe we could discuss the issue there?

I'm not sure it'd be appropriate to add to the demangler as-is, since it's not part of the mangling spec. Maybe it's worth a discussion with the itanium mangling folks. https://github.com/itanium-cxx-abi

I've posted an issue at https://github.com/itanium-cxx-abi/cxx-abi/issues/142. Maybe we could discuss the issue there?

Sure

avogelsgesang added a subscriber: avogelsgesang.EditedJun 17 2022, 2:00 AM

Can we somehow achieve that for the source code

std::lazy<int> my_coro_task_function(int x) {
     while (--x) {
          co_await do_something();
     }
     co_return;
}

I can use

(gdb) # 0x418eb0 is the address of a coroutine frame
(gdb )p (my_coro_task_function::__coro_frame_ty)*0x418eb0

in the debugger?

As a user of clang/gdb, I do not want to deal with linkage names at all.
I would prefer if I could use the non-mangled function name and simply append ::__coro_frame_ty to it.
In my mental model, the coroutine frame type is a coroutine-function-local type. As such, writing it as a type nested inside the function scope feels pretty natural to me...

Can we somehow achieve that for the source code

std::lazy<int> my_coro_task_function(int x) {
     while (--x) {
          co_await do_something();
     }
     co_return;
}

I can use

(gdb) # 0x418eb0 is the address of a coroutine frame
(gdb )p (my_coro_task_function::__coro_frame_ty)*0x418eb0

in the debugger?

As a user of clang/gdb, I do not want to deal with linkage names at all.
I would prefer if I could use the non-mangled function name and simply append ::__coro_frame_ty to it.
In my mental model, the coroutine frame type is a coroutine-function-local type. As such, writing it as a type nested inside the function scope feels pretty natural to me...

We must not be able to achieve it due to the overloading problem. Long story short, the syntax above is ambiguous if there is another coroutine function with signature my_coro_task_function(double x).

From our use experience, your concern doesn't matter really. This is how we debugging coroutines internally:
(1) Write a gdb scripts, let's call it debug-coro.py. And it implemented a gdb command named: print-coro-frame
(2) In gdb, when we want to debug a coroutine, we could run:

(gdb) source debug-coro.py

(3) The it is simple to print the coroutine frame by:

(gdb) #0x418eb0 is the address of a coroutine frame
(gdb) print-coro-frame 0x418eb0

Then we could get the layout of that coroutine. So the usual user wouldn't see linkage name really. The reason why I didn't post the gdb scripts in D127626 is that I feel it might be bad to post gdb script in Clang community. (I feel lldb is competitor with gdb).

palves added a subscriber: palves.Jun 17 2022, 8:08 AM

Oh, this one might be problematic. Now the debugger complains for the following command:

  (gdb) # 0x418eb0 is the address of a coroutine frame
  (gdb )p (_ZL9coro_taski.coro_frame_ty)*0x418eb0
Attempt to extract a component of a value that is not a structure.

It looks like the debugger treat _ZL9coro_taski.coro_frame_ty as a function name to me.

oh, _ZL9coro_taski is a function name - but because you were appending "__coro_frame_ty" to the end of it you were breaking the mangling and causing the consumer to no longer recognize it as a function?

This is no different from any other vendor extension in the mangled name (those separated by "."), like clones. GDB is seeing the dot, and thinking
that you're trying to access a struct field called "coro_frame_ty", and then complains that _ZL9coro_task isn't a structure. You can use single quotes
around the linkage name to disambiguate. For example, using the program gdb's gdb.cp/cold-clone.exp testcase, we see the same:

(gdb) p _ZL3barv.cold
Attempt to extract a component of a value that is not a structure.
(gdb) p '_ZL3barv.cold'
$1 = {<text variable, no debug info>} 0x555555555065 <bar() [clone .cold]>
(gdb)

I think "." is right separator for appending vendor extensions to the linkage name. It's what the Itanium spec specifies. Suffixes other than "." will lead to more complicated code, I think. E.g., for HIP externalized symbols, LLVM already uses "." . Dealing with "." in that case in gdb/libiberty was easy. Dealing with a suffix like __foo is harder (dot separator is part of the grammar), and it is potentially doing to slow down demangling.

Can we somehow achieve that for the source code

std::lazy<int> my_coro_task_function(int x) {
     while (--x) {
          co_await do_something();
     }
     co_return;
}

I can use

(gdb) # 0x418eb0 is the address of a coroutine frame
(gdb )p (my_coro_task_function::__coro_frame_ty)*0x418eb0

in the debugger?

As a user of clang/gdb, I do not want to deal with linkage names at all.
I would prefer if I could use the non-mangled function name and simply append ::__coro_frame_ty to it.
In my mental model, the coroutine frame type is a coroutine-function-local type. As such, writing it as a type nested inside the function scope feels pretty natural to me...

We must not be able to achieve it due to the overloading problem. Long story short, the syntax above is ambiguous if there is another coroutine function with signature my_coro_task_function(double x).

I don't see where the problem is. This seems very much the same as printing a local static variable, when you have overloads. E.g., with:

int func (int)
{
  static int local = 1;
  return local;
}

int func (long)
{
  static int local = 2;
  return local;
}

int main () {}

You can disambiguate (again) with single quotes. Vis:

(gdb) p 'func(int)'::local
$1 = 1
(gdb) p 'func(long)'::local
$2 = 2
(gdb)

John gives the suggestion for the DIName of coroutine frame in C++ version in https://github.com/itanium-cxx-abi/cxx-abi/issues/142::

<local-name> ::= Z <function encoding> E <entity name> [<discriminator>]

So that the frame name in the example would be _ZZL9coro_taskiE13coro_frame_ty. And both C++filt and llvm-cxxfilt would detangle it as we want as coro_task(int)::coro_frame_ty. I guess this is what @dblaikie want initially. Then John throws another problem that the solution above is not suit for C-style names. Although we're talking about C++ coroutine only, we could use extern "C" in C++ to prevent mangling actually. (It should be rare actually. Since the use of C++ coroutines generally require the involvement of coroutine types like 'lazy' or 'generator'. But such coroutine types are not C compatible generally.)

Then @avogelsgesang and @palves throw another requirement to avoid to see mangled names in gdb. I feel like this might not be necessary at first since the users wouldn't see the linkage name from our use experience. Although @palves throws the solution for locals, but I found it doesn't work local scope types:

C++
int main() {
  struct S {
    int a;
    bool b;
  };
  S s;
}
(gdb) p main::s
$2 = {a = -8000, b = 255}
(gdb) p main::S
A syntax error in expression, near `'.
(gdb) p 'main'::S
A syntax error in expression, near `'.
(gdb) p 'main()'::S
A syntax error in expression, near `'.

I am not sure if it is unable to make.


So there are 2 problems we saw:
(1) How do we handle unmangled names?
(2) Do we need to enable the user to print the type without knowing mangled names?

For the first problem, I prefer to use '.' as the separator since @palves gives reason why I failed earlier. And for the second problem, my answer is no due to above reasons. But it would be good if we have solutions.

Looks like C locals

John gives the suggestion for the DIName of coroutine frame in C++ version in https://github.com/itanium-cxx-abi/cxx-abi/issues/142::

<local-name> ::= Z <function encoding> E <entity name> [<discriminator>]

So that the frame name in the example would be _ZZL9coro_taskiE13coro_frame_ty. And both C++filt and llvm-cxxfilt would detangle it as we want as coro_task(int)::coro_frame_ty. I guess this is what @dblaikie want initially. Then John throws another problem that the solution above is not suit for C-style names. Although we're talking about C++ coroutine only, we could use extern "C" in C++ to prevent mangling actually. (It should be rare actually. Since the use of C++ coroutines generally require the involvement of coroutine types like 'lazy' or 'generator'. But such coroutine types are not C compatible generally.)

Then @avogelsgesang and @palves throw another requirement to avoid to see mangled names in gdb. I feel like this might not be necessary at first since the users wouldn't see the linkage name from our use experience. Although @palves throws the solution for locals, but I found it doesn't work local scope types:

C++
int main() {
  struct S {
    int a;
    bool b;
  };
  S s;
}
(gdb) p main::s
$2 = {a = -8000, b = 255}
(gdb) p main::S
A syntax error in expression, near `'.
(gdb) p 'main'::S
A syntax error in expression, near `'.
(gdb) p 'main()'::S
A syntax error in expression, near `'.

I am not sure if it is unable to make.


So there are 2 problems we saw:
(1) How do we handle unmangled names?
(2) Do we need to enable the user to print the type without knowing mangled names?

For the first problem, I prefer to use '.' as the separator since @palves gives reason why I failed earlier. And for the second problem, my answer is no due to above reasons. But it would be good if we have solutions.

Looks like at least clang (it's not part of the ABI so clang and GCC do things differently) uses "func.var" to encode local static variables in C, so if "func.var" has some precedent for C and works OK for C++, I think that's probably a reasonably generic solution, that doesn't depend on detecting mangled names and remangling them (which is awkward when there's different manglings, etc).

Looks like C locals

John gives the suggestion for the DIName of coroutine frame in C++ version in https://github.com/itanium-cxx-abi/cxx-abi/issues/142::

<local-name> ::= Z <function encoding> E <entity name> [<discriminator>]

So that the frame name in the example would be _ZZL9coro_taskiE13coro_frame_ty. And both C++filt and llvm-cxxfilt would detangle it as we want as coro_task(int)::coro_frame_ty. I guess this is what @dblaikie want initially. Then John throws another problem that the solution above is not suit for C-style names. Although we're talking about C++ coroutine only, we could use extern "C" in C++ to prevent mangling actually. (It should be rare actually. Since the use of C++ coroutines generally require the involvement of coroutine types like 'lazy' or 'generator'. But such coroutine types are not C compatible generally.)

Then @avogelsgesang and @palves throw another requirement to avoid to see mangled names in gdb. I feel like this might not be necessary at first since the users wouldn't see the linkage name from our use experience. Although @palves throws the solution for locals, but I found it doesn't work local scope types:

C++
int main() {
  struct S {
    int a;
    bool b;
  };
  S s;
}
(gdb) p main::s
$2 = {a = -8000, b = 255}
(gdb) p main::S
A syntax error in expression, near `'.
(gdb) p 'main'::S
A syntax error in expression, near `'.
(gdb) p 'main()'::S
A syntax error in expression, near `'.

I am not sure if it is unable to make.


So there are 2 problems we saw:
(1) How do we handle unmangled names?
(2) Do we need to enable the user to print the type without knowing mangled names?

For the first problem, I prefer to use '.' as the separator since @palves gives reason why I failed earlier. And for the second problem, my answer is no due to above reasons. But it would be good if we have solutions.

Looks like at least clang (it's not part of the ABI so clang and GCC do things differently) uses "func.var" to encode local static variables in C, so if "func.var" has some precedent for C and works OK for C++, I think that's probably a reasonably generic solution, that doesn't depend on detecting mangled names and remangling them (which is awkward when there's different manglings, etc).

Yeah, this is implemented in current revision. I agree with it at least it is simpler from the perspective of both implementers and users. Would you like to give this a LGTM?

dblaikie accepted this revision.Jul 6 2022, 10:13 AM

Sounds good, thanks for working through/discussing the alternatives.

This revision is now accepted and ready to land.Jul 6 2022, 10:13 AM
This revision was landed with ongoing or failed builds.Jul 6 2022, 7:37 PM
This revision was automatically updated to reflect the committed changes.