This is an archive of the discontinued LLVM Phabricator instance.

[RFC] [Coroutines] Enable printing coroutine frame in debugger if program is compiled with -g
ClosedPublic

Authored by ChuanqiXu on Mar 23 2021, 5:11 AM.

Details

Summary

Although the coroutine frame is constructed and maintained by the compiler and the programmer shouldn't care about the coroutine frame by the design of C++20 coroutine,
a lot of programmers told me that they want to see the layout of the coroutine frame strongly. Although C++ is designed as an abstract layer so that the programmers shouldn't care about the actual memory in bits, many experienced C++ programmers are familiar with assembler and debugger to see the memory layout in fact, After I was been told they want to see the coroutine frame about 3 times, I think it is an actual and desired demand.

However, the debug information is constructed in the front end and coroutine frame is constructed in the middle end. This is a natural and clear gap. So I could only try to construct the debug information in the middle end after coroutine frame constructed. I am not clear whether this violates the guide principles of LLVM.

Here is an example:

struct test {
int i;
int j;
double k;
};

struct coro {
  struct promise_type {
    struct test str2;
    promise_type() {
        str2.i = 1;
        str2.j = 2;
        str2.k = 3;
    }
    coro get_return_object() {
      auto handle = coroutine_handle<promise_type>::from_promise(*this);
      return coro(handle);
    }
    suspend_always initial_suspend() { return {}; }
    suspend_always final_suspend() noexcept { return {}; }
    void return_void() __attribute__((noinline)) {}
    void unhandled_exception() {}
  };

  coroutine_handle<promise_type> handle;
  coro(coroutine_handle<promise_type> handle)
      : handle(handle) {}
};

class Obj {
public:
  Obj(int a) : a(a) {}

    coro foo(struct test t, std::shared_ptr<struct test> SP) noexcept {
      int localStr = t.i;
      int j = t.i;
      struct test t1 = {1, ++t.i, t.k};
      printf("str");

      co_await suspend_always();
      j = t.j;
      t1.i++;
      t1.j++;
      localStr += t.i + t.j;
      a = 10;
      if (t.i > 1000) {
        foo(t, std::move(SP));
      } else {
        printf("%d, %d, %d\n", SP->i, t.i, t1.j); // 2, 1
      }

      co_await suspend_always();
      j = t.k;
      ++localStr;
      printf("%d,\n", localStr); // 3, 2

      t1.i++;
    }

private:
    int a;
};

After we added simple main function and compiled it with:

clang++ -g -O2 -fcoroutines-ts -mllvm -enhance-debug-with-coroutine ToSeeCoroFrame.cpp -o a.out

Then we can stop the program in gdb/lldb in the middle of the coroutine, we can get the coroutine frame layout by p __coro_frame:

$1 = {__resume_fn = 0x4013b0 <Obj::foo(test, std::shared_ptr<test>)>, __destroy_fn = 0x401530 <Obj::foo(test, std::shared_ptr<test>)>, __promise = {str2 = {i = 1, j = 2, k = 3}}, class_std__shared_ptr_0 = {class_std____shared_ptr = {
      struct_test_Ptr = 0x0, class_std____shared_count = {class_std___Sp_counted_base_Ptr = 0x0}}}, class_Obj_Ptr_1 = 0x7fffffffe250, int64_2 = 1095216660718, double_3 = 1, struct_test_Ptr_4 = 0x416ec0,
  class_std___Sp_counted_base_Ptr_5 = 0x416eb0, class_std___Sp_counted_base_Ptr_6 = 0x0, __coro_index = 0 '\000'}

One hard part is we need construct the name for variables since there isn't a map from llvm variables to DIVar. Then here is the strategy this patch uses:

  • The name __resume_fn , __destroy_fn and __coro_index are constructed by the patch.
  • Then the name __promise comes from the dbg.variable of corresponding dbg.declare of PromiseAlloca, which shows highest priority to construct the debug information for the member of coroutine frame.
  • Then if the member is struct, we would try to get the name of the llvm struct directly. Then replace ':' and '.' with '_' to make it printable for debugger.
  • If the member is a basic type like integer or double, we would try to emit the corresponding name.
  • Then if the member is a Pointer Type, we would add Ptr after corresponding pointee type.
  • Otherwise, we would name it with 'UnknownType'.

The ability to construct coroutine is controlled by an option so this patch won't affect other parts. Some parts of this patch is duplicated with D97673 and D96938. So we can't apply them all without conflicts.

It is a little unnatural to construct coroutine frame debug information. The implementation and design for the patch may have many problems. Any opinion is well come.

Diff Detail

Event Timeline

ChuanqiXu created this revision.Mar 23 2021, 5:11 AM
ChuanqiXu requested review of this revision.Mar 23 2021, 5:11 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 23 2021, 5:11 AM

From a layering perspective, this is not something we would usually do. One immediate problem is that LLVM should be mostly language agnostic. For example the Swift compiler also uses to CoroSplit pass, and you wouldn't want to inject C++ type information into Swift functions. Maybe it would be cleaner for Clang to register an LLVM pass to run after CoroSplit that inserts the C++-specific debug info?

From a layering perspective, this is not something we would usually do. One immediate problem is that LLVM should be mostly language agnostic. For example the Swift compiler also uses to CoroSplit pass, and you wouldn't want to inject C++ type information into Swift functions. Maybe it would be cleaner for Clang to register an LLVM pass to run after CoroSplit that inserts the C++-specific debug info?

Sorry for I forgot the cases for swift and milr. Your suggestion makes sense. I put buildFrameDebugInfo in CoroFrame.cpp since the FrameDataInfo is maintained in this file only and this information would loss if buildCoroutineFrame ends. I think we need to do a big refactoring to extract the analysis part of the coroutine as an Analysis pass. This would be beneficial only for this patch. And a workaround maybe:

- if (EnhanceDebugability)
-    buildFrameDebugInfo(F, Shape, FrameData);
+ if (EnhanceDebugability && Shape.ABI == coro::ABI::Switch)
+    buildFrameDebugInfo(F, Shape, FrameData);

Trying to agree on frame layout across multiple passes seems really brittle to me, since nearly any change to the function could trigger a change in the frame layout.

I assume that DWARF doesn't really have a concept of language-independent or mixed-language types. Maybe generating a C-language DWARF struct to describe the frame layout could be triggered by the presence of C-language debug info?

Trying to agree on frame layout across multiple passes seems really brittle to me, since nearly any change to the function could trigger a change in the frame layout.

Yeah. There are two assumptions:

  • Optimization passes may not transform a struct allocated on the heap.
  • If the coroutine frame is Coro-elided to be an alloca, then any optimization passes who transform the alloca is responsible for maintaining the debug information.

To my knowledge, I don't find a pass who would transform a struct allocated on the heap. Then if there would be one, I think this pass is responsible for maintaining the debug information too.

I assume that DWARF doesn't really have a concept of language-independent or mixed-language types. Maybe generating a C-language DWARF struct to describe the frame layout could be triggered by the presence of C-language debug info?

I guess @aprantl means the way I constructed the frame struct. This patch assumes there would be a promise in the coroutine frame. However, there is no promise in swift or mlir coroutine.

Maybe generating a C-language DWARF struct to describe the frame layout could be triggered by the presence of C-language debug info?

The code could check for the presence of one of the DW_LANG_C_plus_plus derivatives (cf. isCPlusPlus() in Dwarf.h in the containing DICompileUnit to check that this is indeed C++ and just not run in any other case.

Maybe generating a C-language DWARF struct to describe the frame layout could be triggered by the presence of C-language debug info?

The code could check for the presence of one of the DW_LANG_C_plus_plus derivatives (cf. isCPlusPlus() in Dwarf.h in the containing DICompileUnit to check that this is indeed C++ and just not run in any other case.

Do you think it is better to search for DW_LANG_C_plus_plus debug information than judging the ABI of Coroutine?

Do you think it is better to search for DW_LANG_C_plus_plus debug information than judging the ABI of Coroutine?

I think that would be better. To give a hypothetical example, flang could decide to emit Fortran coroutines using the C++ ABI for interoperability with C++ code, but then you wouldn't want to inject C++ types into the Fortran debug info.

Only generates debug information for coroutine frame by searching DW_LANG_C_plus_plusderivatives.

Only generating debug information for coroutine frame by searching DW_LANG_C_plus_plus derivatives.

dblaikie added inline comments.Apr 2 2021, 12:26 PM
llvm/test/Transforms/Coroutines/coro-debug-coro-frame.ll
4–16

This testing looks a bit more scant than I'd expect - to take a random example, the production code change in this patch includes what appears to be creating a type called __resume_fn - but the existence, spelling, and use of that type doesn't seem to be tested here. (same goes for lots of other stuff in the patch - looks like it's creating a whole bunch of debug metadata that doesn't seem to be tested here)

ChuanqiXu updated this revision to Diff 335478.Apr 6 2021, 4:26 AM
ChuanqiXu added a reviewer: dblaikie.

Enhance Test Cases as the comment address.

Testing looks a bit better - could probably be more comprehensive.

General architecture of the change: Yeah, I dunno about creating new structures and things (& corresponding debug info) during optimizations, bit unfortunate/not something we've done before - but I don't know of much better, given my cursory understanding of the current architecture & lowering of the coroutines this seems understandable if unfortunate.

Happy for @aprantl to continue with review for the broader design, but I'll keep reading & chime in if I have further thoughts.

Perhaps if restructuring the passes is not practical - maybe the pass could be parameterized by the frontend that's creating the pass pipeline? insert some kind of callback... hmm, nope, that doesn't work because you might -disable-llvm-optzns to get LLVM IR out from clang then use llc (or disable frontend optimizations in an LTO build, etc) - yeah, not sure.

llvm/test/Transforms/Coroutines/coro-debug-coro-frame.ll
17

Probably worth checking the scope and line - since those are specified by the code being added in this patch (eg: if the patch did have a bug - passing the wrong line or scope, the test would still pass) - similarly for other fields in other records.

(fundamental design discussion/decision might be worth a broader design discussion on llvm-dev with a bunch of debug info folks - not sure)

Testing looks a bit better - could probably be more comprehensive.

General architecture of the change: Yeah, I dunno about creating new structures and things (& corresponding debug info) during optimizations, bit unfortunate/not something we've done before - but I don't know of much better, given my cursory understanding of the current architecture & lowering of the coroutines this seems understandable if unfortunate.

Happy for @aprantl to continue with review for the broader design, but I'll keep reading & chime in if I have further thoughts.

Perhaps if restructuring the passes is not practical - maybe the pass could be parameterized by the frontend that's creating the pass pipeline? insert some kind of callback... hmm, nope, that doesn't work because you might -disable-llvm-optzns to get LLVM IR out from clang then use llc (or disable frontend optimizations in an LTO build, etc) - yeah, not sure.

Thanks for reviewing! I would try to enhance the test case.

maybe the pass could be parameterized by the frontend that's creating the pass pipeline? insert some kind of callback... hmm, nope, that doesn't work because you might -disable-llvm-optzns to get LLVM IR out from clang then use llc (or disable frontend optimizations in an LTO build, etc) - yeah, not sure.

From my perspective, it looks like the callback to frontend wouldn't work since clang frontend is worked on AST instead of LLVM IR. If we want a callback to clang to handle this, we need to preserve the frontend information like AST to the LLVM IR. This seems to be a hardest method.

(fundamental design discussion/decision might be worth a broader design discussion on llvm-dev with a bunch of debug info folks - not sure)

I would try to discuss it in llvm-dev. It seems like there two things we need to discuss:

  • Is it ok to construct debug information in LLVM Passes? If yes, we could construct corresponding interface to IR module instead of coroutine module.
  • If no, how can we solve these problems? One thought I had is to emit debug type information for LLVM types in the frontend. But I am not sure if it would obstacle the optimizations.

Normally we don't want passes making educated guesses about things and codifying that into debug info, but in this specific instance i don't think it's unreasonable, since the debug frame is an artificial type. Ideally you would find the debug info that's present and use that to describe the frame layout as much as possible. The way frame layout works should allow that kind of reversal well enough.

The biggest conceptual problem is that, for the fields that you can't find debug info for, you'll have to either come up with a C type from the LLVM type or leave them out the layout entirely. The latter seems like it would make this much less useful, but the former is really problematic because the sizes of C's fundamental types are so unportable.

Normally we don't want passes making educated guesses about things and codifying that into debug info, but in this specific instance i don't think it's unreasonable, since the debug frame is an artificial type. Ideally you would find the debug info that's present and use that to describe the frame layout as much as possible. The way frame layout works should allow that kind of reversal well enough.

The biggest conceptual problem is that, for the fields that you can't find debug info for, you'll have to either come up with a C type from the LLVM type or leave them out the layout entirely. The latter seems like it would make this much less useful, but the former is really problematic because the sizes of C's fundamental types are so unportable.

The strategy now when we can't find corresponding debug info is to handle struct and basic type separately. For example, if a struct type with name "%class.foo" in LLVM, this patch would try to name it with class_foo. If it is basic type like i32 or i64, this patch would call it i32 or i64 directly instead of trying to get the corresponding C type like int or long which is not approximate.

ChuanqiXu updated this revision to Diff 336019.Apr 8 2021, 12:29 AM

Rebase with trunk and address the comments.

Given everything that was said, it looks like this approach is the best one, even if it is unusual.

llvm/lib/Transforms/Coroutines/CoroFrame.cpp
1037

I think we might be able to get away without adding the new variables to RetainedNodes. IIRC, the RetainedNodes is there to guarantee that variables that are optimized out still make it into the debug info even if no dbg.value points to them. But for artificial variables like these, there is probably no point in preserving them because they are only interesting if they actually have a value.

ChuanqiXu added inline comments.Apr 8 2021, 8:46 PM
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
1037

If we don't add the new variables to RetianedNodes, when user try to pint the frame in the debugger, user may get no symbol __coro_frame in context. And now user may get __coro_frame is optimized out. I think the latter may be more friendly to users.

aprantl added inline comments.Apr 9 2021, 12:38 PM
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
1037

If that is a desired outcome, this works for me.

aprantl added inline comments.Apr 9 2021, 12:52 PM
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
346

FieldAlignMap.insert({V, Align})?

930

unsigned ... ;

934

NameCache.insert({ResumeIndex, "__resume_fn"});

aprantl added inline comments.Apr 9 2021, 12:52 PM
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
777

string stream?

778

We should use the constructor that takes a StringRef not the one that takes a char*

854

could this be written as a range-based for?

871

Perhaps use one of the llvm string stream classes here to minimize all the std::string copying?

875

I would also use insert here to make it clearer that we are adding a new entry and not overwriting an existing one

965

NameCache.insert({Index, DIVarCache[V]->getName())});

972

OffsetCache.insert({ResumeIndex, {8, 0}};

Address the comments and enhance the test case.

ChuanqiXu added inline comments.Apr 11 2021, 11:54 PM
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
854

Since we need to use the index I for StructLayout later, I choose the index based for. If we use range-based for, we may need to edit the StructLayout module.

Something @probinson mentioned in the llvm-dev thread is that any debug info this creates should be sure that it doesn't produce duplicate unique entities - if it has to create any support types. Does the code correctly only produce one of them, even if there are multiple coroutines being lowered?

ChuanqiXu updated this revision to Diff 337376.Apr 14 2021, 2:12 AM

Address comments in thread: https://groups.google.com/g/llvm-dev/c/hfZ6z_MJFOw

  • Add artificial flags to all the types created by this pass.
  • Use __ prefix for type created to avoid name collisions.
  • Add tests to show two coroutines in one module would reuse the created base types.

Something @probinson mentioned in the llvm-dev thread is that any debug info this creates should be sure that it doesn't produce duplicate unique entities - if it has to create any support types. Does the code correctly only produce one of them, even if there are multiple coroutines being lowered?

Thanks for reminding! I don't know why I don't get the responding from @probinson even in the spam list. And I find it here: https://groups.google.com/g/llvm-dev/c/hfZ6z_MJFOw.
It looks like it wouldn't create redundant base types. But it would still create one coroutine frame type for each coroutine. It makes sense because the layout of coroutine is different with each other (Although it could be the same, as the updated test case tells).
And I didn't do type cache in the module level actually. So I guess it is handled by the infrastructure of the DI system.

Something @probinson mentioned in the llvm-dev thread is that any debug info this creates should be sure that it doesn't produce duplicate unique entities - if it has to create any support types. Does the code correctly only produce one of them, even if there are multiple coroutines being lowered?

Thanks for reminding! I don't know why I don't get the responding from @probinson even in the spam list. And I find it here: https://groups.google.com/g/llvm-dev/c/hfZ6z_MJFOw.
It looks like it wouldn't create redundant base types. But it would still create one coroutine frame type for each coroutine. It makes sense because the layout of coroutine is different with each other (Although it could be the same, as the updated test case tells).
And I didn't do type cache in the module level actually. So I guess it is handled by the infrastructure of the DI system.

Hmm - does the test case cover this situation where there are two coroutine frames independently constructed/lowered, but they share basic types? My first look at some of the basic types created only shows them being used in one frame.

Also, can this end up creating multiple types and variables with the same name in the same scope? If so, maybe those types and variables (the coro_frame types and variables specifically, from what I can see) should be unnamed instead of duplicately named? (or their names should somehow be uniqued, but I imagine that's non-trivial, having to look to see what coroutines have been lowered already/what names are already used)

llvm/lib/Transforms/Coroutines/CoroFrame.cpp
762–768

I'd probably write this with llvm::find_if, if I'm understanding the code correctly it would be equival/ent to this:

auto DDIs = FindDbgDeclareUses(V);
auto I = llvm::find_if(DDIs, [](DbgDeclareInst* DDI) { return DDI->getExpression()->getNumElements() == 0; });
if (I != DDIs.end())
  DIVarCache.insert({V, DDI->getVariable()});
783–788
824–825

This does two map lookups - best to do it with one by reusing the found iterator:

if (DIType *DT = DITypeCache.lookup(Ty))
  return DT;

(assuming there aren't any null values in the cache)

ChuanqiXu updated this revision to Diff 339090.Apr 20 2021, 8:32 PM

Address the inline comments.

Something @probinson mentioned in the llvm-dev thread is that any debug info this creates should be sure that it doesn't produce duplicate unique entities - if it has to create any support types. Does the code correctly only produce one of them, even if there are multiple coroutines being lowered?

Thanks for reminding! I don't know why I don't get the responding from @probinson even in the spam list. And I find it here: https://groups.google.com/g/llvm-dev/c/hfZ6z_MJFOw.
It looks like it wouldn't create redundant base types. But it would still create one coroutine frame type for each coroutine. It makes sense because the layout of coroutine is different with each other (Although it could be the same, as the updated test case tells).
And I didn't do type cache in the module level actually. So I guess it is handled by the infrastructure of the DI system.

Hmm - does the test case cover this situation where there are two coroutine frames independently constructed/lowered, but they share basic types? My first look at some of the basic types created only shows them being used in one frame.

Also, can this end up creating multiple types and variables with the same name in the same scope? If so, maybe those types and variables (the coro_frame types and variables specifically, from what I can see) should be unnamed instead of duplicately named? (or their names should somehow be uniqued, but I imagine that's non-trivial, having to look to see what coroutines have been lowered already/what names are already used)

Thanks for the detailed review!

Hmm - does the test case cover this situation where there are two coroutine frames independently constructed/lowered, but they share basic types? My first look at some of the basic types created only shows them being used in one frame.

Yes, two coroutines would share basic types like i32 and double. I updated the test case to cover that.

Also, can this end up creating multiple types and variables with the same name in the same scope? If so, maybe those types and variables (the coro_frame types and variables specifically, from what I can see) should be unnamed instead of duplicately named? (or their names should somehow be uniqued, but I imagine that's non-trivial, having to look to see what coroutines have been lowered already/what names are already used)

Maybe I don't get your point correctly. I think it is necessary to create name for __coro_frame (the created variables) and member types including (__resume_fn, __destroy_fn and __coro_index). Since the users of debugger want to see coroutine frame by p __coro_frame in the debugger, I worries if we don't create names, then the end user can't print the variable or they can't get the meaning. It may be OK different coroutines uses the same name since their scopes are not the same. I think it is a little bit like:

void foo() {
    strcut {
      // ... whatever
    } local_type;
    local_type my_var;
}
void bar() {
    strcut {
      // ... whatever
    } local_type;
    local_type my_var;
}

Both foo and bar have local_type and my_var. It wouldn't be confusing since we can know the context in the debugger.

I'm mostly happy with this now. Do you think it makes sense to factor out all this code into a new file, so it doesn't get confused

llvm/lib/Transforms/Coroutines/CoroFrame.cpp
341

What does this function add over FieldAlignMap.lookup(V)?

871

Unresolved

873

I would use https://llvm.org/doxygen/classllvm_1_1raw__string__ostream.html here, since it knows about StringRefs. Name.str() will create another temporary copy that is then copied into the stream buffer.

883

How about "Build artificial debug info for C++ coroutine frames to allow users to inspect the contents of the frame directly."?

888

declare

924

I believe it would be best to set the Artificial flag on everything that is inserted by this function, since there is no source code that these types and variables belong to.

Something @probinson mentioned in the llvm-dev thread is that any debug info this creates should be sure that it doesn't produce duplicate unique entities - if it has to create any support types. Does the code correctly only produce one of them, even if there are multiple coroutines being lowered?

Thanks for reminding! I don't know why I don't get the responding from @probinson even in the spam list. And I find it here: https://groups.google.com/g/llvm-dev/c/hfZ6z_MJFOw.
It looks like it wouldn't create redundant base types. But it would still create one coroutine frame type for each coroutine. It makes sense because the layout of coroutine is different with each other (Although it could be the same, as the updated test case tells).
And I didn't do type cache in the module level actually. So I guess it is handled by the infrastructure of the DI system.

Hmm - does the test case cover this situation where there are two coroutine frames independently constructed/lowered, but they share basic types? My first look at some of the basic types created only shows them being used in one frame.

Also, can this end up creating multiple types and variables with the same name in the same scope? If so, maybe those types and variables (the coro_frame types and variables specifically, from what I can see) should be unnamed instead of duplicately named? (or their names should somehow be uniqued, but I imagine that's non-trivial, having to look to see what coroutines have been lowered already/what names are already used)

Thanks for the detailed review!

Hmm - does the test case cover this situation where there are two coroutine frames independently constructed/lowered, but they share basic types? My first look at some of the basic types created only shows them being used in one frame.

Yes, two coroutines would share basic types like i32 and double. I updated the test case to cover that.

Also, can this end up creating multiple types and variables with the same name in the same scope? If so, maybe those types and variables (the coro_frame types and variables specifically, from what I can see) should be unnamed instead of duplicately named? (or their names should somehow be uniqued, but I imagine that's non-trivial, having to look to see what coroutines have been lowered already/what names are already used)

Maybe I don't get your point correctly. I think it is necessary to create name for __coro_frame (the created variables) and member types including (__resume_fn, __destroy_fn and __coro_index). Since the users of debugger want to see coroutine frame by p __coro_frame in the debugger, I worries if we don't create names, then the end user can't print the variable or they can't get the meaning. It may be OK different coroutines uses the same name since their scopes are not the same. I think it is a little bit like:

void foo() {
    strcut {
      // ... whatever
    } local_type;
    local_type my_var;
}
void bar() {
    strcut {
      // ... whatever
    } local_type;
    local_type my_var;
}

Both foo and bar have local_type and my_var. It wouldn't be confusing since we can know the context in the debugger.

Ah, if the types and variables are always local to independent functions (please confirm that two of these entities can't appear in the same scope), that's fine by me.

ChuanqiXu updated this revision to Diff 339460.Apr 21 2021, 9:19 PM

Address the comments.

I'm mostly happy with this now. Do you think it makes sense to factor out all this code into a new file, so it doesn't get confused

Thanks for reviewing! I am a little confused about the sentence for 'factor out all this code into a new file'. Does it suggest me to move the code in this patch to outer of coroutine module as a basic infrastructure? Or does it suggest me to move these code into a new file in coroutine module?

If it means we need to move these codes as a basic infrastructure, it may not make sense due to all of us agree on we could only construct debug info in the special case coroutine.
If it means we need to move these codes to a new file in the coroutine module, my opinion is that it is more easy to read these codes if we keep them in the CoroFrame.cpp. Since the total lines of codes added in CoroFrame.cpp is about 370, I think it may be OK to remain these codes in the CoroFrame.cpp now. I have a plan to extract the analysis part of CoroFrame.cpp into Analysis module as an analysis pass. After that, the lines of codes in CoroFrame.cpp could be less than now. So it may not be so confusing.

Ah, if the types and variables are always local to independent functions (please confirm that two of these entities can't appear in the same scope), that's fine by me.

Yes, the types (__coro_frame_ty) and variables (__coro_frame) are always local. Without inlining, there would be one and only coroutine frame in each coroutine. If inlining is turned on, there may be multiple coroutine frame in one function. And as far as I know, the debug info during inlining have been handled already. I mean, there wouldn't be two of __coro_frame appear in one DIScope.

llvm/lib/Transforms/Coroutines/CoroFrame.cpp
341

Compared with FieldAlignMap.lookup(V), this function add an assertion. I use this style to match the style in the context.

lxfind added inline comments.Apr 22 2021, 11:51 AM
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
345

You will only be calling setAlign and setOffset once for each field, which is different from the index (index is set twice). So you should only be asserting count=0 here and also for setOffset.

371–372

Please add comment to the fields

900

How does debug_compile_units work? Could it contain a mix of C++ and other languages?

911

I am always curious, when would PromiseAlloca to not be present?

1487

Builder should no longer be initialized with CB->getNextNode, but just the context

2653

redundant "only"

llvm/lib/Transforms/Coroutines/CoroInternal.h
136–137

Comment about the purpose

Address the comments.

ChuanqiXu added inline comments.Apr 22 2021, 10:58 PM
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
371–372

Add comments for when would these fields be initialized. So others could use them after that. I didn't constraint the usage in the comments since if we use them in other places, the comments may be inconsistent with the code.

911

Done. It would happen in test cases. I should had edit the test cases and add an assertion here.

1487

IRBuilder<> Builder(CB->getNextNode()) tells where builder should insert instructions. I think it is not redundant.

llvm/lib/Transforms/Coroutines/CoroInternal.h
136–137

I think the meaning of the new added fields are clear from their name. And for the purpose, I wouldn't like to constrain it in the comments. Because if we use them in other places in the future, it is very easy to become inconsistent.

Update format.

lxfind added inline comments.Apr 23 2021, 7:18 AM
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
900

Does this mean that if there is any C++ files, we don't build debug info?

913

Sorry I didn't mean to ask for this change, but more of curiosity, I see that in many places we check whether PromiseAlloca is nullptr under switch-lowering. So I wonder if there exists legit cases where PromiseAloca is null?

1487

If you look closely, this insert position is never used. In fact, it's confusing to set the insert position to the next instruction of CoroBegin after this patch, because now the next instruction of CoroBegin is the bitcast of the frame pointer. You don't want to insert anything before the frame pointer construction anyway.

dblaikie added inline comments.Apr 23 2021, 12:24 PM
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
900

aside, I guess this should probably be using llvm::none_of - it'll shortcircuit (won't keep testing every CU if it finds one that is C++) and is more clear about the semantic goal of the test compared to the !count current phrasing.

But also: Why is the code searching all the CUs? Wouldn't it be better to test the CU of the Function being passed in only? Then this code won't behave differently under LTO?

ChuanqiXu updated this revision to Diff 340429.Apr 25 2021, 9:48 PM

Address the comments.

ChuanqiXu added inline comments.Apr 25 2021, 9:54 PM
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
900

@dblaikie Thanks for reminding! Previously I am not so familiar with the API system for debug. It makes more sense to test the CU for the Function only.

This wouldn't behave differently under LTO since the related intrinsic would be lowered after we construct the coroutine frame.

913

To my knowledge, there wouldn't be the cases the PromiseAlloca is null even the promise_type in C++ are an empty class. I guess other codes are also workarounds to the test cases. We could clean them in future.

1487

Done. You are right.

ChuanqiXu added inline comments.Apr 25 2021, 11:02 PM
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
904

I am not sure if the check for DIS->getUnit() is redundant.

ChuanqiXu updated this revision to Diff 340485.Apr 26 2021, 5:00 AM

Fix clang-tidy problems.

dblaikie added inline comments.Apr 26 2021, 10:33 AM
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
900

@dblaikie Thanks for reminding! Previously I am not so familiar with the API system for debug. It makes more sense to test the CU for the Function only.

This wouldn't behave differently under LTO since the related intrinsic would be lowered after we construct the coroutine frame.

I don't understand this, sorry. Do you mean that this lowering pass always happens before LTO? So the old code was only ever really (in the common use case) going to see one CU in the CU list anyway?

ChuanqiXu added inline comments.Apr 26 2021, 6:55 PM
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
900

Yes, this lowering pass always happens before LTO.

Internally, in CoroSplit Pass, who would call the codes in CoroFrame, it would collect if the current function has attribute CORO_PRESPLIT_ATTR to decide whether to run it actually. After CoroSplit is done actually, the attribute CORO_PRESPLIT_ATTR would be removed, so a function shouldn't run CoroSplit pass twice actually.

dblaikie added inline comments.Apr 26 2021, 7:09 PM
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
900

Fair enough - thanks for the details.

aprantl accepted this revision.May 12 2021, 7:07 PM

Sorry for the delay. LGTM assuming @dblaikie is happy too.

This revision is now accepted and ready to land.May 12 2021, 7:07 PM
This revision was landed with ongoing or failed builds.May 12 2021, 9:43 PM
This revision was automatically updated to reflect the committed changes.

Sorry for the delay. LGTM assuming @dblaikie is happy too.

Yeah - sounds good. I don't have any further feedback.

@ChuanqiXu the use of DW_ATE_address was a surprise to our debugger; it is a very unusual encoding, and there are no other uses of it within LLVM. I see there was no Release Note regarding the description of coroutine frames, which might have helped us notice.

More importantly, I see two places where DW_ATE_address is used in constructing the DWARF, and yet the tests do not look for it. Please enhance the tests to cover these cases. I can file an issue if you would like, to keep track of this.

Herald added a project: Restricted Project. · View Herald TranscriptJun 2 2022, 5:49 AM

@ChuanqiXu the use of DW_ATE_address was a surprise to our debugger; it is a very unusual encoding, and there are no other uses of it within LLVM. I see there was no Release Note regarding the description of coroutine frames, which might have helped us notice.

More importantly, I see two places where DW_ATE_address is used in constructing the DWARF, and yet the tests do not look for it. Please enhance the tests to cover these cases. I can file an issue if you would like, to keep track of this.

Hmm, yeah, that's curious. Also seems noteworthy (I only checked quickly, so maybe I missed something) there seems to be no mention of DW_ATE_address in the test cases. Maybe missing test coverage?

But to produce more common/likely-usable DWARF would be to use a pointer_type with no DW_ATE attribute (that's what Clang/LLVM produce for "void*" (DW_TAG_pointer_type with no attributes))

@ChuanqiXu the use of DW_ATE_address was a surprise to our debugger; it is a very unusual encoding, and there are no other uses of it within LLVM. I see there was no Release Note regarding the description of coroutine frames, which might have helped us notice.

Oh, sorry for forgetting about the notes to users. Would try to add one.

More importantly, I see two places where DW_ATE_address is used in constructing the DWARF, and yet the tests do not look for it. Please enhance the tests to cover these cases. I can file an issue if you would like, to keep track of this.

But to produce more common/likely-usable DWARF would be to use a pointer_type with no DW_ATE attribute (that's what Clang/LLVM produce for "void*" (DW_TAG_pointer_type with no attributes))

So, it looks like it is not as easy as I imaged to add a test simply, is it? If it is the case, maybe we need an issue actually.

@ChuanqiXu the use of DW_ATE_address was a surprise to our debugger; it is a very unusual encoding, and there are no other uses of it within LLVM. I see there was no Release Note regarding the description of coroutine frames, which might have helped us notice.

Oh, sorry for forgetting about the notes to users. Would try to add one.

More importantly, I see two places where DW_ATE_address is used in constructing the DWARF, and yet the tests do not look for it. Please enhance the tests to cover these cases. I can file an issue if you would like, to keep track of this.

But to produce more common/likely-usable DWARF would be to use a pointer_type with no DW_ATE attribute (that's what Clang/LLVM produce for "void*" (DW_TAG_pointer_type with no attributes))

So, it looks like it is not as easy as I imaged to add a test simply, is it? If it is the case, maybe we need an issue actually.

What makes this hard to test? I'd expect putting "assert(false)" in whatever code is currently creating the DW_ATE_address type, testing which test fails, rerunning the test without the assertion and dumping its output should reveal the DW_ATE_address in its output that should be tested (& possibly changed to use a DW_TAG_pointer_type instead)

I have filed https://github.com/llvm/llvm-project/issues/55916 for the problems described in the post-commit review comments.

@ChuanqiXu the use of DW_ATE_address was a surprise to our debugger; it is a very unusual encoding, and there are no other uses of it within LLVM. I see there was no Release Note regarding the description of coroutine frames, which might have helped us notice.

Oh, sorry for forgetting about the notes to users. Would try to add one.

More importantly, I see two places where DW_ATE_address is used in constructing the DWARF, and yet the tests do not look for it. Please enhance the tests to cover these cases. I can file an issue if you would like, to keep track of this.

But to produce more common/likely-usable DWARF would be to use a pointer_type with no DW_ATE attribute (that's what Clang/LLVM produce for "void*" (DW_TAG_pointer_type with no attributes))

So, it looks like it is not as easy as I imaged to add a test simply, is it? If it is the case, maybe we need an issue actually.

What makes this hard to test? I'd expect putting "assert(false)" in whatever code is currently creating the DW_ATE_address type, testing which test fails, rerunning the test without the assertion and dumping its output should reveal the DW_ATE_address in its output that should be tested (& possibly changed to use a DW_TAG_pointer_type instead)

Sorry, I haven't looked into the details. I would try to look at it later.