This is an archive of the discontinued LLVM Phabricator instance.

[Coroutine][DebugInfo] Enhance the ability of coroutine to debug parameters under O2
AbandonedPublic

Authored by dongAxis1944 on Jan 11 2021, 12:21 AM.

Details

Summary

Coroutine copy arguments to the coroutine frame, so it will case coroutine might drop some debug info.
This patch will propagate the debugging information of the parameters to the resume function of the coroutine to help lldb and gdb print out the parameters in certain situations.

test plan:
check-llvm

Diff Detail

Event Timeline

dongAxis1944 created this revision.Jan 11 2021, 12:21 AM
dongAxis1944 requested review of this revision.Jan 11 2021, 12:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 11 2021, 12:21 AM
dongAxis1944 edited the summary of this revision. (Show Details)Jan 11 2021, 12:22 AM

Hi Yifeng,

Thanks for improving this. Can you add testcases to illustrate the "certain situations" you mentioned? Also, can you elaborate on the propagation approach? DeleteUselessDbgInfo looks potentially expensive, is it really needed?

aprantl added inline comments.Jan 12 2021, 1:27 PM
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
1103

Nit: According to the LLVM coding style this should be a full sentence:
// Copy debug intrinsics from the entry block if one exists.

1104

Why not just
DIBuilder DIB(*M, false)?

1147

This comment is redundant with the function name, I would just skip it.

1159

. at the end

1162

why filter out artificial arguments? This will filter out the this pointer for example.

1166

Why not handle all of them in loop? That would give the frontend more freedom.

1220

Aren't we unconditionally create DIB above?

1220

and then do for (DebugIntrinsic : DebugIntrinsics) here

1225

// Try to get the scope from the gep or use the original one.

1228

Why do we need to do this manually? Shouldn't salvageDebugInfo be able to translate a GEP into a combination of DW_OP_plus and DW_OP_deref?

llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
99

This statement is not correct.

Imagine the following code:

void f(int i) {
  if (i==42) ...;
 ...
}

in IR:

if.then:
  dbg.value(i32 42, DILocalVariable("i"), DIExpression())
  // optimized out
  br %if.end
if.end:
  // Let's assume "i" has been optimized away for here on.
  dbg.value(i32 undef, DILocalVariable("i"), DIExpression())

If we remove the undef dbg.value, the debug info will make it look as if the value for i was unconditionally 42 on all paths, which would be wrong in all paths where the branch wasn't taken.

My point is, it is never safe to remove a debug intrinsic. We may be able to remove them inside of the CoroFrame code based on special knowledge that we have there, but it can't be a general helper function in Local.cpp, because what the function does isn't generally correct.

149

We can't do this here, unfortunately.

Do we understand *why* the frontend is emitting the variables twice? If that just a bug in the clang frontend (which would be my first guess) then we should aim at fixing the frontend instead. If the frontend is trying to do something different, I would love to understand what that is.

Thanks for reviewing, I learn a lot.

Do we understand *why* the frontend is emitting the variables twice? If that just a bug in the clang frontend (which would be my first guess) then we should aim at fixing the frontend instead. If the frontend is trying to do something different, I would love to understand what that is.

i will upadte patch and remove DeleteUselessDbgInfo

Do we understand *why* the frontend is emitting the variables twice? If that just a bug in the clang frontend (which would be my first guess) then we should aim at fixing the frontend instead. If the frontend is trying to do something different, I would love to understand what that is.

Hi @aprantl,
According to the standard, coroutine needs to copy parameters (calling constructor) passed to the coroutine function by the original caller into the coroutine frame. So in frontend, we fake local variables for this behavior, that why we see two variables in debuginfo.

Do we understand *why* the frontend is emitting the variables twice? If that just a bug in the clang frontend (which would be my first guess) then we should aim at fixing the frontend instead. If the frontend is trying to do something different, I would love to understand what that is.

Hi @aprantl,
According to the standard, coroutine needs to copy parameters (calling constructor) passed to the coroutine function by the original caller into the coroutine frame. So in frontend, we fake local variables for this behavior, that why we see two variables in debuginfo.

So, this is not the bug, shold we avoid this by using anonymous fake variables?

I think if we use fake variable to solve this, it might cause the patch https://reviews.llvm.org/D93497 fail to print variable under O0 level

Do we understand *why* the frontend is emitting the variables twice? If that just a bug in the clang frontend (which would be my first guess) then we should aim at fixing the frontend instead. If the frontend is trying to do something different, I would love to understand what that is.

Hi @aprantl,
According to the standard, coroutine needs to copy parameters (calling constructor) passed to the coroutine function by the original caller into the coroutine frame. So in frontend, we fake local variables for this behavior, that why we see two variables in debuginfo.

So, this is not the bug, shold we avoid this by using anonymous fake variables?

I think if we use fake variable to solve this, it might cause the patch https://reviews.llvm.org/D93497 fail to print variable under O0 level

Do we understand *why* the frontend is emitting the variables twice? If that just a bug in the clang frontend (which would be my first guess) then we should aim at fixing the frontend instead. If the frontend is trying to do something different, I would love to understand what that is.

Hi @aprantl,
According to the standard, coroutine needs to copy parameters (calling constructor) passed to the coroutine function by the original caller into the coroutine frame. So in frontend, we fake local variables for this behavior, that why we see two variables in debuginfo.

So, this is not the bug, shold we avoid this by using anonymous fake variables?

If I understand correctly, what ends up as the "real" function parameters in DWARF are the function arguments passed into the coro entry point, and the artificial variables are what is stored in the CoroFrame, and those are the only ones visible in the .resume funclets. Is this correct? If yes, we probably do want both to be visible, because they end up in different funclets.

This comment was removed by dongAxis1944.

Maybe we can make the two "variables" use only one dwarf var.
It might solve this problem, I think. Do you have any ideas?

I think if we use fake variable to solve this, it might cause the patch https://reviews.llvm.org/D93497 fail to print variable under O0 level

Do we understand *why* the frontend is emitting the variables twice? If that just a bug in the clang frontend (which would be my first guess) then we should aim at fixing the frontend instead. If the frontend is trying to do something different, I would love to understand what that is.

Hi @aprantl,
According to the standard, coroutine needs to copy parameters (calling constructor) passed to the coroutine function by the original caller into the coroutine frame. So in frontend, we fake local variables for this behavior, that why we see two variables in debuginfo.

So, this is not the bug, shold we avoid this by using anonymous fake variables?

If I understand correctly, what ends up as the "real" function parameters in DWARF are the function arguments passed into the coro entry point, and the artificial variables are what is stored in the CoroFrame, and those are the only ones visible in the .resume funclets. Is this correct? If yes, we probably do want both to be visible, because they end up in different funclets.

I think if we use fake variable to solve this, it might cause the patch https://reviews.llvm.org/D93497 fail to print variable under O0 level

Do we understand *why* the frontend is emitting the variables twice? If that just a bug in the clang frontend (which would be my first guess) then we should aim at fixing the frontend instead. If the frontend is trying to do something different, I would love to understand what that is.

Hi @aprantl,
According to the standard, coroutine needs to copy parameters (calling constructor) passed to the coroutine function by the original caller into the coroutine frame. So in frontend, we fake local variables for this behavior, that why we see two variables in debuginfo.

So, this is not the bug, shold we avoid this by using anonymous fake variables?

If I understand correctly, what ends up as the "real" function parameters in DWARF are the function arguments passed into the coro entry point, and the artificial variables are what is stored in the CoroFrame, and those are the only ones visible in the .resume funclets. Is this correct? If yes, we probably do want both to be visible, because they end up in different funclets.

Sorry for the late reply. With O1 or above, yes, we can only see artificial variables in .resume function. IIUC, we want see "real" function parameters in ramp function and artificial variables in resume function and they do have different scope, then the debugger may print them correctly. am i right?

I think if we use fake variable to solve this, it might cause the patch https://reviews.llvm.org/D93497 fail to print variable under O0 level

Do we understand *why* the frontend is emitting the variables twice? If that just a bug in the clang frontend (which would be my first guess) then we should aim at fixing the frontend instead. If the frontend is trying to do something different, I would love to understand what that is.

Hi @aprantl,
According to the standard, coroutine needs to copy parameters (calling constructor) passed to the coroutine function by the original caller into the coroutine frame. So in frontend, we fake local variables for this behavior, that why we see two variables in debuginfo.

So, this is not the bug, shold we avoid this by using anonymous fake variables?

If I understand correctly, what ends up as the "real" function parameters in DWARF are the function arguments passed into the coro entry point, and the artificial variables are what is stored in the CoroFrame, and those are the only ones visible in the .resume funclets. Is this correct? If yes, we probably do want both to be visible, because they end up in different funclets.

Sorry for the late reply. With O1 or above, yes, we can only see artificial variables in .resume function. IIUC, we want see "real" function parameters in ramp function and artificial variables in resume function and they do have different scope, then the debugger may print them correctly. am i right?

@aprantl any update on this?

If I understand correctly, what ends up as the "real" function parameters in DWARF are the function arguments passed into the coro entry point, and the artificial variables are what is stored in the CoroFrame, and those are the only ones visible in the .resume funclets. Is this correct? If yes, we probably do want both to be visible, because they end up in different funclets.

With O1 or above, yes, we can only see artificial variables in .resume function. IIUC, we want see "real" function parameters in ramp function and artificial variables in resume function and they do have different scope, then the debugger may print them correctly. am i right?

I don't think that we want the variables in the resume functions to be artificial. As far as the end-user is concerned, these are the real variables, the fact that they they have been processed by the compiler in order to pass them to the resume function is an implementation detail that the end-user shouldn't need to know about. I think it would be best to change the Clang frontend to emit these variables not as artificial. They are currently being marked as artificial because they are derived from an implicit declaration. We could either change the AST to drop the implicit bit, or special-case them in CGDebugInfo.

LLDB (and maybe other debuggers too) hide all artificial variables by default, so it would valuable to not mark them as artificial.

If I understand correctly, what ends up as the "real" function parameters in DWARF are the function arguments passed into the coro entry point, and the artificial variables are what is stored in the CoroFrame, and those are the only ones visible in the .resume funclets. Is this correct? If yes, we probably do want both to be visible, because they end up in different funclets.

With O1 or above, yes, we can only see artificial variables in .resume function. IIUC, we want see "real" function parameters in ramp function and artificial variables in resume function and they do have different scope, then the debugger may print them correctly. am i right?

I don't think that we want the variables in the resume functions to be artificial. As far as the end-user is concerned, these are the real variables, the fact that they they have been processed by the compiler in order to pass them to the resume function is an implementation detail that the end-user shouldn't need to know about. I think it would be best to change the Clang frontend to emit these variables not as artificial. They are currently being marked as artificial because they are derived from an implicit declaration. We could either change the AST to drop the implicit bit, or special-case them in CGDebugInfo.

LLDB (and maybe other debuggers too) hide all artificial variables by default, so it would valuable to not mark them as artificial.

@aprantl, In some cases, the ramp function may contains both of the variables, this may confuse the debuggers.

hi @junparser & @aprantl

As @junparser mentioned, there are some senses that both variables will be used. When it happens, gdb or lldb will fail to print some variables under o1 and above.
For solving this problems, we would like to merge two variables into one.
Do you have any suggestions?

Thanks in advance.

If I understand correctly, what ends up as the "real" function parameters in DWARF are the function arguments passed into the coro entry point, and the artificial variables are what is stored in the CoroFrame, and those are the only ones visible in the .resume funclets. Is this correct? If yes, we probably do want both to be visible, because they end up in different funclets.

With O1 or above, yes, we can only see artificial variables in .resume function. IIUC, we want see "real" function parameters in ramp function and artificial variables in resume function and they do have different scope, then the debugger may print them correctly. am i right?

I don't think that we want the variables in the resume functions to be artificial. As far as the end-user is concerned, these are the real variables, the fact that they they have been processed by the compiler in order to pass them to the resume function is an implementation detail that the end-user shouldn't need to know about. I think it would be best to change the Clang frontend to emit these variables not as artificial. They are currently being marked as artificial because they are derived from an implicit declaration. We could either change the AST to drop the implicit bit, or special-case them in CGDebugInfo.

LLDB (and maybe other debuggers too) hide all artificial variables by default, so it would valuable to not mark them as artificial.

@aprantl, In some cases, the ramp function may contains both of the variables, this may confuse the debuggers.

hi @junparser & @aprantl

As @junparser mentioned, there are some senses that both variables will be used. When it happens, gdb or lldb will fail to print some variables under o1 and above.
For solving this problems, we would like to merge two variables into one.
Do you have any suggestions?

Thanks in advance.

If I understand correctly, what ends up as the "real" function parameters in DWARF are the function arguments passed into the coro entry point, and the artificial variables are what is stored in the CoroFrame, and those are the only ones visible in the .resume funclets. Is this correct? If yes, we probably do want both to be visible, because they end up in different funclets.

With O1 or above, yes, we can only see artificial variables in .resume function. IIUC, we want see "real" function parameters in ramp function and artificial variables in resume function and they do have different scope, then the debugger may print them correctly. am i right?

I don't think that we want the variables in the resume functions to be artificial. As far as the end-user is concerned, these are the real variables, the fact that they they have been processed by the compiler in order to pass them to the resume function is an implementation detail that the end-user shouldn't need to know about. I think it would be best to change the Clang frontend to emit these variables not as artificial. They are currently being marked as artificial because they are derived from an implicit declaration. We could either change the AST to drop the implicit bit, or special-case them in CGDebugInfo.

LLDB (and maybe other debuggers too) hide all artificial variables by default, so it would valuable to not mark them as artificial.

@aprantl, In some cases, the ramp function may contains both of the variables, this may confuse the debuggers.

Also, under O0, the ramp and resume contain both of the variables as well since we keep everything.

dongAxis1944 abandoned this revision.Mar 3 2021, 9:52 PM