This is an archive of the discontinued LLVM Phabricator instance.

[Debug] [Coroutines] Get rid of DW_ATE_address
ClosedPublic

Authored by ChuanqiXu on Jun 13 2022, 2:48 AM.

Details

Summary

Fixing https://github.com/llvm/llvm-project/issues/55916

This patch tries to get rid of DW_ATE_address and enhance the test coverage.

Diff Detail

Event Timeline

ChuanqiXu created this revision.Jun 13 2022, 2:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 13 2022, 2:48 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
ChuanqiXu requested review of this revision.Jun 13 2022, 2:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 13 2022, 2:48 AM
ChuanqiXu set the repository for this revision to rG LLVM Github Monorepo.
dblaikie added inline comments.Jun 14 2022, 1:31 PM
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
920–934

How come this one became unsigned char rather than pointer type?

llvm/test/Transforms/Coroutines/coro-debug-coro-frame.ll
16–17

CHECK-NOT only applies between one CHECK and the next, so this probably won't do much/not enough to carry its weight, I'd guess?

(--implicit-check-not as an argument to FileCheck anti-checks for the value everywhere, but even then - I'm happy to drop this testing if the source code no longer mentions DW_ATE_address anywhere anyway, and the things being emitted are being actively tested for instead)

37

Why does this type have a name? I think it should probably be nameless - so it should merge with other instances of the same pointer type created for the original source code, etc.

72–73

why are there changes to the IR here? I guess they're necessary to create the interesting debug info IR being tested?

Address comments.

ChuanqiXu marked an inline comment as done.Jun 14 2022, 11:29 PM
ChuanqiXu added inline comments.
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
920–934

The intention of the 'else' section is to create a DIType for an unresolved type. And the unresolved type lives in the coroutine frame. So it is a little bit odd for me to treat it as a pointer type. Since the size of the unresolved type might be larger than a pointer.

The choice to use DW_ATE_unsigned_char might not be correct. Here is my thoughts. I want to treat the unresolved type as bytes. I want to make it as something like "I don't know what you are so I will present the bytes directly". And DW_ATE_unsigned_char is the closest thing I found in DW_ATE* things. How do you think about it?

This is covered in the test as vector types. Since we failed to recognize vector types now, we would treat it as a unresolved type with 128 size.

llvm/test/Transforms/Coroutines/coro-debug-coro-frame.ll
37

The name here is try to provide more information as much as possible. Ideally, I think this one should be the int* type. It shows the the ability of type resolver in current coroutine compiler is not strong enough. Given this revision aims to get rid of DW_ATE_address, I would try to fix the problem in other revisions.

For the duplicated DITypes problem, this is tested in the following function 'bar' that the generated DIType would be reused.

72–73

This is intended to test the result of unresolved type.

dblaikie added inline comments.Jun 15 2022, 10:50 AM
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
920–934

So this is going to create an arbitrarily large "unsigned char" type (that might be tens of bytes, etc?) - is it at least a multiple of whole bytes? Maybe it should be a char array instead?

llvm/test/Transforms/Coroutines/coro-debug-coro-frame.ll
37

The name here is try to provide more information as much as possible. Ideally, I think this one should be the int* type.

Any reason it can't be int*? Currently it's void*.

Given this revision aims to get rid of DW_ATE_address, I would try to fix the problem in other revisions.

Sure

For the duplicated DITypes problem, this is tested in the following function 'bar' that the generated DIType would be reused.

I was thinking of the types created from the original C++ source that has pointers in it too - not other types created by the coroutine lowering code.

ChuanqiXu added inline comments.Jun 15 2022, 7:26 PM
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
920–934

So this is going to create an arbitrarily large "unsigned char" type (that might be tens of bytes, etc?)

Yes.

is it at least a multiple of whole bytes? Maybe it should be a char array instead?

From my understanding, I feel like the allowed choices is listed in https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/BinaryFormat/Dwarf.def#L917-L937. I feel like a char array might be good. But I am not sure how to make it.

llvm/test/Transforms/Coroutines/coro-debug-coro-frame.ll
37

Any reason it can't be int*? Currently it's void*.

It is due to the current mechanism to solve LLVM types to DITypes is not mature enough. My bad. Let's enhance them later.

I was thinking of the types created from the original C++ source that has pointers in it too - not other types created by the coroutine lowering code.

It would try to find @dbg.declare intrinsics for variables in the frame. If founded, it would reuse that one. But if it failed to find one, it would only try to create one.

dblaikie added inline comments.Jun 16 2022, 2:22 PM
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
920–934

You can check Clang's code for making array types - DIBuilder::createArrayType, passed in the char basic type.

llvm/test/Transforms/Coroutines/coro-debug-coro-frame.ll
37

Any reason it can't be int*? Currently it's void*.

It is due to the current mechanism to solve LLVM types to DITypes is not mature enough. My bad. Let's enhance them later.

Not sure I follow - but sure, can handle that in a follow-up patch.

I was thinking of the types created from the original C++ source that has pointers in it too - not other types created by the coroutine lowering code.

It would try to find @dbg.declare intrinsics for variables in the frame. If founded, it would reuse that one. But if it failed to find one, it would only try to create one.

Sure - but what I mean is if it tries to create one it could create one that's more similar to the ones the frontend would create - a named pointer type is a weird beast that we wouldn't usually create in the frontend (pointer types don't have names, unless it's via a typedef - the typedef has the name, the pointer type does not).

ChuanqiXu added inline comments.Jun 17 2022, 12:18 AM
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
920–934

Thanks! But during the process of implementing it, I find it is more complex than necessary. We need to judge if it is larger than a byte or if it is a multiple of bytes. So that we have at least 3 cases to handle. In my mind, we only need to provide two properties when we meet unknowing types: start address and the size. So that the users could debug in the pretty old fashion (assume the size of the unknown type is 128 in the example):

(gdb) x/128x the_variable_address_of_unknow_types

And the current implementation looks not bad in this direction. I totally understand that it looks not good in the code: how come we get a char in 128 bits ?! It looks odd indeed. But from my experience, the users wouldn't be strict with the Dwarf Type in the case since the type name starts with 'Unknown'. What I want to say here is that: it might not be good indeed but it is not so bad too. The root problem here might be that we're going to construct debug info in middle end. So we would meet some weird things I guess.

llvm/test/Transforms/Coroutines/coro-debug-coro-frame.ll
37

Sure - but what I mean is if it tries to create one it could create one that's more similar to the ones the frontend would create - a named pointer type is a weird beast that we wouldn't usually create in the frontend (pointer types don't have names, unless it's via a typedef - the typedef has the name, the pointer type does not).

Got it. Will do in following patches.

dblaikie added inline comments.Jun 17 2022, 2:46 PM
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
920–934

I'm not sure I understand the complexity - I'd asked earlier if it was always a byte multiple, which I think you said it was - in that case I'd always create an array of that many bytes. I don't think this should be complicated.

This would be more likely to be supported by DWARF Consumers (I'm not sure what they'd do with a huge integer type) & probably render more meaningfully for users (rather than as one big integer, array printers in debuggers do smart user-customizable things like printing only the first N array elements and then using ..., etc).

Treat unresolved type as array of bytes instead of a big char.

ChuanqiXu added inline comments.Jun 19 2022, 11:03 PM
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
920–934

I'm not sure I understand the complexity - I'd asked earlier if it was always a byte multiple, which I think you said it was - in that case I'd always create an array of that many bytes. I don't think this should be complicated.

Sorry for the confusion. My bad. For the question if it is always a byte multiple, according to my experiments these days, it is always a byte multiple in C++ due to the alignment requirement. I also added an assert (Size % 8 == 0) and run it for some workloads. It looked fine. But for LLVM IR, we could make the assertion failure simply by making a vector type of <9 x i1>. So I add the if part in this revision.

This would be more likely to be supported by DWARF Consumers (I'm not sure what they'd do with a huge integer type) & probably render more meaningfully for users (rather than as one big integer, array printers in debuggers do smart user-customizable things like printing only the first N array elements and then using ..., etc).

You know debugger much than me. I'd follow it.

dblaikie accepted this revision.Jun 24 2022, 11:51 AM

Think this is alright?

llvm/lib/Transforms/Coroutines/CoroFrame.cpp
920–934

OK, sounds good!

This revision is now accepted and ready to land.Jun 24 2022, 11:51 AM
This revision was landed with ongoing or failed builds.Jul 6 2022, 7:48 PM
This revision was automatically updated to reflect the committed changes.