This is an archive of the discontinued LLVM Phabricator instance.

Introduce DW_OP_LLVM_convert
ClosedPublic

Authored by markus on Jan 11 2019, 4:04 AM.

Details

Summary

Introduce a DW_OP_LLVM_convert Dwarf expression pseudo op that allows for a convenient way to perform type conversions on the Dwarf expression stack. As an additional bonus it paves the way for using other Dwarf v5 ops that need to reference a base_type.

  • Use the new operation in llvm::replaceAllDbgUsesWith where it replaces a complex (and broken) shift and mask based expression to perform sign- and zero-extension.
  • Update the AsmPrinter parts to allow for referencing a base_type from Dwarf expressions.

Diff Detail

Repository
rL LLVM

Event Timeline

markus created this revision.Jan 11 2019, 4:04 AM
bjope added a subscriber: bjope.Jan 11 2019, 4:58 AM
aprantl added inline comments.Jan 11 2019, 8:13 AM
lib/Transforms/Utils/Local.cpp
1854 ↗(On Diff #181241)

nit: . at the end.

1855 ↗(On Diff #181241)

I haven't had any coffee yet, but shouldn't that be FromBits and From ?:

00001110 >> 4-1 * ~0 << 4 | 00001110
        1       * ~0 << 4 | 00001110 
        11111111     << 4 | 00001110 
        11110000          | 00001110
                     11111110

ii) It is probably not safe to assume that the consumer/debugger would set the high bits to zero in zero extension in e.g. the case that the variable has been spilled to memory.

Why? The consumer should have truncate the fragment to its DW_AT_size, right?

bjope added inline comments.Jan 11 2019, 8:30 AM
lib/Transforms/Utils/Local.cpp
1855 ↗(On Diff #181241)

This method is replacing one dbg use with another. So "from" is the old value and "to" is the new value. Here we replace a old large value (e.g. i32) by a new smaller value (e.g. i16). So we sign extend from ToBits to FromBits to convert the new value back into something that represents the old value in the debugger. I think I needed both coffee and lunch to understand that we extend from To to From here.

vsk added a comment.Jan 11 2019, 9:15 AM

Thanks for the patch.

t’s not obvious to me from skimming what the bug is with sign extension expressions. Could you describe what goes wrong, and maybe share a small program which shows the debugger behaving incorrectly?

bjope added a comment.Jan 11 2019, 9:36 AM
In D56587#1354394, @vsk wrote:

Thanks for the patch.

t’s not obvious to me from skimming what the bug is with sign extension expressions. Could you describe what goes wrong, and maybe share a small program which shows the debugger behaving incorrectly?

For the signed case the old DIExpression calculated

(signbit * -1) | x

which always resulted in -1 if the sign bit was set and x if the sign bit was unset. So the problem was that we modified the low bits when doing the OR.

What we really want to do is to copy the sign bit into the extended bits, hence the addition of the DW_OP_shl to get

((signbit * -1)  << "number of bits to extend from") | x
aprantl added inline comments.Jan 11 2019, 10:21 AM
lib/Transforms/Utils/Local.cpp
1863 ↗(On Diff #181241)

General question: Should we generate a DWARF 5 type conversion here and lower it to this sequence for DWARF 4 and lower in DwarfExpression.cpp to save memory?

In D56587#1354394, @vsk wrote:

Thanks for the patch.

t’s not obvious to me from skimming what the bug is with sign extension expressions. Could you describe what goes wrong, and maybe share a small program which shows the debugger behaving incorrectly?

What @bjope said and as for an example consider the following on x86

#include <stdlib.h>
short foo(short x) {
  int y = x;
  return y;
}
int main(int argc, char **argv) {
  return foo(strtol(argv[1], NULL, 0));
}

Compiled in a somewhat contrived manner

clang dbg.c -O0 -g3 -S -emit-llvm
sed -i 's/optnone//' dbg.ll
opt -mem2reg -instcombine dbg.ll -S -o dbg.opt.ll
llc -O0 dbg.opt.ll -o dbg.s
gcc dbg.s -o dbg
gdb --args ./dbg 0xf000
(gdb) b foo
Breakpoint 1 at 0x400517: file dbg.c, line 4.
(gdb) r
Starting program: /repo/elavkje/llvm/my-dbg-test/dwarf-sext/dbg 0xf000

Breakpoint 1, foo (x=-4096) at dbg.c:4
4             return y;
(gdb) whatis x
type = short
(gdb) p y
$1 = -1
(gdb) whatis y
type = int
(gdb)

It is clear that y should not read as -1 at that point.

ii) It is probably not safe to assume that the consumer/debugger would set the high bits to zero in zero extension in e.g. the case that the variable has been spilled to memory.

Why? The consumer should have truncate the fragment to its DW_AT_size, right?

But wouldn't the DW_AT_size at this point be for the wider type? If it was the case that the consumer could automatically handle the zext then I don't see why it would not also be able to handle the sext by itself.
I think the following example illustrates where it goes wrong. Again contrived as we have to do some tricks to get the consumer to read the value from memory.

typedef unsigned short T;
T foo(T d0, T d1, T d2, T d3, T d4, T d5, T d6, T d7, T d8, T x) {
    unsigned int y = x;
    return y;
}
int main(int argc, char **argv) {
    return foo(0, 1, 2, 3, 4, 5, 6, 7, 8, -1);
}
clang dbg.c -O0 -g3 -S -emit-llvm
sed -i 's/optnone//' dbg.ll
opt -mem2reg -instcombine dbg.ll -S -o dbg.opt.ll
llc -O3 dbg.opt.ll -o dbg.s

Now modify the assembly in the caller to make sure that there are non-zero bits on the stack next to the 'x' argument.
i.e. replace

pushq   $65535                  # imm = 0xFFFF

with

pushq   $-1                  # imm = 0xFFFF

and finally assemble and test

gcc dbg.s -o dbg
gdb ./dbg
(gdb) b foo
Breakpoint 1 at 0x4004c8: file dbg.c, line 4.
(gdb) run
Starting program: /repo/elavkje/llvm/my-dbg-test/dwarf-sext/dbg 

Breakpoint 1, foo (d0=0, d1=1, d2=2, d3=3, d4=4, d5=5, d6=6, d7=7, d8=8, x=65535) at dbg.c:4
4           return y;
(gdb) whatis x
type = T
(gdb) whatis T
type = unsigned short
(gdb) p y
$1 = 4294967295
(gdb) whatis y
type = unsigned int
(gdb)

and it should be clear that zero extension did not happen. Unless my thinking is flawed of course, and that would not be the first time :)

Unfortunately I think that there are additional complications in that our sext/zext computations will have to also depend on the endianness of the target and if the value is stored in memory or not (as having the value loaded as the 'wrong type' would require bytes to be swapped on a big endian target before it could be sext/zext by this expression).

Probably inserting a pseudo op here lowering it at a later stage when it is known if the value will reside in memory would be the right thing to do. Not sure if a DWARF 5 DW_OP_convert would be the easiest option as its argument references another DIE and it seems that would require larger infrastructure changes (but I really don't know anything about this).

Probably inserting a pseudo op here lowering it at a later stage when it is known if the value will reside in memory would be the right thing to do. Not sure if a DWARF 5 DW_OP_convert would be the easiest option as its argument references another DIE and it seems that would require larger infrastructure changes (but I really don't know anything about this).

Referencing a DIE in DIExpression would need a bit of additional work. DIExpression stores its operands in an array of uint64_t, so in order to support MDNode operands we'd have to add them as actual MDNode operands. For example, expr_op_iterator could know which DIExpression operations take MDNode arguments and inject them in the right places. Slightly more complicated would be actually finding a matching DIType in the debug info that we want to point to. If we want to convert to exactly the type of the DIVariable, we can use that type, otherwise we might have to generate new DIBasicTypes on the fly.

Even though this sounds complicated, I still think that it is preferable to encode type conversions as such rather than generating really complicated expressions that happen to have the same effect in the underspecified DWARF 4 stack language.

Referencing a DIE in DIExpression would need a bit of additional work. DIExpression stores its operands in an array of uint64_t, so in order to support MDNode operands we'd have to add them as actual MDNode operands. For example, expr_op_iterator could know which DIExpression operations take MDNode arguments and inject them in the right places. Slightly more complicated would be actually finding a matching DIType in the debug info that we want to point to. If we want to convert to exactly the type of the DIVariable, we can use that type, otherwise we might have to generate new DIBasicTypes on the fly.

Even though this sounds complicated, I still think that it is preferable to encode type conversions as such rather than generating really complicated expressions that happen to have the same effect in the underspecified DWARF 4 stack language.

Yes, long term it is likely the best solution.

I played a bit with trying to insert a DW_OP_convert before I came up with this patch but was clueless on how to retrieve the DIE offset of the DIType when the expression was emitted as that section (.debug_info?) hadn't been emitted yet. If I could get some useful advice on tackling that issue I can have another go at it when I get back to work tomorrow.

Referencing a DIE in DIExpression would need a bit of additional work. DIExpression stores its operands in an array of uint64_t, so in order to support MDNode operands we'd have to add them as actual MDNode operands. For example, expr_op_iterator could know which DIExpression operations take MDNode arguments and inject them in the right places. Slightly more complicated would be actually finding a matching DIType in the debug info that we want to point to. If we want to convert to exactly the type of the DIVariable, we can use that type, otherwise we might have to generate new DIBasicTypes on the fly.

Even though this sounds complicated, I still think that it is preferable to encode type conversions as such rather than generating really complicated expressions that happen to have the same effect in the underspecified DWARF 4 stack language.

Yes, long term it is likely the best solution.

I played a bit with trying to insert a DW_OP_convert before I came up with this patch but was clueless on how to retrieve the DIE offset of the DIType when the expression was emitted as that section (.debug_info?) hadn't been emitted yet. If I could get some useful advice on tackling that issue I can have another go at it when I get back to work tomorrow.

You wouldn't hardcode the offset in IR, you'd refer to the DIType as an MDNode reference and then teach the backend to resolve the reference, similar to how DIE references are resolved in the entire DIType class hierarchy.

In LLVM Assembly this could look like:

!1 = !DIBasicType(name: "short", ...)
!2 = !DIExpression(DW_OP_convert, !1)

but the actual implementation would be as I outlined in my previous reply.

markus added a comment.EditedJan 15 2019, 4:02 AM

You wouldn't hardcode the offset in IR, you'd refer to the DIType as an MDNode reference and then teach the backend to resolve the reference, similar to how DIE references are resolved in the entire DIType class hierarchy.

In LLVM Assembly this could look like:

!1 = !DIBasicType(name: "short", ...)
!2 = !DIExpression(DW_OP_convert, !1)

but the actual implementation would be as I outlined in my previous reply.

Right, I got that part. I have a somewhat temporary solution in place for that which brings me to DwarfExpression::addExpression where I can pickup the corresponding DIBasicType when I see a dwarf::DW_OP_convert. Proceeding from that point seems much more difficult though since the .debug_info section has not been emitted yet and no DIE offsets are available. The code here expects me to emit directly into a ByteStreamer object.. Maybe it is possible to implement some parallel data structure inside DebugLocDwarfExpression to keep track of the convert ops and allow us to insert the DIE offset at a later point when it is available.

I didn't realize you were talking about DwarfExpression, sorry . I'm not really sure what the best solution is because I don't quite understand MCLabels that well. Naively I was assuming that we would emit an MCLabel that is defined as the difference between DIE offset and .debug_info start label. Since we need to emit DIE references all over .debug_info I would start by looking at how it's done there. The fact that it is in a different section may complicate things though, I'm not sure.

bjope added inline comments.Jan 15 2019, 2:22 PM
lib/Transforms/Utils/Local.cpp
1856 ↗(On Diff #181241)

I guess this still is wrong, at least if we end up with a DWARF location description for a memory location.

If for example the variable is 32-bits, and we want to describe it using a 16-bit value, then we will get something like this:

call void @llvm.dbg.value(metadata i16 %value, metadata "!variable", metadata DIExpression(DW_OP_dup, DW_OP_constu, 15, DW_OP_shr, DW_OP_lit0, DW_OP_not, DW_OP_mul, DW_OP_constu, 16, DW_OP_shl, DW_OP_or, DW_OP_stack_value)

In llc this will become a DBG_VALUE. The value could either end up referring to a 16-bit register, or it could refer to a 16-bit stack slot (e.g. if this is an input argument passed on the stack). In the latter case we typically end up prepending the DIExpression with DW_OP_fbreg and an offset. The memory location will point to the 16-bit value.

But we do not really express that the debugger should read a 16-bit value here, right? The debugger will only see that the variable is 32-bits, so it will read 32-bits, right?

For a little endian target we will get garbage in bits 16-31 (since we read outside the 16-bit stack slot). For a big endian target we will get the wanted value in bits 16-31 and garbage in bits 0-15. Either way, the result would be wrong. For little endian we would need to clear bit 16-31 before the OR with the sign-extension mask. For big endian we aren't even operating on the correct bits.

I'm not really sure what happens if the debugger finds a 16-bit register location for the 32-bit variable. Do we know that it only us reading 16-bits to the value stack?

One solution could be to use DW_OP_deref_size when reading from memory, to specify that we only want to read 16 bits. I'm not sure exactly how DwarfExpression could know when this is needed. I guess we can not add the DW_OP_deref_size already here, because it would be wrong in case of ending up with a register location. But maybe we still need to do something more also for the register location scenario when using this approach.

An alternative solution is to describe the variable using two dbg.value intrinsics. One using a fragment for bits 0-15, and another one using a fragment expression for bits 16-31. I guess it would look something like this:

call void @llvm.dbg.value(metadata i16 %value, metadata "!variable", metadata DIExpression(DW_OP_LLVM_fragment 0, 16)
call void @llvm.dbg.value(metadata i16 %value, metadata "!variable", metadata DIExpression(DW_OP_constu, 15, DW_OP_shr, DW_OP_lit0, DW_OP_not, DW_OP_mul, DW_OP_LLVM_fragment 16, 16)

I've seen the discussion about DW_OP_convert. Would DW_OP_convert help in telling the debugger that any derefs should be 16 bits in this case. Then I guess that still would be good for DWARF5.

Similar problem as described above also exists for the zext case below. At least for big endian when dereferencing memory, since we get the wrong value in the least significant bits when reading 32 bits from a 16-bit stack slot.

markus updated this revision to Diff 182271.Jan 17 2019, 6:47 AM

Uploading a prototype implementation using the DW_OP_LLVM_fragment as @bjope described. There are still some issues that need to be worked out with this so it is not final but rather I would like some input on if going down this track would be considered an acceptable solution. If there are clear objections to this approach it would be good to hear them early.

aprantl added inline comments.Jan 17 2019, 8:58 AM
lib/Transforms/Utils/Local.cpp
1844 ↗(On Diff #182271)

I think we should make a public helper function to emit z/sext operations for a DIExpression. Either as a member of DIExpression or as a freestanding function in Local.h. I'm sure this will come in handy elsewhere.

With the fragment approach we still have problems in e.g. lib/CodeGen/PrologEpilogInserter.cpp:1105 where the stack location of a function argument is prepended to the expression and a sized DW_OP_deref would need to be inserted.

While the above is likely solvable I think it is a fair question to ask if would not be cleaner to simply go for a pseudo op approach such as https://reviews.llvm.org/D57010 ?

markus updated this revision to Diff 183857.Jan 28 2019, 7:32 AM

Insert DW_OP_deref_size in Prologue Epilogue Inserter.

markus marked an inline comment as done.Jan 28 2019, 7:39 AM
markus added inline comments.
lib/Transforms/Utils/Local.cpp
1896 ↗(On Diff #183857)

Unfortunately I run into problems when I try having arbitrarily sized fragments ( https://bugs.llvm.org/show_bug.cgi?id=40462 ) so I have to make them byte sized here which produces quite a lot of them ...

markus updated this revision to Diff 184736.Feb 1 2019, 6:37 AM

Time to switch the approach. This time try adding support for typed Dwarf5 stack ops such as DW_OP_convert.

As an example in IR a sext would be represented as (where the DW_OP_convert arguments are indices into dbg.dwarf5.type.tbl):

call void @llvm.dbg.value(metadata i8 %x, metadata !15, metadata !DIExpression(DW_OP_convert, 0, DW_OP_convert, 1, DW_OP_stack_value)), !dbg !17
!dbg.dwarf5.type.tbl = !{!7, !8}
!7 = !DIBasicType(name: "s8", size: 8, encoding: DW_ATE_signed)
!8 = !DIBasicType(name: "s32", size: 32, encoding: DW_ATE_signed)

In assembly (x86) we get a labeled base_type entries in .debug_info

.Ltype_tbl0:                                                                                                                                                                                                       
        .byte   2                       # Abbrev [2] 0x2a:0x7 DW_TAG_base_type                                                                                                                                     
        .long   .Linfo_string3          # DW_AT_name                                                                                                                                                               
        .byte   5                       # DW_AT_encoding                                                                                                                                                           
        .byte   1                       # DW_AT_byte_size                                                                                                                                                          
.Ltype_tbl1:                                                                                                                                                                                                       
        .byte   2                       # Abbrev [2] 0x31:0x7 DW_TAG_base_type                                                                                                                                     
        .long   .Linfo_string4          # DW_AT_name                                                                                                                                                               
        .byte   5                       # DW_AT_encoding                                                                                                                                                           
        .byte   4                       # DW_AT_byte_size                                                                                                                                                          
        .byte   3                       # Abbrev [3] 0x38:0x38 DW_TAG_subprogram

that are referenced from the debug expression as in

.byte   168                     # DW_OP_convert
.uleb128 .Ltype_tbl0-.Lcu_begin0

As I see it the most significant benefit of this approach is that it would open up for using other typed DW5 ops as well.

Feedback please :)

Generally, I much prefer using DW_OP_convert because it's more space efficient and semantically unambiguous.

I have a couple of unsorted ideas:

call void @llvm.dbg.value(metadata i8 %x, metadata !15, metadata !DIExpression(DW_OP_convert, 0, DW_OP_convert, 1, DW_OP_stack_value)),

the type reference should be a metadata operand. Otherwise you'll need to implement support in llvm-link etc, too.
I would just implement the type reference as normal metadata operands in DIExpression and bake the knowledge that DW_OP_convert consumes one of the metadata operands into the DIExpression operand iterator.

MDNodes are uniqued, so just creating a new DIBasicType using a throwaway DIBuilder is cheap. That said, we need to find the types references by DIExpressions so they can be emitted in the .debug_info section.

On the other extreme, the only thing we really need from the basic types used by DW_OP_convert is the size and signedness. We could just encode that directly in the expression as DW_OP_LLVM_convert, 1, 32 for a signed int32_t or something like that. That doesn't solve the problem for how to determine which types we need to emit into the debug info, but it would be a very straightforward self-contained encoding up until AsmPrinter.

markus added a comment.Feb 4 2019, 5:44 AM

the type reference should be a metadata operand. Otherwise you'll need to implement support in llvm-link etc, too.
I would just implement the type reference as normal metadata operands in DIExpression and bake the knowledge that DW_OP_convert consumes one of the metadata operands into the DIExpression operand iterator.

That makes sense. I am working on that right now.

On the other extreme, the only thing we really need from the basic types used by DW_OP_convert is the size and signedness. We could just encode that directly in the expression as DW_OP_LLVM_convert, 1, 32 for a signed int32_t or something like that. That doesn't solve the problem for how to determine which types we need to emit into the debug info, but it would be a very straightforward self-contained encoding up until AsmPrinter.

I'd much prefer a generic solution that makes it easy (at least wrt this) to use other typed DW5 ops as well down the road.

the type reference should be a metadata operand. Otherwise you'll need to implement support in llvm-link etc, too.
I would just implement the type reference as normal metadata operands in DIExpression and bake the knowledge that DW_OP_convert consumes one of the metadata operands into the DIExpression operand iterator.

That makes sense. I am working on that right now.

If you choose this path, one way to solve the issue of what to do with the dangling types that are produced by the conversion operations would be to create them in a separate DICompileUnit with a compiler-generated name and location. This way the types should get uniqued automatically during LTO, and you don't need to worry about llvm-link & friends.

On the other extreme, the only thing we really need from the basic types used by DW_OP_convert is the size and signedness. We could just encode that directly in the expression as DW_OP_LLVM_convert, 1, 32 for a signed int32_t or something like that. That doesn't solve the problem for how to determine which types we need to emit into the debug info, but it would be a very straightforward self-contained encoding up until AsmPrinter.

I'd much prefer a generic solution that makes it easy (at least wrt this) to use other typed DW5 ops as well down the road.

What other DWARF5 operations do you have in mind that would need pointers into the debug type hierarchy?

markus added a comment.Feb 5 2019, 6:54 AM

If you choose this path, one way to solve the issue of what to do with the dangling types that are produced by the conversion operations would be to create them in a separate DICompileUnit with a compiler-generated name and location. This way the types should get uniqued automatically during LTO, and you don't need to worry about llvm-link & friends.

Not sure that I understand this comment right now. Perhaps I will run into the issue you describe later on but for now my mind is not there just yet :)

What other DWARF5 operations do you have in mind that would need pointers into the debug type hierarchy?

For example I imagine that DW_OP_regval_type, DW_OP_deref_type and DW_OP_const_type could come in handy to set the type of the expression stack. The "each element
of the stack is the size of an address on the target machine" of Dwarf4 is a rather annoying limitation for our target.

markus updated this revision to Diff 185303.Feb 5 2019, 7:17 AM
Dwarf5 DW_OP_convert support.

- Extend DIExpression to accept MDNode (really DIBasicType) operands.
Since we are keeping references to the MDNodes outside the "Metadata
machinery" we need to be careful and use TrackingMDNodeRef to support
any RAUW operations that might happen during e.g. IR import of a
forward-reference.

- Modify replaceAllDbgUsesWith to generate dwarf::DW_OP_convert
operations. These ops need to reference a DIBasicType and hence uses the
support from the previous bullet point.

- Update AsmPrinter to allow DIExpressions (emitted in .debug_loc) to
reference DIBasicTypes (emitted in .debug_info). Various MCSymbols are
emitted to realize the references.

Not at all ready but as it looks right now would these changes be considered to be heading in the right direction?

markus marked an inline comment as done.Feb 5 2019, 7:23 AM
markus added inline comments.
lib/Transforms/Utils/Local.cpp
1862 ↗(On Diff #185303)

Why is this the case? I was expecting that since I use TrackingMDRef the nodes would not be considered to have zero uses and as a result be deleted until the trackers let go.

bjope added inline comments.Feb 5 2019, 8:27 AM
lib/Transforms/Utils/Local.cpp
1867 ↗(On Diff #185303)

I still have trouble to understand how/if helps the debugger (or llc) to know how many bits to dereference. I assume that llc needs to prepend a DW_OP_deref_type in case of a memory location, or DW_OP_regval_type in case of a register location to get the correct type on the expression stack for the original value. And then we only need one DW_OP_convert to convert to the final type.
In case we think that it is ok to dereference more bits than ToBits (i.e. the smaller size), then we need to adjust the address to take care of endianess somewhere.
If we go down this road, then I think we need some hack to get DW_OP_deref_type/DW_OP_regval_type in place first. Or what do you think?

If you choose this path, one way to solve the issue of what to do with the dangling types that are produced by the conversion operations would be to create them in a separate DICompileUnit with a compiler-generated name and location. This way the types should get uniqued automatically during LTO, and you don't need to worry about llvm-link & friends.

Not sure that I understand this comment right now. Perhaps I will run into the issue you describe later on but for now my mind is not there just yet :)

My mistake, I mistakenly thought that DIBasicTypes would be owned by a DICompileUnit, but that is not the case, they are freestanding.

What other DWARF5 operations do you have in mind that would need pointers into the debug type hierarchy?

For example I imagine that DW_OP_regval_type, DW_OP_deref_type and DW_OP_const_type could come in handy to set the type of the expression stack. The "each element
of the stack is the size of an address on the target machine" of Dwarf4 is a rather annoying limitation for our target.

All three of these operations expect a DW_TAG_basic_type as argument. So we could easily introduce a DW_OP_LLVM_basic_type <signedness> <bits> operation that gets expanded in AsmPrinter into DW_TAG_basic_types like you do now and thus avoid having to deal with TrackingMDRefs. This would also reduce the memory footprint of all DIExpressions that don't need a type argument.

markus marked an inline comment as done.Feb 6 2019, 1:06 AM
markus added inline comments.
lib/Transforms/Utils/Local.cpp
1867 ↗(On Diff #185303)

I still have trouble to understand how/if helps the debugger (or llc) to know how many bits to dereference.

Ok, lets try to clarify that then so that we all can agree here.

I assume that llc needs to prepend a DW_OP_deref_type in case of a memory location, or DW_OP_regval_type in case of a register location to get the correct type on the expression stack for the original value. And then we only need one DW_OP_convert to convert to the final type.

Yes, down the road that would be ideal as I see it. In the meantime we could do with using two DW_OP_convert ops in sequence as in this patch.

In case we think that it is ok to dereference more bits than ToBits (i.e. the smaller size), then we need to adjust the address to take care of endianess somewhere.

I think that we immediately need to start using DW_OP_deref_size instead of DW_OP_deref to cover the endianess effects, but this is really a separate bug/issue. Modifying the address seems a less desirable way to achieve this.

If we go down this road, then I think we need some hack to get DW_OP_deref_type/DW_OP_regval_type in place first. Or what do you think?

I think that we can start using two DW_OP_convert in sequence and then treat the DW_OP_deref_size as a separate issue and finally DW_OP_deref_type and DW_OP_regval_type as long term goals.

Makes sense?

markus updated this revision to Diff 185542.Feb 6 2019, 6:42 AM

Dropping all the metadata stuff and specifying the bit size and type encoding directly in the expression vector. e.g.

!DIExpression(DW_OP_convert, 8, 5, DW_OP_convert, 32, 5, DW_OP_stack_value)

This approach certainly has its advantages and require far less changes to existing code.

markus added a comment.Feb 6 2019, 6:45 AM

All three of these operations expect a DW_TAG_basic_type as argument. So we could easily introduce a DW_OP_LLVM_basic_type <signedness> <bits> operation that gets expanded in AsmPrinter into DW_TAG_basic_types like you do now and thus avoid having to deal with TrackingMDRefs. This would also reduce the memory footprint of all DIExpressions that don't need a type argument.

Yep, lets do that but I actually think we can do without a DW_OP_LLVM_basic_type and encode it directly into the expression vector as I have done in the new patch.

aprantl added a comment.EditedFeb 6 2019, 9:14 AM

Thanks, I appreciate your perseverance even though we make you go through so many revisions :-)

Since we are not using the DWARF semantics by having an extra bitsize and encoding parameter, I would recommend to use a new enumerator DW_OP_LLVM_convert to avoid confusion. This is also why we use DW_OP_LLVM_fragment instead of DW_OP_bit_piece. We should then also document the new operator in the LangRef.rst or SourceLevelDebugging.rst (wherever DW_OP_LLVM_fragment is documented). Bonus points (but really not required!) for pretty-printing the encoding as DW_ATE_encoding enumerator in LLVM assembly.

markus updated this revision to Diff 185753.Feb 7 2019, 6:02 AM

Updated to DW_OP_LLVM_convert as suggested, did the pretty printing and updated a bunch of llvm-lit tests to cope with the newly generated labels. Now the IR format looks like this:

!DIExpression(DW_OP_LLVM_convert, 16, DW_ATE_signed, DW_OP_LLVM_convert, 32, DW_ATE_signed, DW_OP_stack_value)

There are some open issues that I would like feedback on. I will add inline comments at those locations.

markus marked 4 inline comments as done.Feb 7 2019, 6:30 AM
markus added inline comments.
lib/CodeGen/AsmPrinter/DwarfDebug.cpp
946 ↗(On Diff #185753)

If there were multiple Dwarf CUs in the same LLVM Module this would not work right. We need to emit base types for each DwarfCompileUnit but only those types that are used by DwarfExpressions in that unit. So appears to make sense to put the MarkusNodes inside the CU.

2248 ↗(On Diff #185753)

For DWO how do we find the label into the corresponding .debug_info and how do we emit the base types? Would the code in DwarfDebug.cpp:946 work? I guess that I need to create some test cases for DWO.

lib/CodeGen/AsmPrinter/DwarfExpression.cpp
31 ↗(On Diff #185753)

I think that we need one of these per DwarfCompileUnit.

400 ↗(On Diff #185753)

A DwarfExpression should always know which CU it belongs to right?

aprantl added inline comments.Feb 7 2019, 1:37 PM
docs/LangRef.rst
4670 ↗(On Diff #185753)

Thanks, this looks good!

lib/AsmParser/LLParser.cpp
4837 ↗(On Diff #185753)

Can you also add a round-trip test to test/Assembler/diexpression.ll?

lib/CodeGen/AsmPrinter/DwarfDebug.cpp
946 ↗(On Diff #185753)

Would creating a separate CU just for our basic types help in any way?

bjope added inline comments.Feb 7 2019, 2:52 PM
lib/Transforms/Utils/Local.cpp
1867 ↗(On Diff #185303)

I think that we can start using two DW_OP_convert in sequence and then treat the DW_OP_deref_size as a separate issue and finally DW_OP_deref_type and DW_OP_regval_type as long term goals.

Makes sense?

Makes a little sense at least. I guess it all depends on if this is supposed to be "complete" or just a partial solution. If we go for the latter then you need to update the description (and probably also add some code comments here mentioning that this still gives faulty result in certain situations). Extra credits for adding test cases that show that we still do wrong in some situations.

At least it seems like we agree that for a "complete" solution we need to express how many bits that should be dereferenced instead of the first DW_OP_convert. With this patch we still present wrong values in the debugger sometimes (at least for big endian platforms, right?).

aprantl added inline comments.Feb 7 2019, 3:21 PM
include/llvm/CodeGen/DIE.h
685 ↗(On Diff #185753)

It seems excessive to burden every DIE object with an extra 8 bytes for something we'll only need on a handful of basic type DIEs. Would it be possible to store this in a DenseMap on the side instead?

dstenb added a subscriber: dstenb.Feb 8 2019, 6:56 AM

How should DW_OP_convert be handled when targeting DWARF versions earlier than 5? There is the GNU extension DW_OP_GNU_convert, which GDB seems to have had support for since 2011. The operation seems to be the identical to the final version that got into DWARFv5, so LLDB should be able to handle the two variants transparently. Can we emit that GNU extension (under some limitations)?

How should DW_OP_convert be handled when targeting DWARF versions earlier than 5? There is the GNU extension DW_OP_GNU_convert, which GDB seems to have had support for since 2011. The operation seems to be the identical to the final version that got into DWARFv5, so LLDB should be able to handle the two variants transparently. Can we emit that GNU extension (under some limitations)?

That still leaves the question of what to do when we're not going to emit a convert operator. I don't object to DW_OP_GNU_convert in principle, but I do object to requiring all debuggers to support it.

How should DW_OP_convert be handled when targeting DWARF versions earlier than 5? There is the GNU extension DW_OP_GNU_convert, which GDB seems to have had support for since 2011. The operation seems to be the identical to the final version that got into DWARFv5, so LLDB should be able to handle the two variants transparently. Can we emit that GNU extension (under some limitations)?

That seems fine for debugger-tuning=gdb (and eventually lldb once we implemented support for it). Is emitting the byzantine shift/mask expression that this review started with in all other cases an option for a reasonably large subset of the uses?

How should DW_OP_convert be handled when targeting DWARF versions earlier than 5? There is the GNU extension DW_OP_GNU_convert, which GDB seems to have had support for since 2011. The operation seems to be the identical to the final version that got into DWARFv5, so LLDB should be able to handle the two variants transparently. Can we emit that GNU extension (under some limitations)?

That seems fine for debugger-tuning=gdb (and eventually lldb once we implemented support for it). Is emitting the byzantine shift/mask expression that this review started with in all other cases an option for a reasonably large subset of the uses?

If we have something nice for GDB and LLDB, and Paul's OK with one of those for Sony's stuff - then who would we be maintaining this extra code for? I'd be OK saying that backwards/sidewards/unspecified compatibility might not be worth it? If someone comes with a need, it could be resurrected/implemented - and until then, we could emit no location at all, potentially.

markus updated this revision to Diff 186209.Feb 11 2019, 3:33 AM
markus retitled this revision from Fix sign/zero extension in Dwarf expressions. to Introduce DW_OP_LLVM_convert.
markus edited the summary of this revision. (Show Details)

Addresses most of the open issues I had. The major question that remains is what to do when not targeting Dwarf v5.

markus marked an inline comment as done.Feb 11 2019, 3:42 AM
markus added inline comments.
include/llvm/CodeGen/DIE.h
685 ↗(On Diff #185753)

I agree and it does seem like the most logical place for such a DenseMap would be inside the CU. It is not clear however how to make that available to AsmPrinter::emitDwarfDIE without violating the current interfaces.

How should DW_OP_convert be handled when targeting DWARF versions earlier than 5? There is the GNU extension DW_OP_GNU_convert, which GDB seems to have had support for since 2011. The operation seems to be the identical to the final version that got into DWARFv5, so LLDB should be able to handle the two variants transparently. Can we emit that GNU extension (under some limitations)?

That seems fine for debugger-tuning=gdb (and eventually lldb once we implemented support for it). Is emitting the byzantine shift/mask expression that this review started with in all other cases an option for a reasonably large subset of the uses?

If we have something nice for GDB and LLDB, and Paul's OK with one of those for Sony's stuff - then who would we be maintaining this extra code for? I'd be OK saying that backwards/sidewards/unspecified compatibility might not be worth it? If someone comes with a need, it could be resurrected/implemented - and until then, we could emit no location at all, potentially.

IIUC what we have are:

  • v5: Everybody is okay with DW_OP_convert.
  • v4 GDB: clearly DW_OP_GNU_convert is the way to go.
  • v4 Sony: do the complicated thing.
  • v4 LLDB: either teach it about DW_OP_GNU_convert or do the complicated thing like Sony.

I didn't see any not-the-complicated-thing proposal for cases that don't/can't/won't know about DW_OP_GNU_convert?

How should DW_OP_convert be handled when targeting DWARF versions earlier than 5? There is the GNU extension DW_OP_GNU_convert, which GDB seems to have had support for since 2011. The operation seems to be the identical to the final version that got into DWARFv5, so LLDB should be able to handle the two variants transparently. Can we emit that GNU extension (under some limitations)?

That seems fine for debugger-tuning=gdb (and eventually lldb once we implemented support for it). Is emitting the byzantine shift/mask expression that this review started with in all other cases an option for a reasonably large subset of the uses?

If we have something nice for GDB and LLDB, and Paul's OK with one of those for Sony's stuff - then who would we be maintaining this extra code for? I'd be OK saying that backwards/sidewards/unspecified compatibility might not be worth it? If someone comes with a need, it could be resurrected/implemented - and until then, we could emit no location at all, potentially.

IIUC what we have are:

  • v5: Everybody is okay with DW_OP_convert.
  • v4 GDB: clearly DW_OP_GNU_convert is the way to go.
  • v4 Sony: do the complicated thing.
  • v4 LLDB: either teach it about DW_OP_GNU_convert or do the complicated thing like Sony.

I didn't see any not-the-complicated-thing proposal for cases that don't/can't/won't know about DW_OP_GNU_convert?

I don't think anyone mentioned it, that's why I was bringing it up - there's always an option to not render any location if it's not possible/worth the work. That's all I was asking - is it worth the complexity? (I wasn't sure anyone needed it - but sounds like Sony does, reckon it's worth the tradeoff in complexity in LLVM compared to the work required to support this in the Sony debugger?)

I don't think anyone mentioned it, that's why I was bringing it up - there's always an option to not render any location if it's not possible/worth the work. That's all I was asking - is it worth the complexity? (I wasn't sure anyone needed it - but sounds like Sony does, reckon it's worth the tradeoff in complexity in LLVM compared to the work required to support this in the Sony debugger?)

NVPTX also would need it, because they are stuck on DWARF v2.

I don't think anyone mentioned it, that's why I was bringing it up - there's always an option to not render any location if it's not possible/worth the work. That's all I was asking - is it worth the complexity? (I wasn't sure anyone needed it - but sounds like Sony does, reckon it's worth the tradeoff in complexity in LLVM compared to the work required to support this in the Sony debugger?)

NVPTX also would need it, because they are stuck on DWARF v2.

Any ideas if NVPTX hit this case? my understanding was that NVPTX has a fairly restrictive set of code or actions that can be used.

markus updated this revision to Diff 186431.Feb 12 2019, 2:47 AM

Removed Label from DIE (added DenseMap to CU and pass it around). Rebased to top of trunk.

aprantl added inline comments.Feb 12 2019, 9:21 AM
lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1937 ↗(On Diff #186431)

This is to inject the reference to the basic type die, right?

This code should probably be factored out into a relocate/fixupTypeRefs() helper function. I also assume that you need to apply the same fixup for the case of a single, non-debug_lo, inline DW_AT_location, right?

The fact that the placeholder is encoded as a LEB128 sounds really dangerous. If we ever support any branching operations, it will mess with the offsets. Can we assume that the finalized DIE ref will always be a DW_OP_ref_addr or something with a fixed size? Could we make the placeholder the same fixed size, too?

If that doesn't work, the right solution is probably to defer the emission of DwarfExpressions until here, which we could do in a separate, preparatory commit.

markus marked an inline comment as done.Feb 12 2019, 11:49 AM
markus added inline comments.
lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1937 ↗(On Diff #186431)

This is to inject the reference to the basic type die, right?

Yes

This code should probably be factored out into a relocate/fixupTypeRefs() helper function. I also assume that you need to apply the same fixup for the case of a single, non-debug_lo, inline DW_AT_location, right?

Sounds reasonable. Not sure I could easily find where in the code the inline expressions are inserted though. If you could point at a file and line number that would be helpful. I guess another option would be to force these expressions (the ones containing a base type reference) to always end up in .debug_loc right?

The fact that the placeholder is encoded as a LEB128 sounds really dangerous. If we ever support any branching operations, it will mess with the offsets. Can we assume that the finalized DIE ref will always be a DW_OP_ref_addr or something with a fixed size? Could we make the placeholder the same fixed size, too?

The spec states that the finalized base type DIE offset is encoded as a ULEB128 so not much choice about that but the value we pick up here (the one inserted in DwarfExpression::addExpression) is just a index so we could certainly encode that in a fixed size integer. If branches were to be introduced at a later point I imagine that the branch target in the emitted dwarf would be a label (MCSymbol) but in the intermediate expression vector a simple offset would probably suffice.

If that doesn't work, the right solution is probably to defer the emission of DwarfExpressions until here, which we could do in a separate, preparatory commit.

I think that would be a good thing to do but unfortunately it seems far from easy to get rid of the stuff that is in between.

aprantl added inline comments.Feb 12 2019, 12:46 PM
lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1937 ↗(On Diff #186431)

If you could point at a file and line number that would be helpful.

git grep addBlock.*DW_AT_location lib/CodeGen/AsmPrinter
lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp:    addBlock(*VariableDIE, dwarf::DW_AT_location, DwarfExpr->finalize());
lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp:        addBlock(*VariableDie, dwarf::DW_AT_location, DwarfExpr.finalize());
lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp:  addBlock(*VariableDie, dwarf::DW_AT_location, DwarfExpr.finalize());
lib/CodeGen/AsmPrinter/DwarfUnit.cpp:        addBlock(ParamDIE, dwarf::DW_AT_location, Loc);

The spec states that the finalized base type DIE offset is encoded as a ULEB128 so not much choice about that but the value we pick up here (the one inserted in DwarfExpression::addExpression) is just a index so we could certainly encode that in a fixed size integer.

I see. I didn't think about emitting branch targets a label differences, I thought we'd just hardcode the offsets. I guess we can defer this until it becomes an issue. Encoding the temporary die reference with a fixed size would probably still be a good idea, just to keep this code simpler.

probinson added inline comments.Feb 12 2019, 1:02 PM
lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1937 ↗(On Diff #186431)

Re the ULEB as how to find the base_type DIEs, the unstated assumption is that the base_type DIEs would be emitted unconditionally at the top of the CU, so everyone can just use them as needed. If you want to emit base_types lazily... don't do that.
Re branches, there are already branch operators, if that's what you're talking about (DW_OP_skip and _bra).

dblaikie added inline comments.Feb 12 2019, 1:36 PM
lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1937 ↗(On Diff #186431)

I think maybe you don't need to make that assumption - you can produce a label difference as a uleb (though that may get complicated).

Yeah, we use it in debug_rnglists, for instance:

.uleb128 .Lfunc_end0-.Lfunc_begin0 #   length

& even if I change that to be a label difference between labels that bound the uleb itself (ie: where the difference would vary depending on the number of bytes required for the uleb) clang still assembles it at least... :)

markus marked 2 inline comments as done.Feb 13 2019, 12:29 AM
markus added inline comments.
lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1937 ↗(On Diff #186431)

Re the ULEB as how to find the base_type DIEs, the unstated assumption is that the base_type DIEs would be emitted unconditionally at the top of the CU, so everyone can just use them as needed. If you want to emit base_types lazily... don't do that.

I played with emitting the base_type DIEs directly after the CU header but since the size of that header varies and due to the phase ordering of how the debug info is emitted I still need label differences to be able to locate the base_type DIEs in a robust manner. Right now I would not say that they are emitted lazily but rather we find out which ones we need and then emit them in table form. Still need the labels though.

Re branches, there are already branch operators, if that's what you're talking about (DW_OP_skip and _bra).

Where do you see these branch operators being used? I can't find them.

1937 ↗(On Diff #186431)

I think maybe you don't need to make that assumption - you can produce a label difference as a uleb (though that may get complicated).

Yes, that is what I currently do. It looks like this

.byte   168                     # DW_OP_convert
.uleb128 .Lbase_type0-.Lcu_begin0

& even if I change that to be a label difference between labels that bound the uleb itself (ie: where the difference would vary depending on the number of bytes required for the uleb) clang still assembles it at least... :)

Yep, I thought about that too when I realized that I need to add some prototype support for our downstream assembler. Came to the conclusion that I could treat the ULEB128s as fixed size by sign/zero extending them, so that should simplify things a lot even though it is not space efficient... Either way it is not a problem for this review how the assembler solves it :)

+ @ABataev re the question whether NVPTX runs into the situation described in this review.

The Sony debugger guys are okay with using the GCC operator in a pre-v5 expression. So, tentatively, for all debugger tunings, we can emit that instead of the more complicated expression. That way we are emitting compliant expressions, and the info doesn't just disappear sometimes (a much worse outcome IMO). The only remaining question is my hypothetical about NVPTX.

Re branch operators, I thought Adrian was throwing that out there as a general concern; yes branch operators exist, and yes we don't use them currently. As David says, the assembler knows how to convert a label difference into a ULEB and it will all Just Work. If/when we ever need it to.

I see now that the inlined places i.e. these:

git grep addBlock.*DW_AT_location lib/CodeGen/AsmPrinter
lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp:    addBlock(*VariableDIE, dwarf::DW_AT_location, DwarfExpr->finalize());
lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp:        addBlock(*VariableDie, dwarf::DW_AT_location, DwarfExpr.finalize());
lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp:  addBlock(*VariableDie, dwarf::DW_AT_location, DwarfExpr.finalize());
lib/CodeGen/AsmPrinter/DwarfUnit.cpp:        addBlock(ParamDIE, dwarf::DW_AT_location, Loc);

work quite differently and that will need some additional work.

The good news is that here it seems we might just have the infrastructure (DIEValue etc ) to realize the base_type reference in a much cleaner way.

+ @ABataev re the question whether NVPTX runs into the situation described in this review.

The Sony debugger guys are okay with using the GCC operator in a pre-v5 expression. So, tentatively, for all debugger tunings, we can emit that instead of the more complicated expression. That way we are emitting compliant expressions, and the info doesn't just disappear sometimes (a much worse outcome IMO). The only remaining question is my hypothetical about NVPTX.

Re branch operators, I thought Adrian was throwing that out there as a general concern; yes branch operators exist, and yes we don't use them currently. As David says, the assembler knows how to convert a label difference into a ULEB and it will all Just Work. If/when we ever need it to.

NVPTX supports only DWARF2 and does not know anything about DWARF5 operations. Also, it does not support any type of the expression in the DWARF sections, except for <section_name>+-<int_offset>.

NVPTX supports only DWARF2 and does not know anything about DWARF5 operations. Also, it does not support any type of the expression in the DWARF sections, except for <section_name>+-<int_offset>.

Isn't the latter a rather restrictive limitation that should be addressed in the NVPTX assembler?

NVPTX supports only DWARF2 and does not know anything about DWARF5 operations. Also, it does not support any type of the expression in the DWARF sections, except for <section_name>+-<int_offset>.

Isn't the latter a rather restrictive limitation that should be addressed in the NVPTX assembler?

I agree, but this the limitation we have to live with, unfortunately. Either we support that limitation to have the debug info for NVPTX, or we don't have debug info for NVPTX. I cannot make NVidia to update their ptxas tool to support complex expressions in DWARF sections.

markus updated this revision to Diff 187015.Feb 15 2019, 7:36 AM

Addressed the inlined location expressions.

Got rid of all the label differences introduced in previous patches and instead use padded ULEB128s of a fixed size (4 bytes).

Using fixed size allows us to leverage the existing framework much better and avoid spraying symbols all over to place just to be able to cope with the variable sized .uleb128 directives. One drawback is that debug output is perhaps slightly less space efficient but that is only a matter of a few bytes each time a base_type is referenced from a DW_OP_convert.

Comments please (I know that I need to add lit tests and clean up comments).

markus updated this revision to Diff 187203.Feb 18 2019, 2:34 AM
markus edited the summary of this revision. (Show Details)
  • Did some cleanup
  • Added tests
  • Emit DW_OP_convert for Dwarf5
  • Emit legacy shift and mask expression for Dwarf4 and lower
  • Added llc option -generate-typed-dwarf5-expr to force emission regardless of targeted Dwarf version
aprantl added inline comments.Feb 18 2019, 10:12 AM
lib/CodeGen/AsmPrinter/DIE.cpp
512 ↗(On Diff #187203)

under the current style you may drop the duplicate doxygen comments con the function implementations.

516 ↗(On Diff #187203)

what happens when this assertion isn't met in a release compiler? Is that a purely hypothetical scenario?

lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
1198 ↗(On Diff #187203)

for (auto &I : reverse(ExprRefedBaseTypes))

lib/CodeGen/AsmPrinter/DwarfExpression.cpp
403 ↗(On Diff #187203)

Why not use a SmallDenseSet for CU.ExprRefedBaseTypes?

test/DebugInfo/Generic/convert-inlined.ll
32 ↗(On Diff #187203)

Sorry about the extra work this causes, but it would really pay off to also have a separate peraratory commit that adds dwarfdump support for DW_OP_convert so this is printed as DW_OP_convert (0x000022 "DW_ATE_signed_32") as well as dwarfdump --verify support that ensures that the offset actually points to a type DIE.

markus marked 3 inline comments as done.Feb 19 2019, 1:34 AM
markus added inline comments.
lib/CodeGen/AsmPrinter/DIE.cpp
516 ↗(On Diff #187203)

what happens when this assertion isn't met in a release compiler?

The debug info would be corrupted.

Is that a purely hypothetical scenario?

With ULEB128PadSize = 4 we are fine as long as the types are placed within 256MB from the start of the .debug_info section. Since we take care to insert the types immediately after the CU this will not be an issue for the case of a single CU. However by using llvm-link it is possible to put several CUs in the same module and that could push us closer to the 256MB limit.

If we set ULEB128PadSize = 5 then the limit becomes 32GB and that should put us on the safe side for a considerable future (keeping in mind that this limit is for object files and not the final linked executable).

(One could argue that symbols and label differences, effectively pushing the problem to the assembler, are the right way to go here and I would tend to agree with that but unfortunately that causes havoc in the existing DIE framework as it relies on being able to pre-compute sizes.)

lib/CodeGen/AsmPrinter/DwarfExpression.cpp
403 ↗(On Diff #187203)

I don't think that would work as we rely on being able to index into it.

test/DebugInfo/Generic/convert-inlined.ll
32 ↗(On Diff #187203)

I could look into that but before I spend too much time on dwarfdump I would like to get confirmation that would be the last remaining issue.

markus marked an inline comment as done.Feb 19 2019, 2:28 AM
markus added inline comments.
lib/CodeGen/AsmPrinter/DIE.cpp
516 ↗(On Diff #187203)

However by using llvm-link it is possible to put several CUs in the same module and that could push us closer to the 256MB limit.

Actually forget that part. I see now that is irrelevant since the offset is within CU and not from start of .debug_info. I got confused looking at the output of readelf as that tool prints .debug_info address even though it is encoded as a CU offset.

So given this I think the 256MB limit is perfectly reasonable.

aprantl added inline comments.
lib/CodeGen/AsmPrinter/DIE.cpp
516 ↗(On Diff #187203)

(One could argue that symbols and label differences, effectively pushing the problem to the assembler, are the right way to go here and I would tend to agree with that but unfortunately that causes havoc in the existing DIE framework as it relies on being able to pre-compute sizes.)

@dblaikie has dealt with similar problems in the past while refactoring AsmPrinter to support DWOs, perhaps he has some ideas?

So given this I think the 256MB limit is perfectly reasonable.

A good measure to decide this is to look at the size of an LTO build of Clang with all targets, which is usually a good proxy for what we can expect from large programs.

aprantl added inline comments.Feb 19 2019, 8:51 AM
lib/CodeGen/AsmPrinter/DIE.cpp
516 ↗(On Diff #187203)

When you say

causes havoc in the existing DIE framework as it relies on being able to pre-compute sizes

is the problem that we don't know the offsets of the location list entries ahead of time since they depend on the position of the referenced type DIEs and thus we don't know the size of the DW_AT_location attributes, or is the problem only within the location list section?

markus marked 2 inline comments as done.Feb 19 2019, 10:24 AM
markus added inline comments.
lib/CodeGen/AsmPrinter/DIE.cpp
516 ↗(On Diff #187203)

(One could argue that symbols and label differences, effectively pushing the problem to the assembler, are the right way to go here and I would tend to agree with that but unfortunately that causes havoc in the existing DIE framework as it relies on being able to pre-compute sizes.)

@dblaikie has dealt with similar problems in the past while refactoring AsmPrinter to support DWOs, perhaps he has some ideas?

While I do think that solving this with label differences would be the right thing to do in general I do not think it is the right thing to do here as the changes become much larger than they need to be.

So given this I think the 256MB limit is perfectly reasonable.

A good measure to decide this is to look at the size of an LTO build of Clang with all targets, which is usually a good proxy for what we can expect from large programs.

Yes, to clarify this. Since the base types that we generate are inserted immediately after the CU DIE we would need to insert a huge amount of these (>16 million I believe) to hit the 256MB limit. This should be quite impossible since we don't create entries for duplicate types and there simply aren't that many unique ones :)

516 ↗(On Diff #187203)

is the problem that we don't know the offsets of the location list entries ahead of time since they depend on the position of the referenced type DIEs and thus we don't know the size of the DW_AT_location attributes, or is the problem only within the location list section?

One of the problems is that as soon as we emit a .uleb128 directive of a label difference we do not know that size of that and hence cannot compute the size of the block it is in without emitting more labels (begin and end for that block) and then do a label difference of that too. This is not how the DIE handling is currently designed (IIUC) as it relies on being able to compute the size of each DIE. This is especially a problem for the inlined DW_AT_location attributes.

This is why having padded / fixed size ULEB128s, as we have in the current patch, makes things much easier.

aprantl added inline comments.Feb 19 2019, 10:34 AM
lib/CodeGen/AsmPrinter/DIE.cpp
516 ↗(On Diff #187203)

So we only insert the special type DIEs into the very first DW_TAG_compile_unit? Then this is fine. It wasn't clear to me whether in the LTO case we'd inject the types into every CU. There's no need to do that, so if we don't then.we're good.

I'm a touch confused by the whole discussion here - so I'll write some things & perhaps folks can correct my understanding where necessary.

The issue is that a location expression (either in a direct location, or in a loclist) needs to reference a DIE.

Sounds like that DIE reference is necessarily CU-local (because we're talking about precomputing the offset - and we could only do that if it's CU-local).

We already emit other CU-local DIE references in attributes (eg: DW_AT_specification, etc) references with hardcoded 4 bytes in size- so why would it be problematic to emit this one in the same way (with a padded ULEB128 that we know will give us 4 bytes of offset to work with)?

& yeah, maybe if someone wants to save us some size (at the cost of whatever computational complexity this invokes in the assembler) it could be replaced with label differences (doing label differences for all DIE references would be nifty for manual editing/mucking about with DWARF, but not a big deal).

markus marked an inline comment as done.Feb 20 2019, 6:06 AM

I'm a touch confused by the whole discussion here - so I'll write some things & perhaps folks can correct my understanding where necessary.

I think that we all are. Maybe it is because we tried some many different approaches in the same review that it is a bit convoluted now.

The issue is that a location expression (either in a direct location, or in a loclist) needs to reference a DIE.

Correct.

Sounds like that DIE reference is necessarily CU-local (because we're talking about precomputing the offset - and we could only do that if it's CU-local).

Also correct. The spec says that the reference is an offset into the current CU.

We already emit other CU-local DIE references in attributes (eg: DW_AT_specification, etc) references with hardcoded 4 bytes in size- so why would it be problematic to emit this one in the same way (with a padded ULEB128 that we know will give us 4 bytes of offset to work with)?

Agree. It should be no more problematic than what is done in other places already. The 4 byte ULEB128 gives us 28 bits to encode the offset in but as reasoned in other comments there is no way that limit can be reached.

& yeah, maybe if someone wants to save us some size (at the cost of whatever computational complexity this invokes in the assembler) it could be replaced with label differences (doing label differences for all DIE references would be nifty for manual editing/mucking about with DWARF, but not a big deal).

Agree.

lib/CodeGen/AsmPrinter/DIE.cpp
516 ↗(On Diff #187203)

So we only insert the special type DIEs into the very first DW_TAG_compile_unit? Then this is fine. It wasn't clear to me whether in the LTO case we'd inject the types into every CU. There's no need to do that, so if we don't then.we're good.

The special type DIEs are local to the CU that use them so they will not necessarily only be inserted into the first one but rather into the ones where they are used. This is not a problem though since according to the spec 'the operand is an unsigned LEB128 number that represents the offset of a debugging information entry in the current compilation unit' i.e. the offset is relative to the current CU so there is no chance that it will exceed the 256MB limit since they are inserted immediately after the DW_TAG_compile_unit DIE.

I have created a review for the requested 'llvm-dwarfdump' updates (the printing part) in https://reviews.llvm.org/D58442

markus updated this revision to Diff 187577.Feb 20 2019, 7:05 AM

+ Addressed review remark (use range-based loop instead of iterators).
+ Updated tests (probably forgot during last rebase. Verified the values used now with a clean master)

aprantl added inline comments.Feb 20 2019, 8:16 AM
lib/CodeGen/AsmPrinter/DIE.cpp
516 ↗(On Diff #187203)

If they are CU-relative I agree that should be perfectly safe, thanks.

What happens if I llvm-link two CUs that both contain the same DIExpression. Is it possible for two location list entries to be uniques across compile units? I think the answer is no, right?

markus marked an inline comment as done.Feb 20 2019, 11:21 AM
markus added inline comments.
lib/CodeGen/AsmPrinter/DIE.cpp
516 ↗(On Diff #187203)

What happens if I llvm-link two CUs that both contain the same DIExpression. Is it possible for two location list entries to be uniques across compile units? I think the answer is no, right?

Correct. The data structure is per CU so nothing happens across such units.

markus updated this revision to Diff 187774.Feb 21 2019, 5:50 AM
  • updated to top of trunk to have the llvm-dwarfdump updates committed earlier today
  • updated tests and made additional improvements to llvm-dwarfdump
  • fixed segfault bug where we crashed on release builds but not debug builds (the allocation of DIEBaseTypeRef)

Are we good to land this now or what is remaining?

There's a nonzero chance that this patch will break llvm/tools/dsymutil. I'll see if I can come up with a testcase for that (hopefully later today).

lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
1196 ↗(On Diff #187774)

... so their offsets fit into the 5 bits reserved inside the location expressions.

lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1938 ↗(On Diff #187774)

You might want to reword this comment to be more assertive :-)

Is my understanding correct that we only need to do this here because the inline DW_AT_location (DW_FORM_block, ..) are emitted ahead of time and thus have the correct offsets injected from the get go?

Could you please move this out into a fixupLocEntryDIEReferences() (or something) function?

1978 ↗(On Diff #187774)

Didn't your dwarfdump patch have this info in an enum or am I confusing things?

lib/CodeGen/AsmPrinter/DwarfExpression.cpp
29 ↗(On Diff #187774)

Why is this needed? Shouldn't we key off the debugger tuning flag?

402 ↗(On Diff #187774)

Can you add a comment that explains what happens in the non-location-list cases?

probinson added inline comments.Feb 21 2019, 11:31 AM
lib/CodeGen/AsmPrinter/DwarfDebug.cpp
946 ↗(On Diff #185753)

A separate CU for basic types would not be usable by DW_OP_convert? because it uses CU-relative offsets to find them.

lib/CodeGen/AsmPrinter/DwarfExpression.cpp
29 ↗(On Diff #187774)

The debugger-tuning design is that we use it only in the DwarfDebug ctor to set other control flags, which can all be influenced independently. Tuning and its associated flags are not per-CU, although DWARF version is.

399 ↗(On Diff #187774)

Pre-v5 needs to emit the GNU op, not the standard op.

Could you please add a third testcase with two DICompileUnits, both of which have a DIExpression with a DW_OP_LLVM_convert that will use the same type and then verify that we emit the type in both CUs? This will help guard against later modifications breaking this scenario.

Thanks!

test/DebugInfo/Generic/convert-debugloc.ll
144 ↗(On Diff #187774)

I assume most of these attributes aren't needed?

markus updated this revision to Diff 187926.Feb 22 2019, 4:52 AM
  • Added test with two CUs.
  • Updated some comments.
  • Removed the -generate-typed-dwarf5-expr option, will only produce DW_OP_convert for Dwarf v5, for all prior versions the mask & shift expression is used. I suggest that we forget about DW_OP_GNU_convert and debugger tunings for this review. If that is really desired it can be added later in a separate patch.
markus marked 6 inline comments as done.Feb 22 2019, 5:07 AM
markus added inline comments.
lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1938 ↗(On Diff #187774)

Is my understanding correct that we only need to do this here because the inline DW_AT_location (DW_FORM_block, ..) are emitted ahead of time and thus have the correct offsets injected from the get go?

Sort of. When these are inserted in DwarfExpression::addExpression (for both location-lists and inlined) the offset of the base_type DIE is not known so we need to insert a placeholder. For the the location-list case the data structure is unfortunately a plain byte stream so we need this elaborate state machine to extract the placeholder here.

Could you please move this out into a fixupLocEntryDIEReferences() (or something) function?

Since this is a state machine and hence keeps state I think that putting it in a separate function would only make it messier.

1978 ↗(On Diff #187774)

DWARFExpression does contain similar information but it is private there. Not sure if refactoring would be worthwhile.

I don't think anyone mentioned it, that's why I was bringing it up - there's always an option to not render any location if it's not possible/worth the work. That's all I was asking - is it worth the complexity? (I wasn't sure anyone needed it - but sounds like Sony does, reckon it's worth the tradeoff in complexity in LLVM compared to the work required to support this in the Sony debugger?)

NVPTX also would need it, because they are stuck on DWARF v2.

Any ideas if NVPTX hit this case? my understanding was that NVPTX has a fairly restrictive set of code or actions that can be used.

Ping on this - still wondering if anyone needs the complicated code or if we could get away with the GNU extension + DWARFv5 standard form.

I don't think anyone mentioned it, that's why I was bringing it up - there's always an option to not render any location if it's not possible/worth the work. That's all I was asking - is it worth the complexity? (I wasn't sure anyone needed it - but sounds like Sony does, reckon it's worth the tradeoff in complexity in LLVM compared to the work required to support this in the Sony debugger?)

NVPTX also would need it, because they are stuck on DWARF v2.

Any ideas if NVPTX hit this case? my understanding was that NVPTX has a fairly restrictive set of code or actions that can be used.

Ping on this - still wondering if anyone needs the complicated code or if we could get away with the GNU extension + DWARFv5 standard

not sure, if NVPTX can hit this. It has no stack at all, so, maybe, this op won't be generated at all

markus updated this revision to Diff 188533.Feb 27 2019, 7:43 AM
  • Rewrote the expression parsing state machine in DwarfDebug::emitDebugLocEntry to use DWARFExpression as this avoids duplication of code describing ops and their arguments (got inspired by the dsymutil review).
aprantl added inline comments.Feb 27 2019, 9:14 AM
lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1947 ↗(On Diff #188533)

Nice. Should we do the same thing as in dsymutil here and check for Encoding::BaseTypeRef?

markus updated this revision to Diff 188710.Feb 28 2019, 4:43 AM
  • Removed explicit check for DW_OP_convert and replaced with a generic handling of operands that should work with all types (regardless if it is first, second or both that is BaseTypeRef).

We're almost there!

lib/CodeGen/AsmPrinter/DIE.cpp
521 ↗(On Diff #188710)

ditto.. it should be on the declaration in the header file.

512 ↗(On Diff #187203)

please remove this comment from the implementation.

lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1934 ↗(On Diff #188710)

Remove the first word.

1947 ↗(On Diff #188710)

Could you please still factor this into a static fixupBaseTypeRefs(or something along those lines) function for better readability?

1950 ↗(On Diff #188710)

Please add an assert that fails if the opcode is DW_OP_const_type as it is not supported by this loop.

1952 ↗(On Diff #188710)

Where is this getting copied?

markus marked 2 inline comments as done.Mar 1 2019, 12:48 AM

We're almost there!

Yay :)

lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1947 ↗(On Diff #188710)

Sure, but I don't understand where to place the cut to improve readability. I.e. what should go into fixupBaseTypeRefs and what should remain in emitDebugLocEntry?

1952 ↗(On Diff #188710)

It is not getting copied at all, 'Encoding::SizeNA` indicates that the operation does not have an operand in this slot. Or maybe I am not understanding the question?

aprantl accepted this revision.Mar 1 2019, 8:12 AM
aprantl added inline comments.
lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1952 ↗(On Diff #188710)

My bad, I was thinking about a non-base-type operand, but it is actually handled in the else branch!

1952 ↗(On Diff #188710)

On second thought, I think we can leave it as is, too.

This revision is now accepted and ready to land.Mar 1 2019, 8:12 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptMar 19 2019, 1:48 AM
Herald added a subscriber: ormris. · View Herald Transcript
ormris removed a subscriber: ormris.Mar 19 2019, 8:54 AM