This is an archive of the discontinued LLVM Phabricator instance.

Debug info for variables whose type is shrinked to bool
ClosedPublic

Authored by NikolaPrica on Jul 28 2017, 7:01 AM.

Details

Summary

During the globalopt transformation, when shrinking variable type into boolean, debug information for static global variables is dispatched. As a result there is no DW_AT_location and the debugger is not able to read such variables.

Diff Detail

Event Timeline

NikolaPrica created this revision.Jul 28 2017, 7:01 AM
aprantl added inline comments.Jul 28 2017, 8:30 AM
lib/Transforms/IPO/GlobalOpt.cpp
1604

I'm not familiar with this transformation: Do we need to add a DIExpression to mask out all but the first bit (i.e. can multiple bools be packed into the same uint32_t such that it could confuse debuggers)?

test/Transforms/GlobalOpt/static-global-boolean-dwarf.c
2 ↗(On Diff #108641)

We shouldn't require a frontend to be available when running tests. Can you use thew output of clang -S -emit-llvm instead an run it through opt?

NikolaPrica added inline comments.Aug 1 2017, 12:39 AM
lib/Transforms/IPO/GlobalOpt.cpp
1604

The debug info which is provided here with addDebugInfo later generates address of the variable (DW_OP_addr: xxxx) in DW_AT_location. If we provide here metadata which is for 1byte variable the debugger would get confused because the enum type is written as 4-byte and he would try to read 4 bytes. This is just temporary fix until proper way to handle this is found.

Changing test file from .c to .ll

Sorry for spaming. First time using Phabricator. Uploaded wrong diff.

probinson added inline comments.Aug 1 2017, 7:33 AM
test/Transforms/GlobalOpt/static-global-boolean-dwarf.ll
2 ↗(On Diff #109057)

It would be more usual to pipe the output of opt directly to llvm-dwarfdump, unless opt can't do that.
(It helps the tests run faster to avoid writing files unless you really have to.)

128 ↗(On Diff #109057)

These checks verify that something named "foo" has debug info of some kind, but without verifying any details. I'd rather see verification of the location expression, and possibly the type, given that you are planning to change those in the future to describe the shrunk variable.

aprantl added inline comments.Aug 1 2017, 8:51 AM
lib/Transforms/IPO/GlobalOpt.cpp
1604

If I understood you correctly then the best way to represent this would be a DW_OP_LLVM_fragment /*offset*/0 /*bitsize*/1 (or 8?)
expression. This will get lowered into a DW_OP_bit_piece to tell the debugger that this location is describing of the first n bits of the variable.

NikolaPrica edited the summary of this revision. (Show Details)
NikolaPrica added a reviewer: petarj.

This patch implements new DIExpression :

 val * (ValOther - ValInit) + ValInit:
 DW_OP_deref DW_OP_constu <ValMinus>
DW_OP_mul DW_OP_constu <ValInit> DW_OP_plus DW_OP_stack_value

The other expression which was suggested was :

// val * 42 | ((~val)&1) * 87
DW_OP_dup DW_OP_deref DW_OP_constu 42 DW_OP_mul DW_OP_swap
DW_OP_deref DW_OP_not DW_OP_lit1 DW_OP_and DW_OP_constu 87 DW_OP_mul
DW_OP_or DW_OP_stack_value

I have chosen to implement first because it is simpler.
I have also added simpler test also with concrete DW_AT_location description .

aprantl edited edge metadata.Aug 10 2017, 9:40 AM

This patch implements new DIExpression :

val * (ValOther - ValInit) + ValInit:

If we are doing this expression, there should be an assertion in the code that ValInit < ValOther.

 DW_OP_deref DW_OP_constu <ValMinus>
DW_OP_mul DW_OP_constu <ValInit> DW_OP_plus DW_OP_stack_value

Why does Val need to be dereferenced?

If we use our concrete example with 42 and 87 again:

start: <val> // either 0 or 1
DW_OP_deref // not sure why this is needed, but let's assume we still have 0 or 1 on the stack
DW_OP_constu (87-42) // <0 or 1> <45>
DW_OP_mul // <0 or 45>
DW_OP_constu 42 // <0 or 45> <42>
DW_OP_plus // <42 or 87>
DW_OP_stack_value

Apart from the DW_OP_deref, this seems very good, I'm slightly worried about the assumption that the init value will always be the smaller one.
What happens if any of the two values is negative (because of the typeless DWARF <5 expression stack — not sure if this will be a problem).

The other expression which was suggested was :

// val * 42 | ((~val)&1) * 87
DW_OP_dup DW_OP_deref DW_OP_constu 42 DW_OP_mul DW_OP_swap
DW_OP_deref DW_OP_not DW_OP_lit1 DW_OP_and DW_OP_constu 87 DW_OP_mul
DW_OP_or DW_OP_stack_value

I have chosen to implement first because it is simpler.
I have also added simpler test also with concrete DW_AT_location description .

lib/Transforms/IPO/GlobalOpt.cpp
1641

typo: emit

NikolaPrica retitled this revision from Debug info for variables whos type is shrinked to bool to Debug info for variables whose type is shrinked to bool.
NikolaPrica marked an inline comment as done.Aug 28 2017, 9:21 AM
val * (ValOther - ValInit) + ValInit:

If we are doing this expression, there should be an assertion in the code that ValInit < ValOther.

I have tested this for edge cases such as both negative values, LONG_MAX, LONG_INT etc and there was no problem for this values. Only flaw for this is ugly long DW_AT_location expression for long values.

 DW_OP_deref DW_OP_constu <ValMinus>
DW_OP_mul DW_OP_constu <ValInit> DW_OP_plus DW_OP_stack_value

Why does Val need to be dereferenced?

DW_OP_deref needs to be used because we have address of new variable on stack, not its value.

Apart from the DW_OP_deref, this seems very good, I'm slightly worried about the assumption that the init value will always be the smaller one.
What happens if any of the two values is negative (because of the typeless DWARF <5 expression stack — not sure if this will be a problem).

For dwarf versions lower than 4 gdb reads value of DW_AT_location expression as address giving message "Cannot access memory at address 0x1e". This may be a problem but it is still better than optimized-out?

aprantl accepted this revision.Aug 28 2017, 9:36 AM
val * (ValOther - ValInit) + ValInit:

If we are doing this expression, there should be an assertion in the code that ValInit < ValOther.

I have tested this for edge cases such as both negative values, LONG_MAX, LONG_INT etc and there was no problem for this values. Only flaw for this is ugly long DW_AT_location expression for long values.

Cool. Thanks for checking!

 DW_OP_deref DW_OP_constu <ValMinus>
DW_OP_mul DW_OP_constu <ValInit> DW_OP_plus DW_OP_stack_value

Why does Val need to be dereferenced?

DW_OP_deref needs to be used because we have address of new variable on stack, not its value.

Oh, because it is a global variable! Makes sense.

Apart from the DW_OP_deref, this seems very good, I'm slightly worried about the assumption that the init value will always be the smaller one.
What happens if any of the two values is negative (because of the typeless DWARF <5 expression stack — not sure if this will be a problem).

For dwarf versions lower than 4 gdb reads value of DW_AT_location expression as address giving message "Cannot access memory at address 0x1e". This may be a problem but it is still better than optimized-out?

That makes sense. I'm fine with not attempting to fix this.

test/Transforms/GlobalOpt/integer-bool-dwarf.ll
44

please strip out all non-essential attributes (everything in quotes)

This revision is now accepted and ready to land.Aug 28 2017, 9:36 AM

In fully-compliant mode we should just not emit any DWARF expression that ends with a DW_OP_stack_value in DWARF < 4. However, LLDB and GDB (at least the version Apple used to ship in Xcode) support a couple of expressions that are outside of the specification but do work in practice, so I'm reluctant to drop support for that by being stricter.
That said, Darwin has been using DWARF 4 for a while now, so an argument could be made that it is okay to regress on older deployment targets for the sake of correctness/standards compliance.

NikolaPrica marked an inline comment as not done.

Updated test case

dblaikie accepted this revision.Aug 29 2017, 12:13 PM
dblaikie added inline comments.
test/Transforms/GlobalOpt/integer-bool-dwarf.ll
84–92

Probably put these CHECK lines up the top, above the IR - that's where they usually go for test cases.

NikolaPrica marked 3 inline comments as done.

Updating test case.

This revision was automatically updated to reflect the committed changes.