This is an archive of the discontinued LLVM Phabricator instance.

[llvm-dwarfdump][Statistics] Handle LTO cases with cross CU referencing
ClosedPublic

Authored by dmilosevic141 on Nov 9 2021, 12:09 AM.

Details

Summary

With link-time optimizations enabled, resulting DWARF may end up containing cross CU references (through the DW_AT_abstract_origin attribute).
Consider the following example:

// sum.c
__attribute__((always_inline)) int sum(int a, int b)
{
     return a + b;
}
// main.c
extern int sum(int, int);
int main()
{
     int a = 5, b = 10, c = sum(a, b);
     return 0;
}

Compiled as follows:

$ clang -g -flto -fuse-ld=lld main.c sum.c -o main

Results in the following DWARF:

-- sum.c CU: abstract instance tree
...
0x000000b0:   DW_TAG_subprogram
                DW_AT_name	("sum")
                DW_AT_decl_file	("sum.c")
                DW_AT_decl_line	(1)
                DW_AT_prototyped	(true)
                DW_AT_type	(0x000000d3 "int")
                DW_AT_external	(true)
                DW_AT_inline	(DW_INL_inlined)

0x000000bc:     DW_TAG_formal_parameter
                  DW_AT_name	("a")
                  DW_AT_decl_file	("sum.c")
                  DW_AT_decl_line	(1)
                  DW_AT_type	(0x000000d3 "int")

0x000000c7:     DW_TAG_formal_parameter
                  DW_AT_name	("b")
                  DW_AT_decl_file	("sum.c")
                  DW_AT_decl_line	(1)
                  DW_AT_type	(0x000000d3 "int")
...
-- main.c CU: concrete inlined instance tree
...
0x0000006d:     DW_TAG_inlined_subroutine
                  DW_AT_abstract_origin	(0x00000000000000b0 "sum")
                  DW_AT_low_pc	(0x00000000002016ef)
                  DW_AT_high_pc	(0x00000000002016f1)
                  DW_AT_call_file	("main.c")
                  DW_AT_call_line	(5)
                  DW_AT_call_column	(0x19)

0x00000081:       DW_TAG_formal_parameter
                    DW_AT_location	(DW_OP_reg0 RAX)
                    DW_AT_abstract_origin	(0x00000000000000bc "a")

0x00000088:       DW_TAG_formal_parameter
                    DW_AT_location	(DW_OP_reg2 RCX)
                    DW_AT_abstract_origin	(0x00000000000000c7 "b")
...

Note that each entry within the concrete inlined instance tree in the main.c CU has a DW_AT_abstract_origin attribute which refers to a corresponding entry within the abstract instance tree in the sum.c CU.
llvm-dwarfdump --statistics did not properly report DW_TAG_formal_parameters/DW_TAG_variables from concrete inlined instance trees which had 0% location coverage and which referred to a different CU, mainly because information about abstract instance trees and their parameters/variables was stored locally - just for the currently processed CU, rather than globally - for all CUs.
In particular, if the concrete inlined instance tree from the example above was to look like this (i.e. parameter b has 0% location coverage, hence why it's missing):

0x0000006d:     DW_TAG_inlined_subroutine
                  DW_AT_abstract_origin	(0x00000000000000b0 "sum")
                  DW_AT_low_pc	(0x00000000002016ef)
                  DW_AT_high_pc	(0x00000000002016f1)
                  DW_AT_call_file	("main.c")
                  DW_AT_call_line	(5)
                  DW_AT_call_column	(0x19)

0x00000081:       DW_TAG_formal_parameter
                    DW_AT_location	(DW_OP_reg0 RAX)
                    DW_AT_abstract_origin	(0x00000000000000bc "a")

llvm-dwarfdump --statistics would have not reported b as such.

Diff Detail

Event Timeline

dmilosevic141 created this revision.Nov 9 2021, 12:09 AM
dmilosevic141 requested review of this revision.Nov 9 2021, 12:09 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 9 2021, 12:09 AM
Herald added a subscriber: MaskRay. · View Herald Transcript

Thanks for this! Some initial comments inline.

llvm/test/tools/llvm-dwarfdump/X86/LTO_CCU_zero_loc_cov.ll
87–90

I think we don't need these attributes, so we can get rid of them.

96

we can get rid of the " (https://github.com/dmilosevic141/llvm-project.git 16083be64e5aa8ae2b32be0dbfb3b383b3c562ea)" part.

99

here as well

100
101
129–131

all these three locations can be replaced with just one (and also all the other locations with the same scope could be removed/replaced)

llvm/tools/llvm-dwarfdump/Statistics.cpp
47

is this clang-formatted?

djtodoro added a project: debug-info.

Please also be clear in the patch title that we are fixing the statistics part of the llvm-dwarfdump.

dmilosevic141 retitled this revision from [llvm-dwarfdump] Implementation of an add-on for handling LTO cases with CCU referencing to [llvm-dwarfdump][Statistics] Handle LTO cases with cross CU referencing.Nov 9 2021, 3:16 AM
dmilosevic141 edited the summary of this revision. (Show Details)
dmilosevic141 updated this revision to Diff 385761.EditedNov 9 2021, 3:54 AM

Thanks for the suggestions!

I've updated the following:

  • Changed the title so that it is clear that we're fixing the --statistics part of the llvm-dwarfdump and slightly updated the summary.
  • clang-formatted the missed line within Statistics.cpp. Thanks!
  • Removed the unnecessary git links and hashes, as well as the paths towards the source files within the .ll file.
  • Reduced the number of function attributes, so that the .ll file only contains the necessary ones.
  • Reduced the number of DILocations, so that the .ll file only contains the necessary ones.

Hi,

Note that each entry within the concrete inlined instance tree in the main.c CU has a DW_AT_abstract_origin attribute which refers to a corresponding entry within the abstract instance tree in the sum.c CU.
llvm-dwarfdump (its --statistics option, to be precise) did not properly report DW_TAG_formal_parameters/DW_TAG_variables which had 0% location coverage, mainly because information about abstract instance trees and their parameters/variables was stored locally - just for the currently processed CU, rather than globally - for all CUs.

You state that the variables should have "0% location coverage", but the variables in your example each have a location that covers the entirety of its inlined scope (100% coverage):

0x0000006d:     DW_TAG_inlined_subroutine
                  DW_AT_abstract_origin	(0x00000000000000b0 "sum")
                  DW_AT_low_pc	(0x00000000002016ef)
                  DW_AT_high_pc	(0x00000000002016f1)
                  DW_AT_call_file	("main.c")
                  DW_AT_call_line	(5)
                  DW_AT_call_column	(0x19)

0x00000081:       DW_TAG_formal_parameter
                    DW_AT_location	(DW_OP_reg0 RAX)
                    DW_AT_abstract_origin	(0x00000000000000bc "a")

Am I missing something?

I added just one inline nit/question but I haven't fully reviewed this - please don't let my questions block the review process.

llvm/tools/llvm-dwarfdump/Statistics.cpp
48

Is there a reason that this cannot be just llvm::SmallVector<DIELocation> (i.e. drop the unique_ptr wrapper)?

dmilosevic141 marked 7 inline comments as done.
dmilosevic141 edited the summary of this revision. (Show Details)

Thanks @Orlando!
As far as the example in the summary goes, it was meant to showcase cross CU referencing in general, hence why the variables in the concrete inlined instance tree had full location coverage. I've updated the summary by adding a concrete example of what it looks like when a variable in a concrete inlined instance tree has 0% location coverage, which is what this patch actually fixes.
I've also dropped the std::unique_ptr wrapper around CrossCUReferencingDIELocationTy as there really was no reason for it.

djtodoro added inline comments.Nov 10 2021, 6:12 AM
llvm/tools/llvm-dwarfdump/Statistics.cpp
730

This actually fixes this TODO.

dmilosevic141 marked an inline comment as done.

Removed a trailing whitespace within the .ll file which prevented patch application in pre-merge checks.

Some extra nits included. I'll take another look into the functionality itself.

llvm/test/tools/llvm-dwarfdump/X86/LTO_CCU_zero_loc_cov.ll
14–30

All these should also start with double ';;'

64

The comment as well should also start with double ';;'

  • Rebased.
  • Updated comments within the .ll file. Thanks @djtodoro!
dmilosevic141 marked 2 inline comments as done.

Ran git-clang-format so that changes made match latest clang-format style.

Thanks @Orlando!
As far as the example in the summary goes, it was meant to showcase cross CU referencing in general, hence why the variables in the concrete inlined instance tree had full location coverage. I've updated the summary by adding a concrete example of what it looks like when a variable in a concrete inlined instance tree has 0% location coverage, which is what this patch actually fixes.

Thanks for the explanation, that clears up my confusion!

djtodoro accepted this revision.Nov 15 2021, 3:12 AM

LGTM. Please leave this for a couple of days, to see if someone has additional comments.

This revision is now accepted and ready to land.Nov 15 2021, 3:12 AM