This is an archive of the discontinued LLVM Phabricator instance.

[llvm-dwarfdump][locstats] Unify handling of inlined vars with no loc
ClosedPublic

Authored by djtodoro on Feb 4 2021, 8:26 AM.

Details

Summary

The presence or absence of an inline variable (as well as formal parameter) with only an abstract_origin ref (without DW_AT_location) should not change the location coverage.
It means, for both:

DW_TAG_inlined_subroutine
              DW_AT_abstract_origin (0x0000004e "f")
              DW_AT_low_pc  (0x0000000000000010)
              DW_AT_high_pc (0x0000000000000013)
  DW_TAG_formal_parameter
              DW_AT_abstract_origin       (0x0000005a "b")
NULL

and

DW_TAG_inlined_subroutine
            DW_AT_abstract_origin (0x0000004e "f")
            DW_AT_low_pc  (0x0000000000000010)
            DW_AT_high_pc (0x0000000000000013)
NULL

we should report 0% location coverage. If we add DW_AT_location, for both cases the coverage should be improved.

Diff Detail

Event Timeline

djtodoro created this revision.Feb 4 2021, 8:26 AM
djtodoro requested review of this revision.Feb 4 2021, 8:26 AM
djtodoro edited the summary of this revision. (Show Details)Feb 4 2021, 8:53 AM
jmorse added a comment.Feb 4 2021, 9:22 AM

Thanks for the patch -- do I understand correctly that any coverage of an inlined variable will lead to it being removed from the "InlinedVariables" vector? I'm wondering what happens if we have two inlining sites: one where an inlined variable is optimised out, another where the variable has a location. If we're keeping global data and deleting elements from it when coverage is observed, I think that'll make the output dependent on which the inlining sites are encountered.

If a site without coverage is seen before a site with coverage, it'll get a zero-coverage record from InlinedVariables successfully. But if you flip the order, the site with coverage will erase from InlinedVariables, and the site without coverage won't know about the un-covered variable.

I think this patch also relies on the abstract DW_AT_inline copy of the subprogram being encountered before any call sites, I don't know that that's guaranteed. (Fix this is probably hard, hence I uh, avoided looking at this).

llvm/tools/llvm-dwarfdump/Statistics.cpp
30

Nit, this can probably be a SmallVector, right?

573–574

Thanks for the patch -- do I understand correctly that any coverage of an inlined variable will lead to it being removed from the "InlinedVariables" vector? I'm wondering what happens if we have two inlining sites: one where an inlined variable is optimised out, another where the variable has a location. If we're keeping global data and deleting elements from it when coverage is observed, I think that'll make the output dependent on which the inlining sites are encountered.

If a site without coverage is seen before a site with coverage, it'll get a zero-coverage record from InlinedVariables successfully. But if you flip the order, the site with coverage will erase from InlinedVariables, and the site without coverage won't know about the un-covered variable.

Oh, I've forgotten to replace the vector with a map for concrete inlined instances.

I think this patch also relies on the abstract DW_AT_inline copy of the subprogram being encountered before any call sites, I don't know that that's guaranteed. (Fix this is probably hard, hence I uh, avoided looking at this).

Hm....yes... it could be a problem if it hasn't guaranteed... I'll try to think about it.

djtodoro updated this revision to Diff 321695.Feb 5 2021, 3:35 AM
  • Replace the vector representing inline vars with a map
  • Replace std::vector with SmallVector
  • Improve the test with multiple concrete inlined instacies
jmorse added a comment.Feb 5 2021, 4:43 AM

Looking good with a couple of comments and the harder case of the DW_AT_inline copy being out-of-order wanting to be solved.

llvm/test/tools/llvm-dwarfdump/X86/locstats-for-inlined-vars.yaml
37–38

Could I request a third inlining site with no DIE at all for this variable -- that'll ensure we test the behaviour that D94976 is trying to create.

llvm/tools/llvm-dwarfdump/Statistics.cpp
460

This doesn't need to be in the outer scope, does it? IMO better to keep the declaration where it was, as it's only used in that block.

580–582

I get the feeling modifying this vector while iterating over it is vulnerable to re-ordering / re-allocating.

jmorse added a comment.Feb 5 2021, 5:08 AM

Thinking out loud, and motivated by memory pressure: for the per-inlining-site record, that doesn't need be be maintained before or after we visit the inlining site, right? We might be able to get away with keeping a "Current scopes inlined variables" vector on the stack and passing it down through collectStatsRecursive. That'll automagically release the vector when we're done with the scope.

llvm-dwarfdump --statistics already takes a while to examine a stage2reldeb clang binary, it'd be best to avoid allocating more global data than we need.

djtodoro added a comment.EditedFeb 5 2021, 5:56 AM

@jmorse Thanks for the review!

Thinking out loud, and motivated by memory pressure: for the per-inlining-site record, that doesn't need be be maintained before or after we visit the inlining site, right? We might be able to get away with keeping a "Current scopes inlined variables" vector on the stack and passing it down through collectStatsRecursive. That'll automagically release the vector when we're done with the scope.

llvm-dwarfdump --statistics already takes a while to examine a stage2reldeb clang binary, it'd be best to avoid allocating more global data than we need.

It makes sense. I tried that approach, but I had encountered an issue, and suddenly just dropped that way of the implementation. I'll try to implement this that way now, since we have a prototype working.

llvm/test/tools/llvm-dwarfdump/X86/locstats-for-inlined-vars.yaml
37–38

Sure. I'll add it in the next revision.

llvm/tools/llvm-dwarfdump/Statistics.cpp
460

It is being used at the end of the function again.

580–582

Right... It will be removed completely if we keep the vector on the stack.

djtodoro updated this revision to Diff 321751.Feb 5 2021, 6:53 AM
  • Maintain one array of inlined vars per inlined instance by creating it on the stack
  • Improve the test

TODO: Handle the case then DW_TAG_subprogram with DW_AT_inline comes after inlined instancies

djtodoro updated this revision to Diff 322073.Feb 8 2021, 3:48 AM
  • Code refactor
  • Handle the case when a DW_AT_inline copy of subprogram is out of order
    • Add test for that

Looking good; a few comments inline, thanks for pushing forwards.

llvm/tools/llvm-dwarfdump/Statistics.cpp
471–473

Should this be setting InlinedVarsPtr to nullptr? Inlined variables from a higher-up-the-stack inlining site might still be present and pointed to by the InlinedVarsPtr argument.

666–668

Isn't this "removing" twice, once with std::remove, the other with the erase method?

671

Hmmmmm, this isn't going to collect variables that are nested within lexical_block's I guess, and fixing that probably requires more recursion. Ouch.

731

I reckon we'll need to bump the version number for this.

dblaikie added inline comments.Feb 11 2021, 9:59 AM
llvm/tools/llvm-dwarfdump/Statistics.cpp
666–668

Ah the joys of C++. This code is correct and an application of the classic "erase remove" idiom ( https://en.wikipedia.org/wiki/Erase%E2%80%93remove_idiom ).

Though we do have an llmv wrapper that tidies this up a bit, llvm::erase_if and llvm::erase_value

djtodoro added inline comments.Feb 12 2021, 6:43 AM
llvm/tools/llvm-dwarfdump/Statistics.cpp
471–473

it makes sense

506–514

This piece of code needs some changes in order to handle lexical scopes...

666–668

Yeah, it is an idiom for eliminating elements from stl containers.

@dblaikie I think that using of the std:: ones is recommended approach, right? Or do you proposing using of the llvm:: ones?

671

O yeah... inlined functions can also have lexical scopes... I will add dedicated test case and handle it.

731

I thought that only in the case of new metrics we should update this...
Thanks! I will update it.

dblaikie added inline comments.Feb 12 2021, 11:58 AM
llvm/tools/llvm-dwarfdump/Statistics.cpp
666–668

generally we use the llvm ones, that's why we have them - they simplify the API (make it range based instead of iterator pair based, and in this case wrap up the common erase+remove idiom into a single more legible function call (avoiding the confusion @jmorse ran into when reading this code, for instance))

djtodoro added inline comments.Feb 17 2021, 7:38 AM
llvm/tools/llvm-dwarfdump/Statistics.cpp
666–668

OK, it sounds reasonable to me. Thanks.

djtodoro updated this revision to Diff 324312.Feb 17 2021, 8:01 AM
  • Handle nested nodes (added a test for that)
  • Use llvm::erase_value instead of std::erase-remove idiom

llvm-dwarfdump --statistics already takes a while to examine a stage2reldeb clang binary, it'd be best to avoid allocating more global data than we need.

(Out-of-this-concrete-topic) The D96871 is a small optimization of the llvm-dwarfdump --statistics.

jmorse accepted this revision.Feb 19 2021, 2:35 AM

Ship it!

This revision is now accepted and ready to land.Feb 19 2021, 2:35 AM

This patch seems to be causing a slowdown while running dwarfdump --statistics. It looks like more than 50% of the time is spent in collectZeroCovInlinedVars function.

In function collectStatsForObjectFile, variable InlinedFnsToBeProcessed is being created outside the for loop. It looks like the inline functions to be processed are appended and processed multiple times. I verified that by moving creation of that variable into the for loop, the slowdown goes away.

Can you please take a look!

This patch seems to be causing a slowdown while running dwarfdump --statistics. It looks like more than 50% of the time is spent in collectZeroCovInlinedVars function.

In function collectStatsForObjectFile, variable InlinedFnsToBeProcessed is being created outside the for loop. It looks like the inline functions to be processed are appended and processed multiple times. I verified that by moving creation of that variable into the for loop, the slowdown goes away.

Can you please take a look!

Thanks for finding this! It does solve the problem -- do you want me to patch that or you are willing to post it?

After moving the variable InlinedFnsToBeProcessed to inside the for loop, I noticed a difference in two dwarfdump statistics for a quite large binary with fission=yes:

#variables processed by location statistics
#variables with 0% of parent scope covered by DW_AT_location

Since I am not very familiar with this code, I would request if you could investigate and apply the patch.

After moving the variable InlinedFnsToBeProcessed to inside the for loop, I noticed a difference in two dwarfdump statistics for a quite large binary with fission=yes:

#variables processed by location statistics
#variables with 0% of parent scope covered by DW_AT_location

Since I am not very familiar with this code, I would request if you could investigate and apply the patch.

Hmm, looking at this trying to reproduce the behavior makes me think this wouldn't be the right fix & the change in numbers is probably reflective of a bug in the fix - specifically in LTO situations (where the abstract origin of the function may be in a different CU than the one it's referenced from/inlined into).

But there might be a reasonable alternative fix/improvement - pruning the contents of the list once it has been handled. Possibly sorting the list so it's easy to just visit the elements that need to.

After moving the variable InlinedFnsToBeProcessed to inside the for loop, I noticed a difference in two dwarfdump statistics for a quite large binary with fission=yes:

#variables processed by location statistics
#variables with 0% of parent scope covered by DW_AT_location

Since I am not very familiar with this code, I would request if you could investigate and apply the patch.

Hmm, looking at this trying to reproduce the behavior makes me think this wouldn't be the right fix & the change in numbers is probably reflective of a bug in the fix - specifically in LTO situations (where the abstract origin of the function may be in a different CU than the one it's referenced from/inlined into).

But there might be a reasonable alternative fix/improvement - pruning the contents of the list once it has been handled. Possibly sorting the list so it's easy to just visit the elements that need to.

I have taken a look into this and come up with the similar conclusion.
This could be reproduced with the Split DWARF -- when building e.g. GDB project with -gsplit-dwarf option enabled, it triggers the issue. I was struggling to make a simple test case (from a high level) that meets all the requirements, but it is very hard. That is why I've sent the email: https://lists.llvm.org/pipermail/llvm-dev/2021-April/149735.html.

After moving the variable InlinedFnsToBeProcessed to inside the for loop, I noticed a difference in two dwarfdump statistics for a quite large binary with fission=yes:

#variables processed by location statistics
#variables with 0% of parent scope covered by DW_AT_location

Since I am not very familiar with this code, I would request if you could investigate and apply the patch.

Hmm, looking at this trying to reproduce the behavior makes me think this wouldn't be the right fix & the change in numbers is probably reflective of a bug in the fix - specifically in LTO situations (where the abstract origin of the function may be in a different CU than the one it's referenced from/inlined into).

But there might be a reasonable alternative fix/improvement - pruning the contents of the list once it has been handled. Possibly sorting the list so it's easy to just visit the elements that need to.

I have taken a look into this and come up with the similar conclusion.
This could be reproduced with the Split DWARF -- when building e.g. GDB project with -gsplit-dwarf option enabled, it triggers the issue. I was struggling to make a simple test case (from a high level) that meets all the requirements, but it is very hard. That is why I've sent the email: https://lists.llvm.org/pipermail/llvm-dev/2021-April/149735.html.

Ah, hmm - why would this come up in Split DWARF, do you think? Split DWARF doesn't have any cross-CU references (the format has no way to encode them) so I'd expect no DIEs to be added to the InlinedFnsToBeProcessed list?

After moving the variable InlinedFnsToBeProcessed to inside the for loop, I noticed a difference in two dwarfdump statistics for a quite large binary with fission=yes:

#variables processed by location statistics
#variables with 0% of parent scope covered by DW_AT_location

Since I am not very familiar with this code, I would request if you could investigate and apply the patch.

Hmm, looking at this trying to reproduce the behavior makes me think this wouldn't be the right fix & the change in numbers is probably reflective of a bug in the fix - specifically in LTO situations (where the abstract origin of the function may be in a different CU than the one it's referenced from/inlined into).

But there might be a reasonable alternative fix/improvement - pruning the contents of the list once it has been handled. Possibly sorting the list so it's easy to just visit the elements that need to.

I have taken a look into this and come up with the similar conclusion.
This could be reproduced with the Split DWARF -- when building e.g. GDB project with -gsplit-dwarf option enabled, it triggers the issue. I was struggling to make a simple test case (from a high level) that meets all the requirements, but it is very hard. That is why I've sent the email: https://lists.llvm.org/pipermail/llvm-dev/2021-April/149735.html.

Ah, hmm - why would this come up in Split DWARF, do you think? Split DWARF doesn't have any cross-CU references (the format has no way to encode them) so I'd expect no DIEs to be added to the InlinedFnsToBeProcessed list?

The example I had in my mind is a bit different (I haven't looked into cross-cu support yet, but we should have covered it).
Let me first explain two main variables here:

GlobalInlinedFnInfo -- keeps track of DW_TAG_subprogram copies with DW_AT_inline, so when we face a DW_TAG_inlined_subroutine we know for how many variables we have zero % location coverage
InlinedFnsToBeProcessed -- doing recursive approach if we can firsly meet a DW_TAG_inlined_subroutine referencing a DW_TAG_subprogram copy with DW_AT_inline (since DWARF standard does not propose the order), this variable keeps the DW_TAG_inlined_subroutine DIEs that should be processed after we are finished with the dwarf tree

Let us know consider a big project (let it be GDB) compiled with -gsplit-dwarf. Main executable will have debug_info section:

$ llvm-dwarfdump gdb
0x0000000b: DW_TAG_compile_unit
           DW_AT_stmt_list   (0x00000000)
           DW_AT_comp_dir    ("/gdb")
           DW_AT_GNU_pubnames        (true)
           DW_AT_GNU_dwo_name        ("gdb.dwo")
           DW_AT_GNU_dwo_id  (0xb8ed71b1bb823706)
           DW_AT_low_pc      (0x0000000000482370)
           DW_AT_high_pc     (0x000000000048239a)
           DW_AT_GNU_addr_base       (0x00000000)
0x00000030: Compile Unit: length = 0x00000030, format = DWARF32, version = 0x0004, abbr_offset = 0x001a, addr_size = 0x08 (next unit at 0x00000064)

0x0000003b: DW_TAG_compile_unit
           DW_AT_stmt_list   (0x00000069)
           DW_AT_comp_dir    ("/gdb")
           DW_AT_GNU_pubnames        (true)
           DW_AT_GNU_dwo_name        ("amd64-tdep.dwo")
           DW_AT_GNU_dwo_id  (0x97c5a9e4ce1a43bf)
           DW_AT_GNU_ranges_base     (0x00000000)
           DW_AT_low_pc      (0x00000000004823a0)
           DW_AT_high_pc     (0x000000000048e982)
           DW_AT_GNU_addr_base       (0x00000018)
0x00000064: Compile Unit: length = 0x0000002c, format = DWARF32, version = 0x0004, abbr_offset = 0x0037, addr_size = 0x08 (next unit at 0x00000094)

0x0000006f: DW_TAG_compile_unit
....

We will keep filling both of these variables (GlobalInlinedFnInfo and InlinedFnsToBeProcessed) for all the CUs, but there could be a situation where we have inlined_subroutine referencing a subprogram, but we accidently use variables info from subprogram from previous CU with the same DIE offset -- that is why we have different numbers for this case.

After moving the variable InlinedFnsToBeProcessed to inside the for loop, I noticed a difference in two dwarfdump statistics for a quite large binary with fission=yes:

#variables processed by location statistics
#variables with 0% of parent scope covered by DW_AT_location

Since I am not very familiar with this code, I would request if you could investigate and apply the patch.

Hmm, looking at this trying to reproduce the behavior makes me think this wouldn't be the right fix & the change in numbers is probably reflective of a bug in the fix - specifically in LTO situations (where the abstract origin of the function may be in a different CU than the one it's referenced from/inlined into).

But there might be a reasonable alternative fix/improvement - pruning the contents of the list once it has been handled. Possibly sorting the list so it's easy to just visit the elements that need to.

I have taken a look into this and come up with the similar conclusion.
This could be reproduced with the Split DWARF -- when building e.g. GDB project with -gsplit-dwarf option enabled, it triggers the issue. I was struggling to make a simple test case (from a high level) that meets all the requirements, but it is very hard. That is why I've sent the email: https://lists.llvm.org/pipermail/llvm-dev/2021-April/149735.html.

Ah, hmm - why would this come up in Split DWARF, do you think? Split DWARF doesn't have any cross-CU references (the format has no way to encode them) so I'd expect no DIEs to be added to the InlinedFnsToBeProcessed list?

The example I had in my mind is a bit different (I haven't looked into cross-cu support yet, but we should have covered it).

Hmm, OK. If the intent is that this change isn't meant to work for/fix the LTO case - then changing the scope sounds like something worth doing immediately. (& the discussion below about Split DWARF sounds like it'd be a place where the current buggy code would exhibit different (incorrect) behavior). I can probably try to come up with a test case that demonstrates that buggy behavior.

Let me first explain two main variables here:

GlobalInlinedFnInfo -- keeps track of DW_TAG_subprogram copies with DW_AT_inline, so when we face a DW_TAG_inlined_subroutine we know for how many variables we have zero % location coverage

As an aside: This issue can come up for the concrete DW_TAG_subprogram that refers to the abstract origin, not only for the DW_TAG_inlined_subroutines that refer to the abstract origin. ie:

$ cat test.cpp
void f1();
__attribute__((always_inline)) void f2() {
  int i;
  f1();
}
void f3() {
  f2();
}
blaikie@blaikie-linux2:~/dev/scratch$ clang++-tot -O3 test.cpp -c -g && llvm-dwarfdump-tot test.o | grep "DW_TAG\|abstract_origin\|DW_AT_name"
0x0000000b: DW_TAG_compile_unit
              DW_AT_name        ("test.cpp")
0x0000002a:   DW_TAG_subprogram
                DW_AT_abstract_origin   (0x0000005b "_Z2f2v")
0x0000003d:     DW_TAG_variable
                  DW_AT_abstract_origin (0x00000067 "i")
0x00000042:     DW_TAG_GNU_call_site
                  DW_AT_abstract_origin (0x00000050 "_Z2f1v")
0x00000050:   DW_TAG_subprogram
                DW_AT_name      ("f1")
0x0000005b:   DW_TAG_subprogram
                DW_AT_name      ("f2")
0x00000067:     DW_TAG_variable
                  DW_AT_name    ("i")
0x00000073:   DW_TAG_base_type
                DW_AT_name      ("int")
0x0000007a:   DW_TAG_subprogram
                DW_AT_name      ("f3")
0x00000093:     DW_TAG_inlined_subroutine
                  DW_AT_abstract_origin (0x0000005b "_Z2f2v")
0x000000a7:     DW_TAG_GNU_call_site
                  DW_AT_abstract_origin (0x00000050 "_Z2f1v")

So currently LLVM still has the "emitting a DIE with only an abstract origin for variables that are optimized away" issue in the concrete out of line definition (0x2a/0x3d above) though it's been fixed in the inlined case (0x93 above). And this patch for statistics addresses the latter case (the inlined_subprogram) but doesn't address the former case (if LLVM were improved not to produce that basically dead concrete variable DIE).

InlinedFnsToBeProcessed -- doing recursive approach if we can firsly meet a DW_TAG_inlined_subroutine referencing a DW_TAG_subprogram copy with DW_AT_inline (since DWARF standard does not propose the order), this variable keeps the DW_TAG_inlined_subroutine DIEs that should be processed after we are finished with the dwarf tree

Let us know consider a big project (let it be GDB) compiled with -gsplit-dwarf. Main executable will have debug_info section:

$ llvm-dwarfdump gdb
0x0000000b: DW_TAG_compile_unit
           DW_AT_stmt_list   (0x00000000)
           DW_AT_comp_dir    ("/gdb")
           DW_AT_GNU_pubnames        (true)
           DW_AT_GNU_dwo_name        ("gdb.dwo")
           DW_AT_GNU_dwo_id  (0xb8ed71b1bb823706)
           DW_AT_low_pc      (0x0000000000482370)
           DW_AT_high_pc     (0x000000000048239a)
           DW_AT_GNU_addr_base       (0x00000000)
0x00000030: Compile Unit: length = 0x00000030, format = DWARF32, version = 0x0004, abbr_offset = 0x001a, addr_size = 0x08 (next unit at 0x00000064)

0x0000003b: DW_TAG_compile_unit
           DW_AT_stmt_list   (0x00000069)
           DW_AT_comp_dir    ("/gdb")
           DW_AT_GNU_pubnames        (true)
           DW_AT_GNU_dwo_name        ("amd64-tdep.dwo")
           DW_AT_GNU_dwo_id  (0x97c5a9e4ce1a43bf)
           DW_AT_GNU_ranges_base     (0x00000000)
           DW_AT_low_pc      (0x00000000004823a0)
           DW_AT_high_pc     (0x000000000048e982)
           DW_AT_GNU_addr_base       (0x00000018)
0x00000064: Compile Unit: length = 0x0000002c, format = DWARF32, version = 0x0004, abbr_offset = 0x0037, addr_size = 0x08 (next unit at 0x00000094)

0x0000006f: DW_TAG_compile_unit
....

We will keep filling both of these variables (GlobalInlinedFnInfo and InlinedFnsToBeProcessed) for all the CUs, but there could be a situation where we have inlined_subroutine referencing a subprogram, but we accidently use variables info from subprogram from previous CU with the same DIE offset -- that is why we have different numbers for this case.

Yep, that sounds reasonably buggy & in the absence of LTO/if this code isn't intended to handle the LTO case (which it seems it isn't) then moving the variables to the inner scope for now - pending work to address LTO - sounds right to me.

Here's a test case for the scope change fix/bug with split-dwarf:
a.cpp:

__attribute__((optnone)) static void x() {
}
__attribute__((always_inline)) static void y() {
  int var;
  x();
}
void f1() {
  y();
}

b.cpp:

__attribute__((optnone)) static void x() {
}
__attribute__((always_inline)) static void y() {
  int var;
  x();
}
void f2() {
  y();
}

c.cpp:

void f1();
void f2();
int main() {
  f1();
  f2();
}
$ clang++ a.cpp b.cpp c.cpp -g -gsplit-dwarf -O1
$ llvm-dwarfdump --statistics a.out

(you could compile c.cpp without debug info, or put main in a or b.cpp at the end, I think it'd still work).

With the current coed, this reports:

"#variables processed by location statistics": 3,                                                             
"#variables with 0% of parent scope covered by DW_AT_location": 3,    
...
"#local vars processed by location statistics": 3,                                                                        
"#local vars with 0% of parent scope covered by DW_AT_location": 3,

Moving the variables into the nested scope causes these to all change to 2, which is correct.

After moving the variable InlinedFnsToBeProcessed to inside the for loop, I noticed a difference in two dwarfdump statistics for a quite large binary with fission=yes:

#variables processed by location statistics
#variables with 0% of parent scope covered by DW_AT_location

Since I am not very familiar with this code, I would request if you could investigate and apply the patch.

Hmm, looking at this trying to reproduce the behavior makes me think this wouldn't be the right fix & the change in numbers is probably reflective of a bug in the fix - specifically in LTO situations (where the abstract origin of the function may be in a different CU than the one it's referenced from/inlined into).

But there might be a reasonable alternative fix/improvement - pruning the contents of the list once it has been handled. Possibly sorting the list so it's easy to just visit the elements that need to.

I have taken a look into this and come up with the similar conclusion.
This could be reproduced with the Split DWARF -- when building e.g. GDB project with -gsplit-dwarf option enabled, it triggers the issue. I was struggling to make a simple test case (from a high level) that meets all the requirements, but it is very hard. That is why I've sent the email: https://lists.llvm.org/pipermail/llvm-dev/2021-April/149735.html.

Ah, hmm - why would this come up in Split DWARF, do you think? Split DWARF doesn't have any cross-CU references (the format has no way to encode them) so I'd expect no DIEs to be added to the InlinedFnsToBeProcessed list?

The example I had in my mind is a bit different (I haven't looked into cross-cu support yet, but we should have covered it).

Hmm, OK. If the intent is that this change isn't meant to work for/fix the LTO case - then changing the scope sounds like something worth doing immediately. (& the discussion below about Split DWARF sounds like it'd be a place where the current buggy code would exhibit different (incorrect) behavior). I can probably try to come up with a test case that demonstrates that buggy behavior.

Let me first explain two main variables here:

GlobalInlinedFnInfo -- keeps track of DW_TAG_subprogram copies with DW_AT_inline, so when we face a DW_TAG_inlined_subroutine we know for how many variables we have zero % location coverage

As an aside: This issue can come up for the concrete DW_TAG_subprogram that refers to the abstract origin, not only for the DW_TAG_inlined_subroutines that refer to the abstract origin. ie:

$ cat test.cpp
void f1();
__attribute__((always_inline)) void f2() {
  int i;
  f1();
}
void f3() {
  f2();
}
blaikie@blaikie-linux2:~/dev/scratch$ clang++-tot -O3 test.cpp -c -g && llvm-dwarfdump-tot test.o | grep "DW_TAG\|abstract_origin\|DW_AT_name"
0x0000000b: DW_TAG_compile_unit
              DW_AT_name        ("test.cpp")
0x0000002a:   DW_TAG_subprogram
                DW_AT_abstract_origin   (0x0000005b "_Z2f2v")
0x0000003d:     DW_TAG_variable
                  DW_AT_abstract_origin (0x00000067 "i")
0x00000042:     DW_TAG_GNU_call_site
                  DW_AT_abstract_origin (0x00000050 "_Z2f1v")
0x00000050:   DW_TAG_subprogram
                DW_AT_name      ("f1")
0x0000005b:   DW_TAG_subprogram
                DW_AT_name      ("f2")
0x00000067:     DW_TAG_variable
                  DW_AT_name    ("i")
0x00000073:   DW_TAG_base_type
                DW_AT_name      ("int")
0x0000007a:   DW_TAG_subprogram
                DW_AT_name      ("f3")
0x00000093:     DW_TAG_inlined_subroutine
                  DW_AT_abstract_origin (0x0000005b "_Z2f2v")
0x000000a7:     DW_TAG_GNU_call_site
                  DW_AT_abstract_origin (0x00000050 "_Z2f1v")

So currently LLVM still has the "emitting a DIE with only an abstract origin for variables that are optimized away" issue in the concrete out of line definition (0x2a/0x3d above) though it's been fixed in the inlined case (0x93 above). And this patch for statistics addresses the latter case (the inlined_subprogram) but doesn't address the former case (if LLVM were improved not to produce that basically dead concrete variable DIE).

Oh, yes, I think the D54217 have introduced a bug here (since we should consider dw_subprogram with abstract_origin as inlined function) and I guess we should make something like this:

diff --git a/llvm/tools/llvm-dwarfdump/Statistics.cpp b/llvm/tools/llvm-dwarfdump/Statistics.cpp
index 3758a56da7a2..4d0cf6b5fc8d 100644
--- a/llvm/tools/llvm-dwarfdump/Statistics.cpp
+++ b/llvm/tools/llvm-dwarfdump/Statistics.cpp
@@ -462,9 +462,15 @@ static void collectStatsRecursive(DWARFDie Die, std::string FnPrefix,
     return;

   // Handle any kind of lexical scope.
-  const bool IsFunction = Tag == dwarf::DW_TAG_subprogram;
+  const bool HasAbstractOriginAttr =
+      Die.find(dwarf::DW_AT_abstract_origin) != None;
+  const bool IsFunction =
+      ((Tag == dwarf::DW_TAG_subprogram) && !HasAbstractOriginAttr);
   const bool IsBlock = Tag == dwarf::DW_TAG_lexical_block;
-  const bool IsInlinedFunction = Tag == dwarf::DW_TAG_inlined_subroutine;
+  const bool IsInlinedFunction =
+      (Tag == dwarf::DW_TAG_inlined_subroutine ||
+       (Tag == dwarf::DW_TAG_subprogram && HasAbstractOriginAttr));
+
   InlinedVarsTy InlinedVars;
   // Get the vars of the inlined fn, so the locstats
   // reports the missing vars (with coverage 0%).

InlinedFnsToBeProcessed -- doing recursive approach if we can firsly meet a DW_TAG_inlined_subroutine referencing a DW_TAG_subprogram copy with DW_AT_inline (since DWARF standard does not propose the order), this variable keeps the DW_TAG_inlined_subroutine DIEs that should be processed after we are finished with the dwarf tree

Let us know consider a big project (let it be GDB) compiled with -gsplit-dwarf. Main executable will have debug_info section:

$ llvm-dwarfdump gdb
0x0000000b: DW_TAG_compile_unit
           DW_AT_stmt_list   (0x00000000)
           DW_AT_comp_dir    ("/gdb")
           DW_AT_GNU_pubnames        (true)
           DW_AT_GNU_dwo_name        ("gdb.dwo")
           DW_AT_GNU_dwo_id  (0xb8ed71b1bb823706)
           DW_AT_low_pc      (0x0000000000482370)
           DW_AT_high_pc     (0x000000000048239a)
           DW_AT_GNU_addr_base       (0x00000000)
0x00000030: Compile Unit: length = 0x00000030, format = DWARF32, version = 0x0004, abbr_offset = 0x001a, addr_size = 0x08 (next unit at 0x00000064)

0x0000003b: DW_TAG_compile_unit
           DW_AT_stmt_list   (0x00000069)
           DW_AT_comp_dir    ("/gdb")
           DW_AT_GNU_pubnames        (true)
           DW_AT_GNU_dwo_name        ("amd64-tdep.dwo")
           DW_AT_GNU_dwo_id  (0x97c5a9e4ce1a43bf)
           DW_AT_GNU_ranges_base     (0x00000000)
           DW_AT_low_pc      (0x00000000004823a0)
           DW_AT_high_pc     (0x000000000048e982)
           DW_AT_GNU_addr_base       (0x00000018)
0x00000064: Compile Unit: length = 0x0000002c, format = DWARF32, version = 0x0004, abbr_offset = 0x0037, addr_size = 0x08 (next unit at 0x00000094)

0x0000006f: DW_TAG_compile_unit
....

We will keep filling both of these variables (GlobalInlinedFnInfo and InlinedFnsToBeProcessed) for all the CUs, but there could be a situation where we have inlined_subroutine referencing a subprogram, but we accidently use variables info from subprogram from previous CU with the same DIE offset -- that is why we have different numbers for this case.

Yep, that sounds reasonably buggy & in the absence of LTO/if this code isn't intended to handle the LTO case (which it seems it isn't) then moving the variables to the inner scope for now - pending work to address LTO - sounds right to me.

Here's a test case for the scope change fix/bug with split-dwarf:
a.cpp:

__attribute__((optnone)) static void x() {
}
__attribute__((always_inline)) static void y() {
  int var;
  x();
}
void f1() {
  y();
}

b.cpp:

__attribute__((optnone)) static void x() {
}
__attribute__((always_inline)) static void y() {
  int var;
  x();
}
void f2() {
  y();
}

c.cpp:

void f1();
void f2();
int main() {
  f1();
  f2();
}
$ clang++ a.cpp b.cpp c.cpp -g -gsplit-dwarf -O1
$ llvm-dwarfdump --statistics a.out

(you could compile c.cpp without debug info, or put main in a or b.cpp at the end, I think it'd still work).

With the current coed, this reports:

"#variables processed by location statistics": 3,                                                             
"#variables with 0% of parent scope covered by DW_AT_location": 3,    
...
"#local vars processed by location statistics": 3,                                                                        
"#local vars with 0% of parent scope covered by DW_AT_location": 3,

Moving the variables into the nested scope causes these to all change to 2, which is correct.

Great! Thanks a lot!
This is the patch: https://reviews.llvm.org/D100951.

After moving the variable InlinedFnsToBeProcessed to inside the for loop, I noticed a difference in two dwarfdump statistics for a quite large binary with fission=yes:

#variables processed by location statistics
#variables with 0% of parent scope covered by DW_AT_location

Since I am not very familiar with this code, I would request if you could investigate and apply the patch.

Hmm, looking at this trying to reproduce the behavior makes me think this wouldn't be the right fix & the change in numbers is probably reflective of a bug in the fix - specifically in LTO situations (where the abstract origin of the function may be in a different CU than the one it's referenced from/inlined into).

But there might be a reasonable alternative fix/improvement - pruning the contents of the list once it has been handled. Possibly sorting the list so it's easy to just visit the elements that need to.

I have taken a look into this and come up with the similar conclusion.
This could be reproduced with the Split DWARF -- when building e.g. GDB project with -gsplit-dwarf option enabled, it triggers the issue. I was struggling to make a simple test case (from a high level) that meets all the requirements, but it is very hard. That is why I've sent the email: https://lists.llvm.org/pipermail/llvm-dev/2021-April/149735.html.

Ah, hmm - why would this come up in Split DWARF, do you think? Split DWARF doesn't have any cross-CU references (the format has no way to encode them) so I'd expect no DIEs to be added to the InlinedFnsToBeProcessed list?

The example I had in my mind is a bit different (I haven't looked into cross-cu support yet, but we should have covered it).

Hmm, OK. If the intent is that this change isn't meant to work for/fix the LTO case - then changing the scope sounds like something worth doing immediately. (& the discussion below about Split DWARF sounds like it'd be a place where the current buggy code would exhibit different (incorrect) behavior). I can probably try to come up with a test case that demonstrates that buggy behavior.

Let me first explain two main variables here:

GlobalInlinedFnInfo -- keeps track of DW_TAG_subprogram copies with DW_AT_inline, so when we face a DW_TAG_inlined_subroutine we know for how many variables we have zero % location coverage

As an aside: This issue can come up for the concrete DW_TAG_subprogram that refers to the abstract origin, not only for the DW_TAG_inlined_subroutines that refer to the abstract origin. ie:

$ cat test.cpp
void f1();
__attribute__((always_inline)) void f2() {
  int i;
  f1();
}
void f3() {
  f2();
}
blaikie@blaikie-linux2:~/dev/scratch$ clang++-tot -O3 test.cpp -c -g && llvm-dwarfdump-tot test.o | grep "DW_TAG\|abstract_origin\|DW_AT_name"
0x0000000b: DW_TAG_compile_unit
              DW_AT_name        ("test.cpp")
0x0000002a:   DW_TAG_subprogram
                DW_AT_abstract_origin   (0x0000005b "_Z2f2v")
0x0000003d:     DW_TAG_variable
                  DW_AT_abstract_origin (0x00000067 "i")
0x00000042:     DW_TAG_GNU_call_site
                  DW_AT_abstract_origin (0x00000050 "_Z2f1v")
0x00000050:   DW_TAG_subprogram
                DW_AT_name      ("f1")
0x0000005b:   DW_TAG_subprogram
                DW_AT_name      ("f2")
0x00000067:     DW_TAG_variable
                  DW_AT_name    ("i")
0x00000073:   DW_TAG_base_type
                DW_AT_name      ("int")
0x0000007a:   DW_TAG_subprogram
                DW_AT_name      ("f3")
0x00000093:     DW_TAG_inlined_subroutine
                  DW_AT_abstract_origin (0x0000005b "_Z2f2v")
0x000000a7:     DW_TAG_GNU_call_site
                  DW_AT_abstract_origin (0x00000050 "_Z2f1v")

So currently LLVM still has the "emitting a DIE with only an abstract origin for variables that are optimized away" issue in the concrete out of line definition (0x2a/0x3d above) though it's been fixed in the inlined case (0x93 above). And this patch for statistics addresses the latter case (the inlined_subprogram) but doesn't address the former case (if LLVM were improved not to produce that basically dead concrete variable DIE).

Oh, yes, I think the D54217 have introduced a bug here (since we should consider dw_subprogram with abstract_origin as inlined function) and I guess we should make something like this:

diff --git a/llvm/tools/llvm-dwarfdump/Statistics.cpp b/llvm/tools/llvm-dwarfdump/Statistics.cpp
index 3758a56da7a2..4d0cf6b5fc8d 100644
--- a/llvm/tools/llvm-dwarfdump/Statistics.cpp
+++ b/llvm/tools/llvm-dwarfdump/Statistics.cpp
@@ -462,9 +462,15 @@ static void collectStatsRecursive(DWARFDie Die, std::string FnPrefix,
     return;

   // Handle any kind of lexical scope.
-  const bool IsFunction = Tag == dwarf::DW_TAG_subprogram;
+  const bool HasAbstractOriginAttr =
+      Die.find(dwarf::DW_AT_abstract_origin) != None;
+  const bool IsFunction =
+      ((Tag == dwarf::DW_TAG_subprogram) && !HasAbstractOriginAttr);
   const bool IsBlock = Tag == dwarf::DW_TAG_lexical_block;
-  const bool IsInlinedFunction = Tag == dwarf::DW_TAG_inlined_subroutine;
+  const bool IsInlinedFunction =
+      (Tag == dwarf::DW_TAG_inlined_subroutine ||
+       (Tag == dwarf::DW_TAG_subprogram && HasAbstractOriginAttr));
+
   InlinedVarsTy InlinedVars;
   // Get the vars of the inlined fn, so the locstats
   // reports the missing vars (with coverage 0%).

More or less - I think we'll need to find another name other than "inlined" for these "concrete and inlined instances with abstract origins", otherwise it's liable to be confusing to future readers/maintainers.

InlinedFnsToBeProcessed -- doing recursive approach if we can firsly meet a DW_TAG_inlined_subroutine referencing a DW_TAG_subprogram copy with DW_AT_inline (since DWARF standard does not propose the order), this variable keeps the DW_TAG_inlined_subroutine DIEs that should be processed after we are finished with the dwarf tree

Let us know consider a big project (let it be GDB) compiled with -gsplit-dwarf. Main executable will have debug_info section:

$ llvm-dwarfdump gdb
0x0000000b: DW_TAG_compile_unit
           DW_AT_stmt_list   (0x00000000)
           DW_AT_comp_dir    ("/gdb")
           DW_AT_GNU_pubnames        (true)
           DW_AT_GNU_dwo_name        ("gdb.dwo")
           DW_AT_GNU_dwo_id  (0xb8ed71b1bb823706)
           DW_AT_low_pc      (0x0000000000482370)
           DW_AT_high_pc     (0x000000000048239a)
           DW_AT_GNU_addr_base       (0x00000000)
0x00000030: Compile Unit: length = 0x00000030, format = DWARF32, version = 0x0004, abbr_offset = 0x001a, addr_size = 0x08 (next unit at 0x00000064)

0x0000003b: DW_TAG_compile_unit
           DW_AT_stmt_list   (0x00000069)
           DW_AT_comp_dir    ("/gdb")
           DW_AT_GNU_pubnames        (true)
           DW_AT_GNU_dwo_name        ("amd64-tdep.dwo")
           DW_AT_GNU_dwo_id  (0x97c5a9e4ce1a43bf)
           DW_AT_GNU_ranges_base     (0x00000000)
           DW_AT_low_pc      (0x00000000004823a0)
           DW_AT_high_pc     (0x000000000048e982)
           DW_AT_GNU_addr_base       (0x00000018)
0x00000064: Compile Unit: length = 0x0000002c, format = DWARF32, version = 0x0004, abbr_offset = 0x0037, addr_size = 0x08 (next unit at 0x00000094)

0x0000006f: DW_TAG_compile_unit
....

We will keep filling both of these variables (GlobalInlinedFnInfo and InlinedFnsToBeProcessed) for all the CUs, but there could be a situation where we have inlined_subroutine referencing a subprogram, but we accidently use variables info from subprogram from previous CU with the same DIE offset -- that is why we have different numbers for this case.

Yep, that sounds reasonably buggy & in the absence of LTO/if this code isn't intended to handle the LTO case (which it seems it isn't) then moving the variables to the inner scope for now - pending work to address LTO - sounds right to me.

Here's a test case for the scope change fix/bug with split-dwarf:
a.cpp:

__attribute__((optnone)) static void x() {
}
__attribute__((always_inline)) static void y() {
  int var;
  x();
}
void f1() {
  y();
}

b.cpp:

__attribute__((optnone)) static void x() {
}
__attribute__((always_inline)) static void y() {
  int var;
  x();
}
void f2() {
  y();
}

c.cpp:

void f1();
void f2();
int main() {
  f1();
  f2();
}
$ clang++ a.cpp b.cpp c.cpp -g -gsplit-dwarf -O1
$ llvm-dwarfdump --statistics a.out

(you could compile c.cpp without debug info, or put main in a or b.cpp at the end, I think it'd still work).

With the current coed, this reports:

"#variables processed by location statistics": 3,                                                             
"#variables with 0% of parent scope covered by DW_AT_location": 3,    
...
"#local vars processed by location statistics": 3,                                                                        
"#local vars with 0% of parent scope covered by DW_AT_location": 3,

Moving the variables into the nested scope causes these to all change to 2, which is correct.

Great! Thanks a lot!
This is the patch: https://reviews.llvm.org/D100951.

Thanks!

Oh, and it'd also be really good/important to make this work correctly with LTO (though that'll be more challenging/expensive due to the memory usage - it'll mean basically saving the details of every abstract origin (which means identifying them as such, since they don't expliictly identify themselves as abstract origins) (the expensive part) and then searching through that whenever we see a concrete-or-inlined (& if the search fails, putting the concrete-or-inlined in the list of things to resolve against future abstract origins that are parsed)). Will you be able to do that?

After moving the variable InlinedFnsToBeProcessed to inside the for loop, I noticed a difference in two dwarfdump statistics for a quite large binary with fission=yes:

#variables processed by location statistics
#variables with 0% of parent scope covered by DW_AT_location

Since I am not very familiar with this code, I would request if you could investigate and apply the patch.

Hmm, looking at this trying to reproduce the behavior makes me think this wouldn't be the right fix & the change in numbers is probably reflective of a bug in the fix - specifically in LTO situations (where the abstract origin of the function may be in a different CU than the one it's referenced from/inlined into).

But there might be a reasonable alternative fix/improvement - pruning the contents of the list once it has been handled. Possibly sorting the list so it's easy to just visit the elements that need to.

I have taken a look into this and come up with the similar conclusion.
This could be reproduced with the Split DWARF -- when building e.g. GDB project with -gsplit-dwarf option enabled, it triggers the issue. I was struggling to make a simple test case (from a high level) that meets all the requirements, but it is very hard. That is why I've sent the email: https://lists.llvm.org/pipermail/llvm-dev/2021-April/149735.html.

Ah, hmm - why would this come up in Split DWARF, do you think? Split DWARF doesn't have any cross-CU references (the format has no way to encode them) so I'd expect no DIEs to be added to the InlinedFnsToBeProcessed list?

The example I had in my mind is a bit different (I haven't looked into cross-cu support yet, but we should have covered it).

Hmm, OK. If the intent is that this change isn't meant to work for/fix the LTO case - then changing the scope sounds like something worth doing immediately. (& the discussion below about Split DWARF sounds like it'd be a place where the current buggy code would exhibit different (incorrect) behavior). I can probably try to come up with a test case that demonstrates that buggy behavior.

Let me first explain two main variables here:

GlobalInlinedFnInfo -- keeps track of DW_TAG_subprogram copies with DW_AT_inline, so when we face a DW_TAG_inlined_subroutine we know for how many variables we have zero % location coverage

As an aside: This issue can come up for the concrete DW_TAG_subprogram that refers to the abstract origin, not only for the DW_TAG_inlined_subroutines that refer to the abstract origin. ie:

$ cat test.cpp
void f1();
__attribute__((always_inline)) void f2() {
  int i;
  f1();
}
void f3() {
  f2();
}
blaikie@blaikie-linux2:~/dev/scratch$ clang++-tot -O3 test.cpp -c -g && llvm-dwarfdump-tot test.o | grep "DW_TAG\|abstract_origin\|DW_AT_name"
0x0000000b: DW_TAG_compile_unit
              DW_AT_name        ("test.cpp")
0x0000002a:   DW_TAG_subprogram
                DW_AT_abstract_origin   (0x0000005b "_Z2f2v")
0x0000003d:     DW_TAG_variable
                  DW_AT_abstract_origin (0x00000067 "i")
0x00000042:     DW_TAG_GNU_call_site
                  DW_AT_abstract_origin (0x00000050 "_Z2f1v")
0x00000050:   DW_TAG_subprogram
                DW_AT_name      ("f1")
0x0000005b:   DW_TAG_subprogram
                DW_AT_name      ("f2")
0x00000067:     DW_TAG_variable
                  DW_AT_name    ("i")
0x00000073:   DW_TAG_base_type
                DW_AT_name      ("int")
0x0000007a:   DW_TAG_subprogram
                DW_AT_name      ("f3")
0x00000093:     DW_TAG_inlined_subroutine
                  DW_AT_abstract_origin (0x0000005b "_Z2f2v")
0x000000a7:     DW_TAG_GNU_call_site
                  DW_AT_abstract_origin (0x00000050 "_Z2f1v")

So currently LLVM still has the "emitting a DIE with only an abstract origin for variables that are optimized away" issue in the concrete out of line definition (0x2a/0x3d above) though it's been fixed in the inlined case (0x93 above). And this patch for statistics addresses the latter case (the inlined_subprogram) but doesn't address the former case (if LLVM were improved not to produce that basically dead concrete variable DIE).

Oh, yes, I think the D54217 have introduced a bug here (since we should consider dw_subprogram with abstract_origin as inlined function) and I guess we should make something like this:

diff --git a/llvm/tools/llvm-dwarfdump/Statistics.cpp b/llvm/tools/llvm-dwarfdump/Statistics.cpp
index 3758a56da7a2..4d0cf6b5fc8d 100644
--- a/llvm/tools/llvm-dwarfdump/Statistics.cpp
+++ b/llvm/tools/llvm-dwarfdump/Statistics.cpp
@@ -462,9 +462,15 @@ static void collectStatsRecursive(DWARFDie Die, std::string FnPrefix,
     return;

   // Handle any kind of lexical scope.
-  const bool IsFunction = Tag == dwarf::DW_TAG_subprogram;
+  const bool HasAbstractOriginAttr =
+      Die.find(dwarf::DW_AT_abstract_origin) != None;
+  const bool IsFunction =
+      ((Tag == dwarf::DW_TAG_subprogram) && !HasAbstractOriginAttr);
   const bool IsBlock = Tag == dwarf::DW_TAG_lexical_block;
-  const bool IsInlinedFunction = Tag == dwarf::DW_TAG_inlined_subroutine;
+  const bool IsInlinedFunction =
+      (Tag == dwarf::DW_TAG_inlined_subroutine ||
+       (Tag == dwarf::DW_TAG_subprogram && HasAbstractOriginAttr));
+
   InlinedVarsTy InlinedVars;
   // Get the vars of the inlined fn, so the locstats
   // reports the missing vars (with coverage 0%).

More or less - I think we'll need to find another name other than "inlined" for these "concrete and inlined instances with abstract origins", otherwise it's liable to be confusing to future readers/maintainers.

Here is the patch -- D101025.

InlinedFnsToBeProcessed -- doing recursive approach if we can firsly meet a DW_TAG_inlined_subroutine referencing a DW_TAG_subprogram copy with DW_AT_inline (since DWARF standard does not propose the order), this variable keeps the DW_TAG_inlined_subroutine DIEs that should be processed after we are finished with the dwarf tree

Let us know consider a big project (let it be GDB) compiled with -gsplit-dwarf. Main executable will have debug_info section:

$ llvm-dwarfdump gdb
0x0000000b: DW_TAG_compile_unit
           DW_AT_stmt_list   (0x00000000)
           DW_AT_comp_dir    ("/gdb")
           DW_AT_GNU_pubnames        (true)
           DW_AT_GNU_dwo_name        ("gdb.dwo")
           DW_AT_GNU_dwo_id  (0xb8ed71b1bb823706)
           DW_AT_low_pc      (0x0000000000482370)
           DW_AT_high_pc     (0x000000000048239a)
           DW_AT_GNU_addr_base       (0x00000000)
0x00000030: Compile Unit: length = 0x00000030, format = DWARF32, version = 0x0004, abbr_offset = 0x001a, addr_size = 0x08 (next unit at 0x00000064)

0x0000003b: DW_TAG_compile_unit
           DW_AT_stmt_list   (0x00000069)
           DW_AT_comp_dir    ("/gdb")
           DW_AT_GNU_pubnames        (true)
           DW_AT_GNU_dwo_name        ("amd64-tdep.dwo")
           DW_AT_GNU_dwo_id  (0x97c5a9e4ce1a43bf)
           DW_AT_GNU_ranges_base     (0x00000000)
           DW_AT_low_pc      (0x00000000004823a0)
           DW_AT_high_pc     (0x000000000048e982)
           DW_AT_GNU_addr_base       (0x00000018)
0x00000064: Compile Unit: length = 0x0000002c, format = DWARF32, version = 0x0004, abbr_offset = 0x0037, addr_size = 0x08 (next unit at 0x00000094)

0x0000006f: DW_TAG_compile_unit
....

We will keep filling both of these variables (GlobalInlinedFnInfo and InlinedFnsToBeProcessed) for all the CUs, but there could be a situation where we have inlined_subroutine referencing a subprogram, but we accidently use variables info from subprogram from previous CU with the same DIE offset -- that is why we have different numbers for this case.

Yep, that sounds reasonably buggy & in the absence of LTO/if this code isn't intended to handle the LTO case (which it seems it isn't) then moving the variables to the inner scope for now - pending work to address LTO - sounds right to me.

Here's a test case for the scope change fix/bug with split-dwarf:
a.cpp:

__attribute__((optnone)) static void x() {
}
__attribute__((always_inline)) static void y() {
  int var;
  x();
}
void f1() {
  y();
}

b.cpp:

__attribute__((optnone)) static void x() {
}
__attribute__((always_inline)) static void y() {
  int var;
  x();
}
void f2() {
  y();
}

c.cpp:

void f1();
void f2();
int main() {
  f1();
  f2();
}
$ clang++ a.cpp b.cpp c.cpp -g -gsplit-dwarf -O1
$ llvm-dwarfdump --statistics a.out

(you could compile c.cpp without debug info, or put main in a or b.cpp at the end, I think it'd still work).

With the current coed, this reports:

"#variables processed by location statistics": 3,                                                             
"#variables with 0% of parent scope covered by DW_AT_location": 3,    
...
"#local vars processed by location statistics": 3,                                                                        
"#local vars with 0% of parent scope covered by DW_AT_location": 3,

Moving the variables into the nested scope causes these to all change to 2, which is correct.

Great! Thanks a lot!
This is the patch: https://reviews.llvm.org/D100951.

Thanks!

Oh, and it'd also be really good/important to make this work correctly with LTO (though that'll be more challenging/expensive due to the memory usage - it'll mean basically saving the details of every abstract origin (which means identifying them as such, since they don't expliictly identify themselves as abstract origins) (the expensive part) and then searching through that whenever we see a concrete-or-inlined (& if the search fails, putting the concrete-or-inlined in the list of things to resolve against future abstract origins that are parsed)). Will you be able to do that?

Sure, I'll put it on my TODO list. I have some opened issues I am working on -- as soon as I am finished with it, I'll start working on that. The first step is to make a simple test case.

Oh, and it'd also be really good/important to make this work correctly with LTO (though that'll be more challenging/expensive due to the memory usage - it'll mean basically saving the details of every abstract origin (which means identifying them as such, since they don't expliictly identify themselves as abstract origins) (the expensive part) and then searching through that whenever we see a concrete-or-inlined (& if the search fails, putting the concrete-or-inlined in the list of things to resolve against future abstract origins that are parsed)). Will you be able to do that?

Sure, I'll put it on my TODO list. I have some opened issues I am working on -- as soon as I am finished with it, I'll start working on that. The first step is to make a simple test case.

Thanks - happy to help with that any time. (roughly speaking - one translation unit with a function that's always_inline, and another translation unit that calls that first function - link them together with lto enabled and you'll get a cross-CU inlining)