This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/IR/
-
IR/
3/5
DebugInfoMetadata.cpp
-
test/DebugInfo/
-
DebugInfo/
-
MIR/X86/
-
X86/
-
merge-inline-loc1.mir
-
merge-inline-loc2.mir
-
merge-inline-loc3.mir
-
merge-inline-loc4.mir
-
X86/
-
merge_inlined_loc.ll
-
unittests/IR/
-
IR/
-
MetadataTest.cpp

Differential D142556

[DebugInfo] Merge partially matching chains of inlined locations
ClosedPublic

Authored by dstenb on Jan 25 2023, 8:26 AM.

Download Raw Diff

Details

Reviewers

aprantl
vsk
dblaikie
jmmartinez
probinson

Commits

rG12a7aea6b022: [DebugInfo] Merge partially matching chains of inlined locations

Summary

For example, if you have a chain of inlined funtions like this:

 1 #include <stdlib.h>
 2 int g1 = 4, g2 = 6;
 3
 4 static inline void bar(int q) {
 5   if (q > 5)
 6     abort();
 7 }
 8
 9 static inline void foo(int q) {
10   bar(q);
11 }
12
13 int main() {
14   foo(g1);
15   foo(g2);
16   return 0;
17 }

with optimizations you could end up with a single abort call for the two
inlined instances of foo(). When merging the locations for those inlined
instances you would previously end up with a 0:0 location in main().
Leaving out that inlined chain from the location for the abort call
could make troubleshooting difficult in some cases.

This patch changes DILocation::getMergedLocation() to try to handle such
cases. The function is rewritten to first find a common starting point
for the two locations (same subprogram and inlined-at location), and
then in reverse traverses the inlined-at chain looking for matches in
each subprogram. For each subprogram, the merge function will find the
nearest common scope for the two locations, and matching line and
column (or set them to 0 if not matching).

In the example above, you will for the abort call get a location in
bar() at 6:5, inlined in foo() at 10:3, inlined in main() at 0:0 (since
the two inlined functions are on different lines, but in the same
scope).

I have not seen anything in the DWARF standard that would disallow
inlining a non-zero location at 0:0 in the inlined-at function, and both
LLDB and GDB seem to accept these locations (with D142552 needed for
LLDB to handle cases where the file, line and column number are all 0).
One incompatibility with GDB is that it seems to ignore 0-line locations
in some cases, but I am not aware of any specific issue that this patch
produces related to that.

With x86-64 LLDB (trunk) you previously got:

frame #0: 0x00007ffff7a44930 libc.so.6`abort
frame #1: 0x00005555555546ec a.out`main at merge.c:0

and will now get:

frame #0: 0x[...] libc.so.6`abort
frame #1: 0x[...] a.out`main [inlined] bar(q=<unavailable>) at merge.c:6:5
frame #2: 0x[...] a.out`main [inlined] foo(q=<unavailable>) at merge.c:10:3
frame #3: 0x[...] a.out`main at merge.c:0

and with x86-64 GDB (11.1) you will get:

(gdb) bt
#0  0x00007ffff7a44930 in abort () from /lib64/libc.so.6
#1  0x00005555555546ec in bar (q=<optimized out>) at merge.c:6
#2  foo (q=<optimized out>) at merge.c:10
#3  0x00005555555546ec in main ()

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

dstenb created this revision.Jan 25 2023, 8:26 AM

Herald added a project: Restricted Project. · View Herald TranscriptJan 25 2023, 8:26 AM

Herald added subscribers: pengfei, hiraditya. · View Herald Transcript

dstenb requested review of this revision.Jan 25 2023, 8:26 AM

Herald added a project: Restricted Project. · View Herald TranscriptJan 25 2023, 8:26 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

dstenb added a parent revision: D142552: [lldb] Make GetDIENamesAndRanges() allow 0-valued decl and call lines.Jan 25 2023, 8:27 AM

dstenb added a reviewer: probinson.

As far as I can tell it is valid DWARF to inline at line 0.

I guess that one of the main questions here is whether debuggers accept this. Of what I have tested of lldb trunk and GDB (9 and 11) they seem to handle it, with the caveat regarding the LLDB patch D142552 and that GDB seems to ignore line 0 locations in some cases. I do not know how the Sony debugger, dbx, or any other debugger handles it.

dstenb mentioned this in D142552: [lldb] Make GetDIENamesAndRanges() allow 0-valued decl and call lines.Jan 25 2023, 8:37 AM

Oh, neat idea - but I wonder what it does to the DIE tree, rather than just the line table? Do we end up with 3 inlined_subroutines? (one for each of the original calls, if tehy have any unique instructions after inlining, then one for this sythesized call? I guess that's not strictly worse - the instructions were from some sort of inlining... )

Harbormaster completed remote builds in B209885: Diff 492124.Jan 25 2023, 10:09 AM

ayermolo added a subscriber: ayermolo.Jan 26 2023, 1:51 PM

In D142556#4080303, @dblaikie wrote:

Oh, neat idea - but I wonder what it does to the DIE tree, rather than just the line table? Do we end up with 3 inlined_subroutines? (one for each of the original calls, if tehy have any unique instructions after inlining, then one for this sythesized call? I guess that's not strictly worse - the instructions were from some sort of inlining... )

Yes, we will end up with 3 inlined_subroutines in the example above (as both original calls have unique instructions):

$ grep -A7 DW_TAG_inlined_subroutine after.txt
0x00000066:     DW_TAG_inlined_subroutine
                  DW_AT_abstract_origin	(0x0000004a "foo")
                  DW_AT_low_pc	(0x00000000000006d1)
                  DW_AT_high_pc	(0x00000000000006da)
                  DW_AT_call_file	("/upstream/llvm-project/a.c")
                  DW_AT_call_line	(14)
                  DW_AT_call_column	(0x03)

0x00000073:       DW_TAG_inlined_subroutine
                    DW_AT_abstract_origin	(0x0000003d "bar")
                    DW_AT_low_pc	(0x00000000000006d1)
                    DW_AT_high_pc	(0x00000000000006da)
                    DW_AT_call_file	("/upstream/llvm-project/a.c")
                    DW_AT_call_line	(10)
                    DW_AT_call_column	(0x03)

--
0x00000081:     DW_TAG_inlined_subroutine
                  DW_AT_abstract_origin	(0x0000004a "foo")
                  DW_AT_low_pc	(0x00000000000006da)
                  DW_AT_high_pc	(0x00000000000006e3)
                  DW_AT_call_file	("/upstream/llvm-project/a.c")
                  DW_AT_call_line	(15)
                  DW_AT_call_column	(0x03)

0x0000008e:       DW_TAG_inlined_subroutine
                    DW_AT_abstract_origin	(0x0000003d "bar")
                    DW_AT_low_pc	(0x00000000000006da)
                    DW_AT_high_pc	(0x00000000000006e3)
                    DW_AT_call_file	("/upstream/llvm-project/a.c")
                    DW_AT_call_line	(10)
                    DW_AT_call_column	(0x03)

--
0x0000009c:     DW_TAG_inlined_subroutine
                  DW_AT_abstract_origin	(0x0000004a "foo")
                  DW_AT_low_pc	(0x00000000000006e7)
                  DW_AT_high_pc	(0x00000000000006ec)
                  DW_AT_call_file	("/upstream/llvm-project/a.c")
                  DW_AT_call_line	(0)

0x000000a8:       DW_TAG_inlined_subroutine
                    DW_AT_abstract_origin	(0x0000003d "bar")
                    DW_AT_low_pc	(0x00000000000006e7)
                    DW_AT_high_pc	(0x00000000000006ec)
                    DW_AT_call_file	("/upstream/llvm-project/a.c")
                    DW_AT_call_line	(10)
                    DW_AT_call_column	(0x03)

When building a Clang binary with this patch to see how often the new case occurs I encountered a bug where the BRIt iterator was invalidated due to further insertion into BLocs. I have fixed that and added a unit test.

Harbormaster completed remote builds in B210335: Diff 492719.Jan 27 2023, 7:13 AM

In D142556#4085390, @dstenb wrote:

In D142556#4080303, @dblaikie wrote:

Oh, neat idea - but I wonder what it does to the DIE tree, rather than just the line table? Do we end up with 3 inlined_subroutines? (one for each of the original calls, if tehy have any unique instructions after inlining, then one for this sythesized call? I guess that's not strictly worse - the instructions were from some sort of inlining... )

Yes, we will end up with 3 inlined_subroutines in the example above (as both original calls have unique instructions):

Fair enough - makes sense!

How's the performance?

(@aprantl @JDevlieghere - this seems pretty reasonable/good to me in terms of direction (Haven't looked in detail at the code, but at a glance it seems plausible, if perf is OK) - how about you folks?)

llvm/lib/IR/DebugInfoMetadata.cpp
195	Probably add an explicit return type to the lambda, rather than a cast on the return expression here.

I've run out of time to read the code this week, but it sounds like a good approach to me, I'll give my debugger colleagues a prod about it too.

In D142556#4080184, @dstenb wrote:

As far as I can tell it is valid DWARF to inline at line 0.

I guess that one of the main questions here is whether debuggers accept this. Of what I have tested of lldb trunk and GDB (9 and 11) they seem to handle it, with the caveat regarding the LLDB patch D142552 and that GDB seems to ignore line 0 locations in some cases. I do not know how the Sony debugger, dbx, or any other debugger handles it.

To speak for LLDB here: I support emitting an inlined-at: line 0. If LLDB can't deal with it (though I don't see why) we can fix LLDB.

In D142556#4086340, @dblaikie wrote:

In D142556#4085390, @dstenb wrote:

In D142556#4080303, @dblaikie wrote:

Oh, neat idea - but I wonder what it does to the DIE tree, rather than just the line table? Do we end up with 3 inlined_subroutines? (one for each of the original calls, if tehy have any unique instructions after inlining, then one for this sythesized call? I guess that's not strictly worse - the instructions were from some sort of inlining... )

Yes, we will end up with 3 inlined_subroutines in the example above (as both original calls have unique instructions):

How's the performance?

Sorry, I have been blocked by other work. I have now measured compilation speed for an opt 8.0 binary with the RelWithDebInfo build type using patched respectively unpatched versions of trunk (those were Release builds).

This was done a shared machine, so the load varied a bit over time, but I've built opt 40 times with respective binary and these are the user time results:

# With patch
[67.32661666666667, 67.33988333333333, 68.00153333333333, 69.95165, 70.0251, 70.08698333333332, 70.10561666666666, 70.15785000000001, 70.1597, 70.16915, 70.1803, 70.22085, 70.2247, 70.30366666666666, 70.30761666666666, 70.31343333333334, 70.31875, 70.43641666666667, 70.5036, 70.50515, 70.51246666666667, 70.53808333333333, 70.57755, 70.60194999999999, 70.64415000000001, 70.84625, 71.6538, 73.79506666666667, 73.89473333333333, 74.56911666666666, 76.15891666666667, 76.41226666666667, 79.76478333333333, 81.38295, 81.68751666666667, 86.29401666666666, 87.57535, 89.4209, 95.79031666666667, 109.24031666666667]
mean: 74.44997666666667
median: 70.50880833333333
std: 8.485396943096646

# Without patch
[66.41356666666667, 67.65480000000001, 70.14276666666667, 70.14448333333334, 70.18646666666667, 70.19085, 70.2014, 70.21263333333333, 70.26146666666666, 70.26425, 70.26745, 70.28258333333333, 70.28881666666666, 70.36748333333333, 70.41218333333335, 70.44623333333334, 70.47338333333333, 70.48428333333334, 70.51248333333334, 70.51689999999999, 70.54706666666667, 70.55458333333333, 70.55946666666667, 70.57926666666667, 70.58081666666666, 70.70003333333334, 70.99738333333333, 71.0452, 71.10905, 72.0509, 77.86593333333333, 79.43435000000001, 79.80055, 84.38, 88.10063333333333, 97.71156666666667, 98.03155, 101.29769999999999, 106.61161666666666, 115.31475]
mean: 75.92492250000001
median: 70.53198333333333
std: 11.548023516184701

So, it seems that this patch is not affecting the compile time noticeably, at least for this normal case.

For the RelWithDebInfo opt 8.0 binaries that was mentioned in the previous comment the number of inlined subroutine DIEs increased by ~2.8%:

$ llvm-dwarfdump build-without/bin/opt | grep -E ':\s+DW_TAG_inlined_subroutine' | wc -l
1726835
$ llvm-dwarfdump build-with/bin/opt | grep -E ':\s+DW_TAG_inlined_subroutine' | wc -l
1775581

and this is the output from llvm-dwarfdump --statistics:

--- stats-without.txt	2023-02-03 14:53:06.328707620 +0100
+++ stats-with.txt	2023-02-03 14:51:35.704846099 +0100
@@ -3 +3 @@
-  "file": "build-without/bin/opt",
+  "file": "build-with/bin/opt",
@@ -5,4 +5,4 @@
-  "#functions": 175812,
-  "#functions with location": 175360,
-  "#inlined functions": 1726835,
-  "#inlined functions with abstract origins": 1726835,
+  "#functions": 177591,
+  "#functions with location": 177139,
+  "#inlined functions": 1775581,
+  "#inlined functions with abstract origins": 1775581,
@@ -10 +10 @@
-  "#source variables": 4042638,
+  "#source variables": 4133726,
@@ -12 +12 @@
-  "#call site entries": 1726835,
+  "#call site entries": 1775581,
@@ -15 +15 @@
-  "sum_all_variables(#bytes in parent scope)": 325501149,
+  "sum_all_variables(#bytes in parent scope)": 325501167,
@@ -22 +22 @@
-  "sum_all_local_vars(#bytes in parent scope)": 165535273,
+  "sum_all_local_vars(#bytes in parent scope)": 165535291,
@@ -26 +26 @@
-  "#bytes within inlined functions": 17090162,
+  "#bytes within inlined functions": 17532650,
@@ -35,11 +35,11 @@
-  "#bytes in .debug_info": 150806439,
-  "#bytes in .debug_abbrev": 2926872,
-  "#bytes in .debug_line": 19552340,
-  "#bytes in .debug_str": 55235762,
-  "#bytes in .debug_addr": 9078944,
-  "#bytes in .debug_loclists": 26593774,
-  "#bytes in .debug_rnglists": 7188993,
-  "#bytes in .debug_str_offsets": 24633440,
-  "#bytes in .debug_line_str": 113086,
-  "#variables processed by location statistics": 2579495,
-  "#variables with 0% of parent scope covered by DW_AT_location": 25017,
+  "#bytes in .debug_info": 151752899,
+  "#bytes in .debug_abbrev": 2959240,
+  "#bytes in .debug_line": 19702676,
+  "#bytes in .debug_str": 55419330,
+  "#bytes in .debug_addr": 9205856,
+  "#bytes in .debug_loclists": 26594317,
+  "#bytes in .debug_rnglists": 7304436,
+  "#bytes in .debug_str_offsets": 24650788,
+  "#bytes in .debug_line_str": 113071,
+  "#variables processed by location statistics": 2581225,
+  "#variables with 0% of parent scope covered by DW_AT_location": 26747,
@@ -57 +57 @@
-  "#variables - entry values with 0% of parent scope covered by DW_AT_location": 18853,
+  "#variables - entry values with 0% of parent scope covered by DW_AT_location": 18849,
@@ -69,2 +69,2 @@
-  "#params processed by location statistics": 2219749,
-  "#params with 0% of parent scope covered by DW_AT_location": 10094,
+  "#params processed by location statistics": 2221618,
+  "#params with 0% of parent scope covered by DW_AT_location": 11963,
@@ -82 +82 @@
-  "#params - entry values with 0% of parent scope covered by DW_AT_location": 4540,
+  "#params - entry values with 0% of parent scope covered by DW_AT_location": 4539,
@@ -94,2 +94,2 @@
-  "#local vars processed by location statistics": 359746,
-  "#local vars with 0% of parent scope covered by DW_AT_location": 14923,
+  "#local vars processed by location statistics": 359607,
+  "#local vars with 0% of parent scope covered by DW_AT_location": 14784,
@@ -107 +107 @@
-  "#local vars - entry values with 0% of parent scope covered by DW_AT_location": 14313,
+  "#local vars - entry values with 0% of parent scope covered by DW_AT_location": 14310,

The binary grew by 2 MB (0.6%):

$ du -h build-without/bin/opt build-with/bin/opt 
323M	build-without/bin/opt
325M	build-with/bin/opt

Numbers look fairly good - wonder if you could get this run through the LLVM perf tracker to get some more stable/precise numbers?
& any chance you could use something like bloaty ( https://github.com/google/bloaty ) or by hand comparison on a per-section basis on the clang binary? Getting a sense of, specifically, how much bigger this makes .debug_info (but any incidental changes to other sections too) might be helpful

In D142556#4103178, @dblaikie wrote:

Numbers look fairly good - wonder if you could get this run through the LLVM perf tracker to get some more stable/precise numbers?
& any chance you could use something like bloaty ( https://github.com/google/bloaty ) or by hand comparison on a per-section basis on the clang binary? Getting a sense of, specifically, how much bigger this makes .debug_info (but any incidental changes to other sections too) might be helpful

I feel like mentioning “llvm-dwarfdump --show-section-sizes“ :)

In D142556#4105194, @djtodoro wrote:

In D142556#4103178, @dblaikie wrote:

Numbers look fairly good - wonder if you could get this run through the LLVM perf tracker to get some more stable/precise numbers?
& any chance you could use something like bloaty ( https://github.com/google/bloaty ) or by hand comparison on a per-section basis on the clang binary? Getting a sense of, specifically, how much bigger this makes .debug_info (but any incidental changes to other sections too) might be helpful

I feel like mentioning “llvm-dwarfdump --show-section-sizes“ :)

Indeed - I usually use llvm-objdump -h when I just want section sizes (it's useful to have the DWARF ones in the context of the others, even if sometimes I'm interested only in the change in DWARF sections). - but the think that I find especially useful about Bloaty is the comparison mode - showing per section % changes and overall % changes, etc.

In D142556#4103178, @dblaikie wrote:

Numbers look fairly good - wonder if you could get this run through the LLVM perf tracker to get some more stable/precise numbers?
& any chance you could use something like bloaty ( https://github.com/google/bloaty ) or by hand comparison on a per-section basis on the clang binary? Getting a sense of, specifically, how much bigger this makes .debug_info (but any incidental changes to other sections too) might be helpful

I feel like mentioning “llvm-dwarfdump --show-section-sizes“ :)

In D142556#4105414, @dblaikie wrote:

In D142556#4105194, @djtodoro wrote:

In D142556#4103178, @dblaikie wrote:

Numbers look fairly good - wonder if you could get this run through the LLVM perf tracker to get some more stable/precise numbers?
& any chance you could use something like bloaty ( https://github.com/google/bloaty ) or by hand comparison on a per-section basis on the clang binary? Getting a sense of, specifically, how much bigger this makes .debug_info (but any incidental changes to other sections too) might be helpful

I feel like mentioning “llvm-dwarfdump --show-section-sizes“ :)

Indeed - I usually use llvm-objdump -h when I just want section sizes (it's useful to have the DWARF ones in the context of the others, even if sometimes I'm interested only in the change in DWARF sections). - but the think that I find especially useful about Bloaty is the comparison mode - showing per section % changes and overall % changes, etc.

(sorry for diverging from the topic of the patch:) ) I see. It will be really beneficial if we added something like that in our dwarfdump. I may find some free slots to implement that. Thanks.

In D142556#4103178, @dblaikie wrote:

Numbers look fairly good - wonder if you could get this run through the LLVM perf tracker to get some more stable/precise numbers?

Just to make sure, is that https://llvm-compile-time-tracker.com/? I can try, otherwise I'll build binaries on a smaller machine that is otherwise unused during the nights, to get rid of most of the noise.

& any chance you could use something like bloaty ( https://github.com/google/bloaty ) or by hand comparison on a per-section basis on the clang binary? Getting a sense of, specifically, how much bigger this makes .debug_info (but any incidental changes to other sections too) might be helpful

Here are results for the above mentioned RelWithDebInfo opt build:

$ bloaty normal/build-with/bin/opt -- normal/build-without/bin/opt 
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  +101%  +924Ki  [ = ]       0    .debug_info
  +100%  +179Ki  [ = ]       0    .debug_str
  +101%  +146Ki  [ = ]       0    .debug_line
  +101%  +123Ki  [ = ]       0    .debug_addr
  +102%  +112Ki  [ = ]       0    .debug_rnglists
  +101% +31.6Ki  [ = ]       0    .debug_abbrev
  +100% +16.9Ki  [ = ]       0    .debug_str_offsets
  +100%    +543  [ = ]       0    .debug_loclists
  +100%     -18  [ = ]       0    .debug_line_str
  +100%  +324Mi  +100% +35.6Mi    TOTAL

RelWithDebInfo + Full LTO opt binary:

$ bloaty full-lto/build-with/bin/opt -- full-lto/build-without/bin/opt 
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  +101%  +800Ki  [ = ]       0    .debug_info
  +101%  +152Ki  [ = ]       0    .debug_line
  +102%  +130Ki  [ = ]       0    .debug_rnglists
  +101%  +101Ki  [ = ]       0    .debug_addr
  +100% +92.8Ki  [ = ]       0    .debug_str
  +100% +2.85Ki  [ = ]       0    .debug_str_offsets
  +100%    +120  [ = ]       0    .debug_abbrev
  +100%      -5  [ = ]       0    .debug_loclists
  +100%     -14  [ = ]       0    .debug_line_str
  +101%  +243Mi  +100% +37.7Mi    TOTAL

RelWithDebInfo + Thin LTO opt binary:

$ bloaty thin-lto/build-with/bin/opt -- thin-lto/build-without/bin/opt 
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  +101%  +790Ki  [ = ]       0    .debug_info
  +101%  +148Ki  [ = ]       0    .debug_line
  +102%  +121Ki  [ = ]       0    .debug_rnglists
  +101%  +105Ki  [ = ]       0    .debug_addr
  +100%  +102Ki  [ = ]       0    .debug_str
  +101% +34.7Ki  [ = ]       0    .debug_abbrev
  +100% +10.1Ki  [ = ]       0    .debug_str_offsets
  +100% +1.04Ki  [ = ]       0    .debug_loclists
  +100%     -15  [ = ]       0    .debug_line_str
  +100% -3.07Ki  [ = ]       0    .strtab
  +100%  +332Mi  +100% +38.0Mi    TOTAL

330668	normal/build-without/bin/opt
332204	normal/build-with/bin/opt # ~0.5% increase.

248560	full-lto/build-without/bin/opt
249840	full-lto/build-with/bin/opt # ~0.5% increase.

339144	thin-lto/build-without/bin/opt
340456	thin-lto/build-with/bin/opt # ~0.4% increase.

So, around a 0.5% increase of the total binary size, with at max a 1-2% increase of some of the debug information sections for these RelWithDebInfo builds. I used the opt binary since I had used that before in my earlier comments, but the diffs look the same for a clang binary.

I wonder what the increase is with ThinLTO?

@dblaikie, @ayermolo: I have now ran measurements on an otherwise unused machine to get rid of the noise.

The measurements are from 21 number of opt binary builds per each build type (RelWithDebInfo with/without ThinLTO), using Clang built with the Release build type as before. The time is user+sys (I had missed the system time last round) taken from bash's time.

# With patch:
mean: 81.391325 (81m23.479476s)
median: 81.387383 (81m23.243000s)
std: 0.052490

# Without patch:
mean: 81.216625 (81m12.997476s)
median: 81.216417 (81m12.985000s)
std: 0.058820

# With patch (ThinLTO build):
mean: 90.751952 (90m45.117095s)
median: 90.761333 (90m45.680000s)
std: 0.045923

# Without patch (ThinLTO build):
mean: 90.619793 (90m37.187571s)
median: 90.600950 (90m36.057000s)
std: 0.085509

So, it seems like this patch adds ~10 seconds of CPU time for these ~80-90 minute builds.

Is this a sufficient measurement?

@ayermolo: And in case you meant the effect on the section and file sizes for a ThinLTO build, those are available in my previous comment!

In D142556#4118632, @dstenb wrote:
@dblaikie, @ayermolo: I have now ran measurements on an otherwise unused machine to get rid of the noise.

The measurements are from 21 number of opt binary builds per each build type (RelWithDebInfo with/without ThinLTO), using Clang built with the Release build type as before. The time is user+sys (I had missed the system time last round) taken from bash's time.
# With patch:
mean: 81.391325 (81m23.479476s)
median: 81.387383 (81m23.243000s)
std: 0.052490

# Without patch:
mean: 81.216625 (81m12.997476s)
median: 81.216417 (81m12.985000s)
std: 0.058820

# With patch (ThinLTO build):
mean: 90.751952 (90m45.117095s)
median: 90.761333 (90m45.680000s)
std: 0.045923

# Without patch (ThinLTO build):
mean: 90.619793 (90m37.187571s)
median: 90.600950 (90m36.057000s)
std: 0.085509
So, it seems like this patch adds ~10 seconds of CPU time for these ~80-90 minute builds.

Is this a sufficient measurement?

@ayermolo: And in case you meant the effect on the section and file sizes for a ThinLTO build, those are available in my previous comment!

Ah sorry I am blind.
Another data point. I used this patch on some internal build with all the bells and whistles and increase was about 28MB total. Which is negligible.

In D142556#4110809, @dstenb wrote:

In D142556#4103178, @dblaikie wrote:

Numbers look fairly good - wonder if you could get this run through the LLVM perf tracker to get some more stable/precise numbers?

Just to make sure, is that https://llvm-compile-time-tracker.com/? I can try, otherwise I'll build binaries on a smaller machine that is otherwise unused during the nights, to get rid of most of the noise.

Yeah, that's the one I meant.

& any chance you could use something like bloaty ( https://github.com/google/bloaty ) or by hand comparison on a per-section basis on the clang binary? Getting a sense of, specifically, how much bigger this makes .debug_info (but any incidental changes to other sections too) might be helpful

Here are results for the above mentioned RelWithDebInfo opt build:
$ bloaty normal/build-with/bin/opt -- normal/build-without/bin/opt 
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  +101%  +924Ki  [ = ]       0    .debug_info
  +100%  +179Ki  [ = ]       0    .debug_str
  +101%  +146Ki  [ = ]       0    .debug_line
  +101%  +123Ki  [ = ]       0    .debug_addr
  +102%  +112Ki  [ = ]       0    .debug_rnglists
  +101% +31.6Ki  [ = ]       0    .debug_abbrev
  +100% +16.9Ki  [ = ]       0    .debug_str_offsets
  +100%    +543  [ = ]       0    .debug_loclists
  +100%     -18  [ = ]       0    .debug_line_str
  +100%  +324Mi  +100% +35.6Mi    TOTAL

I'm a bit confused by all these reports - at least with my bloaty locally, "+100%" would mean the section doubled in size (100% growth) but it shows "+100%" even for a section that shrunk (.debug_line_str) slightly... Confusing.

but the overall numbers still sound OK.

In D142556#4119696, @dblaikie wrote:

I'm a bit confused by all these reports - at least with my bloaty locally, "+100%" would mean the section doubled in size (100% growth) but it shows "+100%" even for a section that shrunk (.debug_line_str) slightly... Confusing.

but the overall numbers still sound OK.

Yes, I thought that the output format looked odd. That was with bloaty built from main (52948c), which we can compare to v1.1 (for a small example binary):

$ diff -u <(./bloaty-1.1 with.out -- without.out) <(./bloaty-52948c1 with.out -- without.out)
--- /dev/fd/63	2023-02-11 01:25:09.785684383 +0100
+++ /dev/fd/62	2023-02-11 01:25:09.785684383 +0100
@@ -1,8 +1,8 @@
     FILE SIZE        VM SIZE    
  --------------  -------------- 
-   +11%     +26  [ = ]       0    .debug_info
-  +6.5%     +15  [ = ]       0    .debug_abbrev
-  +6.7%      +8  [ = ]       0    .debug_addr
-  +0.1%      +6  [ = ]       0    [Unmapped]
-  +0.5%      +1  [ = ]       0    .debug_line
-  +0.3%     +56  [ = ]       0    TOTAL
+  +115%     +26  [ = ]       0    .debug_info
+  +109%     +15  [ = ]       0    .debug_abbrev
+  +114%      +8  [ = ]       0    .debug_addr
+  +111%      +6  [ = ]       0    .debug_str_offsets
+  +101%      +1  [ = ]       0    .debug_line
+  +100% +17.1Ki  +100% +2.61Ki    TOTAL

Although they find the same number of bytes, v1.1 reports an +11% increase for .debug_info for this example compared to +115% when using bloaty from main. Perhaps it is broken on main?

I don't have access to run v1.1 on the binaries in question now, but if we look at the ratios for the "#bytes" diff lines from llvm-dwarfdump in one of my earlier comments we can see that the ratios are smaller than how I interpreted bloaty's output, similar to this small example.

In D142556#4119818, @dstenb wrote:
In D142556#4119696, @dblaikie wrote:

I'm a bit confused by all these reports - at least with my bloaty locally, "+100%" would mean the section doubled in size (100% growth) but it shows "+100%" even for a section that shrunk (.debug_line_str) slightly... Confusing.

but the overall numbers still sound OK.

Yes, I thought that the output format looked odd. That was with bloaty built from main (52948c), which we can compare to v1.1 (for a small example binary):
$ diff -u <(./bloaty-1.1 with.out -- without.out) <(./bloaty-52948c1 with.out -- without.out)
--- /dev/fd/63	2023-02-11 01:25:09.785684383 +0100
+++ /dev/fd/62	2023-02-11 01:25:09.785684383 +0100
@@ -1,8 +1,8 @@
     FILE SIZE        VM SIZE    
  --------------  -------------- 
-   +11%     +26  [ = ]       0    .debug_info
-  +6.5%     +15  [ = ]       0    .debug_abbrev
-  +6.7%      +8  [ = ]       0    .debug_addr
-  +0.1%      +6  [ = ]       0    [Unmapped]
-  +0.5%      +1  [ = ]       0    .debug_line
-  +0.3%     +56  [ = ]       0    TOTAL
+  +115%     +26  [ = ]       0    .debug_info
+  +109%     +15  [ = ]       0    .debug_abbrev
+  +114%      +8  [ = ]       0    .debug_addr
+  +111%      +6  [ = ]       0    .debug_str_offsets
+  +101%      +1  [ = ]       0    .debug_line
+  +100% +17.1Ki  +100% +2.61Ki    TOTAL
Although they find the same number of bytes, v1.1 reports an +11% increase for .debug_info for this example compared to +115% when using bloaty from main. Perhaps it is broken on main?

I don't have access to run v1.1 on the binaries in question now, but if we look at the ratios for the "#bytes" diff lines from llvm-dwarfdump in one of my earlier comments we can see that the ratios are smaller than how I interpreted bloaty's output, similar to this small example.

*nod* All a bit confusing, unfortunately - but I /think/ the numbers sound OK.

compile-time-tracker results?

In D142556#4124049, @dblaikie wrote:

compile-time-tracker results?

I have gotten results now, but I will not have time to look at them tonight: https://llvm-compile-time-tracker.com/compare.php?from=f7b10467b63f09ab74e67d4002b3e11601091882&to=d200692f433fbc5b8edaa68e92ab04124d0c7e6a

In D142556#4126647, @dstenb wrote:

In D142556#4124049, @dblaikie wrote:

compile-time-tracker results?

I have gotten results now, but I will not have time to look at them tonight: https://llvm-compile-time-tracker.com/compare.php?from=f7b10467b63f09ab74e67d4002b3e11601091882&to=d200692f433fbc5b8edaa68e92ab04124d0c7e6a

Looking at the instruction count we have the following geomeans:

# NewPM-O3:
geomean         72506M         72501M (-0.01%)
# NewPM-ReleaseThinLTO:
geomean         95895M         95901M (+0.01%)
# NewPM-ReleaseLTO-g:
geomean        110051M        110076M (+0.02%)
# NewPM-O0-g:
geomean         21463M         21473M (+0.05%)

and cycles:

# NewPM-O3:
geomean         54478M         54520M (+0.08%)
# NewPM-ReleaseThinLTO:
geomean         70324M         70367M (+0.06%)
# NewPM-ReleaseLTO-g:
geomean         82451M         82524M (+0.09%)
# NewPM-O0-g:
geomean         16404M         16423M (+0.12%)

I assume that the NewPM-O3 and NewPM-ReleaseThinLTO configurations are not interesting since they do not use -g (there's no file-size changes for those), and that NewPM-O0-g is not interesting since we'd only (?) inline always_inline functions (there are no file-size changes there also). So, there is a slight instruction and cycle count increase for the geometric mean for NewPM-ReleaseLTO-g, but that is similar to the uninteresting configurations. I assume that the uninteresting configurations may move a bit due to different program layout (cache effects, etc.).

In terms of file size increase there is a slight increase for the geometric mean, around in the same ballpark as I saw with my opt compilations.

size-file, NewPM-ReleaseLTO-g:

Benchmark	Old	New
kimwitu++	3610KiB	3638KiB (+0.77%)
sqlite3	1911KiB	1917KiB (+0.31%)
consumer-typeset	1382KiB	1383KiB (+0.05%)
Bullet	2448KiB	2453KiB (+0.21%)
tramp3d-v4	12535KiB	12552KiB (+0.14%)
mafft	579KiB	580KiB (+0.17%)
ClamAV	2213KiB	2221KiB (+0.39%)
lencod	2514KiB	2522KiB (+0.34%)
SPASS	2616KiB	2629KiB (+0.47%)
7zip	5495KiB	5505KiB (+0.18%)
geomean	2589KiB	2597KiB (+0.30%)

In D142556#4086698, @jmorse wrote:

I've run out of time to read the code this week, but it sounds like a good approach to me, I'll give my debugger colleagues a prod about it too.

Hi! Did you have a chance to talk with your colleagues regarding this?

perf report looks ok to me, thanks!

uabelho added a subscriber: uabelho.Feb 20 2023, 4:13 AM

Address comment and fix the patch for non-assert builds.

dstenb marked an inline comment as done.Feb 21 2023, 12:39 AM

dstenb added inline comments.

llvm/lib/IR/DebugInfoMetadata.cpp
195	Sorry, I did not catch this comment!

Harbormaster completed remote builds in B214941: Diff 499059.Feb 21 2023, 1:54 AM

Reckon this is worth a go - maybe wait for @aprantl to sign off too, though.

llvm/lib/IR/DebugInfoMetadata.cpp
155–157	Would it be easier to express/read as arithmetic on rbegin, rather than begin->arithmetic->reverse?

This revision is now accepted and ready to land.Feb 21 2023, 1:24 PM

dstenb marked an inline comment as done.Feb 22 2023, 6:32 AM

dstenb added inline comments.

llvm/lib/IR/DebugInfoMetadata.cpp
155–157	The corresponding initializations based on `rbegin` would be: ARIt = ALocs.rbegin() + (ALocs.size() - IT->second - 1); BRIt = BLocs.rbegin() + (BLocs.size() - I - 1); I think that they are about the same in terms of readability, so happy to pick either or.

dblaikie added inline comments.Feb 22 2023, 12:20 PM

llvm/lib/IR/DebugInfoMetadata.cpp
155–157	Yeah, I think I'd lean /slightly/ towards the latter/not having to explicitly construct a reverse iterator, but no big deal either way, indeed. Your call.

In D142556#4142682, @dblaikie wrote:

Reckon this is worth a go - maybe wait for @aprantl to sign off too, though.

Thanks for the review!

Any thoughts on this, @aprantl, or others? If not, I'll attempt to land this at the end of this week or in the beginning of next.

I'm fine with taking this, thanks!

aprantl accepted this revision.Mar 3 2023, 8:39 AM

Thanks! I'll land this shortly.

This revision was landed with ongoing or failed builds.Mar 6 2023, 5:37 AM

Closed by commit rG12a7aea6b022: [DebugInfo] Merge partially matching chains of inlined locations (authored by dstenb). · Explain Why

This revision was automatically updated to reflect the committed changes.

dstenb mentioned this in rG98c3dc3fa748: [lldb] Make GetDIENamesAndRanges() allow 0-valued decl and call lines.

dstenb added a commit: rG12a7aea6b022: [DebugInfo] Merge partially matching chains of inlined locations.

Revision Contents

Path

Size

llvm/

lib/

IR/

DebugInfoMetadata.cpp

160 lines

test/

DebugInfo/

MIR/

X86/

merge-inline-loc1.mir

222 lines

merge-inline-loc2.mir

339 lines

merge-inline-loc3.mir

187 lines

merge-inline-loc4.mir

166 lines

X86/

merge_inlined_loc.ll

15 lines

unittests/

IR/

MetadataTest.cpp

153 lines

Diff 502605

llvm/lib/IR/DebugInfoMetadata.cpp

//===- DebugInfoMetadata.cpp - Implement debug info metadata --------------===//		//===- DebugInfoMetadata.cpp - Implement debug info metadata --------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This file implements the debug info Metadata classes.		// This file implements the debug info Metadata classes.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/IR/DebugInfoMetadata.h"		#include "llvm/IR/DebugInfoMetadata.h"
#include "LLVMContextImpl.h"		#include "LLVMContextImpl.h"
#include "MetadataImpl.h"		#include "MetadataImpl.h"
		#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/SmallSet.h"		#include "llvm/ADT/SmallSet.h"
#include "llvm/ADT/StringSwitch.h"		#include "llvm/ADT/StringSwitch.h"
#include "llvm/BinaryFormat/Dwarf.h"		#include "llvm/BinaryFormat/Dwarf.h"
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"
#include "llvm/IR/IntrinsicInst.h"		#include "llvm/IR/IntrinsicInst.h"
#include "llvm/IR/Type.h"		#include "llvm/IR/Type.h"
#include "llvm/IR/Value.h"		#include "llvm/IR/Value.h"

▲ Show 20 Lines • Show All 85 Lines • ▼ Show 20 Lines	const DILocation DILocation::getMergedLocation(const DILocation LocA,
const DILocation *LocB) {		const DILocation *LocB) {
if (!LocA \|\| !LocB)		if (!LocA \|\| !LocB)
return nullptr;		return nullptr;

if (LocA == LocB)		if (LocA == LocB)
return LocA;		return LocA;

LLVMContext &C = LocA->getContext();		LLVMContext &C = LocA->getContext();
SmallDenseMap<std::pair<DILocalScope , DILocation >,
std::pair<unsigned, unsigned>, 4>
Locations;

DIScope *S = LocA->getScope();
DILocation *L = LocA->getInlinedAt();
unsigned Line = LocA->getLine();
unsigned Col = LocA->getColumn();

// Walk from the current source locaiton until the file scope;
// then, do the same for the inlined-at locations.
auto AdvanceToParentLoc = [&S, &L, &Line, &Col]() {
S = S->getScope();
if (!S && L) {
Line = L->getLine();
Col = L->getColumn();
S = L->getScope();
L = L->getInlinedAt();
}
};

while (S) {		using LocVec = SmallVector<const DILocation *>;
if (auto *LS = dyn_cast<DILocalScope>(S))		LocVec ALocs;
Locations.try_emplace(std::make_pair(LS, L), std::make_pair(Line, Col));		LocVec BLocs;
AdvanceToParentLoc();		SmallDenseMap<std::pair<const DISubprogram , const DILocation >, unsigned,
}		4>
		ALookup;
// Walk the source locations of LocB until a match with LocA is found.
S = LocB->getScope();		// Walk through LocA and its inlined-at locations, populate them in ALocs and
L = LocB->getInlinedAt();		// save the index for the subprogram and inlined-at pair, which we use to find
Line = LocB->getLine();		// a matching starting location in LocB's chain.
Col = LocB->getColumn();		for (auto [L, I] = std::make_pair(LocA, 0U); L; L = L->getInlinedAt(), I++) {
while (S) {		ALocs.push_back(L);
if (auto *LS = dyn_cast<DILocalScope>(S)) {		auto Res = ALookup.try_emplace(
auto MatchLoc = Locations.find(std::make_pair(LS, L));		{L->getScope()->getSubprogram(), L->getInlinedAt()}, I);
if (MatchLoc != Locations.end()) {		assert(Res.second && "Multiple <SP, InlinedAt> pairs in a location chain?");
// If the lines match, keep the line, but set the column to '0'		(void)Res;
// If the lines don't match, pick a "line 0" location but keep		}
// the current scope and inlined-at.
bool SameLine = Line == MatchLoc->second.first;		LocVec::reverse_iterator ARIt = ALocs.rend();
bool SameCol = Col == MatchLoc->second.second;		LocVec::reverse_iterator BRIt = BLocs.rend();
Line = SameLine ? Line : 0;
Col = SameLine && SameCol ? Col : 0;		// Populate BLocs and look for a matching starting location, the first
		// location with the same subprogram and inlined-at location as in LocA's
		// chain. Since the two locations have the same inlined-at location we do
		// not need to look at those parts of the chains.
		for (auto [L, I] = std::make_pair(LocB, 0U); L; L = L->getInlinedAt(), I++) {
		BLocs.push_back(L);

		if (ARIt != ALocs.rend())
		// We have already found a matching starting location.
		continue;

		auto IT = ALookup.find({L->getScope()->getSubprogram(), L->getInlinedAt()});
		if (IT == ALookup.end())
		continue;

		// The + 1 is to account for the &*rev_it = &(it - 1) relationship.
		ARIt = LocVec::reverse_iterator(ALocs.begin() + IT->second + 1);
		BRIt = LocVec::reverse_iterator(BLocs.begin() + I + 1);
		dblaikieUnsubmitted Not Done Reply Inline Actions Would it be easier to express/read as arithmetic on rbegin, rather than begin->arithmetic->reverse? dblaikie: Would it be easier to express/read as arithmetic on rbegin, rather than begin->arithmetic…
		dstenbAuthorUnsubmitted Done Reply Inline Actions The corresponding initializations based on `rbegin` would be: ARIt = ALocs.rbegin() + (ALocs.size() - IT->second - 1); BRIt = BLocs.rbegin() + (BLocs.size() - I - 1); I think that they are about the same in terms of readability, so happy to pick either or. dstenb: The corresponding initializations based on `rbegin` would be: ``` ARIt = ALocs.rbegin() +…
		dblaikieUnsubmitted Not Done Reply Inline Actions Yeah, I think I'd lean /slightly/ towards the latter/not having to explicitly construct a reverse iterator, but no big deal either way, indeed. Your call. dblaikie: Yeah, I think I'd lean /slightly/ towards the latter/not having to explicitly construct a…

		// If we have found a matching starting location we do not need to add more
		// locations to BLocs, since we will only look at location pairs preceding
		// the matching starting location, and adding more elements to BLocs could
		// invalidate the iterator that we initialized here.
break;		break;
}		}

		// Merge the two locations if possible, using the supplied
		// inlined-at location for the created location.
		auto MergeLocPair = [&C](const DILocation L1, const DILocation L2,
		DILocation InlinedAt) -> DILocation {
		if (L1 == L2)
		return DILocation::get(C, L1->getLine(), L1->getColumn(), L1->getScope(),
		InlinedAt);

		// If the locations originate from different subprograms we can't produce
		// a common location.
		if (L1->getScope()->getSubprogram() != L2->getScope()->getSubprogram())
		return nullptr;

		// Return the nearest common scope inside a subprogram.
		auto GetNearestCommonScope = [](DIScope S1, DIScope S2) -> DIScope * {
		SmallPtrSet<DIScope *, 8> Scopes;
		for (; S1; S1 = S1->getScope()) {
		Scopes.insert(S1);
		if (isa<DISubprogram>(S1))
		break;
}		}
AdvanceToParentLoc();
		for (; S2; S2 = S2->getScope()) {
		if (Scopes.count(S2))
		return S2;
		if (isa<DISubprogram>(S2))
		break;
}		}

if (!S) {		return nullptr;
		dblaikieUnsubmitted Done Reply Inline Actions Probably add an explicit return type to the lambda, rather than a cast on the return expression here. dblaikie: Probably add an explicit return type to the lambda, rather than a cast on the return expression…
		dstenbAuthorUnsubmitted Done Reply Inline Actions Sorry, I did not catch this comment! dstenb: Sorry, I did not catch this comment!
// If the two locations are irreconsilable, pick any scope,		};
// and return a "line 0" location.
Line = Col = 0;		auto Scope = GetNearestCommonScope(L1->getScope(), L2->getScope());
S = LocA->getScope();		assert(Scope && "No common scope in the same subprogram?");

		bool SameLine = L1->getLine() == L2->getLine();
		bool SameCol = L1->getColumn() == L2->getColumn();
		unsigned Line = SameLine ? L1->getLine() : 0;
		unsigned Col = SameLine && SameCol ? L1->getColumn() : 0;

		return DILocation::get(C, Line, Col, Scope, InlinedAt);
		};

		DILocation Result = ARIt != ALocs.rend() ? (ARIt)->getInlinedAt() : nullptr;

		// If we have found a common starting location, walk up the inlined-at chains
		// and try to produce common locations.
		for (; ARIt != ALocs.rend() && BRIt != BLocs.rend(); ++ARIt, ++BRIt) {
		DILocation Tmp = MergeLocPair(ARIt, *BRIt, Result);

		if (!Tmp)
		// We have walked up to a point in the chains where the two locations
		// are irreconsilable. At this point Result contains the nearest common
		// location in the inlined-at chains of LocA and LocB, so we break here.
		break;

		Result = Tmp;
}		}

return DILocation::get(C, Line, Col, S, L);		if (Result)
		return Result;

		// We ended up with LocA and LocB as irreconsilable locations. Produce a
		// location at 0:0 with one of the locations' scope. The function has
		// historically picked A's scope, and a nullptr inlined-at location, so that
		// behavior is mimicked here but I am not sure if this is always the correct
		// way to handle this.
		return DILocation::get(C, 0, 0, LocA->getScope(), nullptr);
}		}

std::optional<unsigned>		std::optional<unsigned>
DILocation::encodeDiscriminator(unsigned BD, unsigned DF, unsigned CI) {		DILocation::encodeDiscriminator(unsigned BD, unsigned DF, unsigned CI) {
std::array<unsigned, 3> Components = {BD, DF, CI};		std::array<unsigned, 3> Components = {BD, DF, CI};
uint64_t RemainingWork = 0U;		uint64_t RemainingWork = 0U;
// We use RemainingWork to figure out if we have no remaining components to		// We use RemainingWork to figure out if we have no remaining components to
// encode. For example: if BD != 0 but DF == 0 && CI == 0, we don't need to		// encode. For example: if BD != 0 but DF == 0 && CI == 0, we don't need to
▲ Show 20 Lines • Show All 1,874 Lines • Show Last 20 Lines

llvm/test/DebugInfo/MIR/X86/merge-inline-loc1.mir

This file was added.

				# RUN: llc -mtriple=x86_64-pc-linux %s -run-pass=branch-folder -o - \| FileCheck %s

				--- \|
				; ModuleID = 'case1.c'
				source_filename = "case1.c"
				target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				@q1 = dso_local local_unnamed_addr global i32 1, align 4
				@q2 = dso_local local_unnamed_addr global i32 4, align 4
				@q3 = dso_local local_unnamed_addr global i32 2, align 4
				@g1 = dso_local local_unnamed_addr global i32 0, align 4
				@g2 = dso_local local_unnamed_addr global i32 0, align 4
				@g3 = dso_local local_unnamed_addr global i32 0, align 4

				; Function Attrs: nounwind uwtable
				define dso_local i32 @multiple_inl_one_loc() local_unnamed_addr #0 !dbg !9 {
				entry:
				%0 = load i32, ptr @q1, align 4, !dbg !12, !tbaa !13
				%cmp.i = icmp sgt i32 %0, 3, !dbg !17
				br i1 %cmp.i, label %if.then.i, label %inl1.exit, !dbg !20

				if.then.i: ; preds = %entry
				tail call void @abort() #2, !dbg !21
				unreachable, !dbg !21

				inl1.exit: ; preds = %entry
				%mul.i = mul nsw i32 %0, 152, !dbg !22
				%add.i = add nsw i32 %mul.i, 100, !dbg !23
				store i32 %add.i, ptr @g1, align 4, !dbg !24, !tbaa !13
				%1 = load i32, ptr @q2, align 4, !dbg !25, !tbaa !13
				%cmp.i3 = icmp sgt i32 %1, 3, !dbg !26
				br i1 %cmp.i3, label %if.then.i4, label %inl1.exit7, !dbg !28

				if.then.i4: ; preds = %inl1.exit
				tail call void @abort() #2, !dbg !29
				unreachable, !dbg !29

				inl1.exit7: ; preds = %inl1.exit
				%mul.i5 = mul nsw i32 %1, 152, !dbg !30
				%add.i6 = add nsw i32 %mul.i5, 200, !dbg !31
				store i32 %add.i6, ptr @g2, align 4, !dbg !32, !tbaa !13
				%2 = load i32, ptr @q3, align 4, !dbg !33, !tbaa !13
				%cmp.i8 = icmp sgt i32 %2, 3, !dbg !34
				br i1 %cmp.i8, label %if.then.i9, label %inl1.exit12, !dbg !36

				if.then.i9: ; preds = %inl1.exit7
				tail call void @abort() #2, !dbg !37
				unreachable, !dbg !37

				inl1.exit12: ; preds = %inl1.exit7
				%mul.i10 = mul nsw i32 %2, 152, !dbg !38
				%add.i11 = add nsw i32 %mul.i10, 300, !dbg !39
				store i32 %add.i11, ptr @q3, align 4, !dbg !40, !tbaa !13
				ret i32 0, !dbg !41
				}

				; Function Attrs: noreturn nounwind
				declare !dbg !42 void @abort() local_unnamed_addr #1

				attributes #0 = { nounwind uwtable "frame-pointer"="none" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
				attributes #1 = { noreturn nounwind "frame-pointer"="none" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
				attributes #2 = { noreturn nounwind }

				!llvm.dbg.cu = !{!0}
				!llvm.module.flags = !{!2, !3, !4, !5, !6, !7}
				!llvm.ident = !{!8}

				!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang version 16.0.0.prerel", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, splitDebugInlining: false, nameTableKind: None)
				!1 = !DIFile(filename: "case1.c", directory: "/", checksumkind: CSK_MD5, checksum: "36d7da9616644fc8b011f5c49108ab31")
				!2 = !{i32 7, !"Dwarf Version", i32 5}
				!3 = !{i32 2, !"Debug Info Version", i32 3}
				!4 = !{i32 1, !"wchar_size", i32 4}
				!5 = !{i32 8, !"PIC Level", i32 2}
				!6 = !{i32 7, !"PIE Level", i32 2}
				!7 = !{i32 7, !"uwtable", i32 2}
				!8 = !{!"clang version 16.0.0.prerel"}
				!9 = distinct !DISubprogram(name: "multiple_inl_one_loc", scope: !1, file: !1, line: 13, type: !10, scopeLine: 14, flags: DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition \| DISPFlagOptimized, unit: !0, retainedNodes: !11)
				!10 = !DISubroutineType(types: !11)
				!11 = !{}
				!12 = !DILocation(line: 15, column: 13, scope: !9)
				!13 = !{!14, !14, i64 0}
				!14 = !{!"int", !15, i64 0}
				!15 = !{!"omnipotent char", !16, i64 0}
				!16 = !{!"Simple C/C++ TBAA"}
				!17 = !DILocation(line: 5, column: 9, scope: !18, inlinedAt: !19)
				!18 = distinct !DISubprogram(name: "inl1", scope: !1, file: !1, line: 3, type: !10, scopeLine: 4, flags: DIFlagPrototyped \| DIFlagAllCallsDescribed, spFlags: DISPFlagLocalToUnit \| DISPFlagDefinition \| DISPFlagOptimized, unit: !0, retainedNodes: !11)
				!19 = distinct !DILocation(line: 15, column: 8, scope: !9)
				!20 = !DILocation(line: 5, column: 7, scope: !18, inlinedAt: !19)
				!21 = !DILocation(line: 6, column: 5, scope: !18, inlinedAt: !19)
				!22 = !DILocation(line: 7, column: 12, scope: !18, inlinedAt: !19)
				!23 = !DILocation(line: 7, column: 18, scope: !18, inlinedAt: !19)
				!24 = !DILocation(line: 15, column: 6, scope: !9)
				!25 = !DILocation(line: 16, column: 13, scope: !9)
				!26 = !DILocation(line: 5, column: 9, scope: !18, inlinedAt: !27)
				!27 = distinct !DILocation(line: 16, column: 8, scope: !9)
				!28 = !DILocation(line: 5, column: 7, scope: !18, inlinedAt: !27)
				!29 = !DILocation(line: 6, column: 5, scope: !18, inlinedAt: !27)
				!30 = !DILocation(line: 7, column: 12, scope: !18, inlinedAt: !27)
				!31 = !DILocation(line: 7, column: 18, scope: !18, inlinedAt: !27)
				!32 = !DILocation(line: 16, column: 6, scope: !9)
				!33 = !DILocation(line: 17, column: 13, scope: !9)
				!34 = !DILocation(line: 5, column: 9, scope: !18, inlinedAt: !35)
				!35 = distinct !DILocation(line: 17, column: 8, scope: !9)
				!36 = !DILocation(line: 5, column: 7, scope: !18, inlinedAt: !35)
				!37 = !DILocation(line: 6, column: 5, scope: !18, inlinedAt: !35)
				!38 = !DILocation(line: 7, column: 12, scope: !18, inlinedAt: !35)
				!39 = !DILocation(line: 7, column: 18, scope: !18, inlinedAt: !35)
				!40 = !DILocation(line: 17, column: 6, scope: !9)
				!41 = !DILocation(line: 18, column: 3, scope: !9)
				!42 = !DISubprogram(name: "abort", scope: !43, file: !43, line: 514, type: !10, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized, retainedNodes: !11)
				!43 = !DIFile(filename: "/usr/include/stdlib.h", directory: "", checksumkind: CSK_MD5, checksum: "f7a1412d75d9e3df251dfc21b02d59ef")

				...
				---
				name: multiple_inl_one_loc
				alignment: 16
				tracksRegLiveness: true
				tracksDebugUserValues: true
				frameInfo:
				stackSize: 8
				offsetAdjustment: -8
				maxAlignment: 1
				adjustsStack: true
				hasCalls: true
				maxCallFrameSize: 0
				machineFunctionInfo: {}
				body: \|
				bb.0.entry:
				successors: %bb.1(0x00000800), %bb.2(0x7ffff800)

				frame-setup PUSH64r undef $rax, implicit-def $rsp, implicit $rsp
				frame-setup CFI_INSTRUCTION def_cfa_offset 16
				renamable $eax = MOV32rm $rip, 1, $noreg, @q1, $noreg, debug-location !12 :: (dereferenceable load (s32) from @q1, !tbaa !13)
				CMP32ri8 renamable $eax, 4, implicit-def $eflags, debug-location !17
				JCC_1 %bb.2, 12, implicit killed $eflags, debug-location !20
				JMP_1 %bb.1, debug-location !20

				bb.1.if.then.i:
				successors:

				CALL64pcrel32 target-flags(x86-plt) @abort, csr_64, implicit $rsp, implicit $ssp, implicit-def $rsp, implicit-def $ssp, debug-location !21

				bb.2.inl1.exit:
				successors: %bb.3(0x00000800), %bb.4(0x7ffff800)
				liveins: $eax

				renamable $eax = nsw IMUL32rri killed renamable $eax, 152, implicit-def dead $eflags, debug-location !22
				renamable $eax = nsw ADD32ri8 killed renamable $eax, 100, implicit-def dead $eflags, debug-location !23
				MOV32mr $rip, 1, $noreg, @g1, $noreg, killed renamable $eax, debug-location !24 :: (store (s32) into @g1, !tbaa !13)
				renamable $eax = MOV32rm $rip, 1, $noreg, @q2, $noreg, debug-location !25 :: (dereferenceable load (s32) from @q2, !tbaa !13)
				CMP32ri8 renamable $eax, 4, implicit-def $eflags, debug-location !26
				JCC_1 %bb.4, 12, implicit killed $eflags, debug-location !28
				JMP_1 %bb.3, debug-location !28

				bb.3.if.then.i4:
				successors:

				CALL64pcrel32 target-flags(x86-plt) @abort, csr_64, implicit $rsp, implicit $ssp, implicit-def $rsp, implicit-def $ssp, debug-location !29

				bb.4.inl1.exit7:
				successors: %bb.5(0x00000800), %bb.6(0x7ffff800)
				liveins: $eax

				renamable $eax = nsw IMUL32rri killed renamable $eax, 152, implicit-def dead $eflags, debug-location !30
				renamable $eax = nsw ADD32ri killed renamable $eax, 200, implicit-def dead $eflags, debug-location !31
				MOV32mr $rip, 1, $noreg, @g2, $noreg, killed renamable $eax, debug-location !32 :: (store (s32) into @g2, !tbaa !13)
				renamable $eax = MOV32rm $rip, 1, $noreg, @q3, $noreg, debug-location !33 :: (dereferenceable load (s32) from @q3, !tbaa !13)
				CMP32ri8 renamable $eax, 4, implicit-def $eflags, debug-location !34
				JCC_1 %bb.6, 12, implicit killed $eflags, debug-location !36
				JMP_1 %bb.5, debug-location !36

				bb.5.if.then.i9:
				successors:

				CALL64pcrel32 target-flags(x86-plt) @abort, csr_64, implicit $rsp, implicit $ssp, implicit-def $rsp, implicit-def $ssp, debug-location !37

				bb.6.inl1.exit12:
				liveins: $eax

				renamable $eax = nsw IMUL32rri killed renamable $eax, 152, implicit-def dead $eflags, debug-location !38
				renamable $eax = nsw ADD32ri killed renamable $eax, 300, implicit-def dead $eflags, debug-location !39
				MOV32mr $rip, 1, $noreg, @q3, $noreg, killed renamable $eax, debug-location !40 :: (store (s32) into @q3, !tbaa !13)
				$eax = MOV32r0 implicit-def dead $eflags, debug-location !41
				$rcx = frame-destroy POP64r implicit-def $rsp, implicit $rsp, debug-location !41
				frame-destroy CFI_INSTRUCTION def_cfa_offset 8, debug-location !41
				RET 0, $eax, debug-location !41

				...

				# In this case we have a abort call block folded from one single location in
				# three inlined instances of inl1():
				#
				# 1 \| #include <stdlib.h>
				# 2 \|
				# 3 \| static inline int inl1(int q, int n)
				# 4 \| {
				# 5 \| if (q > 3)
				# 6 \| abort();
				# 7 \| return q * 152 + n;
				# 8 \| }
				# 9 \|
				# 10 \| int q1 = 1, q2 = 4, q3 = 2;
				# 11 \| int g1, g2, g3;
				# 12 \|
				# 13 \| int multiple_inl_one_loc()
				# 14 \| {
				# 15 \| g1 = inl1(q1, 100);
				# 16 \| g2 = inl1(q2, 200);
				# 17 \| q3 = inl1(q3, 300);
				# 18 \| return 0;
				# 19 \| }
				#
				# We should produce a merged location describing that the abort call is located
				# at line 6 in inl1() inlined at line 0.

				# CHECK-DAG: [[INLINER:![0-9]+]] = distinct !DISubprogram(name: "multiple_inl_one_loc"
				# CHECK-DAG: [[INLINEE:![0-9]+]] = distinct !DISubprogram(name: "inl1"

				# CHECK-NOT: CALL64pcrel32
				# CHECK: CALL64pcrel32 target-flags(x86-plt) @abort, {{.*}} debug-location !DILocation(line: 6, column: 5, scope: [[INLINEE]], inlinedAt: !DILocation(line: 0, scope: [[INLINER]]))
				# CHECK-NOT: CALL64pcrel32

llvm/test/DebugInfo/MIR/X86/merge-inline-loc2.mir

This file was added.

				# RUN: llc -mtriple=x86_64-pc-linux %s -run-pass=branch-folder -o - \| FileCheck %s

				--- \|
				; ModuleID = 'case2.c'
				source_filename = "case2.c"
				target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				@q1 = dso_local local_unnamed_addr global i32 1, align 4, !dbg !0
				@q2 = dso_local local_unnamed_addr global i32 4, align 4, !dbg !5
				@q3 = dso_local local_unnamed_addr global i32 2, align 4, !dbg !8
				@g1 = dso_local local_unnamed_addr global i32 0, align 4, !dbg !10
				@g2 = dso_local local_unnamed_addr global i32 0, align 4, !dbg !12
				@g3 = dso_local local_unnamed_addr global i32 0, align 4, !dbg !14

				; Function Attrs: nounwind uwtable
				define dso_local i32 @multiple_inl_multiple_loc() local_unnamed_addr #0 !dbg !23 {
				entry:
				%0 = load i32, ptr @q1, align 4, !dbg !27, !tbaa !28
				call void @llvm.dbg.value(metadata i32 %0, metadata !32, metadata !DIExpression()), !dbg !38
				call void @llvm.dbg.value(metadata i32 100, metadata !37, metadata !DIExpression()), !dbg !38
				%cmp.i = icmp sgt i32 %0, 3, !dbg !40
				br i1 %cmp.i, label %if.then.i, label %if.end.i, !dbg !42

				if.then.i: ; preds = %entry
				tail call void @abort() #3, !dbg !43
				unreachable, !dbg !43

				if.end.i: ; preds = %entry
				%cmp1.i = icmp slt i32 %0, 1, !dbg !44
				br i1 %cmp1.i, label %if.then2.i, label %inl2.exit, !dbg !46

				if.then2.i: ; preds = %if.end.i
				tail call void @abort() #3, !dbg !47
				unreachable, !dbg !47

				inl2.exit: ; preds = %if.end.i
				%mul.i = mul nuw nsw i32 %0, 152, !dbg !48
				%add.i = add nuw nsw i32 %mul.i, 100, !dbg !49
				store i32 %add.i, ptr @g1, align 4, !dbg !50, !tbaa !28
				%1 = load i32, ptr @q2, align 4, !dbg !51, !tbaa !28
				call void @llvm.dbg.value(metadata i32 %1, metadata !32, metadata !DIExpression()), !dbg !52
				call void @llvm.dbg.value(metadata i32 200, metadata !37, metadata !DIExpression()), !dbg !52
				%cmp.i3 = icmp sgt i32 %1, 3, !dbg !54
				br i1 %cmp.i3, label %if.then.i4, label %if.end.i6, !dbg !55

				if.then.i4: ; preds = %inl2.exit
				tail call void @abort() #3, !dbg !56
				unreachable, !dbg !56

				if.end.i6: ; preds = %inl2.exit
				%cmp1.i5 = icmp slt i32 %1, 1, !dbg !57
				br i1 %cmp1.i5, label %if.then2.i7, label %inl2.exit10, !dbg !58

				if.then2.i7: ; preds = %if.end.i6
				tail call void @abort() #3, !dbg !59
				unreachable, !dbg !59

				inl2.exit10: ; preds = %if.end.i6
				%mul.i8 = mul nuw nsw i32 %1, 152, !dbg !60
				%add.i9 = add nuw nsw i32 %mul.i8, 200, !dbg !61
				store i32 %add.i9, ptr @g2, align 4, !dbg !62, !tbaa !28
				%2 = load i32, ptr @q3, align 4, !dbg !63, !tbaa !28
				call void @llvm.dbg.value(metadata i32 %2, metadata !32, metadata !DIExpression()), !dbg !64
				call void @llvm.dbg.value(metadata i32 300, metadata !37, metadata !DIExpression()), !dbg !64
				%cmp.i11 = icmp sgt i32 %2, 3, !dbg !66
				br i1 %cmp.i11, label %if.then.i12, label %if.end.i14, !dbg !67

				if.then.i12: ; preds = %inl2.exit10
				tail call void @abort() #3, !dbg !68
				unreachable, !dbg !68

				if.end.i14: ; preds = %inl2.exit10
				%cmp1.i13 = icmp slt i32 %2, 1, !dbg !69
				br i1 %cmp1.i13, label %if.then2.i15, label %inl2.exit18, !dbg !70

				if.then2.i15: ; preds = %if.end.i14
				tail call void @abort() #3, !dbg !71
				unreachable, !dbg !71

				inl2.exit18: ; preds = %if.end.i14
				%mul.i16 = mul nuw nsw i32 %2, 152, !dbg !72
				%add.i17 = add nuw nsw i32 %mul.i16, 300, !dbg !73
				store i32 %add.i17, ptr @q3, align 4, !dbg !74, !tbaa !28
				ret i32 0, !dbg !75
				}

				; Function Attrs: noreturn nounwind
				declare !dbg !76 void @abort() local_unnamed_addr #1

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare void @llvm.dbg.value(metadata, metadata, metadata) #2

				attributes #0 = { nounwind uwtable "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
				attributes #1 = { noreturn nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
				attributes #2 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
				attributes #3 = { noreturn nounwind }

				!llvm.dbg.cu = !{!2}
				!llvm.module.flags = !{!16, !17, !18, !19, !20, !21}
				!llvm.ident = !{!22}

				!0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
				!1 = distinct !DIGlobalVariable(name: "q1", scope: !2, file: !3, line: 12, type: !7, isLocal: false, isDefinition: true)
				!2 = distinct !DICompileUnit(language: DW_LANG_C11, file: !3, producer: "clang version 16.0.0.prerel", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, globals: !4, splitDebugInlining: false, nameTableKind: None)
				!3 = !DIFile(filename: "case2.c", directory: "/", checksumkind: CSK_MD5, checksum: "5093c50294e57eaa7f9ed00bfd62075c")
				!4 = !{!0, !5, !8, !10, !12, !14}
				!5 = !DIGlobalVariableExpression(var: !6, expr: !DIExpression())
				!6 = distinct !DIGlobalVariable(name: "q2", scope: !2, file: !3, line: 12, type: !7, isLocal: false, isDefinition: true)
				!7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
				!8 = !DIGlobalVariableExpression(var: !9, expr: !DIExpression())
				!9 = distinct !DIGlobalVariable(name: "q3", scope: !2, file: !3, line: 12, type: !7, isLocal: false, isDefinition: true)
				!10 = !DIGlobalVariableExpression(var: !11, expr: !DIExpression())
				!11 = distinct !DIGlobalVariable(name: "g1", scope: !2, file: !3, line: 13, type: !7, isLocal: false, isDefinition: true)
				!12 = !DIGlobalVariableExpression(var: !13, expr: !DIExpression())
				!13 = distinct !DIGlobalVariable(name: "g2", scope: !2, file: !3, line: 13, type: !7, isLocal: false, isDefinition: true)
				!14 = !DIGlobalVariableExpression(var: !15, expr: !DIExpression())
				!15 = distinct !DIGlobalVariable(name: "g3", scope: !2, file: !3, line: 13, type: !7, isLocal: false, isDefinition: true)
				!16 = !{i32 7, !"Dwarf Version", i32 5}
				!17 = !{i32 2, !"Debug Info Version", i32 3}
				!18 = !{i32 1, !"wchar_size", i32 4}
				!19 = !{i32 8, !"PIC Level", i32 2}
				!20 = !{i32 7, !"PIE Level", i32 2}
				!21 = !{i32 7, !"uwtable", i32 2}
				!22 = !{!"clang version 16.0.0.prerel"}
				!23 = distinct !DISubprogram(name: "multiple_inl_multiple_loc", scope: !3, file: !3, line: 15, type: !24, scopeLine: 16, flags: DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition \| DISPFlagOptimized, unit: !2, retainedNodes: !26)
				!24 = !DISubroutineType(types: !25)
				!25 = !{!7}
				!26 = !{}
				!27 = !DILocation(line: 17, column: 13, scope: !23)
				!28 = !{!29, !29, i64 0}
				!29 = !{!"int", !30, i64 0}
				!30 = !{!"omnipotent char", !31, i64 0}
				!31 = !{!"Simple C/C++ TBAA"}
				!32 = !DILocalVariable(name: "q", arg: 1, scope: !33, file: !3, line: 3, type: !7)
				!33 = distinct !DISubprogram(name: "inl2", scope: !3, file: !3, line: 3, type: !34, scopeLine: 4, flags: DIFlagPrototyped \| DIFlagAllCallsDescribed, spFlags: DISPFlagLocalToUnit \| DISPFlagDefinition \| DISPFlagOptimized, unit: !2, retainedNodes: !36)
				!34 = !DISubroutineType(types: !35)
				!35 = !{!7, !7, !7}
				!36 = !{!32, !37}
				!37 = !DILocalVariable(name: "n", arg: 2, scope: !33, file: !3, line: 3, type: !7)
				!38 = !DILocation(line: 0, scope: !33, inlinedAt: !39)
				!39 = distinct !DILocation(line: 17, column: 8, scope: !23)
				!40 = !DILocation(line: 5, column: 9, scope: !41, inlinedAt: !39)
				!41 = distinct !DILexicalBlock(scope: !33, file: !3, line: 5, column: 7)
				!42 = !DILocation(line: 5, column: 7, scope: !33, inlinedAt: !39)
				!43 = !DILocation(line: 6, column: 5, scope: !41, inlinedAt: !39)
				!44 = !DILocation(line: 7, column: 9, scope: !45, inlinedAt: !39)
				!45 = distinct !DILexicalBlock(scope: !33, file: !3, line: 7, column: 7)
				!46 = !DILocation(line: 7, column: 7, scope: !33, inlinedAt: !39)
				!47 = !DILocation(line: 8, column: 5, scope: !45, inlinedAt: !39)
				!48 = !DILocation(line: 9, column: 12, scope: !33, inlinedAt: !39)
				!49 = !DILocation(line: 9, column: 18, scope: !33, inlinedAt: !39)
				!50 = !DILocation(line: 17, column: 6, scope: !23)
				!51 = !DILocation(line: 18, column: 13, scope: !23)
				!52 = !DILocation(line: 0, scope: !33, inlinedAt: !53)
				!53 = distinct !DILocation(line: 18, column: 8, scope: !23)
				!54 = !DILocation(line: 5, column: 9, scope: !41, inlinedAt: !53)
				!55 = !DILocation(line: 5, column: 7, scope: !33, inlinedAt: !53)
				!56 = !DILocation(line: 6, column: 5, scope: !41, inlinedAt: !53)
				!57 = !DILocation(line: 7, column: 9, scope: !45, inlinedAt: !53)
				!58 = !DILocation(line: 7, column: 7, scope: !33, inlinedAt: !53)
				!59 = !DILocation(line: 8, column: 5, scope: !45, inlinedAt: !53)
				!60 = !DILocation(line: 9, column: 12, scope: !33, inlinedAt: !53)
				!61 = !DILocation(line: 9, column: 18, scope: !33, inlinedAt: !53)
				!62 = !DILocation(line: 18, column: 6, scope: !23)
				!63 = !DILocation(line: 19, column: 13, scope: !23)
				!64 = !DILocation(line: 0, scope: !33, inlinedAt: !65)
				!65 = distinct !DILocation(line: 19, column: 8, scope: !23)
				!66 = !DILocation(line: 5, column: 9, scope: !41, inlinedAt: !65)
				!67 = !DILocation(line: 5, column: 7, scope: !33, inlinedAt: !65)
				!68 = !DILocation(line: 6, column: 5, scope: !41, inlinedAt: !65)
				!69 = !DILocation(line: 7, column: 9, scope: !45, inlinedAt: !65)
				!70 = !DILocation(line: 7, column: 7, scope: !33, inlinedAt: !65)
				!71 = !DILocation(line: 8, column: 5, scope: !45, inlinedAt: !65)
				!72 = !DILocation(line: 9, column: 12, scope: !33, inlinedAt: !65)
				!73 = !DILocation(line: 9, column: 18, scope: !33, inlinedAt: !65)
				!74 = !DILocation(line: 19, column: 6, scope: !23)
				!75 = !DILocation(line: 20, column: 3, scope: !23)
				!76 = !DISubprogram(name: "abort", scope: !77, file: !77, line: 514, type: !78, flags: DIFlagPrototyped \| DIFlagNoReturn, spFlags: DISPFlagOptimized, retainedNodes: !26)
				!77 = !DIFile(filename: "/usr/include/stdlib.h", directory: "", checksumkind: CSK_MD5, checksum: "f7a1412d75d9e3df251dfc21b02d59ef")
				!78 = !DISubroutineType(types: !79)
				!79 = !{null}

				...
				---
				name: multiple_inl_multiple_loc
				alignment: 16
				tracksRegLiveness: true
				tracksDebugUserValues: true
				frameInfo:
				stackSize: 8
				offsetAdjustment: -8
				maxAlignment: 1
				adjustsStack: true
				hasCalls: true
				maxCallFrameSize: 0
				machineFunctionInfo: {}
				body: \|
				bb.0.entry:
				successors: %bb.1(0x00000800), %bb.2(0x7ffff800)

				frame-setup PUSH64r undef $rax, implicit-def $rsp, implicit $rsp
				frame-setup CFI_INSTRUCTION def_cfa_offset 16
				renamable $eax = MOV32rm $rip, 1, $noreg, @q1, $noreg, debug-instr-number 1, debug-location !27 :: (dereferenceable load (s32) from @q1, !tbaa !28)
				DBG_INSTR_REF !32, !DIExpression(DW_OP_LLVM_arg, 0), dbg-instr-ref(1, 0), debug-location !38
				DBG_VALUE 100, $noreg, !37, !DIExpression(), debug-location !38
				CMP32ri8 renamable $eax, 4, implicit-def $eflags, debug-location !40
				JCC_1 %bb.2, 12, implicit killed $eflags, debug-location !42
				JMP_1 %bb.1, debug-location !42

				bb.1.if.then.i:
				successors:

				CALL64pcrel32 target-flags(x86-plt) @abort, csr_64, implicit $rsp, implicit $ssp, implicit-def $rsp, implicit-def $ssp, debug-location !43

				bb.2.if.end.i:
				successors: %bb.3(0x00000800), %bb.4(0x7ffff800)
				liveins: $eax

				TEST32rr renamable $eax, renamable $eax, implicit-def $eflags, debug-location !44
				JCC_1 %bb.4, 15, implicit killed $eflags, debug-location !46
				JMP_1 %bb.3, debug-location !46

				bb.3.if.then2.i:
				successors:

				CALL64pcrel32 target-flags(x86-plt) @abort, csr_64, implicit $rsp, implicit $ssp, implicit-def $rsp, implicit-def $ssp, debug-location !47

				bb.4.inl2.exit:
				successors: %bb.5(0x00000800), %bb.6(0x7ffff800)
				liveins: $eax

				renamable $eax = nuw nsw IMUL32rri killed renamable $eax, 152, implicit-def dead $eflags, debug-location !48
				renamable $eax = nuw nsw ADD32ri8 killed renamable $eax, 100, implicit-def dead $eflags, debug-location !49
				MOV32mr $rip, 1, $noreg, @g1, $noreg, killed renamable $eax, debug-location !50 :: (store (s32) into @g1, !tbaa !28)
				renamable $eax = MOV32rm $rip, 1, $noreg, @q2, $noreg, debug-instr-number 2, debug-location !51 :: (dereferenceable load (s32) from @q2, !tbaa !28)
				DBG_INSTR_REF !32, !DIExpression(DW_OP_LLVM_arg, 0), dbg-instr-ref(2, 0), debug-location !52
				DBG_VALUE 200, $noreg, !37, !DIExpression(), debug-location !52
				CMP32ri8 renamable $eax, 4, implicit-def $eflags, debug-location !54
				JCC_1 %bb.6, 12, implicit killed $eflags, debug-location !55
				JMP_1 %bb.5, debug-location !55

				bb.5.if.then.i4:
				successors:

				CALL64pcrel32 target-flags(x86-plt) @abort, csr_64, implicit $rsp, implicit $ssp, implicit-def $rsp, implicit-def $ssp, debug-location !56

				bb.6.if.end.i6:
				successors: %bb.7(0x00000800), %bb.8(0x7ffff800)
				liveins: $eax

				TEST32rr renamable $eax, renamable $eax, implicit-def $eflags, debug-location !57
				JCC_1 %bb.8, 15, implicit killed $eflags, debug-location !58
				JMP_1 %bb.7, debug-location !58

				bb.7.if.then2.i7:
				successors:

				CALL64pcrel32 target-flags(x86-plt) @abort, csr_64, implicit $rsp, implicit $ssp, implicit-def $rsp, implicit-def $ssp, debug-location !59

				bb.8.inl2.exit10:
				successors: %bb.9(0x00000800), %bb.10(0x7ffff800)
				liveins: $eax

				renamable $eax = nuw nsw IMUL32rri killed renamable $eax, 152, implicit-def dead $eflags, debug-location !60
				renamable $eax = nuw nsw ADD32ri killed renamable $eax, 200, implicit-def dead $eflags, debug-location !61
				MOV32mr $rip, 1, $noreg, @g2, $noreg, killed renamable $eax, debug-location !62 :: (store (s32) into @g2, !tbaa !28)
				renamable $eax = MOV32rm $rip, 1, $noreg, @q3, $noreg, debug-instr-number 3, debug-location !63 :: (dereferenceable load (s32) from @q3, !tbaa !28)
				DBG_INSTR_REF !32, !DIExpression(DW_OP_LLVM_arg, 0), dbg-instr-ref(3, 0), debug-location !64
				DBG_VALUE 300, $noreg, !37, !DIExpression(), debug-location !64
				CMP32ri8 renamable $eax, 4, implicit-def $eflags, debug-location !66
				JCC_1 %bb.10, 12, implicit killed $eflags, debug-location !67
				JMP_1 %bb.9, debug-location !67

				bb.9.if.then.i12:
				successors:

				CALL64pcrel32 target-flags(x86-plt) @abort, csr_64, implicit $rsp, implicit $ssp, implicit-def $rsp, implicit-def $ssp, debug-location !68

				bb.10.if.end.i14:
				successors: %bb.11(0x00000800), %bb.12(0x7ffff800)
				liveins: $eax

				TEST32rr renamable $eax, renamable $eax, implicit-def $eflags, debug-location !69
				JCC_1 %bb.12, 15, implicit killed $eflags, debug-location !70
				JMP_1 %bb.11, debug-location !70

				bb.11.if.then2.i15:
				successors:

				CALL64pcrel32 target-flags(x86-plt) @abort, csr_64, implicit $rsp, implicit $ssp, implicit-def $rsp, implicit-def $ssp, debug-location !71

				bb.12.inl2.exit18:
				liveins: $eax

				renamable $eax = nuw nsw IMUL32rri killed renamable $eax, 152, implicit-def dead $eflags, debug-location !72
				renamable $eax = nuw nsw ADD32ri killed renamable $eax, 300, implicit-def dead $eflags, debug-location !73
				MOV32mr $rip, 1, $noreg, @q3, $noreg, killed renamable $eax, debug-location !74 :: (store (s32) into @q3, !tbaa !28)
				$eax = MOV32r0 implicit-def dead $eflags, debug-location !75
				$rcx = frame-destroy POP64r implicit-def $rsp, implicit $rsp, debug-location !75
				frame-destroy CFI_INSTRUCTION def_cfa_offset 8, debug-location !75
				RET 0, $eax, debug-location !75

				...

				# In this case we have a abort call block folded from two locations in
				# three inlined instances of inl1():
				#
				# 1 \| #include <stdlib.h>
				# 2 \|
				# 3 \| static inline int inl2(int q, int n)
				# 4 \| {
				# 5 \| if (q > 3)
				# 6 \| abort();
				# 7 \| if (q < 1)
				# 8 \| abort();
				# 9 \| return q * 152 + n;
				# 10 \| }
				# 11 \|
				# 12 \| int q1 = 1, q2 = 4, q3 = 2;
				# 13 \| int g1, g2, g3;
				# 14 \|
				# 15 \| int multiple_inl_multiple_loc()
				# 16 \| {
				# 17 \| g1 = inl2(q1, 100);
				# 18 \| g2 = inl2(q2, 200);
				# 19 \| q3 = inl2(q3, 300);
				# 20 \| return 0;
				# 21 \| }
				#
				# We should produce a merged location describing that the abort call is located
				# at line 0 in inl2() inlined at line 0 in multiple_inl_multiple_loc().

				# CHECK-DAG: [[INLINER:![0-9]+]] = distinct !DISubprogram(name: "multiple_inl_multiple_loc"
				# CHECK-DAG: [[INLINEE:![0-9]+]] = distinct !DISubprogram(name: "inl2"

				# CHECK-NOT: CALL64pcrel32
				# CHECK: CALL64pcrel32 target-flags(x86-plt) @abort, {{.*}} debug-location !DILocation(line: 0, scope: [[INLINEE]], inlinedAt: !DILocation(line: 0, scope: [[INLINER]]))
				# CHECK-NOT: CALL64pcrel32

llvm/test/DebugInfo/MIR/X86/merge-inline-loc3.mir

This file was added.

				# RUN: llc -mtriple=x86_64-pc-linux %s -run-pass=branch-folder -o - \| FileCheck %s

				--- \|
				; ModuleID = 'case3.c'
				source_filename = "case3.c"
				target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				@q1 = dso_local local_unnamed_addr global i32 1, align 4
				@q2 = dso_local local_unnamed_addr global i32 6, align 4
				@g1 = dso_local local_unnamed_addr global i32 0, align 4
				@g2 = dso_local local_unnamed_addr global i32 0, align 4

				; Function Attrs: nounwind uwtable
				define dso_local i32 @multiple_inl_funcs() local_unnamed_addr #0 !dbg !9 {
				entry:
				%0 = load i32, ptr @q1, align 4, !dbg !12, !tbaa !13
				%cmp.i = icmp sgt i32 %0, 3, !dbg !17
				br i1 %cmp.i, label %if.then.i, label %inl3.exit, !dbg !20

				if.then.i: ; preds = %entry
				tail call void @abort() #2, !dbg !21
				unreachable, !dbg !21

				inl3.exit: ; preds = %entry
				%mul.i = mul nsw i32 %0, 152, !dbg !22
				%add.i = add nsw i32 %mul.i, 100, !dbg !23
				store i32 %add.i, ptr @g1, align 4, !dbg !24, !tbaa !13
				%1 = load i32, ptr @q2, align 4, !dbg !25, !tbaa !13
				%cmp.i2 = icmp sgt i32 %1, 5, !dbg !26
				br i1 %cmp.i2, label %if.then.i3, label %inl4.exit, !dbg !29

				if.then.i3: ; preds = %inl3.exit
				tail call void @abort() #2, !dbg !30
				unreachable, !dbg !30

				inl4.exit: ; preds = %inl3.exit
				%mul.i4 = mul nsw i32 %1, %1, !dbg !31
				%add.i5 = add nuw nsw i32 %mul.i4, 200, !dbg !32
				store i32 %add.i5, ptr @g2, align 4, !dbg !33, !tbaa !13
				ret i32 0, !dbg !34
				}

				; Function Attrs: noreturn nounwind
				declare !dbg !35 void @abort() local_unnamed_addr #1

				attributes #0 = { nounwind uwtable "frame-pointer"="none" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
				attributes #1 = { noreturn nounwind "frame-pointer"="none" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
				attributes #2 = { noreturn nounwind }

				!llvm.dbg.cu = !{!0}
				!llvm.module.flags = !{!2, !3, !4, !5, !6, !7}
				!llvm.ident = !{!8}

				!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang version 16.0.0.prerel", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, splitDebugInlining: false, nameTableKind: None)
				!1 = !DIFile(filename: "case3.c", directory: "/", checksumkind: CSK_MD5, checksum: "53e9893099480164de1f5ee265c0cf01")
				!2 = !{i32 7, !"Dwarf Version", i32 5}
				!3 = !{i32 2, !"Debug Info Version", i32 3}
				!4 = !{i32 1, !"wchar_size", i32 4}
				!5 = !{i32 8, !"PIC Level", i32 2}
				!6 = !{i32 7, !"PIE Level", i32 2}
				!7 = !{i32 7, !"uwtable", i32 2}
				!8 = !{!"clang version 16.0.0.prerel"}
				!9 = distinct !DISubprogram(name: "multiple_inl_funcs", scope: !1, file: !1, line: 18, type: !10, scopeLine: 19, flags: DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition \| DISPFlagOptimized, unit: !0, retainedNodes: !11)
				!10 = !DISubroutineType(types: !11)
				!11 = !{}
				!12 = !DILocation(line: 20, column: 13, scope: !9)
				!13 = !{!14, !14, i64 0}
				!14 = !{!"int", !15, i64 0}
				!15 = !{!"omnipotent char", !16, i64 0}
				!16 = !{!"Simple C/C++ TBAA"}
				!17 = !DILocation(line: 4, column: 9, scope: !18, inlinedAt: !19)
				!18 = distinct !DISubprogram(name: "inl3", scope: !1, file: !1, line: 3, type: !10, scopeLine: 3, flags: DIFlagPrototyped \| DIFlagAllCallsDescribed, spFlags: DISPFlagLocalToUnit \| DISPFlagDefinition \| DISPFlagOptimized, unit: !0, retainedNodes: !11)
				!19 = distinct !DILocation(line: 20, column: 8, scope: !9)
				!20 = !DILocation(line: 4, column: 7, scope: !18, inlinedAt: !19)
				!21 = !DILocation(line: 5, column: 5, scope: !18, inlinedAt: !19)
				!22 = !DILocation(line: 6, column: 12, scope: !18, inlinedAt: !19)
				!23 = !DILocation(line: 6, column: 18, scope: !18, inlinedAt: !19)
				!24 = !DILocation(line: 20, column: 6, scope: !9)
				!25 = !DILocation(line: 21, column: 13, scope: !9)
				!26 = !DILocation(line: 10, column: 9, scope: !27, inlinedAt: !28)
				!27 = distinct !DISubprogram(name: "inl4", scope: !1, file: !1, line: 9, type: !10, scopeLine: 9, flags: DIFlagPrototyped \| DIFlagAllCallsDescribed, spFlags: DISPFlagLocalToUnit \| DISPFlagDefinition \| DISPFlagOptimized, unit: !0, retainedNodes: !11)
				!28 = distinct !DILocation(line: 21, column: 8, scope: !9)
				!29 = !DILocation(line: 10, column: 7, scope: !27, inlinedAt: !28)
				!30 = !DILocation(line: 11, column: 5, scope: !27, inlinedAt: !28)
				!31 = !DILocation(line: 12, column: 12, scope: !27, inlinedAt: !28)
				!32 = !DILocation(line: 12, column: 16, scope: !27, inlinedAt: !28)
				!33 = !DILocation(line: 21, column: 6, scope: !9)
				!34 = !DILocation(line: 22, column: 3, scope: !9)
				!35 = !DISubprogram(name: "abort", scope: !36, file: !36, line: 514, type: !10, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized, retainedNodes: !11)
				!36 = !DIFile(filename: "/usr/include/stdlib.h", directory: "", checksumkind: CSK_MD5, checksum: "f7a1412d75d9e3df251dfc21b02d59ef")

				...
				---
				name: multiple_inl_funcs
				alignment: 16
				tracksRegLiveness: true
				tracksDebugUserValues: true
				frameInfo:
				stackSize: 8
				offsetAdjustment: -8
				maxAlignment: 1
				adjustsStack: true
				hasCalls: true
				maxCallFrameSize: 0
				machineFunctionInfo: {}
				body: \|
				bb.0.entry:
				successors: %bb.1(0x00000800), %bb.2(0x7ffff800)

				frame-setup PUSH64r undef $rax, implicit-def $rsp, implicit $rsp
				frame-setup CFI_INSTRUCTION def_cfa_offset 16
				renamable $eax = MOV32rm $rip, 1, $noreg, @q1, $noreg, debug-location !12 :: (dereferenceable load (s32) from @q1, !tbaa !13)
				CMP32ri8 renamable $eax, 4, implicit-def $eflags, debug-location !17
				JCC_1 %bb.2, 12, implicit killed $eflags, debug-location !20
				JMP_1 %bb.1, debug-location !20

				bb.1.if.then.i:
				successors:

				CALL64pcrel32 target-flags(x86-plt) @abort, csr_64, implicit $rsp, implicit $ssp, implicit-def $rsp, implicit-def $ssp, debug-location !21

				bb.2.inl3.exit:
				successors: %bb.3(0x00000800), %bb.4(0x7ffff800)
				liveins: $eax

				renamable $eax = nsw IMUL32rri killed renamable $eax, 152, implicit-def dead $eflags, debug-location !22
				renamable $eax = nsw ADD32ri8 killed renamable $eax, 100, implicit-def dead $eflags, debug-location !23
				MOV32mr $rip, 1, $noreg, @g1, $noreg, killed renamable $eax, debug-location !24 :: (store (s32) into @g1, !tbaa !13)
				renamable $eax = MOV32rm $rip, 1, $noreg, @q2, $noreg, debug-location !25 :: (dereferenceable load (s32) from @q2, !tbaa !13)
				CMP32ri8 renamable $eax, 6, implicit-def $eflags, debug-location !26
				JCC_1 %bb.4, 12, implicit killed $eflags, debug-location !29
				JMP_1 %bb.3, debug-location !29

				bb.3.if.then.i3:
				successors:

				CALL64pcrel32 target-flags(x86-plt) @abort, csr_64, implicit $rsp, implicit $ssp, implicit-def $rsp, implicit-def $ssp, debug-location !30

				bb.4.inl4.exit:
				liveins: $eax

				renamable $eax = nsw IMUL32rr killed renamable $eax, renamable $eax, implicit-def dead $eflags, debug-location !31
				renamable $eax = nuw nsw ADD32ri killed renamable $eax, 200, implicit-def dead $eflags, debug-location !32
				MOV32mr $rip, 1, $noreg, @g2, $noreg, killed renamable $eax, debug-location !33 :: (store (s32) into @g2, !tbaa !13)
				$eax = MOV32r0 implicit-def dead $eflags, debug-location !34
				$rcx = frame-destroy POP64r implicit-def $rsp, implicit $rsp, debug-location !34
				frame-destroy CFI_INSTRUCTION def_cfa_offset 8, debug-location !34
				RET 0, $eax, debug-location !34

				...

				# In this case we get a single abort call originated from two separate
				# inlined functions:
				#
				# 1 \| #include <stdlib.h>
				# 2 \|
				# 3 \| static inline int inl3(int q, int n) {
				# 4 \| if (q > 3)
				# 5 \| abort();
				# 6 \| return q * 152 + n;
				# 7 \| }
				# 8 \|
				# 9 \| static inline int inl4(int q, int n) {
				# 10 \| if (q > 5)
				# 11 \| abort();
				# 12 \| return q * q + n;
				# 13 \| }
				# 14 \|
				# 15 \| int q1 = 1, q2 = 6;
				# 16 \| int g1, g2;
				# 17 \|
				# 18 \| int multiple_inl_funcs()
				# 19 \| {
				# 20 \| g1 = inl3(q1, 100);
				# 21 \| g2 = inl4(q2, 200);
				# 22 \| return 0;
				# 23 \| }
				#
				# We should produce a location at line 0 in the most common scope,
				# multiple_inl_funcs(), without any inline information.

				# CHECK: [[INLINER:![0-9]+]] = distinct !DISubprogram(name: "multiple_inl_funcs"

				# CHECK-NOT: CALL64pcrel32
				# CHECK: CALL64pcrel32 target-flags(x86-plt) @abort, {{.*}} debug-location !DILocation(line: 0, scope: [[INLINER]])
				# CHECK-NOT: CALL64pcrel32

llvm/test/DebugInfo/MIR/X86/merge-inline-loc4.mir

This file was added.

				# RUN: llc -mtriple=x86_64-pc-linux %s -run-pass=branch-folder -o - \| FileCheck %s

				--- \|
				; ModuleID = 'case4.c'
				source_filename = "case4.c"
				target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				@q1 = dso_local local_unnamed_addr global i32 1, align 4
				@q2 = dso_local local_unnamed_addr global i32 6, align 4
				@g1 = dso_local local_unnamed_addr global i32 0, align 4

				; Function Attrs: nounwind uwtable
				define dso_local i32 @merge_inl_and_non_inl() local_unnamed_addr #0 !dbg !9 {
				entry:
				%0 = load i32, ptr @q1, align 4, !dbg !12, !tbaa !13
				%cmp.i = icmp sgt i32 %0, 3, !dbg !17
				br i1 %cmp.i, label %if.then.i, label %inl5.exit, !dbg !20

				if.then.i: ; preds = %entry
				tail call void @abort() #2, !dbg !21
				unreachable, !dbg !21

				inl5.exit: ; preds = %entry
				%mul.i = mul nsw i32 %0, 152, !dbg !22
				%add.i = add nsw i32 %mul.i, 100, !dbg !23
				store i32 %add.i, ptr @g1, align 4, !dbg !24, !tbaa !13
				%1 = load i32, ptr @q2, align 4, !dbg !25, !tbaa !13
				%cmp = icmp sgt i32 %1, 5, !dbg !26
				br i1 %cmp, label %if.then, label %if.end, !dbg !25

				if.then: ; preds = %inl5.exit
				tail call void @abort() #2, !dbg !27
				unreachable, !dbg !27

				if.end: ; preds = %inl5.exit
				ret i32 0, !dbg !28
				}

				; Function Attrs: noreturn nounwind
				declare !dbg !29 void @abort() local_unnamed_addr #1

				attributes #0 = { nounwind uwtable "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
				attributes #1 = { noreturn nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
				attributes #2 = { noreturn nounwind }

				!llvm.dbg.cu = !{!0}
				!llvm.module.flags = !{!2, !3, !4, !5, !6, !7}
				!llvm.ident = !{!8}

				!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang version 16.0.0.prerel", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, splitDebugInlining: false, nameTableKind: None)
				!1 = !DIFile(filename: "case4.c", directory: "/", checksumkind: CSK_MD5, checksum: "6ffe3d1878ac05f79e8c3a30566e1b2a")
				!2 = !{i32 7, !"Dwarf Version", i32 5}
				!3 = !{i32 2, !"Debug Info Version", i32 3}
				!4 = !{i32 1, !"wchar_size", i32 4}
				!5 = !{i32 8, !"PIC Level", i32 2}
				!6 = !{i32 7, !"PIE Level", i32 2}
				!7 = !{i32 7, !"uwtable", i32 2}
				!8 = !{!"clang version 16.0.0.prerel"}
				!9 = distinct !DISubprogram(name: "merge_inl_and_non_inl", scope: !1, file: !1, line: 12, type: !10, scopeLine: 13, flags: DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition \| DISPFlagOptimized, unit: !0, retainedNodes: !11)
				!10 = !DISubroutineType(types: !11)
				!11 = !{}
				!12 = !DILocation(line: 14, column: 13, scope: !9)
				!13 = !{!14, !14, i64 0}
				!14 = !{!"int", !15, i64 0}
				!15 = !{!"omnipotent char", !16, i64 0}
				!16 = !{!"Simple C/C++ TBAA"}
				!17 = !DILocation(line: 4, column: 9, scope: !18, inlinedAt: !19)
				!18 = distinct !DISubprogram(name: "inl5", scope: !1, file: !1, line: 3, type: !10, scopeLine: 3, flags: DIFlagPrototyped \| DIFlagAllCallsDescribed, spFlags: DISPFlagLocalToUnit \| DISPFlagDefinition \| DISPFlagOptimized, unit: !0, retainedNodes: !11)
				!19 = distinct !DILocation(line: 14, column: 8, scope: !9)
				!20 = !DILocation(line: 4, column: 7, scope: !18, inlinedAt: !19)
				!21 = !DILocation(line: 5, column: 5, scope: !18, inlinedAt: !19)
				!22 = !DILocation(line: 6, column: 12, scope: !18, inlinedAt: !19)
				!23 = !DILocation(line: 6, column: 18, scope: !18, inlinedAt: !19)
				!24 = !DILocation(line: 14, column: 6, scope: !9)
				!25 = !DILocation(line: 15, column: 7, scope: !9)
				!26 = !DILocation(line: 15, column: 10, scope: !9)
				!27 = !DILocation(line: 16, column: 5, scope: !9)
				!28 = !DILocation(line: 17, column: 3, scope: !9)
				!29 = !DISubprogram(name: "abort", scope: !30, file: !30, line: 514, type: !10, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized, retainedNodes: !11)
				!30 = !DIFile(filename: "/usr/include/stdlib.h", directory: "", checksumkind: CSK_MD5, checksum: "f7a1412d75d9e3df251dfc21b02d59ef")

				...
				---
				name: merge_inl_and_non_inl
				alignment: 16
				tracksRegLiveness: true
				tracksDebugUserValues: true
				frameInfo:
				stackSize: 8
				offsetAdjustment: -8
				maxAlignment: 1
				adjustsStack: true
				hasCalls: true
				maxCallFrameSize: 0
				machineFunctionInfo: {}
				body: \|
				bb.0.entry:
				successors: %bb.1(0x00000800), %bb.2(0x7ffff800)

				frame-setup PUSH64r undef $rax, implicit-def $rsp, implicit $rsp
				frame-setup CFI_INSTRUCTION def_cfa_offset 16
				renamable $eax = MOV32rm $rip, 1, $noreg, @q1, $noreg, debug-location !12 :: (dereferenceable load (s32) from @q1, !tbaa !13)
				CMP32ri8 renamable $eax, 4, implicit-def $eflags, debug-location !17
				JCC_1 %bb.2, 12, implicit killed $eflags, debug-location !20
				JMP_1 %bb.1, debug-location !20

				bb.1.if.then.i:
				successors:

				CALL64pcrel32 target-flags(x86-plt) @abort, csr_64, implicit $rsp, implicit $ssp, implicit-def $rsp, implicit-def $ssp, debug-location !21

				bb.2.inl5.exit:
				successors: %bb.3(0x00000800), %bb.4(0x7ffff800)
				liveins: $eax

				renamable $eax = nsw IMUL32rri killed renamable $eax, 152, implicit-def dead $eflags, debug-location !22
				renamable $eax = nsw ADD32ri8 killed renamable $eax, 100, implicit-def dead $eflags, debug-location !23
				MOV32mr $rip, 1, $noreg, @g1, $noreg, killed renamable $eax, debug-location !24 :: (store (s32) into @g1, !tbaa !13)
				CMP32mi8 $rip, 1, $noreg, @q2, $noreg, 6, implicit-def $eflags, debug-location !26 :: (dereferenceable load (s32) from @q2, !tbaa !13)
				JCC_1 %bb.4, 12, implicit killed $eflags, debug-location !25
				JMP_1 %bb.3, debug-location !25

				bb.3.if.then:
				successors:

				CALL64pcrel32 target-flags(x86-plt) @abort, csr_64, implicit $rsp, implicit $ssp, implicit-def $rsp, implicit-def $ssp, debug-location !27

				bb.4.if.end:
				$eax = MOV32r0 implicit-def dead $eflags, debug-location !28
				$rcx = frame-destroy POP64r implicit-def $rsp, implicit $rsp, debug-location !28
				frame-destroy CFI_INSTRUCTION def_cfa_offset 8, debug-location !28
				RET 0, $eax, debug-location !28

				...

				# In this case we get a single abort call originating from an inlined instance
				# of inl5() and the function that is inlined in:
				#
				# 1 \| #include <stdlib.h>
				# 2 \|
				# 3 \| static inline int inl5(int q, int n) {
				# 4 \| if (q > 3)
				# 5 \| abort();
				# 6 \| return q * 152 + n;
				# 7 \| }
				# 8 \|
				# 9 \| int q1 = 1, q2 = 6;
				# 10 \| int g1;
				# 11 \|
				# 12 \| int merge_inl_and_non_inl()
				# 13 \| {
				# 14 \| g1 = inl5(q1, 100);
				# 15 \| if (q2 > 5)
				# 16 \| abort();
				# 17 \| return 0;
				# 18 \| }
				#
				# We should produce a location at line 0 in the most common scope,
				# merge_inl_and_non_inl(), without any inline information.

				# CHECK: [[INLINER:![0-9]+]] = distinct !DISubprogram(name: "merge_inl_and_non_inl"

				# CHECK-NOT: CALL64pcrel32
				# CHECK: CALL64pcrel32 target-flags(x86-plt) @abort, {{.*}} debug-location !DILocation(line: 0, scope: [[INLINER]])
				# CHECK-NOT: CALL64pcrel32

llvm/test/DebugInfo/X86/merge_inlined_loc.ll

	; RUN: llc %s -mtriple=x86_64-unknown-unknown -o - \| FileCheck %s			; RUN: llc %s -mtriple=x86_64-unknown-unknown -o - \| FileCheck %s

	; Generated with "clang -g -c -emit-llvm -S -O3"			; Generated with "clang -g -c -emit-llvm -S -O3"

	; This will test several features of merging debug locations. Importantly,			; This will test several features of merging debug locations. Importantly,
	; locations with the same source line but different scopes should be merged to			; locations with the same source line but different scopes should be merged to
	; a line zero location at the nearest common scope and inlining. The location			; a line zero location at the nearest common scope and inlining. The location
	; of the single call to "common" (the two calls are collapsed together by			; of the single call to "common" (the two calls are collapsed together by
	; BranchFolding) should be attributed to line zero inside the wrapper2 inlined			; BranchFolding) should be attributed to line 2 inside the wrapper inlined
	; scope within f1.			; scope within wrapper2 at line 0 inlined within f1 at line 13.

	; void common();			; void common();
	; inline void wrapper() { common(); }			; inline void wrapper() { common(); }
	; extern bool b;			; extern bool b;
	; void sink();			; void sink();
	; inline void wrapper2() {			; inline void wrapper2() {
	; if (b) {			; if (b) {
	; sink();			; sink();
	; wrapper();			; wrapper();
	; } else			; } else
	; wrapper();			; wrapper();
	; }			; }
	; void f1() { wrapper2(); }			; void f1() { wrapper2(); }

	; Ensure there is only one inlined_subroutine (for wrapper2, none for wrapper)			; Ensure there are two inlined_subroutine (for wrapper and wrapper2).
	; & that its address range includes the call to 'common'.

	; CHECK: jmp _Z6commonv			; CHECK: .loc 1 2 25 epilogue_begin
				; CHECK-NEXT: popq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: jmp _Z6commonv
	; CHECK-NEXT: [[LABEL:.*]]:			; CHECK-NEXT: [[LABEL:.*]]:

	; CHECK: .section .debug_info			; CHECK: .section .debug_info
	; CHECK: DW_TAG_subprogram			; CHECK: DW_TAG_subprogram
	; CHECK: DW_TAG_subprogram			; CHECK: DW_TAG_subprogram
	; CHECK-NOT: {{DW_TAG\\|End Of Children}}			; CHECK-NOT: {{DW_TAG\\|End Of Children}}
	; CHECK: DW_TAG_inlined_subroutine			; CHECK: DW_TAG_inlined_subroutine
	; CHECK-NOT: {{DW_TAG\\|End Of Children}}			; CHECK-NOT: {{DW_TAG\\|End Of Children}}
	; CHECK: [[LABEL]]-{{.*}} DW_AT_high_pc			; CHECK: [[LABEL]]-{{.*}} DW_AT_high_pc
				; CHECK: .byte 13 # DW_AT_call_line
				; CHECK: DW_TAG_inlined_subroutine
				; CHECK: .byte 0 # DW_AT_call_line
	; CHECK-NOT: DW_TAG			; CHECK-NOT: DW_TAG



	@b = external dso_local local_unnamed_addr global i8, align 1			@b = external dso_local local_unnamed_addr global i8, align 1

	; Function Attrs: uwtable			; Function Attrs: uwtable
	define dso_local void @_Z2f1v() local_unnamed_addr !dbg !7 {			define dso_local void @_Z2f1v() local_unnamed_addr !dbg !7 {
	▲ Show 20 Lines • Show All 55 Lines • Show Last 20 Lines

llvm/unittests/IR/MetadataTest.cpp

Show First 20 Lines • Show All 1,083 Lines • ▼ Show 20 Lines	// inlined at the same place.
auto *B = DILocation::get(Context, 2, 7, SPB, I);		auto *B = DILocation::get(Context, 2, 7, SPB, I);
auto *M = DILocation::getMergedLocation(A, B);		auto *M = DILocation::getMergedLocation(A, B);
EXPECT_EQ(3u, M->getLine());		EXPECT_EQ(3u, M->getLine());
EXPECT_EQ(8u, M->getColumn());		EXPECT_EQ(8u, M->getColumn());
EXPECT_TRUE(isa<DILocalScope>(M->getScope()));		EXPECT_TRUE(isa<DILocalScope>(M->getScope()));
EXPECT_EQ(SPI, M->getScope());		EXPECT_EQ(SPI, M->getScope());
EXPECT_EQ(nullptr, M->getInlinedAt());		EXPECT_EQ(nullptr, M->getInlinedAt());
}		}

		// Merge a location in C, which is inlined-at in B that is inlined in A,
		// with a location in A that has the same scope, line and column as B's
		// inlined-at location.
		{
		auto *FA = getFile();
		auto *FB = getFile();
		auto *FC = getFile();

		auto *SPA = DISubprogram::getDistinct(Context, FA, "a", "a", FA, 0, nullptr,
		0, nullptr, 0, 0, DINode::FlagZero,
		DISubprogram::SPFlagZero, nullptr);

		auto *SPB = DISubprogram::getDistinct(Context, FB, "b", "b", FB, 0, nullptr,
		0, nullptr, 0, 0, DINode::FlagZero,
		DISubprogram::SPFlagZero, nullptr);

		auto *SPC = DISubprogram::getDistinct(Context, FC, "c", "c", FC, 0, nullptr,
		0, nullptr, 0, 0, DINode::FlagZero,
		DISubprogram::SPFlagZero, nullptr);

		auto *A = DILocation::get(Context, 3, 2, SPA);
		auto *B = DILocation::get(Context, 2, 4, SPB, A);
		auto *C = DILocation::get(Context, 13, 2, SPC, B);
		auto *M = DILocation::getMergedLocation(A, C);
		EXPECT_EQ(3u, M->getLine());
		EXPECT_EQ(2u, M->getColumn());
		EXPECT_TRUE(isa<DILocalScope>(M->getScope()));
		EXPECT_EQ(SPA, M->getScope());
		EXPECT_EQ(nullptr, M->getInlinedAt());
		}

		// Two inlined locations with the same scope, line and column
		// in the same inlined-at function at different line and column.
		{
		auto *FA = getFile();
		auto *FB = getFile();
		auto *FC = getFile();

		auto *SPA = DISubprogram::getDistinct(Context, FA, "a", "a", FA, 0, nullptr,
		0, nullptr, 0, 0, DINode::FlagZero,
		DISubprogram::SPFlagZero, nullptr);

		auto *SPB = DISubprogram::getDistinct(Context, FB, "b", "b", FB, 0, nullptr,
		0, nullptr, 0, 0, DINode::FlagZero,
		DISubprogram::SPFlagZero, nullptr);

		auto *SPC = DISubprogram::getDistinct(Context, FC, "c", "c", FC, 0, nullptr,
		0, nullptr, 0, 0, DINode::FlagZero,
		DISubprogram::SPFlagZero, nullptr);

		auto *A = DILocation::get(Context, 10, 20, SPA);
		auto *B1 = DILocation::get(Context, 3, 2, SPB, A);
		auto *B2 = DILocation::get(Context, 4, 5, SPB, A);
		auto *C1 = DILocation::get(Context, 2, 4, SPC, B1);
		auto *C2 = DILocation::get(Context, 2, 4, SPC, B2);

		auto *M = DILocation::getMergedLocation(C1, C2);
		EXPECT_EQ(2u, M->getLine());
		EXPECT_EQ(4u, M->getColumn());
		EXPECT_EQ(SPC, M->getScope());
		ASSERT_NE(nullptr, M->getInlinedAt());

		auto *I1 = M->getInlinedAt();
		EXPECT_EQ(0u, I1->getLine());
		EXPECT_EQ(0u, I1->getColumn());
		EXPECT_EQ(SPB, I1->getScope());
		EXPECT_EQ(A, I1->getInlinedAt());
		}

		// Two locations, different line/column and scope in the same subprogram,
		// inlined at the same place. This should result in a 0:0 location with
		// the nearest common scope in the inlined function.
		{
		auto *FA = getFile();
		auto *FI = getFile();

		auto *SPA = DISubprogram::getDistinct(Context, FA, "a", "a", FA, 0, nullptr,
		0, nullptr, 0, 0, DINode::FlagZero,
		DISubprogram::SPFlagZero, nullptr);

		auto *SPI = DISubprogram::getDistinct(Context, FI, "i", "i", FI, 0, nullptr,
		0, nullptr, 0, 0, DINode::FlagZero,
		DISubprogram::SPFlagZero, nullptr);

		// Nearest common scope for the two locations in a.
		auto *SPAScope1 = DILexicalBlock::getDistinct(Context, SPA, FA, 4, 9);

		// Scope for the first location in a.
		auto *SPAScope2 =
		DILexicalBlock::getDistinct(Context, SPAScope1, FA, 10, 12);

		// Scope for the second location in a.
		auto *SPAScope3 =
		DILexicalBlock::getDistinct(Context, SPAScope1, FA, 20, 8);
		auto *SPAScope4 =
		DILexicalBlock::getDistinct(Context, SPAScope3, FA, 21, 12);

		auto *I = DILocation::get(Context, 3, 8, SPI);
		auto *A1 = DILocation::get(Context, 12, 7, SPAScope2, I);
		auto *A2 = DILocation::get(Context, 21, 15, SPAScope4, I);
		auto *M = DILocation::getMergedLocation(A1, A2);
		EXPECT_EQ(0u, M->getLine());
		EXPECT_EQ(0u, M->getColumn());
		EXPECT_TRUE(isa<DILocalScope>(M->getScope()));
		EXPECT_EQ(SPAScope1, M->getScope());
		EXPECT_EQ(I, M->getInlinedAt());
		}

		// Regression test to catch a case where an iterator was invalidated due to
		// handling the chain of inlined-at locations after the nearest common
		// location for the two arguments were found.
		{
		auto *FA = getFile();
		auto *FB = getFile();
		auto *FI = getFile();

		auto *SPA = DISubprogram::getDistinct(Context, FA, "a", "a", FA, 0, nullptr,
		0, nullptr, 0, 0, DINode::FlagZero,
		DISubprogram::SPFlagZero, nullptr);

		auto *SPB = DISubprogram::getDistinct(Context, FB, "b", "b", FB, 0, nullptr,
		0, nullptr, 0, 0, DINode::FlagZero,
		DISubprogram::SPFlagZero, nullptr);

		auto *SPI = DISubprogram::getDistinct(Context, FI, "i", "i", FI, 0, nullptr,
		0, nullptr, 0, 0, DINode::FlagZero,
		DISubprogram::SPFlagZero, nullptr);

		auto *SPAScope1 = DILexicalBlock::getDistinct(Context, SPA, FA, 4, 9);
		auto *SPAScope2 = DILexicalBlock::getDistinct(Context, SPA, FA, 8, 3);

		DILocation *InlinedAt = nullptr;

		// Create a chain of inlined-at locations.
		for (int i = 0; i < 256; i++) {
		InlinedAt = DILocation::get(Context, 3 + i, 8 + i, SPI, InlinedAt);
		}

		auto *A1 = DILocation::get(Context, 5, 9, SPAScope1, InlinedAt);
		auto *A2 = DILocation::get(Context, 9, 8, SPAScope2, InlinedAt);
		auto *B = DILocation::get(Context, 10, 3, SPB, A1);
		auto *M1 = DILocation::getMergedLocation(B, A2);
		EXPECT_EQ(0u, M1->getLine());
		EXPECT_EQ(0u, M1->getColumn());
		EXPECT_TRUE(isa<DILocalScope>(M1->getScope()));
		EXPECT_EQ(SPA, M1->getScope());
		EXPECT_EQ(InlinedAt, M1->getInlinedAt());

		// Test the other argument order for good measure.
		auto *M2 = DILocation::getMergedLocation(A2, B);
		EXPECT_EQ(M1, M2);
		}
}		}

TEST_F(DILocationTest, getDistinct) {		TEST_F(DILocationTest, getDistinct) {
MDNode *N = getSubprogram();		MDNode *N = getSubprogram();
DILocation *L0 = DILocation::getDistinct(Context, 2, 7, N);		DILocation *L0 = DILocation::getDistinct(Context, 2, 7, N);
EXPECT_TRUE(L0->isDistinct());		EXPECT_TRUE(L0->isDistinct());
DILocation *L1 = DILocation::get(Context, 2, 7, N);		DILocation *L1 = DILocation::get(Context, 2, 7, N);
EXPECT_FALSE(L1->isDistinct());		EXPECT_FALSE(L1->isDistinct());
▲ Show 20 Lines • Show All 3,018 Lines • Show Last 20 Lines