This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
1/1
EarlyCSE.cpp
-
test/DebugInfo/X86/
-
DebugInfo/
-
X86/
-
earlycse-load-marks-undef.ll

Differential D80691

Proposed fix for PR46114
Needs ReviewPublic

Authored by helloqirun on May 27 2020, 11:22 PM.

Download Raw Diff

Details

Reviewers

davide
Orlando

Summary

The problem occurs when removing duplicate dbg intrinsic as follows:

call void @llvm.dbg.value(metadata i32 0, metadata !22, metadata !DIExpression()), !dbg !31
call void @llvm.dbg.value(metadata i32 %0, metadata !22, metadata !DIExpression()), !dbg !31

The first dbg.value(.. i32 0) is removed by the backward scan in llvm::RemoveRedundantDbgInstrs.

The root cause is in EarlyCSE.

Before EarlyCSE, we have

call void @llvm.dbg.value(metadata i32 %0, metadata !21, metadata !DIExpression()), !dbg !31
call void @llvm.dbg.value(metadata i32 0, metadata !22, metadata !DIExpression()), !dbg !31
%1 = load i32, i32* @a, align 4, !dbg !32, !tbaa !27
call void @llvm.dbg.value(metadata i32 %1, metadata !22, metadata !DIExpression()), !dbg !31

When removing %1 = load i32, i32* @a, align 4, !dbg !32, !tbaa !27, we should salvage the debug info early, before replacing all uses with %0.

The IR after applying the patch:

call void @llvm.dbg.value(metadata i32 0, metadata !22, metadata !DIExpression()), !dbg !31
call void @llvm.dbg.value(metadata i32 undef, metadata !22, metadata !DIExpression()), !dbg !31

Diff Detail

Event Timeline

helloqirun created this revision.May 27 2020, 11:22 PM

Herald added a project: Restricted Project. · View Herald TranscriptMay 27 2020, 11:22 PM

Herald added subscribers: llvm-commits, kosarev, hiraditya, aprantl. · View Herald Transcript

Harbormaster failed remote builds in B58168: Diff 266747!May 27 2020, 11:25 PM

helloqirun edited the summary of this revision. (Show Details)May 27 2020, 11:35 PM

Hi @helloqirun,

Thanks for working on this!

Assuming your IR is from the code you provided in your reproducer PR46114 [1],
compiled with clang at 709c52b9553 (18 May 2020) I see this:

$ cat test.c
int a = 1;
short b;
char c;
int main() {
  int d = a, l_52 = 0;
  b = (l_52 = a) - c; //   qirun0
  if (d) {
    int e=1;
  }
}

$ clang test.c -O3 -g -mllvm -print-after-all
...
*** IR Dump After SROA *** // BEFORE EARLY CSE
; Function Attrs: nounwind uwtable
define dso_local i32 @main() #0 !dbg !17 {
entry:
  %0 = load i32, i32* @a, align 4, !dbg !26, !tbaa !27
  call void @llvm.dbg.value(metadata i32 %0, metadata !21, metadata !DIExpression()), !dbg !31
  call void @llvm.dbg.value(metadata i32 0, metadata !22, metadata !DIExpression()), !dbg !31
  %1 = load i32, i32* @a, align 4, !dbg !32, !tbaa !27
  call void @llvm.dbg.value(metadata i32 %1, metadata !22, metadata !DIExpression()), !dbg !31
  %2 = load i8, i8* @c, align 1, !dbg !33, !tbaa !34
  %conv = sext i8 %2 to i32, !dbg !33
  ...

*** IR Dump After Early CSE ***
; Function Attrs: nounwind uwtable
define dso_local i32 @main() #0 !dbg !17 {
entry:
  %0 = load i32, i32* @a, align 4, !dbg !26, !tbaa !27
  call void @llvm.dbg.value(metadata i32 %0, metadata !21, metadata !DIExpression()), !dbg !31
  call void @llvm.dbg.value(metadata i32 0, metadata !22, metadata !DIExpression()), !dbg !31
  call void @llvm.dbg.value(metadata i32 %0, metadata !22, metadata !DIExpression()), !dbg !31
  %1 = load i8, i8* @c, align 1, !dbg !32, !tbaa !33
  %conv = sext i8 %1 to i32, !dbg !32
  ...

During EarlyCSE we detect that %0 = load i32, i32* @a is the same as
%1 = load i32, i32* @a, so the latter is removed. We rewrite the dbg.value
which uses %1 to use %0 instead, because %1 is being removed. IIUC this
behaviour in EarlyCSE looks correct, and makes me feel like the issue
comes from somewhere else?

[1] https://bugs.llvm.org/show_bug.cgi?id=46114

Hi @Orlando
Thanks! Yes, if we want to have two dbg.value in the IR, it works fine. That's precisely what was before e5f07080b8a, i.e., the final IR will contain

call void @llvm.dbg.value(metadata i32 0, metadata !22, metadata !DIExpression()), !dbg !31
call void @llvm.dbg.value(metadata i32 %0, metadata !22, metadata !DIExpression()), !dbg !31

and lldb can print the correct value of 0.

In e5f07080b8a, we introduced the function RemoveRedundantDbgInstrs to eliminate the redundant dbg.value. So now, SimplyCFG will call RemoveRedundantDbgInstrs and then removeRedundantDbgInstrsUsingBackwardScan to remove the call void @llvm.dbg.value(metadata i32 0, metadata !22, metadata !DIExpression()), !dbg !31.

From the comment, I see that RemoveRedundantDbgInstrs seems to be elegant and it has been used in other locations. For some reasons, removeRedundantDbgInstrsUsingBackwardScan sees that the last dbg.value is always the latest.

So I see the following options:
(1) Change earlyCSE to make it compatible with the logic in removeRedundantDbgInstrsUsingBackwardScan
(2) fix the logic in removeRedundantDbgInstrsUsingBackwardScan, but we also need to make sure that it works for all calling contexts
(3) undo the introduction of RemoveRedundantDbgInstrs, but we will see duplicate dbg.values

After my initial investigation, I think (1) may be the most promising way to fix this issue.

In D80691#2059463, @Orlando wrote:

Hi @helloqirun,

Thanks for working on this!

Assuming your IR is from the code you provided in your reproducer PR46114 [1],
compiled with clang at 709c52b9553 (18 May 2020) I see this:

$ cat test.c
int a = 1;
short b;
char c;
int main() {
  int d = a, l_52 = 0;
  b = (l_52 = a) - c; //   qirun0
  if (d) {
    int e=1;
  }
}

$ clang test.c -O3 -g -mllvm -print-after-all
...
*** IR Dump After SROA *** // BEFORE EARLY CSE
; Function Attrs: nounwind uwtable
define dso_local i32 @main() #0 !dbg !17 {
entry:
  %0 = load i32, i32* @a, align 4, !dbg !26, !tbaa !27
  call void @llvm.dbg.value(metadata i32 %0, metadata !21, metadata !DIExpression()), !dbg !31
  call void @llvm.dbg.value(metadata i32 0, metadata !22, metadata !DIExpression()), !dbg !31
  %1 = load i32, i32* @a, align 4, !dbg !32, !tbaa !27
  call void @llvm.dbg.value(metadata i32 %1, metadata !22, metadata !DIExpression()), !dbg !31
  %2 = load i8, i8* @c, align 1, !dbg !33, !tbaa !34
  %conv = sext i8 %2 to i32, !dbg !33
  ...

*** IR Dump After Early CSE ***
; Function Attrs: nounwind uwtable
define dso_local i32 @main() #0 !dbg !17 {
entry:
  %0 = load i32, i32* @a, align 4, !dbg !26, !tbaa !27
  call void @llvm.dbg.value(metadata i32 %0, metadata !21, metadata !DIExpression()), !dbg !31
  call void @llvm.dbg.value(metadata i32 0, metadata !22, metadata !DIExpression()), !dbg !31
  call void @llvm.dbg.value(metadata i32 %0, metadata !22, metadata !DIExpression()), !dbg !31
  %1 = load i8, i8* @c, align 1, !dbg !32, !tbaa !33
  %conv = sext i8 %1 to i32, !dbg !32
  ...

[1] https://bugs.llvm.org/show_bug.cgi?id=46114

Hi @Orlando

During EarlyCSE we detect that %0 = load i32, i32* @a is the same as
%1 = load i32, i32* @a, so the latter is removed. We rewrite the dbg.value
which uses %1 to use %0 instead, because %1 is being removed. IIUC this
behaviour in EarlyCSE looks correct, and makes me feel like the issue
comes from somewhere else?

There are cases where EarlyCSE needs to remove the first (the following case). The instruction `%0 = load i32, i32* @a, align 4` can be the product of both EarlyCSE DCE (the following case) and EarlyCSE CSE LOAD (the case in the original PR).

The EarlyCSE DCE case is compatible with the current logic in `RemoveRedundantDbgInstrs'. So I propose to patch the EarlyCSE CSE LOAD case.

Consider the following test case

$ cat abc1.c
int a = 1;
short b;
char c;
int main() {
  int d = a, l_52 = 0;
  b = (l_52 = a) - c; 
   {
    int e=d;
  }
}

The IR before EarlyCSE is (which is the same as my original PR)

%0 = load i32, i32* @a, align 4, !dbg !25, !tbaa !26
call void @llvm.dbg.value(metadata i32 %0, metadata !21, metadata !DIExpression()), !dbg !30
call void @llvm.dbg.value(metadata i32 0, metadata !22, metadata !DIExpression()), !dbg !30
  %1 = load i32, i32* @a, align 4, !dbg !31, !tbaa !26
call void @llvm.dbg.value(metadata i32 %1, metadata !22, metadata !DIExpression()), !dbg !30
%2 = load i8, i8* @c, align 1, !dbg !32, !tbaa !33
%conv = sext i8 %2 to i32, !dbg !32
...

The IR after EarlyCSE is

call void @llvm.dbg.value(metadata i32 undef, metadata !21, metadata !DIExpression()), !dbg !25
call void @llvm.dbg.value(metadata i32 0, metadata !22, metadata !DIExpression()), !dbg !25
%0 = load i32, i32* @a, align 4, !dbg !26, !tbaa !27
call void @llvm.dbg.value(metadata i32 %0, metadata !22, metadata !DIExpression()), !dbg !25
%1 = load i8, i8* @c, align 1, !dbg !31, !tbaa !32
%conv = sext i8 %1 to i32, !dbg !31

In this case, removeRedundantDbgInstrsUsingBackwardScan will remove the last dbg.value. So lldb can print the correct value.

I had another look at this and followed it up on the ticket (PR46114).

I had another look at this and followed it up on the ticket (PR46114).

Sorry, I didn't notice that you had replied here because I didn't refresh the page!

aprantl added inline comments.May 28 2020, 10:41 AM

llvm/lib/Transforms/Scalar/EarlyCSE.cpp
1139	`// Salvage ...` Does this fit on 80 columns?

Re-formatted the comment.

There are a bunch of places where we have this pattern:
salvageKnowledge(); salvageDI(); eraseFromParent().

I wonder whether we can consolidate this -- because there are probably other latent bugs in EarlyCSE [from visual inspection].

Harbormaster completed remote builds in B58284: Diff 266967.May 28 2020, 2:53 PM

Please add a test as well.

I think that this patch reduces variable coverage by undefing dbg.values which
it cannot salvage that would otherwise be RAUWd.

I'd expect your call to salvageDebugInfoOrMarkUndef to insert an 'undef'
dbg.value for 'l_52' for your original reproducer because you cannot salvage a
load operation. When I build with this patch locally, that is what I get. lldb
(and gdb) tells me that the value for 'l_52' has been optimized out.

With your patch applied, using your original reproducer:

$ cat test.c
int a = 1;
short b;
char c;
int main() {
  int d = a, l_52 = 0;
  b = (l_52 = a) - c; //   qirun0
  if (d) {
    int e=1;
  }
}

$ clang-with-your-patch -O3 -g test.c
$ lldb a.out
...
(lldb) s
Process 21970 stopped
* thread #1, name = 'a.out', stop reason = step in
    frame #0: 0x0000000000400486 a.out`main at test.c:6:20
   3   	char c;
   4   	int main() {
   5   	  int d = a, l_52 = 0;
-> 6   	  b = (l_52 = a) - c; //   qirun0
   7   	  if (d) {
   8   	    int e=1;
   9   	  }
(lldb) p l_52
error: Couldn't materialize: couldn't get the value of variable l_52: no location, value may have been optimized out

My lldb is built at 9b56cc9361a471a3ce57da4c98f60e377cc43026 (1st April 2020),
and clang 709c52b9553 (18th May 2020). What do you see in lldb when you build
your original reproducer with this patch?

There are cases where EarlyCSE needs to remove the first (the following case). The instruction %0 = load i32, i32* @a, align 4 can be the product of both EarlyCSE DCE (the following case) and EarlyCSE CSE LOAD (the case in the original PR).

My understanding is that DCE and CSE do not have equal implications for
debug-info. DCE means the value is going away because it isn't used. In this
case we should salvage any debug users (i.e. dbg.values using the value),
because the value will no longer exist. For CSE the value still exists
somewhere, so we instead want to update debug users to use the value which we
are keeping. This is what the line Inst.replaceAllUsesWith(Op); is doing
without your patch. However, with your patch applied, the salvage call replaces
the dbg.values with a constant if it can salvage, or undef if it cannot. This
means that the replaceAllUsesWith (RAUW) cannot update the debug users.

Thanks! Yes, if we want to have two dbg.value in the IR, it works fine. That's precisely what was before e5f07080b8a, i.e., the final IR will contain

call void @llvm.dbg.value(metadata i32 0, metadata !22, metadata !DIExpression()), !dbg !31
call void @llvm.dbg.value(metadata i32 %0, metadata !22, metadata !DIExpression()), !dbg !31
and lldb can print the correct value of 0.

I wouldn't expect this to result in you seeing a value of '0' in the debugger.

From SourceLevelDebugging.rst [1]:

A dbg.value terminates the effect of any preceding dbg.values for (any overlapping fragments of) the specified variable.
The dbg.value’s position in the IR defines where in the instruction stream the variable’s value changes.

Debug intrinsics (e.g. llvm.dbg.value) will not be lowered into "real"
instructions. This means that you cannot step onto them. Instead, they mark
out ranges of instructions where a variable location is live. So when you have
a contiguous pack of them like this referring to the same variable only the
last one has any effect.

For sanity, please could you run the following command on the reproducer built with
clang with and without your patch?

$ llvm-dwarfdump a.out -name l_52

Thanks,
Orlando

[1] https://llvm.org/docs/SourceLevelDebugging.html#object-lifetime-in-optimized-code

This revision now requires changes to proceed.May 29 2020, 1:43 AM

In D80691#2062323, @Orlando wrote:
I think that this patch reduces variable coverage by undefing dbg.values which
it cannot salvage that would otherwise be RAUWd.

I'd expect your call to salvageDebugInfoOrMarkUndef to insert an 'undef'
dbg.value for 'l_52' for your original reproducer because you cannot salvage a
load operation. When I build with this patch locally, that is what I get. lldb
(and gdb) tells me that the value for 'l_52' has been optimized out.

With your patch applied, using your original reproducer:
$ cat test.c
int a = 1;
short b;
char c;
int main() {
  int d = a, l_52 = 0;
  b = (l_52 = a) - c; //   qirun0
  if (d) {
    int e=1;
  }
}

$ clang-with-your-patch -O3 -g test.c
$ lldb a.out
...
(lldb) s
Process 21970 stopped
* thread #1, name = 'a.out', stop reason = step in
    frame #0: 0x0000000000400486 a.out`main at test.c:6:20
   3   	char c;
   4   	int main() {
   5   	  int d = a, l_52 = 0;
-> 6   	  b = (l_52 = a) - c; //   qirun0
   7   	  if (d) {
   8   	    int e=1;
   9   	  }
(lldb) p l_52
error: Couldn't materialize: couldn't get the value of variable l_52: no location, value may have been optimized out
My lldb is built at 9b56cc9361a471a3ce57da4c98f60e377cc43026 (1st April 2020),
and clang 709c52b9553 (18th May 2020). What do you see in lldb when you build
your original reproducer with this patch?

There are cases where EarlyCSE needs to remove the first (the following case). The instruction %0 = load i32, i32* @a, align 4 can be the product of both EarlyCSE DCE (the following case) and EarlyCSE CSE LOAD (the case in the original PR).

My understanding is that DCE and CSE do not have equal implications for
debug-info. DCE means the value is going away because it isn't used. In this
case we should salvage any debug users (i.e. dbg.values using the value),
because the value will no longer exist. For CSE the value still exists
somewhere, so we instead want to update debug users to use the value which we
are keeping. This is what the line Inst.replaceAllUsesWith(Op); is doing
without your patch. However, with your patch applied, the salvage call replaces
the dbg.values with a constant if it can salvage, or undef if it cannot. This
means that the replaceAllUsesWith (RAUW) cannot update the debug users.

Thanks! Yes, if we want to have two dbg.value in the IR, it works fine. That's precisely what was before e5f07080b8a, i.e., the final IR will contain

call void @llvm.dbg.value(metadata i32 0, metadata !22, metadata !DIExpression()), !dbg !31
call void @llvm.dbg.value(metadata i32 %0, metadata !22, metadata !DIExpression()), !dbg !31
and lldb can print the correct value of 0.

I wouldn't expect this to result in you seeing a value of '0' in the debugger.

Indeed it shows a 0, as I mentioned in the summary. See the following.

$ clang -v
clang version 10.0.0 (https://github.com/llvm/llvm-project.git 632deb6bd04022945468faef2dcaa8c9fdf1b0fd)
$ clang -g -O3 abc.c
$ cat abc.ll
; ModuleID = 'abc.c'
source_filename = "abc.c"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

@a = dso_local local_unnamed_addr global i32 1, align 4, !dbg !0
@c = common dso_local local_unnamed_addr global i8 0, align 1, !dbg !9
@b = common dso_local local_unnamed_addr global i16 0, align 2, !dbg !6
; Function Attrs: nofree norecurse nounwind uwtable
define dso_local i32 @main() local_unnamed_addr #0 !dbg !17 {
entry:
  %0 = load i32, i32* @a, align 4, !dbg !26, !tbaa !27
  call void @llvm.dbg.value(metadata i32 %0, metadata !21, metadata !DIExpression()), !dbg !31
  call void @llvm.dbg.value(metadata i32 0, metadata !22, metadata !DIExpression()), !dbg !31
  call void @llvm.dbg.value(metadata i32 %0, metadata !22, metadata !DIExpression()), !dbg !31
  %1 = load i8, i8* @c, align 1, !dbg !32, !tbaa !33
  %conv = sext i8 %1 to i32, !dbg !32


$ lldb-trunk -s cmds -b a.out
(lldb) target create "a.out"
(lldb) b 6
Breakpoint 1: where = a.out`main + 6 at abc.c:6:20, address = 0x0000000000400486
(lldb) r
Process 13524 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
    frame #0: 0x0000000000400486 a.out`main at abc.c:6:20
   3   	char c;
   4   	int main() {
   5   	  int d = a, l_52 = 0;
-> 6   	  b = (l_52 = a) - c; //   qirun0
   7   	  if (d) {
   8   	    int e=1;
   9   	  }

Process 13524 launched: '/home/absozero/bisect/fix/a.out' (x86_64)
(lldb) p l_52
(int) $0 = 0

$ llvm-dwarfdump a.out -name l_52
a.out:	file format ELF64-x86-64

0x00000097: DW_TAG_variable
              DW_AT_location	(0x00000000:
                 [0x0000000000400486, 0x0000000000400499): DW_OP_consts +0, DW_OP_stack_value)
              DW_AT_name	("l_52")
              DW_AT_decl_file	("/home/absozero/bisect/fix/abc.c")
              DW_AT_decl_line	(5)
              DW_AT_type	(0x0000003f "int")

That's precisely the motivation behind my patch. I don't know an immediate way in earlyCSE to remove the second dbg.value. The backward scan in RemoveRedundantDbgInstrs will always eliminate the first dbg.value.

In D80691#2062229, @djtodoro wrote:

Please add a test as well.

Thanks, will do.

I have added the test case and made a minor change. The RemoveRedundantDbgInstrs call will always keep the second dbg.value. If we do RAUW first, the corresponding value will be replaced -- that causes PR46114. With this patch, we set it as undef. If I missed anything, please let me know. I think @Orlando's suggestion on merging the locations will be a good improvement.

Based on @Orlando's comment. I am attaching the dwarfdump as follows.

The version before the adoption of RemoveRedundantDbgInstrs.

entry:
  %0 = load i32, i32* @a, align 4, !dbg !26, !tbaa !27
  call void @llvm.dbg.value(metadata i32 %0, metadata !21, metadata !DIExpression()), !dbg !31
  call void @llvm.dbg.value(metadata i32 0, metadata !22, metadata !DIExpression()), !dbg !31
  call void @llvm.dbg.value(metadata i32 %0, metadata !22, metadata !DIExpression()), !dbg !31
  %1 = load i8, i8* @c, align 1, !dbg !32, !tbaa !33
  %conv = sext i8 %1 to i32, !dbg !32
  %sub = sub nsw i32 %0, %conv, !dbg !34
  %conv1 = trunc i32 %sub to i16, !dbg !35

$ llvm-dwarfdump a.out -name l_52
a.out:	file format ELF64-x86-64

0x00000097: DW_TAG_variable
              DW_AT_location	(0x00000000:
                 [0x0000000000400486, 0x0000000000400499): DW_OP_consts +0, DW_OP_stack_value)
              DW_AT_name	("l_52")
              DW_AT_decl_file	("/home/absozero/bisect/abc.c")
              DW_AT_decl_line	(5)
              DW_AT_type	(0x0000003f "int")

We can see two dbg.values. But it's still incorrect as the value of l_52 will always be 0.

The version before applying the patch.

entry:
  %0 = load i32, i32* @a, align 4, !dbg !26, !tbaa !27
  call void @llvm.dbg.value(metadata i32 %0, metadata !21, metadata !DIExpression()), !dbg !31
  call void @llvm.dbg.value(metadata i32 %0, metadata !22, metadata !DIExpression()), !dbg !31
  %1 = load i8, i8* @c, align 1, !dbg !32, !tbaa !33
  %conv = sext i8 %1 to i32, !dbg !32

$ llvm-dwarfdump a.out -name l_52
a.out:	file format ELF64-x86-64

0x00000097: DW_TAG_variable
              DW_AT_location	(0x00000000:
                 [0x0000000000400486, 0x000000000040048f): DW_OP_reg0 RAX)
              DW_AT_name	("l_52")
              DW_AT_decl_file	("/home/absozero/bisect/abc.c")
              DW_AT_decl_line	(5)
              DW_AT_type	(0x0000003f "int")

RemoveRedundantDbgInstrs eliminates the first dbg.value and lldb gives a wrong value (PR46144).

Version after applying the patch

entry:
  %0 = load i32, i32* @a, align 4, !dbg !26, !tbaa !27
  call void @llvm.dbg.value(metadata i32 %0, metadata !21, metadata !DIExpression()), !dbg !31
  call void @llvm.dbg.value(metadata i32 undef, metadata !22, metadata !DIExpression()), !dbg !31
  %1 = load i8, i8* @c, align 1, !dbg !32, !tbaa !33
  %conv = sext i8 %1 to i32, !dbg !32

$ llvm-dwarfdump a.out -name l_52
a.out:	file format ELF64-x86-64

0x00000097: DW_TAG_variable
              DW_AT_name	("l_52")
              DW_AT_decl_file	("/home/absozero/bisect/abc.c")
              DW_AT_decl_line	(5)
              DW_AT_type	(0x0000003f "int")

This is quite a long reply so I've got a very small "summary" at the end.

Asking for the final output (looking at the dwarf) was possibly counterproductive because other
transformations may have an impact and misdirect our attention. Looking at the pass alone will be
helpful; the test should provide a good reference point for this discussion.

If we do RAUW first, the corresponding value will be replaced -- that causes PR46114. With this
patch, we set it as undef. If I missed anything, please let me know.

What I was trying to say is that I believe that the RAUW is correct, and that it is not the cause of
PR46114. Ignoring PR46114 and focusing only on this patch for now, using the IR from your test, I'll
try to explain my reasoning for why the RAUW is correct and how your patch reduces variable location
availability:

Before EarlyCSE:
%0 = load i32, i32* @a, align 4

; Value of "l_52" is 0.
call void @llvm.dbg.value(metadata i32 0, metadata !20, metadata !DIExpression()), !dbg !24

%1 = load i32, i32* @a, align 4, !dbg !25, !tbaa !26

; Value of "l_52" is found in virtual register %1.
call void @llvm.dbg.value(metadata i32 %1, metadata !20, metadata !DIExpression()), !dbg !24

%sub = sub nsw i32 %1, 0

!20 = !DILocalVariable(name: "l_52",

EarlyCSE detects that the loads into %0 and %1 are exactly the same, so we can remove the later one,
%1. Without your patch, we replace all uses (including debug uses) with %0, because we've proven
that %0 and %1 have the same value.

After EarlyCSE:
%0 = load i32, i32* @a, align 4

; Value of "l_52" is 0.
call void @llvm.dbg.value(metadata i32 0, metadata !20, metadata !DIExpression()), !dbg !24

; This has been removed by EarlyCSE.
; %1 = load i32, i32* @a, align 4, !dbg !25, !tbaa !26

; Value of "l_52" is now found in virtual register %0, previously %1.
call void @llvm.dbg.value(metadata i32 %0, metadata !20, metadata !DIExpression()), !dbg !24

; Sub now uses %0 instead of %1, because %1 has been deleted and they were equivalent.
%sub = sub nsw i32 %0, 0

!20 = !DILocalVariable(name: "l_52",

And again, without my comments, before RemoveRedundantDbgInstrs runs.

After EarlyCSE:
%0 = load i32, i32* @a, align 4
call void @llvm.dbg.value(metadata i32 0, metadata !20, metadata !DIExpression()), !dbg !24
call void @llvm.dbg.value(metadata i32 %0, metadata !20, metadata !DIExpression()), !dbg !24
%sub = sub nsw i32 %0, 0

!20 = !DILocalVariable(name: "l_52",

Notice that the dbg.values are adjacent in the IR and refer to the same variable ("l_52"). The first
says "after '%0 = load ...' the value of l_52 is 0". The second says "after '%0 = load ...' the
value of l_52 is in virtual register %0". It is safe to remove the first one because the second
renders it redundant. This is what RemoveRedundantDbgInstrs does:

After RemoveRedundantDbgInstrs:
%0 = load i32, i32* @a, align 4
; The following dbg.value has been removed.
; call void @llvm.dbg.value(metadata i32 0, metadata !20, metadata !DIExpression()), !dbg !24
call void @llvm.dbg.value(metadata i32 %0, metadata !20, metadata !DIExpression()), !dbg !24
%sub = sub nsw i32 %0, 0

!20 = !DILocalVariable(name: "l_52",

The meaning of the debug-info remains unchanged. We're still saying "after '%0 = load ...' the
value of l_52 is in virtual register %0".

With your patch, we instead see this after EarlyCSE:

After EarlyCSE with your patch:
%0 = load i32, i32* @a, align 4
call void @llvm.dbg.value(metadata i32 0, metadata !20, metadata !DIExpression()), !dbg !24
call void @llvm.dbg.value(metadata i32 undef metadata !20, metadata !DIExpression()), !dbg !24
%sub = sub nsw i32 %0, 0

!20 = !DILocalVariable(name: "l_52",

'undef' dbg.value means that we can't provide a variable location from here until the next
dbg.value. In this example there is no next dbg.value, so we are saying that we cannot provide the
value of "l_52" at any point in the program. However, hopefully I've been able to illustrate that %0
is a valid variable location for l_52 at this point in the IR.

Summary:
Just to be clear; I agree that the behaviour you show in the ticket PR46114 [1] is a bug, I just
think that this patch covers up the problem by reducing variable location coverage, instead of
fixing it. I've written up a possible cause of the bug on the ticket itself.

I hope this helps, please let me know if I've been unclear anywhere.

[1] https://bugs.llvm.org/show_bug.cgi?id=46114

Thanks @Orlando, I think I agree with your reasoning. My patch on introducing an undef is based on the same reasoning (but I could be totally wrong).

Let's see what we should do instead.

Before earlycse, we have:

  %0 = load i32, i32* @a, align 4, !dbg !26, !tbaa !27
  call void @llvm.dbg.value(metadata i32 0, metadata !22, metadata !DIExpression()), !dbg !31
  %1 = load i32, i32* @a, align 4, !dbg !32, !tbaa !27
  call void @llvm.dbg.value(metadata i32 %1, metadata !22, metadata !DIExpression()), !dbg !31

!26 = !DILocation(line: 5, column: 11, scope: !17)
!31 = !DILocation(line: 0, scope: !17)
!32 = !DILocation(line: 6, column: 15, scope: !17)

after earlycse, we have:

  %0 = load i32, i32* @a, align 4, !dbg !26, !tbaa !27
; I omitted the variable !21 here since it's irrelevant.
  call void @llvm.dbg.value(metadata i32 0, metadata !22, metadata !DIExpression()), !dbg !31
  call void @llvm.dbg.value(metadata i32 %0, metadata !22, metadata !DIExpression()), !dbg !31

!21 = !DILocalVariable(name: "d", scope: !17, file: !3, line: 5, type: !12)
!22 = !DILocalVariable(name: "l_52", scope: !17, file: !3, line: 5, type: !12)
!26 = !DILocation(line: 5, column: 11, scope: !17)
!31 = !DILocation(line: 0, scope: !17)
!32 = !DILocation(line: 6, column: 20, scope: !17)

Could you please help me verify the following?
Based on my understanding, it seems that when eliminating the second load i32, i32* @a in EarlyCSE, we should use the location !32 instead of !26 for the resulting load i32, i32* @a.
Thanks.

In D80691#2067707, @helloqirun wrote:
Thanks @Orlando, I think I agree with your reasoning. My patch on introducing an undef is based on the same reasoning (but I could be totally wrong).

Let's see what we should do instead.

Before earlycse, we have:
  %0 = load i32, i32* @a, align 4, !dbg !26, !tbaa !27
  call void @llvm.dbg.value(metadata i32 0, metadata !22, metadata !DIExpression()), !dbg !31
  %1 = load i32, i32* @a, align 4, !dbg !32, !tbaa !27
  call void @llvm.dbg.value(metadata i32 %1, metadata !22, metadata !DIExpression()), !dbg !31

!26 = !DILocation(line: 5, column: 11, scope: !17)
!31 = !DILocation(line: 0, scope: !17)
!32 = !DILocation(line: 6, column: 15, scope: !17)
after earlycse, we have:
  %0 = load i32, i32* @a, align 4, !dbg !26, !tbaa !27
; I omitted the variable !21 here since it's irrelevant.
  call void @llvm.dbg.value(metadata i32 0, metadata !22, metadata !DIExpression()), !dbg !31
  call void @llvm.dbg.value(metadata i32 %0, metadata !22, metadata !DIExpression()), !dbg !31

!21 = !DILocalVariable(name: "d", scope: !17, file: !3, line: 5, type: !12)
!22 = !DILocalVariable(name: "l_52", scope: !17, file: !3, line: 5, type: !12)
!26 = !DILocation(line: 5, column: 11, scope: !17)
!31 = !DILocation(line: 0, scope: !17)
!32 = !DILocation(line: 6, column: 20, scope: !17)
Could you please help me verify the following?
Based on my understanding, it seems that when eliminating the second load i32, i32* @a in EarlyCSE, we should use the location !32 instead of !26 for the resulting load i32, i32* @a.
Thanks.

After EarlyCSE it may be wrong to say that the load %0 = load ... comes from either line 5 (!26)
or line 6 (!32) because it now comes from source code from both lines. There is a method to handle
this case called Instruction::applyMergedLocation. If my reasoning is correct then we'd just need
this:

    + InVal.DefInst->applyMergedLocation(InVal.DefInst->getDebugLoc().get(),
    +                                    Inst.getDebugLoc());
      if (!Inst.use_empty())
        Inst.replaceAllUsesWith(Op);
      salvageKnowledge(&Inst, &AC);
      salvageKnowledge(&Inst, &AC);
      removeMSSA(Inst);

This will give the instruction line 0 for your reproducer. I know a lot less about line numbers than
variable locations, so I'd want someone else to look at that first. @probinson, @jmorse does it
look like that's the correct thing to do here?

Either way, looking a little closer at the ticket, other than the line number issue mentioned above
(which I'm not confident about), I don't think behaviour in the ticket is actually a bug. The
debugger step line (copied below) shows that we're stopping on column 20 of line 6, which is on the
right hand side of the first '=', and IIUC it's reasonable to expect l_52 to no longer be 0 here.

Copied from https://bugs.llvm.org/show_bug.cgi?id=46114
Process 2616 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
    frame #0: 0x0000000000400486 a.out`main at abc.c:6:20
   3   	char c;
   4   	int main() {
   5   	  int d = a, l_52 = 0;
-> 6   	  b = (l_52 = a) - c; //   stop here
   7   	  if (d) {
   8   	    int e=1;
   9   	  }

(lldb) p l_52
(int) $0 = 1

In D80691#2068585, @Orlando wrote:
In D80691#2067707, @helloqirun wrote:
Thanks @Orlando, I think I agree with your reasoning. My patch on introducing an undef is based on the same reasoning (but I could be totally wrong).

Let's see what we should do instead.

Before earlycse, we have:
  %0 = load i32, i32* @a, align 4, !dbg !26, !tbaa !27
  call void @llvm.dbg.value(metadata i32 0, metadata !22, metadata !DIExpression()), !dbg !31
  %1 = load i32, i32* @a, align 4, !dbg !32, !tbaa !27
  call void @llvm.dbg.value(metadata i32 %1, metadata !22, metadata !DIExpression()), !dbg !31

!26 = !DILocation(line: 5, column: 11, scope: !17)
!31 = !DILocation(line: 0, scope: !17)
!32 = !DILocation(line: 6, column: 15, scope: !17)
after earlycse, we have:
  %0 = load i32, i32* @a, align 4, !dbg !26, !tbaa !27
; I omitted the variable !21 here since it's irrelevant.
  call void @llvm.dbg.value(metadata i32 0, metadata !22, metadata !DIExpression()), !dbg !31
  call void @llvm.dbg.value(metadata i32 %0, metadata !22, metadata !DIExpression()), !dbg !31

!21 = !DILocalVariable(name: "d", scope: !17, file: !3, line: 5, type: !12)
!22 = !DILocalVariable(name: "l_52", scope: !17, file: !3, line: 5, type: !12)
!26 = !DILocation(line: 5, column: 11, scope: !17)
!31 = !DILocation(line: 0, scope: !17)
!32 = !DILocation(line: 6, column: 20, scope: !17)
Could you please help me verify the following?
Based on my understanding, it seems that when eliminating the second load i32, i32* @a in EarlyCSE, we should use the location !32 instead of !26 for the resulting load i32, i32* @a.
Thanks.
After EarlyCSE it may be wrong to say that the load %0 = load ... comes from either line 5 (!26)
or line 6 (!32) because it now comes from source code from both lines. There is a method to handle
this case called Instruction::applyMergedLocation. If my reasoning is correct then we'd just need
this:
    + InVal.DefInst->applyMergedLocation(InVal.DefInst->getDebugLoc().get(),
    +                                    Inst.getDebugLoc());
      if (!Inst.use_empty())
        Inst.replaceAllUsesWith(Op);
      salvageKnowledge(&Inst, &AC);
      salvageKnowledge(&Inst, &AC);
      removeMSSA(Inst);
This will give the instruction line 0 for your reproducer. I know a lot less about line numbers than
variable locations, so I'd want someone else to look at that first. @probinson, @jmorse does it
look like that's the correct thing to do here?

Either way, looking a little closer at the ticket, other than the line number issue mentioned above
(which I'm not confident about), I don't think behaviour in the ticket is actually a bug. The
debugger step line (copied below) shows that we're stopping on column 20 of line 6, which is on the
right hand side of the first '=', and IIUC it's reasonable to expect l_52 to no longer be 0 here.
Copied from https://bugs.llvm.org/show_bug.cgi?id=46114
Process 2616 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
    frame #0: 0x0000000000400486 a.out`main at abc.c:6:20
   3   	char c;
   4   	int main() {
   5   	  int d = a, l_52 = 0;
-> 6   	  b = (l_52 = a) - c; //   stop here
   7   	  if (d) {
   8   	    int e=1;
   9   	  }

(lldb) p l_52
(int) $0 = 1

@Orlando I think you are probably right. It should indeed prints $0=1. I appreciate your patience and my apology for the report. Shall I close the original PR?

@Orlando I think you are probably right. It should indeed prints $0=1. I appreciate your patience and my apology for the report. Shall I close the original PR?

No problem, happy to help. I still wonder whether the line numbers need merging. I'd keep this patch
open for now to see if anyone has any input on the that. But I suppose it is a separate issue so it
makes sense to close the bugzilla ticket, yes.

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Scalar/

EarlyCSE.cpp

6 lines

test/

DebugInfo/

X86/

earlycse-load-marks-undef.ll

65 lines

Diff 267509

llvm/lib/Transforms/Scalar/EarlyCSE.cpp

Show First 20 Lines • Show All 1,130 Lines • ▼ Show 20 Lines	if (MemInst.isValid() && MemInst.isLoad()) {
Value *Op = getOrCreateResult(InVal.DefInst, Inst.getType());		Value *Op = getOrCreateResult(InVal.DefInst, Inst.getType());
if (Op != nullptr) {		if (Op != nullptr) {
LLVM_DEBUG(dbgs() << "EarlyCSE CSE LOAD: " << Inst		LLVM_DEBUG(dbgs() << "EarlyCSE CSE LOAD: " << Inst
<< " to: " << *InVal.DefInst << '\n');		<< " to: " << *InVal.DefInst << '\n');
if (!DebugCounter::shouldExecute(CSECounter)) {		if (!DebugCounter::shouldExecute(CSECounter)) {
LLVM_DEBUG(dbgs() << "Skipping due to debug counter\n");		LLVM_DEBUG(dbgs() << "Skipping due to debug counter\n");
continue;		continue;
}		}
if (!Inst.use_empty())		if (!Inst.use_empty()) {
		aprantlUnsubmitted Done Reply Inline Actions `// Salvage ...` Does this fit on 80 columns? aprantl: `// Salvage ...` Does this fit on 80 columns?
		// Salvage debug info early to be compatible with
		// removeRedundantDbgInstrsUsingBackwardScan() used in SimplfyCFG.
		salvageDebugInfoOrMarkUndef(Inst);
Inst.replaceAllUsesWith(Op);		Inst.replaceAllUsesWith(Op);
		}
salvageKnowledge(&Inst, &AC);		salvageKnowledge(&Inst, &AC);
removeMSSA(Inst);		removeMSSA(Inst);
Inst.eraseFromParent();		Inst.eraseFromParent();
Changed = true;		Changed = true;
++NumCSELoad;		++NumCSELoad;
continue;		continue;
}		}
}		}
▲ Show 20 Lines • Show All 326 Lines • Show Last 20 Lines

llvm/test/DebugInfo/X86/earlycse-load-marks-undef.ll

This file was added.

				; RUN: opt -mtriple=x86_64-- %s -S --early-cse -o - \| FileCheck %s
				; Ensure that the second dbg.value is undef as RemoveRedundantDbgInstrs
				; will remove the first dbg.value.
				; Bugzilla #46114

				@a = dso_local global i32 1, align 4, !dbg !0
				@c = dso_local global i8 0, align 1, !dbg !9
				@b = dso_local global i16 0, align 2, !dbg !6

				define dso_local void @main() #0 !dbg !15 {
				entry:
				%0 = load i32, i32* @a, align 4
				call void @llvm.dbg.value(metadata i32 0, metadata !20, metadata !DIExpression()), !dbg !24
				%1 = load i32, i32* @a, align 4, !dbg !25, !tbaa !26
				; CHECK: call void @llvm.dbg.value(metadata i32 undef
				call void @llvm.dbg.value(metadata i32 %1, metadata !20, metadata !DIExpression()), !dbg !24
				%sub = sub nsw i32 %1, 0
				%conv1 = trunc i32 %sub to i16
				store i16 %conv1, i16* @b, align 2, !dbg !30, !tbaa !31
				%tobool = icmp ne i32 %0, 0
				unreachable
				}

				; Function Attrs: nounwind readnone speculatable willreturn
				declare void @llvm.dbg.value(metadata, metadata, metadata) #1

				attributes #0 = { "use-soft-float"="false" }
				attributes #1 = { nounwind readnone speculatable willreturn }

				!llvm.dbg.cu = !{!2}
				!llvm.module.flags = !{!13, !14}

				!0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
				!1 = distinct !DIGlobalVariable(name: "a", scope: !2, file: !3, line: 1, type: !12, isLocal: false, isDefinition: true)
				!2 = distinct !DICompileUnit(language: DW_LANG_C99, file: !3, producer: "clang version 11.0.0 (https://github.com/llvm/llvm-project.git 20c9bb44ec1a4a795215ff6964d264219f9b05f2)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !4, globals: !5, splitDebugInlining: false, nameTableKind: None)
				!3 = !DIFile(filename: "earlycse-load-marks-undef.ll", directory: "/temp/bz46114")
				!4 = !{}
				!5 = !{!0, !6, !9}
				!6 = !DIGlobalVariableExpression(var: !7, expr: !DIExpression())
				!7 = distinct !DIGlobalVariable(name: "b", scope: !2, file: !3, line: 2, type: !8, isLocal: false, isDefinition: true)
				!8 = !DIBasicType(name: "short", size: 16, encoding: DW_ATE_signed)
				!9 = !DIGlobalVariableExpression(var: !10, expr: !DIExpression())
				!10 = distinct !DIGlobalVariable(name: "c", scope: !2, file: !3, line: 3, type: !11, isLocal: false, isDefinition: true)
				!11 = !DIBasicType(name: "char", size: 8, encoding: DW_ATE_signed_char)
				!12 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
				!13 = !{i32 2, !"Debug Info Version", i32 3}
				!14 = !{i32 1, !"wchar_size", i32 4}
				!15 = distinct !DISubprogram(name: "main", scope: !3, file: !3, line: 4, type: !16, scopeLine: 4, flags: DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition \| DISPFlagOptimized, unit: !2, retainedNodes: !18)
				!16 = !DISubroutineType(types: !17)
				!17 = !{!12}
				!18 = !{!19, !20, !21}
				!19 = !DILocalVariable(name: "d", scope: !15, file: !3, line: 5, type: !12)
				!20 = !DILocalVariable(name: "l_52", scope: !15, file: !3, line: 5, type: !12)
				!21 = !DILocalVariable(name: "e", scope: !22, file: !3, line: 8, type: !12)
				!22 = distinct !DILexicalBlock(scope: !23, file: !3, line: 7, column: 10)
				!23 = distinct !DILexicalBlock(scope: !15, file: !3, line: 7, column: 7)
				!24 = !DILocation(line: 0, scope: !15)
				!25 = !DILocation(line: 6, column: 15, scope: !15)
				!26 = !{!27, !27, i64 0}
				!27 = !{!"int", !28, i64 0}
				!28 = !{!"omnipotent char", !29, i64 0}
				!29 = !{!"Simple C/C++ TBAA"}
				!30 = !DILocation(line: 6, column: 5, scope: !15)
				!31 = !{!32, !32, i64 0}
				!32 = !{!"short", !28, i64 0}

This is an archive of the discontinued LLVM Phabricator instance.

Proposed fix for PR46114Needs ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 267509

llvm/lib/Transforms/Scalar/EarlyCSE.cpp

llvm/test/DebugInfo/X86/earlycse-load-marks-undef.ll

Proposed fix for PR46114
Needs ReviewPublic