This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Transforms/Utils/
-
Transforms/
-
Utils/
2/7
SimplifyCFG.cpp
-
test/Transforms/SimplifyCFG/
-
Transforms/
-
SimplifyCFG/
-
remove-debug.ll

Differential D24164

Remove debug info when hoisting instruction from then/else branch.
ClosedPublic

Authored by danielcdh on Sep 1 2016, 3:50 PM.

Download Raw Diff

Details

Reviewers

kcc
chandlerc
dblaikie
davidxl
echristo

Commits

rG87823f8e4d86: Remove debug info when hoisting instruction from then/else branch.
rL280995: Remove debug info when hoisting instruction from then/else branch.

Summary

The hoisted instruction is executed speculatively. It could affect the debugging experience as user would see gdb go into code that may not be expected to execute. It will also affect sample profile accuracy by assigning incorrect frequency to source within then/else branch.

Diff Detail

Event Timeline

danielcdh updated this revision to Diff 70085.Sep 1 2016, 3:50 PM

danielcdh retitled this revision from to Remove debug info when hoisting instruction from then/else branch..

danielcdh updated this object.

danielcdh added reviewers: davidxl, dblaikie.

danielcdh added a subscriber: llvm-commits.

I would worry that this might produce a worse experience for a sanitizer user, for example - but I'm not sure. (the sanitizer user won't have any trail back to why the store in this example is happening - granted, the way the code currently works, they may be told the store is happening in the wrong branch, which isn't great either)

Would mind getting Eric, Chandler, and/or Kostya's perspective on this.

As for the mechanics: might want to sure-up the CHECK lines to actually trace the lines back to the instructions - depending on the ordering of the DILocations being emitted is probably a bit too brittle.

danielcdh added reviewers: chandlerc, eric_niebler.Sep 1 2016, 8:18 PM

danielcdh edited reviewers, added: echristo; removed: eric_niebler.

andreadb added a subscriber: andreadb.Sep 2 2016, 1:54 AM

In D24164#532594, @dblaikie wrote:

I would worry that this might produce a worse experience for a sanitizer user, for example - but I'm not sure. (the sanitizer user won't have any trail back to why the store in this example is happening - granted, the way the code currently works, they may be told the store is happening in the wrong branch, which isn't great either)

I don't know about the sanitizer user experience. However, based on my experience with sample pgo, this kind of changes tend to improve the quality of the sample profile from autofdo.

@Dehao,
we have identified a couple of similar issues where the debug location of a tail-merged instruction is incorrectly set. We already have plans to send a couple of small patches to fix these issues (those would need D24180); in our sample-pgo experiments, this kind of small fixes for hoisted/tail merged instructions tend to have a very positive impact on the quality of the sample profile.

danielcdh added a reviewer: kcc.Sep 2 2016, 8:14 AM

eric_niebler added a subscriber: eric_niebler.Sep 2 2016, 10:09 AM

This comment was removed by eric_niebler.

In D24164#532702, @andreadb wrote:

In D24164#532594, @dblaikie wrote:

I would worry that this might produce a worse experience for a sanitizer user, for example - but I'm not sure. (the sanitizer user won't have any trail back to why the store in this example is happening - granted, the way the code currently works, they may be told the store is happening in the wrong branch, which isn't great either)

I don't know about the sanitizer user experience. However, based on my experience with sample pgo, this kind of changes tend to improve the quality of the sample profile from autofdo.

@Dehao,
we have identified a couple of similar issues where the debug location of a tail-merged instruction is incorrectly set. We already have plans to send a couple of small patches to fix these issues (those would need D24180); in our sample-pgo experiments, this kind of small fixes for hoisted/tail merged instructions tend to have a very positive impact on the quality of the sample profile.

I also have a related debug-info fix for tail-merging (attached below), but haven't had time to make a unittest and send out for review. It would be great if your patch can cover this case, so that I don't need to prepare a patch for it :)

Could you send us your debug info patches for sample pgo so that we can test them on our workload to see performance impact?

About D24180, we just use -mllvm -use-unknown-locations=true for similar purposes.

Dehao

Index: ../lib/CodeGen/BranchFolding.cpp

../lib/CodeGen/BranchFolding.cpp (revision 280454)

+++ ../lib/CodeGen/BranchFolding.cpp (working copy)
@@ -939,6 +939,11 @@

MachineBasicBlock *MBB = SameTails[commonTailIndex].getBlock();

+ // Remove debug info
+ for (MachineInstr &MI : *MBB)
+ if (!MI.isDebugValue())
+ MI.setDebugLoc(nullptr);
+

// Recompute common tail MBB's edge weights and block frequency.
setCommonTailEdgeWeights(*MBB);

@Dehao,
we have identified a couple of similar issues where the debug location of a tail-merged instruction is incorrectly set. We already have plans to send a couple of small patches to fix these issues (those would need D24180); in our sample-pgo experiments, this kind of small fixes for hoisted/tail merged instructions tend to have a very positive impact on the quality of the sample profile.

I also have a related debug-info fix for tail-merging (attached below), but haven't had time to make a unittest and send out for review. It would be great if your patch can cover this case, so that I don't need to prepare a patch for it :)

Could you send us your debug info patches for sample pgo so that we can test them on our workload to see performance impact?

Sure, I will send you our patches (we have a few fixes - not only the tail-merging - that we can share). If you don't mind, I will send those patches to you on monday as it is almost end of the day here..

About D24180, we just use -mllvm -use-unknown-locations=true for similar purposes.

Interesting, I will look at it.
At the moment I am using D24180 and it seems to be doing a good job with sample pgo. In my experiments, it fixes a few nasty cases where instructions at the top of a basic block are implicitly attributed to the source line for the physically preceding instruction simply because they don't have a debug loc. Anyway, I will explain all the details in the email that I will send to you :-).

Dehao

Index: ../lib/CodeGen/BranchFolding.cpp

../lib/CodeGen/BranchFolding.cpp (revision 280454)

+++ ../lib/CodeGen/BranchFolding.cpp (working copy)
@@ -939,6 +939,11 @@
MachineBasicBlock *MBB = SameTails[commonTailIndex].getBlock();
+ // Remove debug info
+ for (MachineInstr &MI : *MBB)
+ if (!MI.isDebugValue())
+ MI.setDebugLoc(nullptr);
+
// Recompute common tail MBB's edge weights and block frequency.
setCommonTailEdgeWeights(*MBB);

I guess that works well because you specify -use-unknown-locations=true (our patch is almost identical btw).

Cheers,
Andrea

In D24164#533199, @andreadb wrote:

About D24180, we just use -mllvm -use-unknown-locations=true for similar purposes.

Interesting, I will look at it.
At the moment I am using D24180 and it seems to be doing a good job with sample pgo. In my experiments, it fixes a few nasty cases where instructions at the top of a basic block are implicitly attributed to the source line for the physically preceding instruction simply because they don't have a debug loc. Anyway, I will explain all the details in the email that I will send to you :-).

Using -mllvm -use-unknown-locations=true is a superset of D24180. The latter tries to be smarter about where the line-0 attributions are really needed, in order to minimize the size penalty in the debug line table. Also D24180 works for Codeview as well as DWARF.

ping...

Chandler, Eric and Kostya, could you shed some lights on this patch?

Thanks,
Dehao

echristo added inline comments.Sep 6 2016, 6:27 PM

lib/Transforms/Utils/SimplifyCFG.cpp
1256	I think you have a use after free here, IS->eraseFromParent() above.
1256	This needs a comment about the point of what we're doing. That said while this may help the AFDO type of use case I think it might the debug experience worse for sanitizers/debuggers. For the easy case, take one of the instructions that's being hoisted and make it a load of NULL. If we assert/trap/etc on this then the back trace will give us a fairly unhelpful line in the basic block that doesn't correlate to any of the source that the user wrote.

dblaikie added inline comments.Sep 6 2016, 6:29 PM

lib/Transforms/Utils/SimplifyCFG.cpp
1256	That's my concern as well - but conversely, even if we did pick a location for this, half the time we'd pick the wrong one which would also be confusing for the user (we could diagnose a null dereference that we would describe as coming from the 'if' block, even though the 'else' was taken - the user could easily see a contradiction here that may be quite confusing). So, dunno what the right call is - just has some complications.

echristo added inline comments.Sep 6 2016, 6:32 PM

lib/Transforms/Utils/SimplifyCFG.cpp
1256	Well, we'd at least pick a line of code that resembled what failed rather than something from another block...

dblaikie added inline comments.Sep 6 2016, 6:35 PM

lib/Transforms/Utils/SimplifyCFG.cpp
1256	Yep. Alternatively we could use line zero - more correct, technically, but probably no better for most debuggers/tools (& would make the line tables bigger - coming back to Paul's recent patch/direction/discussion)

danielcdh added inline comments.Sep 6 2016, 6:41 PM

lib/Transforms/Utils/SimplifyCFG.cpp
1256	So the impact of this patch (or changing it to line 0) will be: for debugging/sanitizer, it will make already-bad situation a little worse for afdo, it will make a very bad situation much better looks like it's still a win?

andreadb added inline comments.Sep 7 2016, 4:10 AM

lib/Transforms/Utils/SimplifyCFG.cpp
1256	It indeed looks like a win to me. As David said, the debug location was already wrong in the first place. To me, this is clearly one of those cases where we should use the line zero. That said, I will let Eric/Paul/David decide as I don't claim to be a debug info experert.

Still need to fix the use after free and add the comment.

That said, also please add a FIXME to use location 0 rather than ignoring the location completely.

Thanks!

-eric

I'm pretty sure DWARF inherently can't attribute the same instruction to two source locations, so I think it is preferable to say "don't know" all the time than to give the wrong answer half the time.
I'd be curious how the sanitizers actually handle the line-0 case.

In D24164#537288, @probinson wrote:

I'm pretty sure DWARF inherently can't attribute the same instruction to two source locations, so I think it is preferable to say "don't know" all the time than to give the wrong answer half the time.

Right, but I'd prefer an answer that's slightly inaccurate to one that attributes the location to a source line that just doesn't even resemble it :)

I'd be curious how the sanitizers actually handle the line-0 case.

Ditto.

update

Herald added a subscriber: mehdi_amini. · View Herald TranscriptSep 8 2016, 11:06 AM

WFM.

-eric

This revision is now accepted and ready to land.Sep 8 2016, 11:08 AM

danielcdh closed this revision.Sep 8 2016, 3:01 PM

andreadb mentioned this in D27468: When GVN removes a redundant load, it should not modify the debug location of the dominating load..Dec 6 2016, 1:06 PM

Revision Contents

Path

Size

lib/

Transforms/

Utils/

SimplifyCFG.cpp

8 lines

test/

Transforms/

SimplifyCFG/

remove-debug.ll

83 lines

Diff 70728

lib/Transforms/Utils/SimplifyCFG.cpp

Show First 20 Lines • Show All 1,236 Lines • ▼ Show 20 Lines	unsigned KnownIDs[] = {LLVMContext::MD_tbaa,
LLVMContext::MD_invariant_load,		LLVMContext::MD_invariant_load,
LLVMContext::MD_nonnull,		LLVMContext::MD_nonnull,
LLVMContext::MD_invariant_group,		LLVMContext::MD_invariant_group,
LLVMContext::MD_align,		LLVMContext::MD_align,
LLVMContext::MD_dereferenceable,		LLVMContext::MD_dereferenceable,
LLVMContext::MD_dereferenceable_or_null,		LLVMContext::MD_dereferenceable_or_null,
LLVMContext::MD_mem_parallel_loop_access};		LLVMContext::MD_mem_parallel_loop_access};
combineMetadata(I1, I2, KnownIDs);		combineMetadata(I1, I2, KnownIDs);

		// If the debug loc for I1 and I2 are different, as we are combining them
		// into one instruction, we do not want to select debug loc randomly from
		// I1 or I2. Instead, we set the 0-line DebugLoc to note that we do not
		// know the debug loc of the hoisted instruction.
		if (!isa<CallInst>(I1) && I1->getDebugLoc() != I2->getDebugLoc())
		I1->setDebugLoc(DebugLoc());

I2->eraseFromParent();		I2->eraseFromParent();
Changed = true;		Changed = true;

I1 = &*BB1_Itr++;		I1 = &*BB1_Itr++;
		echristoUnsubmitted Done Reply Inline Actions I think you have a use after free here, IS->eraseFromParent() above. echristo: I think you have a use after free here, IS->eraseFromParent() above.
		echristoUnsubmitted Done Reply Inline Actions This needs a comment about the point of what we're doing. That said while this may help the AFDO type of use case I think it might the debug experience worse for sanitizers/debuggers. For the easy case, take one of the instructions that's being hoisted and make it a load of NULL. If we assert/trap/etc on this then the back trace will give us a fairly unhelpful line in the basic block that doesn't correlate to any of the source that the user wrote. echristo: This needs a comment about the point of what we're doing. That said while this may help the…
		dblaikieUnsubmitted Not Done Reply Inline Actions That's my concern as well - but conversely, even if we did pick a location for this, half the time we'd pick the wrong one which would also be confusing for the user (we could diagnose a null dereference that we would describe as coming from the 'if' block, even though the 'else' was taken - the user could easily see a contradiction here that may be quite confusing). So, dunno what the right call is - just has some complications. dblaikie: That's my concern as well - but conversely, even if we did pick a location for this, half the…
		echristoUnsubmitted Not Done Reply Inline Actions Well, we'd at least pick a line of code that resembled what failed rather than something from another block... echristo: Well, we'd at least pick a line of code that resembled what failed rather than something from…
		dblaikieUnsubmitted Not Done Reply Inline Actions Yep. Alternatively we could use line zero - more correct, technically, but probably no better for most debuggers/tools (& would make the line tables bigger - coming back to Paul's recent patch/direction/discussion) dblaikie: Yep. Alternatively we could use line zero - more correct, technically, but probably no better…
		danielcdhAuthorUnsubmitted Not Done Reply Inline Actions So the impact of this patch (or changing it to line 0) will be: for debugging/sanitizer, it will make already-bad situation a little worse for afdo, it will make a very bad situation much better looks like it's still a win? danielcdh: So the impact of this patch (or changing it to line 0) will be: for debugging/sanitizer, it…
		andreadbUnsubmitted Not Done Reply Inline Actions It indeed looks like a win to me. As David said, the debug location was already wrong in the first place. To me, this is clearly one of those cases where we should use the line zero. That said, I will let Eric/Paul/David decide as I don't claim to be a debug info experert. andreadb: It indeed looks like a win to me. As David said, the debug location was already wrong in the…
I2 = &*BB2_Itr++;		I2 = &*BB2_Itr++;
// Skip debug info if it is not identical.		// Skip debug info if it is not identical.
DbgInfoIntrinsic *DBI1 = dyn_cast<DbgInfoIntrinsic>(I1);		DbgInfoIntrinsic *DBI1 = dyn_cast<DbgInfoIntrinsic>(I1);
DbgInfoIntrinsic *DBI2 = dyn_cast<DbgInfoIntrinsic>(I2);		DbgInfoIntrinsic *DBI2 = dyn_cast<DbgInfoIntrinsic>(I2);
if (!DBI1 \|\| !DBI2 \|\| !DBI1->isIdenticalToWhenDefined(DBI2)) {		if (!DBI1 \|\| !DBI2 \|\| !DBI1->isIdenticalToWhenDefined(DBI2)) {
while (isa<DbgInfoIntrinsic>(I1))		while (isa<DbgInfoIntrinsic>(I1))
I1 = &*BB1_Itr++;		I1 = &*BB1_Itr++;
while (isa<DbgInfoIntrinsic>(I2))		while (isa<DbgInfoIntrinsic>(I2))
▲ Show 20 Lines • Show All 4,645 Lines • Show Last 20 Lines

test/Transforms/SimplifyCFG/remove-debug.ll

This file was added.

				; RUN: opt < %s -simplifycfg -S \| FileCheck %s

				; TODO: Track the acutal DebugLoc of the hoisted instruction when no-line
				; DebugLoc is supported (https://reviews.llvm.org/D24180)
				; CHECK: line: 6
				; CHECK-NOT: line: 7
				; CHECK: line: 8
				; CHECK: line: 9
				; CHECK-NOT: line: 10
				; CHECK: line: 11

				; Checks if the debug info for hoisted "x = i" is removed
				; int x;
				; void bar();
				; void baz();
				;
				; void foo(int i) {
				; if (i == 0) {
				; x = i;
				; bar();
				; } else {
				; x = i;
				; baz();
				; }
				; }

				target triple = "x86_64-unknown-linux-gnu"

				@x = global i32 0, align 4

				; Function Attrs: uwtable
				define void @_Z3fooi(i32) #0 !dbg !6 {
				%2 = alloca i32, align 4
				store i32 %0, i32* %2, align 4, !tbaa !8
				%3 = load i32, i32* %2, align 4, !dbg !12, !tbaa !8
				%4 = icmp eq i32 %3, 0, !dbg !13
				br i1 %4, label %5, label %7, !dbg !12

				; <label>:5:
				%6 = load i32, i32* %2, align 4, !dbg !14, !tbaa !8
				store i32 %6, i32* @x, align 4, !dbg !15, !tbaa !8
				call void @_Z3barv(), !dbg !16
				br label %9, !dbg !17

				; <label>:7:
				%8 = load i32, i32* %2, align 4, !dbg !18, !tbaa !8
				store i32 %8, i32* @x, align 4, !dbg !19, !tbaa !8
				call void @_Z3bazv(), !dbg !20
				br label %9

				; <label>:9:
				ret void, !dbg !21
				}

				declare void @_Z3barv() #1

				declare void @_Z3bazv() #1

				!llvm.dbg.cu = !{!0}
				!llvm.module.flags = !{!3, !4}

				!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !1)
				!1 = !DIFile(filename: "a", directory: "b/")
				!2 = !{}
				!3 = !{i32 2, !"Dwarf Version", i32 4}
				!4 = !{i32 2, !"Debug Info Version", i32 3}
				!5 = !{}
				!6 = distinct !DISubprogram(unit: !0)
				!7 = !DISubroutineType(types: !2)
				!8 = !{!9, !9, i64 0}
				!9 = !{!"int", !10, i64 0}
				!10 = !{!"omnipotent char", !11, i64 0}
				!11 = !{!"Simple C++ TBAA"}
				!12 = !DILocation(line: 6, column: 7, scope: !6)
				!13 = !DILocation(line: 6, column: 9, scope: !6)
				!14 = !DILocation(line: 7, column: 9, scope: !6)
				!15 = !DILocation(line: 7, column: 7, scope: !6)
				!16 = !DILocation(line: 8, column: 5, scope: !6)
				!17 = !DILocation(line: 9, column: 3, scope: !6)
				!18 = !DILocation(line: 10, column: 9, scope: !6)
				!19 = !DILocation(line: 10, column: 7, scope: !6)
				!20 = !DILocation(line: 11, column: 5, scope: !6)
				!21 = !DILocation(line: 13, column: 1, scope: !6)

This is an archive of the discontinued LLVM Phabricator instance.

Remove debug info when hoisting instruction from then/else branch.ClosedPublic

Details

Diff Detail

Event Timeline

Index: ../lib/CodeGen/BranchFolding.cpp

Index: ../lib/CodeGen/BranchFolding.cpp

Revision Contents

Diff 70728

lib/Transforms/Utils/SimplifyCFG.cpp

test/Transforms/SimplifyCFG/remove-debug.ll

Remove debug info when hoisting instruction from then/else branch.
ClosedPublic