This is an archive of the discontinued LLVM Phabricator instance.

[GVN] Add default debug location when constructing PHI nodes
AbandonedPublic

Authored by StephenTozer on Mar 15 2019, 8:21 AM.

Download Raw Diff

Details

Reviewers

probinson
brzycki
gbedwell
aprantl
vsk
twoh

Summary

The GVN pass will attempt to eliminate repeated load instructions with non-local dependencies where possible by constructing PHI nodes. The PHI node that directly replaces the load instruction is currently assigned the debug location associated with that load; however, any other predecessor PHI nodes constructed as part of this operation have their debug information left blank. This patch changes the PHI construction code to add a default Line 0 Function-scope debug location for when no more specific information is available.

Diff Detail

Event Timeline

StephenTozer created this revision.Mar 15 2019, 8:21 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 15 2019, 8:21 AM

Herald added subscribers: llvm-commits, hiraditya. · View Herald Transcript

aprantl added inline comments.Mar 15 2019, 8:53 AM

llvm/lib/Transforms/Utils/SSAUpdater.cpp
283	This comment describes what the next line is doing, which is pretty obvious :-) Could you please change it to describe why we are doing that? Basically what you wrote in the review description above?

Update comment to give more context to the use of a Line 0 debug location.

Does this need to be line zero, or could it be no location (the absence of a dbgloc)?

Zero is necessary for inlinable calls (since they can't have no dbgloc - for long standing reasons) or I guess for some other cases where we want to be very clear that this instruction isn't from this line of code. But otherwise it's generally acceptable to have no location (which is what debug loc merging does except for calls) & let any machine code end up with the location of the preceeding instructions.

Using zero locations can/will grow the line table size (having to describe different locations (zero, no-zero, zero, non-zero) for finer slices of the instruction stream) and potentially make debugging more difficult (depending on how the consumers renders this experience to users).

I believe this is a patch to fix PR37964?

In D59417#1431042, @probinson wrote:

I believe this is a patch to fix PR37964?

Hrm - if debugify diagnoses all instances of instructions without locations as buggy, that sounds very noisy to me. How does it handle all the other cases of merged locations that end up with no location? (if my recollection is accurate and the code is still doing this - merged locations that aren't the same and that aren't calls get no location, not a zero location)

In D59417#1431111, @dblaikie wrote:

In D59417#1431042, @probinson wrote:

I believe this is a patch to fix PR37964?

Hrm - if debugify diagnoses all instances of instructions without locations as buggy, that sounds very noisy to me. How does it handle all the other cases of merged locations that end up with no location? (if my recollection is accurate and the code is still doing this - merged locations that aren't the same and that aren't calls get no location, not a zero location)

It looks like DILocation::getMergedLocation() always returns a zero location, and that Instruction::applyMergedLocation() no longer restricts itself to calls. This changed with D45396+D45397: we wanted mem2reg to set merged locations on phis to help users narrow down the scope of a crashing instruction. Phis often cannot assume a previous location (say, if they're at the start of a block).

If it's ok for an instruction to have no location, I think we should try to either define those cases in a way that can be checked, or relax any verifiers if that proves too difficult. I'm not sure what those cases are, though. I think it's reasonable to require locations on newly-created instructions, or to update the location of a moved instruction using a neighboring instruction.

In D59417#1431153, @vsk wrote:

In D59417#1431111, @dblaikie wrote:

In D59417#1431042, @probinson wrote:

I believe this is a patch to fix PR37964?

Hrm - if debugify diagnoses all instances of instructions without locations as buggy, that sounds very noisy to me. How does it handle all the other cases of merged locations that end up with no location? (if my recollection is accurate and the code is still doing this - merged locations that aren't the same and that aren't calls get no location, not a zero location)

It looks like DILocation::getMergedLocation() always returns a zero location, and that Instruction::applyMergedLocation() no longer restricts itself to calls. This changed with D45396+D45397: we wanted mem2reg to set merged locations on phis to help users narrow down the scope of a crashing instruction.

Thanks for citing the previous reviews & reminding me of the facts of the matter (clearly I asked the same questions then and got the right answers/data to justify the change).

Phis often cannot assume a previous location (say, if they're at the start of a block).

Not sure I follow this bit - setting a zero location doesn't make them any easier to debug, does it? Or was it that the "cascade location" (whatever the previous instruction was) the source of confusion?

As for instructions at the start of the block - I'm pretty sure that a while back @probinson made a change to ensure that cascade locations couldn't occur at the start of a block (that we'd set to line zero down in the AsmPrinter/DwarfDebug if that occurred) so we shouldn't end up with super weird/layout-based cascade locations for instructions generated from phis at the start of blocks.

Aren't phis *always* at the start of a block?
If that's the case, I'm tempted to say phis should be allowed to have no source location, and we should fix the debugify verifier accordingly. Sometimes a phi does have an appropriate source location, which is fine. I supposed sometimes a phi ought to have a real location, but doesn't, and the verifier will stop helping us to find those cases; but it will save a bit of memory not to attach source locations to every phi in the world, and have no practical effect on the final line table (because those will turn into line-0 anyway).

Phis often cannot assume a previous location (say, if they're at the start of a block).

Not sure I follow this bit - setting a zero location doesn't make them any easier to debug, does it? Or was it that the "cascade location" (whatever the previous instruction was) the source of confusion?

Setting a zero location on the phi shows the user that the crash occurred in 'inner()', even though it was inlined into 'outer()'. That makes things less confusing (say, if 'inner' is a lot shorter than 'outer'). Of course, the only reason this is helpful is because the zero location on the phi cascades down to the crashy StoreInst.

As for instructions at the start of the block - I'm pretty sure that a while back @probinson made a change to ensure that cascade locations couldn't occur at the start of a block (that we'd set to line zero down in the AsmPrinter/DwarfDebug if that occurred) so we shouldn't end up with super weird/layout-based cascade locations for instructions generated from phis at the start of blocks.

That makes sense. What if a block has multiple phis? I.e. if the first phi in a block has a location but the second phi doesn't, is that a bug, or do we expect the location to cascade?

In D59417#1431369, @vsk wrote:

As for instructions at the start of the block - I'm pretty sure that a while back @probinson made a change to ensure that cascade locations couldn't occur at the start of a block (that we'd set to line zero down in the AsmPrinter/DwarfDebug if that occurred) so we shouldn't end up with super weird/layout-based cascade locations for instructions generated from phis at the start of blocks.

That makes sense. What if a block has multiple phis? I.e. if the first phi in a block has a location but the second phi doesn't, is that a bug, or do we expect the location to cascade?

Locations cascade, with one exception: If the first instruction after a label does not have an explicit source location, we emit line 0.
This is at the end of CodeGen when we're turning machine instructions into assembler text, so the phis have long since been translated to machine instructions or evaporated, but the idea is the same.

In D59417#1431496, @probinson wrote:

In D59417#1431369, @vsk wrote:

As for instructions at the start of the block - I'm pretty sure that a while back @probinson made a change to ensure that cascade locations couldn't occur at the start of a block (that we'd set to line zero down in the AsmPrinter/DwarfDebug if that occurred) so we shouldn't end up with super weird/layout-based cascade locations for instructions generated from phis at the start of blocks.

That makes sense. What if a block has multiple phis? I.e. if the first phi in a block has a location but the second phi doesn't, is that a bug, or do we expect the location to cascade?

Locations cascade, with one exception: If the first instruction after a label does not have an explicit source location, we emit line 0.

That sounds like a sensible rule. Presumably that line 0 location has the correct scope set. Shall I relax the debugify verifier so that it only diagnoses non-phi instructions without locations? I.e. are phis special in some way, or would the relaxed check still be too strict (& any 'no location' diagnostic could be noisy)?

In D59417#1433403, @vsk wrote:

In D59417#1431496, @probinson wrote:

In D59417#1431369, @vsk wrote:

As for instructions at the start of the block - I'm pretty sure that a while back @probinson made a change to ensure that cascade locations couldn't occur at the start of a block (that we'd set to line zero down in the AsmPrinter/DwarfDebug if that occurred) so we shouldn't end up with super weird/layout-based cascade locations for instructions generated from phis at the start of blocks.

That makes sense. What if a block has multiple phis? I.e. if the first phi in a block has a location but the second phi doesn't, is that a bug, or do we expect the location to cascade?

Locations cascade, with one exception: If the first instruction after a label does not have an explicit source location, we emit line 0.

That sounds like a sensible rule. Presumably that line 0 location has the correct scope set. Shall I relax the debugify verifier so that it only diagnoses non-phi instructions without locations? I.e. are phis special in some way, or would the relaxed check still be too strict (& any 'no location' diagnostic could be noisy)?

AsmPrinter/DwarfDebug are careful to track the most recent emitted location, and change only the line number. This gets the desired effect in DWARF (line 0 = no specific location) without a lot of bloat resetting the file & column numbers as well.

Somebody else should chime in about relaxing the verifier, as it was my idea and I often don't know what I'm talking about. :-)

Closed, since as per the discussion above this change is undesired and unnecessary.

vsk mentioned this in D100446: [Transforms][Debugify] Fix "Missing line" false alarm on PHI nodes.Apr 22 2021, 1:57 PM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Utils/

SSAUpdater.cpp

7 lines

test/

Transforms/

GVN/

debugloc.ll

5 lines

Diff 190844

llvm/lib/Transforms/Utils/SSAUpdater.cpp

Show All 13 Lines
#include "llvm/ADT/DenseMap.h"		#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/TinyPtrVector.h"		#include "llvm/ADT/TinyPtrVector.h"
#include "llvm/Analysis/InstructionSimplify.h"		#include "llvm/Analysis/InstructionSimplify.h"
#include "llvm/IR/BasicBlock.h"		#include "llvm/IR/BasicBlock.h"
#include "llvm/IR/CFG.h"		#include "llvm/IR/CFG.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
		#include "llvm/IR/DebugInfoMetadata.h"
#include "llvm/IR/DebugLoc.h"		#include "llvm/IR/DebugLoc.h"
#include "llvm/IR/Instruction.h"		#include "llvm/IR/Instruction.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
#include "llvm/IR/Module.h"		#include "llvm/IR/Module.h"
#include "llvm/IR/Use.h"		#include "llvm/IR/Use.h"
#include "llvm/IR/Value.h"		#include "llvm/IR/Value.h"
#include "llvm/IR/ValueHandle.h"		#include "llvm/IR/ValueHandle.h"
#include "llvm/Support/Casting.h"		#include "llvm/Support/Casting.h"
▲ Show 20 Lines • Show All 244 Lines • ▼ Show 20 Lines	public:
}		}

/// CreateEmptyPHI - Create a new PHI instruction in the specified block.		/// CreateEmptyPHI - Create a new PHI instruction in the specified block.
/// Reserve space for the operands but do not fill them in yet.		/// Reserve space for the operands but do not fill them in yet.
static Value CreateEmptyPHI(BasicBlock BB, unsigned NumPreds,		static Value CreateEmptyPHI(BasicBlock BB, unsigned NumPreds,
SSAUpdater *Updater) {		SSAUpdater *Updater) {
PHINode *PHI = PHINode::Create(Updater->ProtoType, NumPreds,		PHINode *PHI = PHINode::Create(Updater->ProtoType, NumPreds,
Updater->ProtoName, &BB->front());		Updater->ProtoName, &BB->front());
		// The PHI nodes created here may not have any reasonable mapping to the
		aprantlUnsubmitted Not Done Reply Inline Actions This comment describes what the next line is doing, which is pretty obvious :-) Could you please change it to describe why we are doing that? Basically what you wrote in the review description above? aprantl: This comment describes what the next line is doing, which is pretty obvious :-) Could you…
		// source code; use Line 0 as a default location.
		if (BB->getParent()->getSubprogram()) {
		PHI->setDebugLoc(DILocation::get(BB->getContext(), 0, 0,
		BB->getParent()->getSubprogram()));
		}
return PHI;		return PHI;
}		}

/// AddPHIOperand - Add the specified value as an operand of the PHI for		/// AddPHIOperand - Add the specified value as an operand of the PHI for
/// the specified predecessor block.		/// the specified predecessor block.
static void AddPHIOperand(PHINode PHI, Value Val, BasicBlock *Pred) {		static void AddPHIOperand(PHINode PHI, Value Val, BasicBlock *Pred) {
PHI->addIncoming(Val, Pred);		PHI->addIncoming(Val, Pred);
}		}
▲ Show 20 Lines • Show All 206 Lines • Show Last 20 Lines

llvm/test/Transforms/GVN/debugloc.ll

	; RUN: opt < %s -gvn -S \| FileCheck %s			; RUN: opt < %s -gvn -S \| FileCheck %s
				; CHECK-LABEL: @foo(
				; CHECK-SAME: !dbg ![[FNDBG:[0-9]+]]
	; CHECK: {{^}}for.body:			; CHECK: {{^}}for.body:
	; CHECK-NEXT: [[VREG1:%[^ ]+]] = phi{{.}}[[VREG2:%[^ ]+]],{{.}}%.sink,			; CHECK-NEXT: [[VREG1:%[^ ]+]] = phi{{.}}[[VREG2:%[^ ]+]],{{.}}%.sink,
	; CHECK-NOT: !dbg			; CHECK: !dbg ![[DBG:[0-9]+]]
	; CHECK-SAME: {{$}}			; CHECK-SAME: {{$}}
	; CHECK: {{^}}for.inc:			; CHECK: {{^}}for.inc:
	; CHECK-NEXT: [[VREG2]] = phi{{.}}%inc,{{.}}[[VREG1]]			; CHECK-NEXT: [[VREG2]] = phi{{.}}%inc,{{.}}[[VREG1]]
				; CHECK: ![[DBG]] = !DILocation(line: 0, scope: ![[FNDBG]]

	target triple = "x86_64-unknown-linux-gnu"			target triple = "x86_64-unknown-linux-gnu"

	@g = external local_unnamed_addr global i32, align 4			@g = external local_unnamed_addr global i32, align 4

	; Function Attrs: nounwind uwtable			; Function Attrs: nounwind uwtable
	define void @foo(i32 %x, i32 %y, i32 %z) local_unnamed_addr #0 !dbg !4 {			define void @foo(i32 %x, i32 %y, i32 %z) local_unnamed_addr #0 !dbg !4 {
	entry:			entry:
	▲ Show 20 Lines • Show All 62 Lines • Show Last 20 Lines