This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/CodeGen/AsmPrinter/
-
CodeGen/
-
AsmPrinter/
6
DbgValueHistoryCalculator.h
8
DbgValueHistoryCalculator.cpp
-
test/DebugInfo/X86/
-
DebugInfo/
-
X86/
1
dbg-value-inlined-parameter.ll

Differential D6497

Prevent the creation of empty location list ranges.
Needs ReviewPublic

Authored by friss on Dec 2 2014, 8:04 PM.

Download Raw Diff

Details

Reviewers

dblaikie
samsonov
echristo
aprantl

Summary

We happen to create such ranges today depending on teh scheduling of the
DBG_VALUE instructions. An empty range is not only useless, it can also
be a real issue if it happens to span the 0-0 address range (address ranges
in loaction lists are function relative). 0-0 is defined as the end of
location list marker, thus if such a range is generated it will hide the
other entries of the list. A Dwarf consumer like llvm-dwarfdump that
tries to read the .debug_loc section linearly will get totally lost (this
way of reading .debug_loc could arguably be considered a llvm-dwarfdump
bug though).

Note that the patch as-is replaces a std::pair by a struct containing 'first'
and 'second' members. I did this to expose only the logic changes in this
patch. I can do proper renaming as a first step if we agree on the patch's
direction.

Note also that the patch doesn't contain a real test for the empty range
deletion. It might be tricky to come up with something, but I can try if
we agree that the patch DTRT. The testing part of this patch is the fixup
of 2 tests that happened to pass by mistake as the variables they were
testing referred to invalid empty location lists. Once these ranges gone,
the variable is gone too. I XFAILed one test that wouldn't test anything
if modified and I removed the failing parts of the other one (turns out it
was already failing on linux targets).

Diff Detail

Event Timeline

friss updated this revision to Diff 16849.Dec 2 2014, 8:04 PM

friss retitled this revision from to Prevent the creation of empty location list ranges..

friss updated this object.

friss added reviewers: echristo, dblaikie, aprantl, samsonov.

friss added a subscriber: Unknown Object (MLST).

Yeah, I'm mildly curious to get a better understanding of where this comes up & whether we should just be dropping the dbg.value intrinsics earlier rather than waiting until we find that they describe nothing of interest - but it seems totally plausible that that wouldn't be easy/clean/convenient to do and that this (what you've proposed) is the right place to do it.

lib/CodeGen/AsmPrinter/DbgValueHistoryCalculator.cpp
48	I don't follow what's going on here - maybe a comment is in order? (not sure why it's only pushing back if the back() is true). I guess whatever the answer is also explains why SeenInstructionInRange starts with size 1 rather than empty. OK, I think I see what's going on here - rather than having entry in SeenInstructionInRange per live range, you avoid having them in the case where two ranges start together (or are not "valid" before teh second one starts, at least). Bit subtle, but I think I get it - an alternative idea/question: How do you avoid having to walk all the elements of SeenInstructionInRange when performing validateOpenRanges? I would've imagined/expected that you'd need to set all the elements of the container to true, not just the last one, no? In any case - could we just keep booleans directly in the InstrRanges - and validateOpenRanges would iterate through InstrRanges and set them all to true?
50	The pointer to the SmallVector element could end up dangling if the SmallVector is reallocated.
lib/CodeGen/AsmPrinter/DbgValueHistoryCalculator.h
39	Do we need one of these on every InstrRange? I suppose this comes up if we have multiple open InstrRanges at the same time? Do we support/implement that scenario? (I imagine we probably don't - in which case we only need "NonEmpty" for the currently open InstrRange - so perhaps InstrRanges should be a struct of SmallVector<InstrRange> + bool *NonEmpty, so we just have it once per InstrRanges, rather than once for every range within the InstrRanges)
48	'validate' seems a bit vague, but maybe sufficient
test/DebugInfo/X86/block-capture.ll
9 ↗	(On Diff #16849)	Just because we don't generate any locations doesn't mean we should omit the variable entirely - maybe a comment explaining that the missing variable is a (rather severe) bug, even if it doesn't have any location (another, perhaps less severe, bug, most likely)

In D6497#4, @dblaikie wrote:

Yeah, I'm mildly curious to get a better understanding of where this comes up & whether we should just be dropping the dbg.value intrinsics earlier rather than waiting until we find that they describe nothing of interest - but it seems totally plausible that that wouldn't be easy/clean/convenient to do and that this (what you've proposed) is the right place to do it.

In my opinion this is the only place where we can safely do that, because it's the only place where we are sure that Instructions are at their final place in the asm flow.

lib/CodeGen/AsmPrinter/DbgValueHistoryCalculator.cpp
48	Basically each entry in SeenInstructionInRange represents the status of a series of DBG_VALUEs that start at the same PC. If the last entry is false, we can just reuse it, our DBG_VALUE starts at the same PC that all the current still-empty ranges. If the last entry is true, we need to add a new one that will get false till the nest real instruction. Thus the container always contains only 'true' elements except maybe the last one that indicates that there are newly opened ranges that haven't seen an instruction yet. And the fact that a range is opened first doesn't mean it'll get closed first. We need to keep the old 'true' values that are pointed to live. It's a bit subtle, but it's the only way I could think of that keeps the pass algorithmic complexity the same. I fear that iterating the InstrRanges could lead to some really bad pathological cases. If you want to store the flag directly in the InstrRange, you need a way to iterate only the non-validated ones to mark them non-empty. This isn't trivial as the primary storage of the InstrRange is in a SmallVector, thus their address isn't stable (I haven't investigated if there would be a drawback to storing them in a list instead). I prefer the simplicity of just toggling the status for all the current entries by just modifying a bool value that everybody points to.
50	Bummer... I knew I had a good reason for having implemented that as a list first. And then I added the call to SeenInstructionInRange.resize() and thought that this would definitely be more efficient with a vector.
lib/CodeGen/AsmPrinter/DbgValueHistoryCalculator.h
39	We have multiple Ranges in the Same vector with different statuses. Today it's due to what I consider a bug: we never close ranges that aren't described by a register (i.e. a constant variable). But ultimately, even for registers it might make sense to have overlapping ranges for the same variable (the fact that a new DBG_VALUE describes the variable doesn't mean that the value mysteriously disappeared from the previous register). So yes, I explicitly designed this solution to support that scenario.
48	In an earlier iteration the added field in InstrRange was called 'Valid'. I could rename that to markOpenRangesNonEmpty().
test/DebugInfo/X86/block-capture.ll
9 ↗	(On Diff #16849)	I thought this was implicit due to the XFAIL, but I can make it even more visible.

aprantl added inline comments.Dec 3 2014, 9:23 AM

test/DebugInfo/X86/block-capture.ll
9 ↗	(On Diff #16849)	I'll have a look at this and see how we can fix the test case.
test/DebugInfo/X86/dbg-value-inlined-parameter.ll
34	If we drop the DW_TAG_formal_parameter completely, the function signature doesn't match the subroutine type any more. I think the formal parameter should still be there, even if it has no location.

Implement SeenInstructionInRange as a forward_list to guarantee element pointer stability.
Add more in depth comment.
Rename validateOpenRanges to markOpenRangesNonEmpty.

I regenerated block-capture.ll from source in r223492.

dblaikie added inline comments.Dec 8 2014, 9:45 AM

lib/CodeGen/AsmPrinter/DbgValueHistoryCalculator.cpp
49	So 'SeenInstructionInRange' will just keep growing as the function grows? Should we try to limit it in any way. Perhaps an alternative strategy would be to have a "currently unseen" std::shared_ptr<bool>, and then in "markOpenRangesNonEmpty" it would be: if (CurrentlySeen) CurrentlySeen = true; CurrentlySeen = llvm::make_unique<bool>(false); and then in "endInstrRange" the std::shared_ptr<bool> would be: if (Ranges.back().NonEmpty) { Ranges.back().second = &MI; Ranges.back().NonEmpty = nullptr; } ... (would this make it any easier to move the NonEmpty std::shared_ptr<bool> to the InstrRanges rather than into its elements (so we just have one per range set, rather than one per range)? I forget) In any case, hoping Alexey can weigh in, as the person who (re)wrote most of this code most recently.

friss added inline comments.Dec 8 2014, 12:22 PM

lib/CodeGen/AsmPrinter/DbgValueHistoryCalculator.cpp
49	Yes, SeenInstructionInRange only grows with the function. It will however never contain more items than there are ranges, thus the space complexity of the DbgValueHistoryCalculator stays the same. Your suggestion would work, but I'm not sure it is easier to grok for the reader :-) Putting the status in the range set rather than in the range itself would require tracking how many open ranges are empty for this set, which shouldn't be an issue. Piggy backing on your shared_ptr idea, here's another one requiring at most one shared_ptr<bool> at a time and with lifetime semantics a bit simpler than in your proposal: make NonEmpty a weak_ptr (and call it Empty). The value of the pointed bool doesn't matter, but the weak_ptr validity is the real information. markOpenRanges becomes: if (SeenInstructionInRange.use_count()) // Are there empty open ranges? SeenInstructionInRange = make_shared<bool>(true); // This invalidates all current weak_ptrs and in endInstrRange: if (!Ranges.back().Empty.lock()) Ranges.back().second = &MI; else Ranges.pop_back(); There is some hidden complexity in the shared_ptr handling though (management of the shared_ptr use lists), that make me believe that the proposed solution is still more efficient. I don't think this is a big deal because these lists should be short, but I wouldn't be surprised if some pathological case shows up one day. (and yes, Alexey's input would be much welcome)

Finally looked at this patch, sorry for the delay.

First of all, I dislike the extension of InstrRange structure, and adding the extra list to DbgValueHistoryMap. After we returned from calculateDbgValueHistory(), we don't need and never use InstrRange::NonEmpty pointers, and the list, so it's not nice we leave this extra data lying around.

The comment about InstrRange explicitly tells that instruction range *may not* be terminated - it means the range is assumed to be valid either until the start of the next range, or until the end of function. However, you pop_back() "empty" instruction ranges only in endInstrRange() function. It means you may still end up with empty ranges if they were never closed.

Can we do the following instead: have another member:

std::map<const MDNode *, InstrRange> emptyRanges;

which would describe the current, opened and empty, range for a given variable. It means that smth. like

const InstrRange& getCurrentRangeForVar(const MDNode *Var);

would return either the entry from emptyRanges, or the last value of VarInstrRanges vector.

Then startInstrRange would simply add new range, and markOpenRangesNonEmpty() would move the contents of emptyRanges to VarInstrRanges. At the end of calculateDbgValueHistory, we can just discard the contents of emptyRanges.

lib/CodeGen/AsmPrinter/DbgValueHistoryCalculator.cpp
232	Hm, this code is run after the register allocator, can you use isTransient() here? If no, maybe we can introduce `MachineInstr::isPseudoInstruction()` and use it in `MachineInstr::isTransient()` instead?
lib/CodeGen/AsmPrinter/DbgValueHistoryCalculator.h
57	Note that this can leave your `SeenInstructionInRange` list have a single "true" value. This would probably still work, though.

In D6497#12, @samsonov wrote:

Finally looked at this patch, sorry for the delay.

First of all, I dislike the extension of InstrRange structure, and adding the extra list to DbgValueHistoryMap. After we returned from calculateDbgValueHistory(), we don't need and never use InstrRange::NonEmpty pointers, and the list, so it's not nice we leave this extra data lying around.

I can understand that and especially the liveness of the NonEmpty field bothers me. However, let me restate my goal: I wanted to find a way to keep the pass at the same algorithmic complexity. DbgValueHistoryCalculator has already shown up on profiles and I wanted to avoid introducing new potentially costly bookkeeping logic. That being said, I might be too careful. (And I even failed at that, because the destructor of the list<> will do a O(<number of Dbg_VALUE>) walk anyway).

The comment about InstrRange explicitly tells that instruction range *may not* be terminated - it means the range is assumed to be valid either until the start of the next range,

(not really relating to that actual patch: I do not see why the beginning of a variable range should close the previous one for that variable. A variable might be present in 2 registers at some point it is even explicitly stated in the Dwarf standard. I even think you can't get the end result right if you try to prevent this from happening.)

or until the end of function. However, you pop_back() "empty" instruction ranges only in endInstrRange() function. It means you may still end up with empty ranges if they were never closed.

This is true, however the empty ranges could only appear at the very end of the function which is less of an issue. For an empty range to happen there, something would need to have generated a DBG_VALUE after the last instruction of the function (Is that even possible? isn't there a terminator requirement like at the higher level?). But yes, this pass runs so late (meaning the DBG_VALUEs could have been mishandled by so many things) that we need to take care of that case if we want to be exhaustive.

Can we do the following instead: have another member:
std::map<const MDNode *, InstrRange> emptyRanges;
which would describe the current, opened and empty, range for a given variable.

You can have multiple open ranges for a variable (range entries describing constants will never be closed for example). I do not like the limitation to only one pending range. Thus we'd need to make the map value type a list.

It means that smth. like
const InstrRange& getCurrentRangeForVar(const MDNode *Var);
would return either the entry from emptyRanges, or the last value of VarInstrRanges vector.
Then startInstrRange would simply add new range, and markOpenRangesNonEmpty() would move the contents of emptyRanges to VarInstrRanges. At the end of calculateDbgValueHistory, we can just discard the contents of emptyRanges.

I'll try to use a simpler approach with a map<MDnode *, list<InstrRange>> if everybody agrees that the additional bookkeeping shouldn't be an issue.

lib/CodeGen/AsmPrinter/DbgValueHistoryCalculator.cpp
232	I reused the way AsmPrinter::EmitFunctionBody() counts the real emitted instructions. isTransient() seems to DTRT, I could use that.
lib/CodeGen/AsmPrinter/DbgValueHistoryCalculator.h
57	I considered adding a comment for that pointing out that it doesn't matter. The important thing is that the list shouldn't be empty. Now that I look at it again, even though the comment wouldn't hurt, it would actually be longer than reseting the front value to false :-)

In D6497#100223, @friss wrote:

In D6497#12, @samsonov wrote:

Finally looked at this patch, sorry for the delay.

First of all, I dislike the extension of InstrRange structure, and adding the extra list to DbgValueHistoryMap. After we returned from calculateDbgValueHistory(), we don't need and never use InstrRange::NonEmpty pointers, and the list, so it's not nice we leave this extra data lying around.

I can understand that and especially the liveness of the NonEmpty field bothers me. However, let me restate my goal: I wanted to find a way to keep the pass at the same algorithmic complexity. DbgValueHistoryCalculator has already shown up on profiles and I wanted to avoid introducing new potentially costly bookkeeping logic. That being said, I might be too careful. (And I even failed at that, because the destructor of the list<> will do a O(<number of Dbg_VALUE>) walk anyway).

The comment about InstrRange explicitly tells that instruction range *may not* be terminated - it means the range is assumed to be valid either until the start of the next range,

(not really relating to that actual patch: I do not see why the beginning of a variable range should close the previous one for that variable. A variable might be present in 2 registers at some point it is even explicitly stated in the Dwarf standard. I even think you can't get the end result right if you try to prevent this from happening.)

I don't say that ranges should be non-overlapping - they shouldn't in general case. I was just telling that apparently it was the case when this code was refactored, and this assumption seems to be baked in there. For instance, DbgValueHistoryMap::getRegisterForVar() assumes returns a single value, apparently assuming that a variable can be stored in a single register. We'd need to carefully audit the code, see how it works with new DwarfDebug::buildLocationList method, etc. I'm sort of surprised that the latter works with the current
structure of DbgValueHistoryMap::InstrRanges.

or until the end of function. However, you pop_back() "empty" instruction ranges only in endInstrRange() function. It means you may still end up with empty ranges if they were never closed.

This is true, however the empty ranges could only appear at the very end of the function which is less of an issue. For an empty range to happen there, something would need to have generated a DBG_VALUE after the last instruction of the function (Is that even possible? isn't there a terminator requirement like at the higher level?). But yes, this pass runs so late (meaning the DBG_VALUEs could have been mishandled by so many things) that we need to take care of that case if we want to be exhaustive.
Can we do the following instead: have another member:
std::map<const MDNode *, InstrRange> emptyRanges;
which would describe the current, opened and empty, range for a given variable.
You can have multiple open ranges for a variable (range entries describing constants will never be closed for example). I do not like the limitation to only one pending range. Thus we'd need to make the map value type a list.
It means that smth. like
const InstrRange& getCurrentRangeForVar(const MDNode *Var);
would return either the entry from emptyRanges, or the last value of VarInstrRanges vector.
Then startInstrRange would simply add new range, and markOpenRangesNonEmpty() would move the contents of emptyRanges to VarInstrRanges. At the end of calculateDbgValueHistory, we can just discard the contents of emptyRanges.
I'll try to use a simpler approach with a map<MDnode *, list<InstrRange>> if everybody agrees that the additional bookkeeping shouldn't be an issue.

In D6497#100852, @samsonov wrote:

In D6497#100223, @friss wrote:

In D6497#12, @samsonov wrote:

Finally looked at this patch, sorry for the delay.

First of all, I dislike the extension of InstrRange structure, and adding the extra list to DbgValueHistoryMap. After we returned from calculateDbgValueHistory(), we don't need and never use InstrRange::NonEmpty pointers, and the list, so it's not nice we leave this extra data lying around.

I can understand that and especially the liveness of the NonEmpty field bothers me. However, let me restate my goal: I wanted to find a way to keep the pass at the same algorithmic complexity. DbgValueHistoryCalculator has already shown up on profiles and I wanted to avoid introducing new potentially costly bookkeeping logic. That being said, I might be too careful. (And I even failed at that, because the destructor of the list<> will do a O(<number of Dbg_VALUE>) walk anyway).

The comment about InstrRange explicitly tells that instruction range *may not* be terminated - it means the range is assumed to be valid either until the start of the next range,

(not really relating to that actual patch: I do not see why the beginning of a variable range should close the previous one for that variable. A variable might be present in 2 registers at some point it is even explicitly stated in the Dwarf standard. I even think you can't get the end result right if you try to prevent this from happening.)

I don't say that ranges should be non-overlapping - they shouldn't in general case. I was just telling that apparently it was the case when this code was refactored, and this assumption seems to be baked in there. For instance, DbgValueHistoryMap::getRegisterForVar() assumes returns a single value, apparently assuming that a variable can be stored in a single register. We'd need to carefully audit the code, see how it works with new DwarfDebug::buildLocationList method, etc. I'm sort of surprised that the latter works with the current structure of DbgValueHistoryMap::InstrRanges.

Yeah, my comment was more about the state of the code than about your reply. I'd really like to find time to really revamp that part of the debug info flow. The fact is that this is usually only used for optimized code and nobody really cares today about the quality of the debug information in that case. It's also used for -O0 ASAN instrumented code, but it's still something that very few people (if any) will notice if it gives wrong results.

Fred

or until the end of function. However, you pop_back() "empty" instruction ranges only in endInstrRange() function. It means you may still end up with empty ranges if they were never closed.

This is true, however the empty ranges could only appear at the very end of the function which is less of an issue. For an empty range to happen there, something would need to have generated a DBG_VALUE after the last instruction of the function (Is that even possible? isn't there a terminator requirement like at the higher level?). But yes, this pass runs so late (meaning the DBG_VALUEs could have been mishandled by so many things) that we need to take care of that case if we want to be exhaustive.
Can we do the following instead: have another member:
std::map<const MDNode *, InstrRange> emptyRanges;
which would describe the current, opened and empty, range for a given variable.
You can have multiple open ranges for a variable (range entries describing constants will never be closed for example). I do not like the limitation to only one pending range. Thus we'd need to make the map value type a list.
It means that smth. like
const InstrRange& getCurrentRangeForVar(const MDNode *Var);
would return either the entry from emptyRanges, or the last value of VarInstrRanges vector.
Then startInstrRange would simply add new range, and markOpenRangesNonEmpty() would move the contents of emptyRanges to VarInstrRanges. At the end of calculateDbgValueHistory, we can just discard the contents of emptyRanges.
I'll try to use a simpler approach with a map<MDnode *, list<InstrRange>> if everybody agrees that the additional bookkeeping shouldn't be an issue.

Reimplement in a more straightforward way.

So this commit doesn't pass the testsuite as a new test added last week by
Adrian would generate an empty range and thus be cleaned up by this patch.
I'm still posting it for reference. Another issue I noticed: the patch handles
locations the same way as the current logic does, using always the latest
started range. This makes no real sense as the instruction that makes us
close the range might be cloberring the register of any of the currently
opened ranges for the variables.

If you think it is still a good idea to apply something like that, just tell
me. Otherwise, I'm going to try to find some time to finally rewrite the pass
more in depth, because each time I touch it I realize a bit more how broken
it is.

Revision Contents

Path

Size

lib/

CodeGen/

AsmPrinter/

DbgValueHistoryCalculator.h

13 lines

DbgValueHistoryCalculator.cpp

52 lines

test/

DebugInfo/

X86/

dbg-value-inlined-parameter.ll

4 lines

Diff 17350

lib/CodeGen/AsmPrinter/DbgValueHistoryCalculator.h

	//===-- llvm/CodeGen/AsmPrinter/DbgValueHistoryCalculator.h ----- C++ ---===//			//===-- llvm/CodeGen/AsmPrinter/DbgValueHistoryCalculator.h ----- C++ ---===//
	//			//
	// The LLVM Compiler Infrastructure			// The LLVM Compiler Infrastructure
	//			//
	// This file is distributed under the University of Illinois Open Source			// This file is distributed under the University of Illinois Open Source
	// License. See LICENSE.TXT for details.			// License. See LICENSE.TXT for details.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#ifndef LLVM_LIB_CODEGEN_ASMPRINTER_DBGVALUEHISTORYCALCULATOR_H			#ifndef LLVM_LIB_CODEGEN_ASMPRINTER_DBGVALUEHISTORYCALCULATOR_H
	#define LLVM_LIB_CODEGEN_ASMPRINTER_DBGVALUEHISTORYCALCULATOR_H			#define LLVM_LIB_CODEGEN_ASMPRINTER_DBGVALUEHISTORYCALCULATOR_H

	#include "llvm/ADT/MapVector.h"			#include "llvm/ADT/MapVector.h"
	#include "llvm/ADT/SmallVector.h"			#include "llvm/ADT/SmallVector.h"
				#include <forward_list>

	namespace llvm {			namespace llvm {

	class MachineFunction;			class MachineFunction;
	class MachineInstr;			class MachineInstr;
	class MDNode;			class MDNode;
	class TargetRegisterInfo;			class TargetRegisterInfo;

	// For each user variable, keep a list of instruction ranges where this variable			// For each user variable, keep a list of instruction ranges where this variable
	// is accessible. The variables are listed in order of appearance.			// is accessible. The variables are listed in order of appearance.
	class DbgValueHistoryMap {			class DbgValueHistoryMap {
	// Each instruction range starts with a DBG_VALUE instruction, specifying the			// Each instruction range starts with a DBG_VALUE instruction, specifying the
	// location of a variable, which is assumed to be valid until the end of the			// location of a variable, which is assumed to be valid until the end of the
	// range. If end is not specified, location is valid until the start			// range. If end is not specified, location is valid until the start
	// instruction of the next instruction range, or until the end of the			// instruction of the next instruction range, or until the end of the
	// function.			// function.
				// We must avoid generating empty ranges. An empty range conveys no
				// information, and even worse, an empty range with start/end addresses
				// 0 (the addresses are function relative) will be interpreted as the
				// end of location list marker (Making the location list appear as empty).
	public:			public:
	typedef std::pair<const MachineInstr , const MachineInstr > InstrRange;			typedef std::pair<const MachineInstr , const MachineInstr > InstrRange;
	typedef SmallVector<InstrRange, 4> InstrRanges;			typedef SmallVector<InstrRange, 4> InstrRanges;
	typedef MapVector<const MDNode *, InstrRanges> InstrRangesMap;			typedef MapVector<const MDNode *, InstrRanges> InstrRangesMap;
				dblaikieUnsubmitted Not Done Reply Inline Actions Do we need one of these on every InstrRange? I suppose this comes up if we have multiple open InstrRanges at the same time? Do we support/implement that scenario? (I imagine we probably don't - in which case we only need "NonEmpty" for the currently open InstrRange - so perhaps InstrRanges should be a struct of SmallVector<InstrRange> + bool NonEmpty, so we just have it once per InstrRanges, rather than once for every range within the InstrRanges) dblaikie:* Do we need one of these on every InstrRange? I suppose this comes up if we have multiple open…
				frissAuthorUnsubmitted Not Done Reply Inline Actions We have multiple Ranges in the Same vector with different statuses. Today it's due to what I consider a bug: we never close ranges that aren't described by a register (i.e. a constant variable). But ultimately, even for registers it might make sense to have overlapping ranges for the same variable (the fact that a new DBG_VALUE describes the variable doesn't mean that the value mysteriously disappeared from the previous register). So yes, I explicitly designed this solution to support that scenario. friss: We have multiple Ranges in the Same vector with different statuses. Today it's due to what I…
	private:			private:
	InstrRangesMap VarInstrRanges;			InstrRangesMap VarInstrRanges;
				InstrRangesMap EmptyRanges;
	public:			public:
				DbgValueHistoryMap() {}

	void startInstrRange(const MDNode *Var, const MachineInstr &MI);			void startInstrRange(const MDNode *Var, const MachineInstr &MI);
	void endInstrRange(const MDNode *Var, const MachineInstr &MI);			void endInstrRange(const MDNode *Var, const MachineInstr &MI);
				void markOpenRangesNonEmpty();
				dblaikieUnsubmitted Not Done Reply Inline Actions 'validate' seems a bit vague, but maybe sufficient dblaikie: 'validate' seems a bit vague, but maybe sufficient
				frissAuthorUnsubmitted Not Done Reply Inline Actions In an earlier iteration the added field in InstrRange was called 'Valid'. I could rename that to markOpenRangesNonEmpty(). friss: In an earlier iteration the added field in InstrRange was called 'Valid'. I could rename that…
				void dropEmptyRanges();
				void tryToPromoteEmptyRange(const MDNode *Var);

	// Returns register currently describing @Var. If @Var is currently			// Returns register currently describing @Var. If @Var is currently
	// unaccessible or is not described by a register, returns 0.			// unaccessible or is not described by a register, returns 0.
	unsigned getRegisterForVar(const MDNode *Var) const;			unsigned getRegisterForVar(const MDNode *Var) const;

	bool empty() const { return VarInstrRanges.empty(); }			bool empty() const { return VarInstrRanges.empty(); }
	void clear() { VarInstrRanges.clear(); }			void clear() { VarInstrRanges.clear(); }
				samsonovUnsubmitted Not Done Reply Inline Actions Note that this can leave your `SeenInstructionInRange` list have a single "true" value. This would probably still work, though. samsonov: Note that this can leave your `SeenInstructionInRange` list have a single "true" value. This…
				frissAuthorUnsubmitted Not Done Reply Inline Actions I considered adding a comment for that pointing out that it doesn't matter. The important thing is that the list shouldn't be empty. Now that I look at it again, even though the comment wouldn't hurt, it would actually be longer than reseting the front value to false :-) friss: I considered adding a comment for that pointing out that it doesn't matter. The important thing…
	InstrRangesMap::const_iterator begin() const { return VarInstrRanges.begin(); }			InstrRangesMap::const_iterator begin() const { return VarInstrRanges.begin(); }
	InstrRangesMap::const_iterator end() const { return VarInstrRanges.end(); }			InstrRangesMap::const_iterator end() const { return VarInstrRanges.end(); }
	};			};

	void calculateDbgValueHistory(const MachineFunction *MF,			void calculateDbgValueHistory(const MachineFunction *MF,
	const TargetRegisterInfo *TRI,			const TargetRegisterInfo *TRI,
	DbgValueHistoryMap &Result);			DbgValueHistoryMap &Result);
	}			}

	#endif			#endif

lib/CodeGen/AsmPrinter/DbgValueHistoryCalculator.cpp

Show All 38 Lines	void DbgValueHistoryMap::startInstrRange(const MDNode *Var,
assert(MI.isDebugValue() && "not a DBG_VALUE");		assert(MI.isDebugValue() && "not a DBG_VALUE");
auto &Ranges = VarInstrRanges[Var];		auto &Ranges = VarInstrRanges[Var];
if (!Ranges.empty() && Ranges.back().second == nullptr &&		if (!Ranges.empty() && Ranges.back().second == nullptr &&
Ranges.back().first->isIdenticalTo(&MI)) {		Ranges.back().first->isIdenticalTo(&MI)) {
DEBUG(dbgs() << "Coalescing identical DBG_VALUE entries:\n"		DEBUG(dbgs() << "Coalescing identical DBG_VALUE entries:\n"
<< "\t" << Ranges.back().first << "\t" << MI << "\n");		<< "\t" << Ranges.back().first << "\t" << MI << "\n");
return;		return;
}		}
Ranges.push_back(std::make_pair(&MI, nullptr));		EmptyRanges[Var].push_back(std::make_pair(&MI, nullptr));
}		}
		dblaikieUnsubmitted Not Done Reply Inline Actions I don't follow what's going on here - maybe a comment is in order? (not sure why it's only pushing back if the back() is true). I guess whatever the answer is also explains why SeenInstructionInRange starts with size 1 rather than empty. OK, I think I see what's going on here - rather than having entry in SeenInstructionInRange per live range, you avoid having them in the case where two ranges start together (or are not "valid" before teh second one starts, at least). Bit subtle, but I think I get it - an alternative idea/question: How do you avoid having to walk all the elements of SeenInstructionInRange when performing validateOpenRanges? I would've imagined/expected that you'd need to set all the elements of the container to true, not just the last one, no? In any case - could we just keep booleans directly in the InstrRanges - and validateOpenRanges would iterate through InstrRanges and set them all to true? dblaikie: I don't follow what's going on here - maybe a comment is in order? (not sure why it's only…
		frissAuthorUnsubmitted Not Done Reply Inline Actions Basically each entry in SeenInstructionInRange represents the status of a series of DBG_VALUEs that start at the same PC. If the last entry is false, we can just reuse it, our DBG_VALUE starts at the same PC that all the current still-empty ranges. If the last entry is true, we need to add a new one that will get false till the nest real instruction. Thus the container always contains only 'true' elements except maybe the last one that indicates that there are newly opened ranges that haven't seen an instruction yet. And the fact that a range is opened first doesn't mean it'll get closed first. We need to keep the old 'true' values that are pointed to live. It's a bit subtle, but it's the only way I could think of that keeps the pass algorithmic complexity the same. I fear that iterating the InstrRanges could lead to some really bad pathological cases. If you want to store the flag directly in the InstrRange, you need a way to iterate only the non-validated ones to mark them non-empty. This isn't trivial as the primary storage of the InstrRange is in a SmallVector, thus their address isn't stable (I haven't investigated if there would be a drawback to storing them in a list instead). I prefer the simplicity of just toggling the status for all the current entries by just modifying a bool value that everybody points to. friss: Basically each entry in SeenInstructionInRange represents the status of a series of DBG_VALUEs…

		dblaikieUnsubmitted Not Done Reply Inline Actions So 'SeenInstructionInRange' will just keep growing as the function grows? Should we try to limit it in any way. Perhaps an alternative strategy would be to have a "currently unseen" std::shared_ptr<bool>, and then in "markOpenRangesNonEmpty" it would be: if (CurrentlySeen) CurrentlySeen = true; CurrentlySeen = llvm::make_unique<bool>(false); and then in "endInstrRange" the std::shared_ptr<bool> would be: if (Ranges.back().NonEmpty) { Ranges.back().second = &MI; Ranges.back().NonEmpty = nullptr; } ... (would this make it any easier to move the NonEmpty std::shared_ptr<bool> to the InstrRanges rather than into its elements (so we just have one per range set, rather than one per range)? I forget) In any case, hoping Alexey can weigh in, as the person who (re)wrote most of this code most recently. dblaikie: So 'SeenInstructionInRange' will just keep growing as the function grows? Should we try to…
		frissAuthorUnsubmitted Not Done Reply Inline Actions Yes, SeenInstructionInRange only grows with the function. It will however never contain more items than there are ranges, thus the space complexity of the DbgValueHistoryCalculator stays the same. Your suggestion would work, but I'm not sure it is easier to grok for the reader :-) Putting the status in the range set rather than in the range itself would require tracking how many open ranges are empty for this set, which shouldn't be an issue. Piggy backing on your shared_ptr idea, here's another one requiring at most one shared_ptr<bool> at a time and with lifetime semantics a bit simpler than in your proposal: make NonEmpty a weak_ptr (and call it Empty). The value of the pointed bool doesn't matter, but the weak_ptr validity is the real information. markOpenRanges becomes: if (SeenInstructionInRange.use_count()) // Are there empty open ranges? SeenInstructionInRange = make_shared<bool>(true); // This invalidates all current weak_ptrs and in endInstrRange: if (!Ranges.back().Empty.lock()) Ranges.back().second = &MI; else Ranges.pop_back(); There is some hidden complexity in the shared_ptr handling though (management of the shared_ptr use lists), that make me believe that the proposed solution is still more efficient. I don't think this is a big deal because these lists should be short, but I wouldn't be surprised if some pathological case shows up one day. (and yes, Alexey's input would be much welcome) friss: Yes, SeenInstructionInRange only grows with the function. It will however never contain more…
void DbgValueHistoryMap::endInstrRange(const MDNode *Var,		void DbgValueHistoryMap::endInstrRange(const MDNode *Var,
		dblaikieUnsubmitted Not Done Reply Inline Actions The pointer to the SmallVector element could end up dangling if the SmallVector is reallocated. dblaikie: The pointer to the SmallVector element could end up dangling if the SmallVector is reallocated.
		frissAuthorUnsubmitted Not Done Reply Inline Actions Bummer... I knew I had a good reason for having implemented that as a list first. And then I added the call to SeenInstructionInRange.resize() and thought that this would definitely be more efficient with a vector. friss: Bummer... I knew I had a good reason for having implemented that as a list first. And then I…
const MachineInstr &MI) {		const MachineInstr &MI) {
		if (!EmptyRanges[Var].empty())
		return EmptyRanges[Var].pop_back();

auto &Ranges = VarInstrRanges[Var];		auto &Ranges = VarInstrRanges[Var];
// Verify that the current instruction range is not yet closed.		// Verify that the current instruction range is not yet closed.
assert(!Ranges.empty() && Ranges.back().second == nullptr);		assert(!Ranges.empty() && Ranges.back().second == nullptr);
// For now, instruction ranges are not allowed to cross basic block		// For now, instruction ranges are not allowed to cross basic block
// boundaries.		// boundaries.
assert(Ranges.back().first->getParent() == MI.getParent());		assert(Ranges.back().first->getParent() == MI.getParent());

Ranges.back().second = &MI;		Ranges.back().second = &MI;
}		}

		void DbgValueHistoryMap::markOpenRangesNonEmpty() {
		if (EmptyRanges.empty())
		return;

		for (auto &Ranges : make_range(EmptyRanges.begin(), EmptyRanges.end()))
		for (auto &Range : Ranges.second)
		VarInstrRanges[Ranges.first].push_back(Range);

		EmptyRanges.clear();
		}

		void DbgValueHistoryMap::dropEmptyRanges() { EmptyRanges.clear(); }

		void DbgValueHistoryMap::tryToPromoteEmptyRange(const MDNode *Var) {
		if (EmptyRanges.empty())
		return;

		auto I = EmptyRanges.find(Var);
		if (I == EmptyRanges.end())
		return;

		VarInstrRanges[Var].push_back(I->second.back());
		I->second.pop_back();
		EmptyRanges.clear();
		}

unsigned DbgValueHistoryMap::getRegisterForVar(const MDNode *Var) const {		unsigned DbgValueHistoryMap::getRegisterForVar(const MDNode *Var) const {
const auto &I = VarInstrRanges.find(Var);		auto I = EmptyRanges.find(Var);
		if (I == EmptyRanges.end()) {
		I = VarInstrRanges.find(Var);
if (I == VarInstrRanges.end())		if (I == VarInstrRanges.end())
return 0;		return 0;
		}
const auto &Ranges = I->second;		const auto &Ranges = I->second;
if (Ranges.empty() \|\| Ranges.back().second != nullptr)		if (Ranges.empty() \|\| Ranges.back().second != nullptr)
return 0;		return 0;
return isDescribedByReg(*Ranges.back().first);		return isDescribedByReg(*Ranges.back().first);
}		}

namespace {		namespace {
// Maps physreg numbers to the variables they describe.		// Maps physreg numbers to the variables they describe.
▲ Show 20 Lines • Show All 117 Lines • ▼ Show 20 Lines	for (const auto &MBB : *MF) {
for (const auto &MI : MBB) {		for (const auto &MI : MBB) {
if (!MI.isDebugValue()) {		if (!MI.isDebugValue()) {
// Not a DBG_VALUE instruction. It may clobber registers which describe		// Not a DBG_VALUE instruction. It may clobber registers which describe
// some variables.		// some variables.
applyToClobberedRegisters(MI, TRI, [&](unsigned RegNo) {		applyToClobberedRegisters(MI, TRI, [&](unsigned RegNo) {
if (ChangingRegs.test(RegNo))		if (ChangingRegs.test(RegNo))
clobberRegisterUses(RegVars, RegNo, Result, MI);		clobberRegisterUses(RegVars, RegNo, Result, MI);
});		});

		if (!MI.isTransient())
		samsonovUnsubmitted Not Done Reply Inline Actions Hm, this code is run after the register allocator, can you use isTransient() here? If no, maybe we can introduce `MachineInstr::isPseudoInstruction()` and use it in `MachineInstr::isTransient()` instead? samsonov: Hm, this code is run after the register allocator, can you use isTransient() here? If no, maybe…
		frissAuthorUnsubmitted Not Done Reply Inline Actions I reused the way AsmPrinter::EmitFunctionBody() counts the real emitted instructions. isTransient() seems to DTRT, I could use that. friss: I reused the way AsmPrinter::EmitFunctionBody() counts the real emitted instructions.
		Result.markOpenRangesNonEmpty();
continue;		continue;
}		}

assert(MI.getNumOperands() > 1 && "Invalid DBG_VALUE instruction!");		assert(MI.getNumOperands() > 1 && "Invalid DBG_VALUE instruction!");
// Use the base variable (without any DW_OP_piece expressions)		// Use the base variable (without any DW_OP_piece expressions)
// as index into History. The full variables including the		// as index into History. The full variables including the
// piece expressions are attached to the MI.		// piece expressions are attached to the MI.
DIVariable Var = MI.getDebugVariable();		DIVariable Var = MI.getDebugVariable();
Show All 10 Lines	for (const auto &MBB : *MF) {
// Make sure locations for register-described variables are valid only		// Make sure locations for register-described variables are valid only
// until the end of the basic block (unless it's the last basic block, in		// until the end of the basic block (unless it's the last basic block, in
// which case let their liveness run off to the end of the function).		// which case let their liveness run off to the end of the function).
if (!MBB.empty() && &MBB != &MF->back()) {		if (!MBB.empty() && &MBB != &MF->back()) {
for (auto I = RegVars.begin(), E = RegVars.end(); I != E;) {		for (auto I = RegVars.begin(), E = RegVars.end(); I != E;) {
auto CurElem = I++; // CurElem can be erased below.		auto CurElem = I++; // CurElem can be erased below.
if (ChangingRegs.test(CurElem->first))		if (ChangingRegs.test(CurElem->first))
clobberRegisterUses(RegVars, CurElem, Result, MBB.back());		clobberRegisterUses(RegVars, CurElem, Result, MBB.back());
		else
		// If a variable uses a register that isn't clobbered
		// anywhere in the function, its range will be extended. The
		// range needs to survive this pass though.
		for (const auto *Var : CurElem->second)
		Result.tryToPromoteEmptyRange(Var);
}		}
}		}

		Result.dropEmptyRanges();
}		}
}		}

test/DebugInfo/X86/dbg-value-inlined-parameter.ll

	Show All 25 Lines

	;CHECK: DW_TAG_inlined_subroutine			;CHECK: DW_TAG_inlined_subroutine
	;CHECK-NEXT: DW_AT_abstract_origin {{.*}} "foo"			;CHECK-NEXT: DW_AT_abstract_origin {{.*}} "foo"
	;CHECK-NEXT: DW_AT_low_pc [DW_FORM_addr]			;CHECK-NEXT: DW_AT_low_pc [DW_FORM_addr]
	;CHECK-NEXT: DW_AT_high_pc [DW_FORM_data4]			;CHECK-NEXT: DW_AT_high_pc [DW_FORM_data4]
	;CHECK-NEXT: DW_AT_call_file			;CHECK-NEXT: DW_AT_call_file
	;CHECK-NEXT: DW_AT_call_line			;CHECK-NEXT: DW_AT_call_line

				;FIXME: We shouldn't drop the sp parameter.
				aprantlUnsubmitted Not Done Reply Inline Actions If we drop the DW_TAG_formal_parameter completely, the function signature doesn't match the subroutine type any more. I think the formal parameter should still be there, even if it has no location. aprantl: If we drop the DW_TAG_formal_parameter completely, the function signature doesn't match the…
	;CHECK: DW_TAG_formal_parameter			;CHECK: DW_TAG_formal_parameter
	;FIXME: Linux shouldn't drop this parameter either...
	;CHECK-NOT: DW_TAG			;CHECK-NOT: DW_TAG
	;DARWIN: DW_AT_abstract_origin {{.*}} "sp"
	;DARWIN: DW_TAG_formal_parameter
	;CHECK: DW_AT_abstract_origin {{.*}} "nums"			;CHECK: DW_AT_abstract_origin {{.*}} "nums"
	;CHECK-NOT: DW_TAG_formal_parameter			;CHECK-NOT: DW_TAG_formal_parameter

	%struct.S1 = type { float*, i32 }			%struct.S1 = type { float*, i32 }

	@p = common global %struct.S1 zeroinitializer, align 8			@p = common global %struct.S1 zeroinitializer, align 8

	define i32 @foo(%struct.S1* nocapture %sp, i32 %nums) nounwind optsize ssp {			define i32 @foo(%struct.S1* nocapture %sp, i32 %nums) nounwind optsize ssp {
	▲ Show 20 Lines • Show All 71 Lines • Show Last 20 Lines