This is an archive of the discontinued LLVM Phabricator instance.

Fix prologue end handling when code compiled by gcc
ClosedPublic

Authored by tberghammer on Sep 10 2015, 7:01 AM.

Download Raw Diff

Details

Reviewers

Commits

rG6127f033581a: Fix prologue end handling when code compiled by gcc
rLLDB247788: Fix prologue end handling when code compiled by gcc
rL247788: Fix prologue end handling when code compiled by gcc

Summary

Fix prologue end handling when code compiled by gcc

GCC don't use the is_prologue_end flag to mark the first instruction
after the prologue. Instead of it it is issuing a line table entry for
the first instruction of the prologue and one for the first instruction
after the prologue. If the size of the prologue is 0 instruction then
the 2 line entry will have the same file address.

We remove these duplicates entries as they are violating the dwarf spec
and can cause confusion in the debugger. To prevent the lost of
information about the end of prologue we should set the prologue end
flag for the line entries what are representing more then 1 entry.

Diff Detail

Event Timeline

tberghammer updated this revision to Diff 34435.Sep 10 2015, 7:01 AM

tberghammer retitled this revision from to Fix prologue end handling when code compiled by gcc.

tberghammer updated this object.

tberghammer added a reviewer: clayborg.

tberghammer added a subscriber: lldb-commits.

see inlined comments.

source/Symbol/LineTable.cpp
107–117	I am not sure I like this solution. Now if we ever have two line entries with the same address we will automatically mark the item as the prologue end? This seems like a hack and it will mark all sorts of line entries as being "is_prologue_end = true" all throughout the line table.

This revision now requires changes to proceed.Sep 10 2015, 10:46 AM

tberghammer added inline comments.Sep 10 2015, 1:18 PM

source/Symbol/LineTable.cpp
107–117	I agree that this is a hack to work around a bug in gcc (and I don't really like it either), but there is one thing to consider based on the comment in line 100-105. It is invalid dwarf if we have 2 line entries with the same file address, so if the compiler is correct, then this code path will never be activated. Next to the issue with generating duplicate entries for functions without prologue I never seen any case where we have 2 line entry for the same address (which line would we show to the user in that case?) If we don't want to use this (hackish) solution then the other option is to not remove duplicate line entries from the code as we use the 2nd line entry as the beginning of the prologue when there is no is_prologue_end flag in any of the entries. It will result in a cleaner solution for this specific issue (what is working around a gcc bug), but gave up the 1 to 1 mapping what we want to keep based on the comment in line 100-105 *it was added by rL211212)

Hi Greg, what do you think about my inline suggestion? Are you fine with removing the original hack about removing duplicate entries from the line table end then solve the problem around duplicate line entries with always returning the last entry if we have multiple line entries for the same address?

Maybe we can try still removing duplicates, but remembering the first index where we had a duplicate line entry. If we don't get a prologue end, then we got back to the index we remembered for the first duplicate and if it is valid, modify that entry to say "prologue_end = true"?

One other questions for clarification: Is GCC emitting prologue_end, but only emitting it on the first line entry? And then we overrwrite it with the second and remove the prologue_end, or does GCC just plain not emit prologue_end? If so, what happens when we have a line table that doesn't have two entries for the prologue with the same address? Do we just not have a prologue_end in a sequence in that case?

In D12757#246497, @clayborg wrote:

Maybe we can try still removing duplicates, but remembering the first index where we had a duplicate line entry. If we don't get a prologue end, then we got back to the index we remembered for the first duplicate and if it is valid, modify that entry to say "prologue_end = true"?

Remembering to the first duplicate entry isn't really possible because a line table covers several functions and we need the prologe_end marker for each functions. If we want to go in this direction then we have to couple the line table with the function ranges (including the function ranges for inline functions) what I am pretty sure we want to avoid. It would cause significant performance hit because it would require a full dwarf parsing.

One other questions for clarification: Is GCC emitting prologue_end, but only emitting it on the first line entry? And then we overrwrite it with the second and remove the prologue_end, or does GCC just plain not emit prologue_end? If so, what happens when we have a line table that doesn't have two entries for the prologue with the same address? Do we just not have a prologue_end in a sequence in that case?

I never seen GCC emitting prologue_end marker in any architecture I tested and based on some online threads I am pretty sure it is the case for all architecture. It emits 1 line entry for the first address of the function and then an other line entry for the first non prologue instruction of the function.

Blech... Ok, one more try: does GCC always emit the same line and file with the same address? If so we could do:

{
    // GCC don't use the is_prologue_end flag to mark the first instruction after the prologue.
    // Instead of it it is issueing a line table entry for the first instruction of the prologue
    // and one for the first instruction after the prologue. If the size of the prologue is 0
    // instruction then the 2 line entry will have the same file address. Removing it will remove
    // our ability to properly detect the location of the end of prologe so we set the prologue_end
    // flag to preserve this information (setting the prologue_end flag for an entry what is after
    // the prologue end don't have any effect)
    entry.is_prologue_end = entry.file == entries.back().file && entry.line == entries.back().line;
    entries.back() = entry;
}

Otherwise we could settle on just making sure the file is the same:

{
    // GCC don't use the is_prologue_end flag to mark the first instruction after the prologue.
    // Instead of it it is issueing a line table entry for the first instruction of the prologue
    // and one for the first instruction after the prologue. If the size of the prologue is 0
    // instruction then the 2 line entry will have the same file address. Removing it will remove
    // our ability to properly detect the location of the end of prologe so we set the prologue_end
    // flag to preserve this information (setting the prologue_end flag for an entry what is after
    // the prologue end don't have any effect)
    entry.is_prologue_end = entry.file == entries.back().file;
    entries.back() = entry;
}

The line number is not always the same for the 2 entry (the first one points to the open '{' and the second one to the first instruction of the function). I added the check for the file name, but I would be quite surprised if we find a scenario when the file names doesn't match (it would imply some LTO what will kill most of the debugging experience anyway)

Closed by commit rL247788: Fix prologue end handling when code compiled by gcc (authored by tberghammer). · Explain WhySep 16 2015, 5:38 AM

This revision was automatically updated to reflect the committed changes.

The point of the file name check is to catch the case where you had nested inlines that share the same address, and the compiler (errantly, but...) decided to emit duplicate entries at the same address for the two levels of inlining. The inlining could of course be from the current file, but at least at -O0 it is much more common that they will come from somewhere else (like std::whatever)

I see. Thank you for the clarification

Revision Contents

Path

Size

source/

Symbol/

LineTable.cpp

10 lines

Diff 34435

source/Symbol/LineTable.cpp

Show First 20 Lines • Show All 98 Lines • ▼ Show 20 Lines	)
entry_collection &entries = seq->m_entries;		entry_collection &entries = seq->m_entries;
// Replace the last entry if the address is the same, otherwise append it. If we have multiple		// Replace the last entry if the address is the same, otherwise append it. If we have multiple
// line entries at the same address, this!indicates illegal DWARF so this "fixes" the line table		// line entries at the same address, this!indicates illegal DWARF so this "fixes" the line table
// to be correct. If not fixed this can cause a line entry's address that when resolved back to		// to be correct. If not fixed this can cause a line entry's address that when resolved back to
// a symbol context, could resolve to a different line entry. We really want a 1 to 1 mapping		// a symbol context, could resolve to a different line entry. We really want a 1 to 1 mapping
// here to avoid these kinds of inconsistencies. We will need tor revisit this if the DWARF line		// here to avoid these kinds of inconsistencies. We will need tor revisit this if the DWARF line
// tables are updated to allow multiple entries at the same address legally.		// tables are updated to allow multiple entries at the same address legally.
if (!entries.empty() && entries.back().file_addr == file_addr)		if (!entries.empty() && entries.back().file_addr == file_addr)
		{
		// GCC don't use the is_prologue_end flag to mark the first instruction after the prologue.
		// Instead of it it is issueing a line table entry for the first instruction of the prologue
		// and one for the first instruction after the prologue. If the size of the prologue is 0
		// instruction then the 2 line entry will have the same file address. Removing it will remove
		// our ability to properly detect the location of the end of prologe so we set the prologue_end
		// flag to preserve this information (setting the prologue_end flag for an entry what is after
		// the prologue end don't have any effect)
		entry.is_prologue_end = true;
entries.back() = entry;		entries.back() = entry;
		}
		clayborgUnsubmitted Not Done Reply Inline Actions I am not sure I like this solution. Now if we ever have two line entries with the same address we will automatically mark the item as the prologue end? This seems like a hack and it will mark all sorts of line entries as being "is_prologue_end = true" all throughout the line table. clayborg: I am not sure I like this solution. Now if we ever have two line entries with the same address…
		tberghammerAuthorUnsubmitted Not Done Reply Inline Actions I agree that this is a hack to work around a bug in gcc (and I don't really like it either), but there is one thing to consider based on the comment in line 100-105. It is invalid dwarf if we have 2 line entries with the same file address, so if the compiler is correct, then this code path will never be activated. Next to the issue with generating duplicate entries for functions without prologue I never seen any case where we have 2 line entry for the same address (which line would we show to the user in that case?) If we don't want to use this (hackish) solution then the other option is to not remove duplicate line entries from the code as we use the 2nd line entry as the beginning of the prologue when there is no is_prologue_end flag in any of the entries. It will result in a cleaner solution for this specific issue (what is working around a gcc bug), but gave up the 1 to 1 mapping what we want to keep based on the comment in line 100-105 it was added by rL211212) tberghammer:* I agree that this is a hack to work around a bug in gcc (and I don't really like it either)…
else		else
entries.push_back (entry);		entries.push_back (entry);
}		}

void		void
LineTable::InsertSequence (LineSequence* sequence)		LineTable::InsertSequence (LineSequence* sequence)
{		{
assert(sequence != nullptr);		assert(sequence != nullptr);
▲ Show 20 Lines • Show All 481 Lines • Show Last 20 Lines