This is an archive of the discontinued LLVM Phabricator instance.

breakpad: Add FUNC records to the symtab
ClosedPublic

Authored by labath on Jan 11 2019, 4:35 AM.

Details

Summary

This patch extends SymbolFileBreakpad::AddSymbols to include the symbols
from the FUNC records too. These symbols come from the debug info and
have a size associated with them, so they are given preference in case
there is a PUBLIC record for the same address.

To achieve this, I first pre-process the symbols into a temporary
DenseMap, and then insert the uniqued symbols into the module's symtab.

Event Timeline

labath created this revision.Jan 11 2019, 4:35 AM
clayborg added inline comments.Jan 15 2019, 8:40 AM
source/Plugins/ObjectFile/Breakpad/ObjectFileBreakpad.cpp
29–48 ↗(On Diff #181246)

Move to BreakpadLine.cpp/.h?

source/Plugins/ObjectFile/Breakpad/ObjectFileBreakpad.h
20–23 ↗(On Diff #181246)

Should we have a "Line" class in lldb_private::breakpad::Line? All functions that parse a breakpad line could be in this new class? Seeing as we have symbol files, object files and the process plug-in might need to parse these lines, seems like it would be cleaner? Maybe in BreakpadLine.cpp/.h?

namespace lldb_private {
namespace breakpad {
class Line {
  enum class Token { Unknown, Module, Info, File, Func, Public, Stack };
  Line(llvm::StringRef s);
  Token parseToken();
  static llvm::StringRef tokenToString(Token t);
};
source/Plugins/SymbolFile/Breakpad/SymbolFileBreakpad.cpp
27

Move this functionality into llvm::breakpad::Line?

186

change "llvm::StringRef line" to "lldb_private::breakpad::Line" and remove second parameter as the "Line" class can know its token type?

202–209

If we make a Line class this code would become:

lldb_private::breakpad::Line bp_line(line);
if (bp_line.GetToken() != Token::Func)
  continue;
parse(bp_line);
214

If we make a Line class this code would become:

parse(bp_line);
clayborg added inline comments.Jan 15 2019, 8:43 AM
source/Plugins/SymbolFile/Breakpad/SymbolFileBreakpad.cpp
201

Should the iterator for Token::Func just return "FUNC" objects only? Maybe we add a Token::Line to the Token enumeration and then add an optional second parameter to the iterator? So any code that would want a "FUNC" and its lines, would do:

for (llvm::StringRef line: lines(*m_obj_file, Token::Func, Token::Line)) {
}
209

If we modify the iterator this would become:

parse(lldb_private::breakpad::Line(line));
labath updated this revision to Diff 182254.Jan 17 2019, 4:39 AM
labath marked 7 inline comments as done.

Thanks for the review. I've refactored the code to separate (and centralize) the
breakpad parsing from the part which does presents the information to lldb.

I've done this slightly differently than suggested (the new file is called
BreakpadRecords, as that's how breakpad calls them, and there as separate class
for each record type instead of just a function), but I believe it addresses the
root problem.

The main reason I created classes for everything is because I thought I'd make
the iterator class return those. I may still do that at some point but now it
did not seem worth it now because it makes it harder to report parsing errors,
and it did not make the call sites much cleaner.

labath added inline comments.Jan 17 2019, 4:41 AM
source/Plugins/SymbolFile/Breakpad/SymbolFileBreakpad.cpp
27

I haven't moved this part to the new file because (unlike everything else in that file) this depends on the ObjectFile class. Theoretically it can be moved to ObjectFileBreapad.h (or a new file) if needed, but ideally I'd like to avoid anyone else needing to parse the breakpad file contents. If the minidump process plugin for instance needed to access some of this information, I would have it go through the SymbolFile interface, which can present it in a nicer fashion.

201

With the new parse functions this should not be necessary as you can just say "try to parse this line as a FUNC record", and that will automatically reject LINE record as well as any other malformed lines. I think that would make sense if I made the iterator perform the parsing internally and return already parsed records.

I didn't do that (for now) because returning parse errors from an iterator gets weird, and it didn't seem worth the small amount of code it would save.

clayborg accepted this revision.Jan 17 2019, 11:04 AM

Much cleaner! Thanks

This revision is now accepted and ready to land.Jan 17 2019, 11:04 AM
This revision was automatically updated to reflect the committed changes.