This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/DebugInfo/Symbolize/
-
llvm/
-
DebugInfo/
-
Symbolize/
-
Symbolize.h
-
lib/DebugInfo/Symbolize/
-
DebugInfo/
-
Symbolize/
4
Symbolize.cpp

Differential D95232

Symbolizer - Teach symbolizer to work directly on object file.
ClosedPublic

Authored by pvellien on Jan 22 2021, 6:45 AM.

Download Raw Diff

Details

Reviewers

samsonov
ychen
rnk
scott.linder
MaskRay
t-tye

Commits

rG12999d749d72: [Symbolize] Teach symbolizer to work directly on object file.

Summary

This patch intended to provide additional interface to LLVMsymbolizer such that they work directly on object files. There is an existing method - symbolizecode which takes an object file, this patch provides similar overloads for symbolizeInlinedCode, symbolizeData, symbolizeFrame. This can be useful for clients who already have a in-memory object files to symbolize for.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

pvellien created this revision.Jan 22 2021, 6:45 AM

Herald added a subscriber: hiraditya. · View Herald TranscriptJan 22 2021, 6:45 AM

pvellien requested review of this revision.Jan 22 2021, 6:45 AM

Herald added a project: Restricted Project. · View Herald TranscriptJan 22 2021, 6:45 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

pvellien added reviewers: scott.linder, tony-tye.Jan 25 2021, 9:25 AM

scott.linder added inline comments.Jan 25 2021, 3:44 PM

llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
73–74	See below, it seems like this is just a bug currently?
579–602	The logic for `CoffObject` is not repeated elsewhere, which doesn't seem intended? It seems like this method should also be overloaded, i.e. rather than repeat the "try to find it, if not create it" logic in each overload above, just implement `LLVMSymbolizer::getOrCreateModuleInfo(const ObjectFile &)` too, and the bodies elsewhere should become identical. I'd also consider just making the private `Common` methods templates over the type of the first argument, and have the overloads be one-line wrappers: template <typename T> Expected<DILineInfo> symbolizeCommonCode(T ModuleSpecifier, object::SectionedAddress ModuleOffset) { SymbolizableModule Info; if (auto InfoOrErr = getOrCreateModuleInfo(ModuleSpecifier)) Info = InfoOrErr.get(); else return InfoOrErr.takeError(); // ... the existing body in terms of Info and ModuleOffset } Expected<DILineInfo> LLVMSymbolizer::symbolizeCode(const std::string &ModuleName, object::SectionedAddress ModuleOffset) { return symbolizeCodeCommon(ModuleName, ModuleOffset); } Expected<DILineInfo> LLVMSymbolizer::symbolizeCode(const ObjectFile &ModuleObjectFile, object::SectionedAddress ModuleOffset) { return symbolizeCodeCommon(ModuleObjectFile, ModuleOffset); } I think this minimizes the amount of boilerplate, short of using the preprocessor, while still presenting the nice overloaded interface rather than the templated one.

scott.linder added inline comments.Jan 25 2021, 3:55 PM

llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
579–602	Small amendment: I'd change the snippet above slightly to begin the method body with: auto InfoOrErr = getOrCreateModuleInfo(ModuleSpecifier); if (auto Err = InfoOrErr.takeError()) return Err; SymbolizableModule Info = InfoOrErr; Which is shorter, follows the convention of most of the rest of the file, and follows https://llvm.org/docs/ProgrammersManual.html#recoverable-errors

Refactor code

scott.linder edited reviewers, added: t-tye; removed: tony-tye.Feb 1 2021, 11:18 AM

This LGTM now, but I would give @ychen and @MaskRay time to take a look.

I'm also curious where the TODO discussed at https://reviews.llvm.org/D63521?id=206255#inline-568633 went, and if it was intentional?

llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
614	Could you add a `FIXME:` comment here to the effect of "Should this handle COFF specially like getOrCreateModuleInfo(const std::string &) does?"

Added FIXME

Harbormaster completed remote builds in B87831: Diff 321318.Feb 3 2021, 11:50 PM

ping

I'm accepting this as there hasn't been any response from others and I don't know who else might be interested, but if anyone has concerns please feel free to review post-commit, and we can revert as necessary.

This revision is now accepted and ready to land.Feb 12 2021, 10:24 AM

This revision was landed with ongoing or failed builds.Feb 12 2021, 10:30 AM

Closed by commit rG12999d749d72: [Symbolize] Teach symbolizer to work directly on object file. (authored by scott.linder). · Explain Why

This revision was automatically updated to reflect the committed changes.

scott.linder added a commit: rG12999d749d72: [Symbolize] Teach symbolizer to work directly on object file..

Generally there should be test coverage - otherwise these APIs are likely to be cleaned up as unused, even if the chance of actual bugs in them is small (since they're just simple wrappers).

Perhaps some unit testing would be suitable?

In D95232#2564264, @dblaikie wrote:

Generally there should be test coverage - otherwise these APIs are likely to be cleaned up as unused, even if the chance of actual bugs in them is small (since they're just simple wrappers).

Perhaps some unit testing would be suitable?

Hi dave, thanks for your comments. I was worried about this thing. But there are no existing tools in llvm that work on object file directly expect there is a usage for one existing API - symbolizeCode. I don't know how it is reliable to add unit tests, I would highly appreciate your thoughts on this. thanks

In D95232#2565937, @pvellien wrote:

In D95232#2564264, @dblaikie wrote:

Generally there should be test coverage - otherwise these APIs are likely to be cleaned up as unused, even if the chance of actual bugs in them is small (since they're just simple wrappers).

Perhaps some unit testing would be suitable?

Hi dave, thanks for your comments. I was worried about this thing. But there are no existing tools in llvm that work on object file directly expect there is a usage for one existing API - symbolizeCode.

Sorry, I'm not quite following this last sentence, could you rephrase it?

The first part (" But there are no existing tools in llvm that work on object file directly ") doesn't seem quite right, so far as I'm reading it - llvm-symbolizer, the command line tool, can operate on object files and many of the llvm-symbolizer tests operate on object files. (which also makes me wonder about this change in general - llvm-symbolizer can already work on object files, so why are new APIs required, given the functionality already exists/is used/tested?)

I don't know how it is reliable to add unit tests, I would highly appreciate your thoughts on this. thanks

In D95232#2566156, @dblaikie wrote:

In D95232#2565937, @pvellien wrote:

In D95232#2564264, @dblaikie wrote:

Generally there should be test coverage - otherwise these APIs are likely to be cleaned up as unused, even if the chance of actual bugs in them is small (since they're just simple wrappers).

Perhaps some unit testing would be suitable?

Hi dave, thanks for your comments. I was worried about this thing. But there are no existing tools in llvm that work on object file directly expect there is a usage for one existing API - symbolizeCode.

Sorry, I'm not quite following this last sentence, could you rephrase it?

The first part (" But there are no existing tools in llvm that work on object file directly ") doesn't seem quite right, so far as I'm reading it - llvm-symbolizer, the command line tool, can operate on object files and many of the llvm-symbolizer tests operate on object files. (which also makes me wonder about this change in general - llvm-symbolizer can already work on object files, so why are new APIs required, given the functionality already exists/is used/tested?)

I don't know how it is reliable to add unit tests, I would highly appreciate your thoughts on this. thanks

Oh sorry, I mean an in-memory object file rather then a on-disk object file. This is similar to this patch : https://reviews.llvm.org/D63521 but it adds overloads for missing functions. Whether it helps? thanks

From my reads of the history of this component, I think DebugInfo/Symbolize was extracted from llvm-symbolizer because sanitizer runtime needs it. Later other LLVM internal tools (sanitizer runtime, sancov, sanstats, llvm-xray, etc) use the API as well.
I don't find usage from open-source projects. There could be, but I speculate that if do some refactoring the friction will be small.

Currently the std::string overloads are mainly used. It seems to me that we don't need to std::string overloads. They can likely all switch to the const ObjectFile & overloads.
I will take a stab at cleaning up the call sites.

@pvellien @scott.linder The concern with not testing the new API is that they are otherwise unused (I guess that you may have downstream projects which may adopt them soon) and may be deleted by other contributors as dead code.
If you have some specific use cases, contributing unittests would probably be a good idea.
I'll try refactoring the API, if const std::string& is replaced with const ObjectFile & overloads, then the API will be used by in-tree code and will be less likely deleted as dead code.

BTW: symbolizeData is special. Technically symbolizeData does not different things than symbolizeCode. The difference is that it returns DIGlobal which has the size information. Many older binary formats before ELF don't record size information. It is currently only used by hwasan and tsan for diagnostics (D96322).

addr2line does not have a "DATA" symbolization mode. I'd be really curious how you'd use that API.

Thanks for taking a look @MaskRay

(CC @jhenderson @rnk )

The issues with the const std::string & parameter symbolize* API:

The LLVMSymbolizer instance needs to construct ObjectFile, which may be a duplicate if the application has an existing ObjectFile instance.
The file reading error is buried in the API. The application cannot easily tell there is an IO error or incorrect address error and may keep symbolizing more addresses after an IO error. In addition, the user may want to handle the IO error themselves.
It does magic things (opening an auxiliary file) which may be of security concern for some usage: it constructs a .dsym path for a Mach-O file; it constructs a path from build ID for an ELF file; it constructs a .gnu_debuglink path if the section is present in an ELF file; it constructs a PDB path for a PE file. Personally I'd like the feature to open auxiliary files to be specified explicitly by the user on ELF.

The const ObjectFile & symbolize* does not have the ability to open magic auxiliary files. This is not clear from the API.

If we want to keep just one set of symbolize* overloads and go for const ObjectFile &, we perhaps need to store ObjectFile handles instead of StringRef handles for LLVMSymbolizer's internal maps, and make Expected<SymbolizableModule *> LLVMSymbolizer::getOrCreateModuleInfo(const std::string &ModuleName) public.

I want to check with folks whether whether some refactoring is needed in this area.

This can be useful for clients who already have a in-memory object files to symbolize for.

I'd hear more about the use case. For in-memory object files, how do you handle .gnu_debuglink and the build ID stuff. It is possible that you don't want to handle this.

MaskRay added a subscriber: jhenderson.Feb 16 2021, 1:14 PM

I'm not sure if this is what @dblaikie was referring to, but it seems like a good way to test new public-facing APIs would be to write gtest unit tests. I don't think there's any precedent for this for the symbolizer API, so it may not be particularly easy. Refactoring llvm-symbolizer to make use of the new API is probably a good idea, and it may help with the test coverage, but there is a risk that a future refactor beyond that will cause llvm-symbolizer to no longer use the API, and therefore the test coverage is lost.

In D95232#2566234, @MaskRay wrote:

From my reads of the history of this component, I think DebugInfo/Symbolize was extracted from llvm-symbolizer because sanitizer runtime needs it. Later other LLVM internal tools (sanitizer runtime, sancov, sanstats, llvm-xray, etc) use the API as well.
I don't find usage from open-source projects. There could be, but I speculate that if do some refactoring the friction will be small.

Currently the std::string overloads are mainly used. It seems to me that we don't need to std::string overloads. They can likely all switch to the const ObjectFile & overloads.
I will take a stab at cleaning up the call sites.

@pvellien @scott.linder The concern with not testing the new API is that they are otherwise unused (I guess that you may have downstream projects which may adopt them soon) and may be deleted by other contributors as dead code.
If you have some specific use cases, contributing unittests would probably be a good idea.
I'll try refactoring the API, if const std::string& is replaced with const ObjectFile & overloads, then the API will be used by in-tree code and will be less likely deleted as dead code.

@MaskRay Yes, we planned to use those APIs for the heterogeneous address sanitizer, where we have a device (gpu) object code embedded in a data section of the executable. If we want to symbolize the device object code it would to good to have llvm-symbolizer to symbolize directly on the in-memory instance rather than dumping data to a file and passing the file name to llvm-symbolizer.

In D95232#2566553, @MaskRay wrote:

(CC @jhenderson @rnk )

The issues with the const std::string & parameter symbolize* API:

The LLVMSymbolizer instance needs to construct ObjectFile, which may be a duplicate if the application has an existing ObjectFile instance.

The file reading error is buried in the API. The application cannot easily tell there is an IO error or incorrect address error and may keep symbolizing more addresses after an IO error. In addition, the user may want to handle the IO error themselves.

It does magic things (opening an auxiliary file) which may be of security concern for some usage: it constructs a .dsym path for a Mach-O file; it constructs a path from build ID for an ELF file; it constructs a .gnu_debuglink path if the section is present in an ELF file; it constructs a PDB path for a PE file. Personally I'd like the feature to open auxiliary files to be specified explicitly by the user on ELF.

The const ObjectFile & symbolize* does not have the ability to open magic auxiliary files. This is not clear from the API.

If we want to keep just one set of symbolize* overloads and go for const ObjectFile &, we perhaps need to store ObjectFile handles instead of StringRef handles for LLVMSymbolizer's internal maps, and make Expected<SymbolizableModule *> LLVMSymbolizer::getOrCreateModuleInfo(const std::string &ModuleName) public.

I want to check with folks whether whether some refactoring is needed in this area.

This can be useful for clients who already have a in-memory object files to symbolize for.

I'd hear more about the use case. For in-memory object files, how do you handle .gnu_debuglink and the build ID stuff. It is possible that you don't want to handle this.

we are not handling split dwarf scenario.

In D95232#2567721, @jhenderson wrote:

I'm not sure if this is what @dblaikie was referring to, but it seems like a good way to test new public-facing APIs would be to write gtest unit tests. I don't think there's any precedent for this for the symbolizer API, so it may not be particularly easy. Refactoring llvm-symbolizer to make use of the new API is probably a good idea, and it may help with the test coverage, but there is a risk that a future refactor beyond that will cause llvm-symbolizer to no longer use the API, and therefore the test coverage is lost.

@jhenderson so if we about to write unit tests for new symbolizer APIs how one should proceed? I would like to get your thoughts since there is no unit tests for Symbolizer

In D95232#2566553, @MaskRay wrote:

(CC @jhenderson @rnk )

The issues with the const std::string & parameter symbolize* API:

The LLVMSymbolizer instance needs to construct ObjectFile, which may be a duplicate if the application has an existing ObjectFile instance.

The file reading error is buried in the API. The application cannot easily tell there is an IO error or incorrect address error and may keep symbolizing more addresses after an IO error. In addition, the user may want to handle the IO error themselves.

It does magic things (opening an auxiliary file) which may be of security concern for some usage: it constructs a .dsym path for a Mach-O file; it constructs a path from build ID for an ELF file; it constructs a .gnu_debuglink path if the section is present in an ELF file; it constructs a PDB path for a PE file. Personally I'd like the feature to open auxiliary files to be specified explicitly by the user on ELF.

These ^ are the reasons you're suggesting it would be good to refactor the existing code to take an ObjectFile instead of a std::string file name?

The const ObjectFile & symbolize* does not have the ability to open magic auxiliary files. This is not clear from the API.

This ^ is a limitation of these newly proposed ObjectFile APIs? So they won't be entirely compatible with the existing string-based APIs? So it would be difficult/not possible to refactor the existing code to use these APIs?

If we want to keep just one set of symbolize* overloads and go for const ObjectFile &, we perhaps need to store ObjectFile handles instead of StringRef handles for LLVMSymbolizer's internal maps, and make Expected<SymbolizableModule *> LLVMSymbolizer::getOrCreateModuleInfo(const std::string &ModuleName) public.

What's the current lifetime management of the StringRefs in these internal maps? If they come from multiple different external calls passing in strings, that seems like it'd have lifetime problems (might not be clear the caller needs to keep the underlying string data alive) & would have similar problems with ObjectFile lifetime semantics. But if that's the existing contract/complication, perhaps it's not causing any particular problems.

In D95232#2572950, @dblaikie wrote:

In D95232#2566553, @MaskRay wrote:

(CC @jhenderson @rnk )

The issues with the const std::string & parameter symbolize* API:

The LLVMSymbolizer instance needs to construct ObjectFile, which may be a duplicate if the application has an existing ObjectFile instance.

The file reading error is buried in the API. The application cannot easily tell there is an IO error or incorrect address error and may keep symbolizing more addresses after an IO error. In addition, the user may want to handle the IO error themselves.

It does magic things (opening an auxiliary file) which may be of security concern for some usage: it constructs a .dsym path for a Mach-O file; it constructs a path from build ID for an ELF file; it constructs a .gnu_debuglink path if the section is present in an ELF file; it constructs a PDB path for a PE file. Personally I'd like the feature to open auxiliary files to be specified explicitly by the user on ELF.

These ^ are the reasons you're suggesting it would be good to refactor the existing code to take an ObjectFile instead of a std::string file name?

Yes

The const ObjectFile & symbolize* does not have the ability to open magic auxiliary files. This is not clear from the API.

This ^ is a limitation of these newly proposed ObjectFile APIs? So they won't be entirely compatible with the existing string-based APIs? So it would be difficult/not possible to refactor the existing code to use these APIs?

I think we can pass the file with debuginfo as an additional parameter as suggested by @MaskRay regarding refactoring all the use-cases within llvm to new APIs I'm not sure how complicated it would be in tools such as symbolizer, objdump etc

If we want to keep just one set of symbolize* overloads and go for const ObjectFile &, we perhaps need to store ObjectFile handles instead of StringRef handles for LLVMSymbolizer's internal maps, and make Expected<SymbolizableModule *> LLVMSymbolizer::getOrCreateModuleInfo(const std::string &ModuleName) public.

What's the current lifetime management of the StringRefs in these internal maps? If they come from multiple different external calls passing in strings, that seems like it'd have lifetime problems (might not be clear the caller needs to keep the underlying string data alive) & would have similar problems with ObjectFile lifetime semantics. But if that's the existing contract/complication, perhaps it's not causing any particular problems.

pvellien added a subscriber: b-sumner.Feb 25 2021, 12:35 AM

In D95232#2571890, @pvellien wrote:

In D95232#2567721, @jhenderson wrote:

I'm not sure if this is what @dblaikie was referring to, but it seems like a good way to test new public-facing APIs would be to write gtest unit tests. I don't think there's any precedent for this for the symbolizer API, so it may not be particularly easy. Refactoring llvm-symbolizer to make use of the new API is probably a good idea, and it may help with the test coverage, but there is a risk that a future refactor beyond that will cause llvm-symbolizer to no longer use the API, and therefore the test coverage is lost.

@jhenderson so if we about to write unit tests for new symbolizer APIs how one should proceed? I would like to get your thoughts since there is no unit tests for Symbolizer

Sorry for the delay - I've got a very big workload currently and haven't had a chance to get through all review comments from the past week or so. You'd need to write CMakeLists.txt and add a source file for the tests, much like there already is in DebugInfoDWARFTests. You'd then likely want to identify some method to generate test inputs with the required properties. The exact nature of this would depend on what exactly you want to test, but two options might be to reuse the DwarfGenerator code in the DWARF tests, or to use YAML inputs as some other tests for libObject and other places do, to create the object. I can't really give you any more precise details than that, I'm afraid, as I don't know exactly the best way forward without spending time attempting it myself, but I hope these give you some ideas to start with.

Revision Contents

Path

Size

llvm/

include/

llvm/

DebugInfo/

Symbolize/

Symbolize.h

24 lines

lib/

DebugInfo/

Symbolize/

Symbolize.cpp

114 lines

Diff 323388

llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h

Show First 20 Lines • Show All 52 Lines • ▼ Show 20 Lines	public:

LLVMSymbolizer() = default;		LLVMSymbolizer() = default;
LLVMSymbolizer(const Options &Opts) : Opts(Opts) {}		LLVMSymbolizer(const Options &Opts) : Opts(Opts) {}

~LLVMSymbolizer() {		~LLVMSymbolizer() {
flush();		flush();
}		}

		// Overloads accepting ObjectFile does not support COFF currently
Expected<DILineInfo> symbolizeCode(const ObjectFile &Obj,		Expected<DILineInfo> symbolizeCode(const ObjectFile &Obj,
object::SectionedAddress ModuleOffset);		object::SectionedAddress ModuleOffset);
Expected<DILineInfo> symbolizeCode(const std::string &ModuleName,		Expected<DILineInfo> symbolizeCode(const std::string &ModuleName,
object::SectionedAddress ModuleOffset);		object::SectionedAddress ModuleOffset);
Expected<DIInliningInfo>		Expected<DIInliningInfo>
		symbolizeInlinedCode(const ObjectFile &Obj,
		object::SectionedAddress ModuleOffset);
		Expected<DIInliningInfo>
symbolizeInlinedCode(const std::string &ModuleName,		symbolizeInlinedCode(const std::string &ModuleName,
object::SectionedAddress ModuleOffset);		object::SectionedAddress ModuleOffset);

		Expected<DIGlobal> symbolizeData(const ObjectFile &Obj,
		object::SectionedAddress ModuleOffset);
Expected<DIGlobal> symbolizeData(const std::string &ModuleName,		Expected<DIGlobal> symbolizeData(const std::string &ModuleName,
object::SectionedAddress ModuleOffset);		object::SectionedAddress ModuleOffset);
Expected<std::vector<DILocal>>		Expected<std::vector<DILocal>>
		symbolizeFrame(const ObjectFile &Obj, object::SectionedAddress ModuleOffset);
		Expected<std::vector<DILocal>>
symbolizeFrame(const std::string &ModuleName,		symbolizeFrame(const std::string &ModuleName,
object::SectionedAddress ModuleOffset);		object::SectionedAddress ModuleOffset);
void flush();		void flush();

static std::string		static std::string
DemangleName(const std::string &Name,		DemangleName(const std::string &Name,
const SymbolizableModule *DbiModuleDescriptor);		const SymbolizableModule *DbiModuleDescriptor);

private:		private:
// Bundles together object file with code/data and object file with		// Bundles together object file with code/data and object file with
// corresponding debug info. These objects can be the same.		// corresponding debug info. These objects can be the same.
using ObjectPair = std::pair<const ObjectFile , const ObjectFile >;		using ObjectPair = std::pair<const ObjectFile , const ObjectFile >;

		template <typename T>
Expected<DILineInfo>		Expected<DILineInfo>
symbolizeCodeCommon(SymbolizableModule *Info,		symbolizeCodeCommon(const T &ModuleSpecifier,
		object::SectionedAddress ModuleOffset);
		template <typename T>
		Expected<DIInliningInfo>
		symbolizeInlinedCodeCommon(const T &ModuleSpecifier,
		object::SectionedAddress ModuleOffset);
		template <typename T>
		Expected<DIGlobal> symbolizeDataCommon(const T &ModuleSpecifier,
		object::SectionedAddress ModuleOffset);
		template <typename T>
		Expected<std::vector<DILocal>>
		symbolizeFrameCommon(const T &ModuleSpecifier,
object::SectionedAddress ModuleOffset);		object::SectionedAddress ModuleOffset);

/// Returns a SymbolizableModule or an error if loading debug info failed.		/// Returns a SymbolizableModule or an error if loading debug info failed.
/// Only one attempt is made to load a module, and errors during loading are		/// Only one attempt is made to load a module, and errors during loading are
/// only reported once. Subsequent calls to get module info for a module that		/// only reported once. Subsequent calls to get module info for a module that
/// failed to load will return nullptr.		/// failed to load will return nullptr.
Expected<SymbolizableModule *>		Expected<SymbolizableModule *>
getOrCreateModuleInfo(const std::string &ModuleName);		getOrCreateModuleInfo(const std::string &ModuleName);
		Expected<SymbolizableModule *> getOrCreateModuleInfo(const ObjectFile &Obj);

Expected<SymbolizableModule *>		Expected<SymbolizableModule *>
createModuleInfo(const ObjectFile *Obj,		createModuleInfo(const ObjectFile *Obj,
std::unique_ptr<DIContext> Context,		std::unique_ptr<DIContext> Context,
StringRef ModuleName);		StringRef ModuleName);

ObjectFile *lookUpDsymFile(const std::string &Path,		ObjectFile *lookUpDsymFile(const std::string &Path,
const MachOObjectFile *ExeObj,		const MachOObjectFile *ExeObj,
Show All 40 Lines

llvm/lib/DebugInfo/Symbolize/Symbolize.cpp

Show All 33 Lines
#include "llvm/Support/Path.h"		#include "llvm/Support/Path.h"
#include <algorithm>		#include <algorithm>
#include <cassert>		#include <cassert>
#include <cstring>		#include <cstring>

namespace llvm {		namespace llvm {
namespace symbolize {		namespace symbolize {

		template <typename T>
Expected<DILineInfo>		Expected<DILineInfo>
LLVMSymbolizer::symbolizeCodeCommon(SymbolizableModule *Info,		LLVMSymbolizer::symbolizeCodeCommon(const T &ModuleSpecifier,
object::SectionedAddress ModuleOffset) {		object::SectionedAddress ModuleOffset) {

		auto InfoOrErr = getOrCreateModuleInfo(ModuleSpecifier);
		if (!InfoOrErr)
		return InfoOrErr.takeError();

		SymbolizableModule Info = InfoOrErr;

// A null module means an error has already been reported. Return an empty		// A null module means an error has already been reported. Return an empty
// result.		// result.
if (!Info)		if (!Info)
return DILineInfo();		return DILineInfo();

// If the user is giving us relative addresses, add the preferred base of the		// If the user is giving us relative addresses, add the preferred base of the
// object to the offset before we do the query. It's what DIContext expects.		// object to the offset before we do the query. It's what DIContext expects.
if (Opts.RelativeAddresses)		if (Opts.RelativeAddresses)
ModuleOffset.Address += Info->getModulePreferredBase();		ModuleOffset.Address += Info->getModulePreferredBase();

DILineInfo LineInfo = Info->symbolizeCode(		DILineInfo LineInfo = Info->symbolizeCode(
ModuleOffset, DILineInfoSpecifier(Opts.PathStyle, Opts.PrintFunctions),		ModuleOffset, DILineInfoSpecifier(Opts.PathStyle, Opts.PrintFunctions),
Opts.UseSymbolTable);		Opts.UseSymbolTable);
if (Opts.Demangle)		if (Opts.Demangle)
LineInfo.FunctionName = DemangleName(LineInfo.FunctionName, Info);		LineInfo.FunctionName = DemangleName(LineInfo.FunctionName, Info);
return LineInfo;		return LineInfo;
}		}

Expected<DILineInfo>		Expected<DILineInfo>
LLVMSymbolizer::symbolizeCode(const ObjectFile &Obj,		LLVMSymbolizer::symbolizeCode(const ObjectFile &Obj,
object::SectionedAddress ModuleOffset) {		object::SectionedAddress ModuleOffset) {
StringRef ModuleName = Obj.getFileName();		return symbolizeCodeCommon(Obj, ModuleOffset);
		scott.linderUnsubmitted Not Done Reply Inline Actions See below, it seems like this is just a bug currently? scott.linder: See below, it seems like this is just a bug currently?
auto I = Modules.find(ModuleName);
if (I != Modules.end())
return symbolizeCodeCommon(I->second.get(), ModuleOffset);

std::unique_ptr<DIContext> Context = DWARFContext::create(Obj);
Expected<SymbolizableModule *> InfoOrErr =
createModuleInfo(&Obj, std::move(Context), ModuleName);
if (!InfoOrErr)
return InfoOrErr.takeError();
return symbolizeCodeCommon(*InfoOrErr, ModuleOffset);
}		}

Expected<DILineInfo>		Expected<DILineInfo>
LLVMSymbolizer::symbolizeCode(const std::string &ModuleName,		LLVMSymbolizer::symbolizeCode(const std::string &ModuleName,
object::SectionedAddress ModuleOffset) {		object::SectionedAddress ModuleOffset) {
Expected<SymbolizableModule *> InfoOrErr = getOrCreateModuleInfo(ModuleName);		return symbolizeCodeCommon(ModuleName, ModuleOffset);
if (!InfoOrErr)
return InfoOrErr.takeError();
return symbolizeCodeCommon(*InfoOrErr, ModuleOffset);
}		}

Expected<DIInliningInfo>		template <typename T>
LLVMSymbolizer::symbolizeInlinedCode(const std::string &ModuleName,		Expected<DIInliningInfo> LLVMSymbolizer::symbolizeInlinedCodeCommon(
object::SectionedAddress ModuleOffset) {		const T &ModuleSpecifier, object::SectionedAddress ModuleOffset) {
SymbolizableModule *Info;		auto InfoOrErr = getOrCreateModuleInfo(ModuleSpecifier);
if (auto InfoOrErr = getOrCreateModuleInfo(ModuleName))		if (!InfoOrErr)
Info = InfoOrErr.get();
else
return InfoOrErr.takeError();		return InfoOrErr.takeError();

		SymbolizableModule Info = InfoOrErr;

// A null module means an error has already been reported. Return an empty		// A null module means an error has already been reported. Return an empty
// result.		// result.
if (!Info)		if (!Info)
return DIInliningInfo();		return DIInliningInfo();

// If the user is giving us relative addresses, add the preferred base of the		// If the user is giving us relative addresses, add the preferred base of the
// object to the offset before we do the query. It's what DIContext expects.		// object to the offset before we do the query. It's what DIContext expects.
if (Opts.RelativeAddresses)		if (Opts.RelativeAddresses)
ModuleOffset.Address += Info->getModulePreferredBase();		ModuleOffset.Address += Info->getModulePreferredBase();

DIInliningInfo InlinedContext = Info->symbolizeInlinedCode(		DIInliningInfo InlinedContext = Info->symbolizeInlinedCode(
ModuleOffset, DILineInfoSpecifier(Opts.PathStyle, Opts.PrintFunctions),		ModuleOffset, DILineInfoSpecifier(Opts.PathStyle, Opts.PrintFunctions),
Opts.UseSymbolTable);		Opts.UseSymbolTable);
if (Opts.Demangle) {		if (Opts.Demangle) {
for (int i = 0, n = InlinedContext.getNumberOfFrames(); i < n; i++) {		for (int i = 0, n = InlinedContext.getNumberOfFrames(); i < n; i++) {
auto *Frame = InlinedContext.getMutableFrame(i);		auto *Frame = InlinedContext.getMutableFrame(i);
Frame->FunctionName = DemangleName(Frame->FunctionName, Info);		Frame->FunctionName = DemangleName(Frame->FunctionName, Info);
}		}
}		}
return InlinedContext;		return InlinedContext;
}		}

		Expected<DIInliningInfo>
		LLVMSymbolizer::symbolizeInlinedCode(const ObjectFile &Obj,
		object::SectionedAddress ModuleOffset) {
		return symbolizeInlinedCodeCommon(Obj, ModuleOffset);
		}

		Expected<DIInliningInfo>
		LLVMSymbolizer::symbolizeInlinedCode(const std::string &ModuleName,
		object::SectionedAddress ModuleOffset) {
		return symbolizeInlinedCodeCommon(ModuleName, ModuleOffset);
		}

		template <typename T>
Expected<DIGlobal>		Expected<DIGlobal>
LLVMSymbolizer::symbolizeData(const std::string &ModuleName,		LLVMSymbolizer::symbolizeDataCommon(const T &ModuleSpecifier,
object::SectionedAddress ModuleOffset) {		object::SectionedAddress ModuleOffset) {
SymbolizableModule *Info;
if (auto InfoOrErr = getOrCreateModuleInfo(ModuleName))		auto InfoOrErr = getOrCreateModuleInfo(ModuleSpecifier);
Info = InfoOrErr.get();		if (!InfoOrErr)
else
return InfoOrErr.takeError();		return InfoOrErr.takeError();

		SymbolizableModule Info = InfoOrErr;
// A null module means an error has already been reported. Return an empty		// A null module means an error has already been reported. Return an empty
// result.		// result.
if (!Info)		if (!Info)
return DIGlobal();		return DIGlobal();

// If the user is giving us relative addresses, add the preferred base of		// If the user is giving us relative addresses, add the preferred base of
// the object to the offset before we do the query. It's what DIContext		// the object to the offset before we do the query. It's what DIContext
// expects.		// expects.
if (Opts.RelativeAddresses)		if (Opts.RelativeAddresses)
ModuleOffset.Address += Info->getModulePreferredBase();		ModuleOffset.Address += Info->getModulePreferredBase();

DIGlobal Global = Info->symbolizeData(ModuleOffset);		DIGlobal Global = Info->symbolizeData(ModuleOffset);
if (Opts.Demangle)		if (Opts.Demangle)
Global.Name = DemangleName(Global.Name, Info);		Global.Name = DemangleName(Global.Name, Info);
return Global;		return Global;
}		}

		Expected<DIGlobal>
		LLVMSymbolizer::symbolizeData(const ObjectFile &Obj,
		object::SectionedAddress ModuleOffset) {
		return symbolizeDataCommon(Obj, ModuleOffset);
		}

		Expected<DIGlobal>
		LLVMSymbolizer::symbolizeData(const std::string &ModuleName,
		object::SectionedAddress ModuleOffset) {
		return symbolizeDataCommon(ModuleName, ModuleOffset);
		}

		template <typename T>
Expected<std::vector<DILocal>>		Expected<std::vector<DILocal>>
LLVMSymbolizer::symbolizeFrame(const std::string &ModuleName,		LLVMSymbolizer::symbolizeFrameCommon(const T &ModuleSpecifier,
object::SectionedAddress ModuleOffset) {		object::SectionedAddress ModuleOffset) {
SymbolizableModule *Info;		auto InfoOrErr = getOrCreateModuleInfo(ModuleSpecifier);
if (auto InfoOrErr = getOrCreateModuleInfo(ModuleName))		if (!InfoOrErr)
Info = InfoOrErr.get();
else
return InfoOrErr.takeError();		return InfoOrErr.takeError();

		SymbolizableModule Info = InfoOrErr;
// A null module means an error has already been reported. Return an empty		// A null module means an error has already been reported. Return an empty
// result.		// result.
if (!Info)		if (!Info)
return std::vector<DILocal>();		return std::vector<DILocal>();

// If the user is giving us relative addresses, add the preferred base of		// If the user is giving us relative addresses, add the preferred base of
// the object to the offset before we do the query. It's what DIContext		// the object to the offset before we do the query. It's what DIContext
// expects.		// expects.
if (Opts.RelativeAddresses)		if (Opts.RelativeAddresses)
ModuleOffset.Address += Info->getModulePreferredBase();		ModuleOffset.Address += Info->getModulePreferredBase();

return Info->symbolizeFrame(ModuleOffset);		return Info->symbolizeFrame(ModuleOffset);
}		}

		Expected<std::vector<DILocal>>
		LLVMSymbolizer::symbolizeFrame(const ObjectFile &Obj,
		object::SectionedAddress ModuleOffset) {
		return symbolizeFrameCommon(Obj, ModuleOffset);
		}

		Expected<std::vector<DILocal>>
		LLVMSymbolizer::symbolizeFrame(const std::string &ModuleName,
		object::SectionedAddress ModuleOffset) {
		return symbolizeFrameCommon(ModuleName, ModuleOffset);
		}

void LLVMSymbolizer::flush() {		void LLVMSymbolizer::flush() {
ObjectForUBPathAndArch.clear();		ObjectForUBPathAndArch.clear();
BinaryForPath.clear();		BinaryForPath.clear();
ObjectPairForPathArch.clear();		ObjectPairForPathArch.clear();
Modules.clear();		Modules.clear();
}		}

namespace {		namespace {
▲ Show 20 Lines • Show All 363 Lines • ▼ Show 20 Lines	LLVMSymbolizer::getOrCreateModuleInfo(const std::string &ModuleName) {
auto ObjectsOrErr = getOrCreateObjectPair(BinaryName, ArchName);		auto ObjectsOrErr = getOrCreateObjectPair(BinaryName, ArchName);
if (!ObjectsOrErr) {		if (!ObjectsOrErr) {
// Failed to find valid object file.		// Failed to find valid object file.
Modules.emplace(ModuleName, std::unique_ptr<SymbolizableModule>());		Modules.emplace(ModuleName, std::unique_ptr<SymbolizableModule>());
return ObjectsOrErr.takeError();		return ObjectsOrErr.takeError();
}		}
ObjectPair Objects = ObjectsOrErr.get();		ObjectPair Objects = ObjectsOrErr.get();

std::unique_ptr<DIContext> Context;		std::unique_ptr<DIContext> Context;
// If this is a COFF object containing PDB info, use a PDBContext to		// If this is a COFF object containing PDB info, use a PDBContext to
// symbolize. Otherwise, use DWARF.		// symbolize. Otherwise, use DWARF.
if (auto CoffObject = dyn_cast<COFFObjectFile>(Objects.first)) {		if (auto CoffObject = dyn_cast<COFFObjectFile>(Objects.first)) {
const codeview::DebugInfo *DebugInfo;		const codeview::DebugInfo *DebugInfo;
StringRef PDBFileName;		StringRef PDBFileName;
auto EC = CoffObject->getDebugPDBInfo(DebugInfo, PDBFileName);		auto EC = CoffObject->getDebugPDBInfo(DebugInfo, PDBFileName);
if (!EC && DebugInfo != nullptr && !PDBFileName.empty()) {		if (!EC && DebugInfo != nullptr && !PDBFileName.empty()) {
using namespace pdb;		using namespace pdb;
std::unique_ptr<IPDBSession> Session;		std::unique_ptr<IPDBSession> Session;

PDB_ReaderType ReaderType =		PDB_ReaderType ReaderType =
Opts.UseDIA ? PDB_ReaderType::DIA : PDB_ReaderType::Native;		Opts.UseDIA ? PDB_ReaderType::DIA : PDB_ReaderType::Native;
if (auto Err = loadDataForEXE(ReaderType, Objects.first->getFileName(),		if (auto Err = loadDataForEXE(ReaderType, Objects.first->getFileName(),
Session)) {		Session)) {
Modules.emplace(ModuleName, std::unique_ptr<SymbolizableModule>());		Modules.emplace(ModuleName, std::unique_ptr<SymbolizableModule>());
// Return along the PDB filename to provide more context		// Return along the PDB filename to provide more context
return createFileError(PDBFileName, std::move(Err));		return createFileError(PDBFileName, std::move(Err));
}		}
Context.reset(new PDBContext(*CoffObject, std::move(Session)));		Context.reset(new PDBContext(*CoffObject, std::move(Session)));
}		}
}		}
if (!Context)		if (!Context)
Context = DWARFContext::create(*Objects.second, nullptr, Opts.DWPName);		Context = DWARFContext::create(*Objects.second, nullptr, Opts.DWPName);
		scott.linderUnsubmitted Not Done Reply Inline Actions The logic for `CoffObject` is not repeated elsewhere, which doesn't seem intended? It seems like this method should also be overloaded, i.e. rather than repeat the "try to find it, if not create it" logic in each overload above, just implement `LLVMSymbolizer::getOrCreateModuleInfo(const ObjectFile &)` too, and the bodies elsewhere should become identical. I'd also consider just making the private `Common` methods templates over the type of the first argument, and have the overloads be one-line wrappers: template <typename T> Expected<DILineInfo> symbolizeCommonCode(T ModuleSpecifier, object::SectionedAddress ModuleOffset) { SymbolizableModule Info; if (auto InfoOrErr = getOrCreateModuleInfo(ModuleSpecifier)) Info = InfoOrErr.get(); else return InfoOrErr.takeError(); // ... the existing body in terms of Info and ModuleOffset } Expected<DILineInfo> LLVMSymbolizer::symbolizeCode(const std::string &ModuleName, object::SectionedAddress ModuleOffset) { return symbolizeCodeCommon(ModuleName, ModuleOffset); } Expected<DILineInfo> LLVMSymbolizer::symbolizeCode(const ObjectFile &ModuleObjectFile, object::SectionedAddress ModuleOffset) { return symbolizeCodeCommon(ModuleObjectFile, ModuleOffset); } I think this minimizes the amount of boilerplate, short of using the preprocessor, while still presenting the nice overloaded interface rather than the templated one. scott.linder: The logic for `CoffObject` is not repeated elsewhere, which doesn't seem intended? It seems…
		scott.linderUnsubmitted Not Done Reply Inline Actions Small amendment: I'd change the snippet above slightly to begin the method body with: auto InfoOrErr = getOrCreateModuleInfo(ModuleSpecifier); if (auto Err = InfoOrErr.takeError()) return Err; SymbolizableModule Info = InfoOrErr; Which is shorter, follows the convention of most of the rest of the file, and follows https://llvm.org/docs/ProgrammersManual.html#recoverable-errors scott.linder: Small amendment: I'd change the snippet above slightly to begin the method body with: ``` auto…
return createModuleInfo(Objects.first, std::move(Context), ModuleName);		return createModuleInfo(Objects.first, std::move(Context), ModuleName);
}		}

		Expected<SymbolizableModule *>
		LLVMSymbolizer::getOrCreateModuleInfo(const ObjectFile &Obj) {
		StringRef ObjName = Obj.getFileName();
		auto I = Modules.find(ObjName);
		if (I != Modules.end())
		return I->second.get();

		std::unique_ptr<DIContext> Context = DWARFContext::create(Obj);
		// FIXME: handle COFF object with PDB info to use PDBContext
		scott.linderUnsubmitted Not Done Reply Inline Actions Could you add a `FIXME:` comment here to the effect of "Should this handle COFF specially like getOrCreateModuleInfo(const std::string &) does?" scott.linder: Could you add a `FIXME:` comment here to the effect of "Should this handle COFF specially like…
		return createModuleInfo(&Obj, std::move(Context), ObjName);
		}

namespace {		namespace {

// Undo these various manglings for Win32 extern "C" functions:		// Undo these various manglings for Win32 extern "C" functions:
// cdecl - _foo		// cdecl - _foo
// stdcall - _foo@12		// stdcall - _foo@12
// fastcall - @foo@12		// fastcall - @foo@12
// vectorcall - foo@@12		// vectorcall - foo@@12
// These are all different linkage names for 'foo'.		// These are all different linkage names for 'foo'.
▲ Show 20 Lines • Show All 59 Lines • Show Last 20 Lines