This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
test/tools/dsymutil/X86/
-
tools/
-
dsymutil/
-
X86/
-
reproducer.test
-
tools/dsymutil/
-
dsymutil/
9/13
Reproducer.h
8/11
Reproducer.cpp
10/14
dsymutil.cpp

Differential D101340

Allows for dsymutil crashes to generate reproduceable information
AbandonedPublic

Authored by JDevlieghere on Apr 26 2021, 8:14 PM.

Download Raw Diff

Details

Reviewers

friss
aprantl
void
ruiu
bkramer
feg208

Summary

This change to dsymutil allows for, when dsymutil crashes, the user to collect and submit
the object files and a map file so the crash can be reproduced

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	3,590 ms	x64 debian > libarcher.races::lock-unrelated.c

Event Timeline

feg208 created this revision.Apr 26 2021, 8:14 PM

Herald added a subscriber: dang. · View Herald TranscriptApr 26 2021, 8:14 PM

feg208 requested review of this revision.Apr 26 2021, 8:14 PM

Herald added a project: Restricted Project. · View Herald TranscriptApr 26 2021, 8:14 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B101074: Diff 340717.Apr 26 2021, 8:55 PM

Fixed clang-tidy warnings

Harbormaster completed remote builds in B101077: Diff 340721.Apr 26 2021, 10:01 PM

Missing more clang-tidy checks

Harbormaster completed remote builds in B101181: Diff 340867.Apr 27 2021, 9:46 AM

How is collect-crash-inputs different from gan-reproducer? It seems like they're essentially doing the same thing. Can we implement the crash recovery in terms of llvm::dsymutil::Reproducer?

This alters the prior commit by incorporating the Reproducer for the purpose of persisting the files on crash

Forgot to remove raise(SIGSEGV)

Harbormaster completed remote builds in B101558: Diff 341391.Apr 28 2021, 9:27 PM

Harbormaster completed remote builds in B101559: Diff 341392.Apr 28 2021, 9:30 PM

Cleans up clang-tidy/format issues and some minor branch pecadillos

Harbormaster completed remote builds in B101726: Diff 341635.Apr 29 2021, 4:04 PM

clang-format misses

Harbormaster completed remote builds in B102115: Diff 342164.May 1 2021, 11:31 AM

feg208 added reviewers: void, ruiu, bkramer.May 24 2021, 1:08 PM

JDevlieghere added inline comments.May 24 2021, 2:13 PM

llvm/tools/dsymutil/Reproducer.cpp
57–80	If we split this into a separate class, this can remain here, and have the crash variant print the "PLEASE ATTACH THE FILES IN THE FOLLOWING DIRECTORY TO THE BUG REPORT" in the generate crash class.
llvm/tools/dsymutil/Reproducer.h
84	I think we probably want a separate class that corresponds to `GenerateForCrash`.
llvm/tools/dsymutil/dsymutil.cpp
490–491	Spurious newline
713–737	This seems like it should be part of the reproducers class.

Addresses some review comments

I think I addressed all the comments? If so I can't push the changes into the repo. Could you? Assuming there aren't outstanding concerns of course.

Harbormaster completed remote builds in B106017: Diff 347536.May 24 2021, 6:42 PM

Broke a regression test

Harbormaster completed remote builds in B106028: Diff 347557.May 24 2021, 8:41 PM

I am wondering if I could get someone to look at this? I think I have addressed the review comments but I'd be happy to fix anything I missed

I think I have addressed the review comments but I'd be happy to fix anything I missed

Pinging @JDevlieghere ^

JDevlieghere added inline comments.Jun 4 2021, 9:05 AM

llvm/tools/dsymutil/Reproducer.cpp
37	This should be consistent with the header. It's fine (and common) to use the same name for the argument and member variable.
38	LLVM generally does not prefer braced initializer lists (https://llvm.org/docs/CodingStandards.html#do-not-use-braced-initializer-lists-to-call-a-constructor)
106	You should use `LLVM_FALLTHROUGH`, otherwise this won't work if the host compiler is not clang.
108–110	No braces around single line statements (https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements)
llvm/tools/dsymutil/Reproducer.h
24	Maybe `GenerateOnCrash`?
40	LLVM doesn't use exceptions (and generally builds without them).
50	If we're going to store the root, this might as well take a `std::string` too and move it into `Root`.
52	Rather than returning a small string with arbitrary lengths, the common pattern in llvm is to take let the caller decide on the storage type and pass in a `llvm::SmallVectorImpl<char>&` that the function then populates. This also avoids any potential copies.
llvm/tools/dsymutil/dsymutil.cpp
493–495	I don't think `auto` is a good fit here (https://llvm.org/docs/CodingStandards.html#use-auto-type-deduction-to-make-code-more-readable)
682–686	Unrelated whitespace change?
717
721–729	Why is this needed?
751

I rolled up all the comments save one where I just have an outstanding question and then I'll roll that up as well

llvm/tools/dsymutil/Reproducer.cpp
38	Got it. Thanks for passing this along. I really should have reviewed it before this submission but I certainly will before the next
llvm/tools/dsymutil/dsymutil.cpp
682–686	yeah. Sorry it snuck in.
721–729	So my understanding is that when dsymutil crashes we want the mapping.yaml file, the binary and associated .o file (assuming its compiled with -g). The latter maybe for some workflows? Reusing this gets us the object file but at the cost of parsing the debug map. So maybe I should just copy the binary in? I can do that or just parse enough to recreate the object files. Or maybe I am misunderstanding this entirely?

Gets some review comments

Harbormaster completed remote builds in B108057: Diff 350396.Jun 7 2021, 1:27 PM

JDevlieghere requested changes to this revision.Jun 11 2021, 11:06 AM

JDevlieghere added inline comments.

llvm/tools/dsymutil/Reproducer.cpp
56–80	We need the whole directory, so might as well ask the reporter to include it as a whole. ******************** PLEASE ATTACH THE FOLLOWING DIRECTORY TO THE BUG REPORT:
101	The `nullptr` here is redundant.
llvm/tools/dsymutil/Reproducer.h
40	Don't use auto here, it's not obvious from context what the type is.
llvm/tools/dsymutil/dsymutil.cpp
492	Let's be consistent with all the surrounding code and the style guide.
721–729
744–753	I actually don't think this is going to work. You're creating the reproducer after we've crashed and you're not telling dsymutil to use the VFS from that Reproducer instance. It's the VFS that records all access and is what will allow us to copy all the files in place if we decide we need to keep the reproducer. The way I would imagine this to work is: Parse the command line options. If we crash at this point we won't have a reproducer, which I think is acceptable. Initialize the reproducer. Call CRC.RunSafely to do the actual work. We pass it the reproducer, so it captures all file accesses, which is everything we need for the reproducer. 4a. If we crash, we generate the reproducer and print the dsymutil invocation to replay the reproducer. 4b. If we don't crash, but `--gen-reproducer` was passed, we do the same Does that make sense?

This revision now requires changes to proceed.Jun 11 2021, 11:06 AM

I think this covers the comments

llvm/tools/dsymutil/dsymutil.cpp
492	Oh right. Oops.
744–753	Yeah this makes sense and answers my confusion about the right way to go here.

Rolled up review comments

Harbormaster completed remote builds in B108980: Diff 351668.Jun 12 2021, 10:13 AM

Fixes the broken tests

This is not yet complete. I missed step 4a. "4a. If we crash, we generate the reproducer and print the dsymutil invocation to replay the reproducer." It currently doesn't print the command line

Added command line print

Harbormaster completed remote builds in B109034: Diff 351744.Jun 13 2021, 3:48 PM

Clang-tidy checks cleaned up and test fixes

Forgot the clang-tidy checks

Harbormaster completed remote builds in B109155: Diff 351945.Jun 14 2021, 12:51 PM

I'd just wanted to follow up and see if, minimally, my changes addressed the concerns raised or if you think there is further work here?

JDevlieghere added inline comments.Jun 28 2021, 11:02 AM

llvm/tools/dsymutil/Reproducer.cpp
106	Why is this a fallthrough? It seems like now, if we fail to create the `ReproducerGenerateForCrash` instance, we'll try creating a regular `ReproducerGenerate` and only then print the error message? I think we should just handle the EC immediately.
llvm/tools/dsymutil/Reproducer.h
66–68	Do we still need the ability to specify the mode?
97
100	Why do we need a move ctor?
110	I'm not sure if this is really something we want, with any real-world project the number of object files can quickly become overwhelming. Even if we do, this should probably be a property of the `ReproducerGenerate` and not be specific for the crash one.

Addresses some review comments

I have a few outstanding questions

llvm/tools/dsymutil/Reproducer.cpp
106	I don't think so. But maybe my read is incorrect? The unique_ptr would still get created and thus the nullptr check wouldn't pass and so the ec check below would still do its thing.
llvm/tools/dsymutil/Reproducer.h
66–68	thats there for the classof dyn_cast business.
100	So the answer is found around line 734 in dsymutil.cpp. Since we pass a unique_ptr into the main function for the Reproducer if the user selects -gen-reproducer and the program then crashes we need to move the ReproducerGenerate object into the GenerateForCrash so we get the expected messages.
110	Since we are passing the Reproducer unique_ptr into the main function and then, in the case where no -gen-reproducer is passed and the program didn't crash we need to make sure we don't dump the files. That was the intent here. But I can (and did) move it to the parent class. So a question that arises from your comment...should we clean up the dumped files on a crash back to just the binary plus mapping file?

Harbormaster completed remote builds in B111414: Diff 355076.Jun 28 2021, 6:34 PM

I was thinking about it some more and a bunch of complexity here seems to come from the fact that some stuff happens within the CrashRecoveryContext (i.e. dsymutilMain). I believe we can avoid the majority of it by parsing the option, initializing the reproducer, and then running the remaining work in the RunSafely lamba.

I also think we should be able to avoid the dyn_cast and rely on polymorphism instead. For example we could have a virtual generate(bool crashed) method that is called after CRC.RunSafely. For ReproducerGenerate it would unconditionally generate the reproducer. For ReproducerGenerateOnCrash it would only do so if crashed == true.

llvm/tools/dsymutil/Reproducer.cpp
106	I would prefer to handle the error immediately and make every switch case self contained.

I took a stab at the way I had in mind in https://reviews.llvm.org/D127441

Herald added a project: Restricted Project. · View Herald TranscriptJun 9 2022, 2:04 PM

JDevlieghere commandeered this revision.Jun 15 2022, 11:06 AM

JDevlieghere abandoned this revision.

JDevlieghere edited reviewers, added: feg208; removed: JDevlieghere.

Revision Contents

Path

Size

llvm/

test/

tools/

dsymutil/

X86/

reproducer.test

2 lines

tools/

dsymutil/

Reproducer.h

47 lines

Reproducer.cpp

74 lines

dsymutil.cpp

73 lines

Diff 355076

llvm/test/tools/dsymutil/X86/reproducer.test

	Show First 20 Lines • Show All 69 Lines • ▼ Show 20 Lines
	CHECK: DW_TAG_const_type			CHECK: DW_TAG_const_type
	CHECK-NEXT: DW_AT_type (0x00000079			CHECK-NEXT: DW_AT_type (0x00000079
	CHECK: DW_TAG_base_type			CHECK: DW_TAG_base_type
	CHECK-NEXT: DW_AT_name ("char")			CHECK-NEXT: DW_AT_name ("char")
	CHECK-NEXT: DW_AT_encoding (DW_ATE_signed_char)			CHECK-NEXT: DW_AT_encoding (DW_ATE_signed_char)
	CHECK-NEXT: DW_AT_byte_size (0x01)			CHECK-NEXT: DW_AT_byte_size (0x01)
	CHECK: NULL			CHECK: NULL

	REPRODUCER: reproducer written			REPRODUCER: Reproducer Directory
	ERROR: error: cannot parse the debug map			ERROR: error: cannot parse the debug map
	CONFLICT: cannot combine --gen-reproducer and --use-reproducer			CONFLICT: cannot combine --gen-reproducer and --use-reproducer

llvm/tools/dsymutil/Reproducer.h

//===- tools/dsymutil/Reproducer.h ------------------------------*- C++ -*-===// //===- tools/dsymutil/Reproducer.h ------------------------------*- C++ -*-===//

// //

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information. // See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

// //

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

#ifndef LLVM_TOOLS_DSYMUTIL_REPRODUCER_H #ifndef LLVM_TOOLS_DSYMUTIL_REPRODUCER_H

#define LLVM_TOOLS_DSYMUTIL_REPRODUCER_H #define LLVM_TOOLS_DSYMUTIL_REPRODUCER_H

#include "llvm/Support/Error.h" #include "llvm/Support/Error.h"

#include "llvm/Support/FileCollector.h" #include "llvm/Support/FileCollector.h"

#include "llvm/Support/VirtualFileSystem.h" #include "llvm/Support/VirtualFileSystem.h"

#include <string>

namespace llvm { namespace llvm {

namespace dsymutil { namespace dsymutil {

/// The reproducer mode. /// The reproducer mode.

enum class ReproducerMode { enum class ReproducerMode {

Generate, Generate,

GenerateOnCrash,

JDevlieghereAuthorUnsubmitted

Done

Maybe GenerateOnCrash?

JDevlieghere: Maybe `GenerateOnCrash`?

Use, Use,

Off, Off,

}; };

/// The reproducer class manages the sate related to reproducers in dsymutil. /// The reproducer class manages the sate related to reproducers in dsymutil.

/// Instances should be created with Reproducer::createReproducer. An instance /// Instances should be created with Reproducer::createReproducer. An instance

/// of this class is returned when reproducers are off. The VFS returned by /// of this class is returned when reproducers are off. The VFS returned by

/// this instance is the real file system. /// this instance is the real file system.

class Reproducer { class Reproducer {

public: public:

Reproducer(); Reproducer();

virtual ~Reproducer(); virtual ~Reproducer();

IntrusiveRefCntPtr<vfs::FileSystem> getVFS() const { return VFS; } IntrusiveRefCntPtr<vfs::FileSystem> getVFS() const { return VFS; }

const std::string &getRoot() const { return Root; }

JDevlieghereAuthorUnsubmitted

Done

IntrusiveRefCntPtr<vfs::FileSystem> getVFS() const { return VFS; }

- const auto &getRoot() const noexcept { return Root; }

+ const auto &getRoot() const { return Root; }

/// Create a Reproducer instance based on the given mode.

LLVM doesn't use exceptions (and generally builds without them).

JDevlieghere: LLVM doesn't use exceptions (and generally builds without them).

JDevlieghereAuthorUnsubmitted

Done

IntrusiveRefCntPtr<vfs::FileSystem> getVFS() const { return VFS; }

- const auto &getRoot() const { return Root; }

+ const std::string &getRoot() const { return Root; }

/// Create a Reproducer instance based on the given mode.

Don't use auto here, it's not obvious from context what the type is.

JDevlieghere: Don't use auto here, it's not obvious from context what the type is.

/// Get the mode of this reproducer

ReproducerMode getMode() const { return Mode; }

/// Create a Reproducer instance based on the given mode. /// Create a Reproducer instance based on the given mode.

static llvm::Expected<std::unique_ptr<Reproducer>> static llvm::Expected<std::unique_ptr<Reproducer>>

createReproducer(ReproducerMode Mode, StringRef Root); createReproducer(ReproducerMode Mode, StringRef Root);

protected: protected:

Reproducer(std::string Root, ReproducerMode Mode);

JDevlieghereAuthorUnsubmitted

Done

If we're going to store the root, this might as well take a std::string too and move it into Root.

JDevlieghere: If we're going to store the root, this might as well take a `std::string` too and move it into…

void getMappingFileName(SmallVectorImpl<char> &Mapping) const;

JDevlieghereAuthorUnsubmitted

Done

Rather than returning a small string with arbitrary lengths, the common pattern in llvm is to take let the caller decide on the storage type and pass in a llvm::SmallVectorImpl<char>& that the function then populates. This also avoids any potential copies.

JDevlieghere: Rather than returning a small string with arbitrary lengths, the common pattern in llvm is to…

IntrusiveRefCntPtr<vfs::FileSystem> VFS; IntrusiveRefCntPtr<vfs::FileSystem> VFS;

/// The path to the reproducer.

std::string Root;

ReproducerMode Mode = ReproducerMode::Off;

}; };

/// Reproducer instance used to generate a new reproducer. The VFS returned by /// Reproducer instance used to generate a new reproducer. The VFS returned by

/// this instance is a FileCollectorFileSystem that tracks every file used by /// this instance is a FileCollectorFileSystem that tracks every file used by

/// dsymutil. /// dsymutil.

class ReproducerGenerate : public Reproducer { class ReproducerGenerate : public Reproducer {

public: public:

ReproducerGenerate(std::error_code &EC); ReproducerGenerate(std::error_code &EC,

ReproducerMode Mode = ReproducerMode::Generate);

~ReproducerGenerate() override; ~ReproducerGenerate() override;

JDevlieghereAuthorUnsubmitted

Not Done

Do we still need the ability to specify the mode?

JDevlieghere: Do we still need the ability to specify the mode?

feg208Unsubmitted

Done

thats there for the classof dyn_cast business.

feg208: thats there for the classof dyn_cast business.

private: /// Define classof to be able to use isa<>, cast<>, dyn_cast<>, etc

/// The path to the reproducer. static bool classof(const Reproducer *R) {

std::string Root; return R->getMode() == ReproducerMode::Generate;

}

void dodump(bool DoDataDump);

ReproducerGenerate(ReproducerGenerate &&);

protected:

/// The FileCollector used by the FileCollectorFileSystem. /// The FileCollector used by the FileCollectorFileSystem.

std::shared_ptr<FileCollector> FC; std::shared_ptr<FileCollector> FC;

bool DumpFiles = true;

};

JDevlieghereAuthorUnsubmitted

Done

I think we probably want a separate class that corresponds to GenerateForCrash.

JDevlieghere: I think we probably want a separate class that corresponds to `GenerateForCrash`.

class ReproducerGenerateForCrash : public ReproducerGenerate {

public:

ReproducerGenerateForCrash(std::error_code &EC);

ReproducerGenerateForCrash(ReproducerGenerate &&);

~ReproducerGenerateForCrash() override;

/// Define classof to be able to use isa<>, cast<>, dyn_cast<>, etc

static bool classof(const Reproducer *R) {

return R->getMode() == ReproducerMode::GenerateOnCrash;

}

}; };

/// Reproducer instance used to use an existing reproducer. The VFS returned by /// Reproducer instance used to use an existing reproducer. The VFS returned by

JDevlieghereAuthorUnsubmitted

Not Done

std::shared_ptr<FileCollector> FC;

};

- class ReproducerGenerateForCrash : public ReproducerGenerate {

+ class ReproducerGenerateOnCrash : public ReproducerGenerate {

public:

JDevlieghere:

/// this instance is a RedirectingFileSystem that remaps paths to their /// this instance is a RedirectingFileSystem that remaps paths to their

/// counterpart in the reproducer. /// counterpart in the reproducer.

class ReproducerUse : public Reproducer { class ReproducerUse : public Reproducer {

JDevlieghereAuthorUnsubmitted

Not Done

Why do we need a move ctor?

JDevlieghere: Why do we need a move ctor?

feg208Unsubmitted

Done

So the answer is found around line 734 in dsymutil.cpp. Since we pass a unique_ptr into the main function for the Reproducer if the user selects -gen-reproducer and the program then crashes we need to move the ReproducerGenerate object into the GenerateForCrash so we get the expected messages.

feg208: So the answer is found around line 734 in dsymutil.cpp. Since we pass a unique_ptr into the…

public: public:

ReproducerUse(StringRef Root, std::error_code &EC); ReproducerUse(StringRef Root, std::error_code &EC);

~ReproducerUse() override; ~ReproducerUse() override;

private:

/// The path to the reproducer.

std::string Root;

}; };

} // end namespace dsymutil } // end namespace dsymutil

} // end namespace llvm } // end namespace llvm

#endif // LLVM_TOOLS_DSYMUTIL_REPRODUCER_H #endif // LLVM_TOOLS_DSYMUTIL_REPRODUCER_H

JDevlieghereAuthorUnsubmitted

Not Done

I'm not sure if this is really something we want, with any real-world project the number of object files can quickly become overwhelming. Even if we do, this should probably be a property of the ReproducerGenerate and not be specific for the crash one.

JDevlieghere: I'm not sure if this is really something we want, with any real-world project the number of…

feg208Unsubmitted

Done

Since we are passing the Reproducer unique_ptr into the main function and then, in the case where no -gen-reproducer is passed and the program didn't crash we need to make sure we don't dump the files. That was the intent here. But I can (and did) move it to the parent class. So a question that arises from your comment...should we clean up the dumped files on a crash back to just the binary plus mapping file?

feg208: Since we are passing the Reproducer unique_ptr into the main function and then, in the case…

llvm/tools/dsymutil/Reproducer.cpp

//===- Reproducer.cpp -----------------------------------------------------===// //===- Reproducer.cpp -----------------------------------------------------===//

// //

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information. // See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

// //

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

#include "Reproducer.h" #include "Reproducer.h"

#include "llvm/Support/Path.h" #include "llvm/Support/Path.h"

#include <cstdlib>

#include <memory>

#include <utility>

using namespace llvm; using namespace llvm;

using namespace llvm::dsymutil; using namespace llvm::dsymutil;

static std::string createReproducerDir(std::error_code &EC) { static std::string createReproducerDir(std::error_code &EC) {

SmallString<128> Root; SmallString<128> Root;

if (const char *Path = getenv("DSYMUTIL_REPRODUCER_PATH")) { if (const char *Path = getenv("DSYMUTIL_REPRODUCER_PATH")) {

Root.assign(Path); Root.assign(Path);

EC = sys::fs::create_directory(Root); EC = sys::fs::create_directory(Root);

} else { } else {

EC = sys::fs::createUniqueDirectory("dsymutil", Root); EC = sys::fs::createUniqueDirectory("dsymutil", Root);

} }

return EC ? "" : std::string(Root); return EC ? "" : std::string(Root);

} }

void Reproducer::getMappingFileName(SmallVectorImpl<char> &Mapping) const {

Mapping.assign(Root.begin(), Root.end());

sys::path::append(Mapping, "mapping.yaml");

}

Reproducer::Reproducer() : VFS(vfs::getRealFileSystem()) {} Reproducer::Reproducer() : VFS(vfs::getRealFileSystem()) {}

Reproducer::~Reproducer() = default; Reproducer::~Reproducer() = default;

ReproducerGenerate::ReproducerGenerate(std::error_code &EC) Reproducer::Reproducer(std::string Root, ReproducerMode Mode)

JDevlieghereAuthorUnsubmitted

Done

Reproducer::~Reproducer() = default;

- Reproducer::Reproducer(StringRef Rt)

+ Reproducer::Reproducer(StringRef Root)

: VFS{vfs::getRealFileSystem()}, Root{std::move(Rt)} {}

This should be consistent with the header. It's fine (and common) to use the same name for the argument and member variable.

JDevlieghere: This should be consistent with the header. It's fine (and common) to use the same name for the…

: Root(createReproducerDir(EC)), FC() { : VFS(vfs::getRealFileSystem()), Root(std::move(Root)), Mode(Mode) {}

JDevlieghereAuthorUnsubmitted

Done

LLVM generally does not prefer braced initializer lists (https://llvm.org/docs/CodingStandards.html#do-not-use-braced-initializer-lists-to-call-a-constructor)

JDevlieghere: LLVM generally does not prefer braced initializer lists (https://llvm.org/docs/CodingStandards.

feg208Unsubmitted

Done

Got it. Thanks for passing this along. I really should have reviewed it before this submission but I certainly will before the next

feg208: Got it. Thanks for passing this along. I really should have reviewed it before this submission…

ReproducerGenerate::ReproducerGenerate(std::error_code &EC, ReproducerMode Mode)

: Reproducer(createReproducerDir(EC), Mode), FC() {

if (!Root.empty()) if (!Root.empty())

FC = std::make_shared<FileCollector>(Root, Root); FC = std::make_shared<FileCollector>(Root, Root);

VFS = FileCollector::createCollectorVFS(vfs::getRealFileSystem(), FC); VFS = FileCollector::createCollectorVFS(vfs::getRealFileSystem(), FC);

} }

ReproducerGenerate::ReproducerGenerate(ReproducerGenerate &&Other)

: Reproducer(static_cast<Reproducer &&>(Other)), FC(std::move(Other.FC)) {}

ReproducerGenerate::~ReproducerGenerate() { ReproducerGenerate::~ReproducerGenerate() {

if (!FC) if (!FC)

return; return;

FC->copyFiles(false); FC->copyFiles(false);

SmallString<128> Mapping(Root); SmallString<128> Mapping;

sys::path::append(Mapping, "mapping.yaml"); getMappingFileName(Mapping);

FC->writeMapping(Mapping.str()); FC->writeMapping(Mapping.str());

outs() << "reproducer written to " << Root << '\n'; outs() << "Reproducer Directory: " << getRoot() << '\n';

}

ReproducerGenerateForCrash::ReproducerGenerateForCrash(std::error_code &EC)

: ReproducerGenerate(EC, ReproducerMode::GenerateOnCrash) {}

void ReproducerGenerate::dodump(bool DoDataDump) { DumpFiles = DoDataDump; }

ReproducerGenerateForCrash::ReproducerGenerateForCrash(

ReproducerGenerate &&Other)

: ReproducerGenerate(std::move(Other)) {}

ReproducerGenerateForCrash::~ReproducerGenerateForCrash() {

if (!DumpFiles) {

FC.reset();

return;

}

dbgs() << "\n********************\n\n"

"PLEASE ATTACH THE FOLLOWING DIRECTORY TO THE BUG "

"REPORT:\nBinar(ies|y), object file(s) and associated debug map "

"are located at:\n";

// Once the reproducer exits then the mapping will be dumped

// by the parent class destructor

} }

JDevlieghereAuthorUnsubmitted

Done

If we split this into a separate class, this can remain here, and have the crash variant print the "PLEASE ATTACH THE FILES IN THE FOLLOWING DIRECTORY TO THE BUG REPORT" in the generate crash class.

JDevlieghere: If we split this into a separate class, this can remain here, and have the crash variant print…

JDevlieghereAuthorUnsubmitted

Done

We need the whole directory, so might as well ask the reporter to include it as a whole.

********************
PLEASE ATTACH THE FOLLOWING DIRECTORY TO THE BUG REPORT:

JDevlieghere: We need the whole directory, so might as well ask the reporter to include it as a whole. ```…

ReproducerUse::~ReproducerUse() = default; ReproducerUse::~ReproducerUse() = default;

ReproducerUse::ReproducerUse(StringRef Root, std::error_code &EC) { ReproducerUse::ReproducerUse(StringRef Root, std::error_code &EC)

SmallString<128> Mapping(Root); : Reproducer{Root.str(), ReproducerMode::Use} {

sys::path::append(Mapping, "mapping.yaml"); SmallString<128> Mapping;

getMappingFileName(Mapping);

ErrorOr<std::unique_ptr<MemoryBuffer>> Buffer = ErrorOr<std::unique_ptr<MemoryBuffer>> Buffer =

vfs::getRealFileSystem()->getBufferForFile(Mapping.str()); vfs::getRealFileSystem()->getBufferForFile(Mapping.str());

if (!Buffer) { if (!Buffer) {

EC = Buffer.getError(); EC = Buffer.getError();

return; return;

} }

VFS = llvm::vfs::getVFSFromYAML(std::move(Buffer.get()), nullptr, Mapping); VFS = llvm::vfs::getVFSFromYAML(std::move(Buffer.get()), nullptr, Mapping);

} }

llvm::Expected<std::unique_ptr<Reproducer>> llvm::Expected<std::unique_ptr<Reproducer>>

Reproducer::createReproducer(ReproducerMode Mode, StringRef Root) { Reproducer::createReproducer(ReproducerMode Mode, StringRef Root) {

switch (Mode) { std::unique_ptr<Reproducer> Repro;

JDevlieghereAuthorUnsubmitted

Done

Reproducer::createReproducer(ReproducerMode Mode, StringRef Root) {

- std::unique_ptr<Reproducer> Repro{nullptr};

+ std::unique_ptr<Reproducer> Repro;

std::error_code EC;

The nullptr here is redundant.

JDevlieghere: The `nullptr` here is redundant.

case ReproducerMode::Generate: {

std::error_code EC; std::error_code EC;

std::unique_ptr<Reproducer> Repro = switch (Mode) {

std::make_unique<ReproducerGenerate>(EC); case ReproducerMode::GenerateOnCrash:

Repro = std::make_unique<ReproducerGenerateForCrash>(EC);

LLVM_FALLTHROUGH;

JDevlieghereAuthorUnsubmitted

Not Done

Repro = std::make_unique<ReproducerGenerateForCrash>(EC);

- [[clang::fallthrough]];

+ LLVM_FALLTHROUGH;

case ReproducerMode::Generate:

You should use LLVM_FALLTHROUGH, otherwise this won't work if the host compiler is not clang.

JDevlieghere: You should use `LLVM_FALLTHROUGH`, otherwise this won't work if the host compiler is not clang.

JDevlieghereAuthorUnsubmitted

Not Done

Why is this a fallthrough? It seems like now, if we fail to create the ReproducerGenerateForCrash instance, we'll try creating a regular ReproducerGenerate and only then print the error message? I think we should just handle the EC immediately.

JDevlieghere: Why is this a fallthrough? It seems like now, if we fail to create the…

feg208Unsubmitted

Done

I don't think so. But maybe my read is incorrect? The unique_ptr would still get created and thus the nullptr check wouldn't pass and so the ec check below would still do its thing.

feg208: I don't think so. But maybe my read is incorrect? The unique_ptr would still get created and…

JDevlieghereAuthorUnsubmitted

Not Done

I would prefer to handle the error immediately and make every switch case self contained.

JDevlieghere: I would prefer to handle the error immediately and make every switch case self contained.

case ReproducerMode::Generate:

if (Repro == nullptr)

Repro = std::make_unique<ReproducerGenerate>(EC);

JDevlieghereAuthorUnsubmitted

Done

No braces around single line statements (https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements)

JDevlieghere: No braces around single line statements (https://llvm.org/docs/CodingStandards.html#don-t-use…

if (EC) if (EC)

return errorCodeToError(EC); return errorCodeToError(EC);

return std::move(Repro); return std::move(Repro);

} case ReproducerMode::Use:

case ReproducerMode::Use: { Repro = std::make_unique<ReproducerUse>(Root, EC);

std::error_code EC;

std::unique_ptr<Reproducer> Repro =

std::make_unique<ReproducerUse>(Root, EC);

if (EC) if (EC)

return errorCodeToError(EC); return errorCodeToError(EC);

return std::move(Repro); return std::move(Repro);

}

case ReproducerMode::Off: case ReproducerMode::Off:

return std::make_unique<Reproducer>(); return std::make_unique<Reproducer>();

} }

llvm_unreachable("All cases handled above."); llvm_unreachable("All cases handled above.");

} }

llvm/tools/dsymutil/dsymutil.cpp

Show All 17 Lines

#include "MachOUtils.h" #include "MachOUtils.h"

#include "Reproducer.h" #include "Reproducer.h"

#include "llvm/ADT/STLExtras.h" #include "llvm/ADT/STLExtras.h"

#include "llvm/ADT/SmallString.h" #include "llvm/ADT/SmallString.h"

#include "llvm/ADT/SmallVector.h" #include "llvm/ADT/SmallVector.h"

#include "llvm/ADT/StringExtras.h" #include "llvm/ADT/StringExtras.h"

#include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringRef.h"

#include "llvm/ADT/Triple.h" #include "llvm/ADT/Triple.h"

#include "llvm/Config/config.h"

#include "llvm/DebugInfo/DIContext.h" #include "llvm/DebugInfo/DIContext.h"

#include "llvm/DebugInfo/DWARF/DWARFContext.h" #include "llvm/DebugInfo/DWARF/DWARFContext.h"

#include "llvm/DebugInfo/DWARF/DWARFVerifier.h" #include "llvm/DebugInfo/DWARF/DWARFVerifier.h"

#include "llvm/Object/Binary.h" #include "llvm/Object/Binary.h"

#include "llvm/Object/MachO.h" #include "llvm/Object/MachO.h"

#include "llvm/Option/Arg.h" #include "llvm/Option/Arg.h"

#include "llvm/Option/ArgList.h" #include "llvm/Option/ArgList.h"

#include "llvm/Option/Option.h" #include "llvm/Option/Option.h"

#include "llvm/Support/CommandLine.h" #include "llvm/Support/CommandLine.h"

#include "llvm/Support/CrashRecoveryContext.h"

#include "llvm/Support/FileCollector.h" #include "llvm/Support/FileCollector.h"

#include "llvm/Support/FileSystem.h" #include "llvm/Support/FileSystem.h"

#include "llvm/Support/InitLLVM.h" #include "llvm/Support/InitLLVM.h"

#include "llvm/Support/ManagedStatic.h" #include "llvm/Support/ManagedStatic.h"

#include "llvm/Support/Path.h" #include "llvm/Support/Path.h"

#include "llvm/Support/PrettyStackTrace.h"

#include "llvm/Support/TargetSelect.h" #include "llvm/Support/TargetSelect.h"

#include "llvm/Support/ThreadPool.h" #include "llvm/Support/ThreadPool.h"

#include "llvm/Support/WithColor.h" #include "llvm/Support/WithColor.h"

#include "llvm/Support/raw_ostream.h" #include "llvm/Support/raw_ostream.h"

#include "llvm/Support/thread.h" #include "llvm/Support/thread.h"

#include <algorithm> #include <algorithm>

#include <cstdint> #include <cstdint>

#include <cstdlib> #include <cstdlib>

▲ Show 20 Lines • Show All 46 Lines • ▼ Show 20 Lines struct DsymutilOptions {

bool ForceKeepFunctionForStatic = false; bool ForceKeepFunctionForStatic = false;

std::string SymbolMap; std::string SymbolMap;

std::string OutputFile; std::string OutputFile;

std::string Toolchain; std::string Toolchain;

std::string ReproducerPath; std::string ReproducerPath;

std::vector<std::string> Archs; std::vector<std::string> Archs;

std::vector<std::string> InputFiles; std::vector<std::string> InputFiles;

unsigned NumThreads; unsigned NumThreads;

ReproducerMode ReproMode = ReproducerMode::Off; ReproducerMode ReproMode = ReproducerMode::GenerateOnCrash;

dsymutil::LinkOptions LinkOpts; dsymutil::LinkOptions LinkOpts;

}; };

/// Return a list of input files. This function has logic for dealing with the /// Return a list of input files. This function has logic for dealing with the

/// special case where we might have dSYM bundles as input. The function /// special case where we might have dSYM bundles as input. The function

/// returns an error when the directory structure doesn't match that of a dSYM /// returns an error when the directory structure doesn't match that of a dSYM

/// bundle. /// bundle.

static Expected<std::vector<std::string>> getInputs(opt::InputArgList &Args, static Expected<std::vector<std::string>> getInputs(opt::InputArgList &Args,

▲ Show 20 Lines • Show All 362 Lines • ▼ Show 20 Lines getOutputFileName(StringRef InputFile, const DsymutilOptions &Options) {

} }

sys::path::append(Path, "Contents", "Resources"); sys::path::append(Path, "Contents", "Resources");

std::string ResourceDir = std::string(Path.str()); std::string ResourceDir = std::string(Path.str());

sys::path::append(Path, "DWARF", sys::path::filename(DwarfFile)); sys::path::append(Path, "DWARF", sys::path::filename(DwarfFile));

return OutputLocation(std::string(Path.str()), ResourceDir); return OutputLocation(std::string(Path.str()), ResourceDir);

} }

int main(int argc, char **argv) { static int dsymutilMain(std::unique_ptr<Reproducer> &Repro, int Argc,

InitLLVM X(argc, argv); char **Argv) {

std::string ProgName(Argv[0]);

// Parse arguments.

DsymutilOptTable T; DsymutilOptTable T;

unsigned MAI; unsigned MAI;

unsigned MAC; unsigned MAC;

ArrayRef<const char *> ArgsArr = makeArrayRef(argv + 1, argc - 1); ArrayRef<const char *> ArgsArr = makeArrayRef(Argv + 1, Argc - 1);

opt::InputArgList Args = T.ParseArgs(ArgsArr, MAI, MAC); auto Args = T.ParseArgs(ArgsArr, MAI, MAC);

JDevlieghereAuthorUnsubmitted

Done

Spurious newline

JDevlieghere: Spurious newline

void *P = (void *)(intptr_t)getOutputFileName; void *P = (void *)(intptr_t)getOutputFileName;

JDevlieghereAuthorUnsubmitted

Not Done

static int dsymutilMain(int Argc, char **Argv) {

- std::string ProgName{Argv[0]};

+ std::string ProgName(Argv[0]);

std::pair<DsymutilOptTable, llvm::opt::InputArgList> ArgPair =

Let's be consistent with all the surrounding code and the style guide.

JDevlieghere: Let's be consistent with all the surrounding code and the style guide.

feg208Unsubmitted

Done

Oh right. Oops.

feg208: Oh right. Oops.

std::string SDKPath = sys::fs::getMainExecutable(argv[0], P); std::string SDKPath = sys::fs::getMainExecutable(Argv[0], P);

SDKPath = std::string(sys::path::parent_path(SDKPath)); SDKPath = std::string(sys::path::parent_path(SDKPath));

JDevlieghereAuthorUnsubmitted

Done

I don't think auto is a good fit here (https://llvm.org/docs/CodingStandards.html#use-auto-type-deduction-to-make-code-more-readable)

JDevlieghere: I don't think `auto` is a good fit here (https://llvm.org/docs/CodingStandards.html#use-auto…

for (auto *Arg : Args.filtered(OPT_UNKNOWN)) { for (auto *Arg : Args.filtered(OPT_UNKNOWN)) {

WithColor::warning() << "ignoring unknown option: " << Arg->getSpelling() WithColor::warning() << "ignoring unknown option: " << Arg->getSpelling()

<< '\n'; << '\n';

} }

if (Args.hasArg(OPT_help)) { if (Args.hasArg(OPT_help)) {

T.printHelp( T.printHelp(

outs(), (std::string(argv[0]) + " [options] <input files>").c_str(), outs(), (ProgName + " [options] <input files>").c_str(),

"manipulate archived DWARF debug symbol files.\n\n" "manipulate archived DWARF debug symbol files.\n\n"

"dsymutil links the DWARF debug information found in the object files\n" "dsymutil links the DWARF debug information found in the object files\n"

"for the executable <input file> by using debug symbols information\n" "for the executable <input file> by using debug symbols information\n"

"contained in its symbol table.\n", "contained in its symbol table.\n",

false); false);

return EXIT_SUCCESS; return EXIT_SUCCESS;

} }

Show All 10 Lines static int dsymutilMain(std::unique_ptr<Reproducer> &Repro, int Argc,

auto &Options = *OptionsOrErr; auto &Options = *OptionsOrErr;

InitializeAllTargetInfos(); InitializeAllTargetInfos();

InitializeAllTargetMCs(); InitializeAllTargetMCs();

InitializeAllTargets(); InitializeAllTargets();

InitializeAllAsmPrinters(); InitializeAllAsmPrinters();

auto Repro = auto IsRepro =

Reproducer::createReproducer(Options.ReproMode, Options.ReproducerPath); Reproducer::createReproducer(Options.ReproMode, Options.ReproducerPath);

if (!Repro) { if (!IsRepro) {

WithColor::error() << toString(Repro.takeError()); WithColor::error() << toString(IsRepro.takeError());

return EXIT_FAILURE; return EXIT_FAILURE;

} }

Options.LinkOpts.VFS = (*Repro)->getVFS(); Repro = std::move(*IsRepro);

Options.LinkOpts.VFS = Repro->getVFS();

for (const auto &Arch : Options.Archs) for (const auto &Arch : Options.Archs)

if (Arch != "*" && Arch != "all" && if (Arch != "*" && Arch != "all" &&

!object::MachOObjectFile::isValidArch(Arch)) { !object::MachOObjectFile::isValidArch(Arch)) {

WithColor::error() << "unsupported cpu architecture: '" << Arch << "'\n"; WithColor::error() << "unsupported cpu architecture: '" << Arch << "'\n";

return EXIT_FAILURE; return EXIT_FAILURE;

} }

▲ Show 20 Lines • Show All 126 Lines • ▼ Show 20 Lines for (auto &Map : *DebugMapPtrsOrErr) {

// FIXME: The DwarfLinker can have some very deep recursion that can max // FIXME: The DwarfLinker can have some very deep recursion that can max

// out the (significantly smaller) stack when using threads. We don't // out the (significantly smaller) stack when using threads. We don't

// want this limitation when we only have a single thread. // want this limitation when we only have a single thread.

if (S.ThreadsRequested == 1) if (S.ThreadsRequested == 1)

LinkLambda(OS, Options.LinkOpts); LinkLambda(OS, Options.LinkOpts);

else else

Threads.async(LinkLambda, OS, Options.LinkOpts); Threads.async(LinkLambda, OS, Options.LinkOpts);

} }

Threads.wait(); Threads.wait();

if (!AllOK) if (!AllOK)

JDevlieghereAuthorUnsubmitted

Done

Unrelated whitespace change?

JDevlieghere: Unrelated whitespace change?

feg208Unsubmitted

Done

yeah. Sorry it snuck in.

feg208: yeah. Sorry it snuck in.

return EXIT_FAILURE; return EXIT_FAILURE;

if (NeedsTempFiles) { if (NeedsTempFiles) {

if (!MachOUtils::generateUniversalBinary(TempFiles, if (!MachOUtils::generateUniversalBinary(TempFiles,

OutputLocationOrErr->DWARFFile, OutputLocationOrErr->DWARFFile,

Options.LinkOpts, SDKPath)) Options.LinkOpts, SDKPath))

return EXIT_FAILURE; return EXIT_FAILURE;

} }

Show All 9 Lines if (stat) {

"object file format."; "object file format.";

return EXIT_FAILURE; return EXIT_FAILURE;

} }

return EXIT_SUCCESS; return EXIT_SUCCESS;

} }

// NOLINTNEXTLINE(readability-identifier-naming)

int main(int argc, char **argv) {

InitLLVM X(argc, argv);

setBugReportMsg(

"PLEASE submit a bug report to " BUG_REPORT_URL

JDevlieghereAuthorUnsubmitted

Done

if (!Repro) {

- dbgs() << "Unable to create reproducer due to "

+ WithColor::error() << "Unable to create reproducer due to "

<< toString(Repro.takeError()) << '\n';

JDevlieghere:

" and include the crash backtrace, object files and debug map.\n");

CrashRecoveryContext::Enable();

CrashRecoveryContext CRC;

CRC.DumpStackAndCleanupOnFailure = true;

int Rval = 0;

std::unique_ptr<Reproducer> Repro;

if (!CRC.RunSafely([&Rval, &Repro, argc, argv]() {

Rval = dsymutilMain(Repro, argc, argv);

})) {

bool DidRepro = false;

if (Repro != nullptr) {

if (auto *RPtr =

JDevlieghereAuthorUnsubmitted

Not Done

Why is this needed?

JDevlieghere: Why is this needed?

feg208Unsubmitted

Done

So my understanding is that when dsymutil crashes we want the mapping.yaml file, the binary and associated .o file (assuming its compiled with -g). The latter maybe for some workflows? Reusing this gets us the object file but at the cost of parsing the debug map. So maybe I should just copy the binary in? I can do that or just parse enough to recreate the object files. Or maybe I am misunderstanding this entirely?

feg208: So my understanding is that when dsymutil crashes we want the mapping.yaml file, the binary and…

JDevlieghereAuthorUnsubmitted

Not Done

return;

}

- auto VFS = (*Repro)->getVFS();

- for (auto &InputFile : Options.InputFiles) {

- auto DebugMapPtrsOrErr =

- parseDebugMap(VFS, InputFile, Options.Archs, {}, false, false, false);

- if (auto EC = DebugMapPtrsOrErr.getError()) {

- dbgs() << "cannot parse the debug map for '" << InputFile

- << "': " << EC.message() << '\n';

- }

}

// NOLINTNEXTLINE(readability-identifier-naming)

JDevlieghere:

llvm::dyn_cast<ReproducerGenerateForCrash>(Repro.get())) {

Repro.reset(); // Generate reproducer on teardown

DidRepro = true;

} else if (auto *RPtr = llvm::dyn_cast<ReproducerGenerate>(Repro.get())) {

ReproducerGenerateForCrash CrashRepro(std::move(*RPtr));

DidRepro = true;

}

JDevlieghereAuthorUnsubmitted

Done

This seems like it should be part of the reproducers class.

JDevlieghere: This seems like it should be part of the reproducers class.

if (!DidRepro)

WithColor::error() << "Unable to create crash reproducer.\n";

dbgs() << "Command: dsymutil ";

for (auto ArgIndex = 1; ArgIndex < argc; ArgIndex++) {

dbgs() << argv[ArgIndex] << " ";

}

dbgs() << "\n";

return CRC.RetCode;

}

if (Repro != nullptr) {

if (auto *RPtr = llvm::dyn_cast<ReproducerGenerateForCrash>(Repro.get())) {

RPtr->dodump(false);

}

JDevlieghereAuthorUnsubmitted

Done

} else {

- dbgs() << "Failed to parse options in crash dump due to "

+ WithColor::error() << "failed to parse options in crash dump due to "

<< toString(OptionsOrErr.takeError()) << '\n';

JDevlieghere:

return Rval;

}

JDevlieghereAuthorUnsubmitted

Not Done

I actually don't think this is going to work. You're creating the reproducer after we've crashed and you're not telling dsymutil to use the VFS from that Reproducer instance. It's the VFS that records all access and is what will allow us to copy all the files in place if we decide we need to keep the reproducer.

The way I would imagine this to work is:

Parse the command line options. If we crash at this point we won't have a reproducer, which I think is acceptable.
Initialize the reproducer.
Call CRC.RunSafely to do the actual work. We pass it the reproducer, so it captures all file accesses, which is everything we need for the reproducer.

4a. If we crash, we generate the reproducer and print the dsymutil invocation to replay the reproducer.
4b. If we don't crash, but --gen-reproducer was passed, we do the same

Does that make sense?

JDevlieghere: I actually don't think this is going to work. You're creating the reproducer after we've…

feg208Unsubmitted

Done

Yeah this makes sense and answers my confusion about the right way to go here.

feg208: Yeah this makes sense and answers my confusion about the right way to go here.

This is an archive of the discontinued LLVM Phabricator instance.

Allows for dsymutil crashes to generate reproduceable informationAbandonedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 355076

llvm/test/tools/dsymutil/X86/reproducer.test

llvm/tools/dsymutil/Reproducer.h

llvm/tools/dsymutil/Reproducer.cpp

llvm/tools/dsymutil/dsymutil.cpp

Allows for dsymutil crashes to generate reproduceable information
AbandonedPublic