This is an archive of the discontinued LLVM Phabricator instance.

Allows for dsymutil crashes to generate reproduceable information
AbandonedPublic

Authored by JDevlieghere on Apr 26 2021, 8:14 PM.

Details

Summary

This change to dsymutil allows for, when dsymutil crashes, the user to collect and submit
the object files and a map file so the crash can be reproduced

Diff Detail

Unit TestsFailed

Event Timeline

feg208 created this revision.Apr 26 2021, 8:14 PM
feg208 requested review of this revision.Apr 26 2021, 8:14 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 26 2021, 8:14 PM
feg208 updated this revision to Diff 340721.Apr 26 2021, 9:21 PM

Fixed clang-tidy warnings

feg208 updated this revision to Diff 340867.Apr 27 2021, 8:58 AM

Missing more clang-tidy checks

How is collect-crash-inputs different from gan-reproducer? It seems like they're essentially doing the same thing. Can we implement the crash recovery in terms of llvm::dsymutil::Reproducer?

feg208 updated this revision to Diff 341391.Apr 28 2021, 8:54 PM

This alters the prior commit by incorporating the Reproducer for the purpose of persisting the files on crash

feg208 updated this revision to Diff 341392.Apr 28 2021, 8:57 PM

Forgot to remove raise(SIGSEGV)

feg208 updated this revision to Diff 341635.Apr 29 2021, 2:10 PM

Cleans up clang-tidy/format issues and some minor branch pecadillos

feg208 updated this revision to Diff 342164.May 1 2021, 10:38 AM

clang-format misses

JDevlieghere added inline comments.May 24 2021, 2:13 PM
llvm/tools/dsymutil/Reproducer.cpp
57–80

If we split this into a separate class, this can remain here, and have the crash variant print the "PLEASE ATTACH THE FILES IN THE FOLLOWING DIRECTORY TO THE BUG REPORT" in the generate crash class.

llvm/tools/dsymutil/Reproducer.h
84

I think we probably want a separate class that corresponds to GenerateForCrash.

llvm/tools/dsymutil/dsymutil.cpp
490–491

Spurious newline

713–737

This seems like it should be part of the reproducers class.

feg208 updated this revision to Diff 347536.May 24 2021, 5:58 PM
feg208 marked 4 inline comments as done.

Addresses some review comments

I think I addressed all the comments? If so I can't push the changes into the repo. Could you? Assuming there aren't outstanding concerns of course.

feg208 updated this revision to Diff 347557.May 24 2021, 7:56 PM

Broke a regression test

feg208 added a comment.Jun 2 2021, 9:33 AM

I am wondering if I could get someone to look at this? I think I have addressed the review comments but I'd be happy to fix anything I missed

I think I have addressed the review comments but I'd be happy to fix anything I missed

Pinging @JDevlieghere ^

JDevlieghere added inline comments.Jun 4 2021, 9:05 AM
llvm/tools/dsymutil/Reproducer.cpp
37

This should be consistent with the header. It's fine (and common) to use the same name for the argument and member variable.

38
106

You should use LLVM_FALLTHROUGH, otherwise this won't work if the host compiler is not clang.

108–110
llvm/tools/dsymutil/Reproducer.h
24

Maybe GenerateOnCrash?

40

LLVM doesn't use exceptions (and generally builds without them).

50

If we're going to store the root, this might as well take a std::string too and move it into Root.

52

Rather than returning a small string with arbitrary lengths, the common pattern in llvm is to take let the caller decide on the storage type and pass in a llvm::SmallVectorImpl<char>& that the function then populates. This also avoids any potential copies.

llvm/tools/dsymutil/dsymutil.cpp
493–495
682–686

Unrelated whitespace change?

717
721–729

Why is this needed?

751
feg208 marked 11 inline comments as done.Jun 7 2021, 12:54 PM

I rolled up all the comments save one where I just have an outstanding question and then I'll roll that up as well

llvm/tools/dsymutil/Reproducer.cpp
38

Got it. Thanks for passing this along. I really should have reviewed it before this submission but I certainly will before the next

llvm/tools/dsymutil/dsymutil.cpp
682–686

yeah. Sorry it snuck in.

721–729

So my understanding is that when dsymutil crashes we want the mapping.yaml file, the binary and associated .o file (assuming its compiled with -g). The latter maybe for some workflows? Reusing this gets us the object file but at the cost of parsing the debug map. So maybe I should just copy the binary in? I can do that or just parse enough to recreate the object files. Or maybe I am misunderstanding this entirely?

feg208 updated this revision to Diff 350396.Jun 7 2021, 12:54 PM
feg208 marked 2 inline comments as done.

Gets some review comments

JDevlieghere requested changes to this revision.Jun 11 2021, 11:06 AM
JDevlieghere added inline comments.
llvm/tools/dsymutil/Reproducer.cpp
56–80

We need the whole directory, so might as well ask the reporter to include it as a whole.

********************
PLEASE ATTACH THE FOLLOWING DIRECTORY TO THE BUG REPORT:
101

The nullptr here is redundant.

llvm/tools/dsymutil/Reproducer.h
40

Don't use auto here, it's not obvious from context what the type is.

llvm/tools/dsymutil/dsymutil.cpp
492

Let's be consistent with all the surrounding code and the style guide.

721–729
744–753

I actually don't think this is going to work. You're creating the reproducer after we've crashed and you're not telling dsymutil to use the VFS from that Reproducer instance. It's the VFS that records all access and is what will allow us to copy all the files in place if we decide we need to keep the reproducer.

The way I would imagine this to work is:

  1. Parse the command line options. If we crash at this point we won't have a reproducer, which I think is acceptable.
  2. Initialize the reproducer.
  3. Call CRC.RunSafely to do the actual work. We pass it the reproducer, so it captures all file accesses, which is everything we need for the reproducer.

4a. If we crash, we generate the reproducer and print the dsymutil invocation to replay the reproducer.
4b. If we don't crash, but --gen-reproducer was passed, we do the same

Does that make sense?

This revision now requires changes to proceed.Jun 11 2021, 11:06 AM
feg208 marked 3 inline comments as done.Jun 12 2021, 9:28 AM

I think this covers the comments

llvm/tools/dsymutil/dsymutil.cpp
492

Oh right. Oops.

744–753

Yeah this makes sense and answers my confusion about the right way to go here.

feg208 updated this revision to Diff 351668.Jun 12 2021, 9:28 AM

Rolled up review comments

feg208 updated this revision to Diff 351742.Jun 13 2021, 2:34 PM

Fixes the broken tests

This is not yet complete. I missed step 4a. "4a. If we crash, we generate the reproducer and print the dsymutil invocation to replay the reproducer." It currently doesn't print the command line

feg208 updated this revision to Diff 351744.Jun 13 2021, 3:03 PM

Added command line print

feg208 updated this revision to Diff 351925.Jun 14 2021, 10:46 AM

Clang-tidy checks cleaned up and test fixes

feg208 updated this revision to Diff 351945.Jun 14 2021, 11:40 AM

Forgot the clang-tidy checks

I'd just wanted to follow up and see if, minimally, my changes addressed the concerns raised or if you think there is further work here?

JDevlieghere added inline comments.Jun 28 2021, 11:02 AM
llvm/tools/dsymutil/Reproducer.cpp
106

Why is this a fallthrough? It seems like now, if we fail to create the ReproducerGenerateForCrash instance, we'll try creating a regular ReproducerGenerate and only then print the error message? I think we should just handle the EC immediately.

llvm/tools/dsymutil/Reproducer.h
66–68

Do we still need the ability to specify the mode?

97
100

Why do we need a move ctor?

110

I'm not sure if this is really something we want, with any real-world project the number of object files can quickly become overwhelming. Even if we do, this should probably be a property of the ReproducerGenerate and not be specific for the crash one.

feg208 updated this revision to Diff 355076.Jun 28 2021, 5:34 PM

Addresses some review comments

I have a few outstanding questions

llvm/tools/dsymutil/Reproducer.cpp
106

I don't think so. But maybe my read is incorrect? The unique_ptr would still get created and thus the nullptr check wouldn't pass and so the ec check below would still do its thing.

llvm/tools/dsymutil/Reproducer.h
66–68

thats there for the classof dyn_cast business.

100

So the answer is found around line 734 in dsymutil.cpp. Since we pass a unique_ptr into the main function for the Reproducer if the user selects -gen-reproducer and the program then crashes we need to move the ReproducerGenerate object into the GenerateForCrash so we get the expected messages.

110

Since we are passing the Reproducer unique_ptr into the main function and then, in the case where no -gen-reproducer is passed and the program didn't crash we need to make sure we don't dump the files. That was the intent here. But I can (and did) move it to the parent class. So a question that arises from your comment...should we clean up the dumped files on a crash back to just the binary plus mapping file?

I was thinking about it some more and a bunch of complexity here seems to come from the fact that some stuff happens within the CrashRecoveryContext (i.e. dsymutilMain). I believe we can avoid the majority of it by parsing the option, initializing the reproducer, and then running the remaining work in the RunSafely lamba.

I also think we should be able to avoid the dyn_cast and rely on polymorphism instead. For example we could have a virtual generate(bool crashed) method that is called after CRC.RunSafely. For ReproducerGenerate it would unconditionally generate the reproducer. For ReproducerGenerateOnCrash it would only do so if crashed == true.

llvm/tools/dsymutil/Reproducer.cpp
106

I would prefer to handle the error immediately and make every switch case self contained.

I took a stab at the way I had in mind in https://reviews.llvm.org/D127441

Herald added a project: Restricted Project. · View Herald TranscriptJun 9 2022, 2:04 PM
JDevlieghere commandeered this revision.Jun 15 2022, 11:06 AM
JDevlieghere abandoned this revision.
JDevlieghere edited reviewers, added: feg208; removed: JDevlieghere.