This change to dsymutil allows for, when dsymutil crashes, the user to collect and submit
the object files and a map file so the crash can be reproduced
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Time | Test | |
---|---|---|
3,590 ms | x64 debian > libarcher.races::lock-unrelated.c |
Event Timeline
How is collect-crash-inputs different from gan-reproducer? It seems like they're essentially doing the same thing. Can we implement the crash recovery in terms of llvm::dsymutil::Reproducer?
This alters the prior commit by incorporating the Reproducer for the purpose of persisting the files on crash
llvm/tools/dsymutil/Reproducer.cpp | ||
---|---|---|
57–80 | If we split this into a separate class, this can remain here, and have the crash variant print the "PLEASE ATTACH THE FILES IN THE FOLLOWING DIRECTORY TO THE BUG REPORT" in the generate crash class. | |
llvm/tools/dsymutil/Reproducer.h | ||
84 | I think we probably want a separate class that corresponds to GenerateForCrash. | |
llvm/tools/dsymutil/dsymutil.cpp | ||
490–491 | Spurious newline | |
713–737 | This seems like it should be part of the reproducers class. |
I think I addressed all the comments? If so I can't push the changes into the repo. Could you? Assuming there aren't outstanding concerns of course.
I am wondering if I could get someone to look at this? I think I have addressed the review comments but I'd be happy to fix anything I missed
I think I have addressed the review comments but I'd be happy to fix anything I missed
Pinging @JDevlieghere ^
llvm/tools/dsymutil/Reproducer.cpp | ||
---|---|---|
37 | This should be consistent with the header. It's fine (and common) to use the same name for the argument and member variable. | |
38 | LLVM generally does not prefer braced initializer lists (https://llvm.org/docs/CodingStandards.html#do-not-use-braced-initializer-lists-to-call-a-constructor) | |
106 | You should use LLVM_FALLTHROUGH, otherwise this won't work if the host compiler is not clang. | |
108–110 | No braces around single line statements (https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements) | |
llvm/tools/dsymutil/Reproducer.h | ||
24 | Maybe GenerateOnCrash? | |
40 | LLVM doesn't use exceptions (and generally builds without them). | |
50 | If we're going to store the root, this might as well take a std::string too and move it into Root. | |
52 | Rather than returning a small string with arbitrary lengths, the common pattern in llvm is to take let the caller decide on the storage type and pass in a llvm::SmallVectorImpl<char>& that the function then populates. This also avoids any potential copies. | |
llvm/tools/dsymutil/dsymutil.cpp | ||
493–495 | I don't think auto is a good fit here (https://llvm.org/docs/CodingStandards.html#use-auto-type-deduction-to-make-code-more-readable) | |
682–686 | Unrelated whitespace change? | |
717 | ||
721–729 | Why is this needed? | |
751 |
I rolled up all the comments save one where I just have an outstanding question and then I'll roll that up as well
llvm/tools/dsymutil/Reproducer.cpp | ||
---|---|---|
38 | Got it. Thanks for passing this along. I really should have reviewed it before this submission but I certainly will before the next | |
llvm/tools/dsymutil/dsymutil.cpp | ||
682–686 | yeah. Sorry it snuck in. | |
721–729 | So my understanding is that when dsymutil crashes we want the mapping.yaml file, the binary and associated .o file (assuming its compiled with -g). The latter maybe for some workflows? Reusing this gets us the object file but at the cost of parsing the debug map. So maybe I should just copy the binary in? I can do that or just parse enough to recreate the object files. Or maybe I am misunderstanding this entirely? |
llvm/tools/dsymutil/Reproducer.cpp | ||
---|---|---|
56–80 | We need the whole directory, so might as well ask the reporter to include it as a whole. ******************** PLEASE ATTACH THE FOLLOWING DIRECTORY TO THE BUG REPORT: | |
101 | The nullptr here is redundant. | |
llvm/tools/dsymutil/Reproducer.h | ||
40 | Don't use auto here, it's not obvious from context what the type is. | |
llvm/tools/dsymutil/dsymutil.cpp | ||
492 | Let's be consistent with all the surrounding code and the style guide. | |
721–729 | ||
744–753 | I actually don't think this is going to work. You're creating the reproducer after we've crashed and you're not telling dsymutil to use the VFS from that Reproducer instance. It's the VFS that records all access and is what will allow us to copy all the files in place if we decide we need to keep the reproducer. The way I would imagine this to work is:
4a. If we crash, we generate the reproducer and print the dsymutil invocation to replay the reproducer. Does that make sense? |
This is not yet complete. I missed step 4a. "4a. If we crash, we generate the reproducer and print the dsymutil invocation to replay the reproducer." It currently doesn't print the command line
I'd just wanted to follow up and see if, minimally, my changes addressed the concerns raised or if you think there is further work here?
llvm/tools/dsymutil/Reproducer.cpp | ||
---|---|---|
106 | Why is this a fallthrough? It seems like now, if we fail to create the ReproducerGenerateForCrash instance, we'll try creating a regular ReproducerGenerate and only then print the error message? I think we should just handle the EC immediately. | |
llvm/tools/dsymutil/Reproducer.h | ||
66–68 | Do we still need the ability to specify the mode? | |
97 | ||
100 | Why do we need a move ctor? | |
110 | I'm not sure if this is really something we want, with any real-world project the number of object files can quickly become overwhelming. Even if we do, this should probably be a property of the ReproducerGenerate and not be specific for the crash one. |
I have a few outstanding questions
llvm/tools/dsymutil/Reproducer.cpp | ||
---|---|---|
106 | I don't think so. But maybe my read is incorrect? The unique_ptr would still get created and thus the nullptr check wouldn't pass and so the ec check below would still do its thing. | |
llvm/tools/dsymutil/Reproducer.h | ||
66–68 | thats there for the classof dyn_cast business. | |
100 | So the answer is found around line 734 in dsymutil.cpp. Since we pass a unique_ptr into the main function for the Reproducer if the user selects -gen-reproducer and the program then crashes we need to move the ReproducerGenerate object into the GenerateForCrash so we get the expected messages. | |
110 | Since we are passing the Reproducer unique_ptr into the main function and then, in the case where no -gen-reproducer is passed and the program didn't crash we need to make sure we don't dump the files. That was the intent here. But I can (and did) move it to the parent class. So a question that arises from your comment...should we clean up the dumped files on a crash back to just the binary plus mapping file? |
I was thinking about it some more and a bunch of complexity here seems to come from the fact that some stuff happens within the CrashRecoveryContext (i.e. dsymutilMain). I believe we can avoid the majority of it by parsing the option, initializing the reproducer, and then running the remaining work in the RunSafely lamba.
I also think we should be able to avoid the dyn_cast and rely on polymorphism instead. For example we could have a virtual generate(bool crashed) method that is called after CRC.RunSafely. For ReproducerGenerate it would unconditionally generate the reproducer. For ReproducerGenerateOnCrash it would only do so if crashed == true.
llvm/tools/dsymutil/Reproducer.cpp | ||
---|---|---|
106 | I would prefer to handle the error immediately and make every switch case self contained. |
Maybe GenerateOnCrash?