Page MenuHomePhabricator

Allows for dsymutil crashes to generate reproduceable information
Needs ReviewPublic

Authored by feg208 on Apr 26 2021, 8:14 PM.

Details

Summary

This change to dsymutil allows for, when dsymutil crashes, the user to collect and submit
the object files and a map file so the crash can be reproduced

Diff Detail

Unit TestsFailed

TimeTest
3,590 msx64 debian > libarcher.races::lock-unrelated.c
Script: -- : 'RUN: at line 13'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests -I /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/runtime/src -L /var/lib/buildkite-agent/builds/llvm-project/build/lib -Wl,-rpath,/var/lib/buildkite-agent/builds/llvm-project/build/lib /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/races/lock-unrelated.c -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/lock-unrelated.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/deflake.bash /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/lock-unrelated.c.tmp 2>&1 | tee /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/lock-unrelated.c.tmp.log | /var/lib/buildkite-agent/builds/llvm-project/build/./bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/races/lock-unrelated.c

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
feg208 updated this revision to Diff 340721.Apr 26 2021, 9:21 PM

Fixed clang-tidy warnings

feg208 updated this revision to Diff 340867.Apr 27 2021, 8:58 AM

Missing more clang-tidy checks

How is collect-crash-inputs different from gan-reproducer? It seems like they're essentially doing the same thing. Can we implement the crash recovery in terms of llvm::dsymutil::Reproducer?

feg208 updated this revision to Diff 341391.Apr 28 2021, 8:54 PM

This alters the prior commit by incorporating the Reproducer for the purpose of persisting the files on crash

feg208 updated this revision to Diff 341392.Apr 28 2021, 8:57 PM

Forgot to remove raise(SIGSEGV)

feg208 updated this revision to Diff 341635.Apr 29 2021, 2:10 PM

Cleans up clang-tidy/format issues and some minor branch pecadillos

feg208 updated this revision to Diff 342164.May 1 2021, 10:38 AM

clang-format misses

JDevlieghere added inline comments.May 24 2021, 2:13 PM
llvm/tools/dsymutil/Reproducer.cpp
57–80

If we split this into a separate class, this can remain here, and have the crash variant print the "PLEASE ATTACH THE FILES IN THE FOLLOWING DIRECTORY TO THE BUG REPORT" in the generate crash class.

llvm/tools/dsymutil/Reproducer.h
84

I think we probably want a separate class that corresponds to GenerateForCrash.

llvm/tools/dsymutil/dsymutil.cpp
490–491

Spurious newline

713–737

This seems like it should be part of the reproducers class.

feg208 updated this revision to Diff 347536.May 24 2021, 5:58 PM
feg208 marked 4 inline comments as done.

Addresses some review comments

I think I addressed all the comments? If so I can't push the changes into the repo. Could you? Assuming there aren't outstanding concerns of course.

feg208 updated this revision to Diff 347557.May 24 2021, 7:56 PM

Broke a regression test

feg208 added a comment.Jun 2 2021, 9:33 AM

I am wondering if I could get someone to look at this? I think I have addressed the review comments but I'd be happy to fix anything I missed

I think I have addressed the review comments but I'd be happy to fix anything I missed

Pinging @JDevlieghere ^

JDevlieghere added inline comments.Jun 4 2021, 9:05 AM
llvm/tools/dsymutil/Reproducer.cpp
37

This should be consistent with the header. It's fine (and common) to use the same name for the argument and member variable.

38
106

You should use LLVM_FALLTHROUGH, otherwise this won't work if the host compiler is not clang.

108–110
llvm/tools/dsymutil/Reproducer.h
24

Maybe GenerateOnCrash?

40

LLVM doesn't use exceptions (and generally builds without them).

50

If we're going to store the root, this might as well take a std::string too and move it into Root.

52

Rather than returning a small string with arbitrary lengths, the common pattern in llvm is to take let the caller decide on the storage type and pass in a llvm::SmallVectorImpl<char>& that the function then populates. This also avoids any potential copies.

llvm/tools/dsymutil/dsymutil.cpp
493–495
682–686

Unrelated whitespace change?

717
721–729

Why is this needed?

751
feg208 marked 11 inline comments as done.Jun 7 2021, 12:54 PM

I rolled up all the comments save one where I just have an outstanding question and then I'll roll that up as well

llvm/tools/dsymutil/Reproducer.cpp
38

Got it. Thanks for passing this along. I really should have reviewed it before this submission but I certainly will before the next

llvm/tools/dsymutil/dsymutil.cpp
682–686

yeah. Sorry it snuck in.

721–729

So my understanding is that when dsymutil crashes we want the mapping.yaml file, the binary and associated .o file (assuming its compiled with -g). The latter maybe for some workflows? Reusing this gets us the object file but at the cost of parsing the debug map. So maybe I should just copy the binary in? I can do that or just parse enough to recreate the object files. Or maybe I am misunderstanding this entirely?

feg208 updated this revision to Diff 350396.Jun 7 2021, 12:54 PM
feg208 marked 2 inline comments as done.

Gets some review comments

JDevlieghere requested changes to this revision.Jun 11 2021, 11:06 AM
JDevlieghere added inline comments.
llvm/tools/dsymutil/Reproducer.cpp
56–80

We need the whole directory, so might as well ask the reporter to include it as a whole.

********************
PLEASE ATTACH THE FOLLOWING DIRECTORY TO THE BUG REPORT:
101

The nullptr here is redundant.

llvm/tools/dsymutil/Reproducer.h
40

Don't use auto here, it's not obvious from context what the type is.

llvm/tools/dsymutil/dsymutil.cpp
492

Let's be consistent with all the surrounding code and the style guide.

721–729
744–753

I actually don't think this is going to work. You're creating the reproducer after we've crashed and you're not telling dsymutil to use the VFS from that Reproducer instance. It's the VFS that records all access and is what will allow us to copy all the files in place if we decide we need to keep the reproducer.

The way I would imagine this to work is:

  1. Parse the command line options. If we crash at this point we won't have a reproducer, which I think is acceptable.
  2. Initialize the reproducer.
  3. Call CRC.RunSafely to do the actual work. We pass it the reproducer, so it captures all file accesses, which is everything we need for the reproducer.

4a. If we crash, we generate the reproducer and print the dsymutil invocation to replay the reproducer.
4b. If we don't crash, but --gen-reproducer was passed, we do the same

Does that make sense?

This revision now requires changes to proceed.Jun 11 2021, 11:06 AM
feg208 marked 3 inline comments as done.Jun 12 2021, 9:28 AM

I think this covers the comments

llvm/tools/dsymutil/dsymutil.cpp
492

Oh right. Oops.

744–753

Yeah this makes sense and answers my confusion about the right way to go here.

feg208 updated this revision to Diff 351668.Jun 12 2021, 9:28 AM

Rolled up review comments

feg208 updated this revision to Diff 351742.Jun 13 2021, 2:34 PM

Fixes the broken tests

This is not yet complete. I missed step 4a. "4a. If we crash, we generate the reproducer and print the dsymutil invocation to replay the reproducer." It currently doesn't print the command line

feg208 updated this revision to Diff 351744.Jun 13 2021, 3:03 PM

Added command line print

feg208 updated this revision to Diff 351925.Jun 14 2021, 10:46 AM

Clang-tidy checks cleaned up and test fixes

feg208 updated this revision to Diff 351945.Jun 14 2021, 11:40 AM

Forgot the clang-tidy checks

I'd just wanted to follow up and see if, minimally, my changes addressed the concerns raised or if you think there is further work here?

JDevlieghere added inline comments.Mon, Jun 28, 11:02 AM
llvm/tools/dsymutil/Reproducer.cpp
106

Why is this a fallthrough? It seems like now, if we fail to create the ReproducerGenerateForCrash instance, we'll try creating a regular ReproducerGenerate and only then print the error message? I think we should just handle the EC immediately.

llvm/tools/dsymutil/Reproducer.h
66–68

Do we still need the ability to specify the mode?

97
100

Why do we need a move ctor?

110

I'm not sure if this is really something we want, with any real-world project the number of object files can quickly become overwhelming. Even if we do, this should probably be a property of the ReproducerGenerate and not be specific for the crash one.

feg208 updated this revision to Diff 355076.Mon, Jun 28, 5:34 PM

Addresses some review comments

I have a few outstanding questions

llvm/tools/dsymutil/Reproducer.cpp
106

I don't think so. But maybe my read is incorrect? The unique_ptr would still get created and thus the nullptr check wouldn't pass and so the ec check below would still do its thing.

llvm/tools/dsymutil/Reproducer.h
66–68

thats there for the classof dyn_cast business.

100

So the answer is found around line 734 in dsymutil.cpp. Since we pass a unique_ptr into the main function for the Reproducer if the user selects -gen-reproducer and the program then crashes we need to move the ReproducerGenerate object into the GenerateForCrash so we get the expected messages.

110

Since we are passing the Reproducer unique_ptr into the main function and then, in the case where no -gen-reproducer is passed and the program didn't crash we need to make sure we don't dump the files. That was the intent here. But I can (and did) move it to the parent class. So a question that arises from your comment...should we clean up the dumped files on a crash back to just the binary plus mapping file?