Add a script that calls C-Reduce on an input file and given the clang crash script, which is used to generate an interestingness test for C-Reduce.
Details
Diff Detail
- Repository
- rC Clang
Event Timeline
Thanks for this! Functionally, this looks good. My comments are mostly just simplicity/readability nitpicks.
clang/utils/creduce-clang-crash.py | ||
---|---|---|
36 ↗ | (On Diff #189796) | nit: this isn't necessary; with doesn't introduce its own scope |
37 ↗ | (On Diff #189796) | nit , 'r' is implied; please remove |
38 ↗ | (On Diff #189796) | Hrm. Looks like there are cases where these crash reproducers are multiple lines, though the actually meaningful one (to us) is always the last one? If so, it may be cleaner to say cmd = f.readlines()[-1] here |
43 ↗ | (On Diff #189796) | nit: can replace with subprocess.check_output unless we explicitly want to ignore the return value (in which case, we should probably still call wait() anyway?) |
49 ↗ | (On Diff #189796) | please pipes.quote(llvm_not) and pipes.quote(cmd) |
54 ↗ | (On Diff #189796) | Why are we escaping the graves and single-quotes? Also, nit: when possible with regex strings, please use raw strings. Doing so makes it way easier to reason about what Python's going to transform in the string literal before the regex engine gets to see it. |
58 ↗ | (On Diff #189796) | nit: please pipes.quote instead of adding manual quotes |
62 ↗ | (On Diff #189796) | nit: please simplify to del matches[:-5] |
63 ↗ | (On Diff #189796) | nit: pipes.quote please |
76 ↗ | (On Diff #189796) | nit: probably better to concatenate these; otherwise, I think we'll end up with a lot of spaces between these sentences? e.g. help="The path to the llvm-not executable. " "Required if [...]") |
108 ↗ | (On Diff #189796) | I hate being *that* person, this technically isn't portable Python. I don't honestly know if we care about not-CPython being able to run our code, but the fix is to simply use with open(...) as f: instead of this one-liner. (Specifically, CPython uses refcounts, so testfile will always be closed promptly, unless CPython decides someday to make files cyclic with themselves. GC'ed python implementations might take a while to flush this file and clean it up, which would be bad...) |
113 ↗ | (On Diff #189796) | Does creduce try and jump out into its own pgid? If not, I think this try/except block can be replaced with sys.exit(subprocess.call([creduce, testfile, file_to_reduce])) |
120 ↗ | (On Diff #189796) | nit: this is implied; please remove |
The new diffs should not be relative to the previously uploaded diff, but to the git master/svn trunk.
For it to be really useful for the majority of bugs, it would be nice to figure out automatically how to get the preprocessing step done and filter out the # lines afterwards. That part alone significantly cuts down the creduce time.
@gbiv already got all my shell quoting comments.
I think we should do one more round of fixes, we can commit that for you, and then move on to the next steps.
clang/utils/creduce-clang-crash.py | ||
---|---|---|
109 ↗ | (On Diff #189796) | We could try validating that the interestingness test passes here. If it doesn't, that's a bug in this script, I suppose. CReduce already does this for the user, but it's not clear with our usage model how to do this. |
110 ↗ | (On Diff #189796) | Let's add a TODO (or FIXME, that's more LLVM-y) here to add a step that runs the full pre-processor with -E and -P here. As we've discussed, it often doesn't work, but when it does, it avoids all those issues with #defines, comments, etc, breaking up topformflat. This doesn't have to be in the first version, of course. I see @joerg added a comment about this as well. |
I think we should do one more round of fixes, we can commit that for you, and then move on to the next steps
+1. This looks good to me with outstanding nits fixed
and filter out the # lines afterwards.
AIUI, the very-recently-merged https://github.com/csmith-project/creduce/pull/183 and https://github.com/csmith-project/creduce/pull/188 should hopefully handle the majority of those?
clang/utils/creduce-clang-crash.py | ||
---|---|---|
1 ↗ | (On Diff #190164) | nit: please specify a python version here, since the world is steadily making python == python3 (if pipes.quote is working, I assume you'll want #!/usr/bin/env python2) |
clang/utils/creduce-clang-crash.py | ||
---|---|---|
1 ↗ | (On Diff #190164) | Please don't do that, there is no python2.exe or python3.exe on Windows. #!/usr/bin/env python is the simplest thing that could work. |
(A few python style comments; feel free to ignore, and feel free to land no matter what you do with the comments.)
clang/utils/creduce-clang-crash.py | ||
---|---|---|
1 ↗ | (On Diff #190164) | There's no python2 on mac either. #!/usr/bin/env python is the correct first line for executable python scripts. |
29 ↗ | (On Diff #190189) | There's distutils.spawn.find_executable() which does this for you. |
68 ↗ | (On Diff #190189) | It's a somewhat common pattern to put a module docstring at the top of the file instead of a comment which explains usage, and then say description=__doc__ here; see llvm/utils/gn/build/symlink_or_copy.py for an example. |
90 ↗ | (On Diff #190189) | It's a bit more common to say return 1 here and do sys.exit(main()) at the bottom. |
I think that addresses all of the concerns people have put forward; given rnk's comment about one more round of fixes, this LGTM. Will check this in for you shortly.
Thanks again!
clang/utils/creduce-clang-crash.py | ||
---|---|---|
1 ↗ | (On Diff #190164) | TIL. Thank you both for the counterexamples :) |
In case anyone is interested, for the CHERI fork of LLVM/Clang I added a similar script that contains additional features such as inferring the crash message (so that you get the minimal reproducer for the issue that you are trying to reduce and not some obscure parser crash for invalid input), dealing with infinite loops, generating a test case with a mostly sensible RUN: line.
Furthermore, it will attempt to use bugpoint if it is a backend rather than a frontend crash since that is much much faster than creduce.
It should allows you to run $LLVM_BUILDIR/bin/creduce_crash_testcase.py /tmp/clang-reproducers.sh and then give a sensible minimal test case back
I can try to clean that up and remove the CHERI-specific compiler flags and llvm-lit substitutions.
For reference that script can be found here: https://github.com/CTSRD-CHERI/llvm-project/blob/master/clang/utils/creduce_crash_testcase.py