This is an archive of the discontinued LLVM Phabricator instance.

[llvm-reduce] keep terminator instructions during ReduceGlobalVars
AbandonedPublic

Authored by dwightguth on Nov 12 2021, 11:12 AM.

Details

Reviewers
aeubanks
Summary

Previously we were deleting instructions that use a global variable when
we delete the global variable. However, this obviously does not work if
the instruction is a terminator instruction as the resulting basic block
will have no terminator, resulting in invalid IR. To fix this, we simply
ensure that terminator instructions are never added to the list of
instructions to delete.

Diff Detail

Event Timeline

dwightguth requested review of this revision.Nov 12 2021, 11:12 AM
dwightguth created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptNov 12 2021, 11:12 AM

can we not delete instructions in this pass? just do the RAUW of global vars and leave the instruction deleting to another pass?

I can definitely give that a try. I was trying to leave the code as is as much as possible, but I don't mind trying to tackle this cleanup since it makes sense to me and shouldn't be too hard to implement. I won't get to it until Monday though

I actually just realized, I don't think what you suggested is going to work. If you replace all the uses of the variable with undefs but don't delete any instructions that rely on that undef, then you're going to be left with loads and stores to undef, which will trigger undefined behavior. As a result, any interestingness test that happens to execute the program is likely going to segfault and be uninteresting. As a result of which, many global variables will not be deleted even though they are not actually needed. I think it's better to leave this code as it is.

llvm-reduce does not preserve semantics in any way at all, it's purely meant to create a small reproducer for some crash

Sure, but it can also be used to reproduce a small test case for a program that was miscompiled. But it won't do that effectively if individual delta passes, when run, crash unnecessarily in uninteresting ways. If we make this change, you will be making the tool considerably less useful for minimizing programs that are miscompiled by the compiler.

Sure, but it can also be used to reproduce a small test case for a program that was miscompiled. But it won't do that effectively if individual delta passes, when run, crash unnecessarily in uninteresting ways. If we make this change, you will be making the tool considerably less useful for minimizing programs that are miscompiled by the compiler.

I echo what @aeubanks said.

llvm-reduce does not preserve semantics in any way at all, it's purely meant to create a small reproducer for some crash

Sure, but it can also be used to reproduce a small test case for a program that was miscompiled. But it won't do that effectively if individual delta passes, when run, crash unnecessarily in uninteresting ways. If we make this change, you will be making the tool considerably less useful for minimizing programs that are miscompiled by the compiler.

I think you'll want another tool that specifically does that since this tool specifically does not do that. For example, something that runs specific optimization passes one at a time, perhaps on a function at a time for function passes.

I genuinely don't understand the concern here. There are two types of common bugs in the llvm compiler. The first is when the compiler itself crashes. For this, all you need the interestingness test in llvm-reduce to do is run llc and report that it is interesting if the compiler crashes. In this case, you don't care what the program does when linked or executed, it's true. However, there is another type of common bug: namely, when the compiler miscompiles a program. In order to test whether the reduced program is interesting in this case, it's necessary to actually link and run it. Right now, llvm-reduce can be effectively used to reduce test cases that the compiler miscompiles. It's rather slow, but it works. If we make this change, it will no longer be able to effectively reduce global variables in this case because the reduced global variables will cause the attempts at reduction to segfault, rendering them uninteresting. As a result, the final reduced program will not be able to reduce any live global variables, which seems undesirable to me in a minimizer.

For reference, this is exactly what I am currently using llvm-reduce to do: I have a bug in the code generator that I am trying to construct a minimal example for. I can't test compilation passes because the bug isn't in a compiler pass, it's in the backend. I can't use bugpoint because bugpoint relies on having a "safe" compiler for the program, and I don't have one of those. If you know of another tool that I can use to minimize llvm IR for the purposes of reproducing miscompilations, I'd be happy to use it instead. But the claim that it's specifically not designed to reduce miscompilations seems vacuous to me when it is demonstrably currently capable of doing exactly that.

I genuinely don't understand the concern here. There are two types of common bugs in the llvm compiler. The first is when the compiler itself crashes. For this, all you need the interestingness test in llvm-reduce to do is run llc and report that it is interesting if the compiler crashes. In this case, you don't care what the program does when linked or executed, it's true. However, there is another type of common bug: namely, when the compiler miscompiles a program. In order to test whether the reduced program is interesting in this case, it's necessary to actually link and run it. Right now, llvm-reduce can be effectively used to reduce test cases that the compiler miscompiles. It's rather slow, but it works. If we make this change, it will no longer be able to effectively reduce global variables in this case because the reduced global variables will cause the attempts at reduction to segfault, rendering them uninteresting. As a result, the final reduced program will not be able to reduce any live global variables, which seems undesirable to me in a minimizer.

Basically every single delta pass does not preserve semantics, I'm very surprised that you're able to get anything useful to link. For example, arguably the most useful reduction, ReduceInstructions.cpp, RAUWs instructions with undef, which is clearly not semantics preserving in almost every case.

And yet... I have managed to use it so far to reduce 84MB of bitcode to 2.5MB of bitcode. It is still reducing.

dwightguth abandoned this revision.Nov 17 2021, 1:19 PM

It looks like there doesn't seem to be an appetite for changes that improve the usability of llvm-reduce as a tool for reducing miscompilation errors, despite it already being demonstrably quite useful for such a purpose. I don't have a lot of motivation currently to spend time fixing up this revision into a form that would be accepted when doing so would not prove useful to me. Obviously if someone else wants to take this change over and implement it in the format suggested, that's their choice. Barring any change to the consensus, I'm going to abandon this change.