This is an archive of the discontinued LLVM Phabricator instance.

[Verifier] Allow undef token argument to llvm.experimental.gc.result
ClosedPublic

Authored by dbakunevich on Sep 22 2022, 4:34 AM.

Details

Summary

The verifier checks that the token of “experimental_gc_result” intrinsic is some kind of call. Therefore, checks that the token is not poison or undef.
See: https://github.com/llvm/llvm-project/issues/57871

Diff Detail

Event Timeline

dbakunevich created this revision.Sep 22 2022, 4:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 22 2022, 4:34 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
dbakunevich requested review of this revision.Sep 22 2022, 4:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 22 2022, 4:34 AM
apilipenko accepted this revision.Sep 22 2022, 10:59 AM
apilipenko added inline comments.
llvm/lib/IR/Verifier.cpp
5234–5239
llvm/test/Verifier/gc_result_token.ll
22

Add a new line.

This revision is now accepted and ready to land.Sep 22 2022, 10:59 AM
nikic requested changes to this revision.Sep 22 2022, 11:26 AM
nikic added a subscriber: nikic.

Can you please explain in more detail what the motivation for this change is? I've read your patch description, and I've read the linked issue, but I still don't understand why this change is necessary. The fact that a fuzzer can produce IR that does not verify is not surprising -- there is only an issue if IR that originally verifies stops verifying after the application of an IR optimization pass.

This revision now requires changes to proceed.Sep 22 2022, 11:26 AM

As part of the optimization in the unreachable code, we remove tokens, thereby replacing them with undef/poison in intrinsics. But the verifier falls on the assertion, within of what it sees token poison in unreachable code, which in turn is incorrect.
The original code was in the form of a Java program that crashed on this assert. As part of the search for a bug, based on ir obtained from java, a mini test was written that reflects the problem. If you need, I can send Java code with this bug.

This structural constraint (gc.result being tied to a statepoint instruction) can be violated in dead code. We've recently fixed a similar issue for gc.relocates in D128904.

nikic resigned from this revision.Sep 27 2022, 5:54 AM
nikic retitled this revision from Fixed bug on token undef/poison in unreachable code to [Verifier] Allow undef token argument to llvm.experimental.gc.result.

Thanks for the explanation. I do wonder how we end up in the situation where the gc.statepoint gets removed, but the gc.result is not also removed.

llvm/lib/IR/Verifier.cpp
5234

Just isa<UndefValue>() is sufficient, it also handles poison.

llvm/test/Verifier/gc_result_token.ll
9

Branch is unnecessary here -- you want label_2 to be unreachable in the CFG, not just dynamically dead.

This revision is now accepted and ready to land.Sep 27 2022, 5:56 AM
mkazantsev added inline comments.Sep 28 2022, 6:31 AM
llvm/test/Verifier/gc_result_token.ll
9

No, this is fine. If a pass is unable to modify CFG, it still has a right to replace operands with undef under if(false).

mkazantsev added a comment.EditedSep 28 2022, 6:32 AM

Thanks for the explanation. I do wonder how we end up in the situation where the gc.statepoint gets removed, but the gc.result is not also removed.

@nikic gc.statepoint doesn't need to be removed. It's enough that gc.result is sunk into a block under if(false), and then some opt understood this and replace its operand with poison.

Why it didn't remove the call at all is a good question, but it could have its reasons.

This revision was landed with ongoing or failed builds.Oct 19 2022, 6:52 AM
This revision was automatically updated to reflect the committed changes.