This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/IR/
-
IR/
2
Verifier.cpp
-
test/Verifier/
-
Verifier/
3
gc_result_token.ll

Differential D134427

[Verifier] Allow undef token argument to llvm.experimental.gc.result
ClosedPublic

Authored by dbakunevich on Sep 22 2022, 4:34 AM.

Download Raw Diff

Details

Reviewers

mkazantsev
apilipenko
nikic

Commits

rGfecfd012523f: [Verifier] Allow undef/poison token argument to llvm.experimental.gc.result

Summary

The verifier checks that the token of “experimental_gc_result” intrinsic is some kind of call. Therefore, checks that the token is not poison or undef.
See: https://github.com/llvm/llvm-project/issues/57871

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

dbakunevich created this revision.Sep 22 2022, 4:34 AM

Herald added a project: Restricted Project. · View Herald TranscriptSep 22 2022, 4:34 AM

Herald added a subscriber: hiraditya. · View Herald Transcript

dbakunevich requested review of this revision.Sep 22 2022, 4:34 AM

Herald added a project: Restricted Project. · View Herald TranscriptSep 22 2022, 4:34 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B188142: Diff 462133.Sep 22 2022, 4:34 AM

apilipenko accepted this revision.Sep 22 2022, 10:59 AM

apilipenko added inline comments.

llvm/lib/IR/Verifier.cpp
5234–5239
llvm/test/Verifier/gc_result_token.ll
22	Add a new line.

This revision is now accepted and ready to land.Sep 22 2022, 10:59 AM

Can you please explain in more detail what the motivation for this change is? I've read your patch description, and I've read the linked issue, but I still don't understand why this change is necessary. The fact that a fuzzer can produce IR that does not verify is not surprising -- there is only an issue if IR that originally verifies stops verifying after the application of an IR optimization pass.

This revision now requires changes to proceed.Sep 22 2022, 11:26 AM

As part of the optimization in the unreachable code, we remove tokens, thereby replacing them with undef/poison in intrinsics. But the verifier falls on the assertion, within of what it sees token poison in unreachable code, which in turn is incorrect.
The original code was in the form of a Java program that crashed on this assert. As part of the search for a bug, based on ir obtained from java, a mini test was written that reflects the problem. If you need, I can send Java code with this bug.

This structural constraint (gc.result being tied to a statepoint instruction) can be violated in dead code. We've recently fixed a similar issue for gc.relocates in D128904.

Thanks for the explanation. I do wonder how we end up in the situation where the gc.statepoint gets removed, but the gc.result is not also removed.

llvm/lib/IR/Verifier.cpp
5234	Just `isa<UndefValue>()` is sufficient, it also handles poison.
llvm/test/Verifier/gc_result_token.ll
9	Branch is unnecessary here -- you want label_2 to be unreachable in the CFG, not just dynamically dead.

This revision is now accepted and ready to land.Sep 27 2022, 5:56 AM

dbakunevich updated this revision to Diff 463529.Sep 28 2022, 6:25 AM

mkazantsev added inline comments.Sep 28 2022, 6:31 AM

llvm/test/Verifier/gc_result_token.ll
9	No, this is fine. If a pass is unable to modify CFG, it still has a right to replace operands with undef under if(false).

In D134427#3817919, @nikic wrote:

Thanks for the explanation. I do wonder how we end up in the situation where the gc.statepoint gets removed, but the gc.result is not also removed.

@nikic gc.statepoint doesn't need to be removed. It's enough that gc.result is sunk into a block under if(false), and then some opt understood this and replace its operand with poison.

Why it didn't remove the call at all is a good question, but it could have its reasons.

Harbormaster completed remote builds in B189157: Diff 463529.Sep 28 2022, 7:43 AM

This revision was landed with ongoing or failed builds.Oct 19 2022, 6:52 AM

Closed by commit rGfecfd012523f: [Verifier] Allow undef/poison token argument to llvm.experimental.gc.result (authored by dbakunevich). · Explain Why

This revision was automatically updated to reflect the committed changes.

dbakunevich added a commit: rGfecfd012523f: [Verifier] Allow undef/poison token argument to llvm.experimental.gc.result.

Revision Contents

Path

Size

llvm/

lib/

IR/

Verifier.cpp

7 lines

test/

Verifier/

gc_result_token.ll

21 lines

Diff 468898

llvm/lib/IR/Verifier.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,224 Lines • ▼ Show 20 Lines case Intrinsic::experimental_gc_statepoint:

Check(Call.getParent()->getParent()->hasGC(), Check(Call.getParent()->getParent()->hasGC(),

"Enclosing function does not use GC.", Call); "Enclosing function does not use GC.", Call);

verifyStatepoint(Call); verifyStatepoint(Call);

break; break;

case Intrinsic::experimental_gc_result: { case Intrinsic::experimental_gc_result: {

Check(Call.getParent()->getParent()->hasGC(), Check(Call.getParent()->getParent()->hasGC(),

"Enclosing function does not use GC.", Call); "Enclosing function does not use GC.", Call);

auto *Statepoint = Call.getArgOperand(0);

nikicUnsubmitted

Not Done

Just isa<UndefValue>() is sufficient, it also handles poison.

nikic: Just `isa<UndefValue>()` is sufficient, it also handles poison.

if (isa<UndefValue>(Statepoint))

break;

// Are we tied to a statepoint properly? // Are we tied to a statepoint properly?

const auto *StatepointCall = dyn_cast<CallBase>(Call.getArgOperand(0)); const auto *StatepointCall = dyn_cast<CallBase>(Statepoint);

apilipenkoUnsubmitted

Not Done

"Enclosing function does not use GC.", Call);

- if (isa<UndefValue>(Call.getArgOperand(0)) ||

- isa<PoisonValue>(Call.getArgOperand(0)))

+ auto *Statepoint = Call.getArgOperand(0);

+ if (isa<UndefValue>(Statepoint) || isa<PoisonValue>(Statepoint))

break;

// Are we tied to a statepoint properly?

- const auto *StatepointCall = dyn_cast<CallBase>(Call.getArgOperand(0));

+ const auto *StatepointCall = dyn_cast<CallBase>(Statepoint);

const Function *StatepointFn =

apilipenko:

const Function *StatepointFn = const Function *StatepointFn =

StatepointCall ? StatepointCall->getCalledFunction() : nullptr; StatepointCall ? StatepointCall->getCalledFunction() : nullptr;

Check(StatepointFn && StatepointFn->isDeclaration() && Check(StatepointFn && StatepointFn->isDeclaration() &&

StatepointFn->getIntrinsicID() == StatepointFn->getIntrinsicID() ==

Intrinsic::experimental_gc_statepoint, Intrinsic::experimental_gc_statepoint,

"gc.result operand #1 must be from a statepoint", Call, "gc.result operand #1 must be from a statepoint", Call,

Call.getArgOperand(0)); Call.getArgOperand(0));

▲ Show 20 Lines • Show All 1,478 Lines • Show Last 20 Lines

llvm/test/Verifier/gc_result_token.ll

This file was added.

				; RUN: opt -S -passes=verify < %s \| FileCheck %s

				target triple = "x86_64-unknown-linux-gnu"

				define void @foo() gc "statepoint_example" personality ptr @P {
				; CHECK-NOT: gc.result operand #1 must be from a statepoint
				entry:
				br label %label_1
				label_1:
				nikicUnsubmitted Not Done Reply Inline Actions Branch is unnecessary here -- you want label_2 to be unreachable in the CFG, not just dynamically dead. nikic: Branch is unnecessary here -- you want label_2 to be unreachable in the CFG, not just…
				mkazantsevUnsubmitted Not Done Reply Inline Actions No, this is fine. If a pass is unable to modify CFG, it still has a right to replace operands with undef under if(false). mkazantsev: No, this is fine. If a pass is unable to modify CFG, it still has a right to replace operands…
				; CHECK: ret void
				ret void

				label_2:
				; CHECK: token poison
				%call = call noundef i32 @llvm.experimental.gc.result.i32(token poison)
				unreachable
				}

				declare i32 @llvm.experimental.gc.result.i32(token)

				declare ptr @P()
				apilipenkoUnsubmitted Not Done Reply Inline Actions Add a new line. apilipenko: Add a new line.