This is an archive of the discontinued LLVM Phabricator instance.

[MachineVerifier] Fix crash on early clobbered subreg operands.
ClosedPublic

Authored by dfukalov on May 30 2022, 8:41 AM.

Details

Summary

MachineVerifier tried to checkLivenessAtDef() ignoring it is actually a subreg.

The issue was with processing two subregs of the same reg are used in the same
instruction (e.g. inline asm): "def early-clobber" and other just "def".

Diff Detail

Event Timeline

dfukalov created this revision.May 30 2022, 8:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 30 2022, 8:41 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
dfukalov requested review of this revision.May 30 2022, 8:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 30 2022, 8:41 AM
arsenm added inline comments.Jun 8 2022, 5:58 AM
llvm/test/MachineVerifier/verifier-ec-subreg-liveness.mir
2

tests in test/MachineVerifier should run none to just run the verifier. Shouldn't be running codegen passes

10–13

Don't need the register section

20

Set the register class like %2:vreg_64?

dfukalov updated this revision to Diff 435534.Jun 9 2022, 7:09 AM

Beautified test.

llvm/test/MachineVerifier/verifier-ec-subreg-liveness.mir
2

The crush is observed only when the verifier runs after a pass that has preserved liveintervals analysis.

arsenm added inline comments.Jun 9 2022, 6:59 PM
llvm/test/MachineVerifier/verifier-ec-subreg-liveness.mir
2

llc -march=amdgcn -run-pass liveintervals -verify-machineinstrs should work. Don't want to rely on a pass that could make changes for testing this

foad added a subscriber: qcolombet.

I think this needs more explanation. It seems like it only fails verification when the same instruction has both a normal subreg def and an early-clobber subreg def of the same register. Is that correct? And the problem is that the super range starts at the early clobber slot (16e), but the subrange for the normal subreg def starts at the reg slot (16r):

%0 [16e,32r:0) 0@16e  L0000000000000003 [16e,32r:0) 0@16e  L000000000000000C [16r,32r:0) 0@16r  weight:0.000000e+00
...
*** Bad machine code: Inconsistent valno->def ***
- function:    sub0
- basic block: %bb.0  (0xbdc57a8) [0B;64B)
- instruction: 16B	INLINEASM &"" [attdialect], $0:[regdef-ec:VGPR_32], def undef early-clobber %0.sub0:vreg_64, $1:[regdef:VGPR_32], def undef %0.sub1:vreg_64
- operand 5:   undef %0.sub1:vreg_64
- liverange:   [16e,32r:0) 0@16e
- v. register: %0
- ValNo:       0 (def 16e)
- at:          16r

I think your fix looks OK but I'd also like to hear opinions from @MatzeB or @qcolombet.

llvm/test/MachineVerifier/verifier-ec-subreg-liveness.mir
2

I have a vague recollection that -run-pass liveintervals -verify-machineinstrs isn't enough because the pass manager frees the liveintervals analysis as soon as it is no longer required, before the verifier has a chance to run. But I could be wrong.

Jay, you are completely correct. Sorry I didn't mention the same info as in dependent review. Updated the description.

llvm/test/MachineVerifier/verifier-ec-subreg-liveness.mir
2

Yes, you're right, the verifier doesn't get liveinterval analysis in -run-pass liveintervals -verify-machineinstrs case. Regcoalescer seemed to me the smallest one from a few passes allowed to run unto crash.

dfukalov edited the summary of this revision. (Show Details)Jun 10 2022, 3:24 AM
arsenm added inline comments.Jun 10 2022, 7:38 AM
llvm/lib/CodeGen/MachineVerifier.cpp
2485

This does have a SubRangeCheck flag. Also wouldn't you want to perform the same checks over each subrange?

llvm/test/MachineVerifier/verifier-ec-subreg-liveness.mir
2

I think I figured out some trick to get live intervals preserved here but I forget what it was. I think you also need to use -verify-coalescing to get a verifier run before the pass actually runs, instead of -verify-machineinstrs

dfukalov added inline comments.Jun 10 2022, 1:21 PM
llvm/lib/CodeGen/MachineVerifier.cpp
2485

The checks over each subrange is performed in the if() just below this code.

llvm/test/MachineVerifier/verifier-ec-subreg-liveness.mir
2

Actually in -run-pass=liveintervals,simple-register-coalescing -verify-machineinstrs case the verifier instance that runs into crash is performed after liveintervals pass and before coalescer. On the other hand -run-pass=liveintervals,simple-register-coalescing -verify-coalescing runs into verifier within coalescer (but before main work) so it seems a little more fragile.

arsenm added inline comments.Jun 13 2022, 5:34 PM
llvm/test/MachineVerifier/verifier-ec-subreg-liveness.mir
2

I've been running into this problem a lot recently. I think we need to add a flag to force a dependency on the live intervals to MachineVerifier. We'll probably find a good number of failures in the test suite if it's forced on by default

foad added inline comments.Jun 14 2022, 3:13 AM
llvm/test/MachineVerifier/verifier-ec-subreg-liveness.mir
2

I had a patch to do this with no flag. I've posted it as D127731. Yes there are plenty of failures.

@MatzeB, @qcolombet, would you please have a look?

Gentle ping...

dfukalov updated this revision to Diff 449097.Aug 1 2022, 12:38 PM

Using pipeliner instead of RC as dummy pass.

MatzeB added a comment.Aug 1 2022, 1:23 PM

Sorry, but at a first glance this looks like you are just disabling verifier checks.

The summary says that checkLivenessAtDef() would ignore that something is a subreg definition, but there are SubRangeCheck parameters and a LaneMask parameter, so this doesn't seem to be true in general.

Is this really an incorrect check? Or does LLVM compute wrong liveness information?

Could you share a dump of the computed liveintervals for the given example here? (I don't have an amdgpu-enabled build ready at the moment).

llvm/test/MachineVerifier/verifier-ec-subreg-liveness.mir
2

Why not just -run-pass=liveintervals?

MatzeB added inline comments.Aug 1 2022, 1:27 PM
llvm/test/MachineVerifier/verifier-ec-subreg-liveness.mir
2

Oh, this is because machine verifier has no addUsedIfAvailable<LiveIntervals>(); at the moment. Maybe add a comment about it, or the next reviewer will just wonder again...

MatzeB requested changes to this revision.Aug 1 2022, 1:40 PM

I guess Jay already posted the liveness info. It seems this is related to CalcLiveRangeUtilBase::createDeadDef moving definitions to the early clobber slot when there's an early-clobber and a normal def. Judging by the longer explanation in that function this is probably deliberate and we have to blame the MachineVerifier here. But that means you should fix the machine verifier and not just disable a whole subset of tests!

I think to adapt to this, when you see an operand defined at a normal position but the live range starting at an early-clobber spot you need to do a 2nd pass making sure that there is another early-clobber def and in that case skip the warning!

This revision now requires changes to proceed.Aug 1 2022, 1:40 PM

I guess Jay already posted the liveness info. It seems this is related to CalcLiveRangeUtilBase::createDeadDef moving definitions to the early clobber slot when there's an early-clobber and a normal def. Judging by the longer explanation in that function this is probably deliberate and we have to blame the MachineVerifier here. But that means you should fix the machine verifier and not just disable a whole subset of tests!

The actual issue here is we are calling checkLivenessAtDef(MO, MONum, DefIdx, *LI, Reg) where DefIdx is the slot for the checked operand that is subreg but LI is the interval for the whole superreg.
So it just seemed me inconsistent.
Additional checks for subregs case are performed in the following lines 2442-2451.

I think to adapt to this, when you see an operand defined at a normal position but the live range starting at an early-clobber spot you need to do a 2nd pass making sure that there is another early-clobber def and in that case skip the warning!

So if you mean that such usage of checkLivenessAtDef() is correct I can modify the function to skip warning.

dfukalov updated this revision to Diff 449269.Aug 2 2022, 5:28 AM

Moved warning skipping into checkLivenessAtDef() to get back some checks.

foad added inline comments.Aug 2 2022, 5:54 AM
llvm/lib/CodeGen/MachineVerifier.cpp
2312

Do we need an "else" case here? You could at least check SlotIndex::isSameInstr(VNI->def, DefIdx). And maybe check that VNI-def is the early clobber slot and DefIdx is the normal slot.

And (as @MatzeB said) maybe even scan the instruction to check that there is an early-clobber def of the same superregister somewhere. Or maybe that would be better handled by a separate check that any range with subranges has the same start idx as one of the subranges.

dfukalov updated this revision to Diff 451849.EditedAug 11 2022, 7:16 AM

Addresses comments.

@foad please review I correctly implemented
"VNI-def is the early clobber slot and DefIdx is the normal slot"
as
VNI->def.isEarlyClobber() && DefIdx > VNI->def?

@MatzeB I double-checked that early-clobber def of superreg has a corresponding
subreg early-clobber def: I hacked range being checked in debugger and got the
following:

%0 [16e,32r:0) 0@16r L..3 [16r,32r:0) 0@16r L..C [16r,32r:0) 0@16r weight:0.000000e+00
and
%0 [16r,32r:0) 0@16e L..3 [16r,32r:0) 0@16r L..C [16r,32r:0) 0@16r weight:0.000000e+00

generate *** Bad machine code: Live segment must begin at MBB entry or valno def ***
in MachineVerifier::verifyLiveRangeSegment()

%0 [16e,32r:0) 0@16e L..3 [16r,32r:0) 0@16r L..C [16r,32r:0) 0@16r weight:0.000000e+00
generates *** Bad machine code: Non-PHI, non-early clobber def must be at a register slot ***
in MachineVerifier::verifyLiveRangeValue()

both of them are called from MachineVerifier::visitMachineFunctionAfter() (added mention to the comment)

dfukalov added inline comments.Sep 1 2022, 6:39 AM
llvm/lib/CodeGen/MachineVerifier.cpp
2304

@foad would you please approve the condition means your suggestion "VNI-def is the early clobber slot and DefIdx is the normal slot"?

foad added inline comments.Sep 5 2022, 1:36 AM
llvm/lib/CodeGen/MachineVerifier.cpp
2301

Don't need the last set of parens on this line.

2304

Looks OK but I think (!VNI->def.isEarlyClobber() || !DefIdx.isRegister()) might be clearer.

dfukalov updated this revision to Diff 457950.Sep 5 2022, 5:19 AM

Addressed comment.

llvm/lib/CodeGen/MachineVerifier.cpp
2301

Formally yes, but without the parens I got

warning: '&&' within '||' [-Wlogical-op-parentheses]
note: place parentheses around the '&&' expression to silence this warning
foad added inline comments.Sep 5 2022, 5:24 AM
llvm/lib/CodeGen/MachineVerifier.cpp
2301

By "the last set" I meant the parens around VNI->def != DefIdx.

dfukalov updated this revision to Diff 457968.Sep 5 2022, 6:28 AM

Removed redundant parenses.

foad accepted this revision.Sep 5 2022, 6:31 AM

LGTM, thanks!

This revision was not accepted when it landed; it landed in state Needs Review.Sep 5 2022, 7:08 AM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.