This is an archive of the discontinued LLVM Phabricator instance.

StackMap: Fix assertion on undef operands for anyregc
Needs ReviewPublic

Authored by arsenm on Mar 28 2022, 6:19 AM.

Details

Summary

There is some special handling of undef operands here which I don't
understand. For some reason a special offset value is used and the
type is changed to Constant, and then later was replaced with
ConstantIndex based on the dummy offset value. This also seems to be
inconsistently treated in the existing code. See the test changes in
statepoint-fixup-undef.mir since this stops the use of the constant
pool with the special offset in one of the run lines. I don't
understand why an undef operand would end up being encoded as anything
in the final binary.

Also have no idea if the new test checks are meaningful.

This avoids test failures in a future patch which starts discarding
liveness information on register allocation failures.

I also noticed we seem to be lacking verification for PATCHPOINT's
operand restrictions. In particular you hit assertions if the operand
for the number of operands isn't correct.

Diff Detail

Unit TestsFailed

Event Timeline

arsenm created this revision.Mar 28 2022, 6:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 28 2022, 6:19 AM
arsenm requested review of this revision.Mar 28 2022, 6:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 28 2022, 6:19 AM
Herald added a subscriber: wdng. · View Herald Transcript
dantrushin added a comment.EditedMar 28 2022, 8:42 AM

StackMap contents is opaque for compiler and is interpreted by runtime. With your change, relocating garbage collector can get some garbage in register and try to relocate (move) it. This will result in immediate crash.
Also, StackMap can encode operand types in it. Suddenly changing operand size from .quad to .long might also break runtime's invariants.

I think proper fix would be to add parameter to StackMaps::parseOperand() which will tell if replacement of undef is legal or not. For call arguments is should be false while for variable part it must be true.
(And then StackMaps::recordPatchPoint() in case of anyreg convention will have to split call to recordStackMapOpers into two calls - one for call arguments and second for variable arguments)

StackMap contents is opaque for compiler and is interpreted by runtime. With your change, relocating garbage collector can get some garbage in register and try to relocate (move) it. This will result in immediate crash.
Also, StackMap can encode operand types in it. Suddenly changing operand size from .quad to .long might also break runtime's invariants.

I think proper fix would be to add parameter to StackMaps::parseOperand() which will tell if replacement of undef is legal or not. For call arguments is should be false while for variable part it must be true.
(And then StackMaps::recordPatchPoint() in case of anyreg convention will have to split call to recordStackMapOpers into two calls - one for call arguments and second for variable arguments)

I tried this and am getting confused about the operand structure for PATCHPOINT and why there need to be 2 calls. It seems like it's already splitting the set of operands since getStackMapStartIdx is different for anyregcc.

I keep coming back to questioning why do this in the first place. i.e. the original change in D94703 doesn't make any sense to me. If the motivation is purely to avoid the verifier error, the correct solution is to handle undef in later passes to preserve correct liveness. From a lowering perspective, undef doesn't exist and shouldn't receive special treatment. If I revert that patch, the verifier error is introduced in FixupStatepointCallerSaved, which blindly inserts spills. It could instead skip the spill for undef registers (or would need to propagate the undef on the new store value operand)

StackMap contents is opaque for compiler and is interpreted by runtime. With your change, relocating garbage collector can get some garbage in register and try to relocate (move) it. This will result in immediate crash.
Also, StackMap can encode operand types in it. Suddenly changing operand size from .quad to .long might also break runtime's invariants.

I think proper fix would be to add parameter to StackMaps::parseOperand() which will tell if replacement of undef is legal or not. For call arguments is should be false while for variable part it must be true.
(And then StackMaps::recordPatchPoint() in case of anyreg convention will have to split call to recordStackMapOpers into two calls - one for call arguments and second for variable arguments)

I tried this and am getting confused about the operand structure for PATCHPOINT and why there need to be 2 calls. It seems like it's already splitting the set of operands since getStackMapStartIdx is different for anyregcc.

I keep coming back to questioning why do this in the first place. i.e. the original change in D94703 doesn't make any sense to me. If the motivation is purely to avoid the verifier error, the correct solution is to handle undef in later passes to preserve correct liveness. From a lowering perspective, undef doesn't exist and shouldn't receive special treatment. If I revert that patch, the verifier error is introduced in FixupStatepointCallerSaved, which blindly inserts spills. It could instead skip the spill for undef registers (or would need to propagate the undef on the new store value operand)

D122605 is an alternative that avoids touching the undef operands

I'm sorry, I perhaps wasn't clear - English is not my native language. Let me try again.

First, StackMaps are used to communicate some (opaque to compiler) state to runtime.
Among other things, STATEPOINT passes set of pointers which need relocation to GC.
0xFEFEFEFE is a special sentinel value which means dead pointer. Now you just passing garbage to GC. (Theoretically this might be OK, practically I don't think so)

Second, current LLVM implementation supports passing pointers only in callee saved registers in StackMaps. If assigned by RA to other register, it must be spilled and that spill location must be recorded in StackMap instead.
This is what FixupStatepointCallerSaved pass does. Now you encode forbidden register in StackMap. Runtime will crash or behave strangely

Third, runtime might have enough knowledge of StackMap contents to wish to validate data types in it.
This is why 64-bit sentinel constant is used. And 64 bit constants are encoded via constant pool indexing. Now it will blow up.

PATCHPOINT is special pseudo with fancy encoding requirements. With anyregcc convention it requires function return value and arguments to be encoded in StackMap as well.
Obviously, it has different encoding requirements for THAT part.

I don't know if you really use PATHPOINT instruction or simply fixing some artificial test, but we use STATEPOINT instructions as described and you just broke it.

And, by the way, you test case does not make any sens to me. Is it correct to have a dead DEF with undef USE? What that would mean?

0xFEFEFEFE is a special sentinel value which means dead pointer. Now you just passing garbage to GC. (Theoretically this might be OK, practically I don't think so)

It seems to me like you are blessing undef registers with special semantics, which is not what undef is for. If the value was undefined, it's undefined. The undef marker just informs the register liveness tracking of this. It doesn't make sense to have a lowering pass try to fix up the undefined value to something to avoid a crash. Transforming this is basically creating a mini-sanitizer/hardening.

PATCHPOINT is special pseudo with fancy encoding requirements. With anyregcc convention it requires function return value and arguments to be encoded in StackMap as well.
Obviously, it has different encoding requirements for THAT part.

I don't know if you really use PATHPOINT instruction or simply fixing some artificial test, but we use STATEPOINT instructions as described and you just broke it.

I don't use it and an simply trying to fix lit test failures after an unrelated patch.

And, by the way, you test case does not make any sens to me. Is it correct to have a dead DEF with undef USE? What that would mean?

Yes. These flags maintain liveness and don't imply any particular semantic meaning. It means there's no relation between the def and the use and the register is available between the two points. In the cases I'm trying to fix, the liveness is artificial. The register allocation failed, so the liveness information is essentially garbage. My patch will produce flags like this to avoid triggering verifier errors after fatal compile errors.

0xFEFEFEFE is a special sentinel value which means dead pointer. Now you just passing garbage to GC. (Theoretically this might be OK, practically I don't think so)

It seems to me like you are blessing undef registers with special semantics, which is not what undef is for. If the value was undefined, it's undefined. The undef marker just informs the register liveness tracking of this. It doesn't make sense to have a lowering pass try to fix up the undefined value to something to avoid a crash. Transforming this is basically creating a mini-sanitizer/hardening.

OK, what about non-callee saved register appearing in stackmap, which is not supported?

JFYI, the original special handling for undef was for diagnostic quality, nothing more. We wanted a fixed value for undef to avoid arbitrarily complex and broken stackmap entries when the input IR contained undef. Such stackmaps should be dynamically dead, but having to defer all validation until runtime was non-ideal. Using a constant value in place of undef gave us a way to constrain how broken the stackmap entries appeared. As a result, the JIT could meaningfully post-validate a stackmap section and detect errors eagerly.

(This is different than what Denis said above in an subtle way - the signal value does *not* mean "dead pointer". It's simply a constraint on the structure of a dynamically dead stackmap entry. Same in practice, slightly different in motivation.)

I think we can use any consistent handling for undef pointers in gc-live list. (Er, side note: structure of deopt is important, we can't "just drop" a field from it.)

I don't think anyone is using patchpoint. I suspect the semantics of it have bit rotten enough to be hard to make many reasonable assumptions. statepoint is in active use.