This is an archive of the discontinued LLVM Phabricator instance.

Bug fix on stable hash calculation for machine operands RegisterMask and RegisterLiveOut
ClosedPublic

Authored by yozhu on Sep 10 2022, 1:21 AM.

Details

Summary

MachineOperand::getRegMask() returns a pointer to register mask. We should hash the raw content of register mask instead of its pointer.

Diff Detail

Event Timeline

yozhu created this revision.Sep 10 2022, 1:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 10 2022, 1:21 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
yozhu requested review of this revision.Sep 10 2022, 1:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 10 2022, 1:21 AM
kyulee added inline comments.Sep 10 2022, 8:02 AM
llvm/lib/CodeGen/MachineStableHash.cpp
122–135

Strictly speaking, the mask pointer can be invalid or the mask size can be longer than int value.
Should we iterate it similar to https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/MachineOperand.cpp#L318?

Is it possible to add a test case to prevent this from regressing in the future? (It;s fine if the answer is "no"; I'm not familiar enough with this part of the code to say.)

yozhu updated this revision to Diff 459324.Sep 10 2022, 10:39 PM

Address comment on needing to consider regmask size; it is not necessarily a uint32_t.

yozhu marked an inline comment as done.Sep 10 2022, 10:45 PM

Is it possible to add a test case to prevent this from regressing in the future? (It;s fine if the answer is "no"; I'm not familiar enough with this part of the code to say.)

I'm not sure if it is feasible to add a small unit test for this. Stable hash calculated on machine operand doesn't exist in compiler output or produced OBJ or binary, and also it is hard to tell whether a hash value is calculated from hashing pointers or raw data. The problem was manifested as non-determinism in binary. If we test determinism, then the input need to be relatively large (larger than usual unit test) so any inherent randomness could be exposed relatively reliably. But determinism test will verify the whole toolset instead of only testing machine IR stable hash calculation.

kyulee accepted this revision.Sep 12 2022, 9:38 AM
This revision is now accepted and ready to land.Sep 12 2022, 9:38 AM
MatzeB added a subscriber: MatzeB.Sep 12 2022, 1:39 PM
MatzeB added inline comments.
llvm/include/llvm/CodeGen/MachineOperand.h
644–646

I am not sure this is a good API if it behaves differently depending on whether the instruction is added to a machine function / block yet. This can be very misleading for people using this function, so I think it may have been better not to add a publicly visible function here and rather deal with it within the hash function directly...

llvm/lib/CodeGen/MachineStableHash.cpp
123–135

So this means the hash value changes depending on whether the instruction is attached to a MachineFunction yet?. I find a hash value changing like that dangerous / unexpected to users. Did you try whether you can assert/abort instead if the instruction is not attached to a function?

efriedma added inline comments.
llvm/lib/CodeGen/MachineOperand.cpp
394

While you're here, can you also fix llvm::hash_value?

MatzeB added a comment.EditedSep 12 2022, 1:53 PM

An alternative approach if we cannot guarantee the availabily of a MachineFunction reference, may be to change machine operands to store ArrayRef<Register> RegMask instead of uint32_t *RegMask. It shouldn't affect memory usage because MachineOperand size is determined by the biggest union member anyway which is already 2 pointers...

yozhu added inline comments.Sep 12 2022, 2:47 PM
llvm/lib/CodeGen/MachineOperand.cpp
394

Yes, will do. Thanks for pointing this out!

yozhu added a comment.Sep 12 2022, 7:50 PM

Discussed with @MatzeB and @kyulee offline, and we will put up a new diff to remove the new API (so to get rid of confusion) and to duplicate the code in the few places where we need to hash MO. For the concern that MO will have different hash values between when it has an associated MF and when it not, we will add an assert (suggesting that in most cases, if not all, we shouldn't hash MO if it doesn't belong to a MF) but will leave the fallback code as in the current change (for Release build, in case it is a valid scenario somewhere but not covered by the current set of tests).