Page MenuHomePhabricator

[GlobalISel]: Fix some non determinism exposed in CSE due to not notifying observers about mutations + add verification for CSE
AcceptedPublic

Authored by aditya_nandakumar on Sep 3 2019, 3:43 PM.

Details

Summary

While investigating some non determinism (CSE doesn't produce wrong code, it just doesn't CSE some times) in GISel CSE on an out of tree target, I realized that the core issue was that there were lots of code that mutates (setReg, setRegClass etc), but doesn't notify observers (CSE in this case but this could be any other observer). In order to make the Observer be available in various parts of code and to avoid having to thread it through various API, the MachineFunction now has the observer as field. This allows it to be easily used in helper functions such as constrainOperandRegClass. One of the other option I considered was to have some sort of unique_ptr<GISelContext> GISelCtxt and move the various GISel related properties into this as well but for now I didn't do this.

Also added some invariant verification method in CSEInfo which can catch these issues (when CSE is enabled). Right now it's at the end the legalizer and a few tests in AArch64,AMDGPU,X86 fail. I haven't fixed these yet, but will get to them hopefully this week while we get some eyes on this patch. Also it's not clear to me if this should be always run at the end of each pass (where CSE is enabled in DEBUG?)?

One of the options considered was to automatically track mutations but this is very invasive and the overhead is likely a lot.

Diff Detail

Repository
rL LLVM

Event Timeline

dsanders edited the summary of this revision. (Show Details)Sep 3 2019, 3:50 PM
paquette added inline comments.Sep 4 2019, 11:33 AM
llvm/lib/CodeGen/GlobalISel/Legalizer.cpp
298–299

Should this be under #ifndef NDEBUG?

aditya_nandakumar marked an inline comment as done.Sep 4 2019, 11:51 AM
aditya_nandakumar added inline comments.
llvm/lib/CodeGen/GlobalISel/Legalizer.cpp
298–299

Definitely. Thanks for catching.

qcolombet added inline comments.Sep 4 2019, 11:59 AM
llvm/lib/CodeGen/GlobalISel/Legalizer.cpp
298–299

I would rather have verify returns a bool and do:
assert(!CSEInfo || CSEInfo->Info());

dsanders accepted this revision.Sep 5 2019, 3:27 PM

LGTM with the changes Jessica and Quentin mentioned (or the XRay style Error verify() version).

llvm/include/llvm/CodeGen/GlobalISel/GISelChangeObserver.h
104

It predates this patch but the reference to CSEInfo here is inaccurate. That's only one usage of the delegate mechanism.

llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
2271–2272

Is there ever a time when we want to do one of these and not the other? As far as I can see, installing the observer is always coupled with installing the delegate

llvm/lib/CodeGen/GlobalISel/Legalizer.cpp
298–299

I would rather have verify returns a bool and do:
assert(!CSEInfo || CSEInfo->Info());

Offline, I commented to Aditya that the precedent was for verify() functions that return void and abort on failure but I do agree with this.

That said, I would like to point out that there's another possibility of returning an Error object like in XRay's BlockVerifier.h. That style has the benefit of being able to report exactly what's wrong and potentially annotate the state dump with indications as to which bit is invalid/stale.

This revision is now accepted and ready to land.Sep 5 2019, 3:27 PM

Ping. Can we combine the delegate and observer installers?

Ping. Can we combine the delegate and observer installers?

Thanks Amara for the feedback. The reason I haven't responded here (or committed this) is that I haven't had the chance to track and fix the issues in the various backends where this fails (probably missing Observer calls). Maybe for now, I will disable the verification by default and get this patch in, and we can work on enabling it by default in a follow up patch?

Ping. Can we combine the delegate and observer installers?

Thanks Amara for the feedback. The reason I haven't responded here (or committed this) is that I haven't had the chance to track and fix the issues in the various backends where this fails (probably missing Observer calls). Maybe for now, I will disable the verification by default and get this patch in, and we can work on enabling it by default in a follow up patch?

Either way is fine for me, it can wait until the bugs are found.