This is an archive of the discontinued LLVM Phabricator instance.

[llvm-diff] Explicitly check ConstantStructs for differences
ClosedPublic

Authored by void on Jun 22 2021, 11:49 AM.

Details

Summary

A ConstantStruct is renamed when the LLVM context sees a new one. This
makes global variable initializers appear different when they aren't.
Instead, check the ConstantStruct for equivalence.

Diff Detail

Event Timeline

void requested review of this revision.Jun 22 2021, 11:49 AM
void created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptJun 22 2021, 11:49 AM
rjmccall added inline comments.Jun 22 2021, 12:03 PM
llvm/tools/llvm-diff/DifferenceEngine.cpp
477

Hmm. I don't think it's okay to just compare the number of elements: if the types might be different, they might have different layouts.

void added inline comments.Jun 22 2021, 12:26 PM
llvm/tools/llvm-diff/DifferenceEngine.cpp
477

The individual elements are checked in the loop below. I made it recursive because I couldn't count on the types to be uniqued, because of the naming differences.

rjmccall added inline comments.Jun 22 2021, 1:01 PM
llvm/tools/llvm-diff/DifferenceEngine.cpp
477

Sure, I see that, but you could have different struct types with different layouts but compatible elements. Because one is packed and the other isn't, or so on.

void updated this revision to Diff 353782.Jun 22 2021, 2:01 PM

Check for packed structures to ensure they have the same data layout.

void added inline comments.Jun 22 2021, 2:02 PM
llvm/tools/llvm-diff/DifferenceEngine.cpp
477

Ah! I see what you mean. There's a method called isLayoutIdentical, but it checks each element's pointer value. The only other thing it checks is the Packed attribute. I added that.

That should work, thanks. What's going on with StructStack?

void added a comment.Jun 22 2021, 8:33 PM

That should work, thanks. What's going on with StructStack?

Structures are the only types which can reference themselves. If I don't stop processing already seen types then it's possible they could infinitely recurse. The stack / std::find code is very hackish. I'm open to suggestions on making it non-gross. :-)

That should work, thanks. What's going on with StructStack?

Structures are the only types which can reference themselves. If I don't stop processing already seen types then it's possible they could infinitely recurse. The stack / std::find code is very hackish. I'm open to suggestions on making it non-gross. :-)

Well, they can't reference themselves as sub-aggregates; you'll never see another field of this exact type.

void updated this revision to Diff 354089.Jun 23 2021, 3:26 PM

Check to ensure we don't analyze the global variable we're currently analyzing. This causes an infinite loop.

void added a comment.Jun 23 2021, 3:27 PM

That should work, thanks. What's going on with StructStack?

Structures are the only types which can reference themselves. If I don't stop processing already seen types then it's possible they could infinitely recurse. The stack / std::find code is very hackish. I'm open to suggestions on making it non-gross. :-)

Well, they can't reference themselves as sub-aggregates; you'll never see another field of this exact type.

Doh! Of course. I found the actual problem. A global variable may occur in its own initializer as part of a GEP or bitcast. I added a more sane check. PTAL

Oh, yeah, if we recurse into initializers in equivalentAsOperands then we'll have that problem for sure. I don't think you can just check at the top level of a struct initializer, though — you need to be able to do this check in recursive positions, like if the element initializer is a bitcast. I would suggest doing the check specifically in the case where you've found global variables.

Also, you should record both GVL and GVR in the difference engine you construct, and then do something like if (GVL == SavedGVL) return GVR == SavedGVR;. If you see a self-reference in the LHS but the RHS is like a null pointer or some other global, that's a difference, not something we should skip! This will make it a sort of maximal fixed point.

I would suggest just setting SavedGVL and SavedGVR on the temporary difference engine instead of passing them in to all uses of equivalentOnOperands.

void updated this revision to Diff 354101.Jun 23 2021, 4:14 PM

Improve the check of the global variable initializers. Both LHS and RHS need to
match the global variable at the same time when recursing or they don't match.

rjmccall added inline comments.Jun 23 2021, 4:18 PM
llvm/tools/llvm-diff/DifferenceEngine.cpp
572

The other constructors doesn't initialize these. Might be best to just = nullptr them and then override them, either here or separately after construction.

void updated this revision to Diff 354103.Jun 23 2021, 4:21 PM

Initialize Saved[LR]HS to nullptr.

rjmccall accepted this revision.Jun 23 2021, 4:24 PM

Thanks for bearing with all the comments. :) LGTM.

This revision is now accepted and ready to land.Jun 23 2021, 4:24 PM
void added a comment.Jun 23 2021, 4:26 PM

Thanks for bearing with all the comments. :) LGTM.

Not at all! Thanks for the review. :-) You made it better than it was.

This revision was landed with ongoing or failed builds.Jun 23 2021, 4:27 PM
This revision was automatically updated to reflect the committed changes.