This is an archive of the discontinued LLVM Phabricator instance.

[BasicAA] Model implicit trunc of GEP indices
ClosedPublic

Authored by nikic on Oct 1 2021, 2:48 PM.

Details

Summary

GEP indices larger than the GEP index size are implicitly truncated to the index size. BasicAA currently doesn't model this, resulting in incorrect alias analysis results.

Fix this by explicitly modelling truncation in CastedValue in the same way we do zext and sext. Additionally we need to disable a number of optimizations for truncated values, in particular "non-zero" and "non-equal" may no longer hold after truncation. I believe the constant offset heuristic is also not necessarily correct for truncated values, but wasn't able to come up with a test for that one.

A possible followup here would be to use the new mechanism to model explicit trunc as well (which should be much more common, as it is the canonical form). This is straightforward, but omitted here to separate the correctness fix from the analysis improvement.

(Side note: While I say "index size" above, BasicAA currently uses the pointer size instead. Something for another day...)

Diff Detail

Event Timeline

nikic created this revision.Oct 1 2021, 2:48 PM
nikic requested review of this revision.Oct 1 2021, 2:48 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 1 2021, 2:48 PM
nikic updated this revision to Diff 379058.Oct 12 2021, 8:46 AM

Rebase & ping

Quick scan only. I don't have a lot of context on this code.

Approach wise, it would seem lower risk to model explicit truncates before adding the implicit trunc on gep operands. (i.e. easier to write test cases, less likelihood of wide spread miscompiles if we got it slightly wrong, less interaction w/e.g. "pointer size" piece) Can I ask why you approached it this was instead?

llvm/lib/Analysis/BasicAliasAnalysis.cpp
268

Naming wise, "Extended" no longer seems correct. Can you change this to something like CastedValue? (In a separate NFC is encouraged.)

292

I don't believe this is correct. What if we trunced off non-zero bits and are no zero extending?

Quick scan only. I don't have a lot of context on this code.

Approach wise, it would seem lower risk to model explicit truncates before adding the implicit trunc on gep operands. (i.e. easier to write test cases, less likelihood of wide spread miscompiles if we got it slightly wrong, less interaction w/e.g. "pointer size" piece) Can I ask why you approached it this was instead?

We already model explicit truncs correctly, simply by dint of treating them as opaque values. I'm interested in implicit truncs here because they're currently being miscompiled.

llvm/lib/Analysis/BasicAliasAnalysis.cpp
292

This method replaces V with zext(NewV). So we now have something like trunc(zext(NewV)). If we extend less than we truncate, then we'll be left with just a trunc(NewV) (this case). If we truncate less than we extend, we're left with just zext(NewV), which gets folded into the outer zext/sext (the existing case below).

nikic updated this revision to Diff 380088.Oct 15 2021, 1:02 PM
nikic edited the summary of this revision. (Show Details)

Rebase over rename

reames accepted this revision.Oct 22 2021, 10:36 AM

LGTM

I'm a bit hesitant to LGTM this as I'm not super familiar with the code, but since no one else seems to be reviewing it and I don't spot any obvious problems, I'm doing so. Hopefully the new code is at least "more correct" than the old code.

This revision is now accepted and ready to land.Oct 22 2021, 10:36 AM
This revision was landed with ongoing or failed builds.Oct 22 2021, 2:47 PM
This revision was automatically updated to reflect the committed changes.