Page MenuHomePhabricator

[LVI] Normalize pointer behavior
ClosedPublic

Authored by nikic on Wed, Nov 6, 10:59 AM.

Details

Summary

Related to D69686. As noted there, LVI currently behaves differently for integer and pointer values: For integers, the block value is always valid inside the basic block, while for pointers it is only valid at the end of the basic block. I believe the integer behavior is the correct one, and CVP relies on it via its getConstantRange() uses.

The reason for the special pointer behavior is that LVI checks whether a pointer is dereferenced in a given basic block and marks it as non-null in that case. Of course, this information is valid only after the dereferencing instruction, or in conservative approximation, at the end of the block.

This patch changes the treatment of dereferencability: Instead of including it inside the block value, we instead treat it as something similar to an assume (it essentially is a non-nullness assume) and incorporate this information in intersectAssumeOrGuardBlockValueConstantRange() if the context instruction is the terminator of the basic block. This happens either when determining an edge-value internally in LVI, or when a terminator was explicitly passed to getValueAt(). The latter case makes this change not fully NFC, because we can now fold terminator icmps based on the dereferencability information in the same block. This is the reason why I changed one JumpThreading test (it would optimize the condition away without the change).

Of course, we do not want to recompute dereferencability on each intersectAssume call, so we need a new cache for this. The dereferencability analysis requires walking the entire basic block and computing underlying objects of all memory operands. This was previously done separately for each queried pointer value. In the new implementation (both because this makes the caching simpler, and because it is faster), I instead only walk the full BB once and cache all the dereferenced pointers. So the traversal is now performed only once per BB, instead of once per queried pointer value.

I think the overall model now makes more sense than before, and there will be no more pitfalls due to differing integer/pointer behavior.

Diff Detail

Event Timeline

nikic created this revision.Wed, Nov 6, 10:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptWed, Nov 6, 10:59 AM
reames requested changes to this revision.Wed, Nov 6, 6:50 PM

This is definitely heading in the right direction. Nicely done overall, see suggestions as inline comments.

llvm/lib/Analysis/LazyValueInfo.cpp
267–269

Minor: I think this can become a free function as opposed to a member function. It doesn't appear to be using member state.

692

Please add a todo about using address space property (i.e. null is defined) instead of the raw AS==0 check here.

850

Ok, I don't think this placement quite works. The challenge is that your changing the point of the block scan from the furthest reached block during the backwards walk (entry, or point of overdefine) to instead scan the query source block.

I'd suggest adjusting the two call sites you remove to use the new caching logic, and then only do the scan if a) context is terminator, or b) context is not in same block (i.e. full block case)

This revision now requires changes to proceed.Wed, Nov 6, 6:50 PM
nikic marked an inline comment as done.Thu, Nov 7, 12:27 AM
nikic added inline comments.
llvm/lib/Analysis/LazyValueInfo.cpp
850

The challenge is that your changing the point of the block scan from the furthest reached block during the backwards walk (entry, or point of overdefine) to instead scan the query source block.

I believe the behavior of the scan essentially stays the same: intersectAssume() is not only used to calculate the final result of the query, it is also invoked internally by LVI when computing edge values. So if you query a pointer in BB2 with incoming edge from BB1, then the first thing we do is calculate the edge value BB1 -> BB2, at which point we will intersectAssume() with the terminator of BB1.

I'd suggest adjusting the two call sites you remove to use the new caching logic, and then only do the scan if a) context is terminator, or b) context is not in same block (i.e. full block case)

We'd have to insert checks into getEdgeValue() (always) and getValueAt() / getValueInBlock() (if context is terminator), which are exactly the places where we already use intersectAssume().

nikic updated this revision to Diff 228253.Thu, Nov 7, 8:24 AM
nikic marked 2 inline comments as done.

Make eraseValueFromPerBlockValueCache a free function, add todo.

reames accepted this revision.Fri, Nov 8, 8:17 AM

LGTM. Very nice cleanup, thank you for doing this.

llvm/lib/Analysis/LazyValueInfo.cpp
850

You are correct. I'd misremembered the code and hadn't checked the callers before commenting. Objection withdrawn.

This revision is now accepted and ready to land.Fri, Nov 8, 8:17 AM
This revision was automatically updated to reflect the committed changes.