Page MenuHomePhabricator

fix major imprecision in LazyValueInfo

Authored by regehr on Apr 25 2016, 9:00 AM.



LVI was neglecting to call the transfer functions for instructions where either operand was the full constant range. This dropped a bunch of precision on the floor for operations such as "and x, 1" or "zext x" or "lshr x, 3" where we learn something from the operation even when x is the full range. This patch fixes the precision bug by forcing LVI to unconditionally run the transfer functions.

This patch also fixes a minor precision bug where LVI was only considering binary instructions of the form "op x, C" where C is a constant. This loses precision by neglecting the "op x, y" case where y is not a constant but we still know something about it.

We can measure the precision of LVI by looking at how much entropy it eliminates from a program. If LVI takes a 32-bit variable and constrains its range to [0..256) we can say that we gain 24 bits of information. By totaling up the number of bits we gain across an entire compilation, we can get an idea about how well LVI is working.

Before this patch, here's the information gained while compiling SPEC CINT 2006:

LVI bits 380433 (2.2%)
Known bits 2226534 (12.9%)
Total bits 17237799

After this patch:

LVI bits 2784947 (16.1%)
Known bits 2232534 (12.9%)
Total bits 17253797

So now LVI has become a stronger analysis than the known bits analysis, which is generally a pretty decent analysis, so this is good. There's plenty more gain to be squeezed out of LVI but this was the lowest hanging fruit.

LLVM with this patch passes tests and also compiles a working SPEC CINT 2006. More testing would be good.

Bug was reported here:

Diff Detail


Event Timeline

regehr updated this revision to Diff 54861.Apr 25 2016, 9:00 AM
regehr retitled this revision from to fix major imprecision in LazyValueInfo.
regehr updated this object.
regehr added a reviewer: reames.
regehr set the repository for this revision to rL LLVM.
reames edited edge metadata.Apr 25 2016, 11:40 AM

The data and motivation for this patch is obvious, but code wise, it's a ways off. Let's start with the basics:

  1. Always running the transfer rules is straight-forward and unlikely to be expensive compile time wise. Forking the analysis at binary operator is not as clearly safe. I would *strongly* prefer to see this patch split so that we can get the first part in and evaluate the second part separately.

This should be rebased on the refactoring change I just made in revision 267438. This will make the handling for unary vs binary far more obvious. I'd suggest writing the unary case by itself with test cases, and post that for review.

For the unary case, we can simply promote the input to a full range if we can't find a constant range input. For the binary case, we can do the same (provided the RHS is a constant). Once we introduce non-constant RHS, the cost model becomes a bit tricker.

Hi Philip, since this is a relatively straightforward change (or should be) and you're knee deep in the code anyway, I would certainly not be offended if you pushed whatever parts of this into the code base yourself in whatever order makes the most sense.

See [LVI] Infer local facts from unary expressions, That's the first part.

regehr abandoned this revision.Apr 28 2016, 8:35 AM

Mostly subsumed by changes implemented by Philip; I'll introduce the remaining changes (taking advantage of LVI information on both sides of a binop) in a different revision.