Introduce support for ctlz intrinsic, including exhaustive testing. Among other things, LVI may now be able to propagate information about constant ranges lattice values. This change makes sure to provide a precise result where possible, or a more conservative one (e.g., should the first argument value be zero and is_zero_poison true), but always correct.
Details
Diff Detail
Event Timeline
llvm/test/Analysis/LazyValueAnalysis/lvi-for-ctlz.ll | ||
---|---|---|
32 | Test looks pretty small for all the different paths in the patch. You don't test the i1 false case for example |
llvm/test/Analysis/LazyValueAnalysis/lvi-for-ctlz.ll | ||
---|---|---|
32 | For the second argument i1 false the result should always be more precise than the current one, but I'll make sure to test it soon as well. |
This needs to be implemented inside ConstantRange and tested exhaustively inside ConstantRangeTest. See the handling of Intrinsic::abs as a possible sample to follow (it also has a semantics-controlling argument).
Moved the implementation in ConstantRange.cpp, and tested exhaustively. For the sake of completeness and rigor, I left the test in LVI, further improved: it shouldn’t harm, and there are similar unit tests in ConstantRangeTest.cpp which are also in test/Analysis.
The unit test currently fails both for correctness and optimality checks:
/var/lib/buildkite-agent/builds/llvm-project/llvm/unittests/IR/ConstantRangeTest.cpp:137 Value of: rangeContains(CR, APInt(BitWidth, Elem), Inputs) Actual: false (empty-set does not contain 0 for inputs: full-set, ) Expected: true /var/lib/buildkite-agent/builds/llvm-project/llvm/unittests/IR/ConstantRangeTest.cpp:137 Value of: rangeContains(CR, APInt(BitWidth, Elem), Inputs) Actual: false (empty-set does not contain -1 for inputs: full-set, ) Expected: true /var/lib/buildkite-agent/builds/llvm-project/llvm/unittests/IR/ConstantRangeTest.cpp:145 Value of: CR.isEmptySet() Actual: false Expected: true /var/lib/buildkite-agent/builds/llvm-project/llvm/unittests/IR/ConstantRangeTest.cpp:177 Value of: NotPreferred(PossibleCR) Actual: false (Inputs = full-set, CR = full-set, BetterCR = [0,-1)) Expected: true /var/lib/buildkite-agent/builds/llvm-project/llvm/unittests/IR/ConstantRangeTest.cpp:145 Value of: CR.isEmptySet() Actual: false Expected: true /var/lib/buildkite-agent/builds/llvm-project/llvm/unittests/IR/ConstantRangeTest.cpp:177 Value of: NotPreferred(PossibleCR) Actual: false (Inputs = [0,2), CR = full-set, BetterCR = [3,4)) Expected: true /var/lib/buildkite-agent/builds/llvm-project/llvm/unittests/IR/ConstantRangeTest.cpp:177 Value of: NotPreferred(PossibleCR) Actual: false (Inputs = [0,3), CR = full-set, BetterCR = [2,4)) Expected: true /var/lib/buildkite-agent/builds/llvm-project/llvm/unittests/IR/ConstantRangeTest.cpp:177 Value of: NotPreferred(PossibleCR) Actual: false (Inputs = [0,4), CR = full-set, BetterCR = [2,4)) Expected: true /var/lib/buildkite-agent/builds/llvm-project/llvm/unittests/IR/ConstantRangeTest.cpp:177 Value of: NotPreferred(PossibleCR) Actual: false (Inputs = [0,5), CR = full-set, BetterCR = [1,4)) Expected: true [...]
You can run the unit tests using ninja -Cbuild check-llvm-unit. Or ninja -Cbuild IRTests && build/unittests/IR/IRTests to run just this test suite.
llvm/include/llvm/ADT/APInt.h | ||
---|---|---|
1741 ↗ | (On Diff #492170) | I'd rather not have this API on APInt. Getting leading zeros as an APInt is pretty unusual. |
llvm/lib/IR/ConstantRange.cpp | ||
1684 ↗ | (On Diff #492170) | Constant ranges are understood modulo poison. The value is in the range or it is poison. As such, if ZeroIsPoison is set, we should return a better range that doesn't have to account for zero input, not a worse range. |
1700 ↗ | (On Diff #492170) | Can Lower be larger than Upper for a non-wrapped set? |
llvm/test/Analysis/LazyValueAnalysis/lvi-for-ctlz.ll | ||
2 | Having an IR test is fine, but please do not test LVI debug output. Just check the resulting IR change using update_test_checks.py. |
@nikic, thank you, I'm having a look at these now. Just a few considerations.
- I'm not entirely sure about the unit test failures: should the returned range simply not include zero, if the incoming interval may contain zero and ZeroIsPoison is true, as you noted? I reckoned that being more conservative (unknown/overdefined) could be better here. Same applies for isEmptySet, we may want to return overdefined, I guess (I'm looking at Value of: CR.isEmptySet(), Actual: false, Expected: true)?.
- If we don't want to have ctlz API on APInt, could it be fine to overload TestUnaryOpExhaustive so that it accepts a callable reference to std::optional<unsigned>(const APInt &) too?
- range.ll now fails as the metadata's range is no longer being used now. I'm gradually realizing that this needs to be handled properly in CVP as well. Could it be fine to revert the ret i1 true to this change, considering that the range is now [0,16] (verification here) for this time, and postpone the handling in CVP for a separate dedicated patch?
It's not necessary to make any conservative choices here. If ZeroIsPoison is set, you can assume that the input range does not contain zero, e.g. by intersecting it out.
- If we don't want to have ctlz API on APInt, could it be fine to overload TestUnaryOpExhaustive so that it accepts a callable reference to std::optional<unsigned>(const APInt &) too?
I think we should just create the APInt in the two places where it's needed.
- range.ll now fails as the metadata's range is no longer being used now. I'm gradually realizing that this needs to be handled properly in CVP as well. Could it be fine to revert the ret i1 true to this change, considering that the range is now [0,16] (verification here) for this time, and postpone the handling in CVP for a separate dedicated patch?
This issue should already be fixed by https://github.com/llvm/llvm-project/commit/2e9bc1b8614c9422573cf2f4728525787b0cb0cb.
llvm/lib/IR/ConstantRange.cpp | ||
---|---|---|
1700 ↗ | (On Diff #492170) | It seems it is possible with negative numbers as well ([0,-7)), non-wrapped set. |
This looks on the right track now.
llvm/lib/IR/ConstantRange.cpp | ||
---|---|---|
1737 ↗ | (On Diff #493846) | This looks basically fine, but a bit overly complicated. This should be sufficient: ConstantRange ConstantRange::ctlz(bool ZeroIsPoison) const { if (isEmptySet()) return getEmpty(); APInt Zero = APInt::getZero(getBitWidth()); if (ZeroIsPoison && contains(Zero)) { // ZeroIsPoison is set, and zero is contained. We discern three cases, in // which a zero can appear: // 1) Lower is zero, handling cases of kind [0, 1), [0, 2), etc. // 2) Upper is zero, wrapped set, handling cases of kind [3, 0], etc. // 3) Zero contained in a wrapped set, e.g., [3, 2), [3, 1), etc. if (getLower().isZero()) { if ((getUpper() - 1).isZero()) { // We have in input interval of kind [0, 1). In this case we cannot // really help but return empty-set. return getEmpty(); } // Compute the resulting range by excluding zero from Lower. return ConstantRange( APInt(getBitWidth(), (getUpper() - 1).countLeadingZeros()), APInt(getBitWidth(), (getLower() + 1).countLeadingZeros() + 1)); } else if ((getUpper() - 1).isZero()) { // Compute the resulting range by excluding zero from Upper. return ConstantRange( Zero, APInt(getBitWidth(), getLower().countLeadingZeros() + 1)); } else { return ConstantRange(Zero, APInt(getBitWidth(), getBitWidth())); } } // Zero is either safe or not in the range. The output range is composed by // the result of countLeadingZero of the two extremes. return getNonEmpty( APInt(getBitWidth(), getUnsignedMax().countLeadingZeros()), APInt(getBitWidth(), getUnsignedMin().countLeadingZeros() + 1)); } |
llvm/test/Analysis/LazyValueAnalysis/lvi-for-ctlz.ll | ||
2 | I don't think that as written, these tests really test anything. There needs to be a comparison involving the ctlz that can be folded away, or similar. |
llvm/test/Analysis/LazyValueAnalysis/lvi-for-ctlz.ll | ||
---|---|---|
2 | I feel like a suitable test could be the following one: int lol(int b) { if (b < 65536) { int n = __builtin_clz(b); if (n < 8) return 0; else return 1; } return 2; } which now could be simplified in a return (b < 65536) ? 1 : 2;. However, as far as I understand this, CVP needs to be extended as well in order to be able to obtain the above constant folding. Is that correct? |
llvm/test/Analysis/LazyValueAnalysis/lvi-for-ctlz.ll | ||
---|---|---|
2 | @nikic, perhaps a better example could be the following one? That would still require a change in CVP if I'm not mistaken though, correct? int test_ctlz(int b) { if (b < 65536) { int n = __builtin_clz(b); if (n < 8 && n > 2) return 0; else return 1; } return 2; } |
llvm/test/Analysis/LazyValueAnalysis/lvi-for-ctlz.ll | ||
---|---|---|
2 | Wouldn't something like this work as a test? int test_ctlz(unsigned b) { if (b < 65536) { unsigned n = __builtin_clz(b); return n >= 16; // Fold to 1 } return 2; } |
Can you please rebase over https://github.com/llvm/llvm-project/commit/14fcdd7f9d7b3973661efc5a426da18e077155bf? I think that should work as a CVP test for this functionality, if I got the numbers right.
@nikic, rebased and removed the old test. Might be wrong, but it seems like O2 is already handling the test case you suggested; whereas the latter one I previously suggested (the one with if (n < 8 && n > 2) return 0;, which can never happen) does not seem to be reduced to a icmp, select, ret as of now.
Can you please rerun update_test_checks.py on the CorrelatedPropagation/range.ll test? There should be some diffs there.
Right. It's not really important that individual pass tests aren't folded by -O2. One construct more complex cases that LVI can handle that InstCombine can't. It's just more straightforward to test the basic case.
llvm/test/Transforms/CorrelatedValuePropagation/range.ll | ||
---|---|---|
954 ↗ | (On Diff #497674) | Did you use the right build to produce this? This is not the expected diff, and it still fails in pre-merge checks. |
llvm/test/Transforms/CorrelatedValuePropagation/range.ll | ||
---|---|---|
954 ↗ | (On Diff #497674) | Sorry, my bad, confused builds and must have accidentally passed the wrong --opt-bin path. Checked twice, we should be good now. |
Having an IR test is fine, but please do not test LVI debug output. Just check the resulting IR change using update_test_checks.py.