Page MenuHomePhabricator

[llvm][Z3][NFC] Improve mkBitvector performance
ClosedPublic

Authored by steakhal on Apr 19 2020, 5:03 AM.

Details

Summary

We convert APSInts to Z3 Bitvectors in an inefficient way for most cases.
We should not serialize to std::string just to pass an int64 integer.

For the vast majority of cases, we use at most 64-bit width integers (at least
in the Clang Static Analyzer). We should simply call the Z3_mk_unsigned_int64
and Z3_mk_int64 instead of the Z3_mk_numeral as stated in the Z3 docs.
Which says:

It (Z3_mk_unsigned_int64, etc.) is slightly faster than Z3_mk_numeral since
it is not necessary to parse a string.

If the APSInt is wider than 64 bits, we will use the Z3_mk_numeral with a
SmallString instead of a heap-allocated std::string.

Diff Detail

Event Timeline

steakhal created this revision.Apr 19 2020, 5:03 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 19 2020, 5:03 AM
xazax.hun accepted this revision.Apr 19 2020, 11:49 AM

Mostly looks good to me. I do believe that this is an optimization but it would be nice to have a small benchmark :) In case it is too much work, I do not insist though.

llvm/lib/Support/Z3Solver.cpp
730

I am not sure if the unlikely is justified. The likelihood might depend on the analyzed code and it is possible to write code where this is the frequent case.

This revision is now accepted and ready to land.Apr 19 2020, 11:49 AM
steakhal marked an inline comment as done.Apr 19 2020, 2:23 PM

Mostly looks good to me. I do believe that this is an optimization but it would be nice to have a small benchmark :) In case it is too much work, I do not insist though.

I'm curious too.

Though I'm not sure how to accomplish that. AFAIK you measured such.
How should I measure this?
Should I use eg. the Clang SA and analyze different projects? (which, how many, etc)
Wouldn't have this method introduce too much noise?
Or should I measure this by a microbenchmark?

I'm really in favor of the microbenchmark approach.

llvm/lib/Support/Z3Solver.cpp
730

The analyzer works on code, but in code, we can not have larger integer literals than long long which is (probably) not wider than 64 bits.
I think the rest of the LLVM users are the concern here.
Should I remove then?

Though, we could further improve the performance, since if the literal's value is small enough to fit into 64-bits, the bitwidth of the storage doesn't matter much. Though, I considered this sort of a hack. In this way, the code is more readable IMO.

ddcc added inline comments.Apr 19 2020, 2:59 PM
llvm/lib/Support/Z3Solver.cpp
730

Aren't __int128_t and __uint128_t types support by Clang? I'm not sure how those integer literals would be generated though, unless an intermediate pass merges together two 64-bit literals.

Mostly looks good to me. I do believe that this is an optimization but it would be nice to have a small benchmark :) In case it is too much work, I do not insist though.

I'm curious too.

Though I'm not sure how to accomplish that. AFAIK you measured such.
How should I measure this?
Should I use eg. the Clang SA and analyze different projects? (which, how many, etc)
Wouldn't have this method introduce too much noise?
Or should I measure this by a microbenchmark?

I'm really in favor of the microbenchmark approach.

It is interesting to see if using Z3 (not just the crosscheck) on some projects will show any difference. I would start with this as this might be the easiest to set up. If we do not see any difference on two or three projects we can try doing some microbenchmarks instead. In a microbenchmark, you can just call mkBitvector with random values (a good question is how to choose the distribution of the types).

NoQ added inline comments.Apr 20 2020, 2:43 AM
llvm/lib/Support/Z3Solver.cpp
730

We don't need those to be literals in the code. They can be computed at run-time and the analyzer will simply follow the computations.

steakhal marked an inline comment as done.Apr 20 2020, 4:11 AM
steakhal added inline comments.
llvm/lib/Support/Z3Solver.cpp
730

@ddcc

Aren't int128_t and uint128_t types support by Clang?

I assume yes, but probably somewhat rare use-case. That is the reason why I marked that path unlikely.
But after this discussion I will probably remove that.

@NoQ

We don't need those to be literals in the code. They can be computed at run-time and the analyzer will simply follow the computations.

AFAIK this function (mkBitvector) would be called only for concrete integers. That integer could have been introduced by the analyzer itself, but at some point, it was a concrete integer. As an example, the ProgramState::assumeInBound transforms an assumption using a new concrete integer MIN.
I'm not expecting the analyzer to use wider integers than the modeled code itself uses. And as the majority of the analyzed code simply uses 32 bit or 64-bit integers, all the integers should be at most 64 bit wide during an analysis.

That was the reasoning behind marking that condition LLVM_UNLIKELY.

NoQ added inline comments.Apr 20 2020, 4:44 AM
llvm/lib/Support/Z3Solver.cpp
730

at some point, it was a concrete integer

__uint128_t x = 2;
__uint128_t y = 2;
__uint128_t z = x * y; // 'z' is a nonloc::ConcreteInt '4 U128b'
                       // that is neither a literal nor hardcoded
                       // in the analyzer.
steakhal marked 4 inline comments as done.Apr 20 2020, 9:40 AM

I made my first google benchmark. Yeey.
Yeah, It's probably flawed :|



Since the benchmark library does not support 128 bit wide integers, I used the INT64_MIN-INT64_MAX value range for testing instead.
It should not matter much, though.

To be honest I expected a bit more improvement.

PS: Can someone tell me why does the uint128 test-case have only two instances?
That case should not differ from the rest of the cases... I'm confused.

llvm/lib/Support/Z3Solver.cpp
730

You are right.

This revision was automatically updated to reflect the committed changes.