This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/include/clang/StaticAnalyzer/Core/PathSensitive/
-
include/
-
clang/
-
StaticAnalyzer/
-
Core/
-
PathSensitive/
1
MemRegion.h

Differential D131707

[analyzer][NFC] Cache the result of getLocationType in TypedValueRegion
Needs ReviewPublic

Authored by ASDenysPetrov on Aug 11 2022, 11:41 AM.

Download Raw Diff

Details

Reviewers

steakhal
martong
NoQ
xazax.hun
isuckatcs

Summary

Fix FIXME. Produce QualType once in getLocationType at the first call. Then cache the result and return it for the next calls.

Diff Detail

Event Timeline

ASDenysPetrov created this revision.Aug 11 2022, 11:41 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 11 2022, 11:41 AM

Herald added subscribers: manas, dkrupp, donat.nagy and 6 others. · View Herald Transcript

ASDenysPetrov requested review of this revision.Aug 11 2022, 11:41 AM

Herald added a subscriber: cfe-commits. · View Herald TranscriptAug 11 2022, 11:41 AM

Harbormaster completed remote builds in B180739: Diff 451935.Aug 11 2022, 2:36 PM

ASDenysPetrov retitled this revision from [analyzer]{NFC} Cache the result of getLocationType in TypedValueRegion to [analyzer][NFC] Cache the result of getLocationType in TypedValueRegion.Aug 11 2022, 11:40 PM

Please measure the performance implications/gains of this patch to justify this change.
Preferably test it on multiple kinds of projects ranging from old-school C and even some modern C++ projects.
Let me know your findings and if you need some help setting it up.

clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h
549–550	I would appreciate it if you assert that the resulting type cannot be `null`, which would imply that if `CachedLocationType.isNull()` means that we haven't cached it yet.

@steakhal
Thank you for the review.
Could you please do testing for me, since I'm on Windows and have no prepared testing environment as I know you do.
I'll add an assertion.

Add assertion accroding to the remark.

ASDenysPetrov added a comment.Aug 15 2022, 8:45 AM

This comment was removed by ASDenysPetrov.

In D131707#3723403, @ASDenysPetrov wrote:

@steakhal
Thank you for the review.
Could you please do testing for me, since I'm on Windows and have no prepared testing environment as I know you do.

ATM it seems like I don't have the bandwidth. What I meant is choosing the right projects to test etc.
Maybe @balazske could give some helping hands. AFAIK it shouldn't be hard for him to enqueue a benchmark.
What do you say @balazske, could you please have a differential measurement in release configuration on the test set demonstrating the performance gain of this patch compared to the baseline?

Harbormaster completed remote builds in B181296: Diff 452689.Aug 15 2022, 9:42 AM

Hmm at a glance I disagree with the FIXME, this doesn't look like a valuable optimization. This operation is typically constant-time anyway so it's probably not worth the memory impact and the added complexity.

The easiest way to roughly estimate impact is to take sqlite3.c (which is a famous large-ish project that gets compiled as a single huge file to ensure that optimizations have access to the entire program) and measure time and memory with UNIX time. Not very representative but very easy to execute. If it shows promising results I'm open to further investigation in this area.

I also recommend consulting a performance profiler before implementing optimizations, to avoid premature optimizations. IIUC the picture hasn't changed much in the recent years:

From low-level perspective the majority of time is spent accessing and manipulating ImmutableMaps and other immutable data structures in the program state;
From high-level perspective almost half the time is spent in mark-and-sweep garbage collection in ExprEngine::removeDead(). Which can't really be cut as running removeDead() less often only increases the analysis time as it makes the maps larger and therefore slower to access. And it's also hard to optimize because the procedure is very complicated and fragile and very few people actually understand how it works.

So I think the most valuable optimizations are low-level optimizations to ImmutableMap. There were a few suggestions on the mailing list to use something more modern than the AVL trees under the hood but I don't think authors found much success with those.

In D131707#3724747, @NoQ wrote:

So I think the most valuable optimizations are low-level optimizations to ImmutableMap. There were a few suggestions on the mailing list to use something more modern than the AVL trees under the hood but I don't think authors found much success with those.

I am not sure how optimized our ImmutableMap is, I would not be surprised if there were some small gains here and there. I don't remember the entire discussion, but I think some of the mentioned alternative data structures are:

Patricia Trees, see C. Okasaki, A. Gill. Fast Mergeable Integer Maps. In Workshop on ML (1998)
Finger Trees, see https://en.wikipedia.org/wiki/Finger_tree
Abandon the concept of trees and implement an immutable HashMap

Just wanted to dump all of this, in case there are some takers :)

So I think the most valuable optimizations are low-level optimizations to ImmutableMap. There were a few suggestions on the mailing list to use something more modern than the AVL trees under the hood but I don't think authors found much success with those.

Back then I was experimenting by replacing some of our ImmutableMap instances with an alternative implementation. I've started with Environment::ExprBindings because profiling showed that as the most frequently used. I've chosen immer::map as an alternative immutable map implementation. It seemed to be very promising because of

the applied Relaxed Radix Balanced Tree implementation
efficient batch manipulations .

Results: I could make the LIT tests pass, except two checks. The performance was not promising though. The run-time was similar that we had with the AVL trees, however, the memory consumption was 5-10x higher! Consequently, I've abandoned the prototype. But, I think an efficient batch manipulation implementation with the existing AVL based ImmutableMap might be worth to experiment with further.

From high-level perspective almost half the time is spent in mark-and-sweep garbage collection in ExprEngine::removeDead(). Which can't really be cut as running removeDead() less often only increases the analysis time as it makes the maps larger and therefore slower to access. And it's also hard to optimize because the procedure is very complicated and fragile and very few people actually understand how it works.

Yes, this is also aligned with my measurements. There are analyzer options to manipulate when the removeDead is called, before every Stmt, or before every basic block, or never. Actually, the last option is meaningless, because renders both the engine and many checkers faulty. The before every basic block option might still be worth to experiment with, but I could not find reasonable arguments to change to that yet (I cannot recall, but some lit tests were failing miserably with that option too).

Revision Contents

Path

Size

clang/

include/

clang/

StaticAnalyzer/

Core/

PathSensitive/

MemRegion.h

17 lines

Diff 452689

clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h

Show First 20 Lines • Show All 523 Lines • ▼ Show 20 Lines	public:
static bool classof(const MemRegion* R) {		static bool classof(const MemRegion* R) {
unsigned k = R->getKind();		unsigned k = R->getKind();
return k >= BEGIN_TYPED_REGIONS && k <= END_TYPED_REGIONS;		return k >= BEGIN_TYPED_REGIONS && k <= END_TYPED_REGIONS;
}		}
};		};

/// TypedValueRegion - An abstract class representing regions having a typed value.		/// TypedValueRegion - An abstract class representing regions having a typed value.
class TypedValueRegion : public TypedRegion {		class TypedValueRegion : public TypedRegion {
		mutable QualType CachedLocationType;

void anchor() override;		void anchor() override;

protected:		protected:
TypedValueRegion(const MemRegion* sReg, Kind k) : TypedRegion(sReg, k) {		TypedValueRegion(const MemRegion* sReg, Kind k) : TypedRegion(sReg, k) {
assert(classof(this));		assert(classof(this));
}		}

public:		public:
virtual QualType getValueType() const = 0;		virtual QualType getValueType() const = 0;

QualType getLocationType() const override {		QualType getLocationType() const override {
// FIXME: We can possibly optimize this later to cache this value.		if (CachedLocationType.isNull()) {
QualType T = getValueType();		QualType T = getValueType();
ASTContext &ctx = getContext();		ASTContext &C = getContext();
if (T->getAs<ObjCObjectType>())		CachedLocationType = T->isObjCObjectType() ? C.getObjCObjectPointerType(T)
return ctx.getObjCObjectPointerType(T);		: C.getPointerType(T);
return ctx.getPointerType(getValueType());		assert(!CachedLocationType.isNull() &&
		steakhalUnsubmitted Not Done Reply Inline Actions I would appreciate it if you assert that the resulting type cannot be `null`, which would imply that if `CachedLocationType.isNull()` means that we haven't cached it yet. steakhal: I would appreciate it if you assert that the resulting type cannot be `null`, which would imply…
		"Cached value supposed to be non-null.");
		}
		return CachedLocationType;
}		}

QualType getDesugaredValueType(ASTContext &Context) const {		QualType getDesugaredValueType(ASTContext &Context) const {
QualType T = getValueType();		QualType T = getValueType();
return T.getTypePtrOrNull() ? T.getDesugaredType(Context) : T;		return T.getTypePtrOrNull() ? T.getDesugaredType(Context) : T;
}		}

static bool classof(const MemRegion* R) {		static bool classof(const MemRegion* R) {
▲ Show 20 Lines • Show All 1,051 Lines • Show Last 20 Lines