This is an archive of the discontinued LLVM Phabricator instance.

[RS4GC] Cache BDVs and bases alogn with IsKnownBase flag (NFC)
ClosedPublic

Authored by dmakogon on May 5 2022, 3:49 AM.

Details

Summary

This refactors RS4GC to cache results returned by findBaseDefiningValue. That function performs a recursive search for a base or at least a BDV for a given value.
Imagine we have the following IR:

%b = call i8* @foo()
%d1 = getelementptr %b, 1
%d2 = getelementptr %d1, 1

and imagine we call findBaseDefiningValue on %d2. The function would make a recursive call on gep operand which is %d1 and then another one on %b, then it'll return %b as base value for %d2. If later we call it on %d1, we would do the same recursive call on %b to again infer that it is a base value.
With this change findBaseDefiningValue caches returned result on each recursive call, so in the example above when being called on %d1 we'd already know that it's base is %b from the cache and wouldn't make a recursive call.

This patch also gets rid of BaseDefiningValueResult by caching the IsKnownBase flags for BDVs and bases.

Diff Detail

Event Timeline

dmakogon created this revision.May 5 2022, 3:49 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 5 2022, 3:49 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
dmakogon requested review of this revision.May 5 2022, 3:49 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 5 2022, 3:49 AM

General questions: there are multiple places where we call eraseFromParent. Is there a guarantee that after this the cache is valid? Any way to verify this?

llvm/lib/Transforms/Scalar/RewriteStatepointsForGC.cpp
291

Remove?

dmakogon updated this revision to Diff 428590.May 11 2022, 2:18 AM

Removed comment with debug prints

General questions: there are multiple places where we call eraseFromParent. Is there a guarantee that after this the cache is valid? Any way to verify this?

Actually we never erase any instructions that may happen to be in the cache. In some places where eraseFromParent is called we do an explicit assert that the value to be erased is not in the cache.
In other places the values being erased are definitely not present in the cache: there's a stage when some dummy calls are inserted and then erased shortly after. And last place where the values are erased is when we replace the existing statepoints with new ones with new live variables.

It looks like all you want BaseDefiningValueResult for is simply keeping one boolean for each Value you know the answer for. To me it looks like you simply need a map Value -> bool, and isKnownBaseResult should just return result from this map. After that, we don't need the structure BaseDefiningValueResult at all.

What I don't like in current approach is that we are not guarded against situation when we have two BaseDefiningValueResult instances with same value but different booleans, and it's not what we want in reality. What we want is map.

Am I missing something here?

llvm/lib/Transforms/Scalar/RewriteStatepointsForGC.cpp
128

nit: ///, \p V.

250

if (MustBeBase) assert(IsKnownBase)? I think it's more understandable.

289

Why not just return LHS == RHS?

mkazantsev added inline comments.May 11 2022, 3:16 AM
llvm/lib/Transforms/Scalar/RewriteStatepointsForGC.cpp
260–261

I guess this comment above would need an update.

dmakogon updated this revision to Diff 428626.May 11 2022, 4:45 AM
dmakogon retitled this revision from [RS4GC] Cache BaseDefiningValueResult instead of BDV (NFC) to [RS4GC] Cache IsKnownBase for bases and BDVs (NFC).

Removed BaseDefiningValueResult struct. Instead we now have a cache which stores whether a base or BDV is a known base.

mkazantsev requested changes to this revision.May 11 2022, 4:55 AM
mkazantsev added inline comments.
llvm/lib/Transforms/Scalar/RewriteStatepointsForGC.cpp
476–481

Should it be findBaseDefiningValueCached? Otherwise you don't benefit from caching in recursion.

543–546

nit: tab

This revision now requires changes to proceed.May 11 2022, 4:55 AM
mkazantsev accepted this revision.May 11 2022, 5:02 AM

LGTM with nit.

This revision is now accepted and ready to land.May 11 2022, 5:02 AM

Pls assert that you don't overwrite values in your cache.

mkazantsev added inline comments.May 11 2022, 5:09 AM
llvm/lib/Transforms/Scalar/RewriteStatepointsForGC.cpp
741

const IsKnownBase ... ?

742

Pls add assert with descriptive message what happens if the value is missing

mkazantsev requested changes to this revision.May 11 2022, 5:31 AM

Withdrawing LGTM until asserts are added. I also think you should shortcut through cached method, unless there are reasons not to.

This revision now requires changes to proceed.May 11 2022, 5:31 AM
dmakogon updated this revision to Diff 428893.May 12 2022, 3:06 AM
dmakogon retitled this revision from [RS4GC] Cache IsKnownBase for bases and BDVs (NFC) to [RS4GC] Cache BDVs and bases alogn with IsKnownBase flag (NFC).
dmakogon edited the summary of this revision. (Show Details)

Implemented caching in findBaseDefiningValue. Added a weak assert in setKnownBase - it checks if the key is already present in the map and if so fails the assert if trying to push a different value from the cached one.

dmakogon updated this revision to Diff 428899.May 12 2022, 3:09 AM

const qualifier in isKnownBase

dmakogon marked 6 inline comments as done.May 12 2022, 3:10 AM
mkazantsev accepted this revision.May 12 2022, 3:21 AM

That looks much safer now! Thanks. LGTM.

llvm/lib/Transforms/Scalar/RewriteStatepointsForGC.cpp
442–446

Constants are so trivial that we might not cache them at all. But it's up to you.

This revision is now accepted and ready to land.May 12 2022, 3:21 AM
This revision was landed with ongoing or failed builds.May 13 2022, 12:21 AM
This revision was automatically updated to reflect the committed changes.