This is an archive of the discontinued LLVM Phabricator instance.

[RS4GC] Remove the hardcoded GC strategy names (v2)
ClosedPublic

Authored by znix on Jan 5 2023, 11:07 PM.

Details

Summary

Previously, RewriteStatepointsForGC had a hardcoded list of GC
strategies for which it would run, and using it with a custom strategy
required patching LLVM.

The locic for selecting the variables that are considered managed was
also hardcoded to use pointers in address space 1, rather than
delegating to GCStrategy::isGCManagedPointer.

This patch fixes both of these flaws: this pass now applies to all
functions whose GCStrategy returns true for useStatepoints, and checking
if a pointer is managed or not is also now done by the strategy.

One potentially questionable design decision in this change: the pass will
be enabled for all GC strategies that use statepoints. It seems unlikely
this would be a problem - consumers that don't use this pass probably
aren't adding it to the pass manager anyway - but if you had two different
GC strategies and only one wants this pass enabled then that'd need a new
flag in GCStrategy, which I can add if anyone thinks it's necessary.

This is an updated version of D140458, rebased to account for LLVM's
changes since D140504 (required by this patch) landed.

Please note I don't have commit access, so if approved I'd need someone else to land it.

Diff Detail

Event Timeline

znix created this revision.Jan 5 2023, 11:07 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 5 2023, 11:07 PM
znix requested review of this revision.Jan 5 2023, 11:07 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 5 2023, 11:07 PM
znix edited the summary of this revision. (Show Details)Jan 5 2023, 11:11 PM
zero9178 added inline comments.Jan 9 2023, 7:04 AM
llvm/lib/Transforms/Scalar/RewriteStatepointsForGC.cpp
116

Use of std::map is generally discouraged unless you require any of its properties.
As far as I can tell you're just doing pure insert and lookups on it, so you'll probably want to use StringMap from LLVM.

3046

You are doing 3 lookups in the map here, which can be reduced to just one.
The trick is to use insert({F.getGC(), nullptr}), which will return an iterator and bool as pairs. The iterator will either point to an existing entry if it was NOT inserted, or to the now inserted entry. The bool is true if inserted.
That way you just need to check if it was inserted, if it was, assign to second of the entry the result of getGCStrategy.
At the end also just return second of the iterator pointing to the entry.

BTW, there is no need to create new revision to reflect trunk changes. You could simply rebase your local branch and then update diffs for the existing review. This works well and is common practice.

As for review itself.
First, I don't think that GCStrategy cache really belongs here. If I want to use GCStrategy say, in Verifier or in other Transform passes, should I add yet another GCStrategy cache everywhere? I don't think so.
The question is where to implement it. There was an attempt to move it to LLVMContext (see D6811), but it was reverted later (commit 56a03938f7a536dc2cef3c42ed67b3f218a1c758).
As hinted in mentioned commit we can create an IR analysis pass for that (which would duplicate CodeGen's GCModuleInfo to some extent, but I think we can live with that).
There even was arguing if GCStrategy are singletons or not (see discussions in D100559). @reames thinks they are and I'm leaning to agree with him. But if they don't then bare llvm::getGCStrategy() would suffice.

Second, a bunch of static functions with long parameter lists start looking ridiculously. Local class which keeps per-function state would look cleaner IMHO, but that can be done separately from this change.

llvm/docs/Statepoints.rst
581–582

This is not true anymore ('The pass assumes that all addrspace(1) pointers are non-integral pointer types')? Perhaps simply delete this sentence?

585

This seems to repeat what was said in first sentence of paragraph.

But English is not my native language, so I'd leave review of this to native speakers.

Added couple folks who should be interested in this stuff :-)

znix added a comment.Jan 12 2023, 2:26 PM

I'll work on making a GCStrategyAnalysisPass then. Given RS4GC supports both the new and legacy pass manager, I assume this new pass needs to support both, too?

With regards to moving all the static RS4GC functions to a class, should I make that a separate diff, a separate commit within the diff, or something similar? I ask because there will be a lot of mostly-whitespace changes as all the function arguments are indented, which would make the patch quite long.

Apologies for this very basic question, but how do I update a patch using Arcanist? I had a look on the docs page about Phabricator reviews, and it didn't mention how to do them.

Thanks,
Campbell

I'll work on making a GCStrategyAnalysisPass then. Given RS4GC supports both the new and legacy pass manager, I assume this new pass needs to support both, too?

I think yes. Even though Legacy PM is deprecated, supporting it is just several extra lines.

With regards to moving all the static RS4GC functions to a class, should I make that a separate diff, a separate commit within the diff, or something similar? I ask because there will be a lot of mostly-whitespace changes as all the function arguments are indented, which would make the patch quite long.

This usually done as a separate NFC (non-functional change) patch.
Note that such change is not a blocker for this review - we can do it later.

Apologies for this very basic question, but how do I update a patch using Arcanist? I had a look on the docs page about Phabricator reviews, and it didn't mention how to do them.

Simply do arc diff again. As long as your commit message has Differential Revision line, arcanist is smart enough to update existing review instead of creating new one.

Please don't couple all uses of statepoints to RS4GC. Adding another function to GCStrategy should be straight forward, please do it.

llvm/lib/Transforms/Scalar/RewriteStatepointsForGC.cpp
3042

I'm confused by your implementation here. getGCStrategy uses the register of GCStrategy objects internally. If calling that works, why do you need another caching layer here at all?

As a bit of history, the GC registery was initially in CodeGen and not available in the rest of LLVM due to some weird linkage problems. I think - though I don't quite remember details, so check - that got fixed years ago.

dantrushin added inline comments.Jan 13 2023, 12:20 PM
llvm/lib/Transforms/Scalar/RewriteStatepointsForGC.cpp
3042

@reames: llvm::getGCStrategy() calls GCStrategy::instantiate() for every query (creating new instance).
CodeGen's GCModuleInfo::getGCStrategy() implements GC strategy cache on top of that. I guess, in attempt to have them as singletons (?).
Do you think we don't need it in IR?
I also found your previous attempt to move ownership of GCStrategies to LLVM Context which was reverted and you mentioned that it
makes sense to have them in Immutable Analysis pass. Was its only purpose solve that linkage problem?
If yes, we can drop this GCStrategy cache completely and call llvm::getGCStrategy()directly.

BTW, that linkage problem was fixed recently by Campbell.

@znix : I suggest you to remove GCStrategy cache from this patch (and use llvm::getGCStrategy() instead where necessary). I think it will unblock review.
We can figure out caching later/separately.

znix updated this revision to Diff 489971.Jan 17 2023, 4:09 PM

Get rid of the strategy cache

znix updated this revision to Diff 489999.Jan 17 2023, 5:31 PM

Rephrase the Statepoint documentation to properly reflect this patch's changes.

znix updated this revision to Diff 490002.Jan 17 2023, 5:48 PM

Make individual GCStrategies op-in to RS4GC support.

Please make a separate NFC review with GCStrategy changes. Besides being common llvm review practice it will allow downstream projects to handle this API change without breakage.
With separate reviews/commit we will have time to update our downstream strategies before RS4GC starts using new API.
If they come in single commit, it will immediately break all existing RS4GC users.

llvm/lib/Transforms/Scalar/RewriteStatepointsForGC.cpp
147

I think getting GCStrategy from function should be moved inside this function. See comments below

153

Same here. Move it inside and you don't need change header file

189

Ditto

335–337

You can use optional::value_or here:

return GC->isGCManagerPointer(T).value_or(true);
3055

Returning false and then asserting looks confusing, this will raise questions like what Strategy really is; is it different from F.getGC; etc.
I think it would be much mo readable to get GCStrategy from function right here and use it.

3061

Moving this assert to GCStrategy::useRS4GC() looks more logical to me.

3072

You won't need this if you move getting GCStrategy inside shouldRewriteStatepointsIn()

znix updated this revision to Diff 490694.Jan 19 2023, 6:04 PM
znix marked 13 inline comments as done.
  • Split the GCStrategy::UseRS4GC into a separate diff
  • Reword some changes to Statepoints.rst and mention UseRS4GC in them
  • Don't pass the GCStrategy into shouldRewriteStatepointsIn or runOnFunction
  • Use optional::value_or in isGCPointerType
dantrushin accepted this revision.Jan 20 2023, 6:28 AM

LGTM.
I'll wait for a few days before landing it to let other reviewers speak up.

llvm/docs/Statepoints.rst
558–559

pass, not patch?

This revision is now accepted and ready to land.Jan 20 2023, 6:28 AM
This revision was landed with ongoing or failed builds.Jan 25 2023, 9:48 AM
This revision was automatically updated to reflect the committed changes.