This is an archive of the discontinued LLVM Phabricator instance.

Statepoint infrastructure for garbage collection
ClosedPublic

Authored by reames on Oct 8 2014, 2:24 PM.

Download Raw Diff

Details

Reviewers

chandlerc
theraven
nicholas
atrick
sanjoy
ributzka
hfinkel

Commits

rGf61232264edf: [Statepoints 4/4] Statepoint infrastructure for garbage collection…
rL223143: [Statepoints 4/4] Statepoint infrastructure for garbage collection…

Summary

The attached patch implements an approach to supporting garbage collection in LLVM that has been mentioned on the mailing list a number of times by now. There's a couple of issues that need to be addressed before submission, but I wanted to get this up to give maximal time for review.

The statepoint intrinsics are intended to enable precise root tracking through the compiler as to support garbage collectors of all types. Our testing to date has focused on fully relocating collectors (where pointers can change at any safepoint poll, or call site), but the infrastructure should support collectors of other styles. The addition of the statepoint intrinsics to LLVM should have no impact on the compilation of any program which does not contain them. There are no side tables created, no extra metadata, and no inhibited optimizations.

A statepoint works by transforming a call site (or safepoint poll site) into an explicit relocation operation. It is the frontend's responsibility (or eventually the safepoint insertion pass we've developed, but that's not part of this patch) to ensure that any live pointer to a GC object is correctly added to the statepoint and explicitly relocated. The relocated value is just a normal SSA value (as seen by the optimizer), so merges of relocated and unrelocated values are just normal phis. The explicit relocation operation, the fact the statepoint is assumed to clobber all memory, and the optimizers standard semantics ensure that the relocations flow through IR optimizations correctly.

During the lowering process, we currently spill aggressively to stack. This is not entirely ideal (and we have plans to do better), but it's functional, relatively straight forward, and matches closely the implementations of the patchpoint intrinsics. We leverage the existing StackMap section format, which is already used by the patchpoint intrinsics, to report where pointer values live. Unlike a patchpoint, these locations are known (by the backend) to be writeable during the call. This enables the garbage collector to transparently read and update pointer values if required. We do optimize lowering in certain well known cases (constant pointers, a.k.a. null, being the key one.)

There are a few areas of this patch which could use improvement:

The test coverage could be improved. Most of the tests we've actually been using are built on top of the safepoint insertion mechanism (not included here) and our runtime. We need to improve the IR level tests for optimizer semantics (i.e. not doing illegal transforms), and lowering. There are some minimal tests in place for the lowering of simple statepoints.
The documentation needs revision, but should be reasonable complete.
Many functions are missing doxygen comments
There's a hack in to force the use of RSP+Offset addressing vs RBP-Offset addressing for references in the StackMap section. This works, shouldn't break anyone else, but should definitely be cleaned up. The choice of addressing preference should be up to the runtime.

When reviewing, I would greatly appreciate feedback on which issues need to be fixed before submission and those which can be addressed afterwards. It is my plan to actively maintain and enhance this infrastructure over next few months (and years). It's already been developed out of tree entirely too long (our fault!), and I'd like to move to incremental work in tree as quickly as feasible.

Planned enhancements after submission:

The ordering of arguments in statepoints is essentially historical cruft at this point. I'm open to suggestions on how to make this more approachable. Reordering arguments would (preferably) be a post commit action.
Support for relocatable pointers in callee saved registers over call sites. This will require the notation of an explicit relocation psuedo op and support for it throughout the backend (particularly the register allocator.)
Optimizations for non-relocating collectors. For example, the clobber semantics of the spill slots aren't needed if the collector isn't relocating roots.
Further optimizations to reduce the cost of spilling around each statepoint (when required at all).
Support for invokable statepoints.
Once this has baked in tree for a while, I plan to delete the existing gc_root code. It is unsound, and essentially unused.

In addition to the enhancements to the infrastructure in the currently proposed patch, we're also working on a number of follow up changes:

Verification passes to confirm that safepoints were inserted in a semantically valid way (i.e. no memory access of a value after it has been inserted)
A transformation pass to convert naive IR to include both safepoint polling sites, and statepoints on every non-leaf call. This transformation pass can be used at initial IR creation time to simplify the frontend authors' work, but is also designed to run on *fully optimized* IR, provided the initial IR meets certain (fairly loose) restrictions.
A transformation pass to convert normal loads and stores into user provided load and store barriers.
Further optimizations to reduce the number of safepoints required, and improve the infrastructure as a whole.

We've been working on these topics for a while, but the follow on patches aren't quite as mature as what's being proposed now. Once these pieces stabilize a bit, we plan to upstream them as well. For those who are curious, our work on those topics is available here: https://github.com/AzulSystems/llvm-late-safepoint-placement

Diff Detail

Repository: rL LLVM

Event Timeline

reames updated this revision to Diff 14604.Oct 8 2014, 2:24 PM

reames retitled this revision from to Statepoint infrastructure for garbage collection.

reames updated this object.

reames edited the test plan for this revision. (Show Details)

reames added reviewers: hfinkel, chandlerc, nicholas, sanjoy, atrick, ributzka, theraven.

reames added a subscriber: Unknown Object (MLST).

+Filip Pizlo

Updating statepoint changes to:

include some reasonable documentation. What's here still needs serious work, but at least it's reasonable complete and doesn't contain obvious errors.
remove the hack to use Indirect vs Direct locations.
rebase on TOT

The previous update didn't actually include some of the changes I'd meant it to.

I don't know enough about this topic to comment on the details of the ideas (although they seem reasonable and match what had been discussed on the mailing list), but I have a read part of the patch and found a few typos and have a few inline comments.

Could you also include more contexts in your patches (-U999999 if you produce your patches by svn/git diff)? It generally makes it easier to review changes on Phabricator.
Thanks for working on this.

include/llvm/CodeGen/MachineInstr.h
91 ↗	(On Diff #14677)	What are these 'BEGIN CHANGE/END CHANGE' comments ?
include/llvm/IR/Intrinsics.td
500 ↗	(On Diff #14677)	Is this necessary ? I don't see in the documentation any case where the statepoint can throw an exception. And if the goal is just to say it can have side-effects on memory, that is the default if there is no annotation according to Intrinsics.td
502 ↗	(On Diff #14677)	These (and gc_relocate) should probably have the attribute IntrNoMem if I understood correctly your documentation.
lib/CodeGen/StackMaps.cpp
285 ↗	(On Diff #14677)	contigious -> contiguous
lib/IR/LLVMContext.cpp
252 ↗	(On Diff #14677)	Unnecessary whitespace change.
lib/IR/Statepoint.cpp
15 ↗	(On Diff #14677)	Could be simplified in "return F && F->getIntrinsicID() == Intrinsic::statepoint"
32 ↗	(On Diff #14677)	Could be simplified in "return CS.isCall() && isGCRelocate(CS.getInstruction())"
47 ↗	(On Diff #14677)	Could be simplified in "return CS.isCall() && isGCRelsult(CS.getInstruction())"
96 ↗	(On Diff #14677)	This should probably be changed before upstreaming if it depends on your runtime.
lib/IR/Verifier.cpp
1120 ↗	(On Diff #14677)	It is not clear to me why the const_cast is required: I is a const Instruction , and isStatepoint accepts a const Instruction .
2561 ↗	(On Diff #14677)	gc.result -> gc.relocate
lib/Target/X86/X86FrameLowering.h
74 ↗	(On Diff #14677)	Why not just use the override keyword ?
lib/Target/X86/X86ISelLowering.cpp
20621 ↗	(On Diff #14677)	devirge -> diverge
lib/Target/X86/X86MCInstLower.cpp
861 ↗	(On Diff #14677)	likelly -> likely
862 ↗	(On Diff #14677)	Is this invariant something general to LLVM and documented as such, or specific to your application (I am asking because the comment talks about a 'VM')?
870 ↗	(On Diff #14677)	Same question: where do this guarantee come from
test/Verifier/statepoint-non-gc-ptr.ll
1 ↗	(On Diff #14677)	It is not clear exactly what this tests

respond to review comments, remove a few pieces of code which aren't relevant for basic functionality, and add experimental prefix to the names.

morisset, I think I've addressed all your comments. Let me know if you see anything I missed.

include/llvm/CodeGen/MachineInstr.h
91 ↗	(On Diff #14677)	Removed.
include/llvm/IR/Intrinsics.td
500 ↗	(On Diff #14677)	Removed. This was an artefact of work to support invokable statepoints which aren't part of this patch.
502 ↗	(On Diff #14677)	You're right. This change will need more testing that I can easily give it at the moment. Do you mind if this change lands separately once this is integrated and I've merged it back into our tree?
lib/IR/Verifier.cpp
1120 ↗	(On Diff #14677)	It's not. This is just old code that hadn't been updated. Fixed. Thanks.
lib/Target/X86/X86MCInstLower.cpp
862 ↗	(On Diff #14677)	This is preliminary support for atomically patchable calls. For now, I just documented the 32 bit offset requirement and removed the rest. This will be a separate change proposal down the road.

I think a change like this might be more compelling if you could give more detail on how it would actually help (I can't find the detail I'm looking for in your blog posts). It seems like the value of this patch is that it will work with late safepoint placement, but it'd be nice to see some examples of cases where late safepoint placement gives you something that early safepoint placement (ie by the frontend) doesn't. It kind of feels like either approach will work well with only non-gc values, and neither approach will be able to do much optimization when you do function calls. I'm not trying to claim that that's necessarily true, but it'd be easier to understand your point if there was some example IR.

Kevin,

Let me try to answer the point you're getting at. In doing so, I want
to explicitly separate the statepoint intrinsics which are currently up
for review, and the future late safepoint placement. The statepoint
intrinsics have value separate from the late safepoint placement
approach, and I want to justify them on their own merits.

The basic problem we're trying to solve with these intrinsics is
supporting fully relocating collectors. By definition, such a collector
needs to be precise w.r.t. root tracking. Even worse, we need to ensure
that *all copies* of a pointer are updated. It is not acceptable to
make two copies of a pointer, update one of them, then use the other for
a memory access.

If the compiler is allowed to introduce derived pointers (i.e. pointer
valued temporaries created by the compiler which point somewhere within
an object, or outside it, but associated with it), we also need to track
which *object* each *pointer* to be updated is associated with. This is
required to safely update the pointers.

For the sake of argument, let's say our frontend does safepoint insertion.

There's a couple of approaches which seem like they might work, let's
explore each in turn:

We could use patchpoints to record all the values needed for the GC

stack map. This mostly works, but requires that the patchpoint not be
marked readonly or readnone (to prevent illegal reorderings). That
could be a usage convention. The real problem is that the compiler is
still free to introduce multiple *copies* of an SSA value over the
patchpoint. (This is completely legal under SSA semantics.) When it
does so, it creates a situation where the gc could fail to update a
pointer which will then be dereferenced. That's a bug. Worth stating
explicitly, I believe the patchpoint scheme would be sufficient *if you
do not every relocate a root*.

We could use the gc.root. gc.root defines the allocs, but does not

define the call format, or any of the mechanisms to ensure proper
relocation. As such, it *by itself* is not viable. Also, gc.root
inherently assumes every value will have a stack slot. Without *heavy*
reengineering, there's no way to have a gc pointer in a callee saved
register over a call site. This is an unfortunate limitation. Any call
representation without explicit relocation suffers from the same bug as
the patchpoint scheme.

We could combine gc.root allocas and patchpoints. This essentially

combines the flaws (no gc pointers in callee saved registers over calls,
and missed copies), with no benefit.

The statepoint intrinsics are basically the patchpoint option above, but
with relocation made explicit in the IR. While it's still legal for the
optimizer to create a copy of the value feeding a statepoint, that's now
okay. By construction, there can be no use of the original SSA value
(and thus the copy) after the statepoint. Instead, the explicitly
relocated value is used.

To summarize: We need (something like) statepoints for correctness of
fully relocating collectors.

(The points I'm making here are somewhat subtle. If it would help to
have IR examples here, ask. I'm deferring writing them because it's
time consuming.)

Other advantages of the statepoint approach:

The gc.relocate intrinsics (part of the statepoint proposal) also makes
it explicit in the IR what the base object of each pointer to be
relocated is. This isn't *required* (you could encode the same
information in the arguments of the statepoint), but making it explicit
is much cleaner.

The explicit relocation notation has the potential to be extended in to
the backend. With some register allocator changes (not part of this
patch!), we could support gc pointers in callee saved registers. This
is possible with the (incorrect) patchpoint scheme. It is possible, but
*hard*, with the gc.root scheme.

The posted patch includes a couple of small optimizations (i.e. null
forwarding) that help performance, but could (probably) be implemented
on top of another scheme. We have a number of planned optimizations on
the statepoint mechanism.

Now, let me finally bring up late safepoint placement. The only real
impact on this patch is that, to date, we have only focused on the
*correctness* of a statepoint passing through the optimizer. We have
not attempted to teach the optimizer about how to leverage one or
perform optimizations over one. There's room for improvement here (i.e.
not completely blocking inlining), but we prefer to approach this
problem by simply inserting them late. You could instead choose to
insert them at generation time, and teach the optimizer about their
semantics. That *strategy choice* is independent of the representation
choosen provided that representation is *correct*.

Yours,
Philip

Sorry yes, I am comparing this approach to gc.root; it seems like gc.root and statepoints are similar in that they both take a spill+reload "reduce the code generator's ability to use copies" approach as compared to a hypothetical "track all copies so that they can be updated in place". It seems like gc.root provides much of the same functionality as statepoint -- gc.root definitely should be able to support a relocating GC as well, and I guess I haven't heard of it being "fundamentally broken" outside of a late-safepoint-placement strategy. So far the arguments I've seen for statepoints over gc.root are

easier to save roots in callee-save registers
easier to automatically generate gc annotations on arbitrary IR, such as post-compiler-optimizations.

I guess I'm wondering if I'm missing other benefits, and what your thoughts are on whether this would be enough to save statepoints from the same fate as gc.root.

jyh added a subscriber: jyh.Oct 16 2014, 10:25 AM

mjacob added a subscriber: mjacob.Oct 23 2014, 5:22 AM

vkalintiris added a subscriber: vkalintiris.Oct 23 2014, 6:31 AM

Hi Philip,

I looked only at parts of the patch so far. I guess certain things could have been broken out into smaller patches that are not part of the statepoints itself, such as the changes to MachineInstr, intrinsic munging, etc.

In general I noticed that the patch requires some love when it comes to the LLVM coding standards (e.g. CamelCase for variables). Besides that nitpicking the code looks really nice.

Please see my inline comments and questions.

Thanks

Cheers,
Juergen

include/llvm/CodeGen/MachineInstr.h
91 ↗	(On Diff #14905)	Are you really hitting the limitations of uint8_t for mem references? Changing this by one byte increases the overall size of a MachineInstr from 64 bytes to 72 bytes (due to padding).
include/llvm/Target/Target.td
859 ↗	(On Diff #14905)	Is this a hard requirement for STATEPOINTs to work?
include/llvm/Target/TargetFrameLowering.h
202–203 ↗	(On Diff #14905)	Yup, couldn't agree more.
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
7330–7334 ↗	(On Diff #14905)	I think we had something similar like this before. I think we changed it because we didn't wan't to modify the LLVM IR by creating a new function call at this point.
7351 ↗	(On Diff #14905)	nullptr

Responding to review comments, updated change set coming soon.

include/llvm/CodeGen/MachineInstr.h
91 ↗	(On Diff #14905)	We were at one point. I'm not sure we still are. I'm fine removing this in the change which initially gets upstreamed and revisiting this separately.
include/llvm/Target/Target.td
859 ↗	(On Diff #14905)	You know, I'm not sure. I've forgotten why that was originally added. I think it's probably legacy junk at this point. I'll remove it and see if I run into any failures.
include/llvm/Target/TargetFrameLowering.h
202–203 ↗	(On Diff #14905)	Is this a "fix before submit" comment? Or just general agreement?
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
7330–7334 ↗	(On Diff #14905)	I don't much care for this either, but at this point, it's fairly well tested and works. I'm fine cleaning this up, but do you mind if we land this first then work incrementally?

Revised to include changes requested by reviewers. Andy, Juergen, can I get a LGTM here? If you have additional comments, I'm happy to keep making changes once in tree, but keeping the diffs up to date and clean is becoming a real pain.

p.s. This is *not* against TOT. I had to include one change from TOT (in IR/Functions.cpp) to cleanup some now redundant code, but this is otherwise based on the same revision as previous. I will of course rebase before submission.

We should try to get fpizlo's thoughts on GC-related stuff like this.

There are still a lot of coding standard violations. I marked a few of them ...

include/llvm/CodeGen/MachineInstr.h
91 ↗	(On Diff #15885)	Whitespace
include/llvm/IR/Intrinsics.td
502 ↗	(On Diff #15885)	How does that work? Doesn't "anyfloat" the the concrete type from one of the arguments?
include/llvm/IR/Statepoint.h
1 ↗	(On Diff #15885)	Copyright Header
2 ↗	(On Diff #15885)	I don't think we use leading and trailing "__".
25 ↗	(On Diff #15885)	CamelCase
56 ↗	(On Diff #15885)	CamelCase
61 ↗	(On Diff #15885)	CamelCase
75 ↗	(On Diff #15885)	CamelCase
141 ↗	(On Diff #15885)	CamelCase
lib/CodeGen/SelectionDAG/StatepointLowering.cpp
134 ↗	(On Diff #15885)	CamelCase
135 ↗	(On Diff #15885)	ditto
161 ↗	(On Diff #15885)	CamelCase
192–194 ↗	(On Diff #15885)	CamelCase
198 ↗	(On Diff #15885)	CamelCase
200 ↗	(On Diff #15885)	CamelCase
201–202 ↗	(On Diff #15885)	ditto
234 ↗	(On Diff #15885)	CamelCase
246–248 ↗	(On Diff #15885)	...
lib/IR/Statepoint.cpp
1 ↗	(On Diff #15885)	Copyright Header
lib/IR/Verifier.cpp
1119–1120 ↗	(On Diff #15885)	Would be nice to put this in a "isStatepoint(const Value *)" method instead.

This revision is now accepted and ready to land.Nov 21 2014, 1:36 PM

Sorry for the big delay, I don't generally get blocks of time large enough to review a patch this size. It's my fault for not insisting that you split it up into docs, IR representation, lowering to MI, and stackmap generation. Anyway, I can give a LGTM now as long you you address Juergen's coding convention comments and my comments inline and below.

Overall, I'm happy because we've agreed on a long-term plan for a generalized llvm.patchpoint to cover this use case and others. I just need to send out that proposal. In the meantime, this intrinsic lets you bootstrap the functionality.

It's somewhat shady that you create an IR instruction during SelectionDAG (CallInst::create). It's probably OK to use a temporary instruction like this (in fact we used to do it for patchpoint), but not ideal. I think we can live with it until a new uber-patchpoint comes along (no need to address it now).

Question: It looks like lowering may require statepoint and gc_relocate calls not to be interleaved? Is that true? If so, can you document that and ensure that it is verified somewhere?

The function getFrameIndexReferenceForGC() is not GC specific. Please use a more appropriate name, like getFrameIndexOffsetFromSP().

Please fix Sphinx warnings:

/s/patch/docs/Statepoints.rst:10: WARNING: Title underline too short.

Status/Warning

/s/patch/docs/Statepoints.rst:60: WARNING: Title underline too short.

An Example Safepoint Sequence

/s/patch/docs/Statepoints.rst:60: WARNING: Title underline too short.

An Example Safepoint Sequence

/s/patch/docs/Statepoints.rst:158: WARNING: Title underline too short.

'''gc_relocate''' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^
/s/patch/docs/Statepoints.rst:158: WARNING: Title underline too short.

'''gc_relocate''' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^
/s/patch/docs/Statepoints.rst:234: WARNING: Title underline too short.

Polling for a Safepoint

/s/patch/docs/Statepoints.rst:234: WARNING: Title underline too short.

Polling for a Safepoint

looking for now-outdated files... none found
pickling environment... done
checking consistency... /s/patch/docs/Statepoints.rst:: WARNING: document isn't included in any toctree

docs/Statepoints.rst
108 ↗	(On Diff #15885)	You should remove the "unused" operand, or commit to it and explain what it's for. It's strange occurring between # call args and the variadic args. Shouldn't the pattern be something like: flags, #args, args..., flags, #args, args...
include/llvm/CodeGen/MachineInstr.h
91 ↗	(On Diff #15885)	Accidental change.
lib/IR/Verifier.cpp
1117–1125 ↗	(On Diff #15885)	This is incomplete because it only verifies that one use of the address is a safepoint, not all users.
lib/Target/X86/X86FrameLowering.cpp
1136–1137 ↗	(On Diff #15885)	I think this comment is accidentally wrapped.
1185–1188 ↗	(On Diff #15885)	I don't understand this comment or why the assert is disabled.

Closed by commit rL223143 (authored by @reames).

Revision Contents

Path

Size

llvm/

trunk/

docs/

Statepoints.rst

209 lines

Diff 16822

llvm/trunk/docs/Statepoints.rst

				=====================================
				Garbage Collection Safepoints in LLVM
				=====================================

				.. contents::
				:local:
				:depth: 2

				Status
				=======

				This document describes a set of experimental extensions to LLVM. Use with caution. Because the intrinsics have experimental status, compatibility across LLVM releases is not guaranteed.

				LLVM currently supports an alternate mechanism for conservative garbage collection support using the gc_root intrinsic. The mechanism described here shares little in common with the alternate implementation and it is hoped that this mechanism will eventually replace the gc_root mechanism.

				Overview
				========

				To collect dead objects, garbage collectors must be able to identify any references to objects contained within executing code, and, depending on the collector, potentially update them. The collector does not need this information at all points in code - that would make the problem much harder - but only at well defined points in the execution known as 'safepoints' For a most collectors, it is sufficient to track at least one copy of each unique pointer value. However, for a collector which wishes to relocate objects directly reachable from running code, a higher standard is required.

				One additional challenge is that the compiler may compute intermediate results ("derived pointers") which point outside of the allocation or even into the middle of another allocation. The eventual use of this intermediate value must yield an address within the bounds of the allocation, but such "exterior derived pointers" may be visible to the collector. Given this, a garbage collector can not safely rely on the runtime value of an address to indicate the object it is associated with. If the garbage collector wishes to move any object, the compiler must provide a mapping for each pointer to an indication of its allocation.

				To simplify the interaction between a collector and the compiled code, most garbage collectors are organized in terms of two three abstractions: load barriers, store barriers, and safepoints.

				#. A load barrier is a bit of code executed immediately after the machine load instruction, but before any use of the value loaded. Depending on the collector, such a barrier may be needed for all loads, merely loads of a particular type (in the original source language), or none at all.
				#. Analogously, a store barrier is a code fragement that runs immediately before the machine store instruction, but after the computation of the value stored. The most common use of a store barrier is to update a 'card table' in a generational garbage collector.

				#. A safepoint is a location at which pointers visible to the compiled code (i.e. currently in registers or on the stack) are allowed to change. After the safepoint completes, the actual pointer value may differ, but the 'object' (as seen by the source language) pointed to will not.

				Note that the term 'safepoint' is somewhat overloaded. It refers to both the location at which the machine state is parsable and the coordination protocol involved in bring application threads to a point at which the collector can safely use that information. The term "statepoint" as used in this document refers exclusively to the former.

				This document focuses on the last item - compiler support for safepoints in generated code. We will assume that an outside mechanism has decided where to place safepoints. From our perspective, all safepoints will be function calls. To support relocation of objects directly reachable from values in compiled code, the collector must be able to:

				#. identify every copy of a pointer (including copies introduced by the compiler itself) at the safepoint,
				#. identify which object each pointer relates to, and
				#. potentially update each of those copies.

				This document describes the mechanism by which an LLVM based compiler can provide this information to a language runtime/collector and ensure that all pointers can be read and updated if desired. The heart of the approach is to construct (or rewrite) the IR in a manner where the possible updates performed by the garbage collector are explicitly visible in the IR. Doing so requires that we:

				#. create a new SSA value for each potentially relocated pointer, and ensure that no uses of the original (non relocated) value is reachable after the safepoint,
				#. specify the relocation in a way which is opaque to the compiler to ensure that the optimizer can not introduce new uses of an unrelocated value after a statepoint. This prevents the optimizer from performing unsound optimizations.
				#. recording a mapping of live pointers (and the allocation they're associated with) for each statepoint.

				At the most abstract level, inserting a safepoint can be thought of as replacing a call instruction with a call to a multiple return value function which both calls the original target of the call, returns it's result, and returns updated values for any live pointers to garbage collected objects.

				Note that the task of identifying all live pointers to garbage collected values, transforming the IR to expose a pointer giving the base object for every such live pointer, and inserting all the intrinsics correctly is explicitly out of scope for this document. The recommended approach is described in the section of Late Safepoint Placement below.

				This abstract function call is concretely represented by a sequence of intrinsic calls known as a 'statepoint sequence'.


				Let's consider a simple call in LLVM IR:
				todo

				Depending on our language we may need to allow a safepoint during the execution of the function called from this site. If so, we need to let the collector update local values in the current frame.

				Let's say we need to relocate SSA values 'a', 'b', and 'c' at this safepoint. To represent this, we would generate the statepoint sequence::
				put an example sequence here

				Ideally, this sequence would have been represented as a M argument, N return value function (where M is the number of values being relocated + the original call arguments and N is the original return value + each relocated value), but LLVM does not easily support such a representation.

				Instead, the statepoint intrinsic marks the actual site of the safepoint or statepoint. The statepoint returns a token value (which exists only at compile time). To get back the original return value of the call, we use the 'gc_result' intrinsic. To get the relocation of each pointer in turn, we use the 'gc_relocate' intrinsic with the appropriate index. Note that both the gc_relocate and gc_result are tied to the statepoint. The combination forms a "statepoint sequence" and represents the entitety of a parseable call or 'statepoint'.

				When lowered, this example would generate the following x86 assembly::
				put assembly here

				Each of the potentially relocated values has been spilled to the stack, and a record of that location has been recorded to the StackMap section. If the garbage collector needs to update any of these pointers during the call, it knows exactly what to change.

				Intrinsics
				===========

				'''gc_statepoint''' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare i32
				@gc_statepoint(func_type <target>, i64 <#call args>.
				i64 <unused>, ... (call parameters),
				i64 <# deopt args>, ... (deopt parameters),
				... (gc parameters))

				Overview:
				"""""""""

				The statepoint intrinsic represents a call which is parse-able by the runtime.

				Operands:
				"""""""""

				The 'target' operand is the function actually being called. The target can be specified as either a symbolic LLVM funciton, or as an arbitrary Value of appropriate function type. Note that the function type must match the signature of the callee and the types of the 'call parameters' arguments.

				The '#call args' operand is the number of arguments to the actual call. It must exactly match the number of arguments passed in the 'call parameters' variable length section.

				The 'unused' operand is unused and likely to be removed. Please do not use.

				The 'call parameters' arguments are simply the arguments which need to be passed to the call target. They will be lowered according to the specified calling convention and otherwise handled like a normal call instruction. The number of arguments must exactly match what is specified in '# call args'. The types must match the signature of 'target'.

				The 'deopt parameters' arguments contain an arbitrary list of Values which is meaningful to the runtime. The runtime may read any of these values, but is assumed not to modify them. If the garbage collector might need to modify one of these values, it must also be listed in the 'gc pointer' argument list. The '# deopt args' field indicates how many operands are to be interpreted as 'deopt parameters'.

				The 'gc parameters' arguments contain every pointer to a garbage collector object which potentially needs to be updated by the garbage collector. Note that the argument list must explicitly contain a base pointer for every derived pointer listed. The order of arguments is unimportant. Unlike the other variable length parameter sets, this list is not length prefixed.

				Semantics:
				""""""""""

				A statepoint is assumed to read and write all memory. As a result, memory operations can not be reordered past a statepoint. It is illegal to mark a statepoint as being either 'readonly' or 'readnone'.

				Note that legal IR can not perform any memory operation on a 'gc pointer' argument of the statepoint in a location statically reachable from the statepoint. Instead, the explicitly relocated value (from a ''gc_relocate'') must be used.

				'''gc_result''' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare type*
				@gc_result_ptr(i32 %statepoint_token)

				declare fX
				@gc_result_float(i32 %statepoint_token)

				declare iX
				@gc_result_int(i32 %statepoint_token)

				Overview:
				"""""""""

				'''gc_result''' extracts the result of the original call instruction which was replaced by the '''gc_statepoint'''. The '''gc_result''' intrinsic is actually a family of three intrinsics due to an implementation limitation. Other than the type of the return value, the semantics are the same.

				Operands:
				"""""""""

				The first and only argument is the '''gc.statepoint''' which starts the safepoint sequence of which this '''gc_result'' is a part. Despite the typing of this as a generic i32, only the value defined by a '''gc.statepoint''' is legal here.

				Semantics:
				""""""""""

				The ''gc_result'' represents the return value of the call target of the ''statepoint''. The type of the ''gc_result'' must exactly match the type of the target. If the call target returns void, there will be no ''gc_result''.

				A ''gc_result'' is modeled as a 'readnone' pure function. It has no side effects since it is just a projection of the return value of the previous call represented by the ''gc_statepoint''.

				'''gc_relocate''' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare <type> addrspace(1)*
				@gc_relocate(i32 %token, i32 %base_offset, i32 %pointer_offset)

				Overview:
				"""""""""

				A ''gc_relocate'' returns the potentially relocated value of a pointer at the safepoint.

				Operands:
				"""""""""

				The first argument is the '''gc.statepoint''' which starts the safepoint sequence of which this '''gc_relocation'' is a part. Despite the typing of this as a generic i32, only the value defined by a '''gc.statepoint''' is legal here.

				The second argument is an index into the statepoints list of arguments which specifies the base pointer for the pointer being relocated. This index must land within the 'gc parameter' section of the statepoint's argument list.

				The third argument is an index into the statepoint's list of arguments which specify the (potentially) derived pointer being relocated. It is legal for this index to be the same as the second argument if-and-only-if a base pointer is being relocated. This index must land within the 'gc parameter' section of the statepoint's argument list.

				Semantics:
				""""""""""
				The return value of ''gc_relocate'' is the potentially relocated value of the pointer specified by it's arguments. It is unspecified how the value of the returned pointer relates to the argument to the ''gc_statepoint'' other than that a) it points to the same source language object with the same offset, and b) the 'based-on' relationship of the newly relocated pointers is a projection of the unrelocated pointers. In particular, the integer value of the pointer returned is unspecified.

				A ''gc_relocate'' is modeled as a 'readnone' pure function. It has no side effects since it is just a way to extract information about work done during the actual call modeled by the ''gc_statepoint''.


				StackMap Format
				================

				Locations for each pointer value which may need read and/or updated by the runtime or collector are provided via the StackMap format specified in the PatchPoint documentation.

				.. TODO: link

				Each statepoint generates the following Locations:

				* Constant which describes number of following deopt Locations (not operands)
				* Variable number of Locations, one for each deopt parameter listed in the IR statepoint (same number as described by previous Constant)
				* Variable number of Locations pairs, one pair for each unique pointer which needs relocated. The first Location in each pair describes the base pointer for the object. The second is the derived pointer actually being relocated. It is guaranteed that the base pointer must also appear explicitly as a relocation pair if used after the statepoint. There may be fewer pairs then gc parameters in the IR statepoint. Each unique pair will occur at least once; duplicates are possible.

				Note that the Locations used in each section may describe the same physical location. e.g. A stack slot may appear as a deopt location, a gc base pointer, and a gc derived pointer.

				The ID field of the 'StkMapRecord' for a statepoint is meaningless and it's value is explicitly unspecified.

				The LiveOut section of the StkMapRecord will be empty for a statepoint record.

				Safepoint Semantics & Verification
				==================================

				The fundamental correctness property for the compiled code's correctness w.r.t. the garbage collector is a dynamic one. It must be the case that there is no dynamic trace such that a operation involving a potentially relocated pointer is observably-after a safepoint which could relocate it. 'observably-after' is this usage means that an outside observer could observe this sequence of events in a way which precludes the operation being performed before the safepoint.

				To understand why this 'observable-after' property is required, consider a null comparison performed on the original copy of a relocated pointer. Assuming that control flow follows the safepoint, there is no way to observe externally whether the null comparison is performed before or after the safepoint. (Remember, the original Value is unmodified by the safepoint.) The compiler is free to make either scheduling choice.

				The actual correctness property implemented is slightly stronger than this. We require that there be no static path on which a potentially relocated pointer is 'observably-after' it may have been relocated. This is slightly stronger than is strictly necessary (and thus may disallow some otherwise valid programs), but greatly simplifies reasoning about correctness of the compiled code.

				By construction, this property will be upheld by the optimizer if correctly established in the source IR. This is a key invariant of the design.

				The existing IR Verifier pass has been extended to check most of the local restrictions on the intrinsics mentioned in their respective documentation. The current implementation in LLVM does not check the key relocation invariant, but this is ongoing work on developing such a verifier. Please ask on llvmdev if you're interested in experimenting with the current version.

This is an archive of the discontinued LLVM Phabricator instance.

Statepoint infrastructure for garbage collectionClosedPublic

Details

Diff Detail

Event Timeline

Status/Warning

An Example Safepoint Sequence

An Example Safepoint Sequence

Polling for a Safepoint

Polling for a Safepoint

Revision Contents

Diff 16822

llvm/trunk/docs/Statepoints.rst

Statepoint infrastructure for garbage collection
ClosedPublic