In current implementation we spill Statepoint's GC arguments into
stack slots and then reload them at corresponding gc.relocate
location. That works reasonably well for small functions/GC sets,
but when register pressure gets high of GC pointer sets grows
large, this causes ugly code, because those spills are not understood
by the register allocator and we end up with reloads from spill
locations which immediately get spilled into statepoint slots.
This change insroduces new form of STATEPOINT instruction:
GC arguments are passed on VRegs and STATEPOINT itself becomes
variadic defs instruction, e.g.:
rel1,rel2,... = STATEPOINT ..., derived1<tied-def0>, derived2<tied-def1>, ...
Then register allocator can spill/fold them as it sees appropriate.
If target runtime/GC does not support GC support GC objects in
registers, FixupStatepointCallerSaves pass is extended to spill
them (optionally allowing these pointers to stay in CSRs).
Due to SDNode memory menagement, we use simple scheme to decide
which GC pointers pass in VRegs: just take first N.
For that to work, we sort GC derived pointers to have those not
needing relocation (e.g., constants) at the end of GC arguments
list.
N is further limited by maximal amount of tied registers machine
instruction can have (15 at the moment).
For more context/details about statepoints see documentation at
https://llvm.org/docs/Statepoints.html
https://llvm.org/docs/GarbageCollection.html
This review includes all changes and aimed to provide full context of
what's being done. It will be committed in smaller parts with separate
smaller reviews for every change.
clang-format suggested style edits found: