This is an archive of the discontinued LLVM Phabricator instance.

[GreedyRA ORE] Separate Folder Reloads and Zero Cost Folder Reloads
ClosedPublic

Authored by skatkov on Apr 6 2021, 10:05 PM.

Details

Summary

Patchpoint instructions have operands which is actually zero cost
(or the same as register) to use the value from the stack.
In terms of statistic it makes same to separate them.

Move from computation instructions related to stack spill/reload to
number of stack slot referenced.

Diff Detail

Unit TestsFailed

Event Timeline

skatkov created this revision.Apr 6 2021, 10:05 PM
skatkov requested review of this revision.Apr 6 2021, 10:05 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 6 2021, 10:05 PM
thegameg added inline comments.Apr 6 2021, 10:17 PM
llvm/lib/CodeGen/RegAllocGreedy.cpp
3171

I would extract the extra check in a separate function or lambda like isSpillSlotAccess above for readability.

reames added a comment.Apr 7 2021, 8:30 PM

I think you're computing slightly the wrong stat here. You're computing "instructions with at least one zero cost use". What I think you actually want is "number of zero cost uses folded". Key point is that a single instruction is likely to have many such uses, and, possibly, some non-zero cost uses.

I think you're computing slightly the wrong stat here. You're computing "instructions with at least one zero cost use". What I think you actually want is "number of zero cost uses folded". Key point is that a single instruction is likely to have many such uses, and, possibly, some non-zero cost uses.

Actually I tried to be consistent with other statistic. All other stat collect number of instructions, not number of stack slot used.
In this term I tried to collect zero cost instruction (all stack slot are zero cost uses) or just folded instructions.

I guess if we want move from number of instructions to number of stack slot usages we should do it for all other cases as well...

skatkov updated this revision to Diff 336066.Apr 8 2021, 5:18 AM

Handled comment. Added test.

Will move to number of spills instead of number of instructions in next update.

thegameg added inline comments.Apr 8 2021, 12:14 PM
llvm/lib/CodeGen/RegAllocGreedy.cpp
3172

I was thinking of moving the whole thing in a function to be able to write something like:

else if (TII->hasLoadFromStackSlot(MI, Accesses) &&
             llvm::any_of(Accesses, isSpillSlotAccess)) {
  if (isX(MI, TII))
   ++Stats.ZeroCostFoldedReloads
  else
   ++Stats.FoldedReloads;
reames added a comment.Apr 8 2021, 1:22 PM

Actually I tried to be consistent with other statistic. All other stat collect number of instructions, not number of stack slot used.

I think this code may simply be expecting that no instruction has one than more folded stack operand. For most architectures I'm aware of, that's true for all the actual instructions.

Matt added a subscriber: Matt.Apr 8 2021, 2:13 PM
skatkov updated this revision to Diff 336339.Apr 9 2021, 1:26 AM
skatkov edited the summary of this revision. (Show Details)

@reames, I moved from instruction computation to stack slot usages.
@thegameg, I landed NFC before to move from if-else to if-continue. Is it readable now or you still want me to separate added logic into utility function?

I've added output of register allocator to the test to be able to verify the correctness of the stat manually.

By some reason utils/update_llc_test_checks.py does not work for me, so I did it manually.

@thegameg, I landed NFC before to move from if-else to if-continue. Is it readable now or you still want me to separate added logic into utility function?

I imagined all the checking ZeroCost/Patchpoint logic to be extracted in a separate function but it looks like it's more complicated than just a simple check.

reames accepted this revision.Apr 13 2021, 12:19 PM

LGTM w/one minor fix before submission.

llvm/lib/CodeGen/RegAllocGreedy.cpp
3199

This assert does not hold when the instruction uses the same stack slot twice (which is entirely legal). Just remove it.

3206

Side question: Are you planning on extending this to zero-cost folded spills? Statepoints also have those too.

This revision is now accepted and ready to land.Apr 13 2021, 12:19 PM
skatkov added inline comments.Apr 13 2021, 7:24 PM
llvm/lib/CodeGen/RegAllocGreedy.cpp
3199

It should. FoldedReloads and ZeroCostFoldedReloads are sets to avoid duplication.
I guesses that memory operands does not contain duplicates, am I wrong?

3206

Probably I misses that... I will check the code when statepoint may contain folded spill...

skatkov added inline comments.Apr 13 2021, 7:46 PM
llvm/lib/CodeGen/RegAllocGreedy.cpp
3206

And to be honest in this case my implementation is incorrect. At this moment it considers all spill stack slot accesses in var area as folder reload. If there are folded spills then I should separate them somehow.

Also makes sense to note that original implementation as well as this one does not consider folded spills if there are folded reload. Most probably to avoid accounting then twice due to cost is probably the same...

skatkov added inline comments.Apr 13 2021, 7:49 PM
llvm/test/CodeGen/X86/statepoint-ra.ll
82

Philip, you are right:
(volatile load store 8 on %stack.0)

Not sure whether we want to separate them, let me think...

skatkov updated this revision to Diff 337351.Apr 14 2021, 12:09 AM

Assert removed.
Test is updated due to recent changes in RA.

This revision was landed with ongoing or failed builds.Apr 14 2021, 12:25 AM
This revision was automatically updated to reflect the committed changes.