Page MenuHomePhabricator

[Greedy RegAlloc] Take into account the cost of local intervals when selecting split candidate.
ClosedPublic

Authored by myatsina on Dec 26 2017, 9:47 AM.

Details

Summary

When selecting a split candidate for region splitting, the register allocator tries to predict which candidate will have the cheapest spill cost.
Global splitting may cause the creation of local intervals, and they might spill.

This patch makes RA take into account the spill cost of local split intervals in use blocks (we already take into account the spill cost in through blocks).
A flag ("-condsider-local-interval-cost") controls weather we do this advanced cost calculation (it's on by default for X86 target, off for the rest).

Diff Detail

Event Timeline

myatsina created this revision.Dec 26 2017, 9:47 AM
qcolombet accepted this revision.Jan 10 2018, 9:32 AM

Hi Marina,

Looks good to me. Couple of nitpicks below.
One question: What is the compile time impact of this new heuristic?

Cheers,
-Quentin

include/llvm/CodeGen/LiveRegMatrix.h
114

*segment

lib/CodeGen/LiveRegMatrix.cpp
211

Period at the end of comments

lib/CodeGen/RegAllocGreedy.cpp
1570

*interference

test/CodeGen/X86/bug26810.ll
28

Why is the new code sequence better?

This revision is now accepted and ready to land.Jan 10 2018, 9:32 AM

Hi Quentin,

Sorry it took me a while to get back to you.
I've measure the compile time influence on CTMark for X86 SKL with default compile flags and O1, O2, O3 optimization levels (O0 does not use Greedy RA).
The difference of this patch (and my previous patch regarding the bad evition chains) is between -0.92% and +0.8% for O1 and between -0.35% and +0.5% for O2, O3.
So compile time seems to not be seriously affected by this (or my previous) patch.

Thanks,
Marina

test/CodeGen/X86/bug26810.ll
28

If we look at the new sequence of the full loop, then it is not worse than the original one.

Here the test is only matching a small part of the loop (because this test is meant to check some other sequence).

Before patch:

loop:
MOVAPSmr
...
SUBPDrr
MOVAPSmr
MOVAPSrm
MULPDrm
ADDPDrr
ADD32ri8
...
jmp loop

After patch:

loop:
...
SUBPDrr
MOVAPSmr
MULPDrm
MOVAPSrm
ADDPDrr
MOVAPSrm
ADD32ri8
...
jmp loop

So the MOVAPSmr which was in the beginning of the loop was moved to the end of the loop.

Ah makes sense.

Thanks for the clarifications.

LGTM (again :)).

This revision was automatically updated to reflect the committed changes.
myatsina marked 3 inline comments as done.

I was looking at flags in this file and noticed there is a typo in the flag added here (condsider-local-interval-cost should be "consider-...", note the extraneous "d" in the flag). Also, there doesn't seem to be any test utilizing this flag. Should it be removed?

Herald added a project: Restricted Project. · View Herald TranscriptJun 25 2019, 9:39 AM