This is an archive of the discontinued LLVM Phabricator instance.

[RegAllocHints] Avoid compile time regression
ClosedPublic

Authored by jonpa on Mar 10 2019, 5:57 PM.

Details

Summary

As a fix for https://bugs.llvm.org/show_bug.cgi?id=40986 ("excessive compile time building opencollada"), this patch makes sure that no phys reg is hinted more than once from getRegAllocationHints().

This handles the case were many virtual registers are assigned to the same physreg. The previous compile time fix (r343686) in weightCalcHelper() only made sure that physical / virtual registers are passed no more than once to addRegAllocationHint().

I am not sure if perhaps this check (with a HintedRegs set) is actually only needed in TargetRegisterInfo::getRegAllocationHints(), so that the one in weightCalcHelper() should be removed?

Diff Detail

Event Timeline

jonpa created this revision.Mar 10 2019, 5:57 PM
dim added a comment.Mar 11 2019, 1:43 AM

This helps quite a lot, in the sense that my original test case from PR40986 now at least compiles within a finite time. There is still a performance loss between clang 7.0.1 and clang 8.0.0 with this patch, though (tested with 10 runs each):

x clang-7.0.1-realtime.txt
+ clang-8.0.0-realtime.txt
+--------------------------------------------------------------------------------------------------+
|    x     x                                                         +                             |
|x   x  x  x    xx          x                  x  +         +++++    ++                           +|
||_________M___A_____________|                        |_________M__A___________|                   |
+--------------------------------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x  10         37.19          40.4         37.89         38.16    0.94949109
+  10         40.56         43.91        41.525        41.732    0.86639483
Difference at 95.0% confidence
        3.572 +/- 0.853992
        9.36059% +/- 2.35454%
        (Student's t, pooled s = 0.908893)

So roughly 9% slower in real time (user time measurements give almost the same result, since this is all single-threaded of course.)

dim added a subscriber: hans.Mar 11 2019, 9:04 AM

@hans it's really late but if you're going to spin 8.0.0 rc5, it would be nice if this one gets merged too, after it's been committed. Or if not, I'll probably merge into the FreeBSD version of clang anyway, since we hit this issue first. :)

dim added a comment.Mar 11 2019, 11:10 AM

Btw @jonpa, for me it's still interesting as to why the compiler hang only occurred for builds targeting i386 (i.e 32 bit x86), while x86_64 worked just fine. Is it due to the lower number of physical registers on i386?

qcolombet accepted this revision.Mar 11 2019, 11:37 AM

LGTM.

test/CodeGen/X86/regalloc-copy-hints.mir
308

Do we need the IR part?
(I.e., does anything in the MIR reference back the IR?)

This revision is now accepted and ready to land.Mar 11 2019, 11:37 AM
In D59201#1424934, @dim wrote:

Btw @jonpa, for me it's still interesting as to why the compiler hang only occurred for builds targeting i386 (i.e 32 bit x86), while x86_64 worked just fine. Is it due to the lower number of physical registers on i386?

I really don't know, but that seems like a reasonable explanation to me.

jonpa closed this revision.Mar 11 2019, 12:04 PM

r355854

jonpa marked an inline comment as done.Mar 11 2019, 12:16 PM
jonpa added inline comments.
test/CodeGen/X86/regalloc-copy-hints.mir
308

I tried removing it but then got 'basic block "bb' is not defined in the function 'fun'"...?

hans added a comment.Mar 12 2019, 1:15 AM
In D59201#1424675, @dim wrote:

@hans it's really late but if you're going to spin 8.0.0 rc5, it would be nice if this one gets merged too, after it's been committed. Or if not, I'll probably merge into the FreeBSD version of clang anyway, since we hit this issue first. :)

I don't want to merge this so late in the process. It seems like a good candidate for 8.0.1 though, and maybe you can take it downstream anyway, as you say.