This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen][FastRegAlloc] Disable registers spilling for a naked function (PR28641)
Needs ReviewPublic

Authored by kbelochapka on Feb 20 2018, 6:22 PM.

Details

Summary

Problem:
Fast Registers Allocator can occasionally spill a register when processing a "naked" function.
It takes a input parameter register list and intersects it with a "asm" statement clobbered registers list, then it spills any register from a resulting list to a stack, which a "naked" function does not have.
Solution:
Since a "naked" function can only have nothing but "asm" statements inside, and those "asm" statements can not have input or output parameters, it is safe to just disable any registers spilling for a "naked" function.

Diff Detail

Event Timeline

kbelochapka created this revision.Feb 20 2018, 6:22 PM

Since a "naked" function can only have nothing but "asm" statements inside, and those "asm" statements can not have input or output parameters, it is safe to just disable any registers spilling for a "naked" function.

Do naked functions have vregs at all then? Could we just skip the whole register allocation process in runOnMachineFunction() instead?

Since a "naked" function can only have nothing but "asm" statements inside, and those "asm" statements can not have input or output parameters, it is safe to just disable any registers spilling for a "naked" function.

Do naked functions have vregs at all then? Could we just skip the whole register allocation process in runOnMachineFunction() instead?

Yes, they have VREGS, a "naked" function can have a list of input parameters.

I'm still a bit confused as to why we start spilling unused input vregs in the first place. Can you add a test case so we can reproduce? (The test can't get commit without a test anyway)

I've been looking into this, and we do indeed produce something like this for naked function with a parameter (this from http://llvm.org/PR28641 compiled with -O0):

bb.0 (%ir-block.1):
  liveins: $edi
  %0:gr32 = COPY $edi
  %1:gr32 = COPY killed %0
  INLINEASM &ret [sideeffect] [attdialect], $0:[clobber], implicit-def early-clobber $rdi, $1:[clobber], implicit-def early-clobber $eflags, !2
  • It would probably be a good idea to change isel to not emit copies for unused arguments but technically it is not wrong.
  • The %1 assignment is dead but not marked as such. I'm not 100% sure whether it is allowed to omit dead flags.
  • We can teach fast regalloc to recognize obviously dead cases where a vreg doesn't have a single use. That's enough to get this naked function stuff going:
diff --git a/lib/CodeGen/RegAllocFast.cpp b/lib/CodeGen/RegAllocFast.cpp
index 7a8d4225ad0..b2603c270a1 100644
--- a/lib/CodeGen/RegAllocFast.cpp
+++ b/lib/CodeGen/RegAllocFast.cpp
@@ -1044,7 +1044,7 @@ void RegAllocFast::allocateBasicBlock(MachineBasicBlock &MBB) {
       }
       LiveRegMap::iterator LRI = defineVirtReg(MI, I, Reg, CopySrcReg);
       MCPhysReg PhysReg = LRI->PhysReg;
-      if (setPhysReg(MI, I, PhysReg)) {
+      if (setPhysReg(MI, I, PhysReg) || MRI->use_nodbg_empty(Reg)) {
         VirtDead.push_back(Reg);
         CopyDstReg = 0; // cancel coalescing;
       } else

(of course to actually push this fastregalloc improvement we have to adapt a number of testcases)

The %1 assignment is dead but not marked as such. I'm not 100% sure whether it is allowed to omit dead flags.

I think it is *not* allowed to omit dead flags.
The question though is why SDISel doesn't set them.

Hi Konstantin,

Could you check why we don't set the dead flags on those?

I believe that would be the right fix.

Cheers,
-Quentin

Hi Quentin,
The dead flag on those registers is not set because we do not run "Live Variable Analysis" pass on -O0 optimization level.
We can add Live Variable Analysis pass prior of execution of Fast Register Allocator pass, but this will cause ~59 LIT tests to fail.
At the same time, in some situations the produced assembler code is more clean, does not contain unnecessary register spills.
And I am not sure can this potentially affect a code debuggability or not.
What would be your suggestion? Shall we add Live Variable Analysis pass on -O0?

Hi Konstantin,

We can add Live Variable Analysis pass prior of execution of Fast Register Allocator pass, but this will cause ~59 LIT tests to fail.

IIRC we were trying to get rid of the Live Variable Analysis pass all together. Therefore, adding uses of this pass is not recommended.
That said, for O0 I don't know what would be the alternative.

Anyway, if that pass does what we want, adding it at O0 works. Fixing LIT tests is fine.

Cheers,
-Quentin

I think fastregalloc/-O0 isn't really expecting all liveness flags to be complete (unfortunately hard to tell/not written down).

In any case I'd tend to not add an instance of LiveVariables to the O0 pipeline just for this. I'd rather vote for some workaround code in fast regalloc to detect obviously dead defs (see above).

Hi Konstantin,

We can add Live Variable Analysis pass prior of execution of Fast Register Allocator pass, but this will cause ~59 LIT tests to fail.

IIRC we were trying to get rid of the Live Variable Analysis pass all together. Therefore, adding uses of this pass is not recommended.
That said, for O0 I don't know what would be the alternative.

Anyway, if that pass does what we want, adding it at O0 works. Fixing LIT tests is fine.

Cheers,
-Quentin