This is an archive of the discontinued LLVM Phabricator instance.

findDeadCallerSavedReg needs to pay attention to calling convention
ClosedPublic

Authored by AndyAyers on Nov 20 2015, 11:45 AM.

Details

Summary

Caller saved regs differ between SysV and Win64. Use the tail call available set to scavenge from.

Refactor register info to create new helper to get at tail call GPRs. Added a new test case for windows. Fixed up a number of X64 tests since now RCX is preferred over RDX on SysV.

Diff Detail

Repository
rL LLVM

Event Timeline

AndyAyers updated this revision to Diff 40809.Nov 20 2015, 11:45 AM
AndyAyers retitled this revision from to findDeadCallerSavedReg needs to pay attention to calling convention.
AndyAyers updated this object.
AndyAyers added a reviewer: majnemer.
AndyAyers set the repository for this revision to rL LLVM.
AndyAyers added a subscriber: llvm-commits.
majnemer added inline comments.Nov 22 2015, 10:46 AM
lib/Target/X86/X86FrameLowering.cpp
160–163 ↗(On Diff #40809)

getPointerRegClass will return GR64_TC which differs from CallerSavedRegs64Bit in that it doesn't include R10 but does include RIP. I reckon that GR64_TC is wrong and should match CallerSavedRegs64Bit, what do you think?

majnemer added inline comments.Nov 22 2015, 12:24 PM
lib/Target/X86/X86FrameLowering.cpp
156 ↗(On Diff #40809)

Reference token should lean right.

160–163 ↗(On Diff #40809)

Ah, RIP was added to silence the following machine verifier check:
https://github.com/llvm-mirror/llvm/blob/5ca8076fccc1aa4860ab0d2fd741b60dd1ba269b/lib/CodeGen/MachineVerifier.cpp#L925

The following test case run with llc -verify-coalescing will cause us to trigger that verifier check...

target triple = "x86_64-pc-win32"

@fnptr = external global void ()*

define void @test() {
entry:
  %p = load void ()*, void ()** @fnptr
  tail call void %p()
  ret void
}

I'll go fix this for Win64...

186 ↗(On Diff #40809)

I think we are OK with getPointerRegClass giving back RIP if we make sure to filter it out.

AndyAyers added inline comments.Nov 22 2015, 2:58 PM
lib/Target/X86/X86FrameLowering.cpp
160–163 ↗(On Diff #40809)

I'll add R10 to the windows set, and exclude RIP.

It seems odd that R10 is not in the SysV set -- it looks like it is caller saved on both.

AndyAyers updated this revision to Diff 40890.Nov 22 2015, 4:50 PM

Updated per feedback. Use R10 on windows, screen out RIP on sysv, fixed placement of &.

AndyAyers marked 2 inline comments as done.Nov 22 2015, 4:51 PM

Updated detail comment status.

rnk accepted this revision.Nov 23 2015, 10:58 AM
rnk added a reviewer: rnk.
rnk added a subscriber: rnk.

Aside from the magic around '4', this looks great.

lib/Target/X86/X86FrameLowering.cpp
156 ↗(On Diff #40890)

I'm surprised the magic number 4 isn't an enum. I guess the intention was that it would only be used by tablegen.

Anyway, I think what we should do here is scoop the guts of 'case 4' in X86RegisterInfo::getPointerRegClass out into its own method on X86RegisterInfo that we can call directly from here. It should have a name like getGPRsForTailCall.

160–163 ↗(On Diff #40890)

According to http://www.x86-64.org/documentation/abi.pdf, R10 is used for passing the "static chain pointer". The document doesn't say what that is, but I think it's for GCC's nested functions extension. We have this code in X86CallingConv.td:

// The 'nest' parameter, if any, is passed in R10.
CCIfNest<CCIfSubtarget<"isTarget64BitILP32()", CCAssignToReg<[R10D]>>>,
CCIfNest<CCAssignToReg<[R10]>>,

I think it's impossible to write C/C++ source code that clang will compile into a normal tail call that uses the nest parameter, but I can see why we might want to be cautious and leave out R10. R11 is always a perfectly valid answer, since it's always scratch.

This revision is now accepted and ready to land.Nov 23 2015, 10:58 AM
AndyAyers added inline comments.Nov 23 2015, 11:44 AM
lib/Target/X86/X86FrameLowering.cpp
156 ↗(On Diff #40890)

Yeah, I was surprised too. I thought there might be a friendly name somewhere.

I'll refactor as you suggest -- seems like case 4 should just call this new method then?

rnk added inline comments.Nov 23 2015, 11:47 AM
lib/Target/X86/X86FrameLowering.cpp
156 ↗(On Diff #40890)

Yup.

AndyAyers updated this revision to Diff 40966.Nov 23 2015, 12:56 PM
AndyAyers updated this object.
AndyAyers edited edge metadata.

Created helper to get at the tail-call available GPRs.

rnk requested changes to this revision.Nov 23 2015, 1:31 PM
rnk edited edge metadata.
rnk added inline comments.
include/llvm/Target/TargetRegisterInfo.h
640 ↗(On Diff #40966)

This doesn't need to be virtual, X86FrameLowering already knows that TRI is an X86RegisterInfo. This whole refactoring should be internal to the X86 backend.

This revision now requires changes to proceed.Nov 23 2015, 1:31 PM
AndyAyers updated this revision to Diff 40973.Nov 23 2015, 1:57 PM
AndyAyers edited edge metadata.

Updated the RegisterInfo type passed to findDeadCallerSavedReg and removed the virtual/override.

rnk accepted this revision.Nov 23 2015, 2:15 PM
rnk edited edge metadata.

Thanks, lgtm

This revision is now accepted and ready to land.Nov 23 2015, 2:15 PM
This revision was automatically updated to reflect the committed changes.