This is an archive of the discontinued LLVM Phabricator instance.

[CFI] cfi directive insertion for callee-save-registers in CFIInstrInserter pass
ClosedPublic

Authored by wmi on Feb 9 2020, 9:57 PM.

Details

Summary

https://reviews.llvm.org/D42848 handled CFA related cfi directives but didn't handle callee-save-registers (csr) related cfi directives like .cfi_offset and .cfi_restore. This patch adds that part.

Basically it reuses the framework created in D42848. For each basicblock, the patch tracks which csr set have been saved at its CFG predecessors's exits, and compare the csr set with the set at its previous basicblock's exit (The previous block is the block laid before the current block). If the saved csr set at its previous basicblock's exit is larger, .cfi_restore will be inserted.

It fixes the cfi information for the simple case below. The case is shrinkwrap optimized by llvm:

--------------- 1.cc ----------------
void goo();
  
long foo(bool cond, long *a) {
  if (__builtin_expect(cond, 1)) {
    goo();
    return a[0] + a[1] + a[2];
  }
  return 0;
}
------------------------------------

clang_without_patch -O2 -fno-omit-frame-pointer -S 1.cc
	.cfi_startproc
# %bb.0:                                # %entry
	testb	%dil, %dil
	je	.LBB0_1
# %bb.2:                                # %if.then
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset %rbp, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register %rbp
	pushq	%rbx
	pushq	%rax
	.cfi_offset %rbx, -24
	movq	%rsi, %rbx
	callq	_Z3goov
	movq	8(%rbx), %rax
	addq	(%rbx), %rax
	addq	16(%rbx), %rax
	addq	$8, %rsp
	popq	%rbx
	popq	%rbp
	.cfi_def_cfa %rsp, 8
	retq
.LBB0_1:
	xorl	%eax, %eax          
	retq
.Lfunc_end0:
	.size	_Z3foobPl, .Lfunc_end0-_Z3foobPl
	.cfi_endproc

clang_with_patch -O2 -fno-omit-frame-pointer -S 1.cc
	.cfi_startproc
# %bb.0:                                # %entry
	testb	%dil, %dil
	je	.LBB0_1
# %bb.2:                                # %if.then
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset %rbp, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register %rbp
	pushq	%rbx
	pushq	%rax
	.cfi_offset %rbx, -24
	movq	%rsi, %rbx
	callq	_Z3goov
	movq	8(%rbx), %rax
	addq	(%rbx), %rax
	addq	16(%rbx), %rax
	addq	$8, %rsp
	popq	%rbx
	popq	%rbp
	.cfi_def_cfa %rsp, 8
	retq
.LBB0_1:
	.cfi_restore %rbx          <==== .cfi_restore inserted.
	.cfi_restore %rbp          <==== .cfi_restore inserted.
	xorl	%eax, %eax           <==== without .cfi_restore above, here libunwind will erroneously think the values of %rbx and %rbp in last stack frame are saved in current stack frame.                                                           
	retq
.Lfunc_end0:
	.size	_Z3foobPl, .Lfunc_end0-_Z3foobPl
	.cfi_endproc

Diff Detail

Repository
rL LLVM

Event Timeline

wmi created this revision.Feb 9 2020, 9:57 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 9 2020, 9:57 PM
Herald added a subscriber: hiraditya. · View Herald Transcript

Thanks for working on this, I think in general it looks good.

Wasn't there a verifier that was running with EXPENSIVE_CHECKS for the previously supported directives? Can we add the registers as well?

More comments inline.

llvm/lib/CodeGen/CFIInstrInserter.cpp
81

You can use BitVectors as sets of registers, with the supported bitwise operations to provide similar set operations as std::set.

llvm/test/CodeGen/X86/cfi-inserter-callee-save-register.mir
7

Does this test actually need the IR?

wmi marked 2 inline comments as done.Feb 22 2020, 11:37 PM

Thanks for working on this, I think in general it looks good.

Wasn't there a verifier that was running with EXPENSIVE_CHECKS for the previously supported directives? Can we add the registers as well?

More comments inline.

Thanks for the review and sorry for the late reply.

And thanks for pointing out the verifier. That is a helpful suggestion! I added the check for CSR related CFI, and it helped me to discover a problem I missed before: Previously we skip inserting .cfi_restore in epilogue in prologue/epilogue generation pass, and only insert .cfi_restore in CFIInstrInserter pass. If the epilogue block is not a return block, skipping inserting .cfi_restore in it will lead to the CSR related CFI verification error -- the return block after the epilogue block will see different OutgoingCSRSaved set from its predecessors.

The solution is: In prologue/epilogue generation pass, we need to insert .cfi_restore for epilogue if it is not a return block or other type of exit block. In this way, each block will see consistent OutgoingCSRSaved information from each of its predecessors. At the same time, for the common case that epilogue block is a return block, we still don't need to insert unnecessary .cfi_restore.

To make it clearer, here is a testcase to show it. In the assembly, %bb.1 is an epilogue bb without return inside of it. If no .cfi_restore is inserted in %bb.1, %LBB0_2 will see different OutgoingCSRSaved information from its two predecessors: %bb.0 (no reg saved) and %bb.1 (rbp saved). So we need to insert .cfi_restore in %bb.1.

------------- 1.cc --------------
void goo();

long foo(bool cond) {
  if (__builtin_expect(cond, 1)) {
    goo();
    return 0;
  }
  return 0;
}
-----------------------------

clang -O2 -fno-omit-frame-pointer  -S 1.cc
  .cfi_startproc
# %bb.0:                                # %entry
  testb %dil, %dil
  je  .LBB0_2
# %bb.1:                                # %if.then
  pushq %rbp
  .cfi_def_cfa_offset 16
  .cfi_offset %rbp, -16
  movq  %rsp, %rbp
  .cfi_def_cfa_register %rbp
  callq _Z3goov
  popq  %rbp                     
  .cfi_def_cfa %rsp, 8
.LBB0_2:                                # %return
  xorl  %eax, %eax
  retq
.Lfunc_end0:
  .size _Z3foob, .Lfunc_end0-_Z3foob
  .cfi_endproc
llvm/lib/CodeGen/CFIInstrInserter.cpp
81

Thanks. Done.

llvm/test/CodeGen/X86/cfi-inserter-callee-save-register.mir
7

The function define is needed, but its body can be further simplified.

wmi updated this revision to Diff 246099.Feb 22 2020, 11:42 PM

Address @thegameg's comment.

wmi added a comment.Mar 2 2020, 8:51 AM

Friendly ping.

Sorry for the delay.

That's right, llvm didn't insert any CFI in epilogues before this pass was added, so backends can start emitting them once this pass is ready to deal with them.

I added some more inline comments, this seems pretty good!

llvm/lib/CodeGen/CFIInstrInserter.cpp
173

Can a BitVector be used here too?

241

If CSRSaved is a BitVector, this can be just MBBInfo.OutgoingCSRSaved |= CSRSaved.

316

for (int Reg : SetDifference.set_bits())

350

set_bits() here too.

355–356

set_bits() here too.

llvm/lib/Target/X86/X86FrameLowering.cpp
482

Can we just not call this function at all on no-return blocks?

llvm/test/CodeGen/X86/cfi-epilogue-with-return.mir
10

I would also remove all the unnecessary IR here if possible.

56

Maybe you could use renamable $rbx = IMPLICIT_DEF for all the movs, for a cleaner test here and in the rest of the tests?

llvm/test/CodeGen/X86/cfi-inserter-callee-save-register.mir
7

IIRC, you can even remove the function define by removing the whole IR part. For example: llvm/test/CodeGen/X86/tailcall-pseudo.mir

wmi marked 8 inline comments as done.Mar 3 2020, 10:09 PM
wmi added inline comments.
llvm/lib/Target/X86/X86FrameLowering.cpp
482

Good point. Done.

llvm/test/CodeGen/X86/cfi-epilogue-with-return.mir
10

I remove most of them, but "define _Z3foob" is still needed. In addition, I need to mark the frame-pointer = all attribute to test the cfi related with %rbp is properly set.

56

Another good point. Done.

wmi updated this revision to Diff 248108.Mar 3 2020, 10:30 PM

Address @thegameg's comment.

thegameg accepted this revision.Mar 4 2020, 9:49 AM

LGTM, this looks great, thanks @wmi.

llvm/test/CodeGen/X86/cfi-epilogue-with-return.mir
29

You can also remove the branch probabilities in all the tests, e.g. in this test: (0x7fef9fcb) and (0x00106035).

This revision is now accepted and ready to land.Mar 4 2020, 9:49 AM
wmi marked an inline comment as done.Mar 4 2020, 10:59 AM

Thanks for the review and helpful suggestions!

This revision was automatically updated to reflect the committed changes.
wmi added a comment.Mar 24 2020, 10:56 AM

The original committed patch was reverted because a test failure in libc++ was reported.

The original patch only inserts .cfi_restore. A missing piece in the original patch is for some cases, it needs to generate .cfi_offset/.cfi_register as well. I attach a new patch which fixes the problem.

Here is a case just used to show where is the problem. In reality the test below won't expose the problem because we don't generate .cfi_restore in epilogue if it is a return block. Suppose the final machine code has the same layout as the c code below.

int foo(bool cond) {
  // prologue: .cfi_offset $rbp 16
  if (cond) {
    b = 5;
    // epilogue: .cfi_restore $rbp
    return 2;
  } 
  // currently, we don't insert anything here, but we need to insert .cfi_offset $rbp 16 because at the place of 'a = 3', $rbp has not been restored to its value in previous frame yet. 
  a = 3;
  // epilogue: .cfi_restore $rbp
  return 3;
}
wmi updated this revision to Diff 252374.Mar 24 2020, 11:00 AM

Fix an unwinding problem exposed in libc++ test.

wmi reopened this revision.Mar 24 2020, 11:27 AM

reopen it for the review of the patch to be recommitted.

This revision is now accepted and ready to land.Mar 24 2020, 11:27 AM
wmi added a comment.Mar 31 2020, 11:45 AM

Friendly ping. Because the fix for the patch contains some substantial change so I want it to be reviewed before I recommit the patch. Thanks in advance!

wmi added a comment.Apr 8 2020, 9:24 AM

Ping again.

thegameg added inline comments.Apr 17 2020, 10:59 AM
llvm/lib/CodeGen/CFIInstrInserter.cpp
92

Can we rename this to something like CSRSavedLocation?

94

0/$noreg should be considered an invalid register here, no?

94

Isn't this a per-basicblock property? I actually have a hard time coming up with a test case where this would break, so maybe this is sufficient?

201

Can we also have tests for cfi_register and cfi_rel_offset? My understanding is that we currently don't generate them.

231

it is not used here

232

Does it mean that we assert if we have:

.cfi_offset $rbx, -10
.cfi_offset $rbx, -24

I would think that this is valid and we should just overwrite it. What do you think? If we do want to support it, please add a test as well.

llvm/test/CodeGen/X86/cfi-inserter-callee-save-register-2.mir
73 ↗(On Diff #252374)

Can we add the CFI_INSTRUCTION restores for consistency here too?

wmi marked 7 inline comments as done.Apr 19 2020, 12:08 PM
wmi added inline comments.
llvm/lib/CodeGen/CFIInstrInserter.cpp
92

That is better. Done.

94

My assumption is every function will only save a CSR in one place and I havn't seen case outside of the assumption. If every basicblock has its own CSR location array, the cost will be much higher, so I choose to hold the assumption until I see a counterexample.

94

Register number used in CFI instruction is got from MCRegisterInfo::getDwarfRegNum, which is different from the physical register number. For $rax, CFI.getRegister() returns 0.

201

I saw PPC backend has code to generate .cfi_register. I add tests for them.

231

Fixed.

232

Ah, the assertion is not written in a correct way. The original intention was to be consistent with the assumption above that every function will only save the CSR in one place. If one CSR is saved in multiple different locs, the assertion could catch it.

I change the assertion and add a test for it.

llvm/test/CodeGen/X86/cfi-inserter-callee-save-register-2.mir
73 ↗(On Diff #252374)

Done.

wmi updated this revision to Diff 258614.Apr 19 2020, 12:12 PM

Address thegameg's comment.

wmi updated this revision to Diff 258619.Apr 19 2020, 12:31 PM

Fix some minor issue in the last version.

thegameg accepted this revision.Apr 23 2020, 1:03 AM

Thank you, this LGTM!

llvm/lib/CodeGen/CFIInstrInserter.cpp
94

Fair enough. I think it's a safe assumption to make, since for now we only emit the prologue more than once if we have ehfunclet-entry blocks (llvm/test/CodeGen/X86/win64-eh-empty-block-2.mir).

94

Maybe llvm::Optional<T> is cleaner in this case? (I don't feel strongly about this, though)

wmi marked an inline comment as done.Apr 27 2020, 12:28 PM
wmi added inline comments.
llvm/lib/CodeGen/CFIInstrInserter.cpp
94

Good suggestion. Optional<T> looks cleaner.

wmi updated this revision to Diff 260411.Apr 27 2020, 12:45 PM

Address thegameg@'s comment.

This revision was automatically updated to reflect the committed changes.