This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] Mark setjmp/catchret MBBs address-taken
ClosedPublic

Authored by JosephTremoulet on Oct 15 2015, 8:09 AM.

Details

Summary

This ensures that BranchFolding (and similar) won't remove these blocks.

Also allow AsmPrinter::EmitBasicBlockStart to process MBBs which are
address-taken but do not have BBs that are address-taken, since otherwise
its call to getAddrLabelSymbolTableToEmit would fail an assertion on such
blocks. I audited the other callers of getAddrLabelSymbolTableToEmit
(and getAddrLabelSymbol); they all have BBs known to be address-taken
except for the call through getAddrLabelSymbol from
WinException::create32bitRef; that call is actually now unreachable, so
I've removed it and updated the signature of create32bitRef.

This fixes PR25168.

Diff Detail

Event Timeline

JosephTremoulet retitled this revision from to [CodeGen] Mark setjmp/catchret MBBs address-taken.
JosephTremoulet updated this object.
JosephTremoulet added subscribers: llvm-commits, pgavlin.
rnk edited edge metadata.Oct 16 2015, 11:17 AM

I'd actually like this transform to work, and I don't want WinEH to get entangled with BlockAddress support, which is what hasAddressTaken is for.

My idea was to create a pseudo instr during frame lowering that defs EAX/RAX and takes no operands. During X86MCInstrLowering, we would scan forwards to the CATCHRET terminator, get the destination MBB operand, and print its symbol.

I'd actually like this transform to work,

In the case that we hit in the bug, we actually get the same codegen despite the MBB being marked address-taken:

  • Tail-merge can still reduce the address-taken block to a jmp
  • We can still position the address-taken block before its succ, leaving it empty
  • We just can't remove the empty block anymore, so we use a different label for the same codegen

Of course, I agree we wouldn't always be able to get the same codegen. I believe the setjmp testcase added here left around a block with a jmp, owing I guess to the successor having multiple preds...

I don't want WinEH to get entangled with BlockAddress support, which is what hasAddressTaken is for.

I was actually trying to be careful *not* to use BlockAddress, so that nothing is address-taken at the IR level. I take it you're saying that distinction isn't helpful?

Happy to consider alternatives:

  1. BranchFolding knows how to rewrite jump tables; perhaps something patterned after that to allow it to rewrite the lea/mov for a catchret would be better? I considered that but since the lea/mov is in code and the jump tables are data it seemed like it would be a harder problem.
  2. I'm not sure I understand why we need to split out the lea/mov at PEI. Could we persist a CATCHRET with explicit target block through to X86MCInsrtLowering, and there find the appropriate place to stick the lea/mov?

My idea was to create a pseudo instr during frame lowering that defs EAX/RAX and takes no operands. During X86MCInstrLowering, we would scan forwards to the CATCHRET terminator, get the destination MBB operand, and print its symbol.

Since the same catch could have multiple catchrets targeting different successors, I'd be worried about the pseudos for different targets getting merged together somehow and then we wouldn't be able to expand the merged pseudo to something correct. Does that sound right, or am I misunderstanding?

Thanks.

rnk accepted this revision.Oct 21 2015, 3:40 PM
rnk edited edge metadata.

lgtm
I looked at the setjmp test case, and that convinced me that your approach here is better.

lib/CodeGen/AsmPrinter/AsmPrinter.cpp
2483–2487

There should be a comment here explaining that MBBs can have their address taken as part of CodeGen without having their address taken in IR.

This revision is now accepted and ready to land.Oct 21 2015, 3:40 PM
JosephTremoulet edited edge metadata.

rebase, incorporate feedback

update newly-added tests