This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] Use assembler expressions to lay out the EH LSDA
ClosedPublic

Authored by rprichard on Jan 30 2018, 4:05 PM.

Details

Reviewers
espindola
Summary

Rely on the assembler to finalize the layout of the DWARF/Itanium
exception-handling LSDA. Rather than calculate the exact size of each
thing in the LSDA, use assembler directives:

  • To emit the offset to the TTBase label:
.uleb128 .Lttbase0-.Lttbaseref0
.Lttbaseref0:
  • To emit the size of the call site table:
.uleb128 .Lcst_end0-.Lcst_begin0
.Lcst_begin0:
... call site table entries ...
.Lcst_end0:
  • To align the type info table:
... action table ...
.balign 4
.long _ZTIi
.long _ZTIl
.Lttbase0:

Using assembler directives simplifies the compiler and allows switching
the encoding of offsets in the call site table from udata4 to uleb128 for
a large code size savings. (This commit does not change the encoding.)

The combination of the uleb128 followed by a balign creates an unfortunate
dependency cycle that the assembler must sometimes resolve either by
padding an LEB or by inserting zero padding before the type table. See
PR35809 or GNU as bug 4029.

Diff Detail

Event Timeline

rprichard created this revision.Jan 30 2018, 4:05 PM
rprichard added a subscriber: danalbert.

I described this change on the llvm-dev mailing list a month ago: https://lists.llvm.org/pipermail/llvm-dev/2018-January/120178.html.

rprichard updated this revision to Diff 132942.Feb 5 2018, 11:26 PM

Remove the -layout-eh-table-in-assembler option; emit uleb128 code
offsets unconditionally.

Also: update LLVM tests to account for the encoding change.

Comments:

  • The SJLJ call site encoding value is still udata4, even though the two fields in each entry are encoded with uleb128. IIRC, I checked GCC, and it specified the encoding as uleb128. I suspect the SJLJ decoder ignores the encoding, but I haven't checked.
  • I removed the comments about the "16-byte bundle". I *think* it originated from Itanium documentation, where it refers to a VLIW instruction bundle.
  • check-llvm passes. I also built a stage2 toolchain and verified that its check-libcxx and check-libcxxabi pass on a 64-bit Linux machine.
rprichard updated this revision to Diff 133131.Feb 6 2018, 6:47 PM

Split this patch into two parts:

  • a patch that uses assembler directives to lay out the LSDA
  • a patch that switches the CST encoding to uleb128 and omits TTBase for empty type tables
rprichard retitled this revision from [CodeGen] Switch non-SJLJ EH encoding to uleb128 to [CodeGen] Use assembler expressions to lay out the EH LSDA.Feb 6 2018, 6:53 PM
rprichard edited the summary of this revision. (Show Details)
rprichard updated this revision to Diff 133304.Feb 7 2018, 2:20 PM
rprichard edited the summary of this revision. (Show Details)

Remove the AsmPrinter::EmitPaddedULEB128 and
MCStreamer::EmitPaddedULEB128IntValue functions, which are now unused.

Fixup comments above MCStreamer::Emit[US]LEB128IntValue.

espindola accepted this revision.Feb 7 2018, 2:34 PM

LGTM. Thanks!

This revision is now accepted and ready to land.Feb 7 2018, 2:34 PM
espindola closed this revision.Feb 9 2018, 9:03 AM

r324749

Thank you so much for cleaning this up.