This is an archive of the discontinued LLVM Phabricator instance.

[ELF] - Fill executable segments with trap instructions.
AbandonedPublic

Authored by grimar on Mar 22 2018, 3:42 AM.

Details

Summary

LLD currently fills the last page of each executable segment with
trap instructions. It seems to be not enough, see PR36853.

Patch changes logic to fill the all paddings within a segment.
So that any zero areas which exist because of alignment
paddings are also covered.

Diff Detail

Event Timeline

grimar created this revision.Mar 22 2018, 3:42 AM
ruiu added a comment.Mar 22 2018, 1:24 PM

This change is a bit worrisome from the performance point of view because it significantly increases the number of bytes we write to mmap'ed file. I believe this is still fine because writing bytes to contiguous memory is extremely fast and the size of the executable part of an executable is relatively small compared to the size of an entire executable, but we need to verify that first.

ruiu added a comment.Mar 22 2018, 2:07 PM

When I tried this patch against an internal program of size 2 GiB, the overhead almost 2% (i.e. the link time is 2% longer than before), which is I think not negligible. We need a better way of filling executable segments with trap instructions.

When I tried this patch against an internal program of size 2 GiB, the overhead almost 2% (i.e. the link time is 2% longer than before), which is I think not negligible. We need a better way of filling executable segments with trap instructions.

Thanks for the information. Indeed I did not expect 2% slowdown. I'll try to implement and benchmark a different approach.

grimar updated this revision to Diff 139588.Mar 23 2018, 6:49 AM
grimar edited the summary of this revision. (Show Details)
  • Reimplemented.
grimar updated this revision to Diff 139589.Mar 23 2018, 6:54 AM
  • Included forgotten test case.
ruiu added inline comments.Mar 23 2018, 11:20 AM
ELF/Arch/X86.cpp
445 ↗(On Diff #139589)

These instructions won't change, so let's just append 0xcc instructions at end of each array, instead of using memset.

grimar updated this revision to Diff 139710.Mar 24 2018, 7:02 AM
  • Addressed review comment.
ruiu added inline comments.Mar 26 2018, 1:43 PM
ELF/Arch/X86.cpp
446 ↗(On Diff #139710)

Please remove ; .align 48 because that doesn't really explain what this code does. (Also we want to align it not to 48 but 16).

466 ↗(On Diff #139710)

Ditto

ELF/Arch/X86_64.cpp
503–504 ↗(On Diff #139710)

Do this

0xcc, 0xcc, ... // int3
0xcc, 0xcc, ... // int3

so that the comments are aligned vertically.

552 ↗(On Diff #139710)

Ditto

grimar added inline comments.Mar 28 2018, 10:23 AM
ELF/Arch/X86.cpp
446 ↗(On Diff #139710)

Actually, we want to align to PltHeaderSize I think, which is 48 here. Because if you align 32 to 16, you'll get
32 instead of 48 and that would not work.

grimar abandoned this revision.Mar 28 2018, 10:24 AM

Abandoning in favour of D44943

ruiu added inline comments.Mar 28 2018, 10:30 AM
ELF/Arch/X86.cpp
446 ↗(On Diff #139710)

That's wrong. ".align 48" doesn't mean that the size of this thingy is 48 byte. It just says that the thin that follows this thing is aligned to 48 bytes. So that comment is wrong.

ruiu added a comment.Mar 28 2018, 2:57 PM

You shouldn't have dropped this patch because mine doesn't include the code to add 0xcc to end of each PLT entry. We still need that for PLT entries that's larger than one page.

espindola added inline comments.Mar 28 2018, 7:40 PM
ELF/Writer.cpp
2195

Why not static?

2212

I wonder if we could write this as we write the sections.

You shouldn't have dropped this patch because mine doesn't include the code to add 0xcc to end of each PLT entry. We still need that for PLT entries that's larger than one page.

I think we have D44682 for PLT entries posted earlier than this one on review and it do all necessary job for PLT entries + has the test case.

grimar added inline comments.Mar 29 2018, 1:32 AM
ELF/Arch/X86.cpp
446 ↗(On Diff #139710)

Ok, seems I overthinking the meaning of "0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, // 19: int3; .align 16" above then,
thanks for the explanation.