This is an archive of the discontinued LLVM Phabricator instance.

[BOLT] Improve jump table entry validation for split jump table
Needs ReviewPublic

Authored by nhuhuan on Jul 17 2022, 2:16 PM.

Details

Summary

Two main challenges of recovering jump tables are (a) to identify jump
table base, and (b) to find jump table bounds. For (b), we mainly rely
on the key observation: jump table entries are valid indirect intra-ref.
In more specific, keep iterating jump table entry until we find an
invalid one. Furthermore, function splitting optimization breaks a
function into a few fragments, and jump table could point to any of
these fragments. From binary perspective, each fragment is a function
on its own. The main problem is identifying a sibling fragment from
fragment in a different function.

In nonstripped binaries, fortunately, standard compilers use consistent
symbol names, e.g., foo and foo.cold.1 are siblings, and BOLT rely on
this information to support split jump table. However, there is no
regular symbol in stripped binaries, suggesting major improvements in
every step. These improvements also improve processing of non-stripped
binaries. Support of split jump table can be broken down to 4 steps:

  1. Decouple sibling analysis (based on symbol) from branch analysis
    • Avoid using incomplete sibling information for nonstripped binaries
  2. Decouple disassembly from branch analysis
    • Requirement for (3) and (4)
  3. Improve solution for problem (a)
  4. Improve solution for problem (b)

This update addresses step (4): restructure jump table entry validation
with ordered properties:

  • Bring instruction bounds check earlier (nonstripped/stripped)
  • Split function has no more than 2 fragments (stripped)
  • Non-overlapping jump table (stripped)
  • Sibling validation based on symbol name (nonstripped)

In future update:

  • Sibling validation based on LSDA (stripped)
  • Cannot target callable fragment entry (nonstripped/stripped)

Diff Detail

Event Timeline

nhuhuan created this revision.Jul 17 2022, 2:16 PM
Herald added a reviewer: Amir. · View Herald Transcript
Herald added a reviewer: maksfb. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
nhuhuan requested review of this revision.Jul 17 2022, 2:16 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 17 2022, 2:16 PM
nhuhuan edited the summary of this revision. (Show Details)Jul 17 2022, 2:22 PM
nhuhuan edited the summary of this revision. (Show Details)Jul 17 2022, 2:26 PM
nhuhuan edited the summary of this revision. (Show Details)
Amir added a comment.Jul 17 2022, 5:06 PM

Thank you for working on this. Can you split out the addition of BC->IsStripped, and the removal of AllowStripped (separate diffs)? This would make this diff more amenable for a review.

nhuhuan updated this revision to Diff 445377.Jul 17 2022, 7:49 PM

Separate the removal of AllowStripped into another diff.

Amir added inline comments.Jul 18 2022, 12:06 PM
bolt/lib/Core/BinaryContext.cpp
501

Let's move this addition into a separate diff. It would be dependent on BC->IsStripped diff.

bolt/test/X86/jump-table-move-pic.s
2

Please add a comment what this test is about: e.g. "Test -jump-tables=move with stripped/non-stripped binary with split function accessing the jump table, reproduces the unresolved symbol issue due to losing symbols referenced from a jump table"

nhuhuan updated this revision to Diff 445602.Jul 18 2022, 12:43 PM

Based on top of D130034 and D130036

nhuhuan updated this revision to Diff 445604.Jul 18 2022, 12:49 PM
nhuhuan marked 2 inline comments as done.

Add description for the test.

nhuhuan updated this revision to Diff 445773.Jul 19 2022, 4:06 AM

Split into several diffs. This diff is about supporting
split jump table in in stripped binaries.

nhuhuan retitled this revision from [BOLT] Support split function in stripped binaries to [BOLT] Support split jump table for stripped binaries.Jul 19 2022, 4:17 AM
nhuhuan edited the summary of this revision. (Show Details)
nhuhuan edited the summary of this revision. (Show Details)Jul 19 2022, 10:31 AM
Amir accepted this revision.Jul 20 2022, 7:26 PM

LGTM. Sorry that the review was taking so long.

This revision is now accepted and ready to land.Jul 20 2022, 7:26 PM
nhuhuan updated this revision to Diff 446400.Jul 21 2022, 2:36 AM

Fix an issue where split jump table targets are marked as
secondary entry points, but they skipped instruction bounds
check in postProcessEntryPoints().

nhuhuan edited the summary of this revision. (Show Details)Jul 21 2022, 2:44 AM
Amir added inline comments.Jul 21 2022, 10:58 AM
bolt/lib/Rewrite/RewriteInstance.cpp
2917 ↗(On Diff #446400)

Can you make a test case for that?

nhuhuan updated this revision to Diff 447096.Jul 23 2022, 12:48 PM

Revert. Move jump table fixes to another diff.

nhuhuan edited the summary of this revision. (Show Details)Jul 23 2022, 2:20 PM
maksfb requested changes to this revision.Jul 25 2022, 5:01 PM

Change the summary to a narrative format.

bolt/include/bolt/Core/BinaryContext.h
483
483

If you don't expect a pointer parameter to be a nullptr, make it a reference. If you don't expect to change it, make it a const reference.

Add a description of the function and its parameters.

This revision now requires changes to proceed.Jul 25 2022, 5:01 PM
nhuhuan updated this revision to Diff 447561.Jul 25 2022, 9:58 PM

Address Maksim's feedback.

nhuhuan edited the summary of this revision. (Show Details)Jul 25 2022, 10:06 PM
nhuhuan edited the summary of this revision. (Show Details)Jul 25 2022, 10:07 PM
nhuhuan edited the summary of this revision. (Show Details)Jul 25 2022, 10:11 PM
nhuhuan marked 2 inline comments as done.Jul 25 2022, 10:20 PM
nhuhuan updated this revision to Diff 447595.Jul 26 2022, 12:39 AM

Update comment.

Amir added a comment.Jul 26 2022, 11:47 AM

@maksfb: does the updated summary look good?
@nhuhuan: please address one typo. LGTM otherwise.

bolt/lib/Core/BinaryContext.cpp
654–655

As a note for future improvement: instead of partially reconstructing the logic of isValidJumpTableEntry, we may enumerate the checks in isValidJTE and return the check number along with its result.

659
nhuhuan updated this revision to Diff 447833.Jul 26 2022, 2:26 PM

Incorporate debugging messages into isValidJumpTableEntry.
An entry failing isValidJumpTableEntry does not mean the
jump table is invalid. Suppose we have a true jump table
base, it means we found the bounds.

nhuhuan marked 2 inline comments as done.Jul 26 2022, 2:26 PM
nhuhuan updated this revision to Diff 447930.Jul 26 2022, 9:22 PM

Refine comments. Improve implementation for isValidJumpTableEntry.

nhuhuan updated this revision to Diff 447983.Jul 27 2022, 3:30 AM

Clang-formatted.

nhuhuan retitled this revision from [BOLT] Support split jump table for stripped binaries to [BOLT] Support split jump table.Aug 5 2022, 6:15 PM
nhuhuan retitled this revision from [BOLT] Support split jump table to [BOLT] Improve jump table entry validation for split jump table.Aug 5 2022, 6:18 PM
nhuhuan updated this revision to Diff 450457.Aug 5 2022, 6:24 PM

Update parent-fragment relation when adding parent to jump table