This is an archive of the discontinued LLVM Phabricator instance.

[BOLT] Support multiple parents for split jump table
ClosedPublic

Authored by nhuhuan on Jun 23 2022, 3:41 PM.

Details

Summary

There are two assumptions regarding jump table:
(a) It is accessed by only one fragment, say, Parent
(b) All entries target instructions in Parent

For (a), BOLT stores jump table entries as relative offset to Parent.
For (b), BOLT treats jump table entries target somewhere out of Parent
as INVALID_OFFSET, including fragment of same split function.

In this update, we extend (a) and (b) to include fragment of same split
functinon. For (a), we store jump table entries in absolute offset
instead. In addition, jump table will store all fragments that access
it. A fragment uses this information to only create label for jump table
entries that target to that fragment.

For (b), using absolute offset allows jump table entries to target
fragments of same split function, i.e., extend support for split jump
table. This can be done using relocation (fragment start/size) and
fragment detection heuristics (e.g., using symbol name pattern for
non-stripped binaries).

For jump table targets that can only be reached by one fragment, we
mark them as local label; otherwise, they would be the secondary
function entry to the target fragment.

Test Plan

ninja check-bolt

Diff Detail

Event Timeline

nhuhuan created this revision.Jun 23 2022, 3:41 PM
Herald added a reviewer: Amir. · View Herald Transcript
Herald added a reviewer: maksfb. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a subscriber: ayermolo. · View Herald Transcript
nhuhuan requested review of this revision.Jun 23 2022, 3:41 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 23 2022, 3:41 PM
nhuhuan edited the summary of this revision. (Show Details)Jun 23 2022, 3:42 PM
rafauler added inline comments.Jun 23 2022, 6:04 PM
bolt/lib/Core/BinaryContext.cpp
645–648

debug code should be guarded under the LLVM_DEBUG macro in the same way as line 625. This line was previously unguarded by mistake.

651–660

same

Do you have a testcase?

bolt/lib/Core/BinaryContext.cpp
704–705

outdated coment

Amir added inline comments.Jun 23 2022, 9:18 PM
bolt/lib/Core/BinaryContext.cpp
629–632

I think checking if the first parent is non-simple is fine, but a future-proof solution would be to check if any of parents is non-simple.

667

Let's add curly braces in if and else bodies

nhuhuan added inline comments.Jun 23 2022, 11:07 PM
bolt/lib/Core/BinaryContext.cpp
629–632

I also thought the same. There are two reasons why I keep it this way:
(a) Not sure about why we need this check.

Possibly the assumption is non-simple fragments are not be optimized, thus we
don't have to make best effort on analyzing it. So it boils down to the meaning of
non-simple. Maksim argues non-simple functions should have complete references,
just that we cannot construct a complete CFG.

(b) The previous calls to analyzeJumpTable and getOrCreateJumpTable already mark

all parents as "HasIndirectTargetToSplitFragment = true". I mistakenly thought that
was the same with all parents being non-simple.

The bottom line is that I will update following your feedbacks.

Amir added inline comments.Jun 23 2022, 11:29 PM
bolt/lib/Core/BinaryContext.cpp
629–632

I agree with your reasoning. Let's stick to keeping the check for now as it's a safe option, and remove it later separately.

nhuhuan added inline comments.Jun 23 2022, 11:29 PM
bolt/lib/Core/BinaryContext.cpp
727

There are two main types for FragmentsToSkip: (a) split jump table targets, (b) split landing pad targets.
Note that with our current code, we would need to call skipMarkedFragments() twice for (a) and (b). This
removal is to prepare for the next update where we only invoke skipMarkedFragments() once. To do that
we need to retain all FragmentsToSkip of both types.

nhuhuan added inline comments.Jun 23 2022, 11:52 PM
bolt/lib/Core/BinaryFunction.cpp
1704

This is a side note. BOLT made two assumptions:
a. Jump table entries are always unsigned int.
b. Jump table entries are arranged moving forward from Jump table base.

It's very often in real-world binaries to use movsl/movsx instead of movzl/movzx,
which seem to suggests against (a). But it is rare to see cases violate (b) from my
experience.

Now keeping (a), it is messy when OffsetEntries point to another fragment whose
appear prior to current function. It will be integer overflow, and this is inconsistent.

nhuhuan updated this revision to Diff 439643.Jun 24 2022, 12:07 AM

Address some feedbacks

nhuhuan marked 5 inline comments as done.Jun 24 2022, 12:10 AM
nhuhuan added inline comments.
bolt/lib/Core/BinaryContext.cpp
704–705

Thanks. I removed UniqueFunctions because it is the same as FragmentsToSkip. Also updated the comment.

nhuhuan updated this revision to Diff 439649.Jun 24 2022, 12:12 AM
nhuhuan marked an inline comment as done.

Clang-formatted

Do you have a testcase?

I already created a test case for it: two parents access same jump table whose entries target both parents:
split-func-jump-table-fragment-bidirection.s

nhuhuan updated this revision to Diff 439652.Jun 24 2022, 12:20 AM

Clang-formatted

nhuhuan updated this revision to Diff 439855.Jun 24 2022, 12:26 PM

Update test case: print-cfg to show all parents of a jump table

nhuhuan updated this revision to Diff 439862.Jun 24 2022, 12:41 PM

Fix the printing message to include all parents of a jump table

Amir added inline comments.Jun 24 2022, 1:27 PM
bolt/lib/Core/BinaryContext.cpp
643

Let's move the body under !Success into

LLVM_DEBUG({
  ...
});

block, except llvm_unreachable at the end

832–833

Please remove that print

bolt/test/X86/split-func-jump-table-fragment-bidirection.s
14
Amir added inline comments.Jun 24 2022, 1:55 PM
bolt/lib/Core/BinaryContext.cpp
629–632

idk how clang-format skipped this

rafauler added inline comments.Jun 24 2022, 4:47 PM
bolt/lib/Core/BinaryContext.cpp
787–788

Warnings are printed to errs() and not outs()

BOLT-INFO is output to outs(). If this information is just reporting something on the input binary, consider converting it to INFO instead of WARNING, specially if this is supported.

Thanks Huan for all the work put in this diff. I have a few more comments below.

bolt/include/bolt/Core/BinaryContext.h
239–242

Remove dead code

bolt/lib/Core/BinaryContext.cpp
493–496

I think it's a bit weird to keep calling this "Offsets", and "OffsetEntries" in JumpTable if we are not using offsets anymore, right? We are now using the full address in each entry. Perhaps we should rename this in JumpTable class? Something like "EntriesAsAddresses" in JumpTables.h:73 - 74 (and update comments there)

571

Same renaming here (my motivation here is why are we calling this an offset if this is not an offset anymore)

rafauler added inline comments.Jun 24 2022, 5:03 PM
bolt/include/bolt/Core/BinaryContext.h
239–242

Now I see it is used in the follow-up diff

nhuhuan added inline comments.Jun 24 2022, 5:26 PM
bolt/lib/Core/BinaryContext.cpp
571

Offset is a general term.
For PIC, it is the offset to the program start. Definitely not "absolute address".
For NoPIC, it is the offset to process's memory start.

nhuhuan marked 2 inline comments as done.Jun 24 2022, 5:27 PM
nhuhuan added inline comments.
bolt/include/bolt/Core/BinaryContext.h
239–242

Right. The follow-up diff depends on this diff.

nhuhuan updated this revision to Diff 440326.Jun 27 2022, 11:23 AM
nhuhuan marked an inline comment as done.

Address the feedbacks.

nhuhuan marked 7 inline comments as done.Jun 27 2022, 11:30 AM
Amir added inline comments.Jun 27 2022, 3:57 PM
bolt/include/bolt/Core/BinaryContext.h
496
nhuhuan added inline comments.Jun 28 2022, 8:23 PM
bolt/lib/Core/BinaryFunction.cpp
1703

Using same variable name "EntryOffset" is completely wrong, and may potentially trigger memory over-use!

nhuhuan updated this revision to Diff 440839.Jun 28 2022, 8:58 PM

Update how jump table entries are generated. Allow buildCFG to operate
on non-simple fragments. Address feedbacks.

nhuhuan edited the summary of this revision. (Show Details)Jun 28 2022, 9:00 PM
nhuhuan updated this revision to Diff 440840.Jun 28 2022, 9:06 PM
nhuhuan marked an inline comment as done.

Clang-formatted.

Amir added a comment.Jun 28 2022, 11:46 PM

Looks good, with a couple of final nits.

bolt/lib/Core/BinaryContext.cpp
827

Please keep the verbosity here

bolt/lib/Core/BinaryFunction.cpp
1649
1650–1654
nhuhuan updated this revision to Diff 440875.Jun 29 2022, 12:06 AM

Address feedbacks.

nhuhuan marked 3 inline comments as done.Jun 29 2022, 12:07 AM
Amir accepted this revision.Jun 29 2022, 12:16 AM

LGTM

This revision is now accepted and ready to land.Jun 29 2022, 12:16 AM
nhuhuan updated this revision to Diff 442131.Jul 4 2022, 11:53 AM

Fix bugs:
(a) missing builtin_unreachable jump table targets
(b) incorrect jump table duplication

This revision was automatically updated to reflect the committed changes.