This is an archive of the discontinued LLVM Phabricator instance.

[BOLT] Update labels for split landing pad
Needs RevisionPublic

Authored by nhuhuan on Jul 19 2022, 3:25 AM.

Details

Summary

Landing pad is the catch block in C++ exception handling and normally
resides in same function with callsite. As a result, it makes sense
to reuse or create a local label during parsing LSDA section.

However, for function-splitting enabled binaries, callsite and landing
pad could be in different sibling fragments. The split landing pad's
label needs to be registered as a secondary entry point to the target
fragment.

Also, unless LSDA section is updated, it is necessary to link callsite
fragment to landing pad fragment to guarantee that landing pad fragment
can be moved.

Test Plan:

ninja check-bolt

Diff Detail

Event Timeline

nhuhuan created this revision.Jul 19 2022, 3:25 AM
Herald added a reviewer: Amir. · View Herald Transcript
Herald added a reviewer: maksfb. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
nhuhuan requested review of this revision.Jul 19 2022, 3:25 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 19 2022, 3:25 AM
nhuhuan edited the summary of this revision. (Show Details)Jul 19 2022, 3:27 AM
nhuhuan edited the summary of this revision. (Show Details)
nhuhuan edited the summary of this revision. (Show Details)Jul 19 2022, 3:30 AM
nhuhuan edited the summary of this revision. (Show Details)Jul 19 2022, 3:37 AM
Amir accepted this revision.Jul 20 2022, 7:24 PM
Amir added a subscriber: rafaelauler.

Thanks for working on it and adding a test with a stripped binary.

The approach looks good to me. I would appreciate if @maksfb or @rafaelauler take a look.
@nhuhuan: let's wait a day before proceeding.

This revision is now accepted and ready to land.Jul 20 2022, 7:24 PM

Excellent work Huan! I have a few design suggestions below. Let me know what do you think.

bolt/lib/Core/Exceptions.cpp
102

I'm not a fan of using boolean input variables (there are several texts written about this anti-pattern, this is just one of them: https://understandlegacycode.com/blog/what-is-wrong-with-boolean-parameters/ )

We have quite a few of these in our codebase already and it makes me uneasy. Here, it just makes code harder to read and understand. This is already a reasonably complex parser, and we should at least try to make it simpler to read. I would prefer if this function was refactored to reflect the fact that it now has two modes of operation.

If we look at parseLSDA as it is in this patch, it can be refactored using a pattern of parser logic separated from callback action. this is similar to semantic actions in a compiler parser, in which we completely separate parser from the component responsible for taking actions depending on what the parser reports as the rule being currently consumed.

put all of the current parser logic into parseLSDA, make parseLSDA receive a callback as an argument that is what is going to happen to each parsed element, and then implement checkonly=true and checkonly=false logic in the callback.

something along these lines (let me know if it is not feasible to implement this pattern or if it is too cumbersome):

BinaryFunction {
private:
  parseAndWalkOverLSDA(data, Callback)
public:
  consumeLSDAForFragmentRelationships();
  consumeLSDAForEHInfo()
}

parseAndWalkOverLSDA{
  parsing logic...
  for each iteration of loop {
    parsing logic
    callback()
  }
}

consumeLSDAForFragmentRelationships{
  auto Callback = [&] {
    establish fragment relationships logic...
  }

  parseAndWalkOverLSDA(Callback);
}

consumeLSDAForEHInfo {
  auto Callback = [&] {
    modify instructions to add all EH info stuff..
  }
  parseAndWalkOverLSDA(Callback);
}
522

fillLSDAAddressFor

I'm also curious if this adds runtime overhead by parsing LSDA two times per function. Can we measure that using a large binary as input?

nhuhuan added a comment.EditedJul 26 2022, 11:51 AM

I'm also curious if this adds runtime overhead by parsing LSDA two times per function. Can we measure that using a large binary as input?

Regarding overhead, running twice should have (almost) no effect because size
of LSDA is proportional to only C++ exception handling. Very small amount of
code actually uses this feature.

I measured performance on a binary of size 741MB.

Run parseLSDA once - without checking for valid LSDA:
Testing Time: 431.06s
real 7m11.206s
user 8m43.858s
sys 13m35.488s

Run parseLSDA twice - with checking for valid LSDA:
Testing Time: 442.66s
real 7m22.803s
user 8m43.770s
sys 13m54.131s

Note that to quickly get results, I skipped the checks in "Run Once" version,
resulted in faster performance, but that is not significant.

If we keep the checks in place, running once or twice probably does not matter
much. I don't think performance is a problem.

Amir added a comment.Jul 26 2022, 11:58 AM

Please update the summary to narrative form.

nhuhuan added inline comments.Jul 26 2022, 12:44 PM
bolt/lib/Core/Exceptions.cpp
102

Thanks for your suggestions, @rafaelauler.

I generally agree that using callbacks improves maintainability. However, a few things are holding me back.

(a) Callbacks may need too many arguments
I assume that we don't want to add parsing logic remained in parseAndWalkOverLSDA
If so, there are simply too many arguments for one general callback.

(b) Avoid printing duplicate debugging messages.
If printing message in callbacks, then many more arguments.
So parseAndWalkOverLSDA need to take another boolean AllowPrintingMessage.

nhuhuan updated this revision to Diff 447947.Jul 27 2022, 1:19 AM

Split this diff into a few diffs.

nhuhuan retitled this revision from [BOLT] Support split landing pad for stripped binaries to [BOLT] Update labels for split landing pad.Jul 27 2022, 1:20 AM
nhuhuan edited the summary of this revision. (Show Details)
rafauler added inline comments.Jul 27 2022, 1:16 PM
bolt/lib/Core/BinaryFunction.cpp
1887–1892

.. if LPFunc has no CFG yet

1888

Solution: rerun recomputeLandingPads after buildCFG for all functions

bolt/lib/Core/Exceptions.cpp
202–204

I would suggest moving this before MCSymbol *LPLabel = nullptr; so the code that calculates LPLabel is closer together

208

I was thinking in maybe adding a comment "Skip creating a landing pad if LPOffset is zero" so it is clear that this is from the LSDA format

216

Treat split landing pad as the fragment's secondary entry point?

But maybe expand this comment, as it looks like this code section is handling _only_ that. Maybe:

Create/fetch landing pad label. If necessary, treat split landing pad as...
225

Is this correct for the case where LPOffset is an offset outside this function (LPLabel is set by Fragment->addEntryPointAtOffset(FragmentOffset)) ?

bolt/lib/Rewrite/RewriteInstance.cpp
2925–2929

I would reverse for clarity:

Process non-simple cases due to split jump table or split landing pad
Ignore all other non-simple cases
nhuhuan updated this revision to Diff 448177.Jul 27 2022, 3:00 PM
nhuhuan marked 7 inline comments as done.

Address feedbacks

bolt/lib/Core/Exceptions.cpp
208

I wrote an equivalent comment: "Assumption: landing pad cannot target current fragment entry"

225

You're correct. I explain this in the last paragraph in description.
Let me clarify a bit more here:

The link between callsite and landing pad fragment is Labels[LPOffset].
If LSDA is not updated, the LPOffset will be the same, while target function
may be moved. This link is to keep connection between two fragments.
I also see its benefit on merging split function task in future.

522

Addressed in another diff: D130663

nhuhuan marked an inline comment as done.Jul 27 2022, 3:01 PM
Amir requested changes to this revision.Jul 27 2022, 11:16 PM

We need to make sure that the support for split LP is sufficient/correct for also running the processed binary.
Can we turn the test into a runtime test, making sure that the EH works after BOLT?

This revision now requires changes to proceed.Jul 27 2022, 11:16 PM