This is an archive of the discontinued LLVM Phabricator instance.

[BOLT] Add pass to fix ambiguous memory references
ClosedPublic

Authored by rafauler on Sep 16 2022, 6:36 PM.

Details

Reviewers
Amir
maksfb
Group Reviewers
Restricted Project
Commits
rG4f158995b9cd: [BOLT] Add pass to fix ambiguous memory references
Summary

This adds a round of checks to memory references, looking for
incorrect references to jump table objects. Fix them by replacing the
jump table reference with another object reference + offset.

This solves bugs related to regular data references in code
accidentally being bound to a jump table, and this reference being
updated to a new (incorrect) location because we moved this jump
table.

Fixes #55004

Diff Detail

Event Timeline

rafauler created this revision.Sep 16 2022, 6:36 PM
Herald added a reviewer: maksfb. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
rafauler requested review of this revision.Sep 16 2022, 6:36 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 16 2022, 6:36 PM
rafauler added a comment.EditedSep 16 2022, 6:47 PM

This pass adds ~3.5s processing time in a large binary that takes about 3 minutes to process. This pass currently does not handle the case where the ambiguous jump table reference is in the same function that owns the referenced jump table label. This requires additional analysis and runtime overhead.

yota9 added inline comments.Sep 17 2022, 11:28 AM
bolt/lib/Passes/ValidateMemRefs.cpp
23

BC might be taken from BF

88

Just curious why not to parallel this?

tschuett added inline comments.
bolt/include/bolt/Passes/ValidateMemRefs.h
16

You may use nested namespaces.

bolt/lib/Passes/ValidateMemRefs.cpp
69

You may use structured bindings.

rafauler updated this revision to Diff 461393.Sep 19 2022, 3:41 PM

Reviewer suggestions: nested namespaces, structured bindings, paralellism.

rafauler marked 4 inline comments as done.Sep 19 2022, 3:46 PM
rafauler added inline comments.
bolt/lib/Passes/ValidateMemRefs.cpp
88

I put up the basic scaffolding necessary to run this in parallel. It does run slower, though. The pass takes 8s wall time versus 3.7s sequential for a given test input. So I forced it run sequentially for now, but I left it there so we can easily change to run in parallel in case this changes.

The reason it runs slower is because we need to acquire a large lock to prevent data races from creating new symbols on line 47. Specifically, in BinaryContext::registerNameAtAddress, we hold iterators to a std::map and another thread might be changing this map.

maksfb added inline comments.Oct 7 2022, 5:23 PM
bolt/lib/Passes/ValidateMemRefs.cpp
58–59
61

Is there a reason to iterate over layout (vs CFG)? If there's no reason, we should prefer CFG.

83

Probably the condition needs to be modified as we are moving jump tables in other modes as well.

93
95

Is the complexity really quadratic?

103
rafauler updated this revision to Diff 466672.Oct 10 2022, 5:54 PM
rafauler marked an inline comment as done.

Maksim's suggestions. Thanks!

maksfb accepted this revision.Oct 12 2022, 12:07 PM

One nit and one comment. Otherwise, LGTM.

bolt/lib/Core/BinaryEmitter.cpp
818–836

Could you please add a comment under which conditions we can have more than one label?

bolt/lib/Passes/ValidateMemRefs.cpp
50
This revision is now accepted and ready to land.Oct 12 2022, 12:07 PM
rafauler updated this revision to Diff 467302.Oct 12 2022, 4:45 PM

Adding requested comment, debugging message fixed, improved test case
2 to be more clear and to test if BOLT breaks this test case.

This revision was automatically updated to reflect the committed changes.