Two main challenges of recovering jump tables are (a) to identify jump
table base, and (b) to find jump table bounds. For (b), we mainly rely
on the key observation: jump table entries are valid indirect intra-ref.
In more specific, keep iterating jump table entry until we find an
invalid one. Furthermore, function splitting optimization breaks a
function into a few fragments, and jump table could point to any of
these fragments. From binary perspective, each fragment is a function
on its own. The main problem is identifying a sibling fragment from
fragment in a different function.
In nonstripped binaries, fortunately, standard compilers use consistent
symbol names, e.g., foo and foo.cold.1 are siblings, and BOLT rely on
this information to support split jump table. However, there is no
regular symbol in stripped binaries, suggesting major improvements in
every step. These improvements also improve processing of non-stripped
binaries. Support of split jump table can be broken down to 4 steps:
- Decouple sibling analysis (based on symbol) from branch analysis
- Avoid using incomplete sibling information for nonstripped binaries
- Decouple disassembly from branch analysis
- Requirement for (3) and (4)
- Improve solution for problem (a)
- Improve solution for problem (b)
This update addresses step (4): restructure jump table entry validation
with ordered properties:
- Bring instruction bounds check earlier (nonstripped/stripped)
- Split function has no more than 2 fragments (stripped)
- Non-overlapping jump table (stripped)
- Sibling validation based on symbol name (nonstripped)
In future update:
- Sibling validation based on LSDA (stripped)
- Cannot target callable fragment entry (nonstripped/stripped)