This is an archive of the discontinued LLVM Phabricator instance.

[BOLT][DWARF] Add version 5 split dwarf support
ClosedPublic

Authored by ayermolo on Apr 2 2022, 3:41 PM.

Details

Summary

Added support for DWARF5 Split Dwarf.

Diff Detail

Event Timeline

ayermolo created this revision.Apr 2 2022, 3:41 PM
Herald added a reviewer: Amir. · View Herald Transcript
Herald added a reviewer: maksfb. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
Herald added subscribers: hoy, modimo, wenlei. · View Herald Transcript
ayermolo requested review of this revision.Apr 2 2022, 3:41 PM
Herald added a project: Restricted Project. · View Herald Transcript
ayermolo retitled this revision from [BOLT] DWARF5 split dwarf support to [BOLT][DWARF] DWARF5 split dwarf support.Apr 2 2022, 3:42 PM
ayermolo retitled this revision from [BOLT][DWARF] DWARF5 split dwarf support to [BOLT][DWARF] Add version 5 split dwarf support.Apr 15 2022, 5:08 PM

Probably worth changing the == 5 version checks to >= 5 since most things stay the same between DWARF versions, and the things that don't are the exceptions/things we'll need to touch on the next version (& nice to not have to touch/update code for things that stay the same)

bolt/include/bolt/Core/DebugData.h
375–377 ↗(On Diff #423161)

That seems like a somewhat problematic API - returning two very different kinds of data (the DWO_ID or the unit offset) seems quite easy to misuse this?

386 ↗(On Diff #423161)

Probably name the parameter the same as the member, rather than one character different/lower case.

bolt/lib/Rewrite/DWARFRewriter.cpp
208–209 ↗(On Diff #423161)

Probably an uppercase V since that's a separate word within the name?

306 ↗(On Diff #423161)
320 ↗(On Diff #423161)

reinterpret_cast seems a bit scary here - perhaps TempRangesSectionWriter should be a void* if it could be a pointer to two unrelated types (if it's a pointer to a base class that covers both ranges and rnglists, then a static_cast should suffice for the downcasting from base pointer to derived pointer)

327–328 ↗(On Diff #423161)

to avoid two lookups (one for count, the other for op[]) perhaps this code should use find?

Comments from @maksfb

bolt/include/bolt/Core/DebugData.h
219 ↗(On Diff #423161)

delete

230 ↗(On Diff #423161)

make reference

348 ↗(On Diff #423161)

add documentation.

386 ↗(On Diff #423161)

Pass BC by reference

399 ↗(On Diff #423161)

isFinalized

402 ↗(On Diff #423161)

releaseBuffer()

bolt/lib/Core/DebugData.cpp
210 ↗(On Diff #423161)

const

358 ↗(On Diff #423161)

const

417 ↗(On Diff #423161)

AM --> AMIter

483 ↗(On Diff #423161)

const
getCUId -->getCUID

1058 ↗(On Diff #423161)

remove method, move to constructor

1069 ↗(On Diff #423161)

const

1071 ↗(On Diff #423161)

Darf --> Dwarf

1075 ↗(On Diff #423161)

use static_cast

1093 ↗(On Diff #423161)

remove curly braces

bolt/lib/Rewrite/DWARFRewriter.cpp
166 ↗(On Diff #423161)

const Unit, AttrInfoVal

168 ↗(On Diff #423161)

const

208 ↗(On Diff #423161)

const DwarfVersion

304 ↗(On Diff #423161)

add else if for DWARF < 5 else case llvm_unreachable

322 ↗(On Diff #423161)

make finalizeSection virtual and do nothing for dwarf4.

335 ↗(On Diff #423161)

Make initSection virtual and do nothing for DWARF4

1175 ↗(On Diff #423161)
1719 ↗(On Diff #423161)

const

bolt/test/X86/dwarf4-df-dualcu.test
23 ↗(On Diff #423161)

dwarf4

ayermolo marked 5 inline comments as done.Apr 25 2022, 12:42 PM
ayermolo updated this revision to Diff 425001.Apr 25 2022, 12:43 PM

addressing comments

Herald added a project: Restricted Project. · View Herald TranscriptApr 25 2022, 12:43 PM
ayermolo updated this revision to Diff 425003.Apr 25 2022, 12:44 PM

oops forgot to switch branches

ayermolo added inline comments.Apr 25 2022, 12:49 PM
bolt/include/bolt/Core/DebugData.h
375–377 ↗(On Diff #423161)

The idea I had behind this APIS is for it to return unique ID representing the CU. As it applies to .debug_addr. For monolithic case is its offset with .debug_info for Split Dwarf case is its DWO ID. At least in my head viewed in that context the return data is the same. It's just something that uniquely identifies this CU, and logic is encapsulated in it.

ayermolo added inline comments.Apr 27 2022, 11:10 AM
bolt/include/bolt/Core/DebugData.h
375–377 ↗(On Diff #423161)

@dblaikie What do you think?

dblaikie added inline comments.Apr 28 2022, 4:30 PM
bolt/include/bolt/Core/DebugData.h
375–377 ↗(On Diff #423161)

Could you use the offset consistently? The debug_addr section is always in the executable - the offset of the skeleton CU would be unique?

ayermolo added inline comments.Apr 28 2022, 4:43 PM
bolt/include/bolt/Core/DebugData.h
375–377 ↗(On Diff #423161)

In theory yes, if I always use offset of the CU/Skeleton CU in the main binary. For that I still need to do a check of CU that is passed in is DWO CU, get it skeleton CU, then get it's offset. At the end what we get is offset all the time. Is it really better? Sure we always getting an offset, but intent of this API is not really get an offset but the unique ID of the CU. Are you concern that offset and DWO ID will collide?

ayermolo updated this revision to Diff 427461.May 5 2022, 2:44 PM

updated so that getCUID consistently returns CU/SkeletonCU offset.

maksfb accepted this revision.May 5 2022, 2:45 PM

Awesome!

This revision is now accepted and ready to land.May 5 2022, 2:45 PM
This revision was landed with ongoing or failed builds.May 5 2022, 2:59 PM
This revision was automatically updated to reflect the committed changes.