This is an archive of the discontinued LLVM Phabricator instance.

[BOLT][DWARF] Add version 5 split dwarf support
ClosedPublic

Authored by ayermolo on Apr 2 2022, 3:41 PM.

Details

Summary

Added support for DWARF5 Split Dwarf.

Diff Detail

Event Timeline

ayermolo created this revision.Apr 2 2022, 3:41 PM
Herald added a reviewer: Amir. · View Herald Transcript
Herald added a reviewer: maksfb. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
Herald added subscribers: hoy, modimo, wenlei. · View Herald Transcript
ayermolo requested review of this revision.Apr 2 2022, 3:41 PM
Herald added a project: Restricted Project. · View Herald Transcript
ayermolo retitled this revision from [BOLT] DWARF5 split dwarf support to [BOLT][DWARF] DWARF5 split dwarf support.Apr 2 2022, 3:42 PM
ayermolo retitled this revision from [BOLT][DWARF] DWARF5 split dwarf support to [BOLT][DWARF] Add version 5 split dwarf support.Apr 15 2022, 5:08 PM

Probably worth changing the == 5 version checks to >= 5 since most things stay the same between DWARF versions, and the things that don't are the exceptions/things we'll need to touch on the next version (& nice to not have to touch/update code for things that stay the same)

bolt/include/bolt/Core/DebugData.h
375–377

That seems like a somewhat problematic API - returning two very different kinds of data (the DWO_ID or the unit offset) seems quite easy to misuse this?

386

Probably name the parameter the same as the member, rather than one character different/lower case.

bolt/lib/Rewrite/DWARFRewriter.cpp
208–210

Probably an uppercase V since that's a separate word within the name?

306
320

reinterpret_cast seems a bit scary here - perhaps TempRangesSectionWriter should be a void* if it could be a pointer to two unrelated types (if it's a pointer to a base class that covers both ranges and rnglists, then a static_cast should suffice for the downcasting from base pointer to derived pointer)

327–329

to avoid two lookups (one for count, the other for op[]) perhaps this code should use find?

Comments from @maksfb

bolt/include/bolt/Core/DebugData.h
219

delete

230

make reference

348

add documentation.

386

Pass BC by reference

399

isFinalized

402

releaseBuffer()

bolt/lib/Core/DebugData.cpp
210

const

358–359

const

417

AM --> AMIter

483–485

const
getCUId -->getCUID

1058

remove method, move to constructor

1069

const

1071

Darf --> Dwarf

1075

use static_cast

1093

remove curly braces

bolt/lib/Rewrite/DWARFRewriter.cpp
166

const Unit, AttrInfoVal

168

const

208–209

const DwarfVersion

304

add else if for DWARF < 5 else case llvm_unreachable

322

make finalizeSection virtual and do nothing for dwarf4.

335–336

Make initSection virtual and do nothing for DWARF4

1175
1719

const

bolt/test/X86/dwarf4-df-dualcu.test
24

dwarf4

ayermolo marked 5 inline comments as done.Apr 25 2022, 12:42 PM
ayermolo updated this revision to Diff 425001.Apr 25 2022, 12:43 PM

addressing comments

Herald added a project: Restricted Project. · View Herald TranscriptApr 25 2022, 12:43 PM
ayermolo updated this revision to Diff 425003.Apr 25 2022, 12:44 PM

oops forgot to switch branches

ayermolo added inline comments.Apr 25 2022, 12:49 PM
bolt/include/bolt/Core/DebugData.h
375–377

The idea I had behind this APIS is for it to return unique ID representing the CU. As it applies to .debug_addr. For monolithic case is its offset with .debug_info for Split Dwarf case is its DWO ID. At least in my head viewed in that context the return data is the same. It's just something that uniquely identifies this CU, and logic is encapsulated in it.

ayermolo added inline comments.Apr 27 2022, 11:10 AM
bolt/include/bolt/Core/DebugData.h
375–377

@dblaikie What do you think?

dblaikie added inline comments.Apr 28 2022, 4:30 PM
bolt/include/bolt/Core/DebugData.h
375–377

Could you use the offset consistently? The debug_addr section is always in the executable - the offset of the skeleton CU would be unique?

ayermolo added inline comments.Apr 28 2022, 4:43 PM
bolt/include/bolt/Core/DebugData.h
375–377

In theory yes, if I always use offset of the CU/Skeleton CU in the main binary. For that I still need to do a check of CU that is passed in is DWO CU, get it skeleton CU, then get it's offset. At the end what we get is offset all the time. Is it really better? Sure we always getting an offset, but intent of this API is not really get an offset but the unique ID of the CU. Are you concern that offset and DWO ID will collide?

ayermolo updated this revision to Diff 427461.May 5 2022, 2:44 PM

updated so that getCUID consistently returns CU/SkeletonCU offset.

maksfb accepted this revision.May 5 2022, 2:45 PM

Awesome!

This revision is now accepted and ready to land.May 5 2022, 2:45 PM
This revision was landed with ongoing or failed builds.May 5 2022, 2:59 PM
This revision was automatically updated to reflect the committed changes.