This is an archive of the discontinued LLVM Phabricator instance.

[BOLT][DWARF] Add version 5 split dwarf support
ClosedPublic

Authored by ayermolo on Apr 2 2022, 3:41 PM.

Details

Summary

Added support for DWARF5 Split Dwarf.

Diff Detail

Event Timeline

ayermolo created this revision.Apr 2 2022, 3:41 PM
Herald added a reviewer: Amir. · View Herald Transcript
Herald added a reviewer: maksfb. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
Herald added subscribers: hoy, modimo, wenlei. · View Herald Transcript
ayermolo requested review of this revision.Apr 2 2022, 3:41 PM
Herald added a project: Restricted Project. · View Herald Transcript
ayermolo retitled this revision from [BOLT] DWARF5 split dwarf support to [BOLT][DWARF] DWARF5 split dwarf support.Apr 2 2022, 3:42 PM
ayermolo retitled this revision from [BOLT][DWARF] DWARF5 split dwarf support to [BOLT][DWARF] Add version 5 split dwarf support.Apr 15 2022, 5:08 PM

Probably worth changing the == 5 version checks to >= 5 since most things stay the same between DWARF versions, and the things that don't are the exceptions/things we'll need to touch on the next version (& nice to not have to touch/update code for things that stay the same)

bolt/include/bolt/Core/DebugData.h
377–379

That seems like a somewhat problematic API - returning two very different kinds of data (the DWO_ID or the unit offset) seems quite easy to misuse this?

388

Probably name the parameter the same as the member, rather than one character different/lower case.

bolt/lib/Rewrite/DWARFRewriter.cpp
210–212

Probably an uppercase V since that's a separate word within the name?

311
325

reinterpret_cast seems a bit scary here - perhaps TempRangesSectionWriter should be a void* if it could be a pointer to two unrelated types (if it's a pointer to a base class that covers both ranges and rnglists, then a static_cast should suffice for the downcasting from base pointer to derived pointer)

330–333

to avoid two lookups (one for count, the other for op[]) perhaps this code should use find?

Comments from @maksfb

bolt/include/bolt/Core/DebugData.h
228

delete

232

make reference

350

add documentation.

388

Pass BC by reference

401

isFinalized

404

releaseBuffer()

bolt/lib/Core/DebugData.cpp
211

const

359–360

const

418

AM --> AMIter

485–487

const
getCUId -->getCUID

1060

remove method, move to constructor

1071

const

1073

Darf --> Dwarf

1077

use static_cast

1095

remove curly braces

bolt/lib/Rewrite/DWARFRewriter.cpp
167

const Unit, AttrInfoVal

169

const

210–211

const DwarfVersion

309

add else if for DWARF < 5 else case llvm_unreachable

327

make finalizeSection virtual and do nothing for dwarf4.

337–338

Make initSection virtual and do nothing for DWARF4

1181
1725–1726

const

bolt/test/X86/dwarf4-df-dualcu.test
24

dwarf4

ayermolo marked 5 inline comments as done.Apr 25 2022, 12:42 PM
ayermolo updated this revision to Diff 425001.Apr 25 2022, 12:43 PM

addressing comments

Herald added a project: Restricted Project. · View Herald TranscriptApr 25 2022, 12:43 PM
ayermolo updated this revision to Diff 425003.Apr 25 2022, 12:44 PM

oops forgot to switch branches

ayermolo added inline comments.Apr 25 2022, 12:49 PM
bolt/include/bolt/Core/DebugData.h
377–379

The idea I had behind this APIS is for it to return unique ID representing the CU. As it applies to .debug_addr. For monolithic case is its offset with .debug_info for Split Dwarf case is its DWO ID. At least in my head viewed in that context the return data is the same. It's just something that uniquely identifies this CU, and logic is encapsulated in it.

ayermolo added inline comments.Apr 27 2022, 11:10 AM
bolt/include/bolt/Core/DebugData.h
377–379

@dblaikie What do you think?

dblaikie added inline comments.Apr 28 2022, 4:30 PM
bolt/include/bolt/Core/DebugData.h
377–379

Could you use the offset consistently? The debug_addr section is always in the executable - the offset of the skeleton CU would be unique?

ayermolo added inline comments.Apr 28 2022, 4:43 PM
bolt/include/bolt/Core/DebugData.h
377–379

In theory yes, if I always use offset of the CU/Skeleton CU in the main binary. For that I still need to do a check of CU that is passed in is DWO CU, get it skeleton CU, then get it's offset. At the end what we get is offset all the time. Is it really better? Sure we always getting an offset, but intent of this API is not really get an offset but the unique ID of the CU. Are you concern that offset and DWO ID will collide?

ayermolo updated this revision to Diff 427461.May 5 2022, 2:44 PM

updated so that getCUID consistently returns CU/SkeletonCU offset.

maksfb accepted this revision.May 5 2022, 2:45 PM

Awesome!

This revision is now accepted and ready to land.May 5 2022, 2:45 PM
This revision was landed with ongoing or failed builds.May 5 2022, 2:59 PM
This revision was automatically updated to reflect the committed changes.