This is an archive of the discontinued LLVM Phabricator instance.

[BOLT][DWARF] Add version 5 split dwarf support
ClosedPublic

Authored by ayermolo on Apr 2 2022, 3:41 PM.

Details

Summary

Added support for DWARF5 Split Dwarf.

Diff Detail

Event Timeline

ayermolo created this revision.Apr 2 2022, 3:41 PM
Herald added a reviewer: Amir. · View Herald Transcript
Herald added a reviewer: maksfb. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
Herald added subscribers: hoy, modimo, wenlei. · View Herald Transcript
ayermolo requested review of this revision.Apr 2 2022, 3:41 PM
Herald added a project: Restricted Project. · View Herald Transcript
ayermolo retitled this revision from [BOLT] DWARF5 split dwarf support to [BOLT][DWARF] DWARF5 split dwarf support.Apr 2 2022, 3:42 PM
ayermolo retitled this revision from [BOLT][DWARF] DWARF5 split dwarf support to [BOLT][DWARF] Add version 5 split dwarf support.Apr 15 2022, 5:08 PM

Probably worth changing the == 5 version checks to >= 5 since most things stay the same between DWARF versions, and the things that don't are the exceptions/things we'll need to touch on the next version (& nice to not have to touch/update code for things that stay the same)

bolt/include/bolt/Core/DebugData.h
376–378

That seems like a somewhat problematic API - returning two very different kinds of data (the DWO_ID or the unit offset) seems quite easy to misuse this?

387

Probably name the parameter the same as the member, rather than one character different/lower case.

bolt/lib/Rewrite/DWARFRewriter.cpp
210–212

Probably an uppercase V since that's a separate word within the name?

308
322

reinterpret_cast seems a bit scary here - perhaps TempRangesSectionWriter should be a void* if it could be a pointer to two unrelated types (if it's a pointer to a base class that covers both ranges and rnglists, then a static_cast should suffice for the downcasting from base pointer to derived pointer)

329–331

to avoid two lookups (one for count, the other for op[]) perhaps this code should use find?

Comments from @maksfb

bolt/include/bolt/Core/DebugData.h
220

delete

231

make reference

348

add documentation.

387

Pass BC by reference

400

isFinalized

403

releaseBuffer()

bolt/lib/Core/DebugData.cpp
210

const

358–359

const

415–417

AM --> AMIter

484–486

const
getCUId -->getCUID

1062

remove method, move to constructor

1073

const

1075

Darf --> Dwarf

1079

use static_cast

1097

remove curly braces

bolt/lib/Rewrite/DWARFRewriter.cpp
168

const Unit, AttrInfoVal

170

const

210–211

const DwarfVersion

306

add else if for DWARF < 5 else case llvm_unreachable

324

make finalizeSection virtual and do nothing for dwarf4.

337–338

Make initSection virtual and do nothing for DWARF4

1178
1722

const

bolt/test/X86/dwarf4-df-dualcu.test
24

dwarf4

ayermolo marked 5 inline comments as done.Apr 25 2022, 12:42 PM
ayermolo updated this revision to Diff 425001.Apr 25 2022, 12:43 PM

addressing comments

Herald added a project: Restricted Project. · View Herald TranscriptApr 25 2022, 12:43 PM
ayermolo updated this revision to Diff 425003.Apr 25 2022, 12:44 PM

oops forgot to switch branches

ayermolo added inline comments.Apr 25 2022, 12:49 PM
bolt/include/bolt/Core/DebugData.h
376–378

The idea I had behind this APIS is for it to return unique ID representing the CU. As it applies to .debug_addr. For monolithic case is its offset with .debug_info for Split Dwarf case is its DWO ID. At least in my head viewed in that context the return data is the same. It's just something that uniquely identifies this CU, and logic is encapsulated in it.

ayermolo added inline comments.Apr 27 2022, 11:10 AM
bolt/include/bolt/Core/DebugData.h
376–378

@dblaikie What do you think?

dblaikie added inline comments.Apr 28 2022, 4:30 PM
bolt/include/bolt/Core/DebugData.h
376–378

Could you use the offset consistently? The debug_addr section is always in the executable - the offset of the skeleton CU would be unique?

ayermolo added inline comments.Apr 28 2022, 4:43 PM
bolt/include/bolt/Core/DebugData.h
376–378

In theory yes, if I always use offset of the CU/Skeleton CU in the main binary. For that I still need to do a check of CU that is passed in is DWO CU, get it skeleton CU, then get it's offset. At the end what we get is offset all the time. Is it really better? Sure we always getting an offset, but intent of this API is not really get an offset but the unique ID of the CU. Are you concern that offset and DWO ID will collide?

ayermolo updated this revision to Diff 427461.May 5 2022, 2:44 PM

updated so that getCUID consistently returns CU/SkeletonCU offset.

maksfb accepted this revision.May 5 2022, 2:45 PM

Awesome!

This revision is now accepted and ready to land.May 5 2022, 2:45 PM
This revision was landed with ongoing or failed builds.May 5 2022, 2:59 PM
This revision was automatically updated to reflect the committed changes.