This is a step1, mechanical refactor, of moving the bulk of llvm-dwp functionality in to a library. This should allow other tools, like BOLT, to re-use some of the llvm-dwp functionality.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
No dobut we've discussed this before, but could you refresh my memory on why bolt needs access to dwp rewriting?
Is it possible we could emit bolt-compatible DWARF instead (I imagine we'll want to do this for Propeller anyway) - for instance if Bolt can change the length of a code range (so DWARF's use of offsets relative to known addresses would become inaccurate) perhaps when code generating we could know that those ranges could change and instead emit addresses rather than offsets? Thoese addresses would go in the address pool and be modifiable by Bolt without having to rewrite the dwp (this will significantly increase the .o/executable size of the DWARF due to the use of more address relocations, but may still be worthwhile/the right direction, possibly).
The main motivation is to simplify the compilation pipeline, with some speed up. Right now we build pre-bolt binary with DWO sections in .dwo/.o, then we run bolt which writes out modified .dwo. Final step we run llvm-dwp which reads all of this in, then writes out again. With bolt supporting dwp as output we can short circuit running llvm-dwp and just write it out directly.
Supporting DWP as input, is mainly just to make it feature complete. It was a low hanging fruit.
I guess on thing that can simplify BOLT. In some cases skeleton CU doesn't have DW_AT_GNU_ranges_base. In general bolt converts DW_AT_low_pc/DW_AT_high_pc to ranges. If Skeleton CU doesn't have DW_AT_GNU_ranges_base we replace DW_AT_low_pc with it. Except DW_AT_GNU_ranges_base doesn't fit in 8 bits. So Abbrev section can't be patched, and I ended up re-writing it. Not sure what can be done there. Maybe llvm can always emit DW_AT_GNU_ranges_base in Skeleton CU?
Is there no chance of the compiler knowing ahead of time what the cut-points are and emitting DWARF that already accounts for that?
If it's only about ranges - yeah, we could have a mode where /everything/ is described by a range, though Split DWARF doesn't provide a way for DIEs in the dwo to references ranges in the linked binary (they reference .debug_rnglists.dwo instead) - even if those ranges used startx_endx encodings, you couldn't change the number of ranges needed to describe something (so you couldn't split a contiguous sequence into a range by only modifying the linked debug info, without touching the .dwo/dwp). So, yeah, without something more like Propeller, where the cut-points are known at compile-time, modifying dwo/dwp seems necessary.
Thanks for walking me through it again.
Perhaps this should be in a subdirectory of "DebugInfo" rather than in its own top level directory? Eh, I guess if DWARFLinker is in its own top-level directory then so could this.
Could you split the change up a bit - do some/all of the library refactoring in-place before moving directories? It'd make it easier to separate/undertsand the changes.
Not sure why reviews tags both llvm-dwp and DWPLibrary.cpp as copied from each other.
Bolt introduces new ranges by splitting hot/cold sections, etc.
Side note. Currently BOLT only supports DWARF 4. I know GCC turned on DWARF 5 by default. Are there plans on llvm side?
Haven't heard any particular discussions - Google switched to DWARFv5 by default several months ago, haven't heard Sony or Apple discussing it.
llvm/tools/llvm-dwp/CMakeLists.txt | ||
---|---|---|
9 | I'd probably call this DWP.cpp to match DWP.h? |
@dblaikie can I land or should I wait for @jdoerfert
Looking at failures, they look unrelated
nah, that's fine - you can go ahead. Please update the patch description/commit message to reflect that this patch isn't creating a library, but is the preliminary changes within the llvm-dwp binary.
I'd probably call this DWP.cpp to match DWP.h?