This is an archive of the discontinued LLVM Phabricator instance.

[WIP][ELFDumper] Refactor some of the helper functions to a header file.
Needs ReviewPublic

Authored by ayermolo on Jul 27 2021, 5:39 PM.

Details

Summary

Refactoring some of the helper functions and structs in to a header file so they can be used by other, downstream, tools. Second part will be moving it under llvm/include/*.

Diff Detail

Event Timeline

ayermolo created this revision.Jul 27 2021, 5:39 PM
ayermolo requested review of this revision.Jul 27 2021, 5:39 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 27 2021, 5:39 PM

Could you provide some explanation as to why you need this code in your downstream tools, please? On the surface of it, I don't like this change (for a start, there's too much implementation in headers). An explanation of your use-cases may help provide an alternative approach.

And note that tools/llvm-readobj/ELFDumper.h is a public API header. llvm/include/llvm/**/*.h are public API headers.

Ah sorry was going to post on llvm-dev as RFC, but got pulled in to something else internally.
My motivation as it was previously to improve user experience of none compiler people dealing with compiler errors and reduce the work of dev teams that support them.
Right now when relocation overflow occurs it's a manual process to figure out what the layout is.

  1. Isolate linking command
  2. add a flag to dump map file
  3. run SED command. Suggested by @MaskRay in another review.
  4. look through not so user friendly output.

I understand @MaskRay doesn't want additional code in LLD in upstream because it ads to maintenance. I am hoping that upstream community will at least be OK with refactor of helper functions so that downstream changes are more self contained.
The end goal is when relocation error occurs part of error message is a nice looking layout print out in same format as llvm-readelf --sections that user can take a look at and see what is going wrong, or ask dev team. Without repeating aforementioned steps every single time.

For header layout comment. When I refactored llvm-dwp in to a library David Blaikie asked me to split it to two commits. One refactor, and second moving files in under llvm/include/* llvm/lib/*. So I am following the same pattern here.
For too much implementation in the headers. I don't really have a strong opinion about it. I was hoping to get feedback from community. In the headers it can just be inlined without using LTO, but standard approach of linked library also works. Probably doesn't matter since this is not on a critical execution path.

Sorry for not coming back to this sooner. I've been both away and busy. I recommend you try writing an RFC for the mailing list, as it will get more attention.

From what I can see, all the code you're moving around is to do with actual printing, not parsing, of objects. In my mind, parsing code can and probably should live in the Object library, and printing code should be specific to each individual tool, so that the impact of output format changes are restricted in scope. It also ensures each tool has its own specific formatting, appropriate for its individual needs - it is extremely rare that one tool's output is appropriate for other tools. If it were, why do those tools need to exist after all? In your case, would a simple script which performs the same operations as you hope to achieve with your tool suffice? This could obviously shell out to llvm-readobj for the printing. That all being said, there might be an appetite in the wider community for refactoring the llvm-readobj dumping code into a library so other tools can share the dumping functionality in the future - you'd need to ask.

More generally, a refactor like this in its current form is likely to be reversed at some future point by somebody who doesn't know the context of this change ("Why are all these functions in the header file, when they can just be defined in the .cpp?"), and so will just break your tool then. If we are going to propose sharing the llvm-readobj code wider, I'd create a dedicated Readobj or ObjectPrinting library, or similar, that can be used elsewhere. In such a case, you'd want to move ALL the llvm-readobj printing code (i.e. basically all of the *Dumper source) into the library, leaving the tool to read the command-line, load the object(s) and call the appropriate functions. Once the code is in a library, it is much easier to work with, and much less likely to get broken at some future point.