This is an initial implementation of lvm-objcopy for XCOFF32.
Currently only supports simple copying, op-passthrough to follow.
Details
- Reviewers
jhenderson hubert.reinterpretcast MaskRay jsji shchenz Higuoxing alexander-shaposhnikov rupprecht - Group Reviewers
Restricted Project - Commits
- rG61835d19a848: [llvm-objcopy] Initial XCOFF32 support.
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Thanks for adding objcopy support on AIX.
Some comments from me.
I have not finished all of the code reviewing yet.
llvm/include/llvm/Object/XCOFFObjectFile.h | ||
---|---|---|
413 | I think it should be ok to call fileHeader32 directly in the Reader? | |
447 | We already have a getSectionContents API in this class, can we reuse that function? | |
llvm/tools/llvm-objcopy/XCOFF/Reader.cpp | ||
20 | use &? | |
llvm/tools/llvm-objcopy/XCOFF/XCOFFObjcopy.cpp | ||
40 | From this comments, seems for now llvm-objcopy for XCOFF does not take any options? Even for -h or -v? Maybe we need a new method to tell whether an option is specified instead of listing all of them. |
Seems the functionality of the llvm-objcopy is not right. Could you please help to check? And maybe we need to add some other tests for the file format.
1:
$ llvm-objcopy 32.o copy_nodebug.o $ dump -X32 -tv copy_nodebug.o copy_nodebug.o: ***Symbol Table Information*** [Index] m Value Scn Aux Sclass Type Name [Index] a0 Fname [Index] a1 Tagndx Lnno Size Lnoptr Endndx [Index] a2 Tagndx Fsiz Lnoptr Endndx [Index] a3 Tagndx Lnno Size Dimensions [Index] a4 CSlen PARMhsh SNhash SMtype SMclass Stab SNstab [Index] a5 SECTlen #RELent #LINnums [0] m 0x00000000 undef 1 extern errno [1] a4 0x05e80000 a1f00 8192 US ?? 48 7936 [2] m 0x00000000 undef 1 extern exit [3] a4 0x1f002000 5f80000 4 ?? PR 536872444 0 [4] m 0x00000000 undef 1 extern atexit [5] a4 0x0000000e1f002000 1544 ER PR 7936 8192 dump: copy_nodebug.o: dump: 0654-103 Cannot read from the specified file. [6] m 0x00000000 undef 1 extern **Invalid Symbol Name**
2: (with DWARF info)
$llvm-dwarfdump -a copy_debug.o Program aborted due to an unhandled Error: The end of the file was unexpectedly encountered 0. Program arguments: llvm-dwarfdump -a copy.o IOT/Abort trap (core dumped)
llvm/include/llvm/Object/XCOFFObjectFile.h | ||
---|---|---|
413 | If so, I think it's better to make the private function public. |
Good catch, thanks!
I will add the support for writing symbol name to StringTable in next update, which may fix the issue.
2: (with DWARF info)
$llvm-dwarfdump -a copy_debug.o Program aborted due to an unhandled Error: The end of the file was unexpectedly encountered 0. Program arguments: llvm-dwarfdump -a copy.o IOT/Abort trap (core dumped)
The patch does not support reading and writing debug info. Do you think I should add it to the patch?
The patch does not support reading and writing debug info. Do you think I should add it to the patch?
Should we distinguish the DWARF sections and the non DWARF sections for now? From the XCOFFReader and XCOFFWriter class, I can not tell why and how the DWARF sections are skipped.
The DWARF-related failure still exists in the latest update. I will have a look into the problem.
You need testing for your error paths, I think.
llvm/test/tools/llvm-objcopy/XCOFF/basic-copy.test | ||
---|---|---|
8 | ||
10–19 | Maybe add a comment to the final section consisting solely of "Flags" to clarify that it is a nameless and empty section? It might make more sense to put the empty section earlier in the sections too, to show the impact (or lack thereof) it has on the later section offsets. | |
12 | Do we really need this much data? Seems to me like a few arbitrary bytes (e.g. "12345678") would be sufficient. | |
llvm/tools/llvm-objcopy/XCOFF/Reader.cpp | ||
42 | Should this be a const &? | |
llvm/tools/llvm-objcopy/XCOFF/Writer.cpp | ||
20–22 | Why not as inline? | |
21 | "header" rather than "headers" maybe? Is there more than one file header after all? | |
26 | Rather than these comments, maybe it would be clearer if you split this function into multiple functions, e.g. finalizeHeader, finalizeSections etc. | |
65 | Similar to above, I'd move these blocks into separate functions. That way, if they need to increase in complexity here, it's easy to do so. | |
87 | Rel or Reloc are generally more common abbreviations for "Relocation" in other parts of the code. I'd recommend renaming here to match. | |
llvm/tools/llvm-objcopy/XCOFF/XCOFFObjcopy.h | ||
34 | Nit: add new line at EOF. |
Aside from a few nits, and the acknowledged missing error path testing, this seems reasonable to me, but requires someone with XCOFF knowledge to review too.
llvm/tools/llvm-objcopy/XCOFF/Reader.cpp | ||
---|---|---|
52 | SymbolRef is designed to be easily copyable. No need for const &. | |
llvm/tools/llvm-objcopy/XCOFF/Writer.cpp | ||
45 | ||
llvm/tools/llvm-objcopy/XCOFF/XCOFFObjcopy.h | ||
34 | Now there are too many new lines :D Delete line 33. There should be a '\n' at the end of line 32, as the last byte in the file. | |
llvm/tools/llvm-objcopy/llvm-objcopy.cpp | ||
209 | Are there likely to be any XCOFF-specific options added in the future? If not, I wonder if we could drop XCOFFConfig entirely. We'd still want the check for unsupported options, but perhaps that should be in a different function then (e.g. Config.checkXCOFFOptions()). What do you think? (If there are XCOFF-specific options that are planned, having a separate config, even though empty for now, is fine). |
- Address comment.
- Add fields of OptionalFileHeader, AuxSymbolEntries, and LineNumInfo, otherwise an error will occur when using the compiled binary as input. (OptionalFileHeader and AuxSymbolEntries are not supported yet, just copy their contents directly.)
- Add error tests except for the error path in getLineNumberInfo because yaml2obj does not support it.
- Add a binary test with LineNumInfo and AuxSymbolEntries.
llvm/tools/llvm-objcopy/llvm-objcopy.cpp | ||
---|---|---|
209 | There isn't a concrete plan yet, but I think it is quite possible to add XCOFF-specific options in the future. So I think it's fine to keep this? |
llvm/lib/Object/XCOFFObjectFile.cpp | ||
---|---|---|
811 ↗ | (On Diff #373181) | Be consistent in the style of your assert messages - the version above has no trailing full stop, but this one. There isn't a single style specified for asserts in LLVM, but for error messages, the agreed-upon style, where surrounding code does not differ from it, is no leading capital letter, and no trailing full stop. |
915 ↗ | (On Diff #373181) | |
llvm/test/tools/llvm-objcopy/XCOFF/binary-copy.test | ||
1–2 ↗ | (On Diff #373181) | Is there a particular need for you to get the llvm-objcopy implementation landed, before you have added the yaml2obj functionality? By adding a binary to the git repository, it will always stay in the repo history, even if it is later deleted, using up disk space on everybody's machines, slowing down clone times etc. |
llvm/test/tools/llvm-objcopy/XCOFF/invalid-section.test | ||
8 ↗ | (On Diff #373181) | Here and below, these error messages need improving to include additional context about what has gone wrong. For example, which section couldn't be read? What was its expected size and offset? Potentially event what was the file size? Take a look at some examples from other tests: |
llvm/tools/llvm-objcopy/XCOFF/Object.h | ||
33 | Add TODO: to these comments. | |
llvm/tools/llvm-objcopy/XCOFF/Reader.cpp | ||
71–74 | Do we need something to protect us from accidentally running off the end of the file, if it's been truncated somehow? | |
91–93 | As above, do we need to check to make sure we don't end up reading off the end of the file? I think it might be better to emit an error and stop, saying "optional headers are not yet supported" or something to that effect, so that we don't end up producing a potentially invalid output. Same goes with the aux symbol entries. | |
95 | Typo? | |
llvm/tools/llvm-objcopy/XCOFF/Writer.cpp | ||
66 | (but see also my above comment) |
@jhenderson Thank you so much for your comments!
I will rework this patch after implementing interfaces such as aux symbol and optional header and adding yaml2obj functionality.
- Addressed comments.
- Replaced binary inputs with YAMLs for testing.
- Updated tests for error paths.
llvm/lib/Object/XCOFFObjectFile.cpp | ||
---|---|---|
702 ↗ | (On Diff #400432) | Why is Start a uintptr_t rather than either a const char * or const uint8_t * (or even a StringRef or ArrayRef<uint8_t>)? I also don't see this test being tested. |
706 ↗ | (On Diff #400432) | Rather than "the raw data" here, perhaps you should pass in a "Name" parameter which is used to describe what the raw data is, e.g. a symbol or similar, so that the error message is more meaningful: "symbol with offset 0x1234 and size 0x4321 goes past the end of the file". |
708 ↗ | (On Diff #400432) | |
llvm/test/tools/llvm-objcopy/XCOFF/invalid-read.test | ||
4 ↗ | (On Diff #400432) | I'd delete the "Error1: " bit of this comment (and similar bits in others), as it doesn't really add much value, but also has a risk of being confused with the FileCheck check prefix ERROR1. Also, there's a copy-and-paste error further down with it :) |
llvm/tools/llvm-objcopy/XCOFF/Reader.cpp | ||
71–74 | I'm probably being dumb, but where is this checked? |
llvm/lib/Object/XCOFFObjectFile.cpp | ||
---|---|---|
702 ↗ | (On Diff #400432) | Hmm, the function is called by objcopy when reading aux symbols. It looks like we can test this by making the StartOffset + SymbolEntrySize * NumberOfAuxEntries exceed the end of the file, where the NumberOfAuxEntries is user definable in the yaml, but I still can't construct such a case because we use zeros padding when the defined number is greater than the actual number of aux symbols in yaml2obj. |
llvm/tools/llvm-objcopy/XCOFF/Reader.cpp | ||
71–74 |
It was a comment in reading auxiliary symbols. Now I read the data by calling XCOFFObjectFile::getRawData, which has an offset check. |
llvm/lib/Object/XCOFFObjectFile.cpp | ||
---|---|---|
702 ↗ | (On Diff #400432) | Fair enough. I'd add a TODO comment saying this path is untested. ELF yaml2obj has support for raw "Content" fields that allow overriding the contents of a section. It may be worth considering something similar for XCOFF to test this case. |
704 ↗ | (On Diff #409475) | Name should be a StringRef. |
710 ↗ | (On Diff #409475) | Rather than this awkward "go/goes", could you use "goes" only, and use singular terms for the Name parameter, e.g. "relocation data" or "symbol data" rather than "relocations" or "symbols"? |
Addressed comments.
llvm/lib/Object/XCOFFObjectFile.cpp | ||
---|---|---|
702 ↗ | (On Diff #400432) | Nice suggestion, thx! |
Thanks, looks good. You should get somebody with XCOFF knowledge to take another look though too.
Oh, forgot to mention: you need to revise this patch on @avl's work to move objcopy code into a library, as it will clash.
LGTM with one nit. Thanks for adding this support.
Have you tried this patch with objects generated by other tools, like XLC?
llvm/lib/ObjCopy/XCOFF/XCOFFReader.cpp | ||
---|---|---|
45 ↗ | (On Diff #410692) | Can we use std::move(ReadSec) here to avoid unnecessary copy? |
70 ↗ | (On Diff #410692) | Can we use std::move(ReadSym) here to avoid unnecessary copy? |
I think it should be ok to call fileHeader32 directly in the Reader?