Page MenuHomePhabricator

[llvm-objcopy] Initial XCOFF32 support.
ClosedPublic

Authored by Esme on Feb 28 2021, 7:07 PM.

Details

Summary

This is an initial implementation of lvm-objcopy for XCOFF32.
Currently only supports simple copying, op-passthrough to follow.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Esme edited reviewers, added: shchenz; removed: steven.zhang.Apr 1 2021, 12:08 AM

Addressed @shchenz's comments and replace Buffe with streams based on D91028.

Esme added inline comments.Apr 1 2021, 12:10 AM
llvm/include/llvm/Object/XCOFFObjectFile.h
430

fileHeader32() is a private member, so I added the public function getFileHeader32().

llvm/tools/llvm-objcopy/XCOFF/XCOFFObjcopy.cpp
40

I have no idea about the new method, so I chose to list all of these options, just like what other targets have done.

shchenz added a comment.EditedApr 8 2021, 1:48 AM

Seems the functionality of the llvm-objcopy is not right. Could you please help to check? And maybe we need to add some other tests for the file format.

1:

$ llvm-objcopy 32.o copy_nodebug.o
$ dump -X32 -tv copy_nodebug.o      

copy_nodebug.o:

                        ***Symbol Table Information***
[Index] m        Value       Scn   Aux  Sclass    Type            Name
[Index] a0                                                        Fname
[Index] a1      Tagndx      Lnno  Size  Lnoptr    Endndx
[Index] a2      Tagndx            Fsiz  Lnoptr    Endndx
[Index] a3      Tagndx      Lnno  Size  Dimensions
[Index] a4   CSlen     PARMhsh SNhash SMtype SMclass Stab SNstab
[Index] a5      SECTlen    #RELent    #LINnums

[0]     m   0x00000000     undef     1  extern                    errno
[1]     a4  0x05e80000   a1f00 8192     US       ??   48 7936
[2]     m   0x00000000     undef     1  extern                    exit
[3]     a4  0x1f002000 5f80000    4     ??       PR 536872444    0
[4]     m   0x00000000     undef     1  extern                    atexit
[5]     a4  0x0000000e1f002000 1544     ER       PR 7936 8192
dump: copy_nodebug.o: dump: 0654-103 Cannot read from the specified file.
[6]     m   0x00000000     undef     1  extern                    **Invalid Symbol Name**

2: (with DWARF info)

$llvm-dwarfdump -a copy_debug.o
Program aborted due to an unhandled Error:
The end of the file was unexpectedly encountered
0.      Program arguments: llvm-dwarfdump -a copy.o
IOT/Abort trap (core dumped)
llvm/include/llvm/Object/XCOFFObjectFile.h
430

If so, I think it's better to make the private function public.

Esme added a comment.Apr 8 2021, 2:12 AM

Good catch, thanks!

Seems the functionality of the llvm-objcopy is not right. Could you please help to check? And maybe we need to add some other tests for the file format.

1:

$ llvm-objcopy 32.o copy_nodebug.o
$ dump -X32 -tv copy_nodebug.o      
...
dump: copy_nodebug.o: dump: 0654-103 Cannot read from the specified file.
[6]     m   0x00000000     undef     1  extern                    **Invalid Symbol Name**

I will add the support for writing symbol name to StringTable in next update, which may fix the issue.

2: (with DWARF info)

$llvm-dwarfdump -a copy_debug.o
Program aborted due to an unhandled Error:
The end of the file was unexpectedly encountered
0.      Program arguments: llvm-dwarfdump -a copy.o
IOT/Abort trap (core dumped)

The patch does not support reading and writing debug info. Do you think I should add it to the patch?

The patch does not support reading and writing debug info. Do you think I should add it to the patch?

Should we distinguish the DWARF sections and the non DWARF sections for now? From the XCOFFReader and XCOFFWriter class, I can not tell why and how the DWARF sections are skipped.

Esme updated this revision to Diff 338003.Apr 16 2021, 12:04 AM

Added the support for writing symbol names to the string table.

Esme added a comment.Apr 16 2021, 12:08 AM

The patch does not support reading and writing debug info. Do you think I should add it to the patch?

Should we distinguish the DWARF sections and the non DWARF sections for now? From the XCOFFReader and XCOFFWriter class, I can not tell why and how the DWARF sections are skipped.

The DWARF-related failure still exists in the latest update. I will have a look into the problem.

Esme updated this revision to Diff 341768.Apr 29 2021, 9:41 PM

Rebase on D85774.

You need testing for your error paths, I think.

llvm/test/tools/llvm-objcopy/XCOFF/basic-copy.test
8
10–19

Maybe add a comment to the final section consisting solely of "Flags" to clarify that it is a nameless and empty section? It might make more sense to put the empty section earlier in the sections too, to show the impact (or lack thereof) it has on the later section offsets.

12

Do we really need this much data? Seems to me like a few arbitrary bytes (e.g. "12345678") would be sufficient.

llvm/tools/llvm-objcopy/XCOFF/Reader.cpp
42

Should this be a const &?

llvm/tools/llvm-objcopy/XCOFF/Writer.cpp
20–22

Why not as inline?

21

"header" rather than "headers" maybe? Is there more than one file header after all?

26

Rather than these comments, maybe it would be clearer if you split this function into multiple functions, e.g. finalizeHeader, finalizeSections etc.

65

Similar to above, I'd move these blocks into separate functions. That way, if they need to increase in complexity here, it's easy to do so.

87

Rel or Reloc are generally more common abbreviations for "Relocation" in other parts of the code. I'd recommend renaming here to match.

llvm/tools/llvm-objcopy/XCOFF/XCOFFObjcopy.h
33

Nit: add new line at EOF.

Esme updated this revision to Diff 372631.Sep 14 2021, 10:25 PM
Esme edited the summary of this revision. (Show Details)

Rebased the patch.
Error path test to be added...

Aside from a few nits, and the acknowledged missing error path testing, this seems reasonable to me, but requires someone with XCOFF knowledge to review too.

llvm/tools/llvm-objcopy/XCOFF/Reader.cpp
53

SymbolRef is designed to be easily copyable. No need for const &.

llvm/tools/llvm-objcopy/XCOFF/Writer.cpp
46
llvm/tools/llvm-objcopy/XCOFF/XCOFFObjcopy.h
33

Now there are too many new lines :D

Delete line 33. There should be a '\n' at the end of line 32, as the last byte in the file.

llvm/tools/llvm-objcopy/llvm-objcopy.cpp
209

Are there likely to be any XCOFF-specific options added in the future? If not, I wonder if we could drop XCOFFConfig entirely. We'd still want the check for unsupported options, but perhaps that should be in a different function then (e.g. Config.checkXCOFFOptions()). What do you think?

(If there are XCOFF-specific options that are planned, having a separate config, even though empty for now, is fine).

Esme updated this revision to Diff 373181.Sep 17 2021, 3:48 AM
  1. Address comment.
  2. Add fields of OptionalFileHeader, AuxSymbolEntries, and LineNumInfo, otherwise an error will occur when using the compiled binary as input. (OptionalFileHeader and AuxSymbolEntries are not supported yet, just copy their contents directly.)
  3. Add error tests except for the error path in getLineNumberInfo because yaml2obj does not support it.
  4. Add a binary test with LineNumInfo and AuxSymbolEntries.
llvm/tools/llvm-objcopy/llvm-objcopy.cpp
209

There isn't a concrete plan yet, but I think it is quite possible to add XCOFF-specific options in the future. So I think it's fine to keep this?

jhenderson added inline comments.Sep 20 2021, 1:44 AM
llvm/lib/Object/XCOFFObjectFile.cpp
811

Be consistent in the style of your assert messages - the version above has no trailing full stop, but this one.

There isn't a single style specified for asserts in LLVM, but for error messages, the agreed-upon style, where surrounding code does not differ from it, is no leading capital letter, and no trailing full stop.

915
llvm/test/tools/llvm-objcopy/XCOFF/binary-copy.test
1–2

Is there a particular need for you to get the llvm-objcopy implementation landed, before you have added the yaml2obj functionality? By adding a binary to the git repository, it will always stay in the repo history, even if it is later deleted, using up disk space on everybody's machines, slowing down clone times etc.

llvm/test/tools/llvm-objcopy/XCOFF/invalid-section.test
8

Here and below, these error messages need improving to include additional context about what has gone wrong. For example, which section couldn't be read? What was its expected size and offset? Potentially event what was the file size?

Take a look at some examples from other tests:
llvm-objcopy\ELF\invalid-p_filesz-p_offset.test
llvm-readobj\ELF\addrsig.test (this one uses warnings, but the same message context is still important)

llvm/tools/llvm-objcopy/XCOFF/Object.h
32

Add TODO: to these comments.

llvm/tools/llvm-objcopy/XCOFF/Reader.cpp
70–73

Do we need something to protect us from accidentally running off the end of the file, if it's been truncated somehow?

90–92

As above, do we need to check to make sure we don't end up reading off the end of the file?

I think it might be better to emit an error and stop, saying "optional headers are not yet supported" or something to that effect, so that we don't end up producing a potentially invalid output. Same goes with the aux symbol entries.

94

Typo?

llvm/tools/llvm-objcopy/XCOFF/Writer.cpp
65

(but see also my above comment)

Esme planned changes to this revision.Sep 23 2021, 4:39 AM

@jhenderson Thank you so much for your comments!
I will rework this patch after implementing interfaces such as aux symbol and optional header and adding yaml2obj functionality.

Esme updated this revision to Diff 400432.Jan 16 2022, 9:25 PM
Esme marked 2 inline comments as done.
  1. Addressed comments.
  2. Replaced binary inputs with YAMLs for testing.
  3. Updated tests for error paths.
jhenderson added inline comments.Feb 1 2022, 1:52 AM
llvm/lib/Object/XCOFFObjectFile.cpp
589

Why is Start a uintptr_t rather than either a const char * or const uint8_t * (or even a StringRef or ArrayRef<uint8_t>)?

I also don't see this test being tested.

593

Rather than "the raw data" here, perhaps you should pass in a "Name" parameter which is used to describe what the raw data is, e.g. a symbol or similar, so that the error message is more meaningful: "symbol with offset 0x1234 and size 0x4321 goes past the end of the file".

595
llvm/test/tools/llvm-objcopy/XCOFF/invalid-read.test
4 ↗(On Diff #400432)

I'd delete the "Error1: " bit of this comment (and similar bits in others), as it doesn't really add much value, but also has a risk of being confused with the FileCheck check prefix ERROR1.

Also, there's a copy-and-paste error further down with it :)

llvm/tools/llvm-objcopy/XCOFF/Reader.cpp
70–73

I'm probably being dumb, but where is this checked?

Esme updated this revision to Diff 409475.Feb 16 2022, 8:30 PM
Esme added inline comments.
llvm/lib/Object/XCOFFObjectFile.cpp
589

Hmm, the function is called by objcopy when reading aux symbols. It looks like we can test this by making the StartOffset + SymbolEntrySize * NumberOfAuxEntries exceed the end of the file, where the NumberOfAuxEntries is user definable in the yaml, but I still can't construct such a case because we use zeros padding when the defined number is greater than the actual number of aux symbols in yaml2obj.

llvm/tools/llvm-objcopy/XCOFF/Reader.cpp
70–73

Do we need something to protect us from accidentally running off the end of the file, if it's been truncated somehow?

It was a comment in reading auxiliary symbols. Now I read the data by calling XCOFFObjectFile::getRawData, which has an offset check.

jhenderson added inline comments.Feb 17 2022, 1:20 AM
llvm/lib/Object/XCOFFObjectFile.cpp
589

Fair enough. I'd add a TODO comment saying this path is untested.

ELF yaml2obj has support for raw "Content" fields that allow overriding the contents of a section. It may be worth considering something similar for XCOFF to test this case.

591

Name should be a StringRef.

597

Rather than this awkward "go/goes", could you use "goes" only, and use singular terms for the Name parameter, e.g. "relocation data" or "symbol data" rather than "relocations" or "symbols"?

Esme updated this revision to Diff 409853.Feb 17 2022, 8:21 PM

Addressed comments.

llvm/lib/Object/XCOFFObjectFile.cpp
589

Nice suggestion, thx!

jhenderson accepted this revision.Feb 18 2022, 12:59 AM

Thanks, looks good. You should get somebody with XCOFF knowledge to take another look though too.

This revision is now accepted and ready to land.Feb 18 2022, 12:59 AM
avl added a subscriber: avl.Feb 18 2022, 3:10 AM

Oh, forgot to mention: you need to revise this patch on @avl's work to move objcopy code into a library, as it will clash.

Esme updated this revision to Diff 410692.Feb 22 2022, 7:28 PM

Revised this patch on @avl 's work, i.e. D88827, D120345, D114664 and D114429.

jhenderson accepted this revision.Feb 23 2022, 1:19 AM

Looks good still.

shchenz accepted this revision as: shchenz.Feb 23 2022, 11:11 PM

LGTM with one nit. Thanks for adding this support.

Have you tried this patch with objects generated by other tools, like XLC?

llvm/lib/ObjCopy/XCOFF/XCOFFReader.cpp
45 ↗(On Diff #410692)

Can we use std::move(ReadSec) here to avoid unnecessary copy?

70 ↗(On Diff #410692)

Can we use std::move(ReadSym) here to avoid unnecessary copy?

Esme added a comment.Feb 27 2022, 9:29 PM

LGTM with one nit. Thanks for adding this support.

Have you tried this patch with objects generated by other tools, like XLC?

Yes, I have tested that case too.

This revision was landed with ongoing or failed builds.Feb 28 2022, 2:00 AM
This revision was automatically updated to reflect the committed changes.