Page MenuHomePhabricator

[llvm-readobj][XCOFF] Add support for printing the String Table.
ClosedPublic

Authored by Esme on Jun 20 2021, 8:25 PM.

Details

Summary

This patch adds string table dumping to llvm-readobj. Currently I only implement it for XCOFF, other files can also implement the feature if needed.

Diff Detail

Event Timeline

Esme created this revision.Jun 20 2021, 8:25 PM
Esme requested review of this revision.Jun 20 2021, 8:25 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 20 2021, 8:25 PM

Please add the new option to the llvm-readobj documentation.

llvm/lib/Object/XCOFFObjectFile.cpp
190

I'd make this return a map of offsets to strings. If you look at the --string-dump output of llvm-readelf, you'll see it includes the string offsets in there, and I think it would be good to use the same output format for this option.

192–193

I'd inline NumberOfSymbols into the loop initialization list, since it's not needed otherwise.

llvm/tools/llvm-readobj/ObjDumper.h
108–109

I'd keep the comment, but name it printStringTable() so that you don't need to rename the function if and when this gets implemented for other file formats - there's nothing fundamentally XCOFF-specific about this feature after all.

llvm/tools/llvm-readobj/XCOFFDumper.cpp
462–463

It would be preferable to print the error as a warning and skip dumping, so that a user who requests to dump the string table and other data can still get the other data. This is the general pattern we follow elsewhere in llvm-readobj, at least for newer code (possibly it hasn't been adopted in XCOFF, but I'm trying to encourage it).

464

StringRef is designed to be passed by value, not reference.

llvm/tools/llvm-readobj/llvm-readobj.cpp
306–309

Let's drop the "xcoff" from the name and description. Perhaps add in the help message "Currently implemented for XCOFF only." instead. That way, when the option is adopted for other formats, we don't need multiple options or to force users to change the switch name.

613

No need for this outer-if statement, since the virtual function has a do-nothing base version.

Esme updated this revision to Diff 353321.Jun 21 2021, 3:36 AM
Esme marked 7 inline comments as done.

Addressed comments.

Esme updated this revision to Diff 353324.Jun 21 2021, 3:39 AM
Higuoxing added inline comments.Jun 21 2021, 7:33 PM
llvm/lib/Object/XCOFFObjectFile.cpp
190

Since we are modifying lib/Object/XCOFFObjectFile.cpp and this function may return an error, I think we need to write unittest for this function. What do you think, @jhenderson ?

jhenderson added inline comments.Jun 23 2021, 1:09 AM
llvm/include/llvm/Object/XCOFFObjectFile.h
18

I don't think this has been clang-formatted. Include lists should be in alphabetical order.

llvm/lib/Object/XCOFFObjectFile.cpp
190

A gtest unit test is always preferable for libraries, in my opinion, but I don't know if the XCOFF stuff has extensive unit testing, and where there's a tool that handles this error, it's probably sufficient to test it in the tool lit tests.

llvm/tools/llvm-readobj/XCOFFDumper.cpp
464

Always use reportUniqueWarning in preference to reportWarning. The current reportWarning implementation may mean the same warning is reported repeatedly, and we are trying to migrate to reportUniqueWarning instead.

Esme updated this revision to Diff 353889.Jun 23 2021, 1:56 AM

Addressed comments.

jhenderson added inline comments.Jun 23 2021, 2:05 AM
llvm/tools/llvm-readobj/XCOFFDumper.cpp
463–464

You need to add a test case that will exercise this warning. You also need to return after the warning, or you'll attempt to print the table contents when you don't have a valid map!

Esme updated this revision to Diff 353905.Jun 23 2021, 3:04 AM
Esme added inline comments.
llvm/tools/llvm-readobj/XCOFFDumper.cpp
463–464

Well, I'm unable to construct a lit test to exercise this warning. The warning occurs only if the function XCOFFObjectFile::getStringTableEntry(uint32_t Offset) returns an error. I don't think this should happen unless yaml2obj writes the wrong data for symbol name offset or the string table.

jhenderson added inline comments.Jun 23 2021, 3:56 AM
llvm/tools/llvm-readobj/XCOFFDumper.cpp
463–464

One of yaml2obj's intentions is to be able to craft broken inputs so that these things can be tested. That suggests to me you need to extend yaml2obj a little more, to allow you to force writing a broken value.

That being said, your comment made me think that the approach in getStringTable is wrong. That should be able to dump the entire string table, by just iterating over null terminated strings from whatever offset indicates the string table start, rather than dumping the individual string entries as derived from symbol information. What if, for example, the string table had strings that didn't correspond to symbol names? I'd expect this option to print them.

Esme updated this revision to Diff 354733.Sun, Jun 27, 2:39 AM
  1. Address comments.
  2. Use MapVector instead of DenseMap to make string table in order.

I just noticed that the code for --string-dump is generic, rather than ELF specific. It loops over the provided sections, and retrieves some details, but the main body of the loop should do exactly what you want to do. Perhaps it would be worth splitting out the while loop from ObjDumper::printSectionsAsString into a separate function and simply reusing that? This would ensure the XCOFF behaviour and other object behaviour is broadly equivalent.

llvm/lib/Object/XCOFFObjectFile.cpp
192–193

Is this an actual case that can happen? Under what circumstances? Perhaps those circumstances deserve a comment.

Esme updated this revision to Diff 355781.Wed, Jun 30, 11:29 PM

Address comment.

jhenderson added inline comments.Thu, Jul 1, 12:56 AM
llvm/test/tools/yaml2obj/XCOFF/basic-doc64.yaml
137 ↗(On Diff #355781)

Does this test pass? Why is there a second "." in this name, if so?

What's at bytes 0-2 in the string table?

llvm/test/tools/yaml2obj/XCOFF/long-symbol-name.yaml
29

Ditto.

Esme added a comment.Thu, Jul 1, 3:41 AM
llvm/test/tools/yaml2obj/XCOFF/basic-doc64.yaml
137 ↗(On Diff #355781)

Yes, the test does pass. The offset is 3, but the real offset is 4 in the Symbol entry. It is not a bug of yaml2obj or this patch, since the issue occurs when check the StringTable of a compiled binary object. This should be a bug during parsing the StringTable, and I will post another patch to fix it before the patch.
There are '\0' at bytes 0-2. Here is an example of dumping the string data for every bytes offset of a StringTable for a compiled binary file. Where the correct result should be [ 4] alongname

StringTable {
-every bytes
0: 
1: 
2: 
3: alongname
4: alongname
5: longname
6: ongname
7: ngname
8: gname
9: name
10: ame
11: me
12: e
13: 
-Result
  [     3]   .alongname  
}
jhenderson accepted this revision.Thu, Jul 1, 3:44 AM

LGTM, pending the bug fix (but see my inline comment - feel free to land this first if it makes testing the fix easier).

llvm/test/tools/yaml2obj/XCOFF/basic-doc64.yaml
137 ↗(On Diff #355781)

Okay, thanks. Please post the other patch, then rebase this one on it. Alternatively, I don't object to you landing this one, with FIXME notes added to the test describing the problem, so that you can use the --string-table option to show the bug fix is correct.

This revision is now accepted and ready to land.Thu, Jul 1, 3:44 AM
This revision was landed with ongoing or failed builds.Sun, Jul 4, 9:17 PM
This revision was automatically updated to reflect the committed changes.