In D104613, we left a FIXME about the error dumping for the first item of StringTable.
This is not caused by parsing, but by the fact that I didn't realize that the first 4 bytes contain the size of the StringTable.
The patch fixes the bug.
Details
- Reviewers
jhenderson shchenz Higuoxing - Group Reviewers
Restricted Project - Commits
- rG69396896fb61: [llvm-readobj][XCOFF] Fix the error dumping for the first
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/tools/llvm-readobj/ObjDumper.cpp | ||
---|---|---|
56 | Can we add a comment about why we need to set start at StrContent + StringSizeFieldLength? Also can we add a line at the beginning of the string table to show the size of the string table? | |
68 | The output is shown as hexadecimal data, but there is no "0x" prefix for the data, we should add the prefix or use decimal data? | |
llvm/tools/llvm-readobj/ObjDumper.h | ||
113 | Maybe a more meaningful name would be StringSizeFieldLength? |
llvm/tools/llvm-readobj/ObjDumper.cpp | ||
---|---|---|
56 |
Same as above, the function is not only used by dumping the string table. But yes, I can print the size at XCOFFDumper::printStringTable. | |
llvm/tools/llvm-readobj/ObjDumper.h | ||
113 | The function also called by ObjDumper::printSectionsAsString, which dumps the specified section(s), which doesn't have the field containing string content's length. So I think StringSizeFieldLength may be confused in that case? |
llvm/tools/llvm-readobj/ObjDumper.cpp | ||
---|---|---|
68 | Changing the output format will affect many tests on dumping sections. That is out of the scope of this patch. We can post another patch for it if necessary. |
llvm/tools/llvm-readobj/ObjDumper.h | ||
---|---|---|
113 | What about BeginOffset? And why it's typed as uint8_t, size_t seems better as offset/length if there's not other reasons. |
llvm/tools/llvm-readobj/ObjDumper.cpp | ||
---|---|---|
58–59 | I'd change this comment to: "Some formats contain additional metadata at the start which should not be interpreted as strings. Skip these bytes, but account for them in the string offsets." Don't describe what the code is doing when the code is already clear. Instead, describe why. I think it's important to explain why you don't just use drop_front priort to calling this function: it's because you want to keep the offsets to represent the real offsets in the string table. | |
68 | Don't add 0x. The output format MUST match that of GNU readelf -p for ELF output at least. If you really want to change XCOFF only, that's fine, but you'll need to do it for that format only. | |
llvm/tools/llvm-readobj/ObjDumper.h | ||
113 | More concretely: I'd change the signature to void printAsStringList(StringRef StringContent, size_t StringDataOffset = 0); | |
llvm/tools/llvm-readobj/XCOFFDumper.cpp | ||
462 | What will happen if you try to call this when there is no string table? XCOFF32 doesn't require one as I understand it. | |
463–465 | I'm not opposed to the length field being here, but it should be a separate patch. | |
464 | ||
465 | I assume that the length includes the length field itself? |
llvm/lib/Object/XCOFFObjectFile.cpp | ||
---|---|---|
193 ↗ | (On Diff #357855) | Looks like you may need a test case for a single-byte sized table! Also, I think you need the following two test cases:
The latter assumes there's a yaml2obj implementation that supports custom string table data. If it doesn't exist, consider adding it, but it's not essential if there isn't. |
Added tests.
- A single-byte sized string entry.
- Empty string table with just the length field.
llvm/test/tools/yaml2obj/XCOFF/string-table.yaml | ||
---|---|---|
1 ↗ | (On Diff #358176) | These two test cases should be part of the llvm-readobj string table printing test, not here: you're testing llvm-readobj (including the XCOFFObjectFile code, via llvm-readobj), not yaml2obj. Update the comments accordingly too. |
30–31 ↗ | (On Diff #358176) | I don't think you need any symbols here to ensure an empty string table. |
Address comments.
llvm/test/tools/yaml2obj/XCOFF/string-table.yaml | ||
---|---|---|
30–31 ↗ | (On Diff #358176) | The absence of any symbols means that there is no string table, but the case tries to display an empty string table, which has no string data but the length field. |
llvm/test/tools/llvm-readobj/XCOFF/string-table.yaml | ||
---|---|---|
1 ↗ | (On Diff #358191) | Apaprently I missed in the other patch that there wasn't an llvm-readobj test case, even though there should have been. Can you extend this test file to include a case where there is more than one symbol name, please? |
7 ↗ | (On Diff #358191) | I don't think you need this line. Same goes below. |
9 ↗ | (On Diff #358191) | Use -NEXT suffixes on your check patterns. This test in its current form will also pass with the following output: StringTable { <some garbage> [ 4] n <more garbage> } Also, delete trailing spaces. Same comments below. |
30 ↗ | (On Diff #358191) | Why does 32-bit XCOFF generate an empty string table if it isn't used? Is that a bug in the emitter, or a spec detail? |
Addressed comments.
llvm/test/tools/llvm-readobj/XCOFF/string-table.yaml | ||
---|---|---|
30 ↗ | (On Diff #358191) | Hmm...I think it's neither a spec detail, nor a bug? That's just how we implement in yaml2obj. We write the string table if the symbol table isn't empty. if (!Obj.Symbols.empty()) { if (!writeSymbols()) return false; // Write the string table. Strings.write(W.OS); } Due to the fact that only long symbol name can be written into the string table for 32-bit XCOFF, I write the test of non-empty symbol table without long symbol names. |
llvm/test/tools/llvm-readobj/XCOFF/string-table.yaml | ||
---|---|---|
55 ↗ | (On Diff #359227) | You don't need the too-short-to-be-in-the-table name, because that's testing yaml2obj's emission of the string table, not llvm-readobj's dumping of it. |
30 ↗ | (On Diff #358191) | Okay, I'd call that a yaml2obj bug (yaml2obj shouldn't be creating things that aren't needed - doing so wastes time and disk space). That being said, it would be a useful feature if you could customise the presence and/or contents of the string table directly in yaml2obj. Something like the following: ## Emit a string table with specific contents. Custom contents overwrite implicitly generated contents. --- !XCOFF FileHeader: ... StringTable: - Length: 10 # Force string table length field to be set to this value. - Strings: ["a", "b", "cd"] # Force contetns to this. Symbols: # Optional. # Names use values as-if string table were implicitly generated (might be invalid). ## Don't emit a string table. --- !XCOFF FileHeader: ... # No StringTable or Symbols element. ## Emit a string table with implicit contents, if and only if required (i.e. 32-bit long name or any 64-bit). --- !XCOFF FileHeader: ... Symbols: ... ## Suppress emission of string table. --- !XCOFF FileHeader: ... StringTable: Emit: false # Or perhaps Suppress: true Symbols: ... # Names generated with appropriate offsets as if table emission was not suppressed. This is similar in concept to how ELF yaml2obj allows changing the contents and even presence of the .strtab section (which contains symbol names), and allows for testing of more code paths (e.g. checking that a symbol name offset is valid, printing of a string table with an invalid length field etc). |
Updated test.
Thanks James!
I have post the patch D106420 to solve the issue we mentioned and will post a following patch to add the feature of customizing the string table directly in yaml2obj.
LGTM, with a couple of minor fixes.
llvm/test/tools/llvm-readobj/XCOFF/string-table.yaml | ||
---|---|---|
28 ↗ | (On Diff #360358) | I'd put this test case first, then the single-byte case (because it's testing an edge case), and the empty case last. |
llvm/tools/llvm-readobj/XCOFFDumper.cpp | ||
463–466 | Careful of off-by-one errors! Byte 0 is the first byte... |
Without looking too hard, my guess is that the problem is caused by an empty StringRef returning nullptr from bytes_begin() (or some equivalent behaviour caused further up the chain that leads to that situation implicitly). You should probably add an early out in case the string is empty. If there's still an issue after that fix, then there's something returning a non-empty StringRef starting from the nullptr, which is clearly broken, but is likely further up the stack in terms of where the problem occurs.
I've committed rG737e27f6236f after rG3df1e7e6f05e to add an early out in case the string is empty.
Maybe a more meaningful name would be StringSizeFieldLength?