This is an archive of the discontinued LLVM Phabricator instance.

[yaml2obj] Add support for writing the long symbol name.
ClosedPublic

Authored by Esme on Jun 1 2021, 6:00 AM.

Details

Summary

This patch, as a follow-up of D95505, adds support for writing the long symbol name by implementing StringTable.

Diff Detail

Event Timeline

Esme created this revision.Jun 1 2021, 6:00 AM
Esme requested review of this revision.Jun 1 2021, 6:00 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 1 2021, 6:00 AM
Higuoxing added inline comments.Jun 3 2021, 7:19 PM
llvm/test/tools/yaml2obj/XCOFF/long-symbol-name.yaml
2

Nit: Usually, we use '##' for comments that are not RUN lines.

19
  1. Can you add a symbol with short symbol name to illustrate that long names are stored in the string table and short names are stored in the .debug section?
  1. I think it's hard to tell whether a symbol name is from the string table or the .debug section in your test since llvm-readobj only prints the symbol name without mentioning where it's from. Are there any options for llvm-readobj that is able to tell us where the symbol name is from?
jhenderson added inline comments.Jun 7 2021, 2:00 AM
llvm/lib/ObjectYAML/XCOFFEmitter.cpp
277–281

Am I right in thinking that we don't yet have XCOFF64 support in yaml2obj by the point of this patch? If not, we should probably just change this to W.write<int32_t>(0); and add the Is64Bit check later (optionally add an assert(!Is64Bit && "64-bit format not yet implemented") as a reminder.

Additionally, the comment suggests that the string can be in one of two places, but as far as this code is concerned, it is only written in one place (the string table), right? I think talking about both here can be confusing, without making it clear which is actually used.

llvm/test/tools/yaml2obj/XCOFF/long-symbol-name.yaml
19

+1 to @Higuoxing's comments. In particular, for testing we need to be sure that we don't accidentally start writing short names in the string table. I don't have a clear suggestion on how to do this, but one option would be to ensure we have the ability to dump the raw string table itself, and then look at the contents (similar to how you could do llvm-readobj -p=.strtab foo.elf to get a list of strings).

Esme updated this revision to Diff 350615.Jun 8 2021, 8:46 AM

Addressed comments except dumping the raw string table.
It seems llvm-readobj can't dump the raw string table for XCOFF, I have to find another tool to do this.

Esme updated this revision to Diff 350792.Jun 8 2021, 11:39 PM

Rebase.

Addressed comments except dumping the raw string table.
It seems llvm-readobj can't dump the raw string table for XCOFF, I have to find another tool to do this.

I think it would make a lot of sense to add string table dumping to llvm-readobj. It's generic enough that you could probably get away with implementing it for multiple file formats too (e.g. for ELF it would dump all SHT_STRTAB sections), but initially, you can just implement it for XCOFF. I'd use a switch name as simple as --string-table or possibly --string-tables.

llvm/test/tools/yaml2obj/XCOFF/long-symbol-name.yaml
31–34

You can remove the excessive whitespace between the key and value.

I'd also make the names just long enough/not quite long enough to need the string table, to test the boundary conditions of the name length check.

Esme updated this revision to Diff 350811.Jun 9 2021, 1:29 AM

Addressed comments except dumping the raw string table.
It seems llvm-readobj can't dump the raw string table for XCOFF, I have to find another tool to do this.

I think it would make a lot of sense to add string table dumping to llvm-readobj. It's generic enough that you could probably get away with implementing it for multiple file formats too (e.g. for ELF it would dump all SHT_STRTAB sections), but initially, you can just implement it for XCOFF. I'd use a switch name as simple as --string-table or possibly --string-tables.

Thanks, it's a good idea to implement the option for llvm-readobj. I will do this in the future.
As far as I know, there is a system tool dump under AIX (https://www.ibm.com/docs/en/aix/7.1?topic=d-dump-command), which can print the raw string table.

bash-5.0$ yaml2obj llvm/test/tools/yaml2obj/XCOFF/long-symbol-name.yaml -o test.o
bash-5.0$ dump -cv test.o 

test.o:

                        ***String Table Information***
         Offset     Name

             4   .longname

However, the test in this patch can't dump the string table yet unless llvm-readobj --string-table is implemented.

shchenz accepted this revision.Jun 14 2021, 8:55 PM

Yes, it is better to implement string table dump for XCOFF, then we should not rely on the system tool to verify this change.

I think the correctness should also be guaranteed by AIX system tool dump.

So LGTM. Thanks for doing this.

FYI. I can get the correct string table and symbol table for long/short name symbols on AIX:

$cat t.yaml 
--- !XCOFF
FileHeader:
  MagicNumber: 0x1DF
Symbols:
  - Name: .symname
  - Name: .longname
  - Name: .longname3456
  - Name: .longname345
$dump -X32 -cv test.o

test.o:

                        ***String Table Information***
         Offset     Name

             4   .longname
            14   .longname3456
            28   .longname345

$dump -X32 -t test.o

[0]     m   0x00000000         0     0    0x00         0x0000     .symname
[1]     m   0x00000000         0     0    0x00         0x0000     .longname
[2]     m   0x00000000         0     0    0x00         0x0000     .longname3456
[3]     m   0x00000000         0     0    0x00         0x0000     .longname345
llvm/lib/ObjectYAML/XCOFFEmitter.cpp
78

Maybe we need to explicitly describe that in the issue summary this is only for 32-bit XCOFF?

This revision is now accepted and ready to land.Jun 14 2021, 8:55 PM
jhenderson accepted this revision.Jun 15 2021, 12:31 AM

LGTM too. Please remember to update the test once the string table printing has landed.

llvm/test/tools/yaml2obj/XCOFF/long-symbol-name.yaml
20

Not for this change, but perhaps a future llvm-readobj enhancement. For ELF, the sh_name/st_name field printing includes the raw value as well as the interpreted name value, with the raw value representing the offset into the string table. This sometimes might be helpful for debugging, and I think could be implemented for XCOFF too.

This revision was landed with ongoing or failed builds.Jun 20 2021, 10:11 PM
This revision was automatically updated to reflect the committed changes.