This is an archive of the discontinued LLVM Phabricator instance.

[Test] Fix YAML mapping keys duplication. NFC.
ClosedPublic

Authored by asi-sc on Jan 16 2023, 6:45 AM.

Details

Summary

YAML specification does not allow keys duplication an a mapping. However, YAML
parser in LLVM does not have any check on that and uses only the last key entry.
In this change duplicated keys are merged to satisfy the spec.

Diff Detail

Event Timeline

asi-sc created this revision.Jan 16 2023, 6:45 AM
asi-sc requested review of this revision.Jan 16 2023, 6:45 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 16 2023, 6:45 AM
asi-sc updated this revision to Diff 489546.Jan 16 2023, 7:01 AM

Remove one incorreclty added test

Adding to review test creators as some changes are not trivial.

jhenderson added inline comments.Jan 31 2023, 1:02 AM
llvm/test/tools/llvm-objcopy/ELF/strip-all-gnu.test
28–29

I don't think this is a correct change, based on my memory of this test and the --strip-all-gnu behaviour. I think it's likely the second Type key is actually spurious and should be deleted.

asi-sc added inline comments.Feb 1 2023, 12:54 AM
llvm/test/tools/llvm-objcopy/ELF/strip-all-gnu.test
28–29

@jhenderson , thanks for your suggestion. However, if I remove the second Type key, llvm-objcopy reports an error:

llvm-objcopy: error: 'build-release/test/tools/llvm-objcopy/ELF/Output/strip-all-gnu.test.tmp': string table '.strtab' cannot be removed because it is referenced by the symbol table '.symtab.dyn'

Do you have any ideas what might be wrong?

@jhenderson , thanks for your suggestion. However, if I remove the second Type key, llvm-objcopy reports an error:

llvm-objcopy: error: 'build-release/test/tools/llvm-objcopy/ELF/Output/strip-all-gnu.test.tmp': string table '.strtab' cannot be removed because it is referenced by the symbol table '.symtab.dyn'

Do you have any ideas what might be wrong?

Ah, yes, that would make (unfortunate) sense. So it looks like there are multiple issues:

  1. There's a bug in the test whereby the .symtab.dyn section is being given the wrong section type. I tracked this down eventually to a change in the original patch introducing --strip-all (https://reviews.llvm.org/D39769?vs=122160&id=122524#toc), and it looks like a simple copy-paste or similar error. --strip-all was later changed to be --stip-all-gnu (with a new --strip-all taking its place), and the test (complete with bug) got migrated over. Looking at the original patch, and the relevant llvm-objcopy code, I'm 100% confident that .symtab.dyn is supposed to be of type SHT_SYMTAB.
  2. When --strip-all-gnu is specified, llvm-objcopy attempts to remove any section of type SHT_STRTAB when that section is neither marked with SHF_ALLOC nor is the section header string table. By default, yaml2obj creates an implicit section called .strtab that any section of type SHT_SYMTAB references. It also emits an error when attempting to remove a section that is referenced by another section that is not being removed. Since, with my suggested change, .symtab.dyn is now of type SHT_SYMTAB, it will reference the implicit .strtab, but llvm-objcopy will try to remove it, and therefore produce the error. This wasn't picked up before because of the first issue.

For context, this test is largely testing the if block starting at this line: https://github.com/llvm/llvm-project/blob/9e845fe44fa125c482178c2b626693e5d80851e0/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp#L389. .symtab.dyn is supposed to show that a SHF_ALLOC section that would be removed due to its type is not removed (because it is marked as SHF_ALLOC and therefore triggers the early return false in the if block). A SHT_NOBITS section is not removed due to type, regardless of the SHF_ALLOC flag, so the section doesn't add anything to the test case in the current form of SHT_NOBITS + SHF_ALLOC. To maintain (or rather introduce) this test coverage, we should make .symtab.dyn SHT_SYMTAB, like I've suggested, but also make a preserved string section that it references. There are a few ways of doing this, but I suggest the following snippet to replace the existing .symtab.dyn section:

- Name:            .symtab.dyn
  Type:            SHT_SYMTAB
  Link:            .strab.dyn
  Flags:           [ SHF_ALLOC ]
- Name:            .strtab.dyn
  Type:            SHT_STRTAB
  Flags:           [ SHF_ALLOC ]

If you make that change to the YAML, you'll also need to add .strtab.dyn to the CHECK: list of sections that are preserved below the YAML, and increase the SectionHeaderCount check from 8 to 9.

Hopefully that all makes sense. You're welcome to spin this into a separate patch, if you want, to avoid confusing the other test changes, since this is a non-trivial change in the test.

llvm/test/tools/llvm-objcopy/ELF/strip-all-gnu.test
28–29

Taking conversation out of line due to length.

28–29

@jhenderson , thanks for your suggestion. However, if I remove the second Type key, llvm-objcopy reports an error:

llvm-objcopy: error: 'build-release/test/tools/llvm-objcopy/ELF/Output/strip-all-gnu.test.tmp': string table '.strtab' cannot be removed because it is referenced by the symbol table '.symtab.dyn'

Do you have any ideas what might be wrong?

asi-sc added a comment.Feb 1 2023, 4:16 AM

@jhenderson , thanks for your investigation and really detailed explanation!

There are a few ways of doing this, but I suggest the following snippet to replace the existing .symtab.dyn section:

- Name:            .symtab.dyn
  Type:            SHT_SYMTAB
  Link:            .strab.dyn
  Flags:           [ SHF_ALLOC ]
- Name:            .strtab.dyn
  Type:            SHT_STRTAB
  Flags:           [ SHF_ALLOC ]

If you make that change to the YAML, you'll also need to add .strtab.dyn to the CHECK: list of sections that are preserved below the YAML, and increase the SectionHeaderCount check from 8 to 9.

Hopefully that all makes sense. You're welcome to spin this into a separate patch, if you want, to avoid confusing the other test changes, since this is a non-trivial change in the test.

I tried the suggested change and it seems we faced with one more issue: llvm-objcopy: error: Symbol table has link index of 4 which is not a string table
There is even one more test that suffers from it - https://github.com/llvm/llvm-project/blob/main/llvm/test/tools/llvm-objcopy/ELF/only-keep-debug.test#L214 . So, I can repeat the temporary fix from this test and use SHT_STRTAB type for .symtab.dyn section. But maybe you could have a look once again and suggest better option?

I tried the suggested change and it seems we faced with one more issue: llvm-objcopy: error: Symbol table has link index of 4 which is not a string table

Ah, the old "llvm-objcopy treats SHF_ALLOC + SHT_STRTAB as dynamic string table internally, and won't let non-dynamic symbol tables use it" issue. We'll have to try something else.

I did some playing around and came across two separate issues in the process of trying to get a working test (see https://github.com/llvm/llvm-project/issues/60448 and https://github.com/llvm/llvm-project/issues/60451 if you're interested). In the end, I came up with the following changes to the YAML:

  1. Rename .symtab.dyn to .symtab. This ensures we don't run into the two SHT_SYMTAB issue described in the first of those tickets.
  2. Make sure it has SHT_SYMTAB type and SHF_ALLOC flags.
  3. Don't add the SHT_STRTAB section I suggested before (it's not needed).
  4. Add --allow-broken-links to the first llvm-objcopy and llvm-strip commands that do the stripping. This will allow the implicitly-generated .strtab section to be removed, whilst still being referenced by the .symtab section.
  5. Update the CHECKs to look for ".symtab" rather than ".symtab.dyn".

Hopefully that should fix the issue. The --allow-broken-links option will result in a somewhat-broken ELF object, but it doesn't matter for the test case's purpose.

There is even one more test that suffers from it - https://github.com/llvm/llvm-project/blob/main/llvm/test/tools/llvm-objcopy/ELF/only-keep-debug.test#L214 . So, I can repeat the temporary fix from this test and use SHT_STRTAB type for .symtab.dyn section. But maybe you could have a look once again and suggest better option?

There's no need to do that, as we already have a SHT_STRTAB + SHF_ALLOC section in the test case. In fact, I'd argue that the "temporary fix" in that test is simply hiding an issue, and is not providing any useful test coverage. The real fix would be to change llvm-objcopy to not treat SHT_STRTAB + SHF_ALLOC different to regular SHT_STRTAB sections somehow. However, the problem is quite complex (we need to be careful about modifying SHF_ALLOC section sizes, so lots of operations that normally work on a SHT_STRTAB section won't work on one with the SHF_ALLOC flag), so it's non-trivial to fix without reworking how llvm-objcopy reads in the ELF object.

asi-sc updated this revision to Diff 494562.Feb 3 2023, 2:23 AM

Rebase: llvm/test/tools/llvm-objcopy/ELF/strip-all-gnu.test was fixed in D143086

Thanks @asi-sc for the llvm-objcopy fix. I don't know enough about the other tools/code to be able to review them (the remaining llvm-objcopy test looks good to me).

@yota9 , @cishida , @spowell , @steven_wu , sorry for mentioning you directly in this review, however I believe you are the right experts to consult about the content of some tests. We need to get rid of keys duplication in YAML inputs as keys duplication is not allowed by the specification. These fixes in tests blocks improvement of LLVM's YAML parser. Could you please check the comments below where I've mentioned you?

bolt/test/AArch64/Inputs/plt-gnu-ld.yaml
178

Currently, YAML parser seems to overwrite Symbol (.text -> '__libc_start_main@@GLIBC_2.17') and Type (R_AARCH64_ADD_ABS_LO12_NC -> R_AARCH64_CALL26) values at Offset 0x40052C. And the test works just because we are lucky. I think the right fix is to introduce an offset for __libc_start_main@@GLIBC_2.17 symbol, but I'm not sure in the offset's value I've chosen. @yota9 , may I ask you to take a quick look at this fix?

llvm/unittests/TextAPI/TextStubV4Tests.cpp
984

@spowell , @cishida , @steven_wu , TBDv4File file seems to contain two files incorrectly merged into one. I've merged some fields so that it doesn't contradict YAML spec, but I'm not sure in the correctness of this change. If such a merging is not possible, when we must split one file into two. However, looking at the checks, it doesn't sound as a required thing for this test. Could you please have a look?

yota9 accepted this revision.Feb 6 2023, 2:31 AM

BOLT part LGTM

bolt/test/AArch64/Inputs/plt-gnu-ld.yaml
178

Thank you for the fix, yes I think the offset was chosen right

This revision is now accepted and ready to land.Feb 6 2023, 2:31 AM
cishida accepted this revision.Feb 6 2023, 11:44 AM
cishida added inline comments.
llvm/unittests/TextAPI/TextStubV4Tests.cpp
984

That sounds right. This input was meant to represent two libraries but the test never checked that and the input was indeed invalid. Thanks for cleaning this up! LGTM.

This revision was automatically updated to reflect the committed changes.