Page MenuHomePhabricator

[Object] Fix crash caused by unhandled error.
AbandonedPublic

Authored by Higuoxing on Apr 1 2020, 11:05 PM.

Details

Summary

The return type of ELFFile<ELFT>::symbols(const Elf_Shdr*) is Expected<Elf_Sym_Range>, and we should handle the error or consume it when this operation fails.

Diff Detail

Unit TestsFailed

TimeTest
450 msLLVM.Transforms/Attributor::liveness.ll
Script: -- : 'RUN: at line 2'; /mnt/disks/ssd0/agent/workspace/BETA_amd64_debian_testing_clang8/llvm-project/build/bin/opt -attributor -attributor-manifest-internal -attributor-disable=false -attributor-max-iterations-verify -attributor-annotate-decl-cs -attributor-max-iterations=7 -S < /mnt/disks/ssd0/agent/workspace/BETA_amd64_debian_testing_clang8/llvm-project/llvm/test/Transforms/Attributor/liveness.ll | /mnt/disks/ssd0/agent/workspace/BETA_amd64_debian_testing_clang8/llvm-project/build/bin/FileCheck /mnt/disks/ssd0/agent/workspace/BETA_amd64_debian_testing_clang8/llvm-project/llvm/test/Transforms/Attributor/liveness.ll --check-prefixes=CHECK,NOT_CGSCC_NPM,NOT_CGSCC_OPM,NOT_TUNIT_NPM,IS__TUNIT____,IS________OPM,IS__TUNIT_OPM
800 msLLVM.Transforms/Attributor::liveness.ll
Script: -- : 'RUN: at line 2'; c:\ws\workspace\amd64_windows_vs2017\llvm-project\build\bin\opt.exe -attributor -attributor-manifest-internal -attributor-disable=false -attributor-max-iterations-verify -attributor-annotate-decl-cs -attributor-max-iterations=7 -S < C:\ws\workspace\amd64_windows_vs2017\llvm-project\llvm\test\Transforms\Attributor\liveness.ll | c:\ws\workspace\amd64_windows_vs2017\llvm-project\build\bin\filecheck.exe C:\ws\workspace\amd64_windows_vs2017\llvm-project\llvm\test\Transforms\Attributor\liveness.ll --check-prefixes=CHECK,NOT_CGSCC_NPM,NOT_CGSCC_OPM,NOT_TUNIT_NPM,IS__TUNIT____,IS________OPM,IS__TUNIT_OPM
300 msLLVM.Transforms/Attributor::noalias.ll
Script: -- : 'RUN: at line 2'; /mnt/disks/ssd0/agent/workspace/BETA_amd64_debian_testing_clang8/llvm-project/build/bin/opt -attributor -attributor-manifest-internal -attributor-disable=false -attributor-max-iterations-verify -attributor-annotate-decl-cs -attributor-max-iterations=7 -S < /mnt/disks/ssd0/agent/workspace/BETA_amd64_debian_testing_clang8/llvm-project/llvm/test/Transforms/Attributor/noalias.ll | /mnt/disks/ssd0/agent/workspace/BETA_amd64_debian_testing_clang8/llvm-project/build/bin/FileCheck /mnt/disks/ssd0/agent/workspace/BETA_amd64_debian_testing_clang8/llvm-project/llvm/test/Transforms/Attributor/noalias.ll --check-prefixes=CHECK,NOT_CGSCC_NPM,NOT_CGSCC_OPM,NOT_TUNIT_NPM,IS__TUNIT____,IS________OPM,IS__TUNIT_OPM
450 msLLVM.Transforms/Attributor::noalias.ll
Script: -- : 'RUN: at line 2'; c:\ws\workspace\amd64_windows_vs2017\llvm-project\build\bin\opt.exe -attributor -attributor-manifest-internal -attributor-disable=false -attributor-max-iterations-verify -attributor-annotate-decl-cs -attributor-max-iterations=7 -S < C:\ws\workspace\amd64_windows_vs2017\llvm-project\llvm\test\Transforms\Attributor\noalias.ll | c:\ws\workspace\amd64_windows_vs2017\llvm-project\build\bin\filecheck.exe C:\ws\workspace\amd64_windows_vs2017\llvm-project\llvm\test\Transforms\Attributor\noalias.ll --check-prefixes=CHECK,NOT_CGSCC_NPM,NOT_CGSCC_OPM,NOT_TUNIT_NPM,IS__TUNIT____,IS________OPM,IS__TUNIT_OPM
130 msLLVM.Transforms/Attributor/ArgumentPromotion::fp80.ll
Script: -- : 'RUN: at line 2'; /mnt/disks/ssd0/agent/workspace/BETA_amd64_debian_testing_clang8/llvm-project/build/bin/opt -attributor -attributor-manifest-internal -attributor-disable=false -attributor-max-iterations-verify -attributor-annotate-decl-cs -attributor-max-iterations=3 -S < /mnt/disks/ssd0/agent/workspace/BETA_amd64_debian_testing_clang8/llvm-project/llvm/test/Transforms/Attributor/ArgumentPromotion/fp80.ll | /mnt/disks/ssd0/agent/workspace/BETA_amd64_debian_testing_clang8/llvm-project/build/bin/FileCheck /mnt/disks/ssd0/agent/workspace/BETA_amd64_debian_testing_clang8/llvm-project/llvm/test/Transforms/Attributor/ArgumentPromotion/fp80.ll --check-prefixes=CHECK,NOT_CGSCC_NPM,NOT_CGSCC_OPM,NOT_TUNIT_NPM,IS__TUNIT____,IS________OPM,IS__TUNIT_OPM
View Full Test Results (7 Failed)

Event Timeline

Higuoxing created this revision.Apr 1 2020, 11:05 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 1 2020, 11:05 PM
Higuoxing edited the summary of this revision. (Show Details)Apr 1 2020, 11:08 PM
MaskRay added inline comments.Apr 1 2020, 11:44 PM
llvm/include/llvm/Object/ELFObjectFile.h
630

Use -> since you are touching this line.

Higuoxing updated this revision to Diff 254437.Apr 2 2020, 12:01 AM

Addressed @MaskRay 's comment.

jhenderson added inline comments.Apr 2 2020, 12:55 AM
llvm/include/llvm/Object/ELFObjectFile.h
629–647

I wonder if this should be changed to not use auto? I think that would have made it less likely for this mistake to have got through unnoticed.

635

Ditto.

llvm/test/tools/llvm-nm/X86/invalid-section-size.test
1 ↗(On Diff #254437)

"when dumping a symbol table"

I wonder if this test should be moved to the Object test folder, since this is really testing the Object library, rather than llvm-nm specifically. What do you think?

3 ↗(On Diff #254437)

Perhaps make this last line a TODO in the test.

6 ↗(On Diff #254437)

It might be a good idea to add 2>&1 to the llvm-nm run and --implicit-check-not=warning to the FileCheck run so that this test starts failing if somebody implements the warning handling described, so that the test can be updated at that point, rather than it becoming redundant.

Higuoxing updated this revision to Diff 254450.Apr 2 2020, 1:44 AM

Addressed comments.

  • Use Expected<Elf_Sym_Range> instead of auto.
  • Add TODO label for future improvement.
  • Move test to Object folder.
Higuoxing updated this revision to Diff 254451.Apr 2 2020, 1:46 AM

Apply clang-format

grimar added inline comments.Apr 2 2020, 2:22 AM
llvm/include/llvm/Object/ELFObjectFile.h
648

Perhaps it worth to generalize this place a bit? Something like:

auto CheckSymbol = [&](Expected<typename ELFT::SymRange> R) {
  if (!R || ESym == R->begin()) {
    Result |= SymbolRef::SF_FormatSpecific;
    // TODO: Actually report errors helpfully.
    consumeError(R.takeError());
    return false;
  }
  return true;
};

if (!CheckSymbol(EF.symbols(DotSymtabSec)) ||
    !CheckSymbol(EF.symbols(DotDynSymSec)))
  return Result;
jhenderson added inline comments.Apr 2 2020, 2:31 AM
llvm/test/Object/nm-invalid-section-size.test
1 ↗(On Diff #254451)

You should rename this to invalid-symtab-size.test and delete the "nm" part of the title. Again, this isn't about testing llvm-nm, but rather the Object library's invalid symbol table section size handling.

3 ↗(On Diff #254451)

will -> would

Higuoxing updated this revision to Diff 254485.Apr 2 2020, 4:22 AM

Addressed comments.

  • Rename test file name to invalid-symtab-size.test
  • Generalize CheckSymbol()
grimar added inline comments.Apr 2 2020, 5:13 AM
llvm/include/llvm/Object/ELFObjectFile.h
241

You probably do not need it anymore as it is used in the single place now.

628–629

With this implementation the CheckSymbol name probably does not make much sense anymore?
It is unclear what does it check I think. It perhaps should be changed to IsFormatSpecificSymbol or something better.

In this case I'd expect to see a bit more related code inside.
E.g. seems the following piece from above:

if (ESym->getType() == ELF::STT_FILE || ESym->getType() == ELF::STT_SECTION)
  Result |= SymbolRef::SF_FormatSpecific;

and the code for ARM from below :

if (EF.getHeader()->e_machine == ELF::EM_ARM) {
    if (Expected<StringRef> NameOrErr = getSymbolName(Sym)) {
      StringRef Name = *NameOrErr;
      if (Name.startswith("$d") || Name.startswith("$t") ||
          Name.startswith("$a"))
        Result |= SymbolRef::SF_FormatSpecific;

can live there so that IsFormatSpecificSymbol would return true for all cases when we want to set the SF_FormatSpecific flag.

630

This condition needs a comment to explain what it does.

grimar added inline comments.Apr 2 2020, 5:24 AM
llvm/include/llvm/Object/ELFObjectFile.h
628–629

Let me debug this to see how well my suggestions works. I'll return with something soon.

Higuoxing marked an inline comment as done.Apr 2 2020, 5:38 AM
Higuoxing added inline comments.
llvm/include/llvm/Object/ELFObjectFile.h
628–629

What do you think of changing the name to IsFirstNULLSymbol(SomeSec) or IsFirstNULLSymbolFrom(SomeSec), and leave other checkers unchanged?

We have to call the function on .symtab and .dynsym to check whether the symbol is NULL symbol. If we want to add one another function IsFormatSpecificSymbol(), we still have to keep IsFirstNULLSymbol()

bool IsFormatSpecificSymbol() {
  if (IsFirstNULLSymbol(.symtab) || IsFirstSymbol(.dynsym))
    return true;
  else if (EF.getHeader()->e_machine == ELF::EM_ARM) {
    ...
    return true;
  } else if (ESym->getType() == STT_FILE || ESym->getType() == STT_SECTION)
    return true;
  return false;
}
grimar added inline comments.Apr 2 2020, 6:03 AM
llvm/include/llvm/Object/ELFObjectFile.h
628–629

Right. I've experimented with the code and came to the same conclusion:

auto IsNullSymbol = [this](const Elf_Sym *Sym, const Elf_Shdr *Symtab) {
  Expected<typename ELFT::SymRange> SymbolsOrErr = EF.symbols(Symtab);
  if (!SymbolsOrErr) {
    // TODO: Actually report errors helpfully.
    consumeError(SymbolsOrErr.takeError());
    return false;
  }
  return Sym == SymbolsOrErr->begin();
};

if (IsNullSymbol(ESym, DotSymtabSec) || IsNullSymbol(ESym, DotDynSymSec))
  Result |= SymbolRef::SF_FormatSpecific;

Creating a IsFormatSpecificSymbol could be a nice follow-up refactoring probably.

So just renaming your version to to IsNullSymbol LG to me.

One more thing though: your test check the broken .symtab case, but the code also checks the .dynsym.
This should also be covered by the test I think.

grimar added inline comments.Apr 2 2020, 6:08 AM
llvm/include/llvm/Object/ELFObjectFile.h
630

And probably no need to add a comment then as IsNullSymbol describes nicely what the code does.

Higuoxing updated this revision to Diff 254507.Apr 2 2020, 6:45 AM

Addressed comment.

  • Add one more test for .dynsym section. llvm-nm --dynamic's behavior will be fixed until D76081 lands.

Thinking about it, I'm reluctant to approve this as is, without a further change to actually report the error (possibly in a seperate patch). Prior to this patch, we have a failure that at least can be identified in certain build versions, whereas now the malformed-ness is not detectable at all. Perhaps this could temporarily use report_fatal_error with a follow-up patch bubbling the error further up the stack?

llvm/include/llvm/Object/ELFObjectFile.h
628–629

I think IsNullSymbol would be fine, and more in keeping with LLVM naming standards.

llvm/test/Object/invalid-symtab-size.test
13 ↗(On Diff #254507)

FIXME rather than TODO here.

grimar added a comment.Apr 3 2020, 1:16 AM

I'd also recommend to start using "Edit related revisions" on the right panel of the phab.
Then you should be able to edit related parent/child revisions, because currently it is unobvious which dependencies your patches has.

llvm/include/llvm/Object/ELFObjectFile.h
630

IsNullSymbol, not IsNULLSymbol

(if you search for "IsNULL" in LLVM's code, you'll see 0 matching lines, but "IsNull" gives a lot of.
So having NULL is at least inconsistent).

I'd also recommend to start using "Edit related revisions" on the right panel of the phab.
Then you should be able to edit related parent/child revisions, because currently it is unobvious which dependencies your patches has.

Alternatively, you can put "Depends on D<number>" in the patch description and the stack is automatically formed without you needing to manually edit the related revisions.

Higuoxing updated this revision to Diff 256045.Apr 8 2020, 9:16 AM

I'd also recommend to start using "Edit related revisions" on the right panel of the phab.
Then you should be able to edit related parent/child revisions, because currently it is unobvious which dependencies your patches has.

Alternatively, you can put "Depends on D<number>" in the patch description and the stack is automatically formed without you needing to manually edit the related revisions.

Thanks a lot! Got it.

Thinking about it, I'm reluctant to approve this as is, without a further change to actually report the error (possibly in a seperate patch). Prior to this patch, we have a failure that at least can be identified in certain build versions, whereas now the malformed-ness is not detectable at all. Perhaps this could temporarily use report_fatal_error with a follow-up patch bubbling the error further up the stack?

I've tried to change uint32_t getSymbolFlags(DataRefImpl) to Expected<uint32_t> getSymbolFlags(DataRefImpl). However, there are several tools, object file formats and functions from core library involved. I tried to update them all carefully. Could you please give me some comments?

FYI: getSymbolFlags() on ELFObjectFile is fallible while on other object files (Mach-O, COFF, XCOFF ...) are not.

sbc100 added inline comments.Apr 8 2020, 10:06 AM
llvm/include/llvm/Object/ObjectFile.h
306

Use early return here?

auto SymbolFlagsOrErr = getSymbolFlags(Symb);
if (!SymbolFlagsOrErr)
   report_fatal_error(SymbolFlagsOrErr.takeError());
assert(*SymbolFlagsOrErr & SymbolRef::SF_Common);
return getCommonSymbolSizeImpl(Symb);

This would also avoid "else-after-return".

llvm/lib/ExecutionEngine/RuntimeDyld/JITSymbol.cpp
93

Ditto, use early return and avoid else after return.

llvm/tools/llvm-size/llvm-size.cpp
213

Use early return?

Higuoxing updated this revision to Diff 256186.Apr 8 2020, 10:14 PM

Addressed comments.

  • Use early-return style.

Thanks for the change. I think the interface change makes sense (and is in keeping with other similar functions). However, I think it may deserve pulling into a separate change from the new error check. That would make it an NFC interface change. This change would then implement the behaviour change and add appropriate testing for said behaviour, which would cover some of the new code paths.

As for the other code paths I've highlighted, it's probably okay to just put a TODO saying "needs testing" or something to that effect for most of them. @grimar, what do you think?

llvm/include/llvm/Object/ObjectFile.h
301

It might be worth putting a TODO here saying that this should be pushed further up the stack. Probably also wants mentioning that this is untested currently.

llvm/lib/ExecutionEngine/Orc/Mangling.cpp
98

This new bit of code should either be tested or at least have a TODO comment saying it needs to be.

llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp
130

Test?

212

Test?

llvm/lib/ExecutionEngine/RuntimeDyld/JITSymbol.cpp
60

Test?

84

TODO for pushing error up stack + test.

llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp
219

Test?

242

Test?

602

Test?

llvm/lib/Object/ArchiveWriter.cpp
269

Test/TODO to echo further up stack?

llvm/lib/Object/MachOObjectFile.cpp
1810

cantFail(FlagsOrErr.takeError())

Then you can delete the comment.

llvm/test/tools/llvm-nm/invalid-symbol-table-size.test
1

a helpful error message

Ditto in other test.

13

I don't think you need the CHECK-EMPTY for when an error is reported.

Ditto below.

llvm/tools/dsymutil/MachODebugMapParser.cpp
499

cantFail

llvm/tools/llvm-nm/llvm-nm.cpp
312

If it's impossible for this to fail, use cantFail instead. Same below and elsewhere.

380

cantFail

1059

cantFail

1069

cantFail

1103

Why is consumeError here okay? Should it be cantFail? A comment is probably needed anyway.

llvm/tools/llvm-objdump/MachODump.cpp
7565

cantFail

llvm/tools/llvm-size/llvm-size.cpp
204–205

You can probably directly call llvm-size's error reporting routine here, since this is in the core llvm-size code. You would also need to test it then.

llvm/tools/sancov/sancov.cpp
661

Test?

grimar added a comment.Apr 9 2020, 3:25 AM

However, I think it may deserve pulling into a separate change from the new error check. That would make it an NFC interface change.

+1.

As for the other code paths I've highlighted, it's probably okay to just put a TODO saying "needs testing" or something to that effect for most of them. @grimar, what do you think?

I agree. It might be hard/take time to add test cases for all places, but it should not be a stopper for us I believe.

Higuoxing abandoned this revision.Apr 20 2020, 12:33 AM