Page MenuHomePhabricator

BitStream reader: propagate errors
ClosedPublic

Authored by jfb on Jun 18 2019, 2:53 PM.

Details

Summary

The bitstream reader handles errors poorly. This has two effects:

  • Bugs in file handling (especially modules) manifest as an "unexpected end of file" crash
  • Users of clang as a library end up aborting because the code unconditionally calls report_fatal_error

The bitstream reader should be more resilient and return Expected / Error as
soon as an error is encountered, not way late like it does now. This patch
starts doing so and adopting the error handling where I think it makes sense.
There's plenty more to do: this patch propagates errors to be minimally useful,
and follow-ups will propagate them further and improve diagnostics.

https://bugs.llvm.org/show_bug.cgi?id=42311
rdar://problem/33159405

Diff Detail

Repository
rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
jfb updated this revision to Diff 206298.Mon, Jun 24, 2:14 PM
  • Ignore another error for now, this one is probably a sign that there's an underlying bug somewhere.
jfb updated this revision to Diff 206299.Mon, Jun 24, 2:22 PM
  • Update error message string in tools/llvm-lto/error.ll
jfb added a comment.Mon, Jun 24, 2:41 PM

All of check-llvm now passes. I'll look at other users of this stuff in the monorepo, compile and run their tests.

jfb updated this revision to Diff 206310.Mon, Jun 24, 3:21 PM
  • Fix clang-tools-extra, check-clang-tools now passes.
jfb updated this revision to Diff 206335.Mon, Jun 24, 5:09 PM
  • Fix a few more uses of APIs in clang-tools-extra. All of the monorepo now builds.
jfb added a subscriber: lhames.Mon, Jun 24, 5:09 PM
jfb retitled this revision from WIP BitStream reader: propagate errors to BitStream reader: propagate errors.Mon, Jun 24, 9:42 PM
jfb added a comment.Mon, Jun 24, 9:44 PM

I did a build of all LLVM monorepo projects, and only LLVM / clang / clang-tools-extra are impacted. I therefore ran tests as follows, and they all pass:

rm -rf debug ; mkdir debug && (cd debug && cmake -G Ninja ../llvm -DLLVM_ENABLE_ASSERTIONS=ON -DCMAKE_BUILD_TYPE=Debug -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra" && ninja check-all )

I haven't run clang-format yet so ignore formatting. Otherwise, this patch is ready for review and getting committed.

jfb updated this revision to Diff 206483.Tue, Jun 25, 10:34 AM
  • Rebase
jfb updated this revision to Diff 206520.Tue, Jun 25, 1:14 PM
  • Remove attribute deprecated, it was there to help catch API uses that needed updating.
bruno accepted this revision.Tue, Jun 25, 1:17 PM

LGTM

This revision is now accepted and ready to land.Tue, Jun 25, 1:17 PM
jfb updated this revision to Diff 206529.Tue, Jun 25, 1:56 PM
  • clang-format all the code. If it's ugly, don't complain to me
thegameg added inline comments.Tue, Jun 25, 2:03 PM
llvm/include/llvm/Bitcode/BitstreamReader.h
441 ↗(On Diff #206520)

llvm:: seems unnecessary here.

489 ↗(On Diff #206520)

Any reason why this doesn't return Error?

llvm/lib/Bitcode/Reader/BitstreamReader.cpp
140 ↗(On Diff #206520)

return createStringError here too?

Bigcheese added inline comments.Tue, Jun 25, 2:17 PM
clang-tools-extra/clang-doc/BitcodeReader.cpp
527–529 ↗(On Diff #206520)

This is a pretty big behavior change. Is clang-doc expecting to get files with unknown blocks?

611–618 ↗(On Diff #206520)

Does consumeError return? MaybeCode.get() will crash if !MaybeCode.

I haven't had a chance to audit the whole patch yet, but in general the error suppression idioms are unsafe (though maybe no more so than the existing code?).

I would be inclined to audit all those FIXMEs and replace them with cantFails or consumeErrors. consumeError will generally match existing behavior in places where you were ignoring errors. cantFail will get you aggressive crashes, which can be handy for debugging but not so fun in release code.

Also, if this patch passes the regression tests we need more failure tests. :)

clang-tools-extra/clang-doc/BitcodeReader.cpp
527–529 ↗(On Diff #206520)

The inner test here is unsafe, as it will discard the outer error. You need:

if (llvm::Error Err = readSubBlock(BlockOrCode, I)) {
  if (llvm::Error Skipped = Stream.SkipBlock()) {
    consumeError(std::move(Err));
    return Skipped;
  }
  return Err;
}

or:

if (llvm::Error Err = readSubBlock(BlockOrCode, I)) {
  if (llvm::Error Skipped = Stream.SkipBlock()) {
    return joinErrors(std::move(Err), std::move(Skipped));
  return Err;
}
611–618 ↗(On Diff #206520)

Yes: consumeError returns. This should probably be:

if (!MaybeCode)
  return consumeError(MaybeCode.takeError()), Cursor::Badness;
clang/lib/Frontend/CompilerInstance.cpp
2051–2053 ↗(On Diff #206529)

This will crash if writeIndex ever generates an error. For that reason I would suggest writing this as:

// FIXME: Can actually fail! <fixme goes here>
cantFail(GlobalModuleIndex::writeIndex(...));

It has the same effect, but without the control flow.

2082 ↗(On Diff #206529)

Another cantFail candidate.

clang/lib/Frontend/FrontendAction.cpp
945–948 ↗(On Diff #206529)

I'd suggest cantFail here, if not for that fixme comment. Yikes.

clang/lib/Frontend/Rewrite/FrontendActions.cpp
132–133 ↗(On Diff #206529)

You need a consumeError here or you'll get a runtime crash if an error is generated.

clang/lib/Frontend/SerializedDiagnosticReader.cpp
89–91 ↗(On Diff #206529)

This needs a consumeError or it will crash if an error is generated. How about:

if (llvm::Error Err = Stream.SkipBlock())

return consumeError(std::move(Err)), SDError::MalformedTopLevelBlock;
149–151 ↗(On Diff #206529)

Needs a consumeError.

166–167 ↗(On Diff #206529)

Needs a consumeError.

194–196 ↗(On Diff #206529)

Needs a consumeError.

216–217 ↗(On Diff #206529)

Needs a consumeError.

clang/lib/Frontend/TestModuleFileExtension.cpp
53–54 ↗(On Diff #206529)

Needs a consumeError or cantFail.

Fun fact: cantFail will auto-unwrap Expecteds:

Expected<T> foo();

T v = cantFail(foo());
clang/lib/Serialization/ASTReader.cpp
1620 ↗(On Diff #206529)

Unsafe on error. Needs cantFail or consumeError.

1628 ↗(On Diff #206529)

Unsafe on error. Needs cantFail or consumeError.

1636 ↗(On Diff #206529)

Unsafe on error. Needs cantFail or consumeError.

1642 ↗(On Diff #206529)

Unsafe on error. Needs cantFail or consumeError.

1669 ↗(On Diff #206529)

Unsafe on error. Needs cantFail or consumeError.

2208 ↗(On Diff #206529)

Unsafe on error. Needs cantFail or consumeError.

2218 ↗(On Diff #206529)

Unsafe on error. Needs cantFail or consumeError.

2250 ↗(On Diff #206529)

Unsafe on error. Needs cantFail or consumeError.

2401 ↗(On Diff #206529)

Unsafe on error. Needs cantFail or consumeError.

2411 ↗(On Diff #206529)

Unsafe on error. Needs cantFail or consumeError.

2433 ↗(On Diff #206529)

Unsafe on error. Needs cantFail or consumeError.

jfb marked 8 inline comments as done.Tue, Jun 25, 4:18 PM
jfb added inline comments.
clang-tools-extra/clang-doc/BitcodeReader.cpp
527–529 ↗(On Diff #206520)

I don't think it expects this, and it looks (from the above code) like the intent is to handle errors when parsing (so this would be a bug that we'd want fixed).

611–618 ↗(On Diff #206520)

Good point, fixed.

llvm/include/llvm/Bitcode/BitstreamReader.h
489 ↗(On Diff #206520)

I'm not sure it's really an error: it goes to the end of the block either because it's empty, or because it pops the scope (which doesn't error either). It might be erroneously used, but I'm not sure we should make it an error right now. WDYT?

jfb updated this revision to Diff 206546.Tue, Jun 25, 4:18 PM
jfb marked an inline comment as done.
  • Address Francis' and Michael's comments.
jfb added a comment.Tue, Jun 25, 4:20 PM

I haven't had a chance to audit the whole patch yet, but in general the error suppression idioms are unsafe (though maybe no more so than the existing code?).

I would be inclined to audit all those FIXMEs and replace them with cantFails or consumeErrors. consumeError will generally match existing behavior in places where you were ignoring errors. cantFail will get you aggressive crashes, which can be handy for debugging but not so fun in release code.

I haven't addressed the comments yet, but wanted to respond: yes, it's unsafe and I was wondering what folks would rather see, so thanks for bringing it up! I indeed only consumed errors that we encountered, and in that sense the code isn't bug-compatible with what we had before. Seems like you'd want consumeError in most places, which I can certainly do.

Also, if this patch passes the regression tests we need more failure tests. :)

Indeed! That's a pretty terrifying thing... but I'm not signing up to address *that* particular issue :)

thegameg added inline comments.Tue, Jun 25, 4:41 PM
llvm/include/llvm/Bitcode/BitstreamReader.h
489 ↗(On Diff #206520)

Yeah, I'm not sure either... BitstreamCursor::advance seems to return BitstreamEntry::getError(); if this function fails after it encountered an END_BLOCK, and the other users seem to return things like SDError::InvalidDiagnostics or Cursor::BadBlock.

I guess for now, it's fine as it is.

jfb updated this revision to Diff 206551.Tue, Jun 25, 5:05 PM
jfb marked 25 inline comments as done.
  • Address Lang's comments.
  • format
clang-tools-extra/clang-doc/BitcodeReader.cpp
611–618 ↗(On Diff #206520)

You tempt me so with this comma operator... but no, I must not!

clang/lib/Frontend/CompilerInstance.cpp
2051–2053 ↗(On Diff #206529)

I think this one shouldn't crash, so I've made it consume the error instead (and it still needs to get fixed).

jfb updated this revision to Diff 206552.Tue, Jun 25, 5:09 PM
  • Improve one fatal error message.
jfb added a comment.Tue, Jun 25, 5:11 PM

I've addressed Lang's comments and the re-audited all the new FIXME instances. This patch now actually drops errors on the floor instead of implicitly erroring out unless the error path is tested (which is what I had before). This hides bugs, but I left FIXMEs everywhere and it means the patch will be less disruptive because it's bug-compatible in the areas that aren't tested.

lhames accepted this revision.Wed, Jun 26, 9:16 AM

Indeed! That's a pretty terrifying thing... but I'm not signing up to address *that* particular issue :)

Yep. I don't think you need to address that in this patch.

LGTM!

clang-tools-extra/clang-doc/BitcodeReader.cpp
611–618 ↗(On Diff #206520)

;)

jfb marked 2 inline comments as done.Wed, Jun 26, 9:31 AM
jfb added a comment.Wed, Jun 26, 10:08 AM

I think this is ready to go! I rebased and ran check-all for LLVM / clang / clang-tools-extras and everything passes.

This revision was automatically updated to reflect the committed changes.
abrachet added inline comments.
llvm/trunk/lib/Bitcode/Reader/BitcodeReader.cpp
205

This and an identical switch on line 5367 cause an unused variable warning from this commit. I don't know if the build bots report on this, or the proper way to tell you about this but hopefully you will see it :)

jfb marked 2 inline comments as done.Thu, Jun 27, 9:25 AM
jfb added a subscriber: hans.
jfb added inline comments.
llvm/trunk/lib/Bitcode/Reader/BitcodeReader.cpp
205

Thanks, looks like @hans fixed it in r364505. Odd that it wasn't warning locally for me.

bjope added a subscriber: bjope.Fri, Jul 5, 8:26 AM
bjope added inline comments.
clang-tools-extra/trunk/clang-doc/BitcodeReader.cpp
619

This has caused problem for our sanitizer tests the last couple of days:

FAIL: Extra Tools Unit Tests :: clang-doc/./ClangDocTests/BitcodeTest.emitMethodInfoBitcode (20077 of 48746)
******************** TEST 'Extra Tools Unit Tests :: clang-doc/./ClangDocTests/BitcodeTest.emitMethodInfoBitcode' FAILED ********************
Note: Google Test filter = BitcodeTest.emitMethodInfoBitcode
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from BitcodeTest
[ RUN      ] BitcodeTest.emitMethodInfoBitcode
/local/repo/bbiswjenk/fem023-eiffel003/workspace/llvm/llvm-master-sanitize/clang-tools-extra/clang-doc/BitcodeReader.cpp:621:13: runtime error: load of value 9, which is not a valid value for type 'llvm::bitc::FixedAbbrevIDs'

This was seen when building trunk with clang 6.0.0 and LVM_USE_SANITIZER=Undefined

cmake -G Ninja '-DLLVM_ENABLE_PROJECTS='\''clang;clang-tools-extra'\''' -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON '-DCMAKE_CXX_FLAGS='\''-stdlib=libc++'\''' -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DLLVM_USE_LINKER=gold -DLLVM_USE_SANITIZER=Undefined ../.
-- The C compiler identification is Clang 6.0.0
-- The CXX compiler identification is Clang 6.0.0

Afaict we can't cast the read value to FixedAbbrevIDs as we do not know yet if it matches one of the values defined in the enum, or if we will take the default case.

A similar switch exist at cfe/trunk/lib/Frontend/SerializedDiagnosticReader.cpp:127, but it is using a slightly different pattern:

unsigned Code;
Code = Res.get();
switch ((llvm::bitc::FixedAbbrevIDs)Code)

I haven't seen any failures for SerializedDiagnosticReader. So either we lack test coverage for that function, or the sanitizer only introduce the check when using the static_cast (and declaring Code as an enum) as done here.

jfb marked 2 inline comments as done.Fri, Jul 5, 11:02 AM
jfb added inline comments.
clang-tools-extra/trunk/clang-doc/BitcodeReader.cpp
619

That sounds like a pre-existing bug. We should check that the value is in range before casting. Can you send patches to fix both code locations, and add test coverage? This code is indeed poorly tested.

Why do the sanitizers catch static_cast but not C-style casts?

jfb marked an inline comment as done.Fri, Jul 5, 11:03 AM
jfb added inline comments.
clang-tools-extra/trunk/clang-doc/BitcodeReader.cpp
619

To be clear: relying on the default case is still UB because there's a cast to the enum type before it occurs.

bjope added inline comments.Fri, Jul 5, 1:11 PM
clang-tools-extra/trunk/clang-doc/BitcodeReader.cpp
619

I made a patch here (assuming the goal would be to keep the cast to the enum, and to let the switch cover all enum values): https://reviews.llvm.org/D64262