This is an archive of the discontinued LLVM Phabricator instance.

[llvm-readobj] Fix output interleaving issue caused by using multiple streams at the same time.
ClosedPublic

Authored by ychen on Jun 10 2019, 7:05 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

ychen created this revision.Jun 10 2019, 7:05 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 10 2019, 7:05 PM
ychen edited the summary of this revision. (Show Details)Jun 10 2019, 7:05 PM

The code change looks good. Thanks!

llvm/test/tools/llvm-readobj/fix-interleaved-output.test
4 ↗(On Diff #203954)

You may add a comment (starting with ## ) explaining what the test intends to check.

Do you have a better name than fix-interleaved-output.test?

jhenderson added inline comments.Jun 11 2019, 2:31 AM
llvm/tools/llvm-readobj/llvm-readobj.cpp
469 ↗(On Diff #203954)

I thought the idea of the change was to avoid the need for the explicit flush?

ychen updated this revision to Diff 204095.Jun 11 2019, 10:07 AM
  • update
ychen updated this revision to Diff 204096.Jun 11 2019, 10:12 AM
ychen marked an inline comment as done.
  • update

The code change looks good. Thanks!

Thank you for the review. Test file name updated to reflect the comment in it.

llvm/tools/llvm-readobj/llvm-readobj.cpp
469 ↗(On Diff #203954)

You are correct. It helps me identify the root cause: opts::CodeViewMergedTypes also uses its own stream.

rupprecht added inline comments.Jun 11 2019, 12:16 PM
llvm/test/tools/llvm-readobj/check-output-order.test
7–9 ↗(On Diff #204096)

There is an --all option that turns on: --file-headers, --program-headers, --section-headers, --symbols, --relocations, --dynamic-table, --notes, --version-info, --unwind, --section-groups and --elf-hash-histogram

Is that what you want here?

That said, checking the full content of --all in a test may be too verbose. Maybe just check a few lines from each section; enough to make sure they aren't interleaved.

llvm/tools/llvm-readobj/ELFDumper.cpp
391–392 ↗(On Diff #204096)

I don't understand this part of the change. W.getOStream() is a raw_ostream and can't be casted to a subclass. What this is doing is strange -- I'm not sure how it compiles.

I don't think the change to this file is actually necessary, is it?

ychen updated this revision to Diff 204161.Jun 11 2019, 1:53 PM
  • update comment
ychen marked 2 inline comments as done.Jun 11 2019, 2:10 PM

Thanks for the review, Jordan.

llvm/test/tools/llvm-readobj/check-output-order.test
7–9 ↗(On Diff #204096)

Totally agree that checking for all output in the test file is too much. Comments updated.

--all does not contain every possible piece of information. If we really want to check the order is the same as printing order (for all possible output) in the code for robustness, I think --all-debug is still needed.

llvm/tools/llvm-readobj/ELFDumper.cpp
391–392 ↗(On Diff #204096)

can't be casted to a subclass.

Why not if we know for a fact that the referenced object is indeed that subclass?

What this is doing is strange -- I'm not sure how it compiles.

I could be wrong but I think it is because there is no compiler checking for static_cast.

I don't think the change to this file is actually necessary, is it?

If not, GNUStyle<>::OS is another output stream that has different buffering state than the stream referenced by W.getOStream() here. (See formatted_raw_ostream::setStream)

rupprecht added inline comments.Jun 11 2019, 3:35 PM
llvm/tools/llvm-readobj/ELFDumper.cpp
391–392 ↗(On Diff #204096)

If not, GNUStyle<>::OS is another output stream

It looks like formatted_raw_ostream just wraps the stream passed in; it doesn't create another stream

I don't think what's happening is what appears to be happening. It looks like a cast. But I think it's creating a temporary reference that immediately leaves the stack and causes heavy UB here.

Have you run unit tests with this part of the change? There are tons of crashes for check-llvm-tools-llvm-readobj with just this part of the change.

ychen marked an inline comment as done.Jun 11 2019, 3:54 PM
ychen added inline comments.
llvm/tools/llvm-readobj/ELFDumper.cpp
391–392 ↗(On Diff #204096)

Ran both` check-all` and check-llvm-tools-llvm-readobj, did not see any regression ... (based on master b341d305a4cef8e75f520d104cd09b0fcfdc8bba)

It looks like formatted_raw_ostream just wraps the stream passed in;

Agree. And formatted_raw_ostream is a stream by itself even without wrapping the stream passed in.

it doesn't create another stream

I'm not sure. formatted_raw_ostream is a subclass of raw_ostream. Why is it not another stream?

rupprecht added inline comments.Jun 11 2019, 5:25 PM
llvm/tools/llvm-readobj/ELFDumper.cpp
391–392 ↗(On Diff #204096)

Ran both` check-all` and check-llvm-tools-llvm-readobj, did not see any regression ... (based on master b341d305a4cef8e75f520d104cd09b0fcfdc8bba)

I only get test failures when I have just this file changed. When I apply the whole patch, the test failures go away.
A coworker pointed me to the rules here: http://eel.is/c++draft/expr.static.cast#2

The tl;dr is that this is valid if it's always true that W.getOStream() here is always a formatted_raw_ostream. If it's not, then you get UB (as evidenced by crashes when running unit tests). The rest of this patch updates all the places so that it is formatted_raw_ostream everywhere, hence why you don't see any crashes.

So in that sense, this patch is fine, but IMO it's a very fragile contract that we shouldn't rely on -- there's 4 layers of indirection between where ScopedPrinter is created (and now initialized w/ fouts() instead of outs()). Would it be possible, for instance, to instead change ScopedPrinter to take a formatted_raw_ostream instead of a raw_ostream? Or maybe insert calls to flush() at appropriate places (whenever switching between GNU vs LLVM style printing)?

ychen marked an inline comment as done.Jun 11 2019, 6:19 PM
ychen added inline comments.
llvm/tools/llvm-readobj/ELFDumper.cpp
391–392 ↗(On Diff #204096)

The contract could be checked by adding assert(&W.getOStream() == &llvm::fouts()).

Relying on flush() and having more than one stream outputting sounds more fragile to me. It needs tracking more locations in the source code.

change ScopedPrinter to take a formatted_raw_ostream

This could serve the purpose here. But ScopedPrinter should not care about if the stream is formatted or not.

ychen updated this revision to Diff 204199.Jun 11 2019, 6:21 PM
  • add assertion

By the way, since you are touching COFF and Mach-O code, your test should probably be run on a file with each of those formats too.

llvm/test/tools/llvm-readobj/check-output-order.test
7–9 ↗(On Diff #204096)

--all can be combined with other options manually, if we are concerned with getting everything. I'm not sure how worth it it is to check every combination, but I certainly won't argue against somebody doing the work if they feel there's value.

When it comes to checking, you can just check the last line or two of the previous block, followed by the first couple of lines of the next block. If they are interleaved, one or other of the checks will fail.

llvm/tools/llvm-readobj/ELFDumper.cpp
391–392 ↗(On Diff #204096)

The OS here should definitely be a reference, not a copy since the buffering for streams is local to the instance, and not attached to the main stream itself that everything wraps.

Is there anything stopping GNUStyle having a raw_ostream & and letting virtual function handling do the work of figuring out the appropriate thing to call? That would solve all these issues.

ychen marked 2 inline comments as done.Jun 12 2019, 9:02 AM
ychen added inline comments.
llvm/test/tools/llvm-readobj/check-output-order.test
7–9 ↗(On Diff #204096)

--all can be combined with other options manually, if we are concerned with getting everything. I'm not sure how worth it it is to check every combination, but I certainly won't argue against somebody doing the work if they feel there's value.

I'm not entirely sure either. Should we keep the TODO here or you prefer not to have it?

llvm/tools/llvm-readobj/ELFDumper.cpp
391–392 ↗(On Diff #204096)

GNUStyle uses formatted_raw_ostream::PadToColumn a lot, which looks like ScopedPrinter::startLine (both for indentation). I think we could get away with not using llvm::fouts(), and make GNUStyle use ScopedPrinter like LLVMStype do, but it does not seem to be a trivial change.

ychen added a comment.Jun 12 2019, 9:16 AM

By the way, since you are touching COFF and Mach-O code, your test should probably be run on a file with each of those formats too.

The change in COFF and Mach-o dumper is covered by

tools/llvm-readobj/coff-needed-libs.test (COFFDumper.cpp)
tools/llvm-readobj/macho-needed-libs.test (MachoDumper.cpp)

By the way, since you are touching COFF and Mach-O code, your test should probably be run on a file with each of those formats too.

The change in COFF and Mach-o dumper is covered by

tools/llvm-readobj/coff-needed-libs.test (COFFDumper.cpp)
tools/llvm-readobj/macho-needed-libs.test (MachoDumper.cpp)

Oops, I was viewing via the "/new/" page. Sorry.

llvm/test/tools/llvm-readobj/check-output-order.test
7–9 ↗(On Diff #204096)

I don't think we need the TODO, but I'm okay with the idea of using --all in this test.

llvm/tools/llvm-readobj/ELFDumper.cpp
391–392 ↗(On Diff #204096)

@rupprecht, what do you think?

ychen updated this revision to Diff 204337.Jun 12 2019, 11:44 AM
  • update test
rupprecht accepted this revision.Jun 12 2019, 12:01 PM
rupprecht added inline comments.
llvm/tools/llvm-readobj/ELFDumper.cpp
391–392 ↗(On Diff #204096)

I'm understanding the issue a little better now... I think there's probably a better way, but for now having this assert seems good enough to get this checked in (and unblock the other patch that depends on this).

This revision is now accepted and ready to land.Jun 12 2019, 12:01 PM

Thanks a lot for the review, Jordan. Do you mind commit this for me?

This revision was automatically updated to reflect the committed changes.

Sure, committed as r363198