This is an archive of the discontinued LLVM Phabricator instance.

[Support] Add stream tie function and use it for errs()
ClosedPublic

Authored by jhenderson on Jun 4 2020, 6:21 AM.

Details

Summary

errs() is now tied to outs() so that if something prints to errs(), outs() will be flushed before the printing occurs. This avoids interleaving output between the two and is consistent with standard cout and cerr behaviour.

Diff Detail

Event Timeline

jhenderson created this revision.Jun 4 2020, 6:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 4 2020, 6:21 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
MaskRay accepted this revision.Jun 4 2020, 9:09 AM

Looks great!

llvm/include/llvm/Support/raw_ostream.h
312

There is another private specifier above. You can avoid this private.

505–509

Don’t duplicate function or class name at the beginning of the comment.

http://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments

This revision is now accepted and ready to land.Jun 4 2020, 9:09 AM
MaskRay requested changes to this revision.EditedJun 4 2020, 10:26 AM

Sorry, undo LGTM. Is it worth making tied_to an opt-in feature? i.e. for command line utilities, they can run errs().tie(outs()).

We can probably even fold the TiedTo field into raw_fd_ostream. Just a thought.

This revision now requires changes to proceed.Jun 4 2020, 10:26 AM

Sorry, undo LGTM. Is it worth making tied_to an opt-in feature? i.e. for command line utilities, they can run errs().tie(outs()).

The issue with that is that all tools that want it (most of them?) will have to explicitly do the work for that. One of the motivations to change errs() is to prevent the need to require code in llvm-dwarfdump (and other tools) to be used to prevent stream interleaving. @dblaikie/@JDevlieghere etc what are your opinions? Is it acceptable to require tools to specify the one-line change here?

We can probably even fold the TiedTo field into raw_fd_ostream. Just a thought.

Folding it into raw_fd_ostream would be reasonable. We could have an Optional stream as the tied-to stream for this, and it would probably simplify the required code (no need to provide a template parameter etc to identify the base stream). tie and untie could be provided to set/unset the tied-to stream.

If we went with that approach, we could always have errs() and outs() tied as proposed in the current patch, but with the ability for tools to opt-out if they don't want the behaviour. What do you think?

labath added a comment.Jun 8 2020, 4:11 AM

Sorry, undo LGTM. Is it worth making tied_to an opt-in feature? i.e. for command line utilities, they can run errs().tie(outs()).

The issue with that is that all tools that want it (most of them?) will have to explicitly do the work for that. One of the motivations to change errs() is to prevent the need to require code in llvm-dwarfdump (and other tools) to be used to prevent stream interleaving. @dblaikie/@JDevlieghere etc what are your opinions? Is it acceptable to require tools to specify the one-line change here?

We can probably even fold the TiedTo field into raw_fd_ostream. Just a thought.

Folding it into raw_fd_ostream would be reasonable. We could have an Optional stream as the tied-to stream for this, and it would probably simplify the required code (no need to provide a template parameter etc to identify the base stream). tie and untie could be provided to set/unset the tied-to stream.

If we went with that approach, we could always have errs() and outs() tied as proposed in the current patch, but with the ability for tools to opt-out if they don't want the behaviour. What do you think?

I don't think it's necessary, but that approach makes sense to me -- though I'd go with tie(nullptr) instead of untie() for symmetry with std::basic_ios.

llvm/unittests/Support/raw_ostream_test.cpp
385

This appears to be unused.

jhenderson updated this revision to Diff 269184.Jun 8 2020, 5:03 AM
jhenderson retitled this revision from [Support] Create a tied stream class and use it for errs() to [Support] Add stream tie function and use it for errs().
jhenderson edited the summary of this revision. (Show Details)

Add tie() function to raw_ostream instead of using a new class. Also expand unit testing to cover the related areas.

jhenderson marked 2 inline comments as done.Jun 8 2020, 5:04 AM
jhenderson added inline comments.
llvm/unittests/Support/raw_ostream_test.cpp
385

Oops, thanks. I've removed it locally.

JDevlieghere accepted this revision.Jun 8 2020, 12:02 PM

Looks good to me, but I'd like Pavel to sign off as well.

MaskRay accepted this revision.Jun 8 2020, 12:56 PM

Sorry, undo LGTM. Is it worth making tied_to an opt-in feature? i.e. for command line utilities, they can run errs().tie(outs()).

The issue with that is that all tools that want it (most of them?) will have to explicitly do the work for that. One of the motivations to change errs() is to prevent the need to require code in llvm-dwarfdump (and other tools) to be used to prevent stream interleaving. @dblaikie/@JDevlieghere etc what are your opinions? Is it acceptable to require tools to specify the one-line change here?

We can probably even fold the TiedTo field into raw_fd_ostream. Just a thought.

Folding it into raw_fd_ostream would be reasonable. We could have an Optional stream as the tied-to stream for this, and it would probably simplify the required code (no need to provide a template parameter etc to identify the base stream). tie and untie could be provided to set/unset the tied-to stream.

If we went with that approach, we could always have errs() and outs() tied as proposed in the current patch, but with the ability for tools to opt-out if they don't want the behaviour. What do you think?

Let's make it opt-out for now. Hopefully there aren't any tool requiring errs().tie(nullptr)

llvm/include/llvm/Support/raw_ostream.h
70

Inconsistent comment marker. // -> ///

71

* => *

303

TieTo -> TiedTo

This revision is now accepted and ready to land.Jun 8 2020, 12:56 PM
labath accepted this revision.Jun 9 2020, 12:44 AM

Yep, this is great.

llvm/unittests/Support/raw_ostream_test.cpp
385

It looks like it's still here. :)

jhenderson marked 2 inline comments as done.Jun 9 2020, 1:15 AM
jhenderson added inline comments.
llvm/unittests/Support/raw_ostream_test.cpp
385

Yeah, I missed it in my last update, and was too lazy to upload a new diff for just that :)

I'll triple check that it's gone before committing though!

jhenderson updated this revision to Diff 269442.Jun 9 2020, 2:10 AM
jhenderson marked 4 inline comments as done.

Address review comments.

jhenderson updated this revision to Diff 269448.Jun 9 2020, 2:12 AM

Upload correct diff.

llvm/include/llvm/Support/raw_ostream.h
303

I'm not going to make this change, since this function tells the class to tie itself to TieTo. TiedTo would imply it has already been tied to it, which would be false.

labath added inline comments.Jun 9 2020, 2:42 AM
llvm/unittests/Support/raw_ostream_test.cpp
385

Sure, no worries.:)

This revision was automatically updated to reflect the committed changes.
MaskRay added inline comments.Jun 9 2020, 6:04 PM
llvm/unittests/Support/raw_ostream_test.cpp
436

I added a pairing TiedTo.tie(nullptr);

TiedStream was destroyed before TiedTo and when flushing TiedTo, TiedStream has been destroyed => asan stack-use-after-scope failure.

Fixed in ceaee253f4f3079d2786697d099e43e9f61f897a

This is surprising and unsafe behaviour. Can we reconsider changing the default, at least until a more careful audit of the places errs() is used?
In particular, if errs is used from multiple threads (with a lock) and outs is used from a single thread (so no lock needed) then this introduces racy flushes to outs() concurrent with writes.

This is surprising and unsafe behaviour. Can we reconsider changing the default, at least until a more careful audit of the places errs() is used?
In particular, if errs is used from multiple threads (with a lock) and outs is used from a single thread (so no lock needed) then this introduces racy flushes to outs() concurrent with writes.

Thanks for highlighting this, it's been brought up on D81525 as well, so rather than discuss it in two places, we'll look at it there, if that's okay?

As to why this was decided to be the default in the first place: a) it matches the default behaviour of std::cout/std::cerr, and b) in the typical case of printing an error message of some kind, it ensures we don't interleave error output in the middle of a line of other output, which seems like an almost universally good thing to me. But yet, we missed the thread-safety (or rather lack thereof) aspect of this change.

This is surprising and unsafe behaviour. Can we reconsider changing the default, at least until a more careful audit of the places errs() is used?
In particular, if errs is used from multiple threads (with a lock) and outs is used from a single thread (so no lock needed) then this introduces racy flushes to outs() concurrent with writes.

Thanks for highlighting this, it's been brought up on D81525 as well, so rather than discuss it in two places, we'll look at it there, if that's okay?

Happy to discuss there. It's a separate issue though, the discussion there is about initialization being racy,

As to why this was decided to be the default in the first place: a) it matches the default behaviour of std::cout/std::cerr, and b) in the typical case of printing an error message of some kind, it ensures we don't interleave error output in the middle of a line of other output, which seems like an almost universally good thing to me. But yet, we missed the thread-safety (or rather lack thereof) aspect of this change.

cout and cerr have been banned in every codebase I've done much IO in, so I had to look this up :-)
However they're *threadsafe*, and so synchronizing there is pretty different.
Per cppreference the specified behavior is also not the same: "synchronized C++ streams are guaranteed to be thread-safe (individual characters output from multiple threads may interleave, but no data races occur)" - this seems to apply to multiple streams too.

This is surprising and unsafe behaviour. Can we reconsider changing the default, at least until a more careful audit of the places errs() is used?
In particular, if errs is used from multiple threads (with a lock) and outs is used from a single thread (so no lock needed) then this introduces racy flushes to outs() concurrent with writes.

Thanks for highlighting this, it's been brought up on D81525 as well, so rather than discuss it in two places, we'll look at it there, if that's okay?

Happy to discuss there. It's a separate issue though, the discussion there is about initialization being racy,

Yeah, I realised that after posting that previous post, sorry! Still, probably best to keep the discussion in one place for now.

This is surprising and unsafe behaviour. Can we reconsider changing the default, at least until a more careful audit of the places errs() is used?
In particular, if errs is used from multiple threads (with a lock) and outs is used from a single thread (so no lock needed) then this introduces racy flushes to outs() concurrent with writes.

Thanks for highlighting this, it's been brought up on D81525 as well, so rather than discuss it in two places, we'll look at it there, if that's okay?

Happy to discuss there. It's a separate issue though, the discussion there is about initialization being racy,

Yeah, I realised that after posting that previous post, sorry! Still, probably best to keep the discussion in one place for now.

Did we resolve this? In an internal XLA build tsan complains about this being racy, and I'm not sure if the problem is with how we are using llvm::dbgs() or with this patch. If the latter, please rollback while we figure out a non-racy way to achieve the same thing.

This is surprising and unsafe behaviour. Can we reconsider changing the default, at least until a more careful audit of the places errs() is used?
In particular, if errs is used from multiple threads (with a lock) and outs is used from a single thread (so no lock needed) then this introduces racy flushes to outs() concurrent with writes.

Thanks for highlighting this, it's been brought up on D81525 as well, so rather than discuss it in two places, we'll look at it there, if that's okay?

Happy to discuss there. It's a separate issue though, the discussion there is about initialization being racy,

Yeah, I realised that after posting that previous post, sorry! Still, probably best to keep the discussion in one place for now.

Did we resolve this? In an internal XLA build tsan complains about this being racy, and I'm not sure if the problem is with how we are using llvm::dbgs() or with this patch. If the latter, please rollback while we figure out a non-racy way to achieve the same thing.

Hopefully resolved by @MaskRay 's 030897523d43e3296f69d25a71a140d9e5793c6a

This is surprising and unsafe behaviour. Can we reconsider changing the default, at least until a more careful audit of the places errs() is used?
In particular, if errs is used from multiple threads (with a lock) and outs is used from a single thread (so no lock needed) then this introduces racy flushes to outs() concurrent with writes.

Thanks for highlighting this, it's been brought up on D81525 as well, so rather than discuss it in two places, we'll look at it there, if that's okay?

Happy to discuss there. It's a separate issue though, the discussion there is about initialization being racy,

Yeah, I realised that after posting that previous post, sorry! Still, probably best to keep the discussion in one place for now.

Did we resolve this? In an internal XLA build tsan complains about this being racy, and I'm not sure if the problem is with how we are using llvm::dbgs() or with this patch. If the latter, please rollback while we figure out a non-racy way to achieve the same thing.

Does 030897523d43e3296f69d25a71a140d9e5793c6a fix the race?

(clangd used tie now. .debug_line parsing depends on it. Reverting is more disruptive than simply disabling tie-by-default behavior)

This is surprising and unsafe behaviour. Can we reconsider changing the default, at least until a more careful audit of the places errs() is used?
In particular, if errs is used from multiple threads (with a lock) and outs is used from a single thread (so no lock needed) then this introduces racy flushes to outs() concurrent with writes.

Thanks for highlighting this, it's been brought up on D81525 as well, so rather than discuss it in two places, we'll look at it there, if that's okay?

Happy to discuss there. It's a separate issue though, the discussion there is about initialization being racy,

Yeah, I realised that after posting that previous post, sorry! Still, probably best to keep the discussion in one place for now.

Did we resolve this? In an internal XLA build tsan complains about this being racy, and I'm not sure if the problem is with how we are using llvm::dbgs() or with this patch. If the latter, please rollback while we figure out a non-racy way to achieve the same thing.

Does 030897523d43e3296f69d25a71a140d9e5793c6a fix the race?

That should fix it, thanks!

(clangd used tie now. .debug_line parsing depends on it. Reverting is more disruptive than simply disabling tie-by-default behavior)

This is surprising and unsafe behaviour. Can we reconsider changing the default, at least until a more careful audit of the places errs() is used?
In particular, if errs is used from multiple threads (with a lock) and outs is used from a single thread (so no lock needed) then this introduces racy flushes to outs() concurrent with writes.

Thanks for highlighting this, it's been brought up on D81525 as well, so rather than discuss it in two places, we'll look at it there, if that's okay?

Happy to discuss there. It's a separate issue though, the discussion there is about initialization being racy,

Yeah, I realised that after posting that previous post, sorry! Still, probably best to keep the discussion in one place for now.

Did we resolve this? In an internal XLA build tsan complains about this being racy, and I'm not sure if the problem is with how we are using llvm::dbgs() or with this patch. If the latter, please rollback while we figure out a non-racy way to achieve the same thing.

Does 030897523d43e3296f69d25a71a140d9e5793c6a fix the race?

That should fix it, thanks!

(clangd used tie now. .debug_line parsing depends on it. Reverting is more disruptive than simply disabling tie-by-default behavior)

Note, printing via dbgs() << "a" or errs() << "a" concurrently was racy even (in both NDEBUG and !NDEBUG builds) before this patch.

This patch introduced additional problems that (1) calling errs() concurrently would be racy (due to S.tie(&outs())) (2) printing to errs() in one thread and printing to outs() in another thread would be racy.

Anyway, you may want to investigate whether there are existing problems due to concurrent dbgs() or errs() uses, even if TSAN did not catch it before this patch.