Page MenuHomePhabricator

[Support] Ensure errs() is constructed after outs() and don't rerun tie when errs() is called
AbandonedPublic

Authored by MaskRay on Jun 9 2020, 7:12 PM.

Details

Summary

This avoids a gotcha:

  • if errs() is constructed before outs()
  • and if errs() is changed to a buffered state
  • when destructing errs() (outs() has been destructed), the tied-to pointer to outs() is dangling.

Don't rerun tie(outs()) so that the next call to errs() will not reset errs().tie(nullptr)

Remark: after errs().tie(nullptr), the unbuffered stderr stream should be thread safe (not dependent on the thread safety of outs()).

Diff Detail

Unit TestsFailed

TimeTest
20 mslinux > MLIR.Conversion/StandardToLLVM::convert-dynamic-memref-ops.mlir
Script: -- : 'RUN: at line 1'; /mnt/disks/ssd0/agent/premerge-debian-5d5d95fcbb-cs8pv-2/llvm-project/premerge-checks/build/bin/mlir-opt -convert-std-to-llvm /mnt/disks/ssd0/agent/premerge-debian-5d5d95fcbb-cs8pv-2/llvm-project/premerge-checks/mlir/test/Conversion/StandardToLLVM/convert-dynamic-memref-ops.mlir | /mnt/disks/ssd0/agent/premerge-debian-5d5d95fcbb-cs8pv-2/llvm-project/premerge-checks/build/bin/FileCheck /mnt/disks/ssd0/agent/premerge-debian-5d5d95fcbb-cs8pv-2/llvm-project/premerge-checks/mlir/test/Conversion/StandardToLLVM/convert-dynamic-memref-ops.mlir --dump-input-on-failure
50 mswindows > MLIR.Conversion/StandardToLLVM::convert-dynamic-memref-ops.mlir
Script: -- : 'RUN: at line 1'; c:\ws\buildkite-windows-4\llvm-project\premerge-checks\build\bin\mlir-opt.exe -convert-std-to-llvm C:\ws\buildkite-windows-4\llvm-project\premerge-checks\mlir\test\Conversion\StandardToLLVM\convert-dynamic-memref-ops.mlir | c:\ws\buildkite-windows-4\llvm-project\premerge-checks\build\bin\filecheck.exe C:\ws\buildkite-windows-4\llvm-project\premerge-checks\mlir\test\Conversion\StandardToLLVM\convert-dynamic-memref-ops.mlir --dump-input-on-failure

Event Timeline

MaskRay created this revision.Jun 9 2020, 7:12 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 9 2020, 7:12 PM
MaskRay edited the summary of this revision. (Show Details)Jun 9 2020, 7:17 PM

Adding global ctors is generally avoided - as they add startup cost (which might be very important to interactive programs, etc) to code linking in the LLVM libraries.

As for the gotchas:

if errs() is constructed before outs()

Why is that a problem? when errs() calls outs(), the static local ctor for outs() will be initialized as intended, right?

and if errs() is changed to a buffered state

Not sure I understand this issue - could you describe it in more detail?

when destructing errs() (outs() has been destructed), the tied-to pointer to outs() is dangling.

Yeah, that's potentially a problem - a wrapper that avoids running the dtor on these types (but does flush them, perhaps?) might be useful in that regard.

Adding global ctors is generally avoided - as they add startup cost (which might be very important to interactive programs, etc) to code linking in the LLVM libraries.

From a friend: "The output and error streams might be essential for logging in case of other initialization errors, so such cost is not worthless."

(It is hard to imagine an application which does not call outs() or errs() if they use raw_ostream.cpp)

As for the gotchas:

if errs() is constructed before outs()

Why is that a problem? when errs() calls outs(), the static local ctor for outs() will be initialized as intended, right?

and if errs() is changed to a buffered state

Not sure I understand this issue - could you describe it in more detail?

when destructing errs() (outs() has been destructed), the tied-to pointer to outs() is dangling.

Yeah, that's potentially a problem - a wrapper that avoids running the dtor on these types (but does flush them, perhaps?) might be useful in that regard.

The three conditions are conjunctions:/ When all three are satisfied, "the tied-to pointer to outs() is dangling" will be the result.

Adding global ctors is generally avoided - as they add startup cost (which might be very important to interactive programs, etc) to code linking in the LLVM libraries.

From a friend: "The output and error streams might be essential for logging in case of other initialization errors, so such cost is not worthless."

My understanding is that this cost is one of the reasons that raw_ostream exists in the first place, and it is *the* reason why #include <iostream> is banned in llvm.

Now, this solution does not introduce a global constructor in every compile unit like <iostream> does, but that does come with a cost -- it makes the code susceptible to static initialization order fiasco -- calling outs() before the constructor has executed will return a pointer to uninitialized memory.

So, I think it would be good to keep the current design of doing the initialization in function-local statics, at least until a wider discussion takes place. And I believe the current problem can be solved with function-local statics.

(It is hard to imagine an application which does not call outs() or errs() if they use raw_ostream.cpp)

As for the gotchas:

if errs() is constructed before outs()

Why is that a problem? when errs() calls outs(), the static local ctor for outs() will be initialized as intended, right?

and if errs() is changed to a buffered state

Not sure I understand this issue - could you describe it in more detail?

when destructing errs() (outs() has been destructed), the tied-to pointer to outs() is dangling.

Yeah, that's potentially a problem - a wrapper that avoids running the dtor on these types (but does flush them, perhaps?) might be useful in that regard.

The three conditions are conjunctions:/ When all three are satisfied, "the tied-to pointer to outs() is dangling" will be the result.

So, if it's a conjuction, it should be enough to shoot down one of these points. The first one appears to be the easiest.

if errs() is constructed before outs()

In the first version of D81156, errs() was implemented like so:

static tied_raw_ostream<raw_fd_ostream> S(outs(), STDERR_FILENO, false, true);
return S;

As outs() was called as a part of the initialization of the errs() object, this implementation guaranteed that outs() gets constructed first (and destructed last) and did not suffer from the above problem. Then the implementation was changed to:

static raw_fd_ostream S(STDERR_FILENO, false, true);
S.tie(&outs());
return S;

Here the outs() call is moved outside of the initializer for the errs() object, and so it is possible that the outs() is constructed later than errs(). Apart from making it susceptible to the above problem, this also causes another issue (which we really should have caught during review) -- the call to errs() is no longer thread safe: as the S.tie(&outs()) call is no longer guarded by the implicit static mutex, it gets executed every time that errs() is called. This introduces data races between concurrent calls to errs() -- the races are mostly bening as they are just re-setting the same piece of memory with the same value over and over again but they are still there. This also makes it impossible for someone to manually "untie" the errs stream like we discussed, as the next call to errs() will just tie it back again (and this untieing will result in more serious data races).

It sounds to me like both of these problems can be solved by moving the tieing back into the static initializer. If we want to keep the current tie interface, then the simplest way to fix that would be to have a helper static object which initializes things in the correct order:

struct InitializeAndTie {
  raw_fd_ostream S; // This could also be a base class.
  InitializeAndTie(raw_ostream &TieTo) : S(STDERR_FILENO, false, true) { S.tie(&TieTo); }
} Init(outs());
return Init.S;
This comment was removed by sammccall.

There's a second threadsafety issue I'd like to highlight, around runtime rather than initialization.

raw_ostreams are not threadsafe, including outs(). It's not necessarily safe to flush outs when errs() is accessed.
For example, in clangd errs() is the log stream, written by multiple threads with a lock. outs() is the protocol stream, written by the main thread without a lock.
After this change, all logging threads are trying to flush outs, racing with the main thread which is writing to it.
I can fix this in clangd but it has me questioning whether this is unsafe for other existing users and likely to go unnoticed for new ones.

Is it really vital that this behaviour be the default? I mentioned on D81156 that this seemed surprising and unsafe.
std::{cout,cerr} do something vaguely related (they're threadsafe and synchronize between operations), but that seems a) much safer than this and b) mostly a concession to history maybe?

Certainly, if we go back to the old behaviour (i.e. not tied by default) it would mostly solve both issues (but see question below). It would however make it up to individual tools to opt-in to the tied behaviour, which seems like it would be wanted all over the place (basically any tool from llvm-dwarfdump to clang that use errs() for reporting errors and outs() for other information in whatever form).

That being said, is there an issue even if we change the default that somebody could manually tie errs() to outs()? The user has to make sure the initialisation order is either amenable, or that they have untied things before destruction occurs.

That being said, is there an issue even if we change the default that somebody could manually tie errs() to outs()? The user has to make sure the initialisation order is either amenable, or that they have untied things before destruction occurs.

Yup, in fact IIUC C++17 guarantees that errs().tie(&outs()) executes in the "wrong" order.

Is it important to support initializing errs but not outs? The static initializer for errs could initialize outs too, in the right order. This "only" breaks silly people who tie the other way.

In any case this is subtle enough (with user-provided streams) that tie() should probably document it.

There's a second threadsafety issue I'd like to highlight, around runtime rather than initialization.

raw_ostreams are not threadsafe, including outs(). It's not necessarily safe to flush outs when errs() is accessed.
For example, in clangd errs() is the log stream, written by multiple threads with a lock. outs() is the protocol stream, written by the main thread without a lock.
After this change, all logging threads are trying to flush outs, racing with the main thread which is writing to it.
I can fix this in clangd but it has me questioning whether this is unsafe for other existing users and likely to go unnoticed for new ones.

Is it really vital that this behaviour be the default? I mentioned on D81156 that this seemed surprising and unsafe.
std::{cout,cerr} do something vaguely related (they're threadsafe and synchronize between operations), but that seems a) much safer than this and b) mostly a concession to history maybe?

I think it's important to note that
a) std::cout/cerr are guarenteed to be threadsafe only in std::ios_base::sync_with_stdio(true) mode https://en.cppreference.com/w/cpp/io/cerr. What sync_with_stdio(true) essentially does is put the standard character streams into the "unbuffered" mode.
b) raw_fd_ostream is threadsafe if it is unbuffered. I think we may even have some code relying on the fact that it can be used from a signal handler (with some care).

However, this doesn't really mean that the two streams are doing the same thing as there are (at least) two important differences:
i) std streams are thread-safe by default -- one has to call sync_with_stdio(false) explicitly, and I don't think many people do. outs() is not thread safe by default. The tieing patch makes errs() also not thread safe by default (and makes its thread-safety dependent on what the outs() stream does.
ii) although the std streams are unbuffered in the sync_with_stdio(true), it does not mean that they are writing to the OS directly. there's another layer of buffering in the C library, which means the performance impact of this is not as huge. an unbuffered raw_fd_ostream writes to the OS directly, and the penalty for using it is much higher.

Anyway, I don't think all this invalidates the original argument, but it is something to keep in mind.

That being said, is there an issue even if we change the default that somebody could manually tie errs() to outs()? The user has to make sure the initialisation order is either amenable, or that they have untied things before destruction occurs.

Yup, in fact IIUC C++17 guarantees that errs().tie(&outs()) executes in the "wrong" order.

Is it important to support initializing errs but not outs? The static initializer for errs could initialize outs too, in the right order. This "only" breaks silly people who tie the other way.

In any case this is subtle enough (with user-provided streams) that tie() should probably document it.

We could ensure that outs() is constructed before errs() even if they are not tied by default, just in case someone decides to tie them.

I think it would be nice to have them be tied by default, though it's true that there is a fair amount of potential for breaking existing use cases

We could ensure that outs() is constructed before errs() even if they are not tied by default, just in case someone decides to tie them.

Probably best thing to do if they aren't tied by default would be to provide a struct that does the tying and untying of errs()/outs() in its constructor and destructor. That way, there's no risk of things going wrong.

We could ensure that outs() is constructed before errs() even if they are not tied by default, just in case someone decides to tie them.

Probably best thing to do if they aren't tied by default would be to provide a struct that does the tying and untying of errs()/outs() in its constructor and destructor. That way, there's no risk of things going wrong.

That sounds good too.

MaskRay updated this revision to Diff 269864.Jun 10 2020, 8:43 AM
MaskRay retitled this revision from [Support] Initialize outs() errs() nulls() once to [Support] Ensure errs() is constructed after outs() and don't rerun tie when errs() is called.
MaskRay edited the summary of this revision. (Show Details)

Avoid global constructors

Add comments from labath

There's a second threadsafety issue I'd like to highlight, around runtime rather than initialization.

raw_ostreams are not threadsafe, including outs(). It's not necessarily safe to flush outs when errs() is accessed.
For example, in clangd errs() is the log stream, written by multiple threads with a lock. outs() is the protocol stream, written by the main thread without a lock.
After this change, all logging threads are trying to flush outs, racing with the main thread which is writing to it.
I can fix this in clangd but it has me questioning whether this is unsafe for other existing users and likely to go unnoticed for new ones.

+1

Is it really vital that this behaviour be the default? I mentioned on D81156 that this seemed surprising and unsafe.
std::{cout,cerr} do something vaguely related (they're threadsafe and synchronize between operations), but that seems a) much safer than this and b) mostly a concession to history maybe?

I think it's important to note that
a) std::cout/cerr are guarenteed to be threadsafe only in std::ios_base::sync_with_stdio(true) mode https://en.cppreference.com/w/cpp/io/cerr. What sync_with_stdio(true) essentially does is put the standard character streams into the "unbuffered" mode.
b) raw_fd_ostream is threadsafe if it is unbuffered. I think we may even have some code relying on the fact that it can be used from a signal handler (with some care).

However, this doesn't really mean that the two streams are doing the same thing as there are (at least) two important differences:
i) std streams are thread-safe by default -- one has to call sync_with_stdio(false) explicitly, and I don't think many people do. outs() is not thread safe by default.

When you say "not thread safe" you mean specifically for using outs from more than one thread at the same time, yes?

The tieing patch makes errs() also not thread safe by default (and makes its thread-safety dependent on what the outs() stream does.

Yeah, I think /this/ is perhaps unacceptable. Making writing to an error stream now a race with writing to the output stream seems a bit too subtle to me.

ii) although the std streams are unbuffered in the sync_with_stdio(true), it does not mean that they are writing to the OS directly. there's another layer of buffering in the C library, which means the performance impact of this is not as huge. an unbuffered raw_fd_ostream writes to the OS directly, and the penalty for using it is much higher.

Anyway, I don't think all this invalidates the original argument, but it is something to keep in mind.

That being said, is there an issue even if we change the default that somebody could manually tie errs() to outs()? The user has to make sure the initialisation order is either amenable, or that they have untied things before destruction occurs.

Yup, in fact IIUC C++17 guarantees that errs().tie(&outs()) executes in the "wrong" order.

Is it important to support initializing errs but not outs? The static initializer for errs could initialize outs too, in the right order. This "only" breaks silly people who tie the other way.

In any case this is subtle enough (with user-provided streams) that tie() should probably document it.

We could ensure that outs() is constructed before errs() even if they are not tied by default, just in case someone decides to tie them.

I think it would be nice to have them be tied by default, though it's true that there is a fair amount of potential for breaking existing use cases

If they can still at least provide "The usual thread safety guarantees" (writing to (apparently) independent objects (errs() and outs()) does not race) then I'm not significantly opposed. If this breaks that guarantee - I don't think that's a sufficiently worthwhile tradeoff.

Two nits in the comments, but frankly, I think we'd be better off dropping the tied by default approach if it's not straightforward to solve all the race conditions compared to what was before, in which case, this code needs to become something that allows tying on first use, but assumes it isn't. Maybe errs() could be given an argument (default to false) of "tie to outs()" or something.

llvm/lib/Support/raw_ostream.cpp
880

destructed -> destroyed
opt out -> opt out of

i) std streams are thread-safe by default -- one has to call sync_with_stdio(false) explicitly, and I don't think many people do. outs() is not thread safe by default.

When you say "not thread safe" you mean specifically for using outs from more than one thread at the same time, yes?

Affirmative.

I think it would be nice to have them be tied by default, though it's true that there is a fair amount of potential for breaking existing use cases

If they can still at least provide "The usual thread safety guarantees" (writing to (apparently) independent objects (errs() and outs()) does not race) then I'm not significantly opposed. If this breaks that guarantee - I don't think that's a sufficiently worthwhile tradeoff.

Unfortunately, I don't think there's a good way to provide those guarantees -- this amounts to ensuring that flush() does not race with any other stream output operation (including itself). That is probably impossible without locking, which will likely have significant implications on the stream performance.

When you say "worthwhile tradeoff", what do you mean exactly? errs() and outs() being tied by default, or the entire tieing concept in general?

I posted this on the GitHub PR, but was told the main discussion about this bug is here:
https://github.com/llvm/llvm-project/commit/1ce831912c797df1cb6d313d8e576a3f86175b6d#r39838454

Not sure a separate InitializeAndTie struct is needed when you can use an immediately-invoked lambda:

raw_fd_ostream &llvm::errs() {
  // Set standard error to be unbuffered and tied to outs() by default.
  static auto S = [] { 
    raw_fd_ostream s(STDERR_FILENO, false, true);
    s.tie(&outs());
    return s;
  }();
  return S;
}

I'm assuming 's' can be returned by value here, of course, otherwise we could make it static and return by reference from the lambda.

i) std streams are thread-safe by default -- one has to call sync_with_stdio(false) explicitly, and I don't think many people do. outs() is not thread safe by default.

When you say "not thread safe" you mean specifically for using outs from more than one thread at the same time, yes?

Affirmative.

I think it would be nice to have them be tied by default, though it's true that there is a fair amount of potential for breaking existing use cases

If they can still at least provide "The usual thread safety guarantees" (writing to (apparently) independent objects (errs() and outs()) does not race) then I'm not significantly opposed. If this breaks that guarantee - I don't think that's a sufficiently worthwhile tradeoff.

Unfortunately, I don't think there's a good way to provide those guarantees -- this amounts to ensuring that flush() does not race with any other stream output operation (including itself). That is probably impossible without locking, which will likely have significant implications on the stream performance.

When you say "worthwhile tradeoff", what do you mean exactly? errs() and outs() being tied by default, or the entire tieing concept in general?

Sorry, yeah - I mean tie-by-default. Essentially: Seems like this is isn't a good idea if it means that outs and errs become racy by default. I think that makes this an opt-in situation, where the app has to make the choice/provide the guarantee that it won't be writing to outs and errs from different threads.

Unless there is an immediate resolution (and the discussion seems to indicate that it is worth taking our time here), can we revert the patch that introduces a TSAN failure?

MaskRay added a comment.EditedJun 11 2020, 1:28 PM

I posted this on the GitHub PR, but was told the main discussion about this bug is here:
https://github.com/llvm/llvm-project/commit/1ce831912c797df1cb6d313d8e576a3f86175b6d#r39838454

Not sure a separate InitializeAndTie struct is needed when you can use an immediately-invoked lambda:

raw_fd_ostream &llvm::errs() {
  // Set standard error to be unbuffered and tied to outs() by default.
  static auto S = [] { 
    raw_fd_ostream s(STDERR_FILENO, false, true);
    s.tie(&outs());
    return s;
  }();
  return S;
}

I'm assuming 's' can be returned by value here, of course, otherwise we could make it static and return by reference from the lambda.

A lambda does not work. raw_fd_ostream does not have a copy constructor.

Unless there is an immediate resolution (and the discussion seems to indicate that it is worth taking our time here), can we revert the patch that introduces a TSAN failure?

@mehdi_amini Pushing this patch will fix the issue.

Can you be more specific about concurrent outs() or outs()+errs() calls?

MaskRay added a comment.EditedJun 11 2020, 1:41 PM

i) std streams are thread-safe by default -- one has to call sync_with_stdio(false) explicitly, and I don't think many people do. outs() is not thread safe by default.

When you say "not thread safe" you mean specifically for using outs from more than one thread at the same time, yes?

Affirmative.

I think it would be nice to have them be tied by default, though it's true that there is a fair amount of potential for breaking existing use cases

If they can still at least provide "The usual thread safety guarantees" (writing to (apparently) independent objects (errs() and outs()) does not race) then I'm not significantly opposed. If this breaks that guarantee - I don't think that's a sufficiently worthwhile tradeoff.

Unfortunately, I don't think there's a good way to provide those guarantees -- this amounts to ensuring that flush() does not race with any other stream output operation (including itself). That is probably impossible without locking, which will likely have significant implications on the stream performance.

When you say "worthwhile tradeoff", what do you mean exactly? errs() and outs() being tied by default, or the entire tieing concept in general?

Sorry, yeah - I mean tie-by-default. Essentially: Seems like this is isn't a good idea if it means that outs and errs become racy by default. I think that makes this an opt-in situation, where the app has to make the choice/provide the guarantee that it won't be writing to outs and errs from different threads.

Calling errs() << char was racy before @jhenderson's patch. raw_fd_ostream::write_impl modifies a member variable pos.

I think we can push this patch, and think about tied-by-default state later. (I originally requested an opt-in state in D81156, but did not think hard about the implications)

MaskRay marked an inline comment as done.Jun 11 2020, 1:44 PM
MaskRay added inline comments.
llvm/lib/Support/raw_ostream.cpp
880

"be destructed" is common in C++. I will not change it.

Will change opt out to opt out of

i) std streams are thread-safe by default -- one has to call sync_with_stdio(false) explicitly, and I don't think many people do. outs() is not thread safe by default.

When you say "not thread safe" you mean specifically for using outs from more than one thread at the same time, yes?

Affirmative.

I think it would be nice to have them be tied by default, though it's true that there is a fair amount of potential for breaking existing use cases

If they can still at least provide "The usual thread safety guarantees" (writing to (apparently) independent objects (errs() and outs()) does not race) then I'm not significantly opposed. If this breaks that guarantee - I don't think that's a sufficiently worthwhile tradeoff.

Unfortunately, I don't think there's a good way to provide those guarantees -- this amounts to ensuring that flush() does not race with any other stream output operation (including itself). That is probably impossible without locking, which will likely have significant implications on the stream performance.

When you say "worthwhile tradeoff", what do you mean exactly? errs() and outs() being tied by default, or the entire tieing concept in general?

Sorry, yeah - I mean tie-by-default. Essentially: Seems like this is isn't a good idea if it means that outs and errs become racy by default. I think that makes this an opt-in situation, where the app has to make the choice/provide the guarantee that it won't be writing to outs and errs from different threads.

Calling errs() << char was racy before @jhenderson's patch. raw_fd_ostream::write_impl modifies a member variable pos.

Sorry, specifically I was talking about racing between writing to outs() and writing to errs() - as far as I understand it, those did not race prior to the tying (ie: they provided the "usual thread safety guarantees" - which is that mutations of distinct objects do not race) but race once the tying was done (because a write to errs might flush outs mid-write). This is the issue @sammccall raised earlier & sounds like it actively impacts clangd.

I think we can push this patch, and think about tied-by-default state later. (I originally requested an opt-in state in D81156, but did not think hard about the implications)

Given that clangd is actively broken, I think it's important to go in a direction that fixes that underlying issue sooner rather than later - which I think amounts to "revert the tie-by-default patch".

Unless there is an immediate resolution (and the discussion seems to indicate that it is worth taking our time here), can we revert the patch that introduces a TSAN failure?

@mehdi_amini Pushing this patch will fix the issue.

This patch isn't approved, and discussion is going on: the most straightforward way of addressing the bug is to revert *first* to remove the urgency here. It is easy to reapply with your change though when it is ready.

i) std streams are thread-safe by default -- one has to call sync_with_stdio(false) explicitly, and I don't think many people do. outs() is not thread safe by default.

When you say "not thread safe" you mean specifically for using outs from more than one thread at the same time, yes?

Affirmative.

I think it would be nice to have them be tied by default, though it's true that there is a fair amount of potential for breaking existing use cases

If they can still at least provide "The usual thread safety guarantees" (writing to (apparently) independent objects (errs() and outs()) does not race) then I'm not significantly opposed. If this breaks that guarantee - I don't think that's a sufficiently worthwhile tradeoff.

Unfortunately, I don't think there's a good way to provide those guarantees -- this amounts to ensuring that flush() does not race with any other stream output operation (including itself). That is probably impossible without locking, which will likely have significant implications on the stream performance.

When you say "worthwhile tradeoff", what do you mean exactly? errs() and outs() being tied by default, or the entire tieing concept in general?

Sorry, yeah - I mean tie-by-default. Essentially: Seems like this is isn't a good idea if it means that outs and errs become racy by default. I think that makes this an opt-in situation, where the app has to make the choice/provide the guarantee that it won't be writing to outs and errs from different threads.

Calling errs() << char was racy before @jhenderson's patch. raw_fd_ostream::write_impl modifies a member variable pos.

Sorry, specifically I was talking about racing between writing to outs() and writing to errs() - as far as I understand it, those did not race prior to the tying (ie: they provided the "usual thread safety guarantees" - which is that mutations of distinct objects do not race) but race once the tying was done (because a write to errs might flush outs mid-write). This is the issue @sammccall raised earlier & sounds like it actively impacts clangd.

I don't think it is actively impacting clangd. Mitigated by D81538.

I think we can push this patch, and think about tied-by-default state later. (I originally requested an opt-in state in D81156, but did not think hard about the implications)

Given that clangd is actively broken, I think it's important to go in a direction that fixes that underlying issue sooner rather than later - which I think amounts to "revert the tie-by-default patch".

Not sure any other projects is impacted.

MaskRay added a comment.EditedJun 11 2020, 2:27 PM

Unless there is an immediate resolution (and the discussion seems to indicate that it is worth taking our time here), can we revert the patch that introduces a TSAN failure?

@mehdi_amini Pushing this patch will fix the issue.

This patch isn't approved, and discussion is going on: the most straightforward way of addressing the bug is to revert *first* to remove the urgency here. It is easy to reapply with your change though when it is ready.

There have been several commits touching this area (there are dependent .debug_line commits as well). It is not a "reverting D81156 and everything will be fine" state. You mentioned some TSAN failures so I asked what are affected. We need to figure out a good way to address the issue.

So far I believe pushing this and things will be fixed. (accessing outs()errs() concurrently is still racy - but accessing outs() concurrently & accessing errs() concurrently were already racy - I don't know what projects can actually trigger the code path)

i) std streams are thread-safe by default -- one has to call sync_with_stdio(false) explicitly, and I don't think many people do. outs() is not thread safe by default.

When you say "not thread safe" you mean specifically for using outs from more than one thread at the same time, yes?

Affirmative.

I think it would be nice to have them be tied by default, though it's true that there is a fair amount of potential for breaking existing use cases

If they can still at least provide "The usual thread safety guarantees" (writing to (apparently) independent objects (errs() and outs()) does not race) then I'm not significantly opposed. If this breaks that guarantee - I don't think that's a sufficiently worthwhile tradeoff.

Unfortunately, I don't think there's a good way to provide those guarantees -- this amounts to ensuring that flush() does not race with any other stream output operation (including itself). That is probably impossible without locking, which will likely have significant implications on the stream performance.

When you say "worthwhile tradeoff", what do you mean exactly? errs() and outs() being tied by default, or the entire tieing concept in general?

Sorry, yeah - I mean tie-by-default. Essentially: Seems like this is isn't a good idea if it means that outs and errs become racy by default. I think that makes this an opt-in situation, where the app has to make the choice/provide the guarantee that it won't be writing to outs and errs from different threads.

Calling errs() << char was racy before @jhenderson's patch. raw_fd_ostream::write_impl modifies a member variable pos.

Sorry, specifically I was talking about racing between writing to outs() and writing to errs() - as far as I understand it, those did not race prior to the tying (ie: they provided the "usual thread safety guarantees" - which is that mutations of distinct objects do not race) but race once the tying was done (because a write to errs might flush outs mid-write). This is the issue @sammccall raised earlier & sounds like it actively impacts clangd.

I don't think it is actively impacting clangd. Mitigated by D81538.

Ah, glad to hear they have a workaround - though still seems problematic that this fairly reasonable use case is broken/requires a workaround.

Unless there is an immediate resolution (and the discussion seems to indicate that it is worth taking our time here), can we revert the patch that introduces a TSAN failure?

@mehdi_amini Pushing this patch will fix the issue.

This patch isn't approved, and discussion is going on: the most straightforward way of addressing the bug is to revert *first* to remove the urgency here. It is easy to reapply with your change though when it is ready.

There have been several commits touching this area (there are dependent .debug_line commits as well). It is not a "reverting D81156 and everything will be fine" state. You mentioned some TSAN failures so I asked what are affected. We need to figure out a good way to address the issue.

But there's open debate about what that is and ongoing breakage in tree - @jhenderson please revert the original patch & continue discussion after that.

So far I believe pushing this and things will be fixed. (accessing outs()errs() concurrently is still racy - but accessing outs() concurrently & accessing errs() concurrently were already racy - I don't know what projects can actually trigger the code path)

saying "outs() isn't thread safe" isn't super surprising (a bit surprising compared to cout/cerr, admittedly) - but "outs() doesn't provide the usual thread safety guarantees" is a bit more problematic.

MaskRay abandoned this revision.Jun 11 2020, 3:22 PM

Abandon the patch for now.

I cannot revert D81156 as at least two commits depend on it. Reverted the tie-by-default behavior in 030897523d43e3296f69d25a71a140d9e5793c6a. Hope it addresses any issue people may observe.

Abandon the patch for now.

I cannot revert D81156 as at least two commits depend on it. Reverted the tie-by-default behavior in 030897523d43e3296f69d25a71a140d9e5793c6a. Hope it addresses any issue people may observe.

Awesome, thanks!

@MaskRay sorry for turning this review thread into a mess - i was directed here but probably should rather have kept the discussions separate.

Thanks for disabling the by-default behavior for the moment. If we turn it back on then the lack of the "usual thread-safety guarantees" seems like a good thing to call out in the comments.

We did indeed turn this off in clangd explicitly, I'll leave that in in case we roll the default forward.

@MaskRay sorry for turning this review thread into a mess - i was directed here but probably should rather have kept the discussions separate.

Thanks for disabling the by-default behavior for the moment. If we turn it back on then the lack of the "usual thread-safety guarantees" seems like a good thing to call out in the comments.

We did indeed turn this off in clangd explicitly, I'll leave that in in case we roll the default forward.

Might be worth adding a comment to describe the current state of things if that code is going to remain (with the hopes to clean it up if the current behavior sticks long-term (~weeks/months/etc))?