This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Remove unnecessary flushes and add additional testing
ClosedPublic

Authored by jhenderson on May 29 2020, 7:48 AM.

Details

Summary

The flushes previously existed to help ensure consistent error message output when stdout and stderr were passed to the same location. This is no longer necessary as errs() is now tied to outs().

Depends on D81156.

Diff Detail

Event Timeline

jhenderson created this revision.May 29 2020, 7:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 29 2020, 7:48 AM

Not sure I feel great about having these things synchronized - seems likely there's lots of error messages that get interspersed with stdout in llvm-dwarfdump and we aren't likely to get all the stdout flushed appropriately, so maybe we shouldn't start/try? If the test is flakey/unreliable due to this - perhaps it should FileCheck stdout and stderr separately rather than piping them together?

(but I don't feel super strongly - I'm OK with this patch/direction if other folks think it's the right direction)

Not sure I feel great about having these things synchronized - seems likely there's lots of error messages that get interspersed with stdout in llvm-dwarfdump and we aren't likely to get all the stdout flushed appropriately, so maybe we shouldn't start/try? If the test is flakey/unreliable due to this - perhaps it should FileCheck stdout and stderr separately rather than piping them together?

+1. Given that the WarningHandler is configurable I don't think we should be making assumptions of what it's doing, even if printing stderr is the default. I would even go a step further and remove the existing flushes so the test don't rely on them.

MaskRay added a comment.EditedMay 30 2020, 10:02 AM

I am in favor of synchronizing stdout/stderr, i.e. when redirected to a file, making the behavior similar to a pty (line buffered). This makes stderr output in tests more relevant, i.e. we usually do 2>&1 | FileCheck, now we can check stderr output is emitted in appropriate places. However, like other opinions, I am wary of littering flush all over the code base. Also note that flushing too frequently has a performance penalty when full buffering is otherwise used. (I once made llvm-objdump -d slow due to a change to make it line buffered)

Can we do something similar to D64165? Teach the warning/error (stderr) handler to flush stdout? If there is no stderr output, usual buffering (full buffering when the output is a file) is used. There is no performance penalty.

I am in favor of synchronizing stdout/stderr, i.e. when redirected to a file, making the behavior similar to a pty (line buffered). This makes stderr output in tests more relevant, i.e. we usually do 2>&1 | FileCheck, now we can check stderr output is emitted in appropriate places. However, like other opinions, I am wary of littering flush all over the code base. Also note that flushing too frequently has a performance penalty when full buffering is otherwise used. (I once made llvm-objdump -d slow due to a change to make it line buffered)

Can we do something similar to D64165? Teach the warning/error (stderr) handler to flush stdout? If there is no stderr output, usual buffering (full buffering when the output is a file) is used. There is no performance penalty.

If we're going to do it, yeah, D64165 (flushing outs() before the diagnostic, flushing the diagnostic stream after the diagnostic) would be the better direction. But even in that case - it seems it isn't consistently applied even within llvm-objdump.cpp (isn't done for command line warnings, nor for errors?)

I am in favor of synchronizing stdout/stderr, i.e. when redirected to a file, making the behavior similar to a pty (line buffered). This makes stderr output in tests more relevant, i.e. we usually do 2>&1 | FileCheck, now we can check stderr output is emitted in appropriate places. However, like other opinions, I am wary of littering flush all over the code base. Also note that flushing too frequently has a performance penalty when full buffering is otherwise used. (I once made llvm-objdump -d slow due to a change to make it line buffered)

Can we do something similar to D64165? Teach the warning/error (stderr) handler to flush stdout? If there is no stderr output, usual buffering (full buffering when the output is a file) is used. There is no performance penalty.

If we're going to do it, yeah, D64165 (flushing outs() before the diagnostic, flushing the diagnostic stream after the diagnostic) would be the better direction. But even in that case - it seems it isn't consistently applied even within llvm-objdump.cpp (isn't done for command line warnings, nor for errors?)

The idea is that some warning/error reporting facilities need outs().flush(). For llvm-objdump.cpp, reportWarning and reportError are the two utilities. It may be possible that some code path does not consistently flush outs(). We can fix them if someone observes problems.

I am in favor of synchronizing stdout/stderr, i.e. when redirected to a file, making the behavior similar to a pty (line buffered). This makes stderr output in tests more relevant, i.e. we usually do 2>&1 | FileCheck, now we can check stderr output is emitted in appropriate places. However, like other opinions, I am wary of littering flush all over the code base. Also note that flushing too frequently has a performance penalty when full buffering is otherwise used. (I once made llvm-objdump -d slow due to a change to make it line buffered)

Can we do something similar to D64165? Teach the warning/error (stderr) handler to flush stdout? If there is no stderr output, usual buffering (full buffering when the output is a file) is used. There is no performance penalty.

If we're going to do it, yeah, D64165 (flushing outs() before the diagnostic, flushing the diagnostic stream after the diagnostic) would be the better direction. But even in that case - it seems it isn't consistently applied even within llvm-objdump.cpp (isn't done for command line warnings, nor for errors?)

The idea is that some warning/error reporting facilities need outs().flush(). For llvm-objdump.cpp, reportWarning and reportError are the two utilities.

reportError doesn't seem to have the flushing (& also I'm not sure if it needs to use an extra raw_string_ostream - could write out to errs() directly, I think) & reportCmdLineWarning, reportCmdLineError (hmm, why are there different reportCmdLine* distinct from report*?)

It may be possible that some code path does not consistently flush outs(). We can fix them if someone observes problems.

Yeah, not the end of the world/fixing at least fewer places.

I am in favor of synchronizing stdout/stderr, i.e. when redirected to a file, making the behavior similar to a pty (line buffered). This makes stderr output in tests more relevant, i.e. we usually do 2>&1 | FileCheck, now we can check stderr output is emitted in appropriate places. However, like other opinions, I am wary of littering flush all over the code base. Also note that flushing too frequently has a performance penalty when full buffering is otherwise used. (I once made llvm-objdump -d slow due to a change to make it line buffered)

Can we do something similar to D64165? Teach the warning/error (stderr) handler to flush stdout? If there is no stderr output, usual buffering (full buffering when the output is a file) is used. There is no performance penalty.

If we're going to do it, yeah, D64165 (flushing outs() before the diagnostic, flushing the diagnostic stream after the diagnostic) would be the better direction. But even in that case - it seems it isn't consistently applied even within llvm-objdump.cpp (isn't done for command line warnings, nor for errors?)

The idea is that some warning/error reporting facilities need outs().flush(). For llvm-objdump.cpp, reportWarning and reportError are the two utilities.

reportError doesn't seem to have the flushing (& also I'm not sure if it needs to use an extra raw_string_ostream - could write out to errs() directly, I think) & reportCmdLineWarning, reportCmdLineError (hmm, why are there different reportCmdLine* distinct from report*?)

reportCmdLineWarning was introduced in D66418. It reports a warning not associated with a file. Many warnings/errors are associated with a file. error: filename: .... style output is convenient when a user greps a log file.

Good catch. If I add outs().flush() to reportError, no tests fail. We can add it when we see abnormal output. The logAllUnhandledErrors use can indeed be simplified.

It may be possible that some code path does not consistently flush outs(). We can fix them if someone observes problems.

Yeah, not the end of the world/fixing at least fewer places.

I found having the error messages in different places somewhat confusing. Worse still, it seemed like occasionally they weren't consistent between being run under FileCheck and when I ran llvm-dwarfdump directly, although that might have only been tangentially related to when things were flushed. I'll take a look at moving the flushing to the error handler. It makes sense not to penalise the normal use-case, i.e. where there are no errors, by flushing unnecessarily.

So moving the flushing to the warning handler is trickier than it first seems, because the default warning handler (which is the one used by llvm-dwarfdump) is a part of the WithColor interface now, and does not have access to the output stream used by the DWARFContext code, which may or may not be stdout. We could still go ahead and flush stdout in that function, as that will work for llvm-dwarfdump, but it's not guaranteed to achieve the intended effect in all cases. We could wrap the warning handler routine locally in the DWARFContext code to do the flushing, but it would prevent people using a non-flushing handler, which seems less than ideal.

Having investigated further, updating the defaultWarningHandler method to flush stdout does not solve the issue, because llvm-dwarfdump actually writes to a file stream, which means the thing being flushed is not the right stream. I think the right thing to do is to have llvm-dwarfdump provide its own warning and error handler to DWARFContext. That will allow it to flush the right stream at the right time.

jhenderson updated this revision to Diff 267625.Jun 1 2020, 8:24 AM
jhenderson retitled this revision from [DebugInfo] Add flush to ensure debug line errors are in right place to [llvm-dwarfdump] Flush output before warning and errors are reported.
jhenderson edited the summary of this revision. (Show Details)
jhenderson added a reviewer: labath.
jhenderson added a subscriber: labath.

Reimplemented flushing behaviour as part of the llvm-dwarfdump warning handler, as discussed. Also fixed an issue where the llvm-dwarfdump warning handler wasn't properly being copied around. I also have removed the two other existing flushes, as I believe we agree that it should be up to the handling methods to do the flushing as desired. Added @labath as a reviewer for this part, since he introduced them a few months ago.

See rGced45978a2a5b258421590214f46848e726144b8 for where the other two flushes were introduced. I think this fix is a better fix from llvm-dwarfdump's point of view, and other clients should be updated if and when necessary.

While I do prefer the current approach over the first revision of this path, I'm still not convinced that dwarfdump should be flushing. If you combine the streams you accept that there is no guaranteed order. Flushing is a band-aid and not a structural solution. This patch already shows that it doesn't scale. I don't think we should settle for "fix[ing] them if someone observes problems." I also don't see how that makes the errors more relevant in the tests. If they are not strict enough we should alter the error message instead of relying on the order.

In case it wasn't clear from my previous reply, I'm totally willing to be convinced that this is the right thing to do, I just hoping for a more compelling argument.

dblaikie accepted this revision.Jun 1 2020, 1:32 PM

Looks good - I'm OK with this direction.

I found having the error messages in different places somewhat confusing. Worse still, it seemed like occasionally they weren't consistent between being run under FileCheck and when I ran llvm-dwarfdump directly, although that might have only been tangentially related to when things were flushed.

Yeah, it's certainly likely that sort of inconsistent behavior could come from flushing issues, especially under FileCheck - I'm kind of OK with that/think a possible "right thing" would be to ensure that we don't depend on the ordering between stdout and stderr. But, yeah, this approach of flushing around diagnostics seems sustainable enough that it's an OK alternative.

This revision is now accepted and ready to land.Jun 1 2020, 1:32 PM
dblaikie requested changes to this revision.Jun 1 2020, 1:42 PM

Going to revert my approval for now as @JDevlieghere 's concerns should be addressed/discussed (didn't mean to usurp his inquiry - just a race condition on feedback)

@JDevlieghere I think this does show a fairly scalable solution - limited to only a couple of diagnostic handlers.

But, yeah, I'm with you too - the error messages should be sufficiently self descriptive that they don't need to appear in particular places interleaved with stdout.

This revision now requires changes to proceed.Jun 1 2020, 1:42 PM

I also think this approach is scalable.

llvm/test/tools/llvm-dwarfdump/X86/debug_line_invalid.test
57

Unneeded change after your other change landed?

llvm/tools/llvm-dwarfdump/llvm-dwarfdump.cpp
506 ↗(On Diff #267625)

errs().flush() can be deleted. stderr is not buffered. I requested this in the llvm-objdump patch when ychen added it, staging that there were two other instances. The two other instances seems to have gone, so I deleted errs().flush() as well yesterday.

If a user explicitly makes stderr buffered (via setvbuf), they should accept strangely interleaved streams.

labath added a comment.Jun 2 2020, 1:12 AM

I think this is definitely better than having scattered flushes -- thanks for working on a more systemic solution.

I'm just going to throw one more idea out there. I came up with it when I was adding the original flushes. At the time I chose not to pursue it as I just wanted to get things done, but if we're going to discuss this problem in depth, I think it deserves consideration. If there's consensus that this is a good direction to go in, I'll be happy to create a patch for it

While I do prefer the current approach over the first revision of this path, I'm still not convinced that dwarfdump should be flushing. If you combine the streams you accept that there is no guaranteed order.

That is not really true. std::iostreams have the concept of "tied" streams, whose (I believe) sole purpose for existence is to ensure consistent ordering between cout and cerr streams. The way it works is that whenever a stream does a read or write operation from/to the backing storage, it first flushes the "tied" stream (if it has one). Normally, cin and cerr are tied to cout, and that's why patterns like cout << "Give me a number:" /* no newline*/; cin >> n; work. So, combining stdout and stderr is an issue that a lot of people considered important, and indeed there are a lot of tools out there which get it "right".

llvm streams have no "tied stream" concept, and I would stipulate that this is the real reason for why we're running into the problems here. The simplest and least obtrusive way this concept could be retrofitted into the existing llvm streams is to have a new class -- let's call it tied_raw_fd_ostream. This class would inherit from raw_fd_ostream and it would take an additional constructor argument -- the "tied" stream. The only special thing it would do is that in its write_impl implementation, it would call TiedStream.flush() before doing the actual write. llvm::errs() would be of type tied_raw_fd_ostream, tied to outs(), everything else would remain unchanged.

Implementing this idea should be very easy. The result won't be as flexible as the std::iostream solution, but that might actually be a good thing. And as long as an application writes only/mostly to outs(), the performance impact should be zero. If an application combines stdout and stderr a lot, then it will suffer some performance degradation, but hopefully that is not the case we need to optimize for.

I'm happy to go for a tied stream approach. That would seem like a more scalable solution for clients that want to keep things in sync. In practice, not that much prints to stderr, so performance is probably not so critical, in my opinion.

I hear @JDevlieghere's concerns about messages being self-descriptive enough to not need worrying about syncing, and I agree with the principle. Indeed, I often push for more context to be included in error messages. However, my take on it is that if error messages are near related output when the two are combined, it makes it much easier to associate one with the other, regardless of how much context there is in the message. I have often found cases in build systems etc which produce huge amounts of output for whatever reason, where the error messages end up miles away from the actual related output. Often, the stdout might say something like "Build Failed" but the related stderr error messages are several hundred lines earlier, and therefore quite hard to find. Admittedly, this is as much to do with there being too much output as it is to do with flushing, but the principle still more or less applies.

In terms of the testing, I have a patch that I hope to put up for review today, to reposition the testing of the error messages in debug_line_invalid.test to be alongside their corresponding tables. The information in most (all?) of the messages is sufficient to already associate them correctly, since they contain the file offsets, but that doesn't make the test script easy to read or follow from my experience, and that's despite having written most of this test myself! Whilst I could continue to keep the errors in a single file and test that indepdendently, but with the FileCheck lines interleaved appropriately, I feel like this may get a little confusing at times.

From a purely llvm-dwarfdump perspective, I think there are likely three common usages:

  1. Both stream printed in a console window - in this case, stream interleaving isn't great, as it can mean the error/warnings don't appear near corresponding stdout, meaning a warning might be missed/thought to be related to the wrong thing.
  2. Both streams redirected to the same file - same as above.
  3. One stream redirected to a file/both redirected to different files - error messages need enough context (which I think they have already for the most part).

In the first two cases, this patch improves the issue, whilst in the third, the only thing that happens is a slight degradation in performance in the event of a warning/error being emitted. Since these should be uncommon, I think that's harmless. Testing typically will use either the first or third version (currently uses the third, but I'd like to move to the first).

I've posted D80989 which includes test changes that rely on this flushing or something equivalent. It's also related to output ordering, but I think is slightly more general than just error messages itself.

I've tried getting the tied_raw_fd_ostream approach to work, by tying errs() to outs() as suggested, but it doesn't work when it comes to testing D80989 (and possibly others) in lit. The problem is that llvm-dwarfdump uses a ToolOutputFile, which contains its own raw_fd_ostream pointing to stdout, with its own buffer independent of the version that is provided by outs(). Consequently, the flush triggered by writing to errs() flushes the empty outs() buffer, before writing the error message, rather than the real output buffer provided by the ToolOutputFile. I see three possible ways forward: 1) Go ahead with my existing approach as suggested in the current patch (my preference), 2) abandon the whole idea of keeping the two in sync (I don't like this idea for reasons already outlined), or 3) have raw_fd_ostream instances contructed with - (the stdout variant) actually use the outs() stream, rather than its own one - this will prevent potential weird situations where we have two independent streams with their own buffers both writing to stdout, and therefore potentially getting very weird interleaving, but I'm not sure of the knock-on effects of this yet.

Thoughts?

jhenderson updated this revision to Diff 267889.Jun 2 2020, 7:49 AM
jhenderson marked 2 inline comments as done.

Addressed @MaskRay's comments.

labath added a comment.Jun 2 2020, 8:10 AM

Hm... that is unfortunate. I think that option (3) is the most correct approach -- I find the current ToolOutputFile behavior fairly problematic -- but you're right that it can have some unexpected knock-on effects, and so I'm not sure if this patch should be predicated on that.

(Incidentally, I was shocked to learn that the magicness of "-" is embedded directly into the raw_os_ostream constructor -- I would have expected this to be handled somewhere higher up (like the ToolOutputFile class, for instance).)

So, combining stdout and stderr is an issue that a lot of people considered important, and indeed there are a lot of tools out there which get it "right".

I hope I didn't give the impression that I don't think it's important. I think it's really useful to have stdout and stderr for exactly the reason that James mentioned. Everything I said was weighing those benefits agains us having to maintain flushes in dwarfdump.

llvm streams have no "tied stream" concept, and I would stipulate that this is the real reason for why we're running into the problems here. The simplest and least obtrusive way this concept could be retrofitted into the existing llvm streams is to have a new class -- let's call it tied_raw_fd_ostream. This class would inherit from raw_fd_ostream and it would take an additional constructor argument -- the "tied" stream. The only special thing it would do is that in its write_impl implementation, it would call TiedStream.flush() before doing the actual write. llvm::errs() would be of type tied_raw_fd_ostream, tied to outs(), everything else would remain unchanged.

Implementing this idea should be very easy. The result won't be as flexible as the std::iostream solution, but that might actually be a good thing. And as long as an application writes only/mostly to outs(), the performance impact should be zero. If an application combines stdout and stderr a lot, then it will suffer some performance degradation, but hopefully that is not the case we need to optimize for.

I really like this idea because it takes the tools out of the business of flushing and provides a structural solution to the problem. My main issue with the original patch was that it's very tricky to cover all cases and you can't sprinkle flushes until it works. Having a tied_raw_fd_ostream would solve that issue completely.

I'm happy to go for a tied stream approach. That would seem like a more scalable solution for clients that want to keep things in sync. In practice, not that much prints to stderr, so performance is probably not so critical, in my opinion.

I hear @JDevlieghere's concerns about messages being self-descriptive enough to not need worrying about syncing, and I agree with the principle. Indeed, I often push for more context to be included in error messages. However, my take on it is that if error messages are near related output when the two are combined, it makes it much easier to associate one with the other, regardless of how much context there is in the message. I have often found cases in build systems etc which produce huge amounts of output for whatever reason, where the error messages end up miles away from the actual related output. Often, the stdout might say something like "Build Failed" but the related stderr error messages are several hundred lines earlier, and therefore quite hard to find. Admittedly, this is as much to do with there being too much output as it is to do with flushing, but the principle still more or less applies.

As I said above I completely agree with all of this!

In terms of the testing, I have a patch that I hope to put up for review today, to reposition the testing of the error messages in debug_line_invalid.test to be alongside their corresponding tables. The information in most (all?) of the messages is sufficient to already associate them correctly, since they contain the file offsets, but that doesn't make the test script easy to read or follow from my experience, and that's despite having written most of this test myself! Whilst I could continue to keep the errors in a single file and test that indepdendently, but with the FileCheck lines interleaved appropriately, I feel like this may get a little confusing at times.

I strongly believe that maintainability of the tests is very important. My point was more about making sure we don't start relying on the order. But you're right, we've been doing a great job and nothing indicates that will change, so it was more of a concern than an actual problem.

From a purely llvm-dwarfdump perspective, I think there are likely three common usages:

  1. Both stream printed in a console window - in this case, stream interleaving isn't great, as it can mean the error/warnings don't appear near corresponding stdout, meaning a warning might be missed/thought to be related to the wrong thing.
  2. Both streams redirected to the same file - same as above.
  3. One stream redirected to a file/both redirected to different files - error messages need enough context (which I think they have already for the most part).

In the first two cases, this patch improves the issue, whilst in the third, the only thing that happens is a slight degradation in performance in the event of a warning/error being emitted. Since these should be uncommon, I think that's harmless. Testing typically will use either the first or third version (currently uses the third, but I'd like to move to the first).

I've tried getting the tied_raw_fd_ostream approach to work, by tying errs() to outs() as suggested, but it doesn't work when it comes to testing D80989 (and possibly others) in lit. The problem is that llvm-dwarfdump uses a ToolOutputFile, which contains its own raw_fd_ostream pointing to stdout, with its own buffer independent of the version that is provided by outs(). Consequently, the flush triggered by writing to errs() flushes the empty outs() buffer, before writing the error message, rather than the real output buffer provided by the ToolOutputFile. I see three possible ways forward: 1) Go ahead with my existing approach as suggested in the current patch (my preference), 2) abandon the whole idea of keeping the two in sync (I don't like this idea for reasons already outlined), or 3) have raw_fd_ostream instances contructed with - (the stdout variant) actually use the outs() stream, rather than its own one - this will prevent potential weird situations where we have two independent streams with their own buffers both writing to stdout, and therefore potentially getting very weird interleaving, but I'm not sure of the knock-on effects of this yet.

Thoughts?

My vote goes to (3) as well, though like Pavel I don't think it should block this patch. I can live with (1) for now and making (3) happen when similar issues pop up down the road.

I'm going to focus on getting my other queued-up patches up for review before I start looking at option 3. If people are happy for this to go in, it will help with me landing the other patches I have planned, but probably isn't actually required and I can probably workaround things to some extent, so can wait if that's the preferred approach. Just let me know!

labath added a comment.Jun 3 2020, 3:27 AM

So, combining stdout and stderr is an issue that a lot of people considered important, and indeed there are a lot of tools out there which get it "right".

I hope I didn't give the impression that I don't think it's important. I think it's really useful to have stdout and stderr for exactly the reason that James mentioned. Everything I said was weighing those benefits agains us having to maintain flushes in dwarfdump.

Cool. Thanks for clearing that up.

I'm going to focus on getting my other queued-up patches up for review before I start looking at option 3. If people are happy for this to go in, it will help with me landing the other patches I have planned, but probably isn't actually required and I can probably workaround things to some extent, so can wait if that's the preferred approach. Just let me know!

Since I started this whole thing, I felt like I should also chip in a bit. I've created D81078 to do the ToolOutputFile refactor. The changes are fairly straight-forward, and they don't break any of the in-tree tests. I have no idea whether they will break some strange out-of-tree use case (I feel it's relatively safe though the potential for it is definitely there...), but I suppose there's only one way to find that out...

jhenderson updated this revision to Diff 268454.Jun 4 2020, 6:36 AM
jhenderson retitled this revision from [llvm-dwarfdump] Flush output before warning and errors are reported to [DebugInfo] Remove unnecessary flushes and add additional testing.
jhenderson edited the summary of this revision. (Show Details)

Rebase on top of D81156. This patch now just covers removing the flushes and a bit of testing.

MaskRay accepted this revision.Jun 4 2020, 11:06 AM

IIUC,

  1. have raw_fd_ostream instances contructed with - (the stdout variant) actually use the outs() stream, rather than its own one - this will prevent potential weird situations where we have two independent streams with their own buffers both writing to stdout, and therefore potentially getting very weird interleaving, but I'm not sure of the knock-on effects of this yet.

has been resolved by D81078

tied_raw_fd_ostream

The idea will be implemented in D81156.

This rebased patch looks good.

@dblaikie, are you happy with this patch in its current form?

dblaikie accepted this revision.Jun 8 2020, 11:50 AM

Sounds alright

This revision is now accepted and ready to land.Jun 8 2020, 11:50 AM
This revision was automatically updated to reflect the committed changes.
jhenderson updated this revision to Diff 269520.Jun 9 2020, 6:20 AM
This comment was removed by jhenderson.
jhenderson updated this revision to Diff 269522.Jun 9 2020, 6:22 AM

Restore diff (I accidentally uploaded it to the wrong patch).