Page MenuHomePhabricator

clang-tidy: Reduce number of stderr write calls
Needs ReviewPublic

Authored by Hiralo on Oct 22 2020, 10:30 PM.

Details

Summary

commit c3fa7da502d6cc77e5c0710b0e9308e6fbdd137c
Author: Hiral Oza <hiral.oza@netapp.com>
Date: Thu Oct 22 22:06:25 2020 -0700

clang-tidy: Reduce number of stderr write calls

Making single write call in place of 7 small string write calls.

Thank you in advance for your kind review.
-Hiral

Diff Detail

Event Timeline

Hiralo created this revision.Oct 22 2020, 10:30 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 22 2020, 10:30 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
Hiralo requested review of this revision.Oct 22 2020, 10:30 PM

Isn't llvm::errs() buffered, negating most of the benefit here.
Also using std::string here is bad, its potentially going to allocate and reallocate memory each time it grows.
It would be better off using an llvm::SmallString and looking at what could potentially be outputted, I'd suggest you'd want a buffer size of at least 256 bytes.

Isn't llvm::errs() buffered, negating most of the benefit here.

+1 to this (the patch description doesn't explain any specific motivation either - whether it's performance (runtime? memory usage? etc?) or something else, and how the performance aspect has been quantified)

Isn't llvm::errs() buffered, negating most of the benefit here.

If it is buffered, we would expect single write call to stderr. But we are seeing 7 write calls!

Yes, please suggest how to make llvm::errs() buffered that will be easy patch :)

(the patch description doesn't explain any specific motivation either - whether it's performance (runtime? memory usage? etc?) or something else, and how the performance aspect has been quantified)

The motivation is to avoid 7 write calls which helps in large build system and easy on NFS!

Thank you.
-Hiral

Isn't llvm::errs() buffered, negating most of the benefit here.

If it is buffered, we would expect single write call to stderr. But we are seeing 7 write calls!

Yes, please suggest how to make llvm::errs() buffered that will be easy patch :)

Looks like you might be able to do something like "llvm::errs().setBuffered()" ?

(the patch description doesn't explain any specific motivation either - whether it's performance (runtime? memory usage? etc?) or something else, and how the performance aspect has been quantified)

The motivation is to avoid 7 write calls which helps in large build system and easy on NFS!

Sorry, I meant "why does any of this matter" I take it you mean "because 7 write calls are slower than 1 in <this situation> by <this much time/percentage>" - do you have rough data/description of the situation where the speed of printing error messages matters and by how much does it matter? (I think it would be good to have this data no matter the solution - be it explicit or built-in buffering)

Looks like you might be able to do something like "llvm::errs().setBuffered()" ?

Do we need to set it for each function using llvm:errs() (e.g. here printStats() )
OR can it be set once for entire clang-tidy ?

(the patch description doesn't explain any specific motivation either - whether it's performance (runtime? memory usage? etc?) or something else, and how the performance aspect has been quantified)

The motivation is to avoid 7 write calls which helps in large build system and easy on NFS!

Sorry, I meant "why does any of this matter" I take it you mean "because 7 write calls are slower than 1 in <this situation> by <this much time/percentage>" - do you have rough data/description of the situation where the speed of printing error messages matters and by how much does it matter? (I think it would be good to have this data no matter the solution - be it explicit or built-in buffering)

No worries. Thanks for your valuable reviews.

For example, consider clang-tidy running on 10,000 files... we expect to have minimal number of write calls. With the code-as-is makes 10,000 * 7 = 70,000 stderr write calls with small small chunk of strings!!!
With proposed changes OR with llvm::errs().setBuffered() set we will see only 10,000 legitimate stderr write calls :)

Thank you.
-Hiral

Tried using llvm::errs().SetBuffered() within printStats()...

<diff>
static void printStats(const ClangTidyStats &Stats) {
+ llvm::errs().SetBuffered()
</diff>

but still I see below stderr write calls...
...
write(2, "10712", 5) = 5
write(2, " warning", 8) = 8
write(2, "s", 1) = 1
write(2, " generated", 10) = 10
write(2, ".\n", 2) = 2
...
write(2, "Suppressed ", 11) = 11
write(2, "10703", 5) = 5
write(2, " warnings (", 11) = 11
write(2, "10703", 5) = 5
write(2, " in non-user code", 17) = 17
write(2, ").\n", 3) = 3
write(2, "Use -header-filter=.* to display"..., 136) = 136
...

We expect it to emit one write call only!

Looks like you might be able to do something like "llvm::errs().setBuffered()" ?

Do we need to set it for each function using llvm:errs() (e.g. here printStats() )
OR can it be set once for entire clang-tidy ?

I think it could be set for the whole process. (though there's a reason stderr isn't usually buffered - because if it's buffered and the process crashes, then you don't get all the output/maybe something important will be lost)

(the patch description doesn't explain any specific motivation either - whether it's performance (runtime? memory usage? etc?) or something else, and how the performance aspect has been quantified)

The motivation is to avoid 7 write calls which helps in large build system and easy on NFS!

Sorry, I meant "why does any of this matter" I take it you mean "because 7 write calls are slower than 1 in <this situation> by <this much time/percentage>" - do you have rough data/description of the situation where the speed of printing error messages matters and by how much does it matter? (I think it would be good to have this data no matter the solution - be it explicit or built-in buffering)

No worries. Thanks for your valuable reviews.

For example, consider clang-tidy running on 10,000 files... we expect to have minimal number of write calls. With the code-as-is makes 10,000 * 7 = 70,000 stderr write calls with small small chunk of strings!!!
With proposed changes OR with llvm::errs().setBuffered() set we will see only 10,000 legitimate stderr write calls :)

Right, but what I mean is "what observable difference do you see" - as a user you aren't counting write calls, you're looking at the wall time, for instance. What difference in user-observable behavior are you seeing/are you trying to address, and how much does this change help that user-observable behavior?

I don't know too much about clang-tidy, but the statistics output for 10,000 files doesn't seem like it'd be that useful, would it? So maybe the solution is to turn it off, rather than to make it faster? (generally error paths are thought of as not needing to be efficient - because we shouldn't be producing thousands of errors to be consumed, because readers aren't going to read thousands of errors, usually? Are your users/you reading all the output that clang-tidy is generating?)

Tried using llvm::errs().SetBuffered() within printStats()...

<diff>
static void printStats(const ClangTidyStats &Stats) {
+ llvm::errs().SetBuffered()
</diff>

but still I see below stderr write calls...
...
write(2, "10712", 5) = 5
write(2, " warning", 8) = 8
write(2, "s", 1) = 1
write(2, " generated", 10) = 10
write(2, ".\n", 2) = 2
...
write(2, "Suppressed ", 11) = 11
write(2, "10703", 5) = 5
write(2, " warnings (", 11) = 11
write(2, "10703", 5) = 5
write(2, " in non-user code", 17) = 17
write(2, ").\n", 3) = 3
write(2, "Use -header-filter=.* to display"..., 136) = 136
...

We expect it to emit one write call only!

Might be worth debugging through it and seeing what's happening, perhaps? I'm not sure exactly why that might not be buffering - perhaps it uses raw_fd_ostream::preferred_buffer_size which might be returning 0 if it's a terminal that it's outputting to? In that case you could try "SetBufferSize" that'll set a customized buffer size and, I think, enable buffering.

Looks like you might be able to do something like "llvm::errs().setBuffered()" ?

Do we need to set it for each function using llvm:errs() (e.g. here printStats() )
OR can it be set once for entire clang-tidy ?

I think it could be set for the whole process. (though there's a reason stderr isn't usually buffered - because if it's buffered and the process crashes, then you don't get all the output/maybe something important will be lost)

Sure

Might be worth debugging through it and seeing what's happening, perhaps? I'm not sure exactly why that might not be buffering - perhaps it uses raw_fd_ostream::preferred_buffer_size which might be returning 0 if it's a terminal that it's outputting to? In that case you could try "SetBufferSize" that'll set a customized buffer size and, I think, enable buffering.

I tried setting...
static void printStats(const ClangTidyStats &Stats) {
+ llvm::errs().SetBufferSize(4096);
+ llvm::errs().SetBuffered();

but still didn't work!

For example, consider clang-tidy running on 10,000 files... we expect to have minimal number of write calls. With the code-as-is makes 10,000 * 7 = 70,000 stderr write calls with small small chunk of strings!!!
With proposed changes OR with llvm::errs().setBuffered() set we will see only 10,000 legitimate stderr write calls :)

Right, but what I mean is "what observable difference do you see" - as a user you aren't counting write calls, you're looking at the wall time, for instance. What difference in user-observable behavior are you seeing/are you trying to address, and how much does this change help that user-observable behavior?

I don't know too much about clang-tidy, but the statistics output for 10,000 files doesn't seem like it'd be that useful, would it? So maybe the solution is to turn it off, rather than to make it faster? (generally error paths are thought of as not needing to be efficient - because we shouldn't be producing thousands of errors to be consumed, because readers aren't going to read thousands of errors, usually? Are your users/you reading all the output that clang-tidy is generating?)

It effectively slows down the processing of files.

Same is there for stdout too.

For stderr messages, as you suggested, we may need option to switch off certain (stderr) messages.
Can you please suggest based on which condition we can switch it off?

Looks like you might be able to do something like "llvm::errs().setBuffered()" ?

Do we need to set it for each function using llvm:errs() (e.g. here printStats() )
OR can it be set once for entire clang-tidy ?

I think it could be set for the whole process. (though there's a reason stderr isn't usually buffered - because if it's buffered and the process crashes, then you don't get all the output/maybe something important will be lost)

Sure

Might be worth debugging through it and seeing what's happening, perhaps? I'm not sure exactly why that might not be buffering - perhaps it uses raw_fd_ostream::preferred_buffer_size which might be returning 0 if it's a terminal that it's outputting to? In that case you could try "SetBufferSize" that'll set a customized buffer size and, I think, enable buffering.

I tried setting...
static void printStats(const ClangTidyStats &Stats) {
+ llvm::errs().SetBufferSize(4096);
+ llvm::errs().SetBuffered();

but still didn't work!

By the looks of the code, you may want to call SetBufferSize only (do not call SetBuffered after that - or it'll go back to the default buffer size of 0.

For example, consider clang-tidy running on 10,000 files... we expect to have minimal number of write calls. With the code-as-is makes 10,000 * 7 = 70,000 stderr write calls with small small chunk of strings!!!
With proposed changes OR with llvm::errs().setBuffered() set we will see only 10,000 legitimate stderr write calls :)

Right, but what I mean is "what observable difference do you see" - as a user you aren't counting write calls, you're looking at the wall time, for instance. What difference in user-observable behavior are you seeing/are you trying to address, and how much does this change help that user-observable behavior?

I don't know too much about clang-tidy, but the statistics output for 10,000 files doesn't seem like it'd be that useful, would it? So maybe the solution is to turn it off, rather than to make it faster? (generally error paths are thought of as not needing to be efficient - because we shouldn't be producing thousands of errors to be consumed, because readers aren't going to read thousands of errors, usually? Are your users/you reading all the output that clang-tidy is generating?)

It effectively slows down the processing of files.

Do you have any measurements of how much it slows things down/how much the patch you're proposing speeds things up?

Same is there for stdout too.

For stderr messages, as you suggested, we may need option to switch off certain (stderr) messages.
Can you please suggest based on which condition we can switch it off?

Looks like, judging by the code above on line 481/482, that --quiet would turn off this stat printing stuff - but also, these stats are only printed if there were ignored errors, by the looks of it? Are you trying to run clang-tidy over a codebase without the correct flags for your codebase, or is there some other reason clang-tidy would be printing/ignoring a lot of errors?

In any case, I'd suggest either fixing the errors or, if you've decided the errors are acceptable for some reason, using --quiet to silence the errors you aren't interested in.

By the looks of the code, you may want to call SetBufferSize only (do not call SetBuffered after that - or it'll go back to the default buffer size of 0.

Oh! I missed it!

For example, consider clang-tidy running on 10,000 files... we expect to have minimal number of write calls. With the code-as-is makes 10,000 * 7 = 70,000 stderr write calls with small small chunk of strings!!!
With proposed changes OR with llvm::errs().setBuffered() set we will see only 10,000 legitimate stderr write calls :)

Right, but what I mean is "what observable difference do you see" - as a user you aren't counting write calls, you're looking at the wall time, for instance. What difference in user-observable behavior are you seeing/are you trying to address, and how much does this change help that user-observable behavior?

I don't know too much about clang-tidy, but the statistics output for 10,000 files doesn't seem like it'd be that useful, would it? So maybe the solution is to turn it off, rather than to make it faster? (generally error paths are thought of as not needing to be efficient - because we shouldn't be producing thousands of errors to be consumed, because readers aren't going to read thousands of errors, usually? Are your users/you reading all the output that clang-tidy is generating?)

It effectively slows down the processing of files.

Do you have any measurements of how much it slows things down/how much the patch you're proposing speeds things up?

Sorry, don't have stats now.

In any case, I'd suggest either fixing the errors or, if you've decided the errors are acceptable for some reason, using --quiet to silence the errors you aren't interested in.

I think we tried with --quiet... but let me retry it...
Meantime, we can pause review of this patch.

JFYI: using --quiet avoids call to following write calls...so that is useful.
write(2, "Suppressed ", 11) = 11
write(2, "10703", 5) = 5
write(2, " warnings (", 11) = 11
write(2, "10703", 5) = 5
write(2, " in non-user code", 17) = 17
write(2, ").\n", 3) = 3
write(2, "Use -header-filter=.* to display"..., 136) = 136

But still we have following stderr write calls, not sure which code emits...
write(2, "10712", 5) = 5
write(2, " warning", 8) = 8
write(2, "s", 1) = 1
write(2, " generated", 10) = 10
write(2, ".\n", 2) = 2

Another observation w.r.t. stdout...

For example, consider following sample program:
<sample_code>
#include <iostream>
#include <string>

using namespace std;
static long long A = 0ull;

void f(const std::string& a) {

std::cout << a << std::endl;

}

int main() {
}
</sample_code>

When running clang-tidy on this... we can see about 343 stdout write calls...
e.g.
write(1, "\33[1m", 4) = 4
write(1, "/some/path/a.cc", 14) = 14
write(1, ":", 1) = 1
write(1, "4", 1) = 1
write(1, ":", 1) = 1
write(1, "1", 1) = 1
write(1, ":", 1) = 1
write(1, " ", 1) = 1
write(1, "\33[0m", 4) = 4
write(1, "\33[0;1;35m", 9) = 9
write(1, "warning", 7) = 7
write(1, ": ", 2) = 2
write(1, "\33[0m", 4) = 4
write(1, "\33[1m", 4) = 4
write(1, "do not use namespace using-direc"..., 100) = 100
write(1, "\33[0m", 4) = 4
write(1, "\n", 1) = 1
write(1, "using namespace std;", 20) = 20
write(1, "\n", 1) = 1
write(1, "\33[0;1;32m", 9) = 9
write(1, "^", 1) = 1
write(1, "\n", 1) = 1
write(1, "\33[0m", 4) = 4
write(1, "\33[1m", 4) = 4
...

Shouldn't clang-tidy stdout with reasonable buffer-size (e.g. 128 or 512 or 1024) ?

I imagine, again, the idea is that if you're generating that many warnings that the performance of printing warnings matters, perhaps you're not paying attention to those warnings? you could disable the ones you aren't interested in? Especially if they're being printed to a terminal (which, since teh output seems to be coloured/including ansi escape codes, it looks like they are going to a terminal) - if you're getting thousands of pages of warnings, hardly seems likely that you can see/act on them?

But sure, in general, probably would be fine to buffer stdout too, might be nice. Same sort of solution - enabling buffering with some reasonable buffer size. (can't be buffered on some platforms if coloured output is being used, like windows, where colouring requires API calls so it's not embedded in the stream like ansi escape codes)

Might be worth checking what clang or other LLVM tools do about this sort of thing, maybe there's an existing solution that can be ported over.