This is an archive of the discontinued LLVM Phabricator instance.

[Diagnostics] Accept newline and format diag opts on first line
Needs ReviewPublic

Authored by vangthao on Jun 15 2022, 5:16 PM.

Details

Summary

Accept newlines when the diagnostic string is being given by an outside source
instead of ignoring newlines. I believe the other processing done in
FormatDiagnostic() does not ignore newlines so we should also not ignore it
here.

Also format diagnostic options on the first line when there are multiple lines.

The motivation behind this patch is to allow optimization remarks from the
backend to be able to format its output with newlines. Prior to this review if
we use newlines with optimization remarks, they would be ignored and this makes
it dificult to do any sort of information display formatting if we want to
output various type of information at once as was attempted in
https://reviews.llvm.org/D123878

Diff Detail

Event Timeline

vangthao created this revision.Jun 15 2022, 5:16 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 15 2022, 5:16 PM
Herald added a subscriber: martong. · View Herald Transcript
vangthao requested review of this revision.Jun 15 2022, 5:16 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 15 2022, 5:16 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript

Please, consider updating the summary to clearly specify the motivation for making this change.
It describes what this patch intends to do, but I'm about the why.

dexonsmith resigned from this revision.Jun 16 2022, 8:50 AM
vangthao updated this revision to Diff 437593.Jun 16 2022, 10:19 AM

Update commit message to include the motivation behind the patch.

vangthao edited the summary of this revision. (Show Details)Jun 16 2022, 10:20 AM

Thank you for the patch! I will leave it to others with more knowledge of the history of diagnostics to comment on whether the premise is acceptable, but I have a few technical comments

clang/lib/Frontend/TextDiagnosticPrinter.cpp
119–137

Nit: I'd remove the ifs and just use the behavior of slice(npos, ...) returning the empty string

I also would reword "split the first line and feed it into printDiagnosticOptions", as it could be interpreted as meaning the first line is an input to printDiagnosticOptions

clang/lib/StaticAnalyzer/Core/TextDiagnostics.cpp
99–106

It's frustrating that std::string::insert doesn't follow the pattern of npos==end-of-string, but I'd still suggest only doing the actual insert/append once

100

I believe they are guaranteed to be equivalent, but this should be StringRef::npos or you should allocate the std::string sooner

101

I think you incur an extra copy assignment here, as the insert returns an lvalue-reference

scott.linder added inline comments.Jun 17 2022, 11:25 AM
clang/lib/StaticAnalyzer/Core/TextDiagnostics.cpp
99–106

Or to be more precise, I would try to only have one piece of code doing the insert/append

Thanks for updating the summary.

I'm not too confident around diagnostics though.
I'm going to piggyback on @scott.linder's comments. I think they are on the point.

nit: I'm not sure if size_t is a thing in c++, std::size_t is though.

clang/lib/Basic/Diagnostic.cpp
815

Should we account for '\r'-s as well? I'm not sure how it works though

clang/lib/Frontend/TextDiagnosticPrinter.cpp
119–137

Use braces for both branches or no braces at all. LLVM CodingStandards

I'm a bit uncomfortable with this as we (at least currently) want to discourage non-terse diagnostics, and allowing newlines encourages longer diagnostics. There's an RFC kicking around about making more expressive diagnostics and so this discomfort may pass with time. (The only three uses of \n in our current Clang diagnostics are all driver diagnostics and all three of them are new for the HLSL stuff, wrong, and need to be reworded. CC @beanz to look into fixing up https://github.com/llvm/llvm-project/blob/main/clang/include/clang/Basic/DiagnosticDriverKinds.td#L674).

Is there a reason the remarks can't use individual diagnostic emissions to simulate newlines? Or is this perhaps a demonstration that the remarks should not be using the diagnostic engine at all and should be emitting their output to a user-controllable stream (or file)?

Is there a reason the remarks can't use individual diagnostic emissions to simulate newlines? Or is this perhaps a demonstration that the remarks should not be using the diagnostic engine at all and should be emitting their output to a user-controllable stream (or file)?

An issue with using multiple diagnostic emissions to simulate newlines is that there are repeated information such as diagnostic location and diagopts reprinted for each line. This creates a lot of noise and makes the output harder to read. Accepting newlines in diagnostics would allow for a lot more flexibility but I do see your point on how it would also encourage longer diagnostics.

Is there a reason the remarks can't use individual diagnostic emissions to simulate newlines? Or is this perhaps a demonstration that the remarks should not be using the diagnostic engine at all and should be emitting their output to a user-controllable stream (or file)?

An issue with using multiple diagnostic emissions to simulate newlines is that there are repeated information such as diagnostic location and diagopts reprinted for each line. This creates a lot of noise and makes the output harder to read.

I can see how that's the case if the prefix information is different with each diagnostic line, but if they're all repeating the same prefix, that noise should be lessened by quite a bit (and for those truly distracted by it, using the same prefix each time makes it easier to strip the prefix in an editor). e.g.,

remark: foo.cl:27:0: 6 instructions in function
remark: foo.cl:27:0: Kernel Name: test_kernel
remark: foo.cl:27:0: SGPRs: 24
remark: foo.cl:27:0: VGPRs: 9
remark: foo.cl:27:0: AGPRs: 43
remark: foo.cl:27:0: ScratchSize [bytes/thread]: 0
remark: foo.cl:27:0: Occupancy [waves/SIMD]: 5
remark: foo.cl:27:0: SGPRs Spill: 0
remark: foo.cl:27:0: VGPRs Spill: 0
remark: foo.cl:27:0: LDS Size [bytes/block]: 512
remark: foo.cl:27:0: ------------------------------

doesn't look that terrible to me because all the salient information is visually aligned. (Line wrapping would be an issue, but the same is true for line wrapping mixed with manual newlines.) Do you have the ability to control the location of the diagnostic so you can keep the prefix the same throughout the report (or use the same prefix within a notional "section" of the report so that blocks of related remarks are visually grouped with the same prefix)?

My objection to the original patch was including things like the "----" lines as remarks to mark a section. I don't think these should be emitted as remarks and are a display function the frontend should take care of if we want any kind of grouping