This is an archive of the discontinued LLVM Phabricator instance.

clang-format: Add a consumer to diagnostics engine
AbandonedPublic

Authored by HazardyKnusperkeks on Oct 25 2020, 8:51 AM.

Details

Summary

Otherwise problems like trying to format readonly file in-place led to crashes.

I've added reviewers by looking at git blame and other reviews to the changed file, so may have missed someone.

Diff Detail

Event Timeline

dmikis created this revision.Oct 25 2020, 8:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 25 2020, 8:51 AM
dmikis requested review of this revision.Oct 25 2020, 8:51 AM
dmikis retitled this revision from Add a consumer to diagnostics engine to clang-format: Add a consumer to diagnostics engine.Oct 25 2020, 8:56 AM

This looks reasonable.

I'm curious about DiagnosticsEngine: previously we were using the defaults for client and ShouldOwnClient ctor params (nullptr and true). If this usage leads to crashes, isn't the issue in DiagnosticsEngine itself?

If this usage leads to crashes, isn't the issue in DiagnosticsEngine itself?

I don't know design intent behind DiagnosticsEngine. As far as I can understand from source code it doesn't try to create any consumers by itself yet requires one to be provided (there're asserts that client is not nullptr). Reasonable thing to do may be to remove default for client parameter of DiagnosticsEngine constructor? I'm strugling to build whole LLVM under MSVC 19.27.29112 due to some weird problems with STL. I'll check later whether it's feasable to simply remove default.

krasimir accepted this revision.Oct 27 2020, 3:37 AM

Thank you! Your fix looks good for clang-format.

I'm also not familiar with DiagnosticsEngine. It would be nice to follow-up with folks on that on either removing the defaulted ctor params and possibly adding assert-s in the ctor itself, or else updating the bits that assert that the client is not a nullptr to something that behaves reasonably in this case. There could be a good reason why the code is as-is; it would be nice if that is documented at the ctor or class level.

This revision is now accepted and ready to land.Oct 27 2020, 3:37 AM

@krasimir What would our next steps be to get this change into LLVM source tree? JIC, I don't have commit rights :)

@krasimir What would our next steps be to get this change into LLVM source tree? JIC, I don't have commit rights :)

I'll commit this for you. If you're planning to commit regularly, and after you have authored a few patches that get committed, you can follow the instructions on obtaining commit access: https://llvm.org/docs/DeveloperPolicy.html#obtaining-commit-access

This revision was landed with ongoing or failed builds.Oct 29 2020, 3:29 AM
This revision was automatically updated to reflect the committed changes.
thakis reopened this revision.Oct 29 2020, 7:29 AM
thakis added a subscriber: thakis.

This doe snot lgtm. clang-format should not depend on clangFrontend, see the lengthy discussion in https://reviews.llvm.org/D68554 . Let's try to find another fix here (and include a test for whatever this fixes). I'll revert this for now.

This revision is now accepted and ready to land.Oct 29 2020, 7:29 AM
thakis requested changes to this revision.Oct 29 2020, 7:30 AM
This revision now requires changes to proceed.Oct 29 2020, 7:30 AM

This doe snot lgtm. clang-format should not depend on clangFrontend, see the lengthy discussion in https://reviews.llvm.org/D68554 . Let's try to find another fix here (and include a test for whatever this fixes). I'll revert this for now.

Thank you Niko for the revert and the pointer to the discussion! Good to know that it is undesirable to have clang-format depend on clangFrontend!

@dmikis: This constraint limits the options we have with DiagnosticsEngine here. A path forward may be to poke around DiagnosticsEngine to not crash with a null client; I'm unsure on the feasibility of that.

@thakis There's a simple option of moving TextDiagnosticPrinter into libBasic thus eliminating neccessety to depend on libFrontend. How does that sound?

Can you expand a bit on what exactly is going wrong? (As part of that, please include a test case that shows the crash this is about.)

@thakis We've stumbled upon this problem when tried to format in-place a read-only file. clang-format tries to report this problem via DiagnosticEngine, and that fails on assert. clang-format crashes. We use perforce and source files being read only is a common occurrence for us.

Is it possible to add an automated test for this in clang/test/Format/... using chmod -w in a RUN line?

dmikis added a comment.EditedOct 30 2020, 9:53 AM

@thakis I've got test working, here's the code:

// RUN: chmod +w %t.dir || true
// RUN: rm -rf %t.dir
// RUN: mkdir %t.dir
// RUN: cp %s %t.dir/read-only.cpp
// RUN: chmod -w %t.dir
// RUN: chmod -w %t.dir/read-only.cpp
// RUN: not clang-format -style=LLVM -i %t.dir/read-only.cpp
//
// CHECK: {{permission denied}}

int main() {
        return 0; }

There's a difference in how Linux and Windows behaves. Linux unlinks read-only file without any errors as long as directory files resides in is writable. That's why I run chmod on directory and file.

I can't check it on Windows. It seems tests aren't designed to run on windows at all because of it's obvious reliance on unix-like environment. Installing MinGW helped.

MyDeveloperDay added a subscriber: MyDeveloperDay.EditedOct 31 2020, 7:03 AM

@thakis didn't you and I go through this process of removing frontEnd once before? do we really want to add this back in for all the reasons you said before?

D69854: [clang-format] [RELAND] Remove the dependency on frontend

dmikis added a comment.EditedOct 31 2020, 7:28 AM

@MyDeveloperDay That change introduced a regression: clang-format crashes instead of reporting errors in certain situations. See my comment with a test case.

As I've mentioned, reasonable solution seems to be to move TextDiagnosticPrinter to libBasic. That way clang-format can use it and no libFrontend dependency would be required.

MyDeveloperDay added a comment.EditedOct 31 2020, 9:52 AM

@dmikis just for history sake I introduced the libFrontEnd when doing D68554: [clang-format] Proposal for clang-format to give compiler style warnings, D69854: [clang-format] [RELAND] Remove the dependency on frontend was an attempt to remove the need for frontEnd while doing the same thing, so its not fair to say that D69854
was the cause of a regression its actually my belief that it would have failed before (unless I'm mistaken)

Having said that I understand what you are trying to do...and agree it being moved to libBasic would help us here.

thakis added a comment.Nov 9 2020, 6:46 AM

Thanks for the test!

I think moving TextDiagnosticsPrinter to Basic is much better than making Format depend on Frontend. If you just want to fix your crash and be done with it, then doing that (and adding your test) is fine.

I think the best fix would be to change Basic to never emit diagnostics though. That would also fix this crash, and it'd completely remove clang-format's dependency on the huge array with all of clang's diag, which would significantly reduce the size of clang-format (iirc it'd cut it in half or something). This would need some reorganizing -- Basic would have to call some callback on errors instead, and in clang these callbacks would emit normal diags but in clang-format they'd do something else. If you wanted to look into that, that'd be amazing :)

@thakis , At the back of my mind is those people who always say "Measure, Measure, Measure"...

This issue still plagues us, we constantly get bugs reported that come back to this. (https://bugs.llvm.org/show_bug.cgi?id=52021,https://bugs.llvm.org/show_bug.cgi?id=42014)

So I decided I should look into the concerns with us fixing this issue.

I decided to do 3 tests:

  • a) Build current tip of main
  • b) Apply this FrontEnd fix
  • c) Move TextDiagnosticPrinter (and dependencies) into lib/Basic

I took a look and in its simplest form moving TextDiagnosticPrinter.cpp to lib/Basic but also requires DiagnosticRender.cpp and TextDiagnostic.cpp to also move to lib/Basic too,

But DiagnosticRender.cpp has a dependency on lib/Edit so clang-format now needs that as a dependency, is that better/worse than having a dependency on lib/FrontEnd? (not sure why I don't have to provide "Edit" as a dependency when including FrontEnd!)

For each test, I want to check

  • a) size of the executable produced (debug/release)
  • b) number of build dependencies (from ninja after ninja clean)

The size of the executable I assume is a concern as this impacts the runtime of clang-format as the system loads the module,
The number of build dependencies comes down to how quickly developers working on clang-format can rebuild.

I'm not concerning myself with the final "link" time of clang-format, as its relatively trivial on a modern machine, and nothing in comparison to building from clean.

Firstly I think we should recognize that this "crash" against a read-only file IS a problem that most of us will see from time to time and at best
its annoying. So I think there is enough momentum for us to have a solution of one sort of another.

Build speeds

Build speeds are important if you are a LLVM developer as you don't really want to have to keep rebuilding, I personally use ninja on Windows
having given up using vs projects generated by cmake years ago as being way too slow in the first place.

Ninja will give me the number of "targets" needed to compile, rather than timing it, I'll simply work on the assumption that "less is best"

$ ninja clang-format
[698/1298] Building CXX object lib\Object\CMakeFiles\LLVMObject.dir\SymbolicFile.cpp.obj

Before building each of the targets I perform a "ninja clean" so that hopefully the build that follows shows me how many targets I needed

  1. Using the current tip of master, the number of targets to build clang-format sits at 608
  2. If I apply this FrontEnd patch D90121: clang-format: Add a consumer to diagnostics engine it rises to 1352

3 )If I apply a patch that moves just TextDiagnosticPrinter/TextDiagnostic/DiagnosticRender into lib/Basic then the number is 1298

Executable Size

For Completeness I built both Debug and Release (mainly because I live in Debug as I work on clang-format, but the releases are Release (i assume))

Debug (clang-format)

For Debug there seems no benefit for moving the code out of FrontEnd into Basic

-rwxr-xr-x 1  17425920 Oct 14 14:48 clang-format-frontend.exe   (Debug)
-rwxr-xr-x 1  17425920 Oct 14 14:40 clang-format.basic.exe (Debug)
-rwxr-xr-x 1  17171968 Oct 14 14:44 clang-format.original.exe (Debug)

The modified debug binaries are only 1.5% larger.

Release (clang-format)

For Release the Basic version was actually larger.

-rwxr-xr-x 1   2601472 Oct 14 15:17 clang-format-release.basic.exe   (Release)
-rwxr-xr-x 1   2600960 Oct 14 15:12 clang-format-release-frontend.exe (Release)
-rwxr-xr-x 1   2041344 Oct 14 15:14 clang-format-release.orginal.exe (Release)

With both Basic and FrontEnd release binaries being ~27% larger.

Conclusion

This bug is annoying.

  1. I just don't think an approach that moves the files into Basic is the solution.
  • a) It requires multiple CMakeFile changes
  • b) We have to leave header stubs lying around (or fix all the other tools that might include them and that is ugly)
  • c) The binary is actually larger (go figure!)
  • d) The number of files to be build is 213% as many as not having this fix, even if its ~10% less than the frontend fix (thats not worth the difference)
  1. Frontend fix is the simplest solution (and has been sent for review at least 3 times that I can find)
  • a) Its slightly annoying because I keep having to defend why we don't want to fix it!
  • b) Its 222% more files during the build cycle (however we don't build clean every time when building clang-format) so for a clang-format developer the impact is likely minimal
  • c) The binaries are larger, less than the basic but still ~27% larger (which one could see as causing a slowdown on startup, although I have not measured that)

I'm kind of the opinion that we should resolve the crash before worrying about binary size aspect. Like I said I primarily use debug builds of clang-format
even during my working day so that I am testing what we are developing and whilst they are slower because they are debug, I actually don't find them intolerable and they are almost 900% larger.

I have yet to investigate if there is some other way. (like guarding the crash)

I feel we should take another look to see if we can't find a simpler alternative, but I personally don't find the build times or binaries sizes to be pervasive to starting with this as a fix.

Could we discuss it again given that it continues to be an annoyance, keeps getting reported, has a simple enough initial fix, doesn't seem to have a terrible detrimental impact.

MyDeveloperDay.

I took a look and in its simplest form moving TextDiagnosticPrinter.cpp to lib/Basic but also requires DiagnosticRender.cpp and TextDiagnostic.cpp to also move to lib/Basic too,

But DiagnosticRender.cpp has a dependency on lib/Edit so clang-format now needs that as a dependency, is that better/worse than having a dependency on lib/FrontEnd? (not sure why I don't have to provide "Edit" as a dependency when including FrontEnd!)

I did make an attempt to move those classes to lib/Basic but ultimately dependency tree turned out to be too complex for me to tackle (IIRC, it basically ended up depending upon Lexer or something similar, that wasn't appropriate to move to lib/Basic). Since we used heavily patched version of clang-format (including this patch) that problem didn't bugged us and I kinda moved on to other things :/

HazardyKnusperkeks commandeered this revision.Nov 3 2022, 5:57 AM
HazardyKnusperkeks edited reviewers, added: dmikis; removed: HazardyKnusperkeks.
Herald added a project: Restricted Project. · View Herald TranscriptNov 3 2022, 5:57 AM
HazardyKnusperkeks abandoned this revision.Nov 3 2022, 5:58 AM

Since we used heavily patched version of clang-format (including this patch) that problem didn't bugged us and I kinda moved on to other things :/

This is a bit out of left field, but these comments reminded me of something - a long time ago I was working on fixing some Clang layering to potentially use explicit modules more in our internal build (& then hopefully in the upstream/external build) and one major holdup was the layering of diagnostics - essentially it's somewhat circular that each clang library owns its diagnostics, but libBasic depends on all those lists of diagnostics. Yeah, we could just move the diagnostic files down into libBasic, but the layering is problematic and it'd be great to fix it.

The idea, which I never got around to implementing, was that perhaps each library could have its own diagnostic lists, and the diagnostic library/engine itself would have no hardcoded knowledge of them - different tools, using different subsets of libraries, would have to initialize the diagnostic engine with all the diagnostic groups that cover the libraries they were going to use/they were going to pass the diagnostics engine to. That would fix the layering and lower the overhead of using diagnostics in tools that only need a small subset of the libraries and diagnostics clang uses.

Would that be relevant to address thhe concerns here? Would anyone with a vested interest in this review be interested in pursuing that direction? I'd be happy to help/throw around ideas, etc.

The issue that caused this issue was fix by the fix in https://reviews.llvm.org/rGa92cf5a5a0cd01145f8db2ae09334a8b43a1271b , from the clang-format perspective I don't really feel we need to revisit bringing in these libraries.

I think I proved that moving everything to lib/Basic isn't always the best way to go, if keeping the dependencies low (if speed of building the tools is of concern, which it is for me at least)

Ultimately if there is a better way for this to be structured, it needs to be handled by someone who is the owner of clang/Basic and/or llvm/Support, I don't feel like our team is in the best position to drive that.