This is an archive of the discontinued LLVM Phabricator instance.

[DO NOT SUBMIT] Add -vs-diagnostics.
ClosedPublic

Authored by chrisjackson on Feb 20 2019, 5:41 PM.

Details

Summary

This patch adds -vs-diagnostics flag to lld so that the linker prints
out error messages in the format that is friendly to Microsoft Visual Studio.

Since the existing error message format of lld and MSVC are significantly
different, and in most cases we print out more information than MSVC,
it is not practical nor beneficial to make lld to print out the same error
messages as MSVC does.

It looks like the most annoying issue if you are using lld on MSVC is that
the IDE automatically makes the first word of the first line of a linker
error message a hypertext, expecting that the word is an error location.
We print out "ld.lld" there by default, so when you click that string on
the IDE, it opens the linker binary file -- which is not useful at all.
In this patch, I'm focusing on fixing that issue.

This patch adds a function to find an error location in an error message.
If -vs-diagnostics is given, lld calls that function and prints out its
return value instead of "ld.lld". That operation is purely textual; I
used regexps to recognize some error message patterns to extract an error
location.

An alternative approach would have been abstracting error message
construction so that we can construct both the existing and MSVC-style
error messages.

Doing everything over a text might seems too primitive compared to adding
an abstraction layer, but I think the former is overall a better approach
than the latter. It is much less intrusive and very easy to understand.
And if you think of it, this is similar to what IDEs do to recognize and
highlight compiler and linker error messages. They do recognize some
patterns because linkers print out somewhat structured error messages.
If IDEs can recognize it, so do we. If our error message is machine-
unfriendly, we should change the error message.

This is a counter-proposal to https://reviews.llvm.org/D58300. This is
a proof-of-concept to share the idea. I think I need to add a few more
regexes to make it usable.

Diff Detail

Event Timeline

ruiu created this revision.Feb 20 2019, 5:41 PM
ruiu edited the summary of this revision. (Show Details)Feb 20 2019, 5:46 PM

For me, the code should be, above all, reliable. You can trust it only in that case; otherwise, you never know, when and how it is going to be broken and produce an unexpected result.

I would suggest a simple test for both approaches. What if a developer adds a new diagnostic message?

With D58300 it will probably just work as expected in both modes out-of-the-box. You even don't need to check that it is properly formatted, because most of the formatting is performed in one place, uniformly.

With this patch (D58484) the developer might not be aware of the existence of the --vs-diagnostics switch. Thus, recognizing patterns will not probably be updated and the message generation will be broken for an unpredictable time until someone comes across that new message in Visual Studio.

To sum up, I consider this patch to be very fragile. It would be hard for me to welcome it.

I have also noticed the following differences in the generated messages compared to D58300:

  • The line number is not enclosed in parentheses, how it should.
  • The patch is unable to generate separate messages for each duplicate symbol, which D58300 does.
  • The patch sometimes uses unsuitable strings for the source location, for example, an offset in a section.
  • In some cases it cannot recognize the source location to be used in the front of the message.

You may see all these issues with tests from D58300. It looks like trying to fixing them will complicate this patch and make it even more fragile.

I believe that D58300 not only adds VS-compatible diagnostic messages but makes a step towards the better code quality. It separates message formatting in a distinct place, making the other code cleaner. It also adds a possibility to add other formatters in the future, not only for different IDEs but for any other tool.

ruiu added a comment.Feb 22 2019, 2:28 PM

For me, the code should be, above all, reliable. You can trust it only in that case; otherwise, you never know, when and how it is going to be broken and produce an unexpected result.

I would suggest a simple test for both approaches. What if a developer adds a new diagnostic message?

With D58300 it will probably just work as expected in both modes out-of-the-box. You even don't need to check that it is properly formatted, because most of the formatting is performed in one place, uniformly.

With this patch (D58484) the developer might not be aware of the existence of the --vs-diagnostics switch. Thus, recognizing patterns will not probably be updated and the message generation will be broken for an unpredictable time until someone comes across that new message in Visual Studio.

I think I disagree. The error messages we are printing out is under our control, so we don't have to worry too much about the hypothetical scenario that "what if someone unaware of this does <something>". We know what we are printing, and we'd also like to make it machine-readable in some degree. The number of error messages added for each release is very small (I'd guess less than ten in average), and that's pretty manageable number. In addition to that, even when something goes wrong and some error message fell through this mechanism, it is not an end-of-the-world kind of thing. It is just mildly annoying if you are using an IDE and there's a link error. But I don't think even that would happen.

To sum up, I consider this patch to be very fragile. It would be hard for me to welcome it.

I have also noticed the following differences in the generated messages compared to D58300:

  • The line number is not enclosed in parentheses, how it should.

It shouldn't be too hard to fix it.

  • The patch is unable to generate separate messages for each duplicate symbol, which D58300 does.

I'd like to first discuss what would be expected as an error message for a duplicate symbol, as IIUC, MSVC doesn't show two separate error messages for two conflicting locations.

  • The patch sometimes uses unsuitable strings for the source location, for example, an offset in a section.
  • In some cases it cannot recognize the source location to be used in the front of the message.

It shouldn't be too hard to fix them, too.

You may see all these issues with tests from D58300. It looks like trying to fixing them will complicate this patch and make it even more fragile.

I believe that D58300 not only adds VS-compatible diagnostic messages but makes a step towards the better code quality. It separates message formatting in a distinct place, making the other code cleaner. It also adds a possibility to add other formatters in the future, not only for different IDEs but for any other tool.

I'd like to emphasize the cost of reading, understanding and maintaining code. Imagine that you were a first-time reader of the lld source code and looking for a way to print out an error message. The current way of doing it is very easy. It is also extremely easy to identify which line of code corresponds to a specific error message. That readability comes from the fact that we don't do too much clever things when printing out error messages. I can see that abstracting the error message construction to match exiting IDE may seem like the "right" approach, but we should be careful not to overdesign. This is perhaps a "worse is better" situation. If something very simple-minded works just enough, we should take it instead of a more complex and theoretically comprehensive one as a first approach.

aganea added a subscriber: aganea.Feb 22 2019, 4:06 PM

I'm working on extending this to pass the existing tests from D58300. Like @ikudrin , I am not fond of this regex method, but I think we will all be in a better place to make an evaluation once this implementation can pass the tests.

chrisjackson commandeered this revision.Mar 21 2019, 8:46 AM
chrisjackson edited reviewers, added: ruiu; removed: chrisjackson.

I have expanded Rui's regex implementation to cover the existing tests for Visual Studio diagnostics compatibility patches,

Does anyone have any comments on this expanded alternative implementation of vs-diagnostics?

ruiu added a comment.Apr 26 2019, 2:57 AM

Apologies for the belated response. I'm happy with this patch.

Common/ErrorHandler.cpp
98

nit: add {}

ELF/Driver.cpp
761

The members of this class have the same names of the corresponding options, so VsDiagnostics.

ELF/Options.td
363

Please indent just like other definitions.

MaskRay added inline comments.Apr 26 2019, 4:11 AM
Common/ErrorHandler.cpp
89

How about using llvm::Regex instead of std::regex. I assume that may make the increased code size smaller.

ruiu added inline comments.May 7 2019, 2:22 AM
Common/ErrorHandler.cpp
89

I'm not sure if we still should maintain llvm::Regex as I believe it is written before C++11. Maybe we should migrate to std::regex? I'm also not worried too much about the binary size, as lld already contains entire LLVM and std::regex shouldn't be too large.

Additional regexes to match more messages containing source:line information. Also added a test.

Does anyone have any thoughts on these additions?

ruiu accepted this revision.Jul 16 2019, 3:40 AM

LGTM

Sorry for the belated response. Variable names should be now in camelBack, so please update your code before submitting.

This revision is now accepted and ready to land.Jul 16 2019, 3:40 AM
This revision was automatically updated to reflect the committed changes.
rnk added a subscriber: rnk.Jul 18 2019, 2:04 PM
rnk added inline comments.
Common/ErrorHandler.cpp
89

FYI, this broke Chromium's packaging of clang: https://crbug.com/929645#c5
I think it's just time for us to upgrade to a newer GCC, so I don't think we should do anything here. But that won't stop me from shaking my fist at folks for creating work for us. =P