# [lld] Enable Visual Studio compatible outputNeeds ReviewPublicActions

Authored by chrisjackson on May 30 2018, 9:16 AM.

# Details

Reviewers
 ruiu espindola inglorion
Summary

Adds diagnostic output functions to enable Visual Studio
compatibility with the flag --vs-diagnostics. Clang has a
flag which enables similar support, see
"-fdiagnostics-format=clang/msvc/vi" described
in https://clang.llvm.org/docs/UsersManual.html.
The existing diagnostics output format for warning and error messages is
not fully compatible with Visual Studio (VS) and can even cause VS to become
unresponsive.

The format accepted by VS
(https://msdn.microsoft.com/en-us/library/yxkt8b26.aspx):

{filename (line# [, column#]) | toolname} : [any text] {error | warning}
code####:localizable string [ any text ]

or more simply:

Origin : {warning|error} code#### : Message

The current default lld format uses the lld executable path for Origin.
Clicking the message inside VS will lead to a hang or long delay as VS
attempts to access the executable as a source code origin. Additionally, the
line and column information accepted by VS differs from the lld format.

I've included an additional lit test to ensure diagnostics output in
--vs-diagnostics mode is as expected. All existing lit tests that check
diagnostic output also help ensure that the previous lld output format
is maintained when the flag is not enabled.

# Diff Detail

### Event Timeline

chrisjackson created this revision.May 30 2018, 9:16 AM
ruiu added a comment.May 30 2018, 9:35 AM

I reviewed this patch, but I'm not convinced that we really want this. You are changing the ELF linker, but how many people are using our ELF linker on Visual Studio? Did you try to talk with Microsoft to let them make a change for you instead of making a change to us? I'm not trying to reject this, but adding a new feature and a new command line flag always needs a fairly strong justification, so please explain how much useful this is.

ELF/Driver.cpp
76–77

This should be moved to ErrorHandlers.cpp.

705–706

This is perhaps too late. This should be processed at the same timing as --color-diagnostics.

ELF/DriverUtils.cpp
135

Why do you need to add this code? If it is working now, you don't need to do this.

ELF/InputFiles.cpp
97

No else after return.

ELF/Relocations.cpp
646

You include Src in an error message and also pass Src to the function. As a result, I believe the same string is printed out more than once. That would look pretty odd.

include/lld/Common/ErrorHandler.h
116–117

I don't like a function with a default argument as it is too easy to mess up. People casually add additional parameters to existing functions, and the result would be a function that has tons of parameters. Please avoid doing this if possible.

ELF/Relocations.cpp
646

Src remains inside Msg so that the output diagnostic message text remains unchanged if --vs-diagnostics is not enabled. It is is also passed as a separate parameter so that it can be used as the origin component of the message when --vs-disagnostics is enabled. This enables the source hyperlink functionality in the VS error list.

ruiu added a comment.May 30 2018, 10:02 AM

I mean from the user's perspective, showing the same filename more than once seems just odd, no?

grimar added a subscriber: grimar.May 31 2018, 1:15 AM

Hi Rui! Thanks for looking at the patch.

I reviewed this patch, but I'm not convinced that we really want this. You are changing the ELF linker, but how many people are using our ELF linker on Visual Studio?

Surprisingly many! Wikipedia currently lists 1765 planned or released PlayStation(R) 4 games, each of which was built with our elf toolchain.

Did you try to talk with Microsoft to let them make a change for you instead of making a change to us? I'm not trying to reject this, but adding a new feature and a new command line flag always needs a fairly strong justification, so please explain how much useful this is.

Visual Studio's diagnostic message format is mature and well documented, so although we can propose a change, it seems unlikely that microsoft will implement a change in the near future.

It looks like Chris' patch could be adapted for COFF lld too, with a few changes. Do you think that would be useful?

Thanks for reviewing Rui, I've made the changes as described in my responses to your comments. Primarily:

• removed the extra set of LogName
• made clearer the extraction of the linker executable name from Args
• Made a correction to print() in ErrorHandler
• Removed unnecessary else in inputfiles.cpp
ruiu added a comment.Jun 6 2018, 2:17 PM

Could you share a few real examples of error messages with and without this patch to see how they look like?

Updated the lit test to show more examples of the the new error function, in response to a request from Ruiu. Also, corrected the setting of errorHandler.LogName in driverutils as LogName was incorrectly being set even if the --vs-diagnostics flag was not set.

chrisjackson added a comment.EditedJun 12 2018, 9:20 AM

Could you share a few real examples of error messages with and without this patch to see how they look like?

I've expanded the vs-diagnostics.s lit test with some extra examples to show the
difference in output with and without --vs-diagnostics enabled for several error
call scenarios. I've enumerated the examples so they can be checked against the output
at the end of the message. Currently the errors originate from:

1: 'unrecognised flag' which calls error without a source parameter and arguably shouldn't have this parameter i.e. error(Msg)
2: 'undefined symbol' which uses the new error(Msg, Src) interface and debug line information is available.
3: 'duplicate symbol' which uses an unmodified error(Msg) call and no debug line information is available.

In --vs-diagnostics mode, calling warn()/ error() without the source parameter
defaults the 'Origin' component of the message to the linker executable name
without the .exe extension. This lets the message indicate the originator
of the diagnostic while remaining VS diagnostic format conformant.

I've included the lld output from the lit tests below for comparison as requested. The lit
tests also check that the newline behaviour for multi-line diagnostic messages has
been preserved, by including two undefined symbols in test 1.

Test 1 without --vs-diagnostics

X:\git\upstream\lld_scripts_build\Debug\bin\ld.lld.EXE : error: undefined symbol: flump
>>> referenced by vs-diagnostics.s:1215
>>>               X:\git\upstream\lld_scripts_build\tools\lld\test\ELF\Output\vs-diagnostics.s.tmp.o:(.text+0x1)

X:\git\upstream\lld_scripts_build\Debug\bin\ld.lld.EXE : error: undefined symbol: foo
>>> referenced by vs-diagnostics.s:1066
>>>               X:\git\upstream\lld_scripts_build\tools\lld\test\ELF\Output\vs-diagnostics.s.tmp.o:(.text+0x6)

Test 1 with --vs-diagnostics

vs-diagnostics.s (1215) : error: undefined symbol: flump
>>> referenced by vs-diagnostics.s (1215)
>>>               X:\git\upstream\lld_scripts_build\tools\lld\test\ELF\Output\vs-diagnostics.s.tmp.o:(.text+0x1)

vs-diagnostics.s (1066) : error: undefined symbol: foo
>>> referenced by vs-diagnostics.s (1066)
>>>               X:\git\upstream\lld_scripts_build\tools\lld\test\ELF\Output\vs-diagnostics.s.tmp.o:(.text+0x6)

Test 2 without --vs-diagnostics

X:\git\upstream\lld_scripts_build\Debug\bin\ld.lld.EXE : error: duplicate symbol: wump
>>> defined at vs-diagnostics-multiply-defined.s
>>>            X:\git\upstream\lld_scripts_build\tools\lld\test\ELF\Output\vs-diagnostics.s.tmp2.o:(.text+0x0)
>>> defined at vs-diagnostics-multiply-defined.s
>>>            X:\git\upstream\lld_scripts_build\tools\lld\test\ELF\Output\vs-diagnostics.s.tmp2.o:(.text+0x0)

Test 2 with --vs-diagnostics

ld.lld : error: duplicate symbol: wump
>>> defined at vs-diagnostics-multiply-defined.s
>>>            X:\git\upstream\lld_scripts_build\tools\lld\test\ELF\Output\vs-diagnostics.s.tmp2.o:(.text+0x0)
>>> defined at vs-diagnostics-multiply-defined.s
>>>            X:\git\upstream\lld_scripts_build\tools\lld\test\ELF\Output\vs-diagnostics.s.tmp2.o:(.text+0x0)

Test 3 without --vs-diagnostics

X:\git\upstream\lld_scripts_build\Debug\bin\ld.lld.EXE : error: unknown argument: --nonexistantflag

Test 3 with --vs-diagnostics

ld.lld : error: unknown argument: --nonexistantflag
ELF/Driver.cpp
76–77

This additional set of LogName was unnecessary and is now solely in DriverUtils.cpp ELFOptTable::parse(). Other errorHandler properties are set here too, so I don't know if they should all be moved? I'd be happy to create a common SetupErrorHandler() function that sets the errorhandler properties in one place -- should this be a separate change?

705–706

Agreed. I've moved this.

ELF/DriverUtils.cpp
135

As we need the linker executable name for the diagnostics output, I removed the ArgsArr.slice(1) from LinkerDriver::main, as well as removing this Vec.erase. To make things clearer I've separated the linker name into a separate Stringref that is passed alongside the Argv ArrayRef.

ELF/InputFiles.cpp
97

I've removed the else.

include/lld/Common/ErrorHandler.h
116–117

I left a a default argument so that existing error() and warn() calls needn't be changed. The message output from existing single parameter calls will therefore not be altered by the addition of the VS compatibility mode. I experimented with forcing a message and src parameter but this means every warn() and error() call site must have a source parameter created.

Common/ErrorHandler.cpp
94

s/paranthesised/parenthesized/

ruiu added a comment.Jun 13 2018, 3:13 PM

I noticed that MSVC link.exe doesn't show the error message in such a format that you can jump the source location when a link error occurs. Is this a configuration error in my environment, or is it expected?

include/lld/Common/ErrorHandler.h
116–117

@ruiu, do you have an alternative proposal? A good number of calls to error() and fatal() have file information, and a good number don't. Unless we want to convert all of those to have file information (what is the file information for an unrecognized command line option?), we need to support both of these cases. There are several ways we could do this, so I'm curious what you have in mind.

ruiu added a comment.Jun 13 2018, 3:43 PM

I do not have a concrete proposal at the moment, but before fixing the error reporting interface, I want to make sure that it is as simple and less intrusive as possible. For example, a simple additional text processing to convert Unix-style error message to MSVC style might work (which is a bit hacky but the least intrusive). There are other ideas, perhaps. I'd like to explore more possibilities before setting my mind.

grep -ER '(error|fatal|warn)\(' lld/ELF

finds a lot of places where we are putting location information in diagnostics. Are you planning to convert those to use the new API/formatting, too?

grep -ER '(error|fatal|warn)\(' lld/ELF

finds a lot of places where we are putting location information in diagnostics. Are you planning to convert those to use the new API/formatting, too?

@inglorion
I think this presents a choice that could affect the implementation. If all the call sites are converted, then potentially we could make the Src parameter non-optional. However, for the cases where a diagnostic lacks file source information we would risk inconsistent output of the Origin component of the message due to the lack of the optional default. One idea in this case is providing some functionality in the errorHandler class that helps enforce consistency e.g.

error("unknown argument: " + Arg->getSpelling(), errorSrcDefault());

Alternatively, we can convert only some of the calls and retain the optional class for Src. I'm assuming that for simplicity we are only considering options that have a single interface each for error/warn/fatal.

However, the short answer to your question is yes. My preference is to convert all of the diagnostic calls to the new API.

I think this presents a choice that could affect the implementation. If all the call sites are converted, then potentially we could make the Src parameter non-optional.

Exactly. I think this is a good way to proceed. Making it mandatory to specify an origin should help both communicate that an origin is desired (so that we have a source file in the diagnostic), and also makes it easy to pass that information so that we know how to format it, and harder to do it in an ad-hoc fashion that would not be correctly formatted.

However, for the cases where a diagnostic lacks file source information we would risk inconsistent output of the Origin component of the message due to the lack of the optional default. One idea in this case is providing some functionality in the errorHandler class that helps enforce consistency e.g.

error("unknown argument: " + Arg->getSpelling(), errorSrcDefault());

I've actually always thought that having LogName in ErrorHandler is kind of awkward, and I would love for us to take this opportunity to get rid of it. By the same token, while I think you're right that it is a good idea to have a function to return the string we print, I would like that function to not live in ErrorHandler. Driver may be a good place for it (and is where the LogName is currently determined, too).

I also considered various other alternatives, such as having a more structured API (so e.g. source file and line number are available separately instead of glommed together in a string), but I think it's better to keep it simple for now. We can always get fancier if we find out we need it.

One thing I would have you consider is putting the origin before the message. This doesn't make a super big difference, but it's more consistent with how the code is currently written and the order in which the strings appear.

chrisjackson added a comment.EditedJun 15 2018, 6:59 AM

One thing I would have you consider is putting the origin before the message. This doesn't make a super big difference, but it's more consistent with how the code is currently written and the order in which the strings appear.

I agree this is the preferred ordering. The only reason message is before origin currently is the use of the optional class.

Common/ErrorHandler.cpp
94

Noted. I'll correct this in a revision some time today. Thanks.

ruiu added a comment.Jun 25 2018, 12:19 AM

I created a program that doesn't link and tried to build it on VS. Linker error messages in the VS window are not clickable -- I couldn't jump to the location by clicking a filename and a linename (e.g. foo\bar\baz.cpp:42) in the linker error message. Is this expected?

I created a program that doesn't link and tried to build it on VS. Linker error messages in the VS window are not clickable -- I couldn't jump to the location by clicking a filename and a linename (e.g. foo\bar\baz.cpp:42) in the linker error message. Is this expected?

@ruiu

Thanks for looking at this. Can you please clarify your experiment e.g. did you use lld in VS or the MSVC linker?
The format of the source "foo\bar\baz.cpp:42" does not meet compatibility for VS diagnostic hyperlinks. There should be no colon and the line and column information should be paranthesized, but you'll see that from my comments and changes.
Thanks,

Chris

chrisjackson updated this revision to Diff 152862.EditedJun 26 2018, 3:05 AM
• Corrected paranthesised/parenthesized usage in source comments
• Updated the lit test to remove usage of deprecated %T

I think we need to see the other places converted, too. Reports for duplicate symbols and invalid relocations will probably require additional changes, which may lead to better design when we see the whole picture.

ELF/InputFiles.cpp
86

You may want to update the comments for InputSectionBase::getSrcMsg().

There are several modifications to the code and lit test additions based
on the suggestions given. The default origin or source used to prefix any
diagnostics with no source code file is now stored in Driver and passed to
ErrorHandler. There are methods to set and get the private member
'DiagnosticSourceDefault'. This is a StringRef initialised to "lld", resembling
the previous behaviour of ErrorHandler.LogName.

Several changes to comments were required to account for the new output
formatting, in InputSection for example. The lit tests have been modified:

• A bad relocation type resulting in an error.
• A bad relocation type resulting in a warning.
• An updated test against a duplicate symbol error.

Added missing relocation test file in ELF/Inputs/

This comment was removed by chrisjackson.
ruiu added a comment.Jul 5 2018, 10:44 AM

This still seems a bit too complicated to me. I wonder if we can simplify this by textually tweak error messages inside error() (and other functions such as fatal()), so that error messages are converted to MSVC-ish format, without changing any API.

Since MSVC doesn't recognize multi-line error messages, only the first line of an error message matters. Currently, the first line of an error message is in the following format:

error: filename[:line#]: any text

So it seems easy to recognize this pattern to replace it with:

filename [line#]: error: any text

This might seem a bit hacky, but this is less intrusive than adding a new parameter to all the error-handling functions. What do you think?

I think that if we are to add support for Visual Studio (VS) that any
implementation should be explicit and not rely on quietly attempting to
transform any messages to the desired format. As you suggest, the pattern
matching / transform solution is quite hacky. Also, it will implicitly rely on
the diagnostic message string passed having the properties that the
implementation matches on. If we use a pattern matching implementation, have you
considered how we would handle message strings that do not conform to the
expected pattern?

Clang's Visual Studio format support is not identical to my suggestion but
the output string is constructed for the format and does not rely on a transform
e.g. llvm\tools\clang\lib\Frontend\TextDiagnostic.cpp.

Additionally, I believe that intrusion of extending the api as I've suggested is
outweighed by the structure it will bring to diagnostic messages. Currently any
string can be passed to error/warn/fail and it might be nice to suggest some
kind of consistency by separating the source and message.

ruiu added a comment.Jul 9 2018, 4:36 PM

I think that if we are to add support for Visual Studio (VS) that any
implementation should be explicit and not rely on quietly attempting to
transform any messages to the desired format. As you suggest, the pattern
matching / transform solution is quite hacky. Also, it will implicitly rely on
the diagnostic message string passed having the properties that the
implementation matches on. If we use a pattern matching implementation, have you
considered how we would handle message strings that do not conform to the
expected pattern?

I agree with you that what I suggested is hacky, but that's not immediately mean that's a bad approach. We could just pass through error messages that don't match the pattern. I'm not arguing that that's a better approach than extending the error message API, but I believe it's worth thinking.

It also feels to me that the API that takes both a source location and an error message might not nail it. Some error messages don't have the notion of the source location (e.g. command line argument) and some have multiple source locations (e.g. duplicate symbols). Adding a new API that covers only one case feels a bit odd to me.

Can we try the approach where the final style of diagnostic messages is determined in the reporting module and all other code just don't care about it?
Right now, we have a bunch of functions here and there which prepare some parts of the message in a specific style. Maybe it will be more semantically clear if there is a distinct place where the resulting message is prepared while all other parts of the linker provide only basic parts of the message (like message text, source location(s), code location(s), etc.) in a structured form. In this case, we can adjust only that distinct place to add any required message styles.

@Ik

Can we try the approach where the final style of diagnostic messages is determined in the reporting module and all other code just don't care about it?
Right now, we have a bunch of functions here and there which prepare some parts of the message in a specific style. Maybe it will be more semantically clear if there is a distinct place where the resulting message is prepared while all other parts of the linker provide only basic parts of the message (like message text, source location(s), code location(s), etc.) in a structured form. In this case, we can adjust only that distinct place to add any required message styles.

While more structured, I think this approach could be even more invasive than that suggested. Do you have an example implementation?

@ruiu
Clang has specific visual studio diagnostics format support. Do you think their approach is too involved?

While more structured, I think this approach could be even more invasive than that suggested. Do you have an example implementation?

Not yet. I can prepare a patch if no one is against the idea itself.

Please see D50560 for my variant.

aganea added a subscriber: aganea.Aug 15 2018, 11:27 AM