This is an archive of the discontinued LLVM Phabricator instance.

Get default -fms-compatibility-version from cl.exe's version
ClosedPublic

Authored by amccarth on May 10 2016, 2:39 PM.

Details

Summary

-fms-compatibility-version was defaulting to 18 (VS 2013), which is a pain if your environment is pointing to version 19 (VS 2015) libraries.

If cl.exe can be found, this patch uses its version number as the default instead. It re-uses the existing code to find the Visual Studio binaries folder and WinAPI methods to check its version. You can still explicitly specify a compatibility version on the command line. If you don't have cl.exe, this should be a no-op and you'll get the old default of 18.

This affected the tests, which assumed that if you didn't specific a version, that it would default to 18, but this won't be true for all machines. So a couple test cases had to be eliminated and a couple others had to be tweaked to allow for various outputs.

Addresses: https://llvm.org/bugs/show_bug.cgi?id=27215

Diff Detail

Event Timeline

amccarth updated this revision to Diff 56819.May 10 2016, 2:39 PM
amccarth retitled this revision from to Get default -fms-compatibility-version from cl.exe's version.
amccarth updated this object.
amccarth added a reviewer: rnk.
amccarth added a subscriber: cfe-commits.
thakis added a subscriber: thakis.May 10 2016, 2:49 PM
thakis added inline comments.
lib/Driver/MSVCToolChain.cpp
478

We already stat a bunch of directories to find the sdk include path. Can we use the result of that instead of looking at cl.exe? Then we wouldn't have to do additional stats.

majnemer added inline comments.
lib/Driver/MSVCToolChain.cpp
473

Why not use the GetFileVersionInfoSizeW variant?

amccarth added inline comments.May 10 2016, 3:07 PM
lib/Driver/MSVCToolChain.cpp
478

I'm just a simple CPM/VMS/Windows developer. Your Linux terms (stat) frighten and confuse me.

Seriously, though, this API isn't a file system check. It's digging out the version record from the file's resources.

We _could_ guess at the version from the names of the directories in the path, but that would require mapping names to versions, and if it's installed in a non-standard place it wouldn't help at all.

Also, -fms-compatibility-version is really about the version of the compiler (cl.exe), not that of the standard library nor of the SDK, so trying to check something else as a proxy for the version seems prone to obscure failures.

I share your concern about speed, especially since getting the version happens twice (once for the triple and once for the compatibility version), but invoking clang and having it choose the wrong default costs a lot of time, too.

The bug report correctly says we shouldn't spin up a process to run cl /version--that would be far more expensive. And if you put -fms-compatibility-version on the command line, then this function won't be called as it won't need to figure out the default.

amccarth added inline comments.May 10 2016, 3:09 PM
lib/Driver/MSVCToolChain.cpp
473

I started down that road, but it seemed overkill to convert the path to a wide string. I'm happy to do it if you think it worthwhile.

majnemer added inline comments.May 10 2016, 3:16 PM
lib/Driver/MSVCToolChain.cpp
473

I think it's worth it, we get bug reports whenever we break this sort of thing...

478

Seriously, though, this API isn't a file system check. It's digging out the version record from the file's resources.

Isn't the content stored as a resource in the PE? If so, that means that getting the version information requires reading bytes inside of cl.exe

With regard to -fms-compatibility-version, it shouldn't have anything to do with the platform SDK. However, it is fundamentally the case that the CRT and the compiler have the same version. Otherwise, really terrible things happen (MSVC19 assumes char16_t is a keyword which it provides but the MSVC18 STL assumes it is responsible for providing the keyword).

rnk added inline comments.May 10 2016, 3:56 PM
lib/Driver/MSVCToolChain.cpp
42

Personally, I think this is OK but I know Aaron Ballman and other people don't like using pragma comment lib. The alternative is hacking CMake goo, which is always best avoided when possible.

473

+1, you can use ConvertUTF8toWide to make this easy.

478

I think one extra file read is probably worth the convenience it buys for our users. It's easy to win back by having the user pass an explicit -fms-compatibility-version flag.

lib/Driver/Tools.cpp
3337–3338

IMO you should make this a virtual method on Toolchain that does nothing and is only overridden in MSVCToolChain. You can also cache it if you do that.

tools/driver/driver.cpp
504 ↗(On Diff #56819)

Yeah, it's a typo, but you don't have any other changes in this file, so I wouldn't touch it as part of this change.

amccarth updated this revision to Diff 56833.May 10 2016, 4:26 PM
amccarth marked an inline comment as done.

Addressed most comments: now using wide-chars for WinAPI calls, made getMSVCVersionFromExe a virtual method, removed extraneous typo-correction.

amccarth marked 4 inline comments as done.May 10 2016, 4:31 PM
amccarth added inline comments.
lib/Driver/MSVCToolChain.cpp
478

Yes, it looks in the executable (which I tried to emphasize with the method name).

I don't think this is very expensive given that Explorer often makes zillions of such calls, but I'm open to other suggestions.

I know that you can't use a library that's newer than the compiler (because it may use new language features), but I don't know if that applies in the other direction or how we would safely and reliably map directory names to library versions and therefore to compiler versions.

now using wide-chars for WinAPI calls,

Dōmo arigatō gozaimashita!

thakis added inline comments.May 11 2016, 5:29 AM
lib/Driver/MSVCToolChain.cpp
478

I agree that figuring out the right value for fmsc-version automatically somehow is definitely something we should do.

I forgot that getVisualStudioBinariesFolder already works by looking for cl.exe in PATH, so cl.exe's metadata is already warmed up in the disk cache. However, GetFileVersionInfoW() probably opens cl.exe itself and does some PE parsing to get at the version, and that probably is in cold cache territory. (https://msdn.microsoft.com/en-us/library/windows/desktop/ms647003(v=vs.85).aspx suggests that this function might open several files).

getVisualStudioBinariesFolder checks:

  1. getenv("VCINSTALLDIR")
  2. cl.exe in getenv("PATH")
  3. registry (via getVisualStudioInstallDir)

The common cases are 1 and 3. For 1, for default installs, the version number is part of the directory name (for default installs, what most people have). For 3, the version number is in the registry key we query. So in most cases we shouldn't have to look at cl.exe itself. And for the cases where we would have to look, maybe it's ok to require an explicit fmsc-version flag.

amccarth added inline comments.May 11 2016, 8:00 AM
lib/Driver/MSVCToolChain.cpp
478

The version number in the directory name and the registry is the version number of Visual Studio not of the compiler. Yes, we could do a mapping (VS 14 comes bundled with CL 19), assuming Microsoft continues to keep VS releases and compiler releases in sync, and it means this code will forever need updates to the mapping data.

The mapping would give just the major version number, which might be all that matters now, but if there's ever a CL 19.1 that has different compatibility requirements (and is maybe released out-of-band with Visual Studio), we'd be stuck.

Getting the actual version from the compiler seems the most accurate and future-proof way to check. If that's too expensive, then maybe we should abandon the idea of detecting the default for compatibility.

amccarth added inline comments.May 11 2016, 8:23 AM
lib/Driver/MSVCToolChain.cpp
478

I'll do some research to figure out the actual costs. I suspect that walking the PATH for the executable may be far more expensive, but I'll get some numbers and report back.

thakis added inline comments.May 11 2016, 8:36 AM
lib/Driver/MSVCToolChain.cpp
478

Compilers being released independently of VC versions and fractional compat numbers sounds like things we can worry about when they happen (probably not soon, right?).

We already walk PATH, so that wouldn't be an additional cost.

Be sure to measure cold disk cache perf impact (which is tricky on Windows since there's as far as I know no way to tell the OS to drop its caches). As far as I know file metadata is stored with the directory node on NTFS, so stating files doesn't warm up file content accesses.

amccarth added inline comments.May 11 2016, 8:43 AM
lib/Driver/MSVCToolChain.cpp
478

Compilers being released independently of VC versions and fractional compat numbers sounds like things we can worry about when they happen (probably not soon, right?).

It already happens. Herb Sutter talks about it in one of his blogs: "Soon after VC++11 ships we have announced we will do out-of-band releases for additional C++11 conformance which will naturally also include more C11 features that are in the C subset of C++11." In this case, it's just the build number (of major.minor.build) that's updating, but it's for increasing conformance, which is exactly a compatibility issue.

We already walk PATH, so that wouldn't be an additional cost.

I suspect we may be walking it more than once, which can be expensive even if the cache is all warmed up. This is one of the things I'm checking. If it's a problem, I'll propose a patch to cache the result from the first walk.

stating files doesn't warm up file content accesses.

That is correct.

amccarth added inline comments.May 11 2016, 2:17 PM
lib/Driver/MSVCToolChain.cpp
478

My machine in Windows 7 Enterprise, with cl.exe on a spinning HD (not SSD), and with Bit9 in "monitor" mode, which drags down all i/o operations.

First, I used procmon to make sure there were no surprising file ops happening as a result of my change. As expected, it creates the file handle, maps the file, and performs three read non-contiguous operations, loading a total of 5 pages (1 page = 4096 bytes). I'm guessing that the three reads correspond to the file header, a resource index, and the actual version resource. In the cold cache case, the file operations themselves (as reported by procmon) totaled 17 ms.

Subsequent calls from the same process did not even repeat the reads. It seems the system might cache the resource information once loaded.

Procmon also highlighted that the path search is doing twice as many checks as necessary. For each segment it checks for both cl.exe and cl.exe.exe. The path searching (according to procmon) took a little more than 1 ms per walk, and, with this change, it happens three times, for a total of almost 4 ms. This could be eliminated by caching the result in the MSVCToolChain object so subsequent calls are essentially free and/or reduced by eliminating the spurious checks for cl.exe.exe.

With release builds of Clang (with and without my change) and procmon turned off, I built a trivial C++ program multiple times. These are effectively warm-cache results, which I think are appropriate, since you're either building one-off interactively (in which case 20 ms isn't a big deal) or you're building a lot, in which case the cache will likely be warm for all but the first invocation.

W/o this change: 1.426 s average (1.421, 1.431, 1.458, 1.408, 1.411)
With this change: 1.442 s average (1.467, 1.442, 1.417, 1.433, 1.453)

So the delta is 16 ms, some of which is the extra path walking, and is on the order of the run-to-run variance.

Ok, that seems fine then. Thanks much for checking!

Are there any remaining concerns with this patch? Is everyone satisfied with the performance numbers?

aaron.ballman added inline comments.
lib/Driver/MSVCToolChain.cpp
42

Eh, I am lightening up on this sort of thing, so this is fine by me.

467

Elide braces (here and elsewhere) per usual project coding conventions.

483

Pure pedantry: uint8_t instead of char, or is this data really a textual string in practice?

amccarth marked 2 inline comments as done.May 13 2016, 8:40 AM
amccarth added inline comments.
lib/Driver/MSVCToolChain.cpp
42

I was following the pattern I saw in llvm\lib\Support\Windows\Path.inc (and elsewhere), so I thought it was the way we did things around here.

483

It's a mix. The part we're looking at is binary data, but the rest of the block is text.

I though the API wanted a pointer to char, so I chose char to avoid unnecessary casts.

But I must've misread the reference page, because just now I double-checked and I see that the API wants a void pointer, so I'll go ahead and use uint8_t, which satisfies my inner pedant as well.

amccarth updated this revision to Diff 57196.May 13 2016, 8:41 AM
amccarth marked an inline comment as done.

Addressed additional comments.

majnemer added inline comments.May 13 2016, 10:32 AM
lib/Driver/MSVCToolChain.cpp
481

It might be nicer to use a SmallVector<uint8_t, sizeof(VS_FIXEDFILEINFO)>, or whatever VersionSize typically is, here to avoid heap allocation in the common case.

amccarth added inline comments.May 13 2016, 10:39 AM
lib/Driver/MSVCToolChain.cpp
481

What's the cutoff for "small"? The version block in cl.exe is about 9KB.

majnemer added inline comments.May 13 2016, 10:50 AM
lib/Driver/MSVCToolChain.cpp
481

Using 10K is probably fine, the default stack size on Windows is a massive 1 MB and this function is not reentrant.

amccarth marked an inline comment as done.May 13 2016, 11:43 AM
amccarth added inline comments.
lib/Driver/MSVCToolChain.cpp
481

My mistake. It's a smidge over 1KB, (still more than sizeof(VS_FIXEDFILEINFO)) so I've make a SmallVector of 2KB.

rnk accepted this revision.May 13 2016, 4:12 PM
rnk edited edge metadata.

LG, sounds like people are happy with this

This revision is now accepted and ready to land.May 13 2016, 4:12 PM
This revision was automatically updated to reflect the committed changes.
amccarth marked an inline comment as done.
thakis added a comment.Jan 9 2017, 3:29 PM

One consequence from this that I just realized is that linking a binary depending on clang stuff with (morally):

c++ -o foo foo.o $($LLVMBUILD/bin/llvm-config --ldflags) -lclangFrontend -lclangDriver -lclangParse -lclangSema -lclangSerialization -lclangAnalysis -lclangAST -lclangEdit -lclangLex -lclangBasic $($LLVMBUILD/bin/llvm-config --libs) $($LLVMBUILD/bin/llvm-config --system-libs)

now fails with

clangDriver.lib(MSVCToolChain.cpp.obj) : error LNK2019: unresolved external symbol GetFileVersionInfoSizeW referenced in
 function "private: class clang::VersionTuple __cdecl clang::driver::toolchains::MSVCToolChain::getMSVCVersionFromExe(vo
id)const " (?getMSVCVersionFromExe@MSVCToolChain@toolchains@driver@clang@@AEBA?AVVersionTuple@4@XZ)
clangDriver.lib(MSVCToolChain.cpp.obj) : error LNK2019: unresolved external symbol GetFileVersionInfoW referenced in fun
ction "private: class clang::VersionTuple __cdecl clang::driver::toolchains::MSVCToolChain::getMSVCVersionFromExe(void)c
onst " (?getMSVCVersionFromExe@MSVCToolChain@toolchains@driver@clang@@AEBA?AVVersionTuple@4@XZ)
clangDriver.lib(MSVCToolChain.cpp.obj) : error LNK2019: unresolved external symbol VerQueryValueW referenced in function
 "private: class clang::VersionTuple __cdecl clang::driver::toolchains::MSVCToolChain::getMSVCVersionFromExe(void)const
" (?getMSVCVersionFromExe@MSVCToolChain@toolchains@driver@clang@@AEBA?AVVersionTuple@4@XZ)
pptest.exe : fatal error LNK1120: 3 unresolved externals

and folks have to manually add mincore.lib to their link. llvm-config --system-libs takes care of things like this for LLVM, but we don't have a clang-config as far as I know (does anybody know why not?). So folks now have to manually pass more libraries than just the clang libraries. Not a big deal, but I figured I'd point it out.

and folks have to manually add mincore.lib to their link.

I could load the library dynamically on demand and use GetProcAddress, but I suspect that would further degrade the performance. I could probably write my own code to find the version in the binary, but I doubt that crosses the reward/work threshold.

rnk added a comment.Jan 9 2017, 4:20 PM

and folks have to manually add mincore.lib to their link.

I could load the library dynamically on demand and use GetProcAddress, but I suspect that would further degrade the performance. I could probably write my own code to find the version in the binary, but I doubt that crosses the reward/work threshold.

#pragma comment(lib, "mincore"), maybe?

Aha, looks like this did have a #pragma comment(lib when it went in (for version.lib which is actually the correct one), but takumi removed it in http://llvm.org/viewvc/llvm-project?rev=269557&view=rev