The getVirtualFile method would create entries for e.g. libclang's
CXUnsavedFile but not mark them as valid. The effect is that a lookup
through getFile where the file name is not exactly matching the virtual
file (e.g. through mixing slashes and backslashes on Windows) would
result in a normal file "lookup", and re-using the file entry found
by using the UniqueID, and overwrite the file entry fields. Because the
lookup involves opening the file, and moving it into the file entry, the
file is now open. The SourceManager keys its buffers on the UniqueID
(which is still the same), so it will find an already loaded buffer.
Because only the loading a buffer from disk will close the file, the
FileEntry will hold on to an open file for as long as the FileManager
is around. As the FileManager will only get destroyed at a reparse,
you can't safe to the "leaked" and locked file on Windows.
Details
Diff Detail
Event Timeline
I found a crash when applying this patch to 3.9.1 final (source archives from http://llvm.org/releases/download.html#3.9.1) and compiling LLVM/Clang with MinGW-w64 GCC 6.2.0 or MSVC 2015. Clang-format sometimes crashes with input from stdin, output to stdout and a command line like this:
clang-format.exe -style=file -assume-filename=C:/path/to/file.hpp
I’m not familiar with the code base, so these are just my observations from a user’s point of view. From what I found the problem could be linked to the first slash/backslash in an absolute path:
Crash
- C:/path/to/file.hpp
- C:/path\to\file.hpp
- /path/to/file.hpp
- /path\to\file.hpp
Works
- Backslashes only always work.
- C:\path/to/file.hpp
- \path/to/file.hpp
- path/to/file.hpp
- path/to\file.hpp
Header file (.hpp) or source file (.cpp) does not make a difference.
A crash produces this error message:
Wrote crash dump file "C:\Users\besc\AppData\Local\Temp\clang-format.exe-e00fa3.dmp" 0x0000000000459504 (0x0000000000000000 0x0000000000000000 0x0000000000000000 0x000000000022E300) 0x000000000047A45F (0x000000000022E3F0 0x0000000000000015 0x000000000022E5AC 0x0000000004000004) 0x000000000047174F (0x000000000022EE20 0x000000000022ED30 0x0000000000000000 0x000000000094846C) 0x0000000000403744 (0x0000000000A77F30 0x0000000077CC0A7A 0x0000000000857120 0x000000000022FDB0) 0x00000000006E9CBF (0x0000000000000002 0x000000000000004D 0x000000000085ABD0 0x0000000000000000) 0x00000000004013E8 (0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000) 0x000000000040151B (0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000) 0x0000000077B859CD (0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000), BaseThreadInitThunk() + 0xD bytes(s) 0x0000000077CBA561 (0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000), RtlUserThreadStart() + 0x21 bytes(s)
A corresponding dump file is attached:
I also tested a vanilla 3.9.1 final, built with the same compilers: works fine with all the above slash combinations.
lib/Basic/FileManager.cpp | ||
---|---|---|
218 | I'd use a larger SmallString<256>, with large projects 128 bytes are frequently not enough. |
besc: I can't reproduce any crash. I tried trunk and the release_39 branch with msvc2015.
lib/Basic/FileManager.cpp | ||
---|---|---|
218 | Well, the calling sides often use 128, so that's why I stuck with it. |
lib/Basic/FileManager.cpp | ||
---|---|---|
167–168 | I'd add a canonicalizePath function that: Then we can call that when we access the cache or write to the cache instead of sprinkling #ifdefs around. | |
lib/StaticAnalyzer/Core/BugReporterVisitors.cpp | ||
1716–1717 ↗ | (On Diff #87376) | I'd do changes like this in a separate patch. |
test/CodeGen/debug-prefix-map.c | ||
21 ↗ | (On Diff #87376) | That is super unfortunate. |
@erikjv Ready for review? Does this reliably fix https://bugreports.qt.io/browse/QTCREATORBUG-15449?
I assume this is fine but I don't really understand what's going on. A test case would be great.
Is the diff view on phab broken, or am I missing something? I only see a single line of diff now, and don't see a way to change the diff.
Added a test for the specific scenario, and added asserts for validity of UFEs returned by getVirtualFile.
The IsValid = true line coincidentally fixes a tangentially related bug -- see D20338 (in which I tried to introduce the same fix almost a year earlier, but nobody accepted the review). I guess I have to maintain the test for that case out of tree, though I could try to upstream it again (however, given that it's now passing with your change, I don't have much hope of it being accepted).
I'd add a canonicalizePath function that:
-> on unix just returns the path
-> on windows, does the ::native dance and returns that path
Then we can call that when we access the cache or write to the cache instead of sprinkling #ifdefs around.