Page MenuHomePhabricator

FileManager: mark virtual file entries as valid entries

Authored by erikjv on Dec 15 2016, 7:51 AM.



The getVirtualFile method would create entries for e.g. libclang's
CXUnsavedFile but not mark them as valid. The effect is that a lookup
through getFile where the file name is not exactly matching the virtual
file (e.g. through mixing slashes and backslashes on Windows) would
result in a normal file "lookup", and re-using the file entry found
by using the UniqueID, and overwrite the file entry fields. Because the
lookup involves opening the file, and moving it into the file entry, the
file is now open. The SourceManager keys its buffers on the UniqueID
(which is still the same), so it will find an already loaded buffer.
Because only the loading a buffer from disk will close the file, the
FileEntry will hold on to an open file for as long as the FileManager
is around. As the FileManager will only get destroyed at a reparse,
you can't safe to the "leaked" and locked file on Windows.

Diff Detail

Event Timeline

erikjv updated this revision to Diff 81582.Dec 15 2016, 7:51 AM
erikjv retitled this revision from to Normalize all filenames before searching FileManager caches.
erikjv updated this object.
erikjv added reviewers: bkramer, klimek.
erikjv added a subscriber: cfe-commits.
kfunk added a reviewer: kfunk.Dec 15 2016, 8:58 AM
besc added a subscriber: besc.Dec 29 2016, 6:41 AM

I found a crash when applying this patch to 3.9.1 final (source archives from and compiling LLVM/Clang with MinGW-w64 GCC 6.2.0 or MSVC 2015. Clang-format sometimes crashes with input from stdin, output to stdout and a command line like this:

clang-format.exe -style=file -assume-filename=C:/path/to/file.hpp

I’m not familiar with the code base, so these are just my observations from a user’s point of view. From what I found the problem could be linked to the first slash/backslash in an absolute path:


  • C:/path/to/file.hpp
  • C:/path\to\file.hpp
  • /path/to/file.hpp
  • /path\to\file.hpp


  • Backslashes only always work.
  • C:\path/to/file.hpp
  • \path/to/file.hpp
  • path/to/file.hpp
  • path/to\file.hpp

Header file (.hpp) or source file (.cpp) does not make a difference.

A crash produces this error message:

Wrote crash dump file "C:\Users\besc\AppData\Local\Temp\clang-format.exe-e00fa3.dmp"
0x0000000000459504 (0x0000000000000000 0x0000000000000000 0x0000000000000000 0x000000000022E300)
0x000000000047A45F (0x000000000022E3F0 0x0000000000000015 0x000000000022E5AC 0x0000000004000004)
0x000000000047174F (0x000000000022EE20 0x000000000022ED30 0x0000000000000000 0x000000000094846C)
0x0000000000403744 (0x0000000000A77F30 0x0000000077CC0A7A 0x0000000000857120 0x000000000022FDB0)
0x00000000006E9CBF (0x0000000000000002 0x000000000000004D 0x000000000085ABD0 0x0000000000000000)
0x00000000004013E8 (0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000)
0x000000000040151B (0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000)
0x0000000077B859CD (0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000), BaseThreadInitThunk() + 0xD bytes(s)
0x0000000077CBA561 (0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000), RtlUserThreadStart() + 0x21 bytes(s)

A corresponding dump file is attached:

I also tested a vanilla 3.9.1 final, built with the same compilers: works fine with all the above slash combinations.

yaron.keren added inline comments.

I'd use a larger SmallString<256>, with large projects 128 bytes are frequently not enough.

besc: I can't reproduce any crash. I tried trunk and the release_39 branch with msvc2015.

erikjv added inline comments.Jan 27 2017, 2:43 AM

Well, the calling sides often use 128, so that's why I stuck with it.

erikjv updated this revision to Diff 87376.Feb 7 2017, 1:56 AM

Fixed all failing tests on Windows.

klimek added inline comments.Feb 14 2017, 6:34 AM

I'd add a canonicalizePath function that:
-> on unix just returns the path
-> on windows, does the ::native dance and returns that path

Then we can call that when we access the cache or write to the cache instead of sprinkling #ifdefs around.

1716–1717 ↗(On Diff #87376)

I'd do changes like this in a separate patch.

21 ↗(On Diff #87376)

That is super unfortunate.
If we want tests to be able to run on windows, I think we might just want to give clang a flag to always show paths in unix format. That also should be something we could just make the file manager do (I think whatever the filemanager stores will be what is visible to the user).

erikjv updated this revision to Diff 88874.Feb 17 2017, 5:24 AM
erikjv retitled this revision from Normalize all filenames before searching FileManager caches to FileManager: mark virtual file entries as valid entries.
erikjv edited the summary of this revision. (Show Details)
kfunk edited edge metadata.Feb 28 2017, 1:32 AM

@erikjv Ready for review? Does this reliably fix

erikjv added a comment.Mar 8 2017, 1:59 AM

@kfunk yes and yes, so @bkramer or @klimek : ping?

bkramer edited edge metadata.Mar 8 2017, 3:29 AM

I assume this is fine but I don't really understand what's going on. A test case would be great.

klimek edited edge metadata.Mar 8 2017, 3:30 AM

Is the diff view on phab broken, or am I missing something? I only see a single line of diff now, and don't see a way to change the diff.

erikjv added a comment.Mar 8 2017, 4:14 AM

@klimek no, it's a 1 line fix. The rest was the previous version of the patch.

bruno added a subscriber: bruno.Mar 9 2017, 1:49 PM

Please attach a testcase!

erikjv updated this revision to Diff 93221.Mar 28 2017, 2:14 AM

Added a test for the specific scenario, and added asserts for validity of UFEs returned by getVirtualFile.

This revision is now accepted and ready to land.Mar 28 2017, 2:23 AM
erikjv closed this revision.Mar 28 2017, 4:33 AM

Committed as r298905.

The IsValid = true line coincidentally fixes a tangentially related bug -- see D20338 (in which I tried to introduce the same fix almost a year earlier, but nobody accepted the review). I guess I have to maintain the test for that case out of tree, though I could try to upstream it again (however, given that it's now passing with your change, I don't have much hope of it being accepted).