This is an archive of the discontinued LLVM Phabricator instance.

FileManager: Try to compute RealPathName in getVirtualFile
Needs ReviewPublic

Authored by simark on Jul 11 2018, 10:01 AM.

Details

Summary

I noticed that when getVirtualFile is called for a file, subsequent
calls to getFile will return a FileEntry without the RealPathName
computed:

const FileEntry *VFE = Mgr->getVirtualFile("/foo/../bar", ...);
const FileEntry *FE = Mgr->getFile("/foo/../bar");

FE->tryGetRealPathName() returns an empty string.

This is because getVirtualFile creates a FileEntry without computing the
RealPathName field and stores it in SeenFileEntries. getFile then
simply retrieves this object and returns it, when asked for the same
file.

I think it's reasonable to try to compute RealPathName in getVirtualFile
too.

There might be a similar issue with the File field. If you call
getVirtualFile, then call getFile with open=true, you may get a
FileEntry with the File field NULL, which is not what you requested. I
have not addressed this issue in this patch though.

Diff Detail

Event Timeline

simark created this revision.Jul 11 2018, 10:01 AM
simark updated this revision to Diff 155027.Jul 11 2018, 10:02 AM

Update commit message.

I'm not sure who should review this patch, please add anybody you think is qualified/responsible.

Background: I'm trying to fix the cases why we receive a FileEntry without a real path computed in clangd, so we can avoid having to compute it ourselves.

ioeric added inline comments.Jul 11 2018, 11:41 AM
lib/Basic/FileManager.cpp
395

It seems that at this point, we have failed to find FileName in vfs (with getStatValue above), so getRealPath would also fail?

In general, the virtual file here can be an actual *virtual* file that doesn't exist anywhere, and I think there are users who use this to map virtual file (possibly with relative paths) into file manager (while they should really use overlay vfs!).

simark added inline comments.Jul 11 2018, 11:54 AM
lib/Basic/FileManager.cpp
395

It seems that at this point, we have failed to find FileName in vfs (with getStatValue above), so getRealPath would also fail?

From what I understood, this code will be executed if the file actually exists but it's the first time we access it (we won't take the return at line 373). If we take the return at line 373, then some previous invocation of getFile or getVirtualFile has created the file, and that invocation would have been responsible for computing the real path.

In general, the virtual file here can be an actual *virtual* file that doesn't exist anywhere, and I think there are users who use this to map virtual file (possibly with relative paths) into file manager (while they should really use overlay vfs!).

My thinking was that the worst that could happen is that getRealPath fails in that case.

Background: I'm trying to fix the cases why we receive a FileEntry without a real path computed in clangd, so we can avoid having to compute it ourselves.

I'm curious how you use getVirtualFile in your clangd tests. In general, I'd highly recommend virtual file system in favor of remapped files for clangd tests.

lib/Basic/FileManager.cpp
395

From what I understood, this code will be executed if the file actually exists but it's the first time we access it (we won't take the return at line 373). If we take the return at line 373, then some previous invocation of getFile or getVirtualFile has created the file, and that invocation would have been responsible for computing the real path.

I see. Thanks for the explanation!

My thinking was that the worst that could happen is that getRealPath fails in that case.

It might make sense to take a look at how getVirtualFile is used in clang. For example, in CompilerInstance, it's used to create virtual FileEntry for remapped files (https://github.com/llvm-mirror/clang/blob/master/lib/Frontend/CompilerInstance.cpp#L329), and contents will be overwritten in SourceManager. This usage is arguable as we have overlay file system nowadays. But in this case, if we try to getRealPath on a remapped file, it might happen that we accidentally get an absolute path on the real file system, if there happens to be a file with the same path (relative to the CWD) on disk. This can be surprising and could be hard to debug. This is not unusual as people tend to use file remapping to overwrite the content of disk files with dirty buffers.

In general, virtual files in FileManager and virtual files in vfs are different things, and mixing them up further would likely cause more confusion in the future. We should really get rid of the remapped virtual files in FileManager and implement them properly with an overlay fs...

Background: I'm trying to fix the cases why we receive a FileEntry without a real path computed in clangd, so we can avoid having to compute it ourselves.

I'm curious how you use getVirtualFile in your clangd tests. In general, I'd highly recommend virtual file system in favor of remapped files for clangd tests.

I don't use getVirtualFile directly. I just use ClangdServer and look at the paths it outputs. It just happens that for some file, clang first opened it internally using getVirtualFile before using getFile, so the real path ends up missing`. Patch D48687 adds code to compensate that, but I added some logging to know when it happens, so we can try to fix it at the root (ideally, clang could always give us the real path so we don't have to compute it ourselves):

./tools/clang/tools/extra/unittests/clangd/ClangdTests --gtest_filter='*RelPathsInCompileCommand*'
Note: Google Test filter = *RelPathsInCompileCommand*
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from GoToDefinition
[ RUN      ] GoToDefinition.RelPathsInCompileCommand
Updating file /clangd-test/src/foo.cpp with command [/clangd-test/build] clang -ffreestanding ../src/foo.cpp -resource-dir=/home/emaisin/build/llvm/tools/clang/tools/extra/unittests/clangd/../lib/clang/7.0.0
Preamble for file /clangd-test/src/foo.cpp cannot be reused. Attempting to rebuild it.
Built preamble of size 171284 for file /clangd-test/src/foo.cpp
FileEntry for ../src/foo.cpp did not contain the real path.     <<<<< HERE
FileEntry for /clangd-test/src/header_in_preamble.h did not contain the real path.     <<<<< AND HERE
[       OK ] GoToDefinition.RelPathsInCompileCommand (42 ms)
[----------] 1 test from GoToDefinition (42 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (43 ms total)
[  PASSED  ] 1 test.

I'm investigating why clang failed to provide that to us in these two cases.

simark added inline comments.Jul 12 2018, 6:21 AM
lib/Basic/FileManager.cpp
395

Ok, I'm not familiar with what file remapping is.

An alternative approach would be to compute the real path in getFile when there is a cache hit but the file entry previously represented a virtual file. Eessentially, we would "upgrade" the virtual file to a standard one.