This is an archive of the discontinued LLVM Phabricator instance.

[PCH] Allow VFS to be used for tests that generate PCH files
ClosedPublic

Authored by cameron314 on Sep 5 2017, 6:51 AM.

Details

Summary

When using a virtual file-system (VFS) and a preamble file (PCH) is generated, it is generated on-disk in the real file-system instead of in the VFS (which I guess makes sense, since the VFS is read-only). However, when subsequently reading the generated PCH, the frontend passes through the VFS it has been given -- resulting in an error and a failed parse (since the VFS doesn't contain the PCH; the real filesystem does).

This patch fixes that by detecting when a VFS is being used for a parse that needs to work with a PCH file, and creating an overlay VFS that includes the real file-system underneath. Since the PCH is hard-coded to always be on the real file-system, I believe this is the cleanest approach.

This allows tests to be written which make use of both PCH files and a VFS, like the one I've included here.

Note: This was originally part of the code to test the bug fixed in D20338, but while languishing in review it has since been fixed by somebody else in D27810. However, I feel it's still important to be able to test the frontend preamble code while at the same time making use of a VFS, so I've rebased that part of the patch (and my test to go with it).

Diff Detail

Repository
rL LLVM

Event Timeline

cameron314 created this revision.Sep 5 2017, 6:51 AM
bkramer edited reviewers, added: ilya-biryukov; removed: cfe-commits.Sep 5 2017, 7:29 AM
bkramer added a subscriber: cfe-commits.
ilya-biryukov edited edge metadata.Sep 5 2017, 7:58 AM

Could we fix this in tests instead by providing an overlay over RealFS there instead of doing that in ASTUnit?

If I'm passing vfs::FileSystem, I specifically do not want any file accesses to silently go through RealFS, ignoring the vfs::FileSystem that I passed.

We do overlays in clangd and we also have a vfs::FileSystem implementation that only reads certain WhitelistedDirs from RealFS to ensure we don't read other files from RealFS.
See https://reviews.llvm.org/diffusion/L/browse/clang-tools-extra/trunk/unittests/clangd/ClangdTests.cpp;312537$34.

I suppose we could do the overlay manually in all the tests that need it; I guess it depends on what the goal of the VFS support is. To my mind, it doesn't make sense that a file is placed in the RealFS in the first place if a VFS is used, but this is quite entrenched in the current design and difficult to change. In which case, I would argue that parsing a file with a preamble and a VFS should not give a cryptic error without an explicit overlay, but rather "just work". But I agree my current patch is a little crude in that it allows access to the entire RealFS via the VFS.

I could change the automatic overlay to *only* support access to the generated PCH file, similar to how the FilteredFileSystem you've pointed out works, but for a single file only. Would that be acceptable?

I suppose we could do the overlay manually in all the tests that need it; I guess it depends on what the goal of the VFS support is. To my mind, it doesn't make sense that a file is placed in the RealFS in the first place if a VFS is used, but this is quite entrenched in the current design and difficult to change. In which case, I would argue that parsing a file with a preamble and a VFS should not give a cryptic error without an explicit overlay, but rather "just work". But I agree my current patch is a little crude in that it allows access to the entire RealFS via the VFS.

I could change the automatic overlay to *only* support access to the generated PCH file, similar to how the FilteredFileSystem you've pointed out works, but for a single file only. Would that be acceptable?

I totally agree, the current behavior is definitely confusing for clients using vfs::FileSystem without proper filesystem access on top of ASTUnit.
Having an overlay that provides only the PCH file sounds like a good middle ground.

Maybe we could even read the PCH contents using file APIs instead of vfs::FileSystem and add them into PreprocessorOption::RemappedFileBuffers? Do you think that would also work?
Since ASTUnit is responsible writes PCHs to disk, it also makes sense for it to read them from disk, but accesses to all other files would go through vfs::FileSystem.

I'll change the overlay to only allow access to the PCH.

I agree that it would be nice to only read the PCH using real filesystem APIs in order to be symmetrical, but this seems non-trivial. It also feels like a step in the wrong direction -- ideally the VFS would hold the PCH too instead of putting it on disk, in my opinion.

I'm not sure adding a remapped file buffer would work, since it's unclear to me exactly at which point the PCH is written relative to the multiple places it's read. Additionally, there's code that does directory traversal in the current VFS to find a PCH when given a directory name instead of a file name; this only works if the VFS is aware of the PCH, or if the code is rewritten to use the underlying file system.

cameron314 updated this revision to Diff 114016.Sep 6 2017, 8:51 AM

Here's an updated patch that allows only the PCH to be accessed from the real FS when a VSF is present. Tests still pass.

I'll change the overlay to only allow access to the PCH.

I agree that it would be nice to only read the PCH using real filesystem APIs in order to be symmetrical, but this seems non-trivial. It also feels like a step in the wrong direction -- ideally the VFS would hold the PCH too instead of putting it on disk, in my opinion.

Unfortunately vfs::FileSystem is a read-only API, and AFAIK this is by design. So we should not hope that would ever happen.

I'm not sure adding a remapped file buffer would work, since it's unclear to me exactly at which point the PCH is written relative to the multiple places it's read. Additionally, there's code that does directory traversal in the current VFS to find a PCH when given a directory name instead of a file name; this only works if the VFS is aware of the PCH, or if the code is rewritten to use the underlying file system.

Fair point, the directory traversal code would only work with vfs.
We can use assumptions about our input, though. ASTUnit controls PCH creation and AFAIK it in our case pch will always point to a file. Moreover, the path seems to PCH seems to always be absolute, so we should not care about working directory of vfs either.

Using remapped buffers would make the code simpler, would not require custom vfs implementations and you're also right that it would look more symmetrical since we'll use filesystem API for both writes and reads of PCH.
Maybe we could try this approach?

Looking at the way remapped buffers are handled, I just remembered that they must exist on the file system (at the very least, in a directory that exists) or the remapping is not taken into account. So that pretty much rules out the other approach, I think.

Looking at the way remapped buffers are handled, I just remembered that they must exist on the file system (at the very least, in a directory that exists) or the remapping is not taken into account. So that pretty much rules out the other approach, I think.

You are right, thanks for pointing this out. Fiddling with VFS seems like an only option.

lib/Frontend/ASTUnit.cpp
1014 ↗(On Diff #114016)

Maybe create a combination of InMemoryFileSystem and OverlayFileSystem instead of custom filtering implementation?
We really need to read only a single file given that ASTUnit never creates directory PCHs.
I bet it would make the code simpler and less error-prone.

1090 ↗(On Diff #114016)

Maybe create a PCH overlay only when !VFS->exists(/*PreamblePath...*/)?
This seems like a good enough indication that RealFS is underneath the vfs::FileSystem instance, even if pointer equality does not hold (i.e. in case of overlays over RealFS).

unittests/Frontend/CMakeLists.txt
9 ↗(On Diff #114016)

Maybe rename to PCHPreambleTest?
LLVM code generally capitalizes all letters in abbreviations.

cameron314 added inline comments.Sep 8 2017, 10:33 AM
lib/Frontend/ASTUnit.cpp
1014 ↗(On Diff #114016)

I hadn't thought of that. Yes, that makes sense and is more concise.

1090 ↗(On Diff #114016)

Makes sense.

unittests/Frontend/CMakeLists.txt
9 ↗(On Diff #114016)

Oops, overlooked this one. Will do!

The latest patch. I think this one should do the trick :-)

ilya-biryukov accepted this revision.Sep 11 2017, 1:55 AM

Looks good, just a few minor style comments.

include/clang/Frontend/PrecompiledPreamble.h
100 ↗(On Diff #114414)

Not introduced by this change, but could you also add a full stop here for constistency?

103 ↗(On Diff #114414)

NIT: comment should end with a full stop.

lib/Frontend/ASTUnit.cpp
1021 ↗(On Diff #114414)

Maybe return original VFS if PCHFilename could not be read and not create any empty overlays?

1053 ↗(On Diff #114414)

The check && RealFS is redundant and can be removed.

This revision is now accepted and ready to land.Sep 11 2017, 1:55 AM

Final diff, will commit soon. Thanks!

include/clang/Frontend/PrecompiledPreamble.h
100 ↗(On Diff #114414)

Sure.

103 ↗(On Diff #114414)

OK!

lib/Frontend/ASTUnit.cpp
1021 ↗(On Diff #114414)

Makes sense, will do.

1053 ↗(On Diff #114414)

It's not redundant, but looking at the implementation of vfs::getRealFileSystem it will always return a non-null pointer.

This revision was automatically updated to reflect the committed changes.