We don't immediately need the file to be open, and this code does not ensure
that it is closed. On at least some code paths (using headers from PCHs) the
file is never closed and we accumulate open file descriptors.
Prior to r347205, the file was not actually opened (despite shouldOpen=true)
if the FileEntry already existed. In the PCH case, the entry existed (due to
a previous call with shouldOpen=false) and so we got away with passing true.
r347205 "fixed" that behavior of FileManager: if the file was previously statted
but not opened, and then FileManager::get(shouldOpen=true) is called, then the
file is opened.
Subsequently we observe spurious "file not found" errors in projects that use
large PCHs on platforms with low open-file limits:
https://bugs.chromium.org/p/chromium/issues/detail?id=924225
The fix here has non-obvious effects on the sequence of FS operations, due to
the subtle pattern of caches and side-effects in FileManager's usage in clang.
The main observed effects (from instrumenting FileSystem) are:
- It converts most open()+fstat() pairs for headers into a stat()+open() pair. This should be a performance loss: stat() is more expensive.
- It avoids open()s of some files where we turn out not to need the contents. This should be a performance win: stat() is cheaper, and open() usually requires an additional close().
At least in simple tests without PCHs, I observe the same total for
fstat+stat+open, e.g. fstat -51, stat +73, open -22.
This deserves a comment that we don't open it because we might not need it. If we would use it the file would be closed after reading the contents.