This is an archive of the discontinued LLVM Phabricator instance.

[HeaderSearch] don't immediately request that headers are opened in getFileAndSuggestModule().
AcceptedPublic

Authored by sammccall on Jan 24 2019, 5:28 AM.

Details

Summary

We don't immediately need the file to be open, and this code does not ensure
that it is closed. On at least some code paths (using headers from PCHs) the
file is never closed and we accumulate open file descriptors.

Prior to r347205, the file was not actually opened (despite shouldOpen=true)
if the FileEntry already existed. In the PCH case, the entry existed (due to
a previous call with shouldOpen=false) and so we got away with passing true.

r347205 "fixed" that behavior of FileManager: if the file was previously statted
but not opened, and then FileManager::get(shouldOpen=true) is called, then the
file is opened.
Subsequently we observe spurious "file not found" errors in projects that use
large PCHs on platforms with low open-file limits:

https://bugs.chromium.org/p/chromium/issues/detail?id=924225

The fix here has non-obvious effects on the sequence of FS operations, due to
the subtle pattern of caches and side-effects in FileManager's usage in clang.
The main observed effects (from instrumenting FileSystem) are:

  • It converts most open()+fstat() pairs for headers into a stat()+open() pair. This should be a performance loss: stat() is more expensive.
  • It avoids open()s of some files where we turn out not to need the contents. This should be a performance win: stat() is cheaper, and open() usually requires an additional close().

At least in simple tests without PCHs, I observe the same total for
fstat+stat+open, e.g. fstat -51, stat +73, open -22.

Diff Detail

Event Timeline

sammccall created this revision.Jan 24 2019, 5:28 AM
sammccall updated this revision to Diff 183291.Jan 24 2019, 5:29 AM

Actually include the relevant code change :-\

sammccall added a subscriber: hans.Jan 24 2019, 5:35 AM

@hans: I think we're going to need either this fix or a revert of rL347205 + dependents cherrypicked for 8.0.
I don't know what the constraints/goals are for releases, let me know if you have an opinion which way we should go, (e.g. how important that trunk+release go the same way).

thakis accepted this revision.Jan 24 2019, 6:29 AM

Thanks for the fix!

Nice test too. Let's land this and see if anything breaks :-)

On the 8.0, I'd probably recommend reverting your change instead, since this feels like a subtle change that might have unintended consequences somewhere.

This revision is now accepted and ready to land.Jan 24 2019, 6:29 AM
bkramer accepted this revision.Jan 24 2019, 6:55 AM
bkramer added inline comments.
lib/Lex/HeaderSearch.cpp
313

This deserves a comment that we don't open it because we might not need it. If we would use it the file would be closed after reading the contents.

Sigh, this regresses perf by 6%ish in my testing (building HeaderSearch.cpp with -fsyntax-only and a cold disk cache).
I think we'll need a more targeted fix: probably passing openFile=false only in the using-PCH case.

Plan for now is to revert the original change, live with the things that are broken, cherrypick the revert to the llvm8 branch, then revisit.

hans added a comment.Jan 24 2019, 11:03 AM

Plan for now is to revert the original change, live with the things that are broken, cherrypick the revert to the llvm8 branch, then revisit.

Sgtm for 8.0. I see the revert landed in r352079, so I'll merge that once it's baked in trunk a little bit.