Every Clang instance uses an internal FileSystemStatCache to avoid stating the same content multiple times. However, different instances of Clang will contend for filesystem access for their initial stats during HeaderSearch or module validation. On some workloads, the time spent in the kernel in these concurrent stat calls has been measured to be over 20% of the overall compilation time. This is extremly wassteful when most of the stat calls target mostly immutable content like a SDK. This commit introduces a new tool `clang-stat-cache` able to generate an OnDiskHashmap containing the stat data for a given filesystem hierarchy. The driver part of this has been modeled after -ivfsoverlay given the similarities with what it influences. It introduces a new -ivfsstatcache driver option to instruct Clang to use a stat cache generated by `clang-stat-cache`. These stat caches are inserted at the bottom of the VFS stack (right above the real filesystem).
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
clang/tools/clang-stat-cache/clang-stat-cache.cpp | ||
---|---|---|
62 | Sorry, but you misuse anonymous namespaces. You want static instead. |
Mostly just skimmed so far, will hopefully have some time to look at this more tomorrow. Out of interest, do you have any performance numbers using this change? I assume this mainly impacts implicit modules (since I suspect we'd also be opening the file as well anyway), is that true?
clang/lib/Frontend/CompilerInvocation.cpp | ||
---|---|---|
4765–4780 | IMO VFS overlays being modeled like this is a mistake and makes reasoning/testing about them fairly difficult. I have a PR up https://reviews.llvm.org/D121423 to change OverlayFileSystem to make more sense and be a real overlay rather than what it is now. If I finish that one off, how would you feel about changing the behavior of StatCacheFileSystem to just immediately error if it doesn't contain the file, rather than proxy (and thus chain) as it does now? So for multiple files we'd then have:
Then any non-stat or exists call would return llvm::errc::no_such_file_or_directory and then the next FS would be used instead. I don't think this *really* matters for StatCacheFileSystem, so I'm fine if you'd rather not wait for me to change OverlayFileSystem. I can make the changes myself after getting my PR in. | |
clang/tools/clang-stat-cache/clang-stat-cache.cpp | ||
144 | ||
llvm/lib/Support/VirtualFileSystem.cpp | ||
2958 ↗ | (On Diff #470328) | |
2959–2960 ↗ | (On Diff #470328) | This sentence is a little confusing to me. remove_dots just leaves .. unless you pass remove_dot_dot (but it doesn't do any checking). IMO just the If the path does not contain '..' we can safely say it doesn't exist. is enough. |
2961 ↗ | (On Diff #470328) | FWIW StringRef has a contains |
Thanks for the initial feedback!
Mostly just skimmed so far, will hopefully have some time to look at this more tomorrow. Out of interest, do you have any performance numbers using this change? I assume this mainly impacts implicit modules (since I suspect we'd also be opening the file as well anyway), is that true?
You're correct that this overhead has been measured on implicit module builds. As I mentioned in the commit message this saves over 20% of the overall built time in some cases. This time is split between module validation (which could be skipped) and HeaderSearch (which cannot be skipped).
clang/lib/Frontend/CompilerInvocation.cpp | ||
---|---|---|
4765–4780 | I don't think that's really doable if you want to keep the ability to cache negative hits. If a miss always results in a query to the real filesystem, then you're not saving the stat call. A lot of the time is spent in HeaderSearch which queries a similar number of non-existing and existing files. | |
llvm/lib/Support/VirtualFileSystem.cpp | ||
2959–2960 ↗ | (On Diff #470328) | remove_dots does more than remove dots. It is confusing, but it also removes excess separators (see the Canonical unit test). This means that the cache will work for /path/to/cache/file/a as well as /path/to/cache/file///a and /path/to/cache/file/././a. There are basically infinite spellings of a path just by adding . and additional separators. |
2961 ↗ | (On Diff #470328) | But that wouldn't be correct. Here we are looking for a path component which is ... A simple text search would fire on a filename containing ... I think this search on components is the only correct way to do this. |
You're correct that this overhead has been measured on implicit module builds. As I mentioned in the commit message this saves over 20% of the overall built time in some cases. This time is split between module validation (which could be skipped) and HeaderSearch (which cannot be skipped).
Heh, I read the commit as saying we've seen a 20% overhead, but not that this completely removed that. Not sure why I didn't make that connection in my head, it's a reasonable implication. Very nice!
clang/lib/Frontend/CompilerInvocation.cpp | ||
---|---|---|
4765–4780 | That's a great point. It'd be easy enough to avoid by eg. passing down an error to use for either "this is a path I handle and it doesn't exist" or "this is not a path I handle" and handling that in OverlayFileSystem, which could default to llvm:errc::no_such_file_or_directory. Generally I just think this is easier to reason about, but it's more problem for RedirectingFileSystem then it is for this (since this isn't *really* an overlay at all). This is definitely fine as is either way though. If I find RedirectingFileSystem needs the same sort of idea then we can revisit it then. | |
llvm/lib/Support/VirtualFileSystem.cpp | ||
2959–2960 ↗ | (On Diff #470328) | I suppose it could be confusing if you didn't know what remove_dots did, but in that case I'd prefer something like:
|
2961 ↗ | (On Diff #470328) | Ah yeah, fair enough 👍 |
llvm/lib/Support/VirtualFileSystem.cpp | ||
---|---|---|
3002 ↗ | (On Diff #470644) | Few comments about stats representation.
|
llvm/lib/Support/VirtualFileSystem.cpp | ||
---|---|---|
3002 ↗ | (On Diff #470644) |
Yes, we can. I pondered doing it in the original patch but didn't see a reason this would evolve in the future. Of course, I can't predict the future, so maybe a version field is the safe approach.
The non-portability of the current cache doesn't bother me as it's meant to mirror filesystem aspects that are local to a machine. Doesn't adding a version number alleviate some of these concerns? If we wanted to add something else in the future, or make it portable, we could just bump that. I think your comment hints at a higher-level question though. If we were to add something else to the cached data, it wouldn't be a stat cache anymore. Should the feature be a more general "filesystem cache" which starts out with just stat data? |
Yes, we can. I pondered doing it in the original patch but didn't see a reason this would evolve in the future. Of course, I can't predict the future, so maybe a version field is the safe approach.
I think a version number should be good enough for now. For the portability, it is probably not too big a concern. We can add that in future versions if it is needed.
clang/include/clang/Driver/Options.td | ||
---|---|---|
3361 | "filesystemsyscalls" -> "filesystem syscalls" | |
clang/lib/Frontend/CompilerInvocation.cpp | ||
4767 | We should use BaseFS->getBufferForFile here in case the BaseFS is not the real filesystem. This matches how vfsoverlay works below and could avoid re-reading this file repeatedly during scanning where we have a caching filesystem at the base. | |
clang/tools/clang-stat-cache/clang-stat-cache.cpp | ||
177 | This could use more description of the error. | |
292 | Is this pathconf extension Darwin-only? | |
301 | Nit: move subtraction to the print or create a new "duration" variable instead of mutating endTime. Same below for writeStatCache. | |
llvm/include/llvm/Support/VirtualFileSystem.h | ||
1113 ↗ | (On Diff #472379) | s/real/underlying/ - it could be wrapping any filesystem in principle. |
1116 ↗ | (On Diff #472379) | typo: "and SDK" -> "an SDK" |
1141 ↗ | (On Diff #472379) | Nit: It's unusal to see unique_ptr && instead of just unique_ptr. Is this needed? Same for the constructor. |
1142 ↗ | (On Diff #472379) | Is CacheFilename different from CacheBuffer->getBufferIdentifier()? |
1167 ↗ | (On Diff #472379) | Currently we are not adding the SDK base directory itself to the cache, which causes status(BaseDir) to fail. I think we should add a status parameter to the constructor of StatCacheWriter to force it to be included. This case also needs a test. |
llvm/lib/Support/VirtualFileSystem.cpp | ||
2816 ↗ | (On Diff #472379) | Does this depend on other non-public code from this file, or could we split it into StatCacheFileSystem.cpp? Obviously there is a lot of code here already, but would be nice to break the trend if it's easy to do so. |
2907 ↗ | (On Diff #472379) | I think this comment is missing 'Version', which was added later. |
3050 ↗ | (On Diff #472379) | Nit: could we use pwrite here and then change this method to use the more-general raw_pwrite_stream instead of being fd_ostream-specific? |
clang/tools/clang-stat-cache/clang-stat-cache.cpp | ||
---|---|---|
292 | I don't think so. It's a BSD extension. | |
301 | I would have done so originally if there was a TimeRecord overload that allows simple subtraction. I don't think that exists. I renamed the variable to duration, maybe it make the code a little more readable. |
Re-added a few of my minor comments that I think got lost on a specific version of the diff. Otherwise LGTM.
llvm/include/llvm/Support/StatCacheFileSystem.h | ||
---|---|---|
24 | s/real/underlying/ - it could be wrapping any filesystem in principle. | |
27 | typo: "and SDK" -> "an SDK" | |
52 | Nit: It's unusal to see unique_ptr && instead of just unique_ptr. Is this needed? Same for the constructor. | |
53 | Is CacheFilename different from CacheBuffer->getBufferIdentifier()? |
Sorry @lebedev.ri I missed your earlier comment about this. What format of doc would you like to see? A more elaborate comment in the tool's source or something else?
Right now it's impossible to discover that tool unless you already know it exists (or it's mentioned in some xcode doc, i guess)
It should have it's own description in it's --help, and ideally an .rst documentation page in docs.
I have no issue adding a more detailed help text. The documentation page seems somewhat overkill, but I guess I could be convinced otherwise. You need a fairly special build environment for this to make a difference and I'm not sure it's broadly applicable. You also need good build system integration to make it practical.
... and all of that knowledge seems to be missing from the current docs, which is why i'm asking in the first place.
This broke our mac builds with errors like:
ld64.lld: error: undefined symbol: CFRunLoopRun >>> referenced by tools/clang/tools/clang-stat-cache/CMakeFiles/clang-stat-cache.dir/clang-stat-cache.cpp.o:(symbol main+0x11be)
(and many more symbols)
Additional information: those are cross-compiled, and for some reason the -framework CoreServices in the CMakeLists.txt doesn't seem to make it to the command line, which is surprising because something similar works fine for e.g. dsymutil.
I ran into similar issues, not when cross compiling, but when compiling natively. In my case, I'm building with -DLLVM_LINK_LLVM_DYLIB=ON. Without that, it builds correctly for me.
The new tool needs to be linked with -framework CoreServices. This does get set in clang/tools/clang-stat-cache/CMakeLists.txt like this:
if(APPLE) set(CLANG_STAT_CACHE_LIB_DEPS "-framework CoreServices" ) endif() clang_target_link_libraries(clang-stat-cache PRIVATE ${CLANG_STAT_CACHE_LIB_DEPS} )
However, clang_target_link_libraries ignores extra dependencies when linking against the dylib: https://github.com/llvm/llvm-project/blob/a033dbbe5c43247b60869b008e67ed86ed230eaa/clang/cmake/modules/AddClang.cmake#L209-L213
if (CLANG_LINK_CLANG_DYLIB) target_link_libraries(${target} ${type} clang-cpp) else() target_link_libraries(${target} ${type} ${ARGN}) endif()
I guess clang_target_link_libraries needs a mechanism to disambiguate between generic clang library dependencies (which need to be dropped when linking against clang-cpp instead) and other dependencies which always are needed. I wonder if there's prior art for such disambiguation in other places - e.g. llvm_add_library does have similar logic for linking against either other llvm libraries or the dylib (and handles lots of other options). (I'm a bit out of time for digging further into this right now...)
Thanks for the report, unfortunately I have no way of testing this setup. Any chance you can experiment on your side? DirectoryWatcher uses the came decency, I'm wondering why it's not an issue there. If it helps to revert the change while someone investigates, that's perfectly fine.
Trying. If that's indeed the issue (and not the cross-compile environment), I'll revert while looking at it.
Oh yes, we're using -DLLVM_LINK_LLVM_DYLIB=ON too, so that would be the main trigger.
This was indeed the main trigger. Thanks for the help narrowing this down. I reverted for now.
If it's not too much trouble, maybe you could cherry-pick and amend 8d498e08deaf6e06a578cfedb4eb259b722ac7f6 into the commit that relands this.
(If not, no worries.)
"filesystemsyscalls" -> "filesystem syscalls"