In some deployments, for example when running on FUSE or using some
network-based VFS implementation, the filesystem operations might add up
to a significant fraction of preamble build time. This change allows us
to track time spent in FS operations to better understand the problem.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Hmm just a few curious questions from the sidelines.
Why a "custom system" instead of something -ftime-trace based?
How much overhead does this introduce, esp. for normal use-cases?
What's the gain? The information is very coarse and general-purpose system profiling tools should give you much better information regarding file system perf.
Thanks, LG in general. I suppose the extra timing overhead should be negligible, considering the preamble build times but would be nice to see some numbers if you have any.
Apart from that I've raised the concern around multiple preamble builds finishing in parallel and poisoning the first_build metric, wanted to raise it here again to make sure it isn't lost in details.
clang-tools-extra/clangd/FS.cpp | ||
---|---|---|
128 ↗ | (On Diff #415467) | nit: Timer(std::move(Timer)) |
141 ↗ | (On Diff #415467) | i guess this is negligible but, timing close operations too shouldn't hurt? |
clang-tools-extra/clangd/FS.h | ||
16 ↗ | (On Diff #415467) | i don't think we need this one |
71 ↗ | (On Diff #415467) | I don't think mentioning FS here is worth it. Can we rather emphasize on this being an accumulating timer? |
71 ↗ | (On Diff #415467) | what about making all of this an implementation detail of Preamble.cpp ? we can lift it up once we have other places that we want to make use of. |
73 ↗ | (On Diff #415467) | maybe put a FIXME: |
78 ↗ | (On Diff #415467) | nit: we use /// for public comments. |
82 ↗ | (On Diff #415467) | what about returning an int with ms granularity ? |
86 ↗ | (On Diff #415467) | what about just storing double/int ? |
91 ↗ | (On Diff #415467) | i feel like this deserves the FS specific comment mentioned above. maybe something like: This will record all time spent on IO operations in \p Timer. |
91 ↗ | (On Diff #415467) | i don't think we ever want concurrent access to Timer, i.e. startTime should never be called without a matching call to stopTime first. passing it in as a shared_ptr somehow gives the feeling that it might be shared across multiple objects, which might do whatever they want with the object. |
clang-tools-extra/clangd/Preamble.cpp | ||
464 | what about only conditionally doing these when Stats is not null? | |
467 | why use an expensive llvm::Timer if all we care is walltime? | |
470–471 | it might be worth leaving some comments around this TimedFS being exposed to outside through PreambleCallbacks. We provide access to Preprocessor, which keeps the underlying FileManager alive. | |
clang-tools-extra/clangd/Preamble.h | ||
82 | nit: again /// (i see that fields of PreambleData doesn't follow the style here, but it's the outlier) | |
95 | nit: maybe make this the last parameter and default to nullptr to get rid of changes in tests. | |
clang-tools-extra/clangd/TUScheduler.cpp | ||
100 | can we move all of these into an anonymous namespace instead? | |
109 | what about just tracking the total build time here? we can get the ratio afterwards. | |
113 | nit: llvm style doesn't really use const if the parameter is being copied anyway | |
121 | this can still be called in parallel as we have multiple preamble threads. moreover this also makes "first_build" tracking somewhat tricky. do you think this metric would still be useful in the presence of such noise? e.g. a respawning clangd instance might receive addDocument requests for all the previously open files and the first preamble to build might've benefited a lot from cached file IO that was done by other preamble builds. i suppose we can have something like following to at least address multiple preamble threads accessing FirstReport issue. static bool ReportedFirstBuild = [&]{ PreambleBuildFilesystemLatency.record(Stats.FileSystemTime, "first_build"); // I think it's fine to report twice here, as we've both performed a first build for a file, and it was the first build of clangd instance. return true; }(); llvm::StringLiteral Label = IsFirstPreamble ? "first_build_for_file" : "rebuild"; PreambleBuildFilesystemLatency.record(Stats.FileSystemTime, Label); if (Stats.TotalBuildTime > 0) // Avoid division by zero. PreambleBuildFilesystemLatencyRatio.record( Stats.FileSystemTime / Stats.TotalBuildTime, Label); | |
1006 | nit: drop the const per style. | |
1012–1017 | nit: if(!LatestBuild) return; reportPreambleBuild(Stats, IsFirstPreamble); if(isReliable(LatestBuild->CompileCommand)) HeaderIncluders.update(FileName, LatestBuild->Includes.allHeaders()); |
I'm not sure if I understand. -ftime-trace is a one-off. I want to monitor these values across multiple clangd instances, over time.
How much overhead does this introduce, esp. for normal use-cases?
On my machine it takes about 3.5 milliseconds to do 100k startTimer(); stopTimer(); calls in a loop. Building preamble for SemaOverload.cpp (random file I checked) takes about ~2200 such calls. Basically the impact of this should not be noticeable.
What's the gain? The information is very coarse and general-purpose system profiling tools should give you much better information regarding file system perf.
The idea is to use this to monitor large deployments of clangd on many machines over long period of time. One question to answer is whether some sort of prefetching of header files (with parallel I/O) would be beneficial. That depends on what fraction of the time we spend waiting on the serial file reads and how warm the caches are, on average.
It will also help us catch regressions, like one we had recently (that went unnoticed for quite a while) where adding some -fmodule-map-file flags to do a layering check caused a lot of stat() calls, which can be quite expensive.
Does that make sense? Does it answer your questions?
clang-tools-extra/clangd/FS.h | ||
---|---|---|
71 ↗ | (On Diff #415467) | Before I make the change, let me clarify: are you suggesting moving the whole TimedFS into Preamble.cpp? Definitely possible. |
73 ↗ | (On Diff #415467) | IMHO FIXME should be actionable. This is not - we shouldn't do it unless something changes. It's just a note that if someone ever wonders "could I just move this to support/ and re-use it", the answer is yes. Does that make sense? |
82 ↗ | (On Diff #415467) | We could, but why? llvm::Timer returns WallTime as double so this is somewhat consistent with it. Not a strong argument, of course, since that consistency doesn't really matter, but I'm not sure what benefit using int here offers? |
86 ↗ | (On Diff #415467) | Again, why? (I'm not strictly against it, just trying to understand why you're asking for this) |
91 ↗ | (On Diff #415467) | It's not about concurrency, it's about live time. This timer needs to have the same lifetime as the entire VFS, which is also ref-counted. Alternative would be for the TimerFS to own this timer and expose it's own getTime() method. That means TimerFS must now be visible outside of FS.cpp, but if we're moving it to Preamble.cpp per your other comment that's fine. |
clang-tools-extra/clangd/Preamble.cpp | ||
470–471 | Hmm...I added a comment, but nothing really changes here, right? VFS in general is not thread safe, so if we add parallelism we need to either make sure no file access happens there or make all VFSs used here thread safe. | |
clang-tools-extra/clangd/Preamble.h | ||
95 | I'd rather not, unless you insist. Besides not having to modify tests (which I already did anyway), what's the benefit of having it be default? Do you think it's more readable? | |
clang-tools-extra/clangd/TUScheduler.cpp | ||
100 | Done. I moved them into anonymous namespace, but I'm not sure what you mean by "instead"? | |
109 | How would you get the ratio then? | |
113 | Right, and I'll never get used to that. Sorry ;-) | |
121 | That's an excellent point. Fixed. I do believe that this extra information is worth it, even if it's noisy. If the fact that we're building multiple TUs in parallel and thus, essentially, have parallel file reading, significantly improves this metric then perhaps lack of parallel reads is not such a big deal after all. | |
1006 | Hmm...is that actually documented somewhere? There's definitely many cases of "const bool" in LLVM codebase. I think the "const" improves readability. |
clang-tools-extra/clangd/FS.h | ||
---|---|---|
71 ↗ | (On Diff #415467) |
Yes, exactly. as neither the FS nor the walltimer is needed by anything else (and we probably want to be more cautious when we want to make use of them in other components). |
73 ↗ | (On Diff #415467) | SG. |
82 ↗ | (On Diff #415467) | it's mostly personal preference i suppose, it just feels reasoning about discrete durations are easier than doubles. feel free to leave as is. |
86 ↗ | (On Diff #415467) | it just feels easier to reason about builtin types, than a template alias like std::chrono::steady_clock::duration, at least on the interface level. but again, probably just a personal preference that I don't feel strongly about, so feel free to ignore. (usually with what abouts i try to signal this :D) |
91 ↗ | (On Diff #415467) |
Right, I've noticed that as I was going through the rest of the review, but forgot to delete these as it was split into two days. sorry for the churn.
yes, i think that would be a nice simplification. |
clang-tools-extra/clangd/Preamble.cpp | ||
470–471 | right. in theory we have that more loudly spelled out by having a ThreadSafeFS, but as you pointed out that wouldn't be the only thing to consider when such a shift happens. | |
474 | is this call intentional ? | |
clang-tools-extra/clangd/Preamble.h | ||
95 | right, i think it's more readable, and moreover it will reduce the need for typing that parameter more times in the future (mostly in tests again). | |
clang-tools-extra/clangd/TUScheduler.cpp | ||
100 | instead of having static in front of the void reportPreambleBuild :D (sorry the comment was definitely misplaced). | |
109 | we could just divide the distributions point by point. it isn't as accurate since we won't have the exact association, but something we've been doing in the past. | |
113 | we can drop the static now. | |
115 | doesn't this need to be at least static ? | |
1006 |
nothing that I can find either.
yes, but I think the majority is still not having "const" in front of bool. it's at least the convention in clangd. if you think there's enough value to having consts for locals for readability, i think we should figure out a way to make the codebase (at least the clangd part) consistent with consts first. |
more review comments
clang-tools-extra/clangd/FS.h | ||
---|---|---|
91 ↗ | (On Diff #415467) |
Done. |
clang-tools-extra/clangd/Preamble.cpp | ||
474 | Oops, no, it's not. | |
clang-tools-extra/clangd/Preamble.h | ||
95 | OK, I made it default to nullptr. The logic behind it not being last was that it's usual (though not a hard rule) for callbacks to be the last argument, probably to make passing lambdas look nicer. Not really important. | |
clang-tools-extra/clangd/TUScheduler.cpp | ||
1006 | OK, I'll drop const for now then. |
thanks, lgtm! sorry for the long round trip.
clang-tools-extra/clangd/Preamble.cpp | ||
---|---|---|
315 | nit: double slashes instead of triple now (as it's no longer public interface), sorry :( |
nit: again /// (i see that fields of PreambleData doesn't follow the style here, but it's the outlier)