This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Implement hot index reloading for clangd-index-server
ClosedPublic

Authored by kbobyrev on Sep 10 2020, 5:58 AM.

Details

Summary

This patch adds a mechanism to load new versions of index into
clangd-index-server using SwapIndex and FileStatus information about last
modification time without downtime.

Diff Detail

Event Timeline

kbobyrev created this revision.Sep 10 2020, 5:58 AM
kbobyrev requested review of this revision.Sep 10 2020, 5:58 AM
kadircet added inline comments.Sep 11 2020, 2:22 AM
clang-tools-extra/clangd/index/remote/server/Server.cpp
72–73

why do we have this extra indirection?

84

why not make this private? also keep using SymbolIndex. also you can make this a reference now.

221

why is this taking a double pointer? I think you can just use a reference here for Index (i.e. clangd::SwapIndex &

223

i think we should rather store the file_status from the original file in here. as we want to reload whenever files size/modification_time has changed.
it doesn't have to be moving forward let alone share the same timezone restrictions and such provided by system_clock::now.

Also it would be nice to use a VFS in here, that way we can move this logic into a helper in the future if other components need similar logic.

224

how's this loop terminated?

I would suggest factoring the loop body into a helper and then having a lambda for periodically running it, while looking for an exit signal.

257

i think hotReload thread creation should also be part of the main.

266

you either need to join a thread before its destruction or detach it from the main thread.

314–315

nit: auto Index = std::make_unique<clang::clangd::SwapIndex>(openIndex...);

kbobyrev updated this revision to Diff 291508.Sep 14 2020, 1:24 AM
kbobyrev marked 7 inline comments as done.

Address review comments.

kbobyrev added inline comments.Sep 14 2020, 1:29 AM
clang-tools-extra/clangd/index/remote/server/Server.cpp
223

Can you elaborate more on VFS usage? I'm not sure how this would improve existing and potential future code over plain llvm::sys::* calls.

kbobyrev marked an inline comment as not done.Sep 14 2020, 1:29 AM
kadircet added inline comments.Sep 14 2020, 1:54 AM
clang-tools-extra/clangd/index/remote/server/Server.cpp
223

it is mostly for unit testing the helper when there's a wider usage. also some users of clangd($Employer's internal editor) doesn't really have a real fs, they surface it through a VFS.

it also makes the code a little bit more easier to read, but definitely doesn't require immediate action, so up to you.

kbobyrev updated this revision to Diff 291521.Sep 14 2020, 2:45 AM
kbobyrev marked 2 inline comments as done.

Address remaining comments.

Thanks! Mostly looks good, just couple of nits.

clang-tools-extra/clangd/index/remote/server/Server.cpp
221

Again Index is a non-nullptr, so let's use a ref instead.

229

from previous comment:

i think we should rather store the file_status from the original file in here. as we want to reload whenever files size/modification_time has changed.
it doesn't have to be moving forward let alone share the same timezone restrictions and such provided by system_clock::now.
245

nit: maybe rename to runServerAndWait

326

instead of both asserting and bailing out, maybe just elog before bail-out?

+ I think what you rather want is grab the inital status outside this lambda. i.e:

clang::clangd::RealThreadsafeFS TFS;
auto FS = TFS...;
auto Status = ....;
if(!Status) {
  // elog and exit, file does not exist
}
auto Index = ...;

if(!Index) {
 // elog and exit, failed to load index 
}

std::thread ..([FS = TFS.view(), LastStatus = *Status, &Index]{ ... });
runServer();
joinThread();
330

nit: static constexpr auto RefreshFrequency, i.e. use constexpr and we don't have a different naming convention for constants. same for WatcherFrequency.

kbobyrev updated this revision to Diff 291861.Sep 15 2020, 3:52 AM
kbobyrev marked 5 inline comments as done.

Address a round of comments.

oops, looks like i forgot to hit submit in the morning..

clang-tools-extra/clangd/index/remote/server/Server.cpp
225

i beleive equivalent checks for inode number equivalence (at least on linux, using realfilesystem). That's not guranteed to change when you overwrite/modify a file.

Sorry if it felt like i meant this, but I was literally suggesting to use != instead of <= when comparing modifcation time, and also incorporating size into the check.

Also please let me know if there's something i am missing with the equivalent.

It would be also nice to have some tests, as we are starting to add non-trivial functionality to server but not blocking for this patch. (i suppose it would've catched the equivalent problem)

231

i think we should update LastStatus here, as we are going to fail loading the index unless the file changes. so there's no need to retry loading the index if the file hasn't changed.

233

i believe, this should be an elog.

237

nit: maybe also print the size.

318

nit: drop File

327

nit: i would first call hotReload and then sleep to get rid of one extra reload of the index when we are shutting down.

337

nit: return 0; ?

kbobyrev updated this revision to Diff 292130.Sep 16 2020, 1:07 AM
kbobyrev marked 5 inline comments as done.

Address another round of comments.

clang-tools-extra/clangd/index/remote/server/Server.cpp
231

Sorry, the comment is off, it belongs to the if statement above. Here, the index file is already different and there is an attempt to reload it. If it succeeds, new index replaces the old one and LastStatus is updated.

337

No need for that in C++.

https://en.cppreference.com/w/cpp/language/main_function

  1. The body of the main function does not need to contain the return statement: if control reaches the end of main without encountering a return statement, the effect is that of executing return 0;
kbobyrev marked 2 inline comments as not done.Sep 16 2020, 1:08 AM
kbobyrev updated this revision to Diff 292131.Sep 16 2020, 1:10 AM

Update comment.

kadircet added inline comments.Sep 16 2020, 1:36 AM
clang-tools-extra/clangd/index/remote/server/Server.cpp
231

right and what I am saying is, even if we fail to load the the index we should still update the LastStatus to prevent redundant retries for a broken index file. e.g:

  • For some reason a malformed index file is written with mod. time X and size Y.
  • Hot reload logic picks it up, the file exists and everything is fine.
  • When we try to read the index, we notice it is malformed or for whatever reason, deserialization fails.
  • Now we exit without updating the LastStatus, hence in the next update all of this will happen again even though index loading is going to fail again (as malformed index is still the same).

We could prevent that redundant loads (and failure logs) by caching the LastStatus as soon as the file exists.

Does that make sense now?

337

right, hence the nit

Harbormaster completed remote builds in B71834: Diff 292131.
kbobyrev updated this revision to Diff 292144.Sep 16 2020, 1:54 AM
kbobyrev marked 2 inline comments as done.

Save last status to prevent redundant updates.

clang-tools-extra/clangd/index/remote/server/Server.cpp
231

Ahh, I see now, makes sense. Thanks for the explanation!

337

Hm, why do you think it is better then? I'm not sure if adding no-op lines of code will increase readability/clarity.

I'm curious now, is there any reason you prefer explicit return 0;?

kbobyrev marked an inline comment as not done.Sep 16 2020, 1:54 AM
kadircet accepted this revision.Sep 16 2020, 2:02 AM

Thanks, LGTM!

clang-tools-extra/clangd/index/remote/server/Server.cpp
337

I suppose, mostly aesthetics and being used to it.

This revision is now accepted and ready to land.Sep 16 2020, 2:02 AM
This revision was landed with ongoing or failed builds.Sep 16 2020, 2:12 AM
This revision was automatically updated to reflect the committed changes.