Allow un-setting the compilation database path
AbandonedPublic

Authored by simark on Sep 6 2018, 3:28 AM.

Details

Reviewers
ilya-biryukov
Summary

It is currently possible to tell clangd where to find the
compile_commands.json file through the initializationOptions or the
didChangeConfiguration message. However, it is not possible to tell
clangd to deselect any explicit compilation database path (i.e. go back
to the default).

This patch makes it possible by sending the value null:

params: {
  "settings": {
    "compilationDatabasePath": null
  }
}

Not including the compilationDatabasePath field doesn't change the value
(which means it doesn't deselect it if one is already set). I chose to
do it this way because the other possible field,
compilationDatabaseChanges, just contains a delta. So I think it makes
sense if compilationDatabasePath works the same way.

simark created this revision.Sep 6 2018, 3:28 AM
simark updated this revision to Diff 164183.Sep 6 2018, 3:42 AM

Fix formatting.

Not sure who should review this, please feel free to add anybody who would be appropriate.

ilya-biryukov added a comment.EditedSep 12 2018, 9:12 AM

Wow, this is getting somewhat complicated.

Have you considered rerunning clangd whenever someone changes an option like that?
Would that be much more complicated on your side?

Not opposed to having an option too, just want to be aware of the costs involved on your end.

clangd/Protocol.h
368

Not a big fan or something like this, but maybe give special meaning to empty path instead of wrapping an optional into an optional?

Double optionals are a real pain to write and read.

Wow, this is getting somewhat complicated.

Have you considered rerunning clangd whenever someone changes an option like that?
Would that be much more complicated on your side?

Not opposed to having an option too, just want to be aware of the costs involved on your end.

Since we already support changing the compilation database path at runtime, I don't think it's significantly more complex to support resetting that value to the default of empty/not-used. If we can restart clangd when the users chooses to use no compilation database path, we might as well just restart it every time the user selects a new compilation database path.

I was assuming that restarting clangd might potentially be significantly more costly than just changing a setting, but maybe I'm wrong. The point of switching compilation database path is usually to point to a different build that has different flags, so all open files will get reparsed anyway... And just start clangd isn't particularly heavy.

I'll investigate how difficult it is to make our frontend (Theia) restart clangd when the user switches compilation database.

If this setting exposed directly the users in Theia or is it something that is exposed via a custom UI (e.g. choosing named build configs or something similar)?

I was assuming that restarting clangd might potentially be significantly more costly than just changing a setting, but maybe I'm wrong.

It's only cheaper if changes in the compilation database paths don't affect compile commands. If they do, everything will get rebuilt on next access, which is costly.

The point of switching compilation database path is usually to point to a different build that has different flags, so all open files will get reparsed anyway... And just start clangd isn't particularly heavy.

+1, since all compile commands change, we probably never get a performance win in practice.
One thing that (I believe) we're currently doing is using the old preamble for completion. This allow to code complete before reparse of preamble is finished.
I'm not sure this works when changing compile commands, though. Have not seen the breakages yet, but it's also neither something we've been testing nor something that pops up often in practice.

I'll investigate how difficult it is to make our frontend (Theia) restart clangd when the user switches compilation database.

If that is feasible, might allow removing some code from clangd. Though don't expect that amount to be too high.
One potential complication to restoring the state of the language server. It should be just the list of open files, but I may be missing something.

If you'll decide to go with an option to reset the path, see the comment about making empty path special. Optional<Optional<>> does not play nicely with json serialization code and hard to read (even though it does look like the right thing to do from the type system perspective).

If this setting exposed directly the users in Theia or is it something that is exposed via a custom UI (e.g. choosing named build configs or something similar)?

It is through a custom UI, that we name exactly that, build configurations. There is an example of the corresponding configuration here:

https://github.com/theia-ide/theia/blob/master/packages/cpp/README.md

I'll investigate how difficult it is to make our frontend (Theia) restart clangd when the user switches compilation database.

If that is feasible, might allow removing some code from clangd. Though don't expect that amount to be too high.
One potential complication to restoring the state of the language server. It should be just the list of open files, but I may be missing something.

I am not sure I understand correctly your last point. Of course, when restarting clangd, we need to send again a didOpen for each file the user has currently open in the editor. But that should be it?

If you'll decide to go with an option to reset the path, see the comment about making empty path special. Optional<Optional<>> does not play nicely with json serialization code and hard to read (even though it does look like the right thing to do from the type system perspective).

Yes, noted. I preferred to avoid giving a special meaning to an empty string because we could avoid it. But I wouldn't mind changing it if we go that route.

kadircet added inline comments.Sep 13 2018, 6:16 AM
clangd/Protocol.h
368

Sorry just passing by since I've seen a similar usage that already exists in codebase, wanted to point it out.

Some api like this could be easier, to make the decision of using empty path or anything else for an invalid path and making it an implementation detail.

https://github.com/llvm-mirror/clang/blob/master/include/clang/Sema/DeclSpec.h#L53

I am not sure I understand correctly your last point. Of course, when restarting clangd, we need to send again a didOpen for each file the user has currently open in the editor. But that should be it?

Yes, this and any initial configuration settings.
My personal experience is from VSCode, which does not call addDocument IIUC. This results in clangd producing errors about running actions in non-opened files.

If you'll decide to go with an option to reset the path, see the comment about making empty path special. Optional<Optional<>> does not play nicely with json serialization code and hard to read (even though it does look like the right thing to do from the type system perspective).

Yes, noted. I preferred to avoid giving a special meaning to an empty string because we could avoid it. But I wouldn't mind changing it if we go that route.

Also not a big fan of it, but Optional<Optional<>> looks terrible. Maybe I'm missing an obvious pattern to make it simpler here, though.
WDYT about deserializing null to empty string? It would mean that specifying both null and "" in json configuration yields the same results, but JSON that we send would look a bit nicer.

clangd/Protocol.h
368

It makes it less straight-forward, though.

simark added a comment.EditedThu, Oct 4, 12:36 PM

So I revisited this today. In the end, restarting clangd in our IDE works well. It should be merged anytime soon:

https://github.com/theia-ide/theia/pull/3017

But I am wondering what to do with the feature that allows clangd to change compilation database path using the didChangeConfiguration notification. In my opinion, it's a bug that it's not possible to switch back to use no explicit compilation database path, so I'd still like to get this patch merged. Or, if we decide this is really not useful, then it should be removed. I don't think we should keep the feature in the current state, either we fix it or remove it.

Even if right now it's not very expensive to restart clangd, do you foresee a situation in the future where it could become more expensive?

Also, I tried to remove the double Optional and do as you suggested (use the empty string to mean to use no explicit compilation database path), and in the end I find it more confusing, because we need special cases at both ends. When we deserialize the JSON and the value is null, and when storing the value in DirectoryBasedGlobalCompilationDatabase, where we convert an empty string to an empty Optional.

But I am wondering what to do with the feature that allows clangd to change compilation database path using the didChangeConfiguration notification. In my opinion, it's a bug that it's not possible to switch back to use no explicit compilation database path, so I'd still like to get this patch merged. Or, if we decide this is really not useful, then it should be removed. I don't think we should keep the feature in the current state, either we fix it or remove it.

Fair point. The original patch looked scary mostly because of Optional<Optional<...>>. Many thanks for clearing up your use-case, though. I thought Theia was exclusively using custom directories for compile_commands.json, I didn't know the user had an option to turn this behavior on/off and configure directories explicitly.
I would question the usefulness of the option if Theia can live without it, so I'd vouch for removing it. @sammccall, WDYT? Especially wrt to auto-indexing and it being a customizable build directory?

Even if right now it's not very expensive to restart clangd, do you foresee a situation in the future where it could become more expensive?

We should keep clangd restarts fast, I don't see any reason why they should be slow.
Changing compile_commands.json files would probably always require both reparsing all the files and reloading all the indexes, so there's little hope we can make clangd faster in those cases, unless we're willing to keep multiple versions of the AST/indexes in memory.

Also, I tried to remove the double Optional and do as you suggested (use the empty string to mean to use no explicit compilation database path), and in the end I find it more confusing, because we need special cases at both ends. When we deserialize the JSON and the value is null, and when storing the value in DirectoryBasedGlobalCompilationDatabase, where we convert an empty string to an empty Optional.

Yeah, moving this special meaning into all layers was a bad idea.
Maybe replace an empty optional with an empty string on deseraizliation? The rest of the code using Optional<Path> seems fine, it's just the ClangdConfigurationParamsChange definition that looks complicated.
The code for deserializing would look like this:

bool fromJSON(const json::Value &Params, ClangdConfigurationParamsChange &CCPC) {
  json::ObjectMapper O(Params);
  if (!O)
    return ...;

  if (const Value* CDBPath = Params->get("compilationDatabasePath")) {
    // Have to update the DB path, CCPC.compilationDatabasePath will have a value.
    llvm::Optional<std::string> NewDBPath;
    if (!fromJSON(CDBPath, NewDBPath))
      return false;
    CCPC.compilationDatabasePath = NewDBPath ? std::move(NewDBPath) : "";
  } else {
    // No need to update DB path, CCPC.compilationDatabasePath will be llvm::None.
  }
  ...
}

But I am wondering what to do with the feature that allows clangd to change compilation database path using the didChangeConfiguration notification. In my opinion, it's a bug that it's not possible to switch back to use no explicit compilation database path, so I'd still like to get this patch merged. Or, if we decide this is really not useful, then it should be removed. I don't think we should keep the feature in the current state, either we fix it or remove it.

I would question the usefulness of the option if Theia can live without it, so I'd vouch for removing it. @sammccall, WDYT? Especially wrt to auto-indexing and it being a customizable build directory?

Agree with you both here. Not being able to reset the initial state is weird, but *being able to* is complicated. (Even more so than switching between CDBs).
And since switching between CDBs invalidates almost everything, restarting clangd seems like a great simplification if it's viable for you.
Supporting a single CDB that's available on startup (either a flag, or maybe more likely set during initialize()?) would be a very useful simplification. (Even if it's not a lot of code, I find myself getting tangled in it often).

Even if right now it's not very expensive to restart clangd, do you foresee a situation in the future where it could become more expensive?

We should keep clangd restarts fast, I don't see any reason why they should be slow.

Agree, it's important that clangd starts fast for other reasons. And auto-indexing should stop/resume fine.
Shutdown is a slightly trickier case: actually doing a clean shutdown may require waiting for e.g. any current AST parse to finish. But if this is a problem, I think you can just send SIGTERM.

Changing compile_commands.json files would probably always require both reparsing all the files and reloading all the indexes, so there's little hope we can make clangd faster in those cases, unless we're willing to keep multiple versions of the AST/indexes in memory.

The only thing I think we can gain is if you change CDBs, but actually many of the commands are *completely* identical. Then we could detect it and avoid invalidating ASTs etc for those files. But this seems like a really marginal edge-case.

Thanks for your input. I have opened https://reviews.llvm.org/D53220, which removes that option.

simark abandoned this revision.Fri, Oct 12, 2:43 PM