This is an archive of the discontinued LLVM Phabricator instance.

[libTooling] Change CDB heuristic to look further for files in a given language.
AcceptedPublic

Authored by adamcz on Sep 7 2020, 11:29 AM.

Details

Reviewers
kadircet
amosnier
Summary

When looking for compile commands for file of type X, other files of
type X will be preferred over files of any other type, regardless of
score.

However, when gathering the set of candidates, we only look for files
with similar name and files close to the requested one. If it happens
that none of them are in language X, we previously (before this change)
would choose one of the existing candidates of some other type.

This patch adds all files in language X (regardless of name and
location) to the list of candidates if the initial set of candidates did
not contain any files of type X. This means we will always choose a file
of type X if one exists anywhere in the project.

Diff Detail

Event Timeline

adamcz created this revision.Sep 7 2020, 11:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 7 2020, 11:29 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
adamcz requested review of this revision.Sep 7 2020, 11:29 AM

Hey Kadir

What do you think about this change? This addresses https://github.com/clangd/clangd/issues/519, but I'm not entirely convinced this is a good idea. Let me know if you have an opinion on this.

nridge added a subscriber: nridge.Sep 7 2020, 11:37 AM
amosnier accepted this revision.Sep 7 2020, 11:58 AM
amosnier added a subscriber: amosnier.

I'm not sure I'm supposed to accept a revision, but apparently I'm authorized to do so. And I'm obviously biased since I wrote the bug report. While looking for candidates close to the target file might often sound reasonable, it does not feel particularly adapted to a whole category of files: template declarations. Why would the files that include those be anywhere close to them? Also, fetching compile commands from a file written in another language pretty much always seems like a bad idea. If I understand correctly, with this commit, an arbitrary file of the same language will be picked, every time one such file is found. It would of course be even better to pick one that includes the target file (in case of a header file), but I can easily imagine that it would be much more difficult to implement.
Based on the previous analysis, I approve this commit.
I would also like to express how impressed I am by this project's efficiency!

This revision is now accepted and ready to land.Sep 7 2020, 11:58 AM

As discussed offline, this is trading off some accuracy between getting -I correct vs -std and it is unclear whether that's beneficial or harmful. It is easy to come up with examples for both and we were split 50/50 between each.
In the end all of this is a heuristic work and it is quite likely that we might break this for more people, while trying to fix this one specific case. So I don't think it's worth it.

As for suggestions(again from offline discussion) :

  • We can implement a more sophisticated transfer logic in clangd layer, by making use of compile commands from a TU including a particular header
  • When there's a mismatch in language, we can try to merge two commands. One with filename/path match and filetype match.

As discussed offline, this is trading off some accuracy between getting -I correct vs -std and it is unclear whether that's beneficial or harmful. It is easy to come up with examples for both and we were split 50/50 between each.
In the end all of this is a heuristic work and it is quite likely that we might break this for more people, while trying to fix this one specific case. So I don't think it's worth it.

As for suggestions(again from offline discussion) :

  • We can implement a more sophisticated transfer logic in clangd layer, by making use of compile commands from a TU including a particular header
  • When there's a mismatch in language, we can try to merge two commands. One with filename/path match and filetype match.

I do not have the full picture, but I trust your judgment. I think I understand the first suggestion, which sounds very good. I would offer to help, but I know I won't have enough time. I can test. :-) How can I make sure I will receive updates on this issue (or any prolongation of it)?

How can I make sure I will receive updates on this issue (or any prolongation of it)?

By commenting on this review, you're automatically subscribed to it, so should get an email notification for any future comments or patch updates.

(You should also receive notifications for comments on the issue you filed, https://github.com/clangd/clangd/issues/519.)