This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Make git ignore index directories
Needs ReviewPublic

Authored by sums on Jul 31 2022, 5:45 PM.

Details

Reviewers
sammccall
kadircet
Group Reviewers
Restricted Project
Summary

This change makes clangd add .gitignore files to the index directories that ignore everything using an asterisk (*) value. We only add these gitignore files when creating a new directory, to allow cases where the user doesn't want this behavior (unlikely but not sure).

This behavior is similar to meson, which automatically puts a gitignore file in all build directories.

Diff Detail

Event Timeline

sums created this revision.Jul 31 2022, 5:45 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 31 2022, 5:45 PM
sums requested review of this revision.Jul 31 2022, 5:45 PM
sums edited the summary of this revision. (Show Details)Jul 31 2022, 5:46 PM

I am not sure if clangd is the right tool the create those .gitignore files, e.g. what if the project's VCS isn't git? I believe the right thing to do is for projects that want to make use of such tools to ignore the relevant directory explicitly (e.g. `.cache).

sums added a comment.EditedAug 1 2022, 6:25 AM

I am not sure if clangd is the right tool the create those .gitignore files, e.g. what if the project's VCS isn't git? I believe the right thing to do is for projects that want to make use of such tools to ignore the relevant directory explicitly (e.g. `.cache).

I believe clangd is the right tool, as the choice to use clangd or not is made by individual developers rather than project maintainers. Every git controlled C++ project would potentially have to include this directory to their gitignore otherwise.

I only thought of this after seeing how the meson build system behaves. A user can give any name to their build directory, so they add a gitignore to ignore it automatically. Users that don't want this behavior can simply remove the gitignore file, and it won't get recreated. The same behavior is present (and tested) with this change as well.

Lastly, this also has the benefit of only ignoring .cache/clangd/index/* instead of .cache which might be relevant to the git repo, and it'll also stay up to date if the directory path/structure changes.

My 2c, though I'm away with kids right now so will leave to @kadircet to make a call

Given that this is fully contained (logically and physically) within clangd's index directory, this seems elegant and harmless to me.

(At first i thought it was modifying project-level .gitignores, which would be scary).

I am not sure if clangd is the right tool the create those .gitignore files, e.g. what if the project's VCS isn't git?

The wrong-tool/slippery-slope argument is legitimate. I'd be pretty comfortable personally drawing the line at "git and nothing else" though.

I believe the right thing to do is for projects that want to make use of such tools to ignore the relevant directory explicitly (e.g. `.cache).

It would be great if there were some general "transient" name patterns that were implicitly ignored. Sadly not the case AFAIK

clang-tools-extra/clangd/test/background-index.test
23

Bleh, ideally this would go in clangd/.gitignore but we'd have to break a bunch of abstractions to do that. I think this is fine.

sums marked an inline comment as done.Aug 1 2022, 4:45 PM
sums added inline comments.
clang-tools-extra/clangd/test/background-index.test
23

Doing it this way also ignores the .gitignore file itself :)

I'm with Kadir on this one, but maybe I'm too attached to my .git/info/excludes. I somehow don't feel it's a big burden specific to clangd. I have to configure .gitignore for my projects and .git/info/excludes for other projects I work on because of many tools I personally use (editors, etc).

If this lands, I feel we should say * in the added .gitignore and put it into .cache/clangd.
That way the .cache folder does not appear in the git status at all and this improves the workflow for people who use git, i.e. they don't see anything related to clangd's internal storage in their VCS by default.

Otherwise, I would personally still put .cache/clangd into the global .gitignore or .git/info/exclude. IMO .cache/clangd should not be in the VCS history at all, even if only for the .gitignore files.
So landing this as is wouldn't really improve the situation for me personally.
Maybe my workflow is different from what others do, but hopefully this can be helpful.

Otherwise, I would personally still put .cache/clangd into the global .gitignore or .git/info/exclude. IMO .cache/clangd should not be in the VCS history at all, even if only for the .gitignore files.
So landing this as is wouldn't really improve the situation for me personally.

Oh, i was under the impression that git didn't track empty directories and this was recursive: if .cache/clangd/index/* are ignored then index/ doesn't exist, so nor does clangd/ or .cache/.

If this is not the case i agree the value is less... (Can't check, don't have a real computer this week)

Otherwise, I would personally still put .cache/clangd into the global .gitignore or .git/info/exclude. IMO .cache/clangd should not be in the VCS history at all, even if only for the .gitignore files.
So landing this as is wouldn't really improve the situation for me personally.

Oh, i was under the impression that git didn't track empty directories and this was recursive: if .cache/clangd/index/* are ignored then index/ doesn't exist, so nor does clangd/ or .cache/.

If this is not the case i agree the value is less... (Can't check, don't have a real computer this week)

No, you're right. I was mistaken about what the patch is doing.
Forget what I said, this change is fine. No .gitignore files are in the git status output.

I'm still think that I need to ignore files for other tools I use anyway, so not doing this for clangd is not a big improvement.
But that's just my opinion, please disregard other concerns.

sums marked an inline comment as done.Aug 3 2022, 6:28 AM

Thank you for the reviews and discussion, everyone!

TIL about .git/info/exclude. It'll let me exclude stuff without updating the checked in .gitignore. I was achieving the same effect for clangd by manually creating a .gitignore similar to this change.

The ideal scenario would be one where all tools ignore their temporary directories automatically. Will send patches whenever I encounter a case like this :)

@kadircet kindly take another look.

nridge added a subscriber: nridge.Aug 6 2022, 1:02 PM

Thank you for the reviews and discussion, everyone!

TIL about .git/info/exclude. It'll let me exclude stuff without updating the checked in .gitignore. I was achieving the same effect for clangd by manually creating a .gitignore similar to this change.

The ideal scenario would be one where all tools ignore their temporary directories automatically. Will send patches whenever I encounter a case like this :)

@kadircet kindly take another look.

thanks! sorry for forgetting about this one. so in the end it looks like there are two cases:

  • clangd is being used on a git-based project, and this is going to be "invisible" to the user
  • clangd is being used on a non-git project, then the user already needs to ignore this directory somehow and the extra .gitignore we generate there will be ignored by that rule.

So I guess this is a win overall.

Let me know if I should land this for you (and an email address for attribution of the commit).

clang-tools-extra/clangd/index/BackgroundIndexStorage.cpp
38

no need for llvm::sys::path::filename, as .gitignore is already a file name.

39

rather than calling out std::string you can do return ShardRootSS.str().str();

49

i don't think this check is really needed, we won't really lose much by overwriting it even if the directory already exists (and this is done once per compilation database we discover, so it isn't saving much IO).

95

this atomic write has the same chance of colliding with a direct write to .gitignore file by another clangd process.

can you change .tmp to .tmp.%%%%%%%% to make the collision less likely? (each % is replaced by a random hex digit).

kadircet added inline comments.Sep 6 2022, 1:10 AM
clang-tools-extra/clangd/index/BackgroundIndexStorage.cpp
58

nit:

if(llvm::Error E = addGitignore(..)) {
 elog(".."), std::move(E));
}

std::move is actually important, otherwise destruction of Error on failure state will trigger a crash.

94

we're passing in Directory but using DiskShardRoot here. can you actually make addGitignore a free-function (same as getGitignorePathForShardRoot).

@sums I think this just needs minor comments addressed and can be landed. Are you still interested in finishing it?