This is an archive of the discontinued LLVM Phabricator instance.

[clang] Fix wrong FDs are used for files with same name in Tooling
Needs RevisionPublic

Authored by OikawaKirie on Nov 26 2020, 2:12 AM.

Details

Summary

ClangTool will make FileManager mix up two header files with the same relative path in different absolute paths.

As the cache of previously opened FileEntry in FileManager is indexed by the file name, when relative paths are used as the index, wrong FileEntry may be used for the file with the same relative path. With ClangTool, as the current working directory will change when parsing multiple files, files with the same relative paths but different absolute paths will be mixed up by the FileManager.

Submit on behalf of Hao Zhang <zhanghao19@ios.ac.cn>, I will forward the reviews and his replies.

Diff Detail

Event Timeline

OikawaKirie created this revision.Nov 26 2020, 2:12 AM
OikawaKirie requested review of this revision.Nov 26 2020, 2:12 AM

Thanks for your patience; I missed this when I was on holiday, and I'm just noticing it now.

Can you tell me more about the scenario this happens? Why is the FileManager being reused after the working directory change? Do the FileEntrys need to remain valid?

I'm also wondering about an alternate solution that optimizes for the common case, such as the following:

  • Don't specifically track relative paths / absolute paths.
  • Add an API, something like: dropRelativePaths(), which drops all {File,Directory}Entry{,Ref} that are somehow based on relative paths. This can forward to a similar function the stat cache if necessary.
  • Update clients to call that if the working directory changes (you'd want to do it only if it actually changed, not always).

This would keep FileManager from having to growing complexity to track working directory changes.

  • Don't specifically track relative paths / absolute paths.
  • Add an API, something like: dropRelativePaths(), which drops all {File,Directory}Entry{,Ref} that are somehow based on relative paths.

I realize this may not be clear; what I'm suggesting is that dropRelativePaths() would visit all entries, check if they're relative, and drop them if so.

Replies on my own:

Can you tell me more about the scenario this happens?

You can first manually try the regression test case in this patch.
As the %t/b/config.h is empty, there should not be any reading operations to the file. However, as the length of %t/a/config.h is wrongly used for %t/b/config.h because of the bug, the file does be read, and some warnings about reading '\0' characters are generated during compilation.

Why is the FileManager being reused after the working directory change?

It seems to be the problem of ClangTool. All the input source files are handled one by one with the same FileManager instance from the ClangTool::Files. (See the comments in the file clang/lib/Tooling/Tooling.cpp)

I'm also wondering about an alternate solution that optimizes for the common case, such as the following:

Don't specifically track relative paths / absolute paths.
Add an API, something like: dropRelativePaths(), which drops all {File,Directory}Entry{,Ref} that are somehow based on relative paths. This can forward to a similar function the stat cache if necessary.
Update clients to call that if the working directory changes (you'd want to do it only if it actually changed, not always).
This would keep FileManager from having to growing complexity to track working directory changes.

I realize this may not be clear; what I'm suggesting is that dropRelativePaths() would visit all entries, check if they're relative, and drop them if so.

I am not quite clear with the fixes in this patch, I will forward the replies to these questions from the original author Hao Zhang <zhanghao@ios.ac.cn> to you when he replies to me.

Thanks

clang/lib/Tooling/Tooling.cpp
545

The CWD changes for each input file, but the same FileManager instance is used.

And relative paths are used to index the file entries.

When there are two files in different CWDs having the same relative path, they will be mixed up.

Replies from the original author Hao Zhang <zhanghao19@ios.ac.cn>
(Sorry for the wrong email address in the previous reply.)


Thanks for your reply.

I noticed this problem when I was using clang-check to analyze a project which had a compile_commands.json. A simplified test case is provided in the patch.

As far as I know this problem only happens in clang/lib/Tooling/Tooling.cpp. (as @OikawaKirie has explained)

However, it is more likely a bug in FileManager, since FileManager is not aware that whether the current working directory has been changed or not.

One solution could be to add an API, such as notifyCurrentWorkingDirectoryChange. Clients call if necessary. (Similar to the 3rd one in your alternative solution). Now the question is what to do in notifyCurrentWorkingDirectoryChange:

  • In my solution, I use a pretty straightforward approach, which is to have an individual cache (for FileEntry and any other related things) for each working directory, and switch caches between working directories.
  • In your alternative solution, if I standerstand it correctly, dropRelativePaths is called in notifyCurrentWorkingDirectoryChange. All those FileEntry which are based on relative paths are dropped from the cache. I think this is a good idea, especially for keeping FileManager simple. However I'm not sure how many files are based on relative paths. If dropRelativePaths drops too many files and the working directory is switched back to a previous one (this could happen when analyzing with a compile_commands.json), it might result in many cache misses, thus involving more system calls to open files.

There was an another solution I have ever tried, without things like notifyCurrentWorkingDirectoryChange. The solution is still straightforward, which is to use absolute path (instead of filename) to query/store a FileEntry from/to the cache. Actually I have tried this before but it ended up with lots of test cases failed. I don't know where I did wrong. If you think this approach is okay, I will continue working on this, and it might take some time.

I'm still not sure which solution is more suitable, or if there is a better one. Any suggestions are welcome!

OikawaKirie edited the summary of this revision. (Show Details)Dec 9 2020, 10:56 PM
  • In my solution, I use a pretty straightforward approach, which is to have an individual cache (for FileEntry and any other related things) for each working directory, and switch caches between working directories.

IIUC, this means the absolute paths won't be cached between changing working directories. Is that correct?

  • In your alternative solution, if I standerstand it correctly, dropRelativePaths is called in notifyCurrentWorkingDirectoryChange. All those FileEntry which are based on relative paths are dropped from the cache. I think this is a good idea, especially for keeping FileManager simple. However I'm not sure how many files are based on relative paths. If dropRelativePaths drops too many files and the working directory is switched back to a previous one (this could happen when analyzing with a compile_commands.json), it might result in many cache misses, thus involving more system calls to open files.

Yes, I think that's the tradeoff. My sense is that FileManager is usually/often fed absolute paths, but it likely depends on the build system and environment.

There was an another solution I have ever tried, without things like notifyCurrentWorkingDirectoryChange. The solution is still straightforward, which is to use absolute path (instead of filename) to query/store a FileEntry from/to the cache. Actually I have tried this before but it ended up with lots of test cases failed. I don't know where I did wrong. If you think this approach is okay, I will continue working on this, and it might take some time.

I'm not surprised some tests failed with this but it might be worth looking deeper to understand current expectations. It might be valuable to look through Git history to find why we're not always making the paths absolute right now. We do it sometimes (see calls to FixupRelativePath and makeAbsolutePath) but not always. Why is that the case? What invariants do we need to keep functioning?

Another point is that FileManager takes the working directory at construction time via FileSystemOptions. There's a FIXME to remove that argument, but (as you've seen) the assumption is baked in that the CWD doesn't change. I think the current patch may not completely fix it, since you can't modify FileSystemOptions.

One idea I have (not fully fleshed out) is to drop FileSystemOpts as a constructor argument (instead grab / store the CWD from the VFS), and, inspired by your approach, split the data structures between relative and absolute paths. The existing data structures would only store absolute paths, but there are new ones for relative paths that can be cached and swapped out. Something like this:

/// Current working directory. Grabbed from the (new) VFS whenever it's
/// changed, and updated if the file manager is notified that the VFS's
/// CWD changes.
std::string CWD;

struct RelativePaths {
  StringMap<FileEntryRef::MapValue> Files;
  StringMap<llvm::ErrorOr<DirectoryEntry &>> Dirs;
};

/// Relative entries for the current CWD.
RelativePaths RelativeEntries;

/// Cached relative entries for other CWDs.
Optional<StringMap<RelativePaths>> CachedRelativeEntries;

The idea is that the existing data structures would now be restricted to absolute paths. Any query that uses a relative path checks in RelativeEntries; if nothing is there, the path is fixed up to an absolute path, inserted / looked up in the main data structures, and then an alias is added to RelativeEntries. If the CWD changes (atypical case, but possible), we do a similar cache dance to the one in your current patch.

@arphaman @Bigcheese, can you share whether we need to solve similar issues in clang-scan-deps? What approach is taken there? Do we just require absolute paths? (I know currently we don't reuse a FileManager, but my recollection is we had a prototype that mostly worked...)

Another thing to look at: is this a problem for implicit modules builds (the CompilerInstance that builds the module will share a FileManager, but use a different CWD)? Why or why not? (Again, maybe we just require absolute paths when doing implicit modules...)

/// Current working directory. Grabbed from the (new) VFS whenever it's
/// changed, and updated if the file manager is notified that the VFS's
/// CWD changes.
std::string CWD;

struct RelativePaths {
  StringMap<FileEntryRef::MapValue> Files;
  StringMap<llvm::ErrorOr<DirectoryEntry &>> Dirs;
};

/// Relative entries for the current CWD.
RelativePaths RelativeEntries;

/// Cached relative entries for other CWDs.
Optional<StringMap<RelativePaths>> CachedRelativeEntries;

(If we ended up doing something like this, I would structure it as at least two patches: first refactoring the current semantics to use RelativeEntries and CWD, likely all NFC / no functionality change, and then a final patch to add CachedRelativeEntries and add support for changing directories.)

Another option here is to assume the VFS never changes directories (no need to store the CWD), and cache relative entries like this:

Optional<DenseMap<IntrusiveRefCntPtr<FileSystem>, RelativePaths>>
    CachedRelativeEntries;

Then if ClangTool wants to change directories, it would be expected to maintain an appropriate set of VFS layers (changing directories would be a different call to getPhysicalFileSystem).

....but this highlights another concern, which is that even the absolute paths depend on the VFS that's in place, since options like -ivfsoverlay can change for different compilation commands. It's not clear to me how well this currently works with a shared FileManager.

Replies from the original author Hao Zhang <zhanghao19@ios.ac.cn>


...split the data structures between relative and absolute paths. The existing data structures would only store absolute paths, but there are new ones for relative paths that can be cached and swapped out.

I like the idea that using one cache for absolute paths and multiple caches for relative paths. Better than my approach.

Another point is that FileManager takes the working directory at construction time via FileSystemOptions. There's a FIXME to remove that argument, but (as you've seen) the assumption is baked in that the CWD doesn't change. I think the current patch may not completely fix it, since you can't modify FileSystemOptions.

It's the point. It is the assumption that the CWD won't change leads to the problem. I agree with you that FileSystemOptions should be removed and grab the CWD directly from the VFS.

Another option here is to assume the VFS never changes directories (no need to store the CWD) ... Then if ClangTool wants to change directories, it would be expected to maintain an appropriate set of VFS layers.

IIUC, does this mean that clients should create new Filesystems when they change directories? It looks to me that it might conflict with the design of Filesystem since it already provides an API to change the CWD.

Another option (which I mentioned in the previous post) is to use absolute paths to query the cache, something like this:

llvm::ErrorOr<const FileEntry *>
FileManager::getFile(StringRef Filename, bool openFile, bool CacheFailure) {
  // Froce using the absolute path to query the cache
  llvm::SmallString<128> AbsPath(Filename);
  makeAbsolutePath(AbsPath);
  auto Result = getFileRef(AbsPath.str(), openFile, CacheFailure);
  if (Result)
    return &Result->getFileEntry();
  return llvm::errorToErrorCode(Result.takeError());
}

It is more likely a defensive workaround to this problem. It might not be a good solution to the real problem (the assumption that the CWD won't change). The above code resulted in failures of some test cases, I haven't looked closely into why they failed. In my point of view, Filename should only be used as the key to query the cache. Logically, if I change it into the absolute path, those test cases should pass as usual.

Anyway, your approach (remove FileSystemOptions, one cache for absolute paths, multiple caches for relative paths) looks good to me.

Replies from the original author Hao Zhang <zhanghao19@ios.ac.cn>

llvm::ErrorOr<const FileEntry *>
FileManager::getFile(StringRef Filename, bool openFile, bool CacheFailure) {
  // Froce using the absolute path to query the cache
  llvm::SmallString<128> AbsPath(Filename);
  makeAbsolutePath(AbsPath);
  auto Result = getFileRef(AbsPath.str(), openFile, CacheFailure);
  if (Result)
    return &Result->getFileEntry();
  return llvm::errorToErrorCode(Result.takeError());
}

(A quibble here is that getFile is just a wrapper around getFileRef, and might eventually be removed; certainly FileEntry::getName is on its way out; you'd want this logic at the top of getFileRef, getVirtualFileRef, etc.)

It is more likely a defensive workaround to this problem. It might not be a good solution to the real problem (the assumption that the CWD won't change). The above code resulted in failures of some test cases, I haven't looked closely into why they failed. In my point of view, Filename should only be used as the key to query the cache. Logically, if I change it into the absolute path, those test cases should pass as usual.

I think it would be good to understand why the tests failed.

Are some clients relying on the name matching the query? If so, why?

My guess, but I'm really not sure and it would be good to confirm, is that we need this to support reproducible builds, where you don't want the CWD to leak into your output. In this case, you'd run all build commands from the same directory, but invoke clang using all relative paths, perhaps something like this for your example:

% clang -g -o a/a.o -Isrc/a src/a/a.c
% clang -g -o b/b.o -Isrc/b src/b/b.c

Perhaps (I'm not sure) there are places in clang that rely on FileManager returning relative paths to support this kind of thing. If anything relied on it, my bet would be modules-related code.

But it's possible we don't need this. If it's safe for us to update the tests and make FileManager::getFileRef always canonicalize to an absolute path, that would definitely be cleaner. FileManager::makeAbsolute can use whatever the FS's CWD is at the time of query... nice and clean.

Anyway, your approach (remove FileSystemOptions, one cache for absolute paths, multiple caches for relative paths) looks good to me.

(But as you mentioned, not as clean as canonicalizing to absolute paths, if we can do it.)

Replies from the original author Hao Zhang <zhanghao19@ios.ac.cn>


Sorry for replying late.

I think it would be good to understand why the tests failed. Are some clients relying on the name matching the query? If so, why?

I'm currently working on this. Maybe try making FileManager::getFileRef always canonicalize to an absolute path, as you mentioned.

Replies from the original author Hao Zhang <zhanghao19@ios.ac.cn>


Sorry for replying late.

I think it would be good to understand why the tests failed. Are some clients relying on the name matching the query? If so, why?

I'm currently working on this. Maybe try making FileManager::getFileRef always canonicalize to an absolute path, as you mentioned.

Sounds great; let me know how it goes.

dexonsmith requested changes to this revision.Jan 6 2021, 12:51 PM

(Marking as "Request Changes" to drop this from my queue; feel free to reach out if the direction I suggested isn't working well...)

This revision now requires changes to proceed.Jan 6 2021, 12:51 PM
Kale added a subscriber: Kale.May 2 2022, 4:51 AM

But it's possible we don't need this. If it's safe for us to update the tests and make FileManager::getFileRef always canonicalize to an absolute path, that would definitely be cleaner. FileManager::makeAbsolute can use whatever the FS's CWD is at the time of query... nice and clean.

I've tried to fix this bug through this way but found that "making FileManager::getFileRef always canonicalize to an absolute path" was pretty hard, because there are many test cases using states stored in FileManager and assuming that they are all in relative form. Besides, the assumption that "the CWD won't change" indeed is correct by design and works fine for single compiler execution. We might not change that unless for a strong necessity.

So I personally believed that this bug is caused by libtooling's incorrect use of FileManger. I plan to fix it by implementing a class like LibtoolingFileManager, or AbsoluteFileManager, which extends the original FileManager and using abs path as key to store the status of seen files, but only used for libtooling.

I will try to upload a patch later to verify the possibility.

Herald added a project: Restricted Project. · View Herald TranscriptMay 2 2022, 4:51 AM
Herald added a subscriber: StephenFan. · View Herald Transcript
dexonsmith added subscribers: bnbarham, benlangmuir.

Adding @bnbarham, @jansvoboda11, and @benlangmuir, who have been looking at various FileManager issues recently. If you post a follow up I suggest keeping them in the loop!

But it's possible we don't need this. If it's safe for us to update the tests and make FileManager::getFileRef always canonicalize to an absolute path, that would definitely be cleaner. FileManager::makeAbsolute can use whatever the FS's CWD is at the time of query... nice and clean.

I've tried to fix this bug through this way but found that "making FileManager::getFileRef always canonicalize to an absolute path" was pretty hard, because there are many test cases using states stored in FileManager and assuming that they are all in relative form. Besides, the assumption that "the CWD won't change" indeed is correct by design and works fine for single compiler execution. We might not change that unless for a strong necessity.

So I personally believed that this bug is caused by libtooling's incorrect use of FileManger. I plan to fix it by implementing a class like LibtoolingFileManager, or AbsoluteFileManager, which extends the original FileManager and using abs path as key to store the status of seen files, but only used for libtooling.

I will try to upload a patch later to verify the possibility.

Thanks a lot for taking over this bug. Please remember to close this patch if your new patch can solve this problem.
I will forward all the comments from the original author (Hao Zhang) if he has any suggestions for your new patch.