This is an archive of the discontinued LLVM Phabricator instance.

[clang][deps] Ensure deterministic filename case
ClosedPublic

Authored by jansvoboda11 on Apr 6 2022, 8:52 AM.

Details

Summary

The dependency scanner can reuse single FileManager instance across multiple translation units. This may lead to non-deterministic output depending on which TU gets processed first.

One of the problems is that Clang uses DirectoryEntry::getName in the header search algorithm. This function returns the path that was first used to construct the (shared) entry in FileManager. Using DirectoryEntryRef::getName instead preserves the case as it was spelled out for the current "get directory entry" request.

rdar://90647508

Diff Detail

Event Timeline

jansvoboda11 created this revision.Apr 6 2022, 8:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 6 2022, 8:52 AM
Herald added a subscriber: mgrang. · View Herald Transcript
jansvoboda11 requested review of this revision.Apr 6 2022, 8:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 6 2022, 8:52 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

Finish test description.

dexonsmith accepted this revision.Apr 6 2022, 10:53 AM
dexonsmith added a reviewer: benlangmuir.
dexonsmith added a subscriber: benlangmuir.

LGTM.

Once we've removed the last use of DirectoryEntry::getName(), we should drop DirectoryEntry::getName() to avoid picking up other users. Perhaps DirectoryEntry itself could be made private to the FileManager.

The root cause is the fact that Clang is using DirectoryEntry::getName in the header search algorithm. This function returns the path that was first used to construct the (shared) entry in FileManager. Using DirectoryEntryRef::getName instead preserves the case as it was spelled out for the current "get directory entry" request.

For clarity, this is just one of various root causes of FileManager being unsound to reuse. Might be good to reword the commit message to avoid implying that it's the only one.

The obvious thing is that FileEntry is still used in a few places, especially in modules.

The non-obvious thing is that the "redirection hack" in FileManager::getFileRef(), which deals with a VFS remapping an external name, can cause future lookups to succeed that would previously have failed. @benlangmuir hit this recently when he was experimenting with NOT sharing the FileManager during implicit modules scanning; he can share the sequence of steps (I tried just now to remember them but I couldn't work it out).

IMO, we should turn off sharing the FileManager across multiple TUs (except as an experimental option) until we actually believe that it's sound. I'm not convinced there's a lot of benefit anyway:

  • Stat failures need to be cleared between TUs
  • Stat successes are cached by the dependency scanning filesystem
  • The contents are (mostly? entirely?) bump-ptr allocated since https://reviews.llvm.org/D123144
This revision is now accepted and ready to land.Apr 6 2022, 10:53 AM

Windows path separators...

jansvoboda11 retitled this revision from [clang][deps] Ensure deterministic file names on case-insensitive filesystems to [clang][deps] Ensure deterministic filename case.Apr 7 2022, 8:09 AM
jansvoboda11 edited the summary of this revision. (Show Details)
This revision was landed with ongoing or failed builds.Apr 8 2022, 12:18 AM
This revision was automatically updated to reflect the committed changes.