This is an archive of the discontinued LLVM Phabricator instance.

[clang] Use `{File,Directory}EntryRef` in modular header search (part 2/2)
ClosedPublic

Authored by jansvoboda11 on May 31 2023, 4:32 PM.

Details

Summary

This patch removes some deprecated uses of {File,Directory}Entry::getName(). No functional change intended.

Depends on D151854.

Diff Detail

Event Timeline

jansvoboda11 created this revision.May 31 2023, 4:32 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 31 2023, 4:32 PM
Herald added a subscriber: ributzka. · View Herald Transcript
jansvoboda11 requested review of this revision.May 31 2023, 4:32 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 31 2023, 4:32 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
benlangmuir accepted this revision.Jun 1 2023, 9:33 AM
This revision is now accepted and ready to land.Jun 1 2023, 9:33 AM

Rebase, reformat

I noticed the Modules/crash-vfs-umbrella-frameworks.m test fails. I'll need to look into this some more.

jansvoboda11 edited the summary of this revision. (Show Details)Jun 1 2023, 11:07 AM

Fix failing test

jansvoboda11 requested review of this revision.Jun 7 2023, 10:56 AM

Regarding the failing Modules/crash-vfs-umbrella-frameworks.m test, this is my understanding after debugging it.

Whenever we find that A.framework/Frameworks/B.framework is a symlink, we treat B as a separate top-level framework, not as a subframework of A. We detect this in ::getTopFrameworkDir() (in HeaderSearch.cpp) by calling FileManager::getCanonicalPath(), which resolves the symlink. We use the returned directory when calling the module map parser. Before this change, the DirectoryEntry::getName() ended up being the path Clang first accessed: A.framework/Frameworks/B.framework. After this change, it's the as-requested canonical path instead: B.framework. Since this canonicalization takes place and doesn't have any side-effect (I believe & hope), I think it should be fine to allow dropping the A.framework/Frameworks/B.framework directory from the reproducer VFS.
Interestingly, before this change, the reproducer generated PCMs only for A and I, which makes me believe B ended up being treated as a subframework somehow (maybe the VFS interferes with the symlink canonicalization?). With this patch, the reproducer generates PCMs for all A, B and I, which matches the original (crashing) invocation on line 13. For the record, the test was introduced in D20194.

@benlangmuir, does that sound like a reasonable thing to do?

I think it should be fine to allow dropping the A.framework/Frameworks/B.framework directory from the reproducer VFS

I think technically this is wrong, since if you're missing the symlink, then A might not build -- e.g. it could be doing a relative include that needs the symlink. But am I understanding correctly that the reproducer was already broken in this case? If so I'm fine with this.

The right thing to do would be to capture both the framework and the symlink. I'm not sure how practical that is with the current architecture. @JDevlieghere any thoughts? Longer term, our CAS work could ultimately end up solving this in a better way by fully capturing all the inputs.

jansvoboda11 added a comment.EditedJun 7 2023, 11:15 AM

I think it should be fine to allow dropping the A.framework/Frameworks/B.framework directory from the reproducer VFS

I think technically this is wrong, since if you're missing the symlink, then A might not build -- e.g. it could be doing a relative include that needs the symlink. But am I understanding correctly that the reproducer was already broken in this case? If so I'm fine with this.

You're right, this would break relative includes, which I believe the current test case would handle correctly. (It only fails to canonicalize the paths to B.framework and make it into a top-level framework.) I'll try to verify this, just to be sure.

The right thing to do would be to capture both the framework and the symlink. I'm not sure how practical that is with the current architecture.

Yeah, I think currently we only record the (as-requested) file names while parsing a module map. I'm not sure how we could record both the canonical and subframework-like paths with the current more precise FileEntryRefs. (Long-term, we should probably unify this with the "affecting files and module maps" logic in clang-scan-deps.)

I think it should be fine to allow dropping the A.framework/Frameworks/B.framework directory from the reproducer VFS

I think technically this is wrong, since if you're missing the symlink, then A might not build -- e.g. it could be doing a relative include that needs the symlink. But am I understanding correctly that the reproducer was already broken in this case? If so I'm fine with this.

You're right, this would break relative includes, which I believe the current test case would handle correctly. (It only fails to canonicalize the paths to B.framework and make it into a top-level framework.) I'll try to verify this, just to be sure.

I tried changing the #include <B/B.h> to relative #include "../Frameworks/B.framework/Headers/B.h" in A.h. The original Clang invocation compiles both of these fine with and without this patch, promoting B to a top-level framework.

The relative include fails to compile with the reproducer VFS, both with and without this patch. This boils down to Clang calling FileSystem::getRealPath("<snip>/Frameworks/A.framework/Frameworks/B.framework") when deciding whether to promote B to top-level framework. Here the RedirectingFileSystem ends up just canonicalizing the path (making it absolute, removing ., etc.) and forwarding to the underlying (real) FS. This doesn't have the symlink, so it returns non-zero error-code, and FileManager::getCanonicalName() ends up returning the virtual path under A.framework, preventing promotion of B to a top-level framework. This ends up registering B as a sub-framework of A in one place, but then Clang sees <snip>/Frameworks/B.framework and attempts to create B again with the same DirectoryEntry umbrella directory, resulting in the error below:

<build>/tools/clang/test/Modules/Output/crash-vfs-umbrella-frameworks.m.tmp/i/Frameworks/A.framework/Frameworks/B.framework/Modules/module.modulemap:2:19: error: umbrella for module 'A.B' already covers this directory
    2 |   umbrella header "B.h"
      |                   ^
<build>/tools/clang/test/Modules/Output/crash-vfs-umbrella-frameworks.m.tmp/i/Frameworks/A.framework/Frameworks/B.framework/Modules/module.modulemap:4:10: error: inferred submodules require a module with an umbrella
    4 |   module * { export * }
      |          ^
While building module 'I' imported from <source>/clang/test/Modules/crash-vfs-umbrella-frameworks.m:42:
In file included from <module-includes>:1:
<build>/tools/clang/test/Modules/Output/crash-vfs-umbrella-frameworks.m.tmp/i/Frameworks/I.framework/Headers/I.h:2:9: fatal error: could not build module 'A'
    2 | #import <A/A.h>
      |  ~~~~~~~^
<source>/clang/test/Modules/crash-vfs-umbrella-frameworks.m:42:9: fatal error: could not build module 'I'
   42 | @import I;
      |  ~~~~~~~^
4 errors generated.

(Note that the A.framework/Frameworks/B.framework/Headers/B.h and A.framework/Frameworks/B.framework/Modules/module.modulemap files do make it into the reproducer VFS with this patch, since they both get touched by the original Clang invocation. For some reason, we end up with duplicate entries for these in the overlay file without this patch.)

That said, this patch doesn't regress the relative include reproducer, since it's already broken. I'm not exactly sure what the fix for that particular issue is, since the interactions are pretty subtle. I don't think it's worth blocking this patch on it, though. What do you think?

benlangmuir accepted this revision.Jun 14 2023, 9:15 AM

I don't think it's worth blocking this patch on it, though. What do you think?

Agreed, I don't think this is making it worse.

This revision is now accepted and ready to land.Jun 14 2023, 9:15 AM
This revision was landed with ongoing or failed builds.Jun 15 2023, 2:22 AM
This revision was automatically updated to reflect the committed changes.