This is an archive of the discontinued LLVM Phabricator instance.

[VFS] Add a new getVFSFromYAMLs API
Changes PlannedPublic

Authored by bnbarham on Mar 10 2022, 3:37 PM.

Details

Summary

Implements redirecting-with using an OverlayFileSystem rather than
nesting RedirectingFileSystem -

  • "fallthrough": adds the RedirectingFileSystem to OverlayFileSystem
  • "fallback": adds the RedirectingFileSystem *before* the previously added one. This is really only useful when its the initial overlay to place it before the RealFileSystem
  • "redirect-only": removes any previously added filesystems. Mostly intended to remove the RealFileSystem, but works in the general case as well.

Updates getVFSFromYAML to use this API, as well as any existing uses
of getVFSFromYAML since it's simpler anyway (ie. no need to read the
underlying buffers first). This does come with a slight change in
semantics though - it is no longer possible to "chain" VFS mappings.
Given an overlay with a mapping of A -> B and another with B -> C, A can
no longer map to C. An A ->C mapping must be specified as its own
mapping if this is desired.

The intention of this change is to allow for simplifying
RedirectingFileSystem by removing all the redirecting-with handling
from it. This will allow reverting it back to its original
implementation where it simply performed the mapping and failed if that
mapping did not exist in ExternalFS. That will be completed in a later
commit to allow for an easier migration path if any downstream user
requires the ability to chain mappings.

Depends on D121423

Diff Detail

Event Timeline

bnbarham created this revision.Mar 10 2022, 3:37 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 10 2022, 3:37 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
bnbarham requested review of this revision.Mar 10 2022, 3:37 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 10 2022, 3:37 PM
bnbarham updated this revision to Diff 414703.Mar 11 2022, 10:58 AM

Fixed up the getVFSFromYAML implementation to actually pass the buffer. Note that I'm happy to remove this, I only left it around to make it easier on downstream users.

bnbarham updated this revision to Diff 414741.Mar 11 2022, 1:47 PM

Add a check for CWD when using redirect-only

bnbarham updated this revision to Diff 414750.Mar 11 2022, 2:20 PM
bnbarham edited the summary of this revision. (Show Details)

Update to single review dependency

bnbarham updated this revision to Diff 415615.Mar 15 2022, 3:44 PM
bnbarham edited the summary of this revision. (Show Details)

Re-order to be before D121424 and merge with D121426

Herald added projects: Restricted Project, Restricted Project, Restricted Project. · View Herald TranscriptMar 15 2022, 3:44 PM
bnbarham added a comment.EditedMar 15 2022, 4:09 PM

There's two failing tests with this change:

  • VFSFromYAMLTest.ReturnsExternalPathVFSHit
  • VFSFromYAMLTest.ReturnsInternalPathVFSHit

Apparently we allow relative paths in external-contents *without* specifying overlay-relative: true. In this case the relative paths are resolved based on the CWD at the time of the operation. Do we really want that? I believe this wouldn't have worked prior to https://reviews.llvm.org/D109128 as there was no canonicalising before then as well. @keith was that change intentional? If we want it then it will *require* setting CWD on each FS in OverlayFS, which I was really hoping to avoid.

To give the concrete case...

Prior to all these changes:

RedirectingFileSystem
  /a/foo -> bar
ExternalFS
  RealFileSystem
    /a/bar
    /b/bar

cd /a
cat foo # /a/bar

cd /b # fail since it doesn't exist in RedirectingFS

After:

OverlayFileSystem
  RedirectingFileSystem
    /a/foo -> bar

  RealFileSystem
    /a/bar
    /b/bar

cd /a
cat foo # fails since OverlayFS::setCWD doesn't set CWD on the underlying FS

cd /b # succeed
cat foo # fail since there's no /b/foo

In D121424 it will fail for a slightly different reason, but still fail - there's no longer any canonicalise after mapping the path in RedirectingFS (I assumed this was unintentional and missed the test case).

If we want to allow this then I think we'll need to go with the previous plan for OverlayFS, ie. keep track of filesystems that setCWD failed on and don't call them at all until they succeed again. That then doesn't allow certain cases that the current method allows.

There's two failing tests with this change:

  • VFSFromYAMLTest.ReturnsExternalPathVFSHit
  • VFSFromYAMLTest.ReturnsInternalPathVFSHit

Apparently we allow relative paths in external-contents *without* specifying overlay-relative: true. In this case the relative paths are resolved based on the CWD at the time of the operation. Do we really want that? I believe this wouldn't have worked prior to https://reviews.llvm.org/D109128 as there was no canonicalising before then as well. @keith was that change intentional? If we want it then it will *require* setting CWD on each FS in OverlayFS, which I was really hoping to avoid.

I spoke to Keith offline. This has always worked - it previously worked by RedirectingFileSystem setting CWD on ExternalFS when setCWD was called on it. It's also important to keep supporting as it's used in Bazel (https://github.com/bazelbuild/rules_swift/blob/c1d7d1df6969c2675c7826ecf1202d78016b1753/swift/internal/vfsoverlay.bzl#L41-L55).

I'm hoping I can fix this by resolving the paths when the overlay is created. I'll see if that works (it'll depend on when -working-directory is actually used by the frontend).

There's two failing tests with this change:

  • VFSFromYAMLTest.ReturnsExternalPathVFSHit
  • VFSFromYAMLTest.ReturnsInternalPathVFSHit

Apparently we allow relative paths in external-contents *without* specifying overlay-relative: true. In this case the relative paths are resolved based on the CWD at the time of the operation. Do we really want that? I believe this wouldn't have worked prior to https://reviews.llvm.org/D109128 as there was no canonicalising before then as well. @keith was that change intentional? If we want it then it will *require* setting CWD on each FS in OverlayFS, which I was really hoping to avoid.

I'm not sure how external-contents factors into this (shouldn't that just affect the observed file path, not whether the lookup succeeds?).

Regardless, here's the testcase again:

OverlayFileSystem
  RedirectingFileSystem
    /a/foo -> bar

  RealFileSystem
    /a/bar
    /b/bar

I think that should behave like:

InMemoryFileSystem
  /a/bar
  /a/foo -> bar (link, with "magic" to sometimes leak path)
  /b/bar

cd /a # succeeds
cat foo # succeeds (get content of /a/bar; report path of /a/bar or /a/foo)

cd /b # succeeds
cat foo # fails

Is the model correct?

If so, I think these are the things we need to get right (not just for that test case):

  1. Return error on operations that should fail.
  2. Get the right content what cat succeeds.
  3. Reporting the "right" paths for files that are opened (depends on external contents, etc.).
  4. Keeping the "view" / state consistent when the underlying filesystems changes state.

Does that breakdown make sense?

If so, can you clarify which part is failing? (Not sure if it's (3) or (4)... or maybe some (5).)

Or another way in: I don't fully understand why this fails:

cd /a
cat foo # fails since OverlayFS::setCWD doesn't set CWD on the underlying FS

Can you be more detailed about the overall state at the time of cat, which causes it to fail?

There's two failing tests with this change:

  • VFSFromYAMLTest.ReturnsExternalPathVFSHit
  • VFSFromYAMLTest.ReturnsInternalPathVFSHit

Apparently we allow relative paths in external-contents *without* specifying overlay-relative: true. In this case the relative paths are resolved based on the CWD at the time of the operation. Do we really want that? I believe this wouldn't have worked prior to https://reviews.llvm.org/D109128 as there was no canonicalising before then as well. @keith was that change intentional? If we want it then it will *require* setting CWD on each FS in OverlayFS, which I was really hoping to avoid.

I spoke to Keith offline. This has always worked - it previously worked by RedirectingFileSystem setting CWD on ExternalFS when setCWD was called on it. It's also important to keep supporting as it's used in Bazel (https://github.com/bazelbuild/rules_swift/blob/c1d7d1df6969c2675c7826ecf1202d78016b1753/swift/internal/vfsoverlay.bzl#L41-L55).

I'm hoping I can fix this by resolving the paths when the overlay is created. I'll see if that works (it'll depend on when -working-directory is actually used by the frontend).

(Would this mean resolving everything in the .yaml file eagerly at launch? That sounds a bit scary...)

bnbarham added a comment.EditedMar 15 2022, 8:41 PM

Can you be more detailed about the overall state at the time of cat, which causes it to fail?

Sure. I think there's also some confusion with names here but I'm generally trying to use the names as they are used in the overlay YAML or the name of the members in RedirectingFS.

"external-contents" is just the path that we're mapping to, so in my example above (/a/foo -> bar), "bar" is what I was referring to by "external-contents". So let's use that again -

OverlayFileSystem
  RedirectingFileSystem
    /a/foo -> bar

  RealFileSystem
    /a/bar
    /b/bar

After D121423, cd /a will result in the following -

  • OverlayFS: CWD set to /a
  • RedirectingFS and RealFS: CWD unchanged, so let's assume they were / at the time of creation and that's what they are now

cat foo -

  • OverlayFS canonicalises to /a/foo and passes that to openFileForRead in RedirectingFS
  • RedirectingFS has a /a/foo -> bar mapping, so it now runs openFileForRead on its ExternalFS (which is the RealFS)
  • RealFS is at / and there is no /bar and thus fails
  • RedirectingFS also fails
  • Back in OverlayFS now, and because we failed we now try RealFS with /a/foo
  • RealFS has no /a/foo, it fails
  • Back in OverlayFS, all FS have now failed, return a failure

How this used to work really depends on when we're talking about 😆. Prior to D121423 the FS would have looked like:

RedirectingFileSystem
  /a/foo -> bar
ExternalFS
  RealFileSystem
    /a/bar
    /b/bar

Prior to Jonas' change, setCWD on RedirectingFS would run setCWD on ExternalFS. So cd /a would set CWD to /a for both RedirectingFS and RealFileSystem. Then cat /a/foo would give bar and openFileForRead on ExternalFS (RealFS) (which has /a CWD) would open /a/bar.

After Jonas' *and* Keith's change, setCWD no longer runs on ExternalFS but instead the path was canonicalised before sending it along. So /a/foo maps to bar as before, but then it would be canonicalised using RedirectingFS CWD and thus do an open for /a/bar on ExternalFS.

(Would this mean resolving everything in the .yaml file eagerly at launch? That sounds a bit scary...)

Yes, depending on what you mean by "resolving". All I mean is "prefix relative paths with CWD". The main issue with doing this is that it prohibits canonicalising paths to a virtual CWD, since you wouldn't be able to set CWD to a virtual path before making the overlay. Perhaps that's what you mean by "a bit scary" though.

We currently prefix relative paths with ExternalContentsPrefixDir if overlay-relative: true is set. That path is a combination of CWD + the directory to the overlay. This is to handle overlays that have absolute paths that we want to remap (specifically for the crash dump use case I believe). But if overlay-relative: false is set then we just canonicalise the path on open/etc.

I just ran a few tests with the frontend though:

  • By default, paths come in relative and will become absolute from the process-wide CWD
  • If -working-directory *is* set then paths come in absolute from FileManager - setCWD is never run

I'm not sure if there's a good reason it's done this way or it's just that CWD was only added to FileSystem relatively recently, but it does mean that that we're actually canonicalising the paths with the process-wide CWD regardless at the moment (at least from the frontend). Possibly unexpected since you could pass in -working-directory/foo but the relative mappings would still get canonicalised using the process CWD.

bnbarham planned changes to this revision.Mar 17 2022, 7:02 PM

Blocked on the dependent

ormris removed a subscriber: ormris.May 16 2022, 11:20 AM