This is an archive of the discontinued LLVM Phabricator instance.

[Reproducers] Add file provider
ClosedPublic

Authored by JDevlieghere on Nov 15 2018, 10:22 PM.

Details

Summary

This patch adds the file provider which is responsible for capturing files used by LLDB.

When capturing a reproducer, we use a file collector that is very similar to the one used in clang. For every file that we touch, we add an entry with a mapping from its virtual to its real path. When we decide to generate a reproducer we copy over the files and their permission into to reproducer folder.

When replaying a reproducer, we load the VFS mapping and instantiate a RedirectingFileSystem. The latter will transparently use the files available in the reproducer.

I've tested this on two macOS machines with an artificial example. Still, it is very likely that I missed some places where we (still) use native file system calls. I'm hoping to flesh those out while testing with more advanced examples. However, I will fix those things in separate patches.

Diff Detail

Event Timeline

JDevlieghere created this revision.Nov 15 2018, 10:22 PM
  • Initialization of the FS needs to happen in the driver.
  • All file access has to go through the new FileSystem.
  • Needs tests.

I am confused by differing scopes of various objects that are interacting here. It seems you are implementing capture/replay as something that can be flicked on/off at any point during a debug session (Debugger lifetime). However, you are modifying global state (the filesystem singleton, at least), which is shared by all debug sessions. If multiple debug sessions try to enable capture/replay (or even access the filesystem concurrently, as none of this is synchronized in any way), things will break horribly. This seems like a bad starting point in implementing a feature, as I don't see any way to fix incrementally in the future.

Maybe you meant that by Initialization of the FS needs to happen in the driver.? In any case, for me the initialization part is the interesting/hard part of this problem. The details of how to collect the files are trivial compared to that and possibly depend on the result of the decisions made there.

I am confused by differing scopes of various objects that are interacting here. It seems you are implementing capture/replay as something that can be flicked on/off at any point during a debug session (Debugger lifetime). However, you are modifying global state (the filesystem singleton, at least), which is shared by all debug sessions. If multiple debug sessions try to enable capture/replay (or even access the filesystem concurrently, as none of this is synchronized in any way), things will break horribly. This seems like a bad starting point in implementing a feature, as I don't see any way to fix incrementally in the future.

Maybe I should've added a comment (I didn't want to litter the code with design goals) but the goal of the commands is not really to enable/disable the reproducer feature. I'm currently abusing the commands to do that, but the long term goal is that you can enable/disable part of the reproducer. For example, you could say only capture GDB packets or capture only files. It makes testing a lot easier if you can isolate a single source of information. Since there's currently only GDB packets I didn't split it up (yet).

Also I piggy-backed off this to enable/disable the feature as a whole because I needed to go through the debugger. See my reply below on why and how that'll be fixed by the next patch :-)

Maybe you meant that by Initialization of the FS needs to happen in the driver.? In any case, for me the initialization part is the interesting/hard part of this problem. The details of how to collect the files are trivial compared to that and possibly depend on the result of the decisions made there.

Yup, as per your suggestion in another review. I have an (unfinished) patch that splits off parsing of the command line arguments in the driver so we know the reproducer path before we initialize anything. That's pretty much the whole reason that we currently go through the debugger right now.

I am confused by differing scopes of various objects that are interacting here. It seems you are implementing capture/replay as something that can be flicked on/off at any point during a debug session (Debugger lifetime). However, you are modifying global state (the filesystem singleton, at least), which is shared by all debug sessions. If multiple debug sessions try to enable capture/replay (or even access the filesystem concurrently, as none of this is synchronized in any way), things will break horribly. This seems like a bad starting point in implementing a feature, as I don't see any way to fix incrementally in the future.

Maybe I should've added a comment (I didn't want to litter the code with design goals) but the goal of the commands is not really to enable/disable the reproducer feature. I'm currently abusing the commands to do that, but the long term goal is that you can enable/disable part of the reproducer. For example, you could say only capture GDB packets or capture only files. It makes testing a lot easier if you can isolate a single source of information. Since there's currently only GDB packets I didn't split it up (yet).

Also I piggy-backed off this to enable/disable the feature as a whole because I needed to go through the debugger. See my reply below on why and how that'll be fixed by the next patch :-)

Ok, that makes sense, thanks for the explanation. I guess that means that filesystem capturing (due to it being global) will then be one of the things that cannot be enabled/disabled at runtime ?

source/Host/common/FileSystem.cpp
79–82

Do you want to propagate this error up the stack?

source/Utility/FileCollector.cpp
32–33

upper_dest = path.upper(); ?

83–88

Is there anything which remove_leading_dotslash does, which wouldn't be done by the latter remove_dots call?

  • Get the external path with modifying the VFS in LLVM.
  • Integrate with the new integration logic.
  • Add a test. (that doesn't work yet, still WIP)
  • Rebase
  • Copy over permissions
  • Fix path when launching.
JDevlieghere retitled this revision from [wip][Reproducers] Add file provider to [Reproducers] Add file provider.Dec 10 2018, 5:16 PM
JDevlieghere edited the summary of this revision. (Show Details)
JDevlieghere set the repository for this revision to rLLDB LLDB.
JDevlieghere added a project: Restricted Project.

With the LLVM changes now checked-in this is ready for review.

Overall, I think the patch is pretty good. I made a bunch of inline suggestions/questions/comments, but they're all fairly minor. From a high-level view the two comments I have are:

  • I am slightly concerned about the non-temporality of this approach. Lldb does a fair amount of filesystem write operations, and this may be hard to capture in a static filesystem image. It seems you already had to work around some of these issues when you skip copying deleted files. I think that's a bridge we can cross when we reach it, but I'm just saying I see some potential trouble ahead...
  • I think it would be good nice to have a more granular test for this functionality. The existing test is sort of a sledgehammer (and apparently only runs on darwin). It is not obvious from it for instance, whether it actually tests any of the special symlink-handling code you have. (I guess it does, because you needed to write that code, but it's not clear whether that will still be true one year from now, or on someone else's machine). It sounds like it shouldn't be too hard to test the finer details of this implementation via unit tests, either by going through the FileSystem, or by going straight to the FileCollector class. What do you think?
include/lldb/Host/FileSystem.h
47

If this pointer should always be non-null, maybe this could be a reference?

lit/Reproducer/TestFileRepro.test
1

Would this test be expected to work on other targets using gdb-remote process plugin too?

source/Host/common/FileSystem.cpp
100–101

Maybe do this in the Initialize function, so you can return errors from there?

443

What are the chances of making this slightly more official by making getVFSFromYAML return a IntrusiveRefCntPtr<RedirectingFileSystem> instead of a generic pointer?

source/Initialization/SystemInitializerCommon.cpp
82

Would it be possible to avoid magic strings here, and use something like GetProviderInfo<FileProvider>() instead?

87

When can this happen? Should this be an error or maybe even an assert?

JDevlieghere marked 6 inline comments as done.

Address Pavel's feedback.

JDevlieghere planned changes to this revision.Jan 16 2019, 10:05 AM

I'll add a more specific test for the collector.

source/Host/common/FileSystem.cpp
443

I'm not sure I understand what the benefit would be, we'd still store it as a generic FileSystem in the member variable?

source/Initialization/SystemInitializerCommon.cpp
82

My idea was to decouple the provider (used only for capture) and the info needed for replay, but I think we can improve things by merging them into one. I don't think I've run into a situation yet where we don't have access to the provider during replay.

87

Yes, reproducers should still work with "some" functionality disabled. The infrastructure is not there yet, but it would be nice to test pieces individually.

labath added inline comments.Jan 16 2019, 12:22 PM
source/Host/common/FileSystem.cpp
47

Would it be possible to have the overloads which always succeed keep returning void? Here, you've had to add failure checks in SystemInitializerCommon, which are essentially dead code, because the error path will never be taken.

443

It's not a big thing, but the advantage would be that it would be possible to locally verify that the up/down-casting is correct just by looking at the FileSystem class. This way you have to peek into the implementation of getVFSFromYAML to see that it indeed always returns the right type. Also, if anyone later gets the idea of changing the type returned by this function, he'll be more likely to thing about possible breakages down the line.

source/Initialization/SystemInitializerCommon.cpp
82

Ok, I see what you mean. That should still be possible to achieve even with stricter type checking. You'd just need to have a separate type for the "info", which can be shared by the replay and capture classes.

87

Ok, makes sense.

JDevlieghere marked 4 inline comments as done.

Feedback + Rebase on D56814

labath added inline comments.Jan 17 2019, 12:20 AM
include/lldb/Utility/Reproducer.h
97–98

Are you sure this can be in a header? I would expect this to give you multiply-defined symbols errors.

JDevlieghere marked an inline comment as done.Jan 17 2019, 10:29 AM
JDevlieghere added inline comments.
include/lldb/Utility/Reproducer.h
97–98

You are correct, should be in the implementation.

  • Fix issue in lit test.
  • Add unit test for the file collector.

Some comments. Thanks!

include/lldb/Utility/FileCollector.h
6–8

You might want to update the license.

51–55

Can we add comments to these member variables? Some are not entirely obvious to me, e.g. what's m_root?
What's m_mutex is supposed to protect . I think we should document these invariants.

source/Utility/FileCollector.cpp
28–34

should this be a function in FS to check the case-sensitiveness?

85–88

Ditto (I thought there was one to canonicalize the source path?)

JDevlieghere marked 9 inline comments as done.

Code review feedback Davide.

source/Utility/FileCollector.cpp
28–34

As Host depends on Utility, we cannot depend on the filesystem class here.

85–88

See previous answer.

davide accepted this revision.Jan 25 2019, 4:31 PM

I don't have other comments, this is good to go for me. Pavel?

This revision is now accepted and ready to land.Jan 25 2019, 4:31 PM

I don't have any major issues with this patch. The thing I'd like to see though is one test which uses a path with .. components and symlinks. If nothing else, it would serve as a good documentation for how these are supposed to be handled.

include/lldb/Utility/Reproducer.h
97–98

It looks like these are still in the header. :)

source/Utility/FileCollector.cpp
28–34

FWIW, I think it makes sense for FileCollector to *not* use the FileSystem class, as FileCollector is pretty much an implementation detail of the FileSystem.

unittests/Utility/FileCollectorTest.cpp
31–67 ↗(On Diff #183641)

I have some doubts about whether this will correctly run on windows (mixing of posix and native paths sounds worrying). It might be better to just have this test always work with host paths (as these class should always deal with files on the host system anyway).

71–72 ↗(On Diff #183641)

For testing, you can theoretically pull in other modules if they are needed (but I don't think that's the case here).

78–80 ↗(On Diff #183641)

If you made these into ScopedFile classes like the VFS tests do then besides making this shorter, you'd also make this test self-cleaning in most of the cases (I fear this can start choking the buildbots after a couple of thousand runs...)

JDevlieghere marked 4 inline comments as done.
  • Feedback pavel
labath accepted this revision.Jan 29 2019, 2:54 AM

Looks good, thanks for bearing with me.

I don't want to hold this up further, but ideally, I'd change that last test to use positive assertions instead of negative ones. E.g., right now this doesn't really fulfill my goal of documenting how symlinks are handled. Something like EXPECT_THAT(mapping, testing::Contains(YAMLVFSEntry(file_root + "/bar", reproducer_root + "/foo"))); would (as well as giving better error messages).

PS: I believe the symlink test will have to be #ifndef WINDOWS.

This revision was automatically updated to reflect the committed changes.