This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/DirectoryWatcher/windows/
-
DirectoryWatcher/
-
windows/
30/33
DirectoryWatcher-windows.cpp
-
unittests/DirectoryWatcher/
-
DirectoryWatcher/
1/2
CMakeLists.txt

Differential D88666

DirectoryWatcher: add an implementation for Windows
ClosedPublic

Authored by compnerd on Oct 1 2020, 8:51 AM.

Download Raw Diff

Details

Reviewers

amccarth

Commits

rG5d74c4351175: DirectoryWatcher: add an implementation for Windows

Summary

This implements the directory watcher on Windows. It does the most
naive thing for simplicity. ReadDirectoryChangesW is used to monitor
the changes. However, in order to support interruption, we must use
overlapped IO, which allows us to use the blocking, synchronous
mechanism. We create a thread to post the notification to the consumer
to allow the monitoring to continue. The two threads communicate via a
locked queue.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

compnerd created this revision.Oct 1 2020, 8:51 AM

Herald added a project: Restricted Project. · View Herald TranscriptOct 1 2020, 8:51 AM

compnerd requested review of this revision.Oct 1 2020, 8:51 AM

Harbormaster completed remote builds in B73663: Diff 295586.Oct 1 2020, 9:02 AM

Overall looks good.

clang/lib/DirectoryWatcher/windows/DirectoryWatcher-windows.cpp
95	Can this be an assert with some message or some explanation of the hEvent return value? What happens if hEvent is non-zero on Release builds?

compnerd added a subscriber: jkorous.Oct 1 2020, 9:42 AM

I wonder if we should unit test this functionality by having some tests that create and remove files that are watched. I'm not 100% convinced that is a great idea, but not having test coverage for the change is also not really a great idea either. Thoughts welcome.

clang/lib/DirectoryWatcher/windows/DirectoryWatcher-windows.cpp
21	You should include `llvm/Support/Windows/WindowsSupport.h` not `Windows.h` directly.
69–76	`hDirectory` -> `Directory`
80	`hDirectory` -> `Directory`
86	You should strip the Hungarian notation prefixes and ensure all the identifiers meet our usual naming rules, I'll stop bringing them up.
87	Is a smart pointer required here or could you use `std::vector<WCHAR>` and reserve the space that way?
101	You can drop the top-level `const` on value types.
170	A newline above this line would be helpful for visual distinction.
264	I think we should assert that -- calling this on a file isn't likely to behave in a good way.
270	More top-level `const`s

compnerd added inline comments.Oct 1 2020, 10:04 AM

clang/lib/DirectoryWatcher/windows/DirectoryWatcher-windows.cpp
95	I debated this myself which is why I added the `assert`. Honestly, if this fails, there is very little that can be done. This is creating an anonymous (unnamed) event. The failure here would be caused by out-of-memory conditions (you're dead anyways) or system is completely out of resources (you're dead anyways). I don't know of any recovery in that situation. I really would prefer to avoid the two-phase construction which is going to be required if we want to handle that error scenario. The event only makes sense after we have the directory handle, so I suppose that we could setup the event prior to the construction of the watcher itself, but that is just as much two-phase construction as adding an `initialize` method.

compnerd added inline comments.Oct 1 2020, 10:05 AM

clang/lib/DirectoryWatcher/windows/DirectoryWatcher-windows.cpp
87	Sure, I can convert this to a `std::vector<WCHAR>` instead.

@aaron.ballman - I completely agree with you about the testing. The interfaces are tested via https://reviews.llvm.org/source/llvm-github/browse/master/clang/unittests/DirectoryWatcher/DirectoryWatcherTest.cpp, which now that I look at again, seems to need an additional case for system name.

I'm sorry. I haven't had to time to review the entire change yet, but I thought I'd share some early feedback now and take more of a look on Monday.

The high level issues on my mind:

I'm wondering whether this has been overcomplicated with the overlapped IO. If the monitoring thread used FindFirstChangeNotificationW to get a waitable handle and then used, then you'd be able to call ReadDirectoryChangesW synchronously. In order to allow the parent thread signal to quit, they'd just need an Event and the monitor thread would use WaitForMultipleObjects to wait for either handle to become signaled. Maybe I'm overlooking something, but it might be worth a few minutes of consideration.

We'll also have to think about how to test this.

The lower level issues that I've spotted are inlined.

clang/lib/DirectoryWatcher/windows/DirectoryWatcher-windows.cpp
31	The `ovlIO` name isn't consistent with LLVM style.
34	If it were me, I'd probably make this a `std::vector`. If an off-by-one bug causes an overrun of one WCHAR, you could trash a crucial member variable. On the heap, the damage is less likely to be catastrophic. You wouldn't need `alignas`. I don't think these are created in a tight loop, so the overhead doesn't concern me. Also, I'd probably go with a slightly more descriptive name, like `Notifications` rather than `Buffer`.
82	There's a lot going on in this constructor. Is this how the other implementations are arranged? Would it make sense to just initialize the object, and save most of the actual work to a `Watch` method?
87	I guess it's fine to use the array form of `std::unique_ptr` (but then you should `#include <memory>`). If it were me, I'd probably just use a `std::wstring` or `std::vector<WCHAR>`. `dwLength` already includes the size of the null terminator. Your first `GetFinalPathNameByHandleW` function "fails" because the buffer is too small. The does says that, if it fails because the buffer is too small, then the return value is the required size _including_ the null terminator. (In the success case, it's the size w/o the terminator.) I know this is the Windows-specific implementation, but it might be best to just the Support api `realPathFromHandle`, which does this and has tests.
88	I don't think you want to ignore the return value, since it'll tell you exactly how many characters you actually got back (or whether there was an error). Again, I recommend using `realPathFromHandle` from Support.
94	No real difference here, but, for consistency, please make this `CreateEventW` with the explicit -W suffix.
189	I think this leaks the `ovlIO.hEvent`. After you've joined both threads, make sure to call `::CloseHandle()`.

There already is testing coverage for this - I just missed the CMake changes.

clang/lib/DirectoryWatcher/windows/DirectoryWatcher-windows.cpp
34	The `alignas` is because the documentation states that the buffer should be DWORD aligned. It is more for pedantic reasons rather than anything else. I think that making it a catastrophic failure is a good thing though - it would catch the error :) You are correct about the allocation - it is once per watch. I'll rename it at least.
82	Largely the same. However, the majority of the "work" is actually the thread proc for the two threads.
87	I didn't know about `realPathFromHandle` - I prefer that actually.

compnerd added inline comments.Oct 4 2020, 8:05 PM

clang/lib/DirectoryWatcher/windows/DirectoryWatcher-windows.cpp
87	Actually, `realPathFromHandle` is private to `Path.cpp` :-(

Address feedback

Herald added a subscriber: mgorny. · View Herald TranscriptOct 4 2020, 8:11 PM

Harbormaster completed remote builds in B73940: Diff 296092.Oct 4 2020, 8:34 PM

Some of this is nitpicky/opinionated, but the race condition is real. We need a reliable way to signal the watcher thread when it's time to exit. Options I see are:

Use the FindFirstChange function to get a handle to wait on for the directory change and create a separate event to signal when it's time to exit. The watcher thread would use WaitForMultipleObjects. If' it's a directory change, then it can make a synchronous call the ReadDirectoryChangesW, knowing that there's info available. (It could possibly even do the callback at that point, without the need for a separate handler thread.)

Continue to use the ReadDirectoryChangesW with overlapped IO, but, instead of waiting in GetOverlappedResult, it would first use WaitForMultipleObjects on the event in the overlapped IO and a distinct event used to tell the threat to exit (as in #1).

clang/lib/DirectoryWatcher/windows/DirectoryWatcher-windows.cpp
34	But it's still an arbitrarily-sized buffer in the middle of a class definition. If you change your mind about how big to make it, it changes the definition of the class. The buffer is going to be accessed using pointer arithmetic, which is generally dangerous. Moving the buffer out of the class avoids both of those problems. The alignas here is _not_ pedantic, it's essential. Without it, you could easily have an alignment problem. But if you used a vector, you'd know that it would always be suitably aligned.
82	Let me put it another way. Constructors cannot return errors and LLVM does use exceptions, so things that can fail generally shouldn't be in the constructor. This code is accessing the file system, creating and event, and spawning two threads. Any of those things can fail, but you've got no way to let the caller know whether something went wrong. If the lambdas were really short, then it would be easy to see that they're thread procs. But they're not, so they're hard to find and understand. If they were private member functions with descriptive names, the code would be easier to understand.
92	Using `Buffer.data()` when you've only reserved space is undefined behavior. You should used `resize` instead of `reserve` and then pass the `size` rather than the `capacity`. Be aware that, while unlikely, this could still fail. The directory could have been removed or renamed between calls or the caller could have passed a bad handle to begin with.
195	I don't think this is a reliable way to get the watcher thread to exit. You've overloaded the meaning of the event object and are racing the i/o system who plans to use the event for its own purposes. Suppose the watcher thread is waiting in GetOverlappedResult. If you set the event, then the other fields of the OVERLAPPED struct are in an indeterminant state. I don't see how the watcher thread can distinguish your exit signal from a normal completion. In another case, suppose to set the event when the watcher thread has just completed a GetOverlappedResult but hasn't yet started the next ReadDirectoryChangesW. The watcher thread won't notice the signal. It'll just loop around and the next ReadDirectoryChangesW will reset it. This will hang. This one reason why I suggested FindFirstChange and WaitForMultipleObjects. It lets you have distinct events objects for distinct events. It also has the benefit that you wouldn't need the overlapped IO and you possibly could combine the watcher and handler functions into a single thread.
clang/unittests/DirectoryWatcher/CMakeLists.txt
1	I'm not a Cmake expert, but I"m curious way `MATCHES "Linux"` but `STREQUAL Windows`.

This revision now requires changes to proceed.Oct 6 2020, 2:10 PM

The reason that I added the overlapped event was specifically for the cancellation and overlooked the fact that it will be used by the kernel side as well, thanks for catching that!

clang/unittests/DirectoryWatcher/CMakeLists.txt
1	`MATCHES` is a regular expression, and overly expensive. The `STREQUAL` is a string comparison (ala `strcmp`).

compnerd marked 2 inline comments as done.Oct 8 2020, 3:14 PM

compnerd added inline comments.

clang/lib/DirectoryWatcher/windows/DirectoryWatcher-windows.cpp
82	Sure, moving the thread procs into a member function is reasonable. Making that request would've been more clear :)

compnerd marked 3 inline comments as done.Oct 8 2020, 4:42 PM

compnerd added inline comments.

clang/lib/DirectoryWatcher/windows/DirectoryWatcher-windows.cpp
34	I think that the class size point is what is more convincing to me for switching to a `vector`.
92	Didn't know that about the `reserve`! Right, that is why most of this is ignoring failures. The watch will fail and terminate everything.

address feedback

LGTM. Thanks for extending this functionality to Windows!

clang/lib/DirectoryWatcher/windows/DirectoryWatcher-windows.cpp
14	I don't see a reason to include `<atomic>` here.
76	I like the name change from HandlerThread to NotifierThread. Thanks!

This revision is now accepted and ready to land.Oct 9 2020, 8:57 AM

compnerd marked an inline comment as done.Oct 9 2020, 1:26 PM

compnerd added inline comments.

clang/lib/DirectoryWatcher/windows/DirectoryWatcher-windows.cpp
14	Left-overs :-(

Remove unnecessary include, fix an incorrect wait (verified that unit tests now pass!)

Herald added a subscriber: jfb. · View Herald TranscriptOct 9 2020, 1:28 PM

This revision was landed with ongoing or failed builds.Oct 9 2020, 1:56 PM

Closed by commit rG5d74c4351175: DirectoryWatcher: add an implementation for Windows (authored by compnerd). · Explain Why

This revision was automatically updated to reflect the committed changes.

compnerd added a commit: rG5d74c4351175: DirectoryWatcher: add an implementation for Windows.

Harbormaster completed remote builds in B74653: Diff 297324.Oct 9 2020, 2:08 PM

hans mentioned this in rGbddef54c5028: Raise the timeout in DirectoryWatcherTest to 10 s.Oct 13 2020, 5:25 AM

We're seeing the tests for this fail in Chromium's Clang builds: https://bugs.chromium.org/p/chromium/issues/detail?id=1137737

I'll try increasing the test's timeout for now and see if that helps.

The longer timeout didn't help :(

I'm not sure what's different about the machine where this is failing. Maybe it's some filesystem issue due to being a VM?

Any ideas for good printfs or similar that could be added to figure out exactly what part is failing?

If I had to guess, my money would be on a deadlock. To unblock, I'd
propose reverting this patch until we can figure it out.

During the review, a deadlock was fixed related to the watcher thread, but
perhaps we missed one for the notifier thread.

I've also been seeing some failures on phab reviews, e.g. https://reviews.llvm.org/D89188.

rnk added a reverting change: rG0ec1cf13f2a4: Revert "DirectoryWatcher: add an implementation for Windows".Oct 13 2020, 12:35 PM

This patch was reverted a while back because a couple DirectoryWatcher tests routinely timed out on build bots even though they work when run in isolation.

I believe the problem is that, on a machine busy doing a build, the startup of the watcher thread is delayed (either because the scheduler algorithm or the thread pool policy). Thus the "initial scan" on the test thread can complete _before_ the watcher thread has called ReadDirectoryChangesW. This leaves a window of time where the test can change a file that the watcher thread will miss.

To test this hypothesis, I took this patch and created one more event called WatcherThreadReady. I have the watcher thread set this event after successfully calling ReadDirectoryChangesW. In the constructor, I changed:

if (WaitForInitialSync)
  InitialScan();

if (WaitForInitialSync) {
  ::WaitForSingleObject(WatcherThreadReady, 10000);
  InitialScan();
}

This is crude, but it seems to be effective. The tests pass reliably for me when my machine is fully loaded. I didn't use an INFINITE timeout because it seems possibly missing a file change is less bad than hanging forever. I didn't even bother to check the result from the wait because there's nothing sane to do besides "best effort" if something goes wrong. I used a Windows event because those are most familiar to me and it's Windows-only code. But it certainly could be done with other type of synchronization object.

There may be more elegant ways to solve this, but something like this directly addresses the root cause with fairly minimal changes.

I wonder if the Linux and Mac implementations might suffer from a similar window but the bug is rare because of differences in thread scheduler.

I was able to play around with this further yesterday evening. You are correct - the issue is the load preventing the watcher thread from spinning up. I was able to reproduce this issue and resolve it by adding in a synchronization point (boolean + mutex + condition variable) before returning the directory watcher to ensure that the RDC was setup. I looked through all the previous failure cases as well as the one that I was finally able to reproduce - it is always the initial notification that we missed (because the thread took too long to come up). In the process I did do a few minor alterations as well. I would like to get additional testing over the weekend on the bots so people aren't impacted if there turns out to be some other subtle threading issue. However, running this in a tight loop locally seemed to be pretty stable (switching the builds to a SSD locally did help uncover the flakiness) so I am confident that this should be safe. I'm happy to address any additional comments in post-commit review.

We still occasionally (every couple of runs) see these tests hang on Windows in both Debug and Release. Unfortunately, I don't have access to the machines running the tests to debug the tests while they are hanging and I haven't had a chance to try to reproduce locally.

Interesting, are the logs from the runs available? I have run the test ~10000 times locally and its been stable. Perhaps the logs can show what is going on.

In D88666#2825300, @compnerd wrote:

Interesting, are the logs from the runs available? I have run the test ~10000 times locally and its been stable. Perhaps the logs can show what is going on.

Unfortunately, no logs :(. I can try to repro locally later today if I have time.

This is the best I can do from the online builds. I'll try and repro locally as well:

FAIL: Clang-Unit :: DirectoryWatcher/./DirectoryWatcherTests.exe/DirectoryWatcherTest.InitialScanAsync (75980 of 75980)
******************** TEST 'Clang-Unit :: DirectoryWatcher/./DirectoryWatcherTests.exe/DirectoryWatcherTest.InitialScanAsync' FAILED ********************
Script:
--
D:\a\_work\1\b\llvm\Debug\tools\clang\unittests\DirectoryWatcher\.\DirectoryWatcherTests.exe --gtest_filter=DirectoryWatcherTest.InitialScanAsync
--
Note: Google Test filter = DirectoryWatcherTest.InitialScanAsync

[==========] Running 1 test from 1 test suite.

[----------] Global test environment set-up.

[----------] 1 test from DirectoryWatcherTest

[ RUN      ] DirectoryWatcherTest.InitialScanAsync


********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. Terminate batch job (Y/N)? 
  interrupted by user, skipping remaining tests

One thing we've run into in the past is that running some of these tests on a drive other than the OS drive can cause weird failure. I am not sure if that may be the case here as well.

In D88666#2825557, @stella.stamenova wrote:

One thing we've run into in the past is that running some of these tests on a drive other than the OS drive can cause weird failure. I am not sure if that may be the case here as well.

As long as it's a local drive, that _shouldn't_ be a problem. I always run tests on a different drive than the OS system drive.

If it's a network drive, then, yeah, that would likely be a problem. If I recall correctly, ReadDirectoryChangesW has substantial limitations when pointed at a remote drive. The implementation should probably check that and signal an "unsupported" error.

Also note that Stella's sample log looks slightly different than the failures we were reproducing. It's almost as if the initial scan never finished. I haven't looked at that code, but I wonder if the file iteration is stuck in some kind of loop due to links or mount points or something.

I wasn't able to reproduce this locally by running *just* the DirectoryWatcher tests. I'm now running all of the clang tests repeatedly to see if I can get a repro that way.

The online tests appear to always hang either in the InitialScanAsync or InvalidatedWatcherAsync based on the log from the failures. We are using a non-OS drive for the tests, but it is not a network drive. The hang is also very consistent in our online testing - every couple of runs, sometimes more often. I suspect one of the reasons I cannot reproduce it locally with ease is that the test machines are able to run more tests in parallel and faster.

[==========] Running 1 test from 1 test suite.

[----------] Global test environment set-up.

[----------] 1 test from DirectoryWatcherTest

[ RUN      ] DirectoryWatcherTest.InvalidatedWatcherAsync


********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. Terminate batch job (Y/N)? 
  interrupted by user, skipping remaining tests

thakis mentioned this in rGfb32de9e97af: Re-Revert "DirectoryWatcher: add an implementation for Windows".Jun 18 2021, 3:52 PM

We also see check-all timeout recently (fairly consistently), see https://bugs.chromium.org/p/chromium/issues/detail?id=1221702

Since Stella reported problems with this too, I speculatively reverted it (and follow-ups) in fb32de9e97af0921242a021e30020ffacf7aa6e2 for now.

In D88666#2828306, @thakis wrote:

We also see check-all timeout recently (fairly consistently), see https://bugs.chromium.org/p/chromium/issues/detail?id=1221702

Since Stella reported problems with this too, I speculatively reverted it (and follow-ups) in fb32de9e97af0921242a021e30020ffacf7aa6e2 for now.

Thanks!

I'm still trying to reproduce locally by running bigger and bigger sets of tests. check-clang doesn't appear to reproduce the issue for me, all of our online tests run check-all, so I am trying that next.

FWIW https://lab.llvm.org/buildbot/#/builders/123 has been red for several days after this landed too (eg https://lab.llvm.org/buildbot/#/builders/123/builds/4545)

Revision Contents

Path

Size

clang/

lib/

DirectoryWatcher/

windows/

DirectoryWatcher-windows.cpp

278 lines

unittests/

DirectoryWatcher/

CMakeLists.txt

2 lines

Diff 297333

clang/lib/DirectoryWatcher/windows/DirectoryWatcher-windows.cpp

	//===- DirectoryWatcher-windows.cpp - Windows-platform directory watching -===//			//===- DirectoryWatcher-windows.cpp - Windows-platform directory watching -===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	// TODO: This is not yet an implementation, but it will make it so Windows
	// builds don't fail.

	#include "DirectoryScanner.h"			#include "DirectoryScanner.h"
	#include "clang/DirectoryWatcher/DirectoryWatcher.h"			#include "clang/DirectoryWatcher/DirectoryWatcher.h"

	#include "llvm/ADT/STLExtras.h"			#include "llvm/ADT/STLExtras.h"
	#include "llvm/ADT/ScopeExit.h"			#include "llvm/Support/ConvertUTF.h"
	#include "llvm/Support/AlignOf.h"
	#include "llvm/Support/Errno.h"
	#include "llvm/Support/Mutex.h"
	#include "llvm/Support/Path.h"			#include "llvm/Support/Path.h"
	#include <atomic>			#include "llvm/Support/Windows/WindowsSupport.h"
				amccarthUnsubmitted Done Reply Inline Actions I don't see a reason to include `<atomic>` here. amccarth: I don't see a reason to include `<atomic>` here.
				compnerdAuthorUnsubmitted Done Reply Inline Actions Left-overs :-( compnerd: Left-overs :-(
	#include <condition_variable>			#include <condition_variable>
	#include <mutex>			#include <mutex>
	#include <queue>			#include <queue>
	#include <string>			#include <string>
	#include <thread>			#include <thread>
	#include <vector>			#include <vector>

				aaron.ballmanUnsubmitted Done Reply Inline Actions You should include `llvm/Support/Windows/WindowsSupport.h` not `Windows.h` directly. aaron.ballman: You should include `llvm/Support/Windows/WindowsSupport.h` not `Windows.h` directly.
	namespace {			namespace {

				using DirectoryWatcherCallback =
				std::function<void(llvm::ArrayRef<clang::DirectoryWatcher::Event>, bool)>;

	using namespace llvm;			using namespace llvm;
	using namespace clang;			using namespace clang;

	class DirectoryWatcherWindows : public clang::DirectoryWatcher {			class DirectoryWatcherWindows : public clang::DirectoryWatcher {
				OVERLAPPED Overlapped;
				amccarthUnsubmitted Done Reply Inline Actions The `ovlIO` name isn't consistent with LLVM style. amccarth: The `ovlIO` name isn't consistent with LLVM style.

				std::vector<DWORD> Notifications;

				amccarthUnsubmitted Not Done Reply Inline Actions If it were me, I'd probably make this a `std::vector`. If an off-by-one bug causes an overrun of one WCHAR, you could trash a crucial member variable. On the heap, the damage is less likely to be catastrophic. You wouldn't need `alignas`. I don't think these are created in a tight loop, so the overhead doesn't concern me. Also, I'd probably go with a slightly more descriptive name, like `Notifications` rather than `Buffer`. amccarth: If it were me, I'd probably make this a `std::vector`. * If an off-by-one bug causes an…
				compnerdAuthorUnsubmitted Done Reply Inline Actions The `alignas` is because the documentation states that the buffer should be DWORD aligned. It is more for pedantic reasons rather than anything else. I think that making it a catastrophic failure is a good thing though - it would catch the error :) You are correct about the allocation - it is once per watch. I'll rename it at least. compnerd: The `alignas` is because the documentation states that the buffer should be DWORD aligned. It…
				amccarthUnsubmitted Not Done Reply Inline Actions But it's still an arbitrarily-sized buffer in the middle of a class definition. If you change your mind about how big to make it, it changes the definition of the class. The buffer is going to be accessed using pointer arithmetic, which is generally dangerous. Moving the buffer out of the class avoids both of those problems. The alignas here is _not_ pedantic, it's essential. Without it, you could easily have an alignment problem. But if you used a vector, you'd know that it would always be suitably aligned. amccarth: But it's still an arbitrarily-sized buffer in the middle of a class definition. If you change…
				compnerdAuthorUnsubmitted Done Reply Inline Actions I think that the class size point is what is more convincing to me for switching to a `vector`. compnerd: I think that the class size point is what is more convincing to me for switching to a `vector`.
				std::thread WatcherThread;
				std::thread HandlerThread;
				std::function<void(ArrayRef<DirectoryWatcher::Event>, bool)> Callback;
				SmallString<MAX_PATH> Path;
				HANDLE Terminate;

				class EventQueue {
				std::mutex M;
				std::queue<DirectoryWatcher::Event> Q;
				std::condition_variable CV;

				public:
				void emplace(DirectoryWatcher::Event::EventKind Kind, StringRef Path) {
				{
				std::unique_lock<std::mutex> L(M);
				Q.emplace(Kind, Path);
				}
				CV.notify_one();
				}

				DirectoryWatcher::Event pop_front() {
				std::unique_lock<std::mutex> L(M);
				while (true) {
				if (!Q.empty()) {
				DirectoryWatcher::Event E = Q.front();
				Q.pop();
				return E;
				}
				CV.wait(L, [this]() { return !Q.empty(); });
				}
				}
				} Q;

	public:			public:
	~DirectoryWatcherWindows() override { }			DirectoryWatcherWindows(HANDLE DirectoryHandle, bool WaitForInitialSync,
	void InitialScan() { }			DirectoryWatcherCallback Receiver);
	void EventReceivingLoop() { }
	void StopWork() { }			~DirectoryWatcherWindows() override;

				void InitialScan();
				void WatcherThreadProc(HANDLE DirectoryHandle);
				void NotifierThreadProc(bool WaitForInitialSync);
				aaron.ballmanUnsubmitted Done Reply Inline Actions `hDirectory` -> `Directory` aaron.ballman: `hDirectory` -> `Directory`
				amccarthUnsubmitted Done Reply Inline Actions I like the name change from HandlerThread to NotifierThread. Thanks! amccarth: I like the name change from HandlerThread to NotifierThread. Thanks!
	};			};

				DirectoryWatcherWindows::DirectoryWatcherWindows(
				HANDLE DirectoryHandle, bool WaitForInitialSync,
				aaron.ballmanUnsubmitted Done Reply Inline Actions `hDirectory` -> `Directory` aaron.ballman: `hDirectory` -> `Directory`
				DirectoryWatcherCallback Receiver)
				: Callback(Receiver), Terminate(INVALID_HANDLE_VALUE) {
				amccarthUnsubmitted Not Done Reply Inline Actions There's a lot going on in this constructor. Is this how the other implementations are arranged? Would it make sense to just initialize the object, and save most of the actual work to a `Watch` method? amccarth: There's a lot going on in this constructor. Is this how the other implementations are arranged?
				compnerdAuthorUnsubmitted Done Reply Inline Actions Largely the same. However, the majority of the "work" is actually the thread proc for the two threads. compnerd: Largely the same. However, the majority of the "work" is actually the thread proc for the two…
				amccarthUnsubmitted Done Reply Inline Actions Let me put it another way. Constructors cannot return errors and LLVM does use exceptions, so things that can fail generally shouldn't be in the constructor. This code is accessing the file system, creating and event, and spawning two threads. Any of those things can fail, but you've got no way to let the caller know whether something went wrong. If the lambdas were really short, then it would be easy to see that they're thread procs. But they're not, so they're hard to find and understand. If they were private member functions with descriptive names, the code would be easier to understand. amccarth: Let me put it another way. Constructors cannot return errors and LLVM does use exceptions, so…
				compnerdAuthorUnsubmitted Done Reply Inline Actions Sure, moving the thread procs into a member function is reasonable. Making that request would've been more clear :) compnerd: Sure, moving the thread procs into a member function is reasonable. Making that request…
				// Pre-compute the real location as we will be handing over the directory
				// handle to the watcher and performing synchronous operations.
				{
				DWORD Length = GetFinalPathNameByHandleW(DirectoryHandle, NULL, 0, 0);
				aaron.ballmanUnsubmitted Done Reply Inline Actions You should strip the Hungarian notation prefixes and ensure all the identifiers meet our usual naming rules, I'll stop bringing them up. aaron.ballman: You should strip the Hungarian notation prefixes and ensure all the identifiers meet our usual…

				aaron.ballmanUnsubmitted Done Reply Inline Actions Is a smart pointer required here or could you use `std::vector<WCHAR>` and reserve the space that way? aaron.ballman: Is a smart pointer required here or could you use `std::vector<WCHAR>` and reserve the space…
				compnerdAuthorUnsubmitted Done Reply Inline Actions Sure, I can convert this to a `std::vector<WCHAR>` instead. compnerd: Sure, I can convert this to a `std::vector<WCHAR>` instead.
				amccarthUnsubmitted Done Reply Inline Actions I guess it's fine to use the array form of `std::unique_ptr` (but then you should `#include <memory>`). If it were me, I'd probably just use a `std::wstring` or `std::vector<WCHAR>`. `dwLength` already includes the size of the null terminator. Your first `GetFinalPathNameByHandleW` function "fails" because the buffer is too small. The does says that, if it fails because the buffer is too small, then the return value is the required size _including_ the null terminator. (In the success case, it's the size w/o the terminator.) I know this is the Windows-specific implementation, but it might be best to just the Support api `realPathFromHandle`, which does this and has tests. amccarth: * I guess it's fine to use the array form of `std::unique_ptr` (but then you should `#include…
				compnerdAuthorUnsubmitted Done Reply Inline Actions I didn't know about `realPathFromHandle` - I prefer that actually. compnerd: I didn't know about `realPathFromHandle` - I prefer that actually.
				compnerdAuthorUnsubmitted Done Reply Inline Actions Actually, `realPathFromHandle` is private to `Path.cpp` :-( compnerd: Actually, `realPathFromHandle` is private to `Path.cpp` :-(
				std::vector<WCHAR> Buffer;
				amccarthUnsubmitted Done Reply Inline Actions I don't think you want to ignore the return value, since it'll tell you exactly how many characters you actually got back (or whether there was an error). Again, I recommend using `realPathFromHandle` from Support. amccarth: I don't think you want to ignore the return value, since it'll tell you exactly how many…
				Buffer.resize(Length);

				Length = GetFinalPathNameByHandleW(DirectoryHandle, Buffer.data(),
				Buffer.size(), 0);
				amccarthUnsubmitted Done Reply Inline Actions Using `Buffer.data()` when you've only reserved space is undefined behavior. You should used `resize` instead of `reserve` and then pass the `size` rather than the `capacity`. Be aware that, while unlikely, this could still fail. The directory could have been removed or renamed between calls or the caller could have passed a bad handle to begin with. amccarth: Using `Buffer.data()` when you've only reserved space is undefined behavior. You should used…
				compnerdAuthorUnsubmitted Done Reply Inline Actions Didn't know that about the `reserve`! Right, that is why most of this is ignoring failures. The watch will fail and terminate everything. compnerd: Didn't know that about the `reserve`! Right, that is why most of this is ignoring failures.
				Buffer.resize(Length);

				amccarthUnsubmitted Done Reply Inline Actions No real difference here, but, for consistency, please make this `CreateEventW` with the explicit -W suffix. amccarth: No real difference here, but, for consistency, please make this `CreateEventW` with the…
				llvm::sys::windows::UTF16ToUTF8(Buffer.data(), Buffer.size(), Path);
				plotfiUnsubmitted Done Reply Inline Actions Can this be an assert with some message or some explanation of the hEvent return value? What happens if hEvent is non-zero on Release builds? plotfi: Can this be an assert with some message or some explanation of the hEvent return value? What…
				compnerdAuthorUnsubmitted Done Reply Inline Actions I debated this myself which is why I added the `assert`. Honestly, if this fails, there is very little that can be done. This is creating an anonymous (unnamed) event. The failure here would be caused by out-of-memory conditions (you're dead anyways) or system is completely out of resources (you're dead anyways). I don't know of any recovery in that situation. I really would prefer to avoid the two-phase construction which is going to be required if we want to handle that error scenario. The event only makes sense after we have the directory handle, so I suppose that we could setup the event prior to the construction of the watcher itself, but that is just as much two-phase construction as adding an `initialize` method. compnerd: I debated this myself which is why I added the `assert`. Honestly, if this fails, there is…
				}

				Notifications.resize(4 * (sizeof(FILE_NOTIFY_INFORMATION) +
				MAX_PATH * sizeof(WCHAR)));

				memset(&Overlapped, 0, sizeof(Overlapped));
				aaron.ballmanUnsubmitted Done Reply Inline Actions You can drop the top-level `const` on value types. aaron.ballman: You can drop the top-level `const` on value types.
				Overlapped.hEvent =
				CreateEventW(NULL, /bManualReset=/TRUE, /bInitialState=/FALSE, NULL);
				assert(Overlapped.hEvent && "unable to create event");

				Terminate = CreateEventW(NULL, /bManualReset=/TRUE,
				/bInitialState=/FALSE, NULL);

				WatcherThread = std::thread([this, DirectoryHandle]() {
				this->WatcherThreadProc(DirectoryHandle);
				});

				if (WaitForInitialSync)
				InitialScan();

				HandlerThread = std::thread([this, WaitForInitialSync]() {
				this->NotifierThreadProc(WaitForInitialSync);
				});
				}

				DirectoryWatcherWindows::~DirectoryWatcherWindows() {
				// Signal the Watcher to exit.
				SetEvent(Terminate);
				HandlerThread.join();
				WatcherThread.join();
				CloseHandle(Terminate);
				CloseHandle(Overlapped.hEvent);
				}

				void DirectoryWatcherWindows::InitialScan() {
				Callback(getAsFileEvents(scanDirectory(Path.data())), /IsInitial=/true);
				}

				void DirectoryWatcherWindows::WatcherThreadProc(HANDLE DirectoryHandle) {
				while (true) {
				// We do not guarantee subdirectories, but macOS already provides
				// subdirectories, might as well as ...
				BOOL WatchSubtree = TRUE;
				DWORD NotifyFilter = FILE_NOTIFY_CHANGE_FILE_NAME
				\| FILE_NOTIFY_CHANGE_DIR_NAME
				\| FILE_NOTIFY_CHANGE_SIZE
				\| FILE_NOTIFY_CHANGE_LAST_ACCESS
				\| FILE_NOTIFY_CHANGE_LAST_WRITE
				\| FILE_NOTIFY_CHANGE_CREATION;

				DWORD BytesTransferred;
				if (!ReadDirectoryChangesW(DirectoryHandle, Notifications.data(),
				Notifications.size(), WatchSubtree,
				NotifyFilter, &BytesTransferred, &Overlapped,
				NULL)) {
				Q.emplace(DirectoryWatcher::Event::EventKind::WatcherGotInvalidated,
				"");
				break;
				}

				HANDLE Handles[2] = { Terminate, Overlapped.hEvent };
				switch (WaitForMultipleObjects(2, Handles, FALSE, INFINITE)) {
				case WAIT_OBJECT_0: // Terminate Request
				case WAIT_FAILED: // Failure
				Q.emplace(DirectoryWatcher::Event::EventKind::WatcherGotInvalidated,
				"");
				(void)CloseHandle(DirectoryHandle);
				return;
				case WAIT_TIMEOUT: // Spurious wakeup?
				continue;
				case WAIT_OBJECT_0 + 1: // Directory change
				break;
				}

				if (!GetOverlappedResult(DirectoryHandle, &Overlapped, &BytesTransferred,
				aaron.ballmanUnsubmitted Done Reply Inline Actions A newline above this line would be helpful for visual distinction. aaron.ballman: A newline above this line would be helpful for visual distinction.
				FALSE)) {
				Q.emplace(DirectoryWatcher::Event::EventKind::WatchedDirRemoved,
				"");
				Q.emplace(DirectoryWatcher::Event::EventKind::WatcherGotInvalidated,
				"");
				break;
				}

				// There was a buffer underrun on the kernel side. We may have lost
				// events, please re-synchronize.
				if (BytesTransferred == 0) {
				Q.emplace(DirectoryWatcher::Event::EventKind::WatcherGotInvalidated,
				"");
				break;
				}

				for (FILE_NOTIFY_INFORMATION *I =
				(FILE_NOTIFY_INFORMATION *)Notifications.data();
				I;
				amccarthUnsubmitted Done Reply Inline Actions I think this leaks the `ovlIO.hEvent`. After you've joined both threads, make sure to call `::CloseHandle()`. amccarth: I think this leaks the `ovlIO.hEvent`. After you've joined both threads, make sure to call `…
				I = I->NextEntryOffset
				? (FILE_NOTIFY_INFORMATION )((CHAR )I + I->NextEntryOffset)
				: NULL) {
				DirectoryWatcher::Event::EventKind Kind =
				DirectoryWatcher::Event::EventKind::WatcherGotInvalidated;
				switch (I->Action) {
				amccarthUnsubmitted Done Reply Inline Actions I don't think this is a reliable way to get the watcher thread to exit. You've overloaded the meaning of the event object and are racing the i/o system who plans to use the event for its own purposes. Suppose the watcher thread is waiting in GetOverlappedResult. If you set the event, then the other fields of the OVERLAPPED struct are in an indeterminant state. I don't see how the watcher thread can distinguish your exit signal from a normal completion. In another case, suppose to set the event when the watcher thread has just completed a GetOverlappedResult but hasn't yet started the next ReadDirectoryChangesW. The watcher thread won't notice the signal. It'll just loop around and the next ReadDirectoryChangesW will reset it. This will hang. This one reason why I suggested FindFirstChange and WaitForMultipleObjects. It lets you have distinct events objects for distinct events. It also has the benefit that you wouldn't need the overlapped IO and you possibly could combine the watcher and handler functions into a single thread. amccarth: I don't think this is a reliable way to get the watcher thread to exit. You've overloaded the…
				case FILE_ACTION_MODIFIED:
				Kind = DirectoryWatcher::Event::EventKind::Modified;
				break;
				case FILE_ACTION_ADDED:
				Kind = DirectoryWatcher::Event::EventKind::Modified;
				break;
				case FILE_ACTION_REMOVED:
				Kind = DirectoryWatcher::Event::EventKind::Removed;
				break;
				case FILE_ACTION_RENAMED_OLD_NAME:
				Kind = DirectoryWatcher::Event::EventKind::Removed;
				break;
				case FILE_ACTION_RENAMED_NEW_NAME:
				Kind = DirectoryWatcher::Event::EventKind::Modified;
				break;
				}

				SmallString<MAX_PATH> filename;
				sys::windows::UTF16ToUTF8(I->FileName, I->FileNameLength / 2,
				filename);
				Q.emplace(Kind, filename);
				}
				}

				(void)CloseHandle(DirectoryHandle);
				}

				void DirectoryWatcherWindows::NotifierThreadProc(bool WaitForInitialSync) {
				// If we did not wait for the initial sync, then we should perform the
				// scan when we enter the thread.
				if (!WaitForInitialSync)
				this->InitialScan();

				while (true) {
				DirectoryWatcher::Event E = Q.pop_front();
				Callback(E, /IsInitial=/false);
				if (E.Kind == DirectoryWatcher::Event::EventKind::WatcherGotInvalidated)
				break;
				}
				}

				auto error(DWORD ErrorCode) {
				DWORD Flags = FORMAT_MESSAGE_ALLOCATE_BUFFER
				\| FORMAT_MESSAGE_FROM_SYSTEM
				\| FORMAT_MESSAGE_IGNORE_INSERTS;

				LPSTR Buffer;
				if (!FormatMessageA(Flags, NULL, ErrorCode,
				MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT), (LPSTR)&Buffer,
				0, NULL)) {
				return make_error<llvm::StringError>("error " + utostr(ErrorCode),
				inconvertibleErrorCode());
				}
				std::string Message{Buffer};
				LocalFree(Buffer);
				return make_error<llvm::StringError>(Message, inconvertibleErrorCode());
				}

	} // namespace			} // namespace

	llvm::Expected<std::unique_ptr<DirectoryWatcher>>			llvm::Expected<std::unique_ptr<DirectoryWatcher>>
	clang::DirectoryWatcher::create(			clang::DirectoryWatcher::create(StringRef Path,
	StringRef Path,			DirectoryWatcherCallback Receiver,
	std::function<void(llvm::ArrayRef<DirectoryWatcher::Event>, bool)> Receiver,
	bool WaitForInitialSync) {			bool WaitForInitialSync) {
	return llvm::Expected<std::unique_ptr<DirectoryWatcher>>(			if (Path.empty())
	llvm::errorCodeToError(std::make_error_code(std::errc::not_supported)));			llvm::report_fatal_error(
				"DirectoryWatcher::create can not accept an empty Path.");

				if (!sys::fs::is_directory(Path))
				aaron.ballmanUnsubmitted Done Reply Inline Actions I think we should assert that -- calling this on a file isn't likely to behave in a good way. aaron.ballman: I think we should assert that -- calling this on a file isn't likely to behave in a good way.
				llvm::report_fatal_error(
				"DirectoryWatcher::create can not accept a filepath.");

				SmallVector<wchar_t, MAX_PATH> WidePath;
				if (sys::windows::UTF8ToUTF16(Path, WidePath))
				return llvm::make_error<llvm::StringError>(
				aaron.ballmanUnsubmitted Done Reply Inline Actions More top-level `const`s aaron.ballman: More top-level `const`s
				"unable to convert path to UTF-16", llvm::inconvertibleErrorCode());

				DWORD DesiredAccess = FILE_LIST_DIRECTORY;
				DWORD ShareMode = FILE_SHARE_READ \| FILE_SHARE_WRITE \| FILE_SHARE_DELETE;
				DWORD CreationDisposition = OPEN_EXISTING;
				DWORD FlagsAndAttributes = FILE_FLAG_BACKUP_SEMANTICS \| FILE_FLAG_OVERLAPPED;

				HANDLE DirectoryHandle =
				CreateFileW(WidePath.data(), DesiredAccess, ShareMode,
				/lpSecurityAttributes=/NULL, CreationDisposition,
				FlagsAndAttributes, NULL);
				if (DirectoryHandle == INVALID_HANDLE_VALUE)
				return error(GetLastError());

				// NOTE: We use the watcher instance as a RAII object to discard the handles
				// for the directory and the IOCP in case of an error. Hence, this is early
				// allocated, with the state being written directly to the watcher.
				return std::make_unique<DirectoryWatcherWindows>(
				DirectoryHandle, WaitForInitialSync, Receiver);
	}			}

clang/unittests/DirectoryWatcher/CMakeLists.txt

	if(APPLE OR CMAKE_SYSTEM_NAME MATCHES "Linux")			if(APPLE OR CMAKE_SYSTEM_NAME MATCHES "Linux" OR CMAKE_SYSTEM_NAME STREQUAL Windows)
				amccarthUnsubmitted Not Done Reply Inline Actions I'm not a Cmake expert, but I"m curious way `MATCHES "Linux"` but `STREQUAL Windows`. amccarth: I'm not a Cmake expert, but I"m curious way `MATCHES "Linux"` but `STREQUAL Windows`.
				compnerdAuthorUnsubmitted Done Reply Inline Actions `MATCHES` is a regular expression, and overly expensive. The `STREQUAL` is a string comparison (ala `strcmp`). compnerd: `MATCHES` is a regular expression, and overly expensive. The `STREQUAL` is a string comparison…

	set(LLVM_LINK_COMPONENTS			set(LLVM_LINK_COMPONENTS
	Support			Support
	)			)

	add_clang_unittest(DirectoryWatcherTests			add_clang_unittest(DirectoryWatcherTests
	DirectoryWatcherTest.cpp			DirectoryWatcherTest.cpp
	)			)
	Show All 9 Lines

This is an archive of the discontinued LLVM Phabricator instance.

DirectoryWatcher: add an implementation for WindowsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 297333

clang/lib/DirectoryWatcher/windows/DirectoryWatcher-windows.cpp

clang/unittests/DirectoryWatcher/CMakeLists.txt

DirectoryWatcher: add an implementation for Windows
ClosedPublic