Page MenuHomePhabricator

[clang][DirectoryWatcher] Upstream DirectoryWatcher

Authored by jkorous on Feb 19 2019, 5:13 PM.



This patch contains implementation of DirectoryWatcher from

We are starting new push to upstream the index-while-building feature in clang and this is one of the dependencies.

Original author is David Farler, other contributors are Argyrios Kyrtzidis, Thomas Roughton and Alex Lorenz.

Part of this implementation was included in the review below so I am adding @tschuett and @yvvan in case they are interested.

Diff Detail


Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
gribozavr added inline comments.May 20 2019, 7:51 AM
42 ↗(On Diff #200123)


357 ↗(On Diff #200123)

I would strongly prefer if you used the gmock matchers (like Contains); as written, when the test fails, the only error we would get would be like "expected: true, actual: false".

jkorous marked 2 inline comments as done.May 20 2019, 12:56 PM

Thanks for taking a look @gribozavr!

I briefly scanned the rest of your comments and I agree with you (or don't really have a strong opinion) in all cases. I'll work on that today.

136 ↗(On Diff #200123)

The whole idea is that we can't block on read() if we ever want to stop watching the directory, release resources (file descriptors, threads) and correctly destruct the DirectoryWatcher instance either

  • because of a bug in some other thread in the implementation
  • or asynchronous client action (e. g. destructor being called) in main application thread

The timeout adds latency in those scenarios.

131 ↗(On Diff #200123)

That was exactly my thinking! Honestly, I don't know - I wasn't able to come up with any reasonably simple, deterministic approach even on a single platform :(
I eventually picked 0.1s as a tradeoff between slowing the test for everyone and getting less false positives.

The problem as I understand it is that we're making changes and monitoring them asynchronously with no guarantee from the kernel API (true for FSEvents, not 100% about inotify) about when (if) we receive notifications.

If you have any idea for robust testing approach I'd be totally happy to use it.

jkorous marked 36 inline comments as done.EditedMay 20 2019, 7:51 PM

I addressed most of the comments.

What is left:

  • rewrite tests with gmock matchers
  • write test for metadata of a file in watched dir being changed
95 ↗(On Diff #200123)

I reworded the comment.

123 ↗(On Diff #200123)

You're right - seems like static factory methods are the LLVM way.

46 ↗(On Diff #188808)

I used WatchedDirRemoved.

62 ↗(On Diff #188808)

I personally didn't like the C-like std::string& error in the interface. But I feel like having a polymorphic class (which doesn't really fit into ErrorOr) is lesser evil than the original pimpl design.

16 ↗(On Diff #200123)

Actually this is wrong but for a different reason - these are link dependencies of the implementation. Tests (or any other client) shouldn't care about it.

jkorous updated this revision to Diff 200389.May 20 2019, 7:52 PM
jkorous marked 4 inline comments as done.

Addressed comments.
Changed semantics of one of std::atomic<bool> in linux implementation.

gribozavr added inline comments.May 21 2019, 9:24 AM
111 ↗(On Diff #200389)

"alignof()" expression is standard C++ since C++11. (No need for underscores.)

114 ↗(On Diff #200389)

use 'auto' to store the return value of make_unique?

136 ↗(On Diff #200123)

Waking up 1000 times a second is not great -- it will lead to battery drain on laptops etc.

Please see for non-busy-wait options.

kadircet added inline comments.May 22 2019, 10:09 AM
48 ↗(On Diff #200389)

nit: get rid of the else

156 ↗(On Diff #200389)

naming, it should be capital P

211 ↗(On Diff #200389)

while (!Stop) ?

211 ↗(On Diff #200389)

why the loop always tries to empty the queue? what would be broken if we simply stopped if Stop was set?

Also if that is important current implementation doesn't seem to be achiving that, since some events can be pushed to the queue after while loop exits.

223 ↗(On Diff #200389)

maybe add a condition variable for queue state and waint on it?

244 ↗(On Diff #200389)

why not use two condition variables for stop and finishedinitscan?

jkorous marked 9 inline comments as done.May 22 2019, 12:50 PM

Thanks for taking a look Kadir!
After yesterday's discussion with Dmitri I removed all those busy waits. Seems like the code is not much more complex now. I am going to update the diff and off to fixing the tests.

211 ↗(On Diff #200389)

I changed the design - stopping condition for this loop is now finding WatcherGotInvalidated in the queue itself.

jkorous updated this revision to Diff 200810.May 22 2019, 12:50 PM
jkorous marked an inline comment as done.

Remove busy waits.

jkorous updated this revision to Diff 200811.May 22 2019, 12:52 PM
jkorous updated this revision to Diff 201114.May 23 2019, 6:43 PM

Reimplemented tests with std::futures which allowed to use more generous timeout while not slowing down the happy paths.

jkorous marked 6 inline comments as done.May 23 2019, 6:49 PM
jkorous added inline comments.
131 ↗(On Diff #200123)

I found a way how to use more generous timeout without slowing down the test for every run - inspired by Argyrios' idea about using different thread + semaphore.

I am using 3 seconds for now. If that's not enough, just let me know.

357 ↗(On Diff #200123)

Since the tests are using are based on something like "eventual correctness" instead of one-time check I didn't use gmock matchers but implemented some custom diagnostics.

Example of the failed test:

/Users/jankorous/src/llvm-monorepo/llvm-project/clang/unittests/DirectoryWatcher/DirectoryWatcherTest.cpp:185: Failure
Value of: *TestConsumer.Result()
  Actual: false
Expected: true
Expected but not seen non-initial events:
Removed a
Unexpected non-initial events seen:
Added a
jkorous marked an inline comment as done.May 24 2019, 1:01 PM
jkorous updated this revision to Diff 201321.May 24 2019, 1:25 PM

Specify what "file modified" means and add a test for metadata change

jkorous updated this revision to Diff 201338.May 24 2019, 2:37 PM
  • simplify link libraries in cmake
  • fix Release build (messed-up asserts)
jkorous updated this revision to Diff 201744.May 28 2019, 12:44 PM

Remove DirectoryWatcher::Event::EventKind::Added

One more thing.

On macOS FSEvents are coalesced more or less at will and it became quite apparent when I was creating automatic tests - I was for example receiving coalesced events Added & Modified & Removed. We had a discussion about how to deal with this and it turned out that the existing client doesn't actually care whether a file was created or whether it was modified. I believe that by removing the EventKind::Added we still keep the interface sane while removing the ambiguity.

Very nice testing approach!

20 ↗(On Diff #201744)

Looks like triple slashes on empty lines got removed, splitting the doc comment.

40 ↗(On Diff #201744)

Three slashes for doc comments?

40 ↗(On Diff #201744)

Don't write what something is used for currently, such comments go out of date quickly. (Just delete the last sentence.)

41 ↗(On Diff #201744)


42 ↗(On Diff #201744)


Three slashes for doc comments.

49 ↗(On Diff #201744)

Extra semicolon.

52 ↗(On Diff #201744)

Since it closes the file descriptors in the destructor, I feel like it should also be responsible for calling pipe in the constructor.

58 ↗(On Diff #201744)

Three slashes for doc comments.

120 ↗(On Diff #201744)

"consumes", "pushes"

109 ↗(On Diff #201744)

Please add a period at the end of the comment (everywhere in the patch).

153 ↗(On Diff #201744)

Please name functions consistently -- there's both consume() that starts with lowercase, and Result() that starts with uppercase.

Please refer to the current naming rules in the style guide and apply everywhere in the patch.

210 ↗(On Diff #201744)

"If the following assertions fail, it is a sign that ..."

Also you can stream the message into the EXPECT_TRUE, it will be printed if the assertion fails.

EXPECT_TRUE(...) << "whatever";

224 ↗(On Diff #201744)

Test names start with an uppercase letter (InitialScanSync). Please apply everywhere in the patch.

243 ↗(On Diff #201744)

Add /*waitForInitialSync=*/ ? (everywhere in the patch)

304 ↗(On Diff #201744)

Delete the comma and wrap onto one line.

419 ↗(On Diff #201744)

80 columns.

jkorous marked 20 inline comments as done.May 31 2019, 12:19 PM
jkorous added inline comments.
40 ↗(On Diff #201744)

Sure, no problem.

I naturally tend to err on the side of over-commenting as I think I'd appreciate it myself if I had to understand the code without prior knowledge - not saying it's intentional or that I have a strong reason to do that though. You seem to have a strong preference for not having out-of-date comments with low-ish information value. Just out of curiosity - is it based on any particular reason or experience?

52 ↗(On Diff #201744)

I know what you mean - my "oop feel" was telling me it's wrong too. It's not just the pipe in this class but also the inotify descriptor in the watcher class.

The problem is that the only reasonable way how to communicate failures from constructors that I am aware of are exceptions which we don't use in llvm. That's why I moved most of the stuff that can fail even before any work is actually started to the factory method (with the exception of epoll file descriptor as that felt like making its scope unnecessarily larger).

Thinking about it now, I am starting to doubt that it makes life any easier for client code as it still has to cope with failure communicated as WatcherGotInvalidated event via receiver.

What do you think?

153 ↗(On Diff #201744)

Sorry about that. This is definitely my weak point.

jkorous updated this revision to Diff 202467.May 31 2019, 12:21 PM
jkorous marked 2 inline comments as done.

Addressed comments.

gribozavr accepted this revision.May 31 2019, 12:56 PM
gribozavr added inline comments.
40 ↗(On Diff #201744)

is it based on any particular reason or experience?

Yes, primarily working on the Swift compiler and the standard library, when everything was changing very quickly. My current work also confirms the same -- a lot of time when I see a comment about the "current usage" or other incidental information that is not the contract of the API, it tends to be outdated. Approximately nobody will change such a comment when another user is added ("I'm just reusing the code..."), even when the quoted user already became non-representative of the usage pattern, or the usage pattern has changed. However, when changing the API contract people typically do change the comment.

It also makes sense to me in abstract: reading that X happens to be used for Y does not necessarily help understand X better -- it is only a cross-reference that I could find myself with an IDE command; I still need to understand the design of Y and the interaction with X, and then using my past experience infer what X was intended to be.

Saying that X is intended to be used only by Y is a different story of course, that's design documentation.

Providing an example of usage is also fine, but it should be phrased as an example that can't become stale.

52 ↗(On Diff #201744)

You could add a factory function to SemaphorePipe, but... I feel like trying to recover from a failure in the pipe() call is a bit like trying to recover from a memory allocation failure. The process is probably hitting a file descriptor limit or something like that, and is likely going to fail anyway. I'd probably trigger a fatal error if pipe() fails.

This class is a lot like pthread_mutex_init -- it can fail "gracefully", but there's no way for the caller to recover -- the caller needs a mutex to proceed.

This revision is now accepted and ready to land.May 31 2019, 12:56 PM
gribozavr added inline comments.May 31 2019, 12:58 PM
317 ↗(On Diff #202467)

Use pipe2() with O_CLOEXEC, to avoid leaking the file descriptors to child processes?

jkorous marked 3 inline comments as done.Jun 3 2019, 5:56 PM

I fixed the rest.

There are still some questions you raised that I just responded to and kept them as not Done. Feel free to take a look. If nothing comes up I'll commit this on Wednesday.

40 ↗(On Diff #201744)

Thank you.

52 ↗(On Diff #201744)

I see what you mean and mostly agree. Anyways, let's allow clients to handle such funny moments themselves as much as they can.

I'll factor out the pipe call to the factory method.

317 ↗(On Diff #202467)

You're right, I didn't account for this. Added the flag also to inotify_init1() and epoll_create1() calls.

jkorous updated this revision to Diff 202833.Jun 3 2019, 5:58 PM

linux implementation

  • factory method for SemaphorePipe
  • *_CLOEXEC flags
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJul 9 2019, 3:44 PM
rnk added a subscriber: rnk.Jul 9 2019, 4:21 PM
rnk added inline comments.

This fails to compile on Windows because file_t is not int there:
C:\b\slave\clang-x64-windows-msvc\build\llvm.src\tools\clang\unittests\DirectoryWatcher\DirectoryWatcherTest.cpp(415): error C2440: 'initializing': cannot convert from 'void *' to 'int'
C:\b\slave\clang-x64-windows-msvc\build\llvm.src\tools\clang\unittests\DirectoryWatcher\DirectoryWatcherTest.cpp(415): note: There is no context in which this conversion is possible

I have been working on migrating some code over to native file handles to make this type of error less likely in the future, but it is not done yet.

rnk added a comment.Jul 9 2019, 4:21 PM

Reverted in rC365581.

Thanks for the revert.

There's actually one more problem - seems like ninja doesn't like the generated file on Linux.

ninja: error: bad $-escape (literal $ must be written as $$)

ormris added a subscriber: ormris.Jul 9 2019, 5:11 PM

Thanks for the revert.

There's actually one more problem - seems like ninja doesn't like the generated file on Linux.

ninja: error: bad $-escape (literal $ must be written as $$)

We ran into this as well, and I was curious about what was going on, so I dug into it.

When you add a private dependency to static library target, as you're doing in this change with ${DIRECTORY_WATCHER_LINK_LIBS}, CMake adds the dependency to the target's INTERFACE_LINK_LIBRARIES property using the $<LINK_ONLY:...> generator expression. The code for clang-shlib iterates through all the clang libraries and uses generator expressions to gather their INTERFACE_LINK_LIBRARIES, but if those generator expressions themselves evaluate to generator expressions ($<LINK_ONLY:...> in this case), the second level of generator expressions won't get evaluated and will just end up in the ninja file directly, hence the complaint about the dollar. The clang-shlib code in question is

Here's a simple repro (where empty.c is literally an empty file). Running cmake -G Ninja on this and then running ninja should demonstrate the issue.

cmake_minimum_required(VERSION 3.4.3)
project(dollartest C)

add_library(a STATIC empty.c)
add_library(b_obj OBJECT empty.c)
add_library(b STATIC empty.c)
target_link_libraries(b PRIVATE a)

add_library(c SHARED empty.c)
target_link_libraries(c PRIVATE

@beanz, thoughts on how best to handle this?

plotfi added a subscriber: plotfi.Jul 31 2019, 2:21 PM

@jkorous DirectoryWatcherTests causes ninja check-clang to hang on my Ubuntu 18.04 computer. check-clang will not finish and I am forced to killall -9 DirectoryWatcherTests. My system has 40 threads and this repros on ext4 and btrfs.

jkorous added a comment.EditedJul 31 2019, 2:42 PM

@jkorous DirectoryWatcherTests causes ninja check-clang to hang on my Ubuntu 18.04 computer. check-clang will not finish and I am forced to killall -9 DirectoryWatcherTests. My system has 40 threads and this repros on ext4 and btrfs.

Hi @plotfi,

Could you please add more details?

The tests seem fine on CentOS 6.9 with Linux 4.19.34-77 on ext4 and also Ubuntu build bots (Ubuntu 18.04.1 LTS) seem fine.

Unfortunately I don't have any Ubuntu system at hand.



Hi Puyan,

I failed to reproduce with (5faa533e47b0e54b04166b0257c5ebb48e6ffcaa) on Ubuntu 18.04.1 LTS both in debug and release build.

Since it sounds like you can reproduce "reliably" - can you please share more info how to reproduce?