This is an archive of the discontinued LLVM Phabricator instance.

[clang][DirectoryWatcher] Upstream DirectoryWatcher
ClosedPublic

Authored by jkorous on Feb 19 2019, 5:13 PM.

Details

Summary

This patch contains implementation of DirectoryWatcher from github.com/apple/swift-clang

We are starting new push to upstream the index-while-building feature in clang and this is one of the dependencies.

Original author is David Farler, other contributors are Argyrios Kyrtzidis, Thomas Roughton and Alex Lorenz.

Part of this implementation was included in the review below so I am adding @tschuett and @yvvan in case they are interested.
https://reviews.llvm.org/D41407

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
  • fix the linux implementation
  • clang-format
jkorous updated this revision to Diff 188808.Feb 28 2019, 4:13 PM

Adding clangd folks in case they want to take a look.

ormris removed a subscriber: ormris.Mar 5 2019, 9:09 AM

Sorry for the delay. I will finish reviewing tomorrow.

clang/include/clang/DirectoryWatcher/DirectoryWatcher.h
10

This comment only repeats the doc comment for the class, I'd suggest to remove it here.

jkorous updated this revision to Diff 190744.Mar 14 2019, 3:36 PM

Remove duplicate comment

jkorous marked an inline comment as done.Mar 14 2019, 3:36 PM
thakis added a subscriber: thakis.Mar 14 2019, 3:56 PM

Why is this needed for index-while-building? My mental model for index-while-building is that that streams out build index metadata as part of the regular compile. Why does that require watching directories?

Why is this needed for index-while-building? My mental model for index-while-building is that that streams out build index metadata as part of the regular compile. Why does that require watching directories?

You're right that this isn't necessary for the indexing phase. But we also provide API so clients can consume the index. This functionality is used for getting notifications about index data changes.

You can see it for example here:
https://github.com/apple/swift-clang/blob/stable/lib/IndexDataStore/IndexDataStore.cpp#L111

Why is this needed for index-while-building? My mental model for index-while-building is that that streams out build index metadata as part of the regular compile. Why does that require watching directories?

You're right that this isn't necessary for the indexing phase. But we also provide API so clients can consume the index. This functionality is used for getting notifications about index data changes.

You can see it for example here:
https://github.com/apple/swift-clang/blob/stable/lib/IndexDataStore/IndexDataStore.cpp#L111

Is that code going to live in clang? This seems more like a tool built on top of the compiler rather than something core to the compiler itself (like the actual index-while-building feature). Maybe this could be in clang-tools-extra?

gribozavr requested changes to this revision.Mar 15 2019, 6:25 AM
gribozavr added inline comments.
clang/include/clang/DirectoryWatcher/DirectoryWatcher.h
28

I feel like since there's nothing Clang-specific about it, it should go into libSupport in LLVM, what do you think?

34

"A file being moved into ... and replacing ... "

44

"File content was modified." ? In other words, metadata was not modified?

47

DirectoryRemoved (for consistency with Removed)

Also, how about TopDirectoryRemoved? Otherwise it sounds like some random directory was removed.

52

Please document if it is going to be absolute or relative path, or just the file name.

56

Users are not going to use this typedef in their code, so I feel like it is obfuscating code -- could you expand it in places where it is used?

62

WaitForInitialSync (everywhere in the patch)

63

Why not return llvm::ErrorOr<std::unique_ptr<DirectoryWatcher>>?

I also don't understand why we need unique_ptr. If you change Implementation &Impl; to std::unique_ptr<Implementation> Impl;, DirectoryWatcher becomes a move-only type, and create can return llvm::ErrorOr<DirectoryWatcher>.

clang/lib/DirectoryWatcher/DirectoryWatcher-linux.inc.h
93 ↗(On Diff #190744)

Please use AlignedCharArray from llvm/include/llvm/Support/AlignOf.h.

93 ↗(On Diff #190744)

NAME_MAX is 255, add some for inotify_event, and multiply by 30... we get almost 8 Kb on the stack. Should we allocate on the heap instead?

137 ↗(On Diff #190744)

Return llvm::Error?

162 ↗(On Diff #190744)

Use a lambda instead of bind to improve readability? https://clang.llvm.org/extra/clang-tidy/checks/modernize-avoid-bind.html

175 ↗(On Diff #190744)

Same comments as for macOS: we need to buffer events from inotify until the initial scan completes.

1 ↗(On Diff #188808)

Please drop the .h from the name, for consistency with the rest of LLVM (there are no .inc.h files, only .inc).

clang/lib/DirectoryWatcher/DirectoryWatcher-mac.inc.h
47 ↗(On Diff #190744)

This function does not handle the kFSEventStreamEventFlagMustScanSubDirs flag. I think it should, otherwise the client would miss silently events. WDYT?

https://developer.apple.com/library/archive/documentation/Darwin/Conceptual/FSEvents_ProgGuide/UsingtheFSEventsFramework/UsingtheFSEventsFramework.html

If an event in a directory occurs at about the same time as one or more events in a subdirectory of that directory, the events may be coalesced into a single event. In this case, you will receive an event with the kFSEventStreamEventFlagMustScanSubDirs flag set. When you receive such an event, you must recursively rescan the path listed in the event. The additional changes are not necessarily in an immediate child of the listed path.

Also from the docs:

A path might be "/" if ether of these flags is set for the event: kFSEventStreamEventFlagUserDropped, kFSEventStreamEventFlagKernelDropped.

91 ↗(On Diff #190744)

I don't think it is guaranteed that any duplicate events will be received in the first call to this callback. This is another reason to buffer events during the initial scan.

99 ↗(On Diff #190744)

Events.push_back(DirectoryWatcher::Event{K, path});

148 ↗(On Diff #190744)

/*latency=*/0.0, and remove the latency variable.

155 ↗(On Diff #190744)

Please check the return code from FSEventStreamStart.

1 ↗(On Diff #188808)

Please drop the .h from the name, for consistency with the rest of LLVM (there are no .inc.h files, only .inc).

14 ↗(On Diff #188808)

No semicolon after "}".

19 ↗(On Diff #188808)

"Path", "Receiver" etc. (throughout the patch for all variables, please)

112 ↗(On Diff #188808)

No need to encode the type name in variable names:

initialScanPtr => InitialScan

118 ↗(On Diff #188808)

"cfPath" (no need to repeat that it is a string, it is in the type)

130 ↗(On Diff #188808)

Please don't repeat code in comments.

135 ↗(On Diff #188808)

Move this block into a separate function?

std::string Realpath(StringRef Path) { ... }
141 ↗(On Diff #188808)
context.info = new EventStreamContextData(
      std::move(realPath), std::move(receiver), std::move(initialScanPtr));

and remove the ctxData variable.

168 ↗(On Diff #188808)

Why not return llvm::Error?

179 ↗(On Diff #188808)

"StrongPath"? "OwnedPath"? We don't care that it is copied, we care that the variable owns the value (AKA has a strong reference to it).

185 ↗(On Diff #188808)

I think setupFSEventsSema can be eliminated if the call to setupFSEventStream() was moved above this dispatch_async. WDYT?

187 ↗(On Diff #188808)

The fs event stream is already started, and it might have sent some events. And now we're sending the initial scan. So, for example, the client could see a file remove event (an fs event) before they see the file add event (from the initial scan).

I think this race can be fixed by buffering the fs event stream until the initial scan is complete. What do you think?

clang/lib/DirectoryWatcher/DirectoryWatcher.cpp
9 ↗(On Diff #188808)

Please don't duplicate doc comments between headers and implementation files -- they will only become stale and confusing.

31 ↗(On Diff #188808)

Please don't repeat code in comments.

32 ↗(On Diff #188808)

This specialization should go into llvm/include/llvm/Support/FileSystem.h, otherwise we are going to run into ODR violations if another file in LLVM defines the same specialization.

59 ↗(On Diff #188808)

"Note: this class is not thread-safe." should be enough. However, all classes are not thread-safe by default, do we really need to highlight this fact?

60 ↗(On Diff #188808)

InitialDirectoryScanner for a more clear name, and then you don't need to explain in the comment what this struct is used for.

61 ↗(On Diff #188808)

FoundFileIDs

62 ↗(On Diff #188808)

Tuple is not readable. One has to read the implementation to understand what is stored in it. Please use a struct and name the fields.

62 ↗(On Diff #188808)

FoundFiles

68 ↗(On Diff #188808)

Since this enumeration is not recursive, I would assume that DirectoryWatcher is not recursive either? That would be very important to mention in its doc comments -- my default expectation is that such notifications should be recursive. Or even in the name, NonrecursiveDirectoryWatcher.

83 ↗(On Diff #188808)

I find it weird to say that a file that was already existing was added to the directory (according to the doc comment on EventKind::Added).

99 ↗(On Diff #188808)

The usual way to do this is not through CMake, but by using the predefined preprocessor macros -- __APPLE__, __linux__ etc.

#if defined(__APPLE__)
#include "DirectoryWatcher-mac.inc"
#endif

#if defined(__linux__)
#include "DirectoryWatcher-linux.inc"
#endif

See llvm/lib/Support/Unix/Path.inc for example.

I don't see an advantage of doing it through CMake. CMakeLists.txt does exactly the same OS check, and then checks if the include files exist. However, there's no real implementation selection happening. It is not like we would ever see CoreServices missing on macOS. And even if it is missing, there's no other implementation suitable for macOS. Same for Linux.

111 ↗(On Diff #188808)

unique_ptr for Implementation should work, as long as you keep the constructor and the destructor of DirectoryWatcher defined out of line in the .cpp file.

141 ↗(On Diff #188808)

Use llvm::make_unique.

clang/unittests/DirectoryWatcher/DirectoryWatcherTest.cpp
44

Are these parallel arrays? I think it would be nicer if we had an array of structs instead.

46

"stats" is only used in the assertion, intentional?

112

Why not constructor?

263

Accidental return?

304

Google Test supports custom assertions. For example, see PrintedStmtMatches in clang/unittests/AST/StmtPrinterTest.cpp.

This revision now requires changes to proceed.Mar 15 2019, 6:25 AM

Why is this needed for index-while-building? My mental model for index-while-building is that that streams out build index metadata as part of the regular compile. Why does that require watching directories?

You're right that this isn't necessary for the indexing phase. But we also provide API so clients can consume the index. This functionality is used for getting notifications about index data changes.

You can see it for example here:
https://github.com/apple/swift-clang/blob/stable/lib/IndexDataStore/IndexDataStore.cpp#L111

Is that code going to live in clang? This seems more like a tool built on top of the compiler rather than something core to the compiler itself (like the actual index-while-building feature). Maybe this could be in clang-tools-extra?

It actually is part of the feature as the serialized format of the index isn't meant as a stable interface, that's what the API is for. DirectoryWatcher isn't a tool, it's just part of implementation of the IndexStore API.

Why is this needed for index-while-building? My mental model for index-while-building is that that streams out build index metadata as part of the regular compile. Why does that require watching directories?

You're right that this isn't necessary for the indexing phase. But we also provide API so clients can consume the index. This functionality is used for getting notifications about index data changes.

You can see it for example here:
https://github.com/apple/swift-clang/blob/stable/lib/IndexDataStore/IndexDataStore.cpp#L111

Is that code going to live in clang? This seems more like a tool built on top of the compiler rather than something core to the compiler itself (like the actual index-while-building feature). Maybe this could be in clang-tools-extra?

It actually is part of the feature as the serialized format of the index isn't meant as a stable interface, that's what the API is for. DirectoryWatcher isn't a tool, it's just part of implementation of the IndexStore API.

Maybe I misunderstand what the client of the IndexStore API is. That's not code that will be in the clang binary, right?

jkorous marked an inline comment as done.Mar 21 2019, 4:07 PM

Why is this needed for index-while-building? My mental model for index-while-building is that that streams out build index metadata as part of the regular compile. Why does that require watching directories?

You're right that this isn't necessary for the indexing phase. But we also provide API so clients can consume the index. This functionality is used for getting notifications about index data changes.

You can see it for example here:
https://github.com/apple/swift-clang/blob/stable/lib/IndexDataStore/IndexDataStore.cpp#L111

Is that code going to live in clang? This seems more like a tool built on top of the compiler rather than something core to the compiler itself (like the actual index-while-building feature). Maybe this could be in clang-tools-extra?

It actually is part of the feature as the serialized format of the index isn't meant as a stable interface, that's what the API is for. DirectoryWatcher isn't a tool, it's just part of implementation of the IndexStore API.

Maybe I misunderstand what the client of the IndexStore API is. That's not code that will be in the clang binary, right?

No, it won't.

Currently the client using this API is our indexing service.
https://github.com/apple/sourcekit-lsp
In the future clangd might become another client.

In theory we could have the index producing part in clang and the index consuming part (IndexStore) somewhere else (clang-tools-extra?) and use functionality that also lives somewhere else (llvm?) but reasons I'd rather not do it *NOW* are:

  1. We'd have to expose the interface between them (the binary file format) which has been just an implementation detail so far without any intention to keep it stable.
  2. From the general perspective - although I am upstreaming a fully developed feature (roughly 10kLOC) it is apparent that I am going to rewrite a significant part of the code based on the feedback from reviews. This patch is #2 out of approximately 10-15 patches total. Since it's probable that the design will change in upcoming reviews I'd prefer to discuss this kind of questions after a significant part of the whole design has been through the review.
  3. For any code that we would move up the tree (to llvm repo) I'd like to have a clear use-case other than index-while-building first. Designing generic APIs is hard/impossible without known specific use-cases (I think the recommended minimum is 3).

The most important word is "now". I am totally happy to discuss this and move parts somewhere else if it seems reasonable in the future.

Does that make sense?

clang/include/clang/DirectoryWatcher/DirectoryWatcher.h
28

This has been brought up before. I prefer to leave it here for now since it's not used anywhere else. I'd only move it to llvm/Support once we have another use-case as that would mean specific requirements for the interface.

jkorous updated this revision to Diff 199712.EditedMay 15 2019, 6:59 PM

A major clean-up.

changelog

general

  • specification, documentation, tests
  • returning only filenames instead of paths (or empty string when the event is related to the watched dir itself)
  • removed unsound event deduplication during/right after the initial scan
  • simplified how is OS-specific implementation selected

linux

  • properly terminating threads
  • added pthreads to fix shared libs build
  • handle IN_DELETE_SELF, IN_IGNORED
  • IN_EXCL_UNLINK

macos

  • simplified synchronization by removing semaphores
  • workarounds in FSEvents use

I am not entirely happy about my tests - since we're handling notifications from kernel asynchronously it's hard to write a deterministic test. I just use some 100 ms sleeps as a workaround. Happy to use something more robust if anyone has some ideas. Tests seem fine with msan & tsan on linux and tsan on macos but given their scope it doesn't mean that much.

jkorous updated this revision to Diff 200123.May 17 2019, 5:53 PM

fix link libraries in cmake

gribozavr added inline comments.May 20 2019, 7:51 AM
clang/include/clang/DirectoryWatcher/DirectoryWatcher.h
79

"its"

79

I'd say "unspecified". "Undefined behavior" has a specific meaning in C++, and I don't believe we have that.

Everywhere in the patch.

80

Please add blank lines between paragraphs (everywhere in the patch).

96

Don't repeat field names in comments.

96

"a relative path" -- relative to what?

96

Is it really a path to the directory?

104

"IsInitial"

104

Users are not going to use this typedef in their code, so I feel like it is obfuscating code -- could you expand it in places where it is used?

124

Make it a static function in the DirectoryWatcher?

clang/lib/DirectoryWatcher/linux/DirectoryWatcher-linux.cpp
87

Don't write what something is used for (it can change), write what something is.

"If true, all async actions are requested to stop."

107

I think this is too much code in the initializer list. Could you move the body of the lambda into a method, and call it from this lambda?

111

NAME_MAX is 255, add some for inotify_event, and multiply by 30... we get almost 8 Kb on the stack. Should we allocate on the heap instead?

113

Please use "//" comments.

119

Please use AlignedCharArray from llvm/include/llvm/Support/AlignOf.h

137

What is the role of the timeout and why does it need to be so small?

211

I think this is too much code in the initializer list. Could you move the body of the lambda into a method, and call it from this lambda?

226

No need for the semicolon.

272

Use llvm::make_unique (and let unique_ptr do an implicit conversion to base).

clang/unittests/DirectoryWatcher/CMakeLists.txt
17

No need to check. macOS will always have this file. If it is not there, it is a big issue anyway.

21

... LINK_FLAGS "-framework CoreServices"

No need for an intermediate variable.

clang/unittests/DirectoryWatcher/DirectoryWatcherTest.cpp
74

"ContainsInitialEvents"

81

"ContainsNonInitialEvents"

132

I'm certain this sleep will be flaky on heavily-loaded CI machines. If you are going to leave it as a sleep, please make it 1s. But is there really no better way?

gribozavr added inline comments.May 20 2019, 7:51 AM
clang/unittests/DirectoryWatcher/DirectoryWatcherTest.cpp
43

ContainsEvents

358

I would strongly prefer if you used the gmock matchers (like Contains); as written, when the test fails, the only error we would get would be like "expected: true, actual: false".

jkorous marked 2 inline comments as done.May 20 2019, 12:56 PM

Thanks for taking a look @gribozavr!

I briefly scanned the rest of your comments and I agree with you (or don't really have a strong opinion) in all cases. I'll work on that today.

clang/lib/DirectoryWatcher/linux/DirectoryWatcher-linux.cpp
137

The whole idea is that we can't block on read() if we ever want to stop watching the directory, release resources (file descriptors, threads) and correctly destruct the DirectoryWatcher instance either

  • because of a bug in some other thread in the implementation
  • or asynchronous client action (e. g. destructor being called) in main application thread

The timeout adds latency in those scenarios.

clang/unittests/DirectoryWatcher/DirectoryWatcherTest.cpp
132

That was exactly my thinking! Honestly, I don't know - I wasn't able to come up with any reasonably simple, deterministic approach even on a single platform :(
I eventually picked 0.1s as a tradeoff between slowing the test for everyone and getting less false positives.

The problem as I understand it is that we're making changes and monitoring them asynchronously with no guarantee from the kernel API (true for FSEvents, not 100% about inotify) about when (if) we receive notifications.

If you have any idea for robust testing approach I'd be totally happy to use it.

jkorous marked 36 inline comments as done.EditedMay 20 2019, 7:51 PM

I addressed most of the comments.

What is left:

  • rewrite tests with gmock matchers
  • write test for metadata of a file in watched dir being changed
clang/include/clang/DirectoryWatcher/DirectoryWatcher.h
47

I used WatchedDirRemoved.

63

I personally didn't like the C-like std::string& error in the interface. But I feel like having a polymorphic class (which doesn't really fit into ErrorOr) is lesser evil than the original pimpl design.
WDYT?

96

I reworded the comment.

124

You're right - seems like static factory methods are the LLVM way.

clang/unittests/DirectoryWatcher/CMakeLists.txt
17

Actually this is wrong but for a different reason - these are link dependencies of the implementation. Tests (or any other client) shouldn't care about it.

jkorous updated this revision to Diff 200389.May 20 2019, 7:52 PM
jkorous marked 4 inline comments as done.

Addressed comments.
Changed semantics of one of std::atomic<bool> in linux implementation.

gribozavr added inline comments.May 21 2019, 9:24 AM
clang/lib/DirectoryWatcher/linux/DirectoryWatcher-linux.cpp
112

"alignof()" expression is standard C++ since C++11. (No need for underscores.)

115

use 'auto' to store the return value of make_unique?

137

Waking up 1000 times a second is not great -- it will lead to battery drain on laptops etc.

Please see https://stackoverflow.com/questions/8593004/waiting-on-a-condition-pthread-cond-wait-and-a-socket-change-select-simultan for non-busy-wait options.

kadircet added inline comments.May 22 2019, 10:09 AM
clang/lib/DirectoryWatcher/linux/DirectoryWatcher-linux.cpp
49

nit: get rid of the else

157

naming, it should be capital P

212

while (!Stop) ?

212

why the loop always tries to empty the queue? what would be broken if we simply stopped if Stop was set?

Also if that is important current implementation doesn't seem to be achiving that, since some events can be pushed to the queue after while loop exits.

224

maybe add a condition variable for queue state and waint on it?

245

why not use two condition variables for stop and finishedinitscan?

jkorous marked 9 inline comments as done.May 22 2019, 12:50 PM

Thanks for taking a look Kadir!
After yesterday's discussion with Dmitri I removed all those busy waits. Seems like the code is not much more complex now. I am going to update the diff and off to fixing the tests.

clang/lib/DirectoryWatcher/linux/DirectoryWatcher-linux.cpp
212

I changed the design - stopping condition for this loop is now finding WatcherGotInvalidated in the queue itself.

jkorous updated this revision to Diff 200810.May 22 2019, 12:50 PM
jkorous marked an inline comment as done.

Remove busy waits.

jkorous updated this revision to Diff 200811.May 22 2019, 12:52 PM
jkorous updated this revision to Diff 201114.May 23 2019, 6:43 PM

Reimplemented tests with std::futures which allowed to use more generous timeout while not slowing down the happy paths.

jkorous marked 6 inline comments as done.May 23 2019, 6:49 PM
jkorous added inline comments.
clang/unittests/DirectoryWatcher/DirectoryWatcherTest.cpp
132

I found a way how to use more generous timeout without slowing down the test for every run - inspired by Argyrios' idea about using different thread + semaphore.

I am using 3 seconds for now. If that's not enough, just let me know.

358

Since the tests are using are based on something like "eventual correctness" instead of one-time check I didn't use gmock matchers but implemented some custom diagnostics.

Example of the failed test:

/Users/jankorous/src/llvm-monorepo/llvm-project/clang/unittests/DirectoryWatcher/DirectoryWatcherTest.cpp:185: Failure
Value of: *TestConsumer.Result()
  Actual: false
Expected: true
Expected but not seen non-initial events:
Removed a
Unexpected non-initial events seen:
Added a
jkorous marked an inline comment as done.May 24 2019, 1:01 PM
jkorous updated this revision to Diff 201321.May 24 2019, 1:25 PM

Specify what "file modified" means and add a test for metadata change

jkorous updated this revision to Diff 201338.May 24 2019, 2:37 PM
  • simplify link libraries in cmake
  • fix Release build (messed-up asserts)
jkorous updated this revision to Diff 201744.May 28 2019, 12:44 PM

Remove DirectoryWatcher::Event::EventKind::Added

One more thing.

On macOS FSEvents are coalesced more or less at will and it became quite apparent when I was creating automatic tests - I was for example receiving coalesced events Added & Modified & Removed. We had a discussion about how to deal with this and it turned out that the existing client doesn't actually care whether a file was created or whether it was modified. I believe that by removing the EventKind::Added we still keep the interface sane while removing the ambiguity.

Very nice testing approach!

clang/include/clang/DirectoryWatcher/DirectoryWatcher.h
21

Looks like triple slashes on empty lines got removed, splitting the doc comment.

clang/lib/DirectoryWatcher/linux/DirectoryWatcher-linux.cpp
41

Three slashes for doc comments?

41

Don't write what something is used for currently, such comments go out of date quickly. (Just delete the last sentence.)

42

Semaphor*e*

43

"Expects"

Three slashes for doc comments.

50

Extra semicolon.

53

Since it closes the file descriptors in the destructor, I feel like it should also be responsible for calling pipe in the constructor.

59

Three slashes for doc comments.

121

"consumes", "pushes"

clang/unittests/DirectoryWatcher/DirectoryWatcherTest.cpp
110

Please add a period at the end of the comment (everywhere in the patch).

154

Please name functions consistently -- there's both consume() that starts with lowercase, and Result() that starts with uppercase.

Please refer to the current naming rules in the style guide and apply everywhere in the patch.

211

"If the following assertions fail, it is a sign that ..."

Also you can stream the message into the EXPECT_TRUE, it will be printed if the assertion fails.

EXPECT_TRUE(...) << "whatever";

225

Test names start with an uppercase letter (InitialScanSync). Please apply everywhere in the patch.

244

Add /*waitForInitialSync=*/ ? (everywhere in the patch)

305

Delete the comma and wrap onto one line.

420

80 columns.

jkorous marked 20 inline comments as done.May 31 2019, 12:19 PM
jkorous added inline comments.
clang/lib/DirectoryWatcher/linux/DirectoryWatcher-linux.cpp
41

Sure, no problem.

I naturally tend to err on the side of over-commenting as I think I'd appreciate it myself if I had to understand the code without prior knowledge - not saying it's intentional or that I have a strong reason to do that though. You seem to have a strong preference for not having out-of-date comments with low-ish information value. Just out of curiosity - is it based on any particular reason or experience?

53

I know what you mean - my "oop feel" was telling me it's wrong too. It's not just the pipe in this class but also the inotify descriptor in the watcher class.

The problem is that the only reasonable way how to communicate failures from constructors that I am aware of are exceptions which we don't use in llvm. That's why I moved most of the stuff that can fail even before any work is actually started to the factory method (with the exception of epoll file descriptor as that felt like making its scope unnecessarily larger).

Thinking about it now, I am starting to doubt that it makes life any easier for client code as it still has to cope with failure communicated as WatcherGotInvalidated event via receiver.

What do you think?

clang/unittests/DirectoryWatcher/DirectoryWatcherTest.cpp
154

Sorry about that. This is definitely my weak point.

jkorous updated this revision to Diff 202467.May 31 2019, 12:21 PM
jkorous marked 2 inline comments as done.

Addressed comments.

gribozavr accepted this revision.May 31 2019, 12:56 PM
gribozavr added inline comments.
clang/lib/DirectoryWatcher/linux/DirectoryWatcher-linux.cpp
41

is it based on any particular reason or experience?

Yes, primarily working on the Swift compiler and the standard library, when everything was changing very quickly. My current work also confirms the same -- a lot of time when I see a comment about the "current usage" or other incidental information that is not the contract of the API, it tends to be outdated. Approximately nobody will change such a comment when another user is added ("I'm just reusing the code..."), even when the quoted user already became non-representative of the usage pattern, or the usage pattern has changed. However, when changing the API contract people typically do change the comment.

It also makes sense to me in abstract: reading that X happens to be used for Y does not necessarily help understand X better -- it is only a cross-reference that I could find myself with an IDE command; I still need to understand the design of Y and the interaction with X, and then using my past experience infer what X was intended to be.

Saying that X is intended to be used only by Y is a different story of course, that's design documentation.

Providing an example of usage is also fine, but it should be phrased as an example that can't become stale.

53

You could add a factory function to SemaphorePipe, but... I feel like trying to recover from a failure in the pipe() call is a bit like trying to recover from a memory allocation failure. The process is probably hitting a file descriptor limit or something like that, and is likely going to fail anyway. I'd probably trigger a fatal error if pipe() fails.

This class is a lot like pthread_mutex_init -- it can fail "gracefully", but there's no way for the caller to recover -- the caller needs a mutex to proceed.

This revision is now accepted and ready to land.May 31 2019, 12:56 PM
gribozavr added inline comments.May 31 2019, 12:58 PM
clang/lib/DirectoryWatcher/linux/DirectoryWatcher-linux.cpp
318

Use pipe2() with O_CLOEXEC, to avoid leaking the file descriptors to child processes?

jkorous marked 3 inline comments as done.Jun 3 2019, 5:56 PM

I fixed the rest.

There are still some questions you raised that I just responded to and kept them as not Done. Feel free to take a look. If nothing comes up I'll commit this on Wednesday.

clang/lib/DirectoryWatcher/linux/DirectoryWatcher-linux.cpp
41

Thank you.

53

I see what you mean and mostly agree. Anyways, let's allow clients to handle such funny moments themselves as much as they can.

I'll factor out the pipe call to the factory method.

318

You're right, I didn't account for this. Added the flag also to inotify_init1() and epoll_create1() calls.

jkorous updated this revision to Diff 202833.Jun 3 2019, 5:58 PM

linux implementation

  • factory method for SemaphorePipe
  • *_CLOEXEC flags
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJul 9 2019, 3:44 PM
rnk added a subscriber: rnk.Jul 9 2019, 4:21 PM
rnk added inline comments.
cfe/trunk/unittests/DirectoryWatcher/DirectoryWatcherTest.cpp
420 ↗(On Diff #208833)

This fails to compile on Windows because file_t is not int there:
C:\b\slave\clang-x64-windows-msvc\build\llvm.src\tools\clang\unittests\DirectoryWatcher\DirectoryWatcherTest.cpp(415): error C2440: 'initializing': cannot convert from 'void *' to 'int'
C:\b\slave\clang-x64-windows-msvc\build\llvm.src\tools\clang\unittests\DirectoryWatcher\DirectoryWatcherTest.cpp(415): note: There is no context in which this conversion is possible

I have been working on migrating some code over to native file handles to make this type of error less likely in the future, but it is not done yet.

rnk added a comment.Jul 9 2019, 4:21 PM

Reverted in rC365581.

Thanks for the revert.

There's actually one more problem - seems like ninja doesn't like the generated build.ninja file on Linux.

ninja: error: build.ninja:52390: bad $-escape (literal $ must be written as $$)

http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/13617/steps/build-stage1-compiler/logs/stdio

ormris added a subscriber: ormris.Jul 9 2019, 5:11 PM

Thanks for the revert.

There's actually one more problem - seems like ninja doesn't like the generated build.ninja file on Linux.

ninja: error: build.ninja:52390: bad $-escape (literal $ must be written as $$)

http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/13617/steps/build-stage1-compiler/logs/stdio

We ran into this as well, and I was curious about what was going on, so I dug into it.

When you add a private dependency to static library target, as you're doing in this change with ${DIRECTORY_WATCHER_LINK_LIBS}, CMake adds the dependency to the target's INTERFACE_LINK_LIBRARIES property using the $<LINK_ONLY:...> generator expression. The code for clang-shlib iterates through all the clang libraries and uses generator expressions to gather their INTERFACE_LINK_LIBRARIES, but if those generator expressions themselves evaluate to generator expressions ($<LINK_ONLY:...> in this case), the second level of generator expressions won't get evaluated and will just end up in the ninja file directly, hence the complaint about the dollar. The clang-shlib code in question is https://github.com/llvm/llvm-project/blob/3837f4273fcc40cc519035479aefe78e5cbd3055/clang/tools/clang-shlib/CMakeLists.txt#L10.

Here's a simple repro (where empty.c is literally an empty file). Running cmake -G Ninja on this and then running ninja should demonstrate the issue.

cmake_minimum_required(VERSION 3.4.3)
project(dollartest C)

add_library(a STATIC empty.c)
add_library(b_obj OBJECT empty.c)
add_library(b STATIC empty.c)
target_link_libraries(b PRIVATE a)

add_library(c SHARED empty.c)
target_link_libraries(c PRIVATE
  b_object
  $<TARGET_PROPERTY:b,INTERFACE_LINK_LIBRARIES>
  $<TARGET_PROPERTY:b,LINK_LIBRARIES>
  )

@beanz, thoughts on how best to handle this?

plotfi added a subscriber: plotfi.Jul 31 2019, 2:21 PM

@jkorous DirectoryWatcherTests causes ninja check-clang to hang on my Ubuntu 18.04 computer. check-clang will not finish and I am forced to killall -9 DirectoryWatcherTests. My system has 40 threads and this repros on ext4 and btrfs.

jkorous added a comment.EditedJul 31 2019, 2:42 PM

@jkorous DirectoryWatcherTests causes ninja check-clang to hang on my Ubuntu 18.04 computer. check-clang will not finish and I am forced to killall -9 DirectoryWatcherTests. My system has 40 threads and this repros on ext4 and btrfs.

Hi @plotfi,

Could you please add more details?

The tests seem fine on CentOS 6.9 with Linux 4.19.34-77 on ext4 and also Ubuntu build bots (Ubuntu 18.04.1 LTS) seem fine.
http://lab.llvm.org:8011/buildslaves/ps4-buildslave1a

Unfortunately I don't have any Ubuntu system at hand.

Thanks.

Jan

Hi Puyan,

I failed to reproduce with llvm.org/master (5faa533e47b0e54b04166b0257c5ebb48e6ffcaa) on Ubuntu 18.04.1 LTS both in debug and release build.

Since it sounds like you can reproduce "reliably" - can you please share more info how to reproduce?