This change define RAII class, that facilitate using file locks.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/include/llvm/Support/FileSystem.h | ||
---|---|---|
1183 | Perhaps this type should be boolean testable (for the error), or using a factory function to create one that returns llvm::Error (wrapped around the error_code) on failure: { Expected<FileLocker> F = TryLock(FD, Timeout); if (!F) { } ... } Hmm - if unlocking can fail, and that failure needs to be handled, this doesn't seem suitable for an RAII type device... |
This is a replacement of D78897.
llvm/include/llvm/Support/FileSystem.h | ||
---|---|---|
1183 |
The class has method locked, which can be used to check if lock was successfully acquired, and method error for those applications that need more information than just the fact of failure.
If the failure to unlock needs to be handled, the user would use method unlock provided by this class. In most expected use cases however this failure need not be handled. User defines RAII object and do things that require exclusive access. Upon destruction lock is released and if something goes wrong at that moment, this fact is just ignored. |
llvm/include/llvm/Support/FileSystem.h | ||
---|---|---|
1183 |
Yeah, I think it'd be better if this was only accessible via a free function that returns Expected<FileLocker> so that the Error must be checked for/can't be accidentally ignored.
Fair enough/that sounds OK then. But it should return llvm::Error to make sure errors are handled/not ignored |
llvm/include/llvm/Support/FileSystem.h | ||
---|---|---|
1183 | Added function tryLock for using Expected<FileLocker> |
llvm/include/llvm/Support/FileSystem.h | ||
---|---|---|
1194–1197 | This implicitly deletes the copy constructor, but it still leaves this class with a default copy-assignment operator. Defining a move assignment would make the class safer and more consistent. | |
1233 | I don't think there should be a default value for the timeout for two reasons:
| |
1234–1236 | If the intention is that people always use this function to lock files, then it would be better to remove the relevant FileLocker constructor and do the locking here. That would also simplify the class as it could maintain the invariant that it is created only if locking was successful and it wouldn't need to carry the error_code member. |
Updated patch
- Lock logic is moved to tryLock function,
- Changed implementation of FileLocker accordingly,
- Removed default agrument of tryLock.
llvm/include/llvm/Support/FileSystem.h | ||
---|---|---|
1194–1197 | Removed copy operations explicitly, added missing operator. | |
1233 | You are right, removed default argument. Timeout is a form of protection against deadlocks. For the particular task of writing to log file infinite wait time is also acceptable, but in other use cases (such as unit tests) timeouts may be convenient. | |
1234–1236 | Makes sense. Moved lock logic to to function tryLock. |
llvm/include/llvm/Support/FileSystem.h | ||
---|---|---|
1183 | Looking at the unit test I realized that fs::tryLockFile and fs::tryLock do quite different things, but that this difference is not obvious from the function names. The way I would deal with that is by making tryLock a static member function fs::FileLocker::tryLock (and then make the FileLocker constructor private), but I see that David has requested a free function. David, what are your thoughts on that? | |
1233 | I don't think unit tests are a good motivating example. There are plenty of ways a test can not terminate without locking files (or anything else). The solution to that is to have a (fairly high) timeout at the level of the test harness. That we already have. So if the unit tests are the main motivation for introducing these timeouts, I think we should remove them, as they add a fair amount of complexity (and flakyness) to the equation. Even for the clang use case, whether one prefers a deadlock over the program silently doing the wrong thing (perhaps when the system is under heavy load, like it is when running the test suite), depends on what exactly is he trying to achieve. But that's something I don't feel qualified to judge... | |
llvm/unittests/Support/Path.cpp | ||
42–44 | I guess these leaked in from the dependant patch somehow... | |
2196 | A cleaner way to write this would be: Expected<fs::FileLocker> L = fs::tryLock(...); ASSERT_THAT_EXPECTED(L, Succeeded()); // do the other checks.. | |
2216–2233 | L = fs::tryLock(FD); ASSERT_THAT_EXPECTED(L, Succeeded()); EXPECT_THAT_EXPECTED(fs::tryLock(Stream.getFD()), Failed()); ASSERT_NO_ERROR(L->unlock()); EXPECT_THAT_EXPECTED(fs::tryLock(Stream.getFD()), Succeeded()); | |
2240–2251 | This could be rewritten in a similar manner, though it's not clear to me what is it testing that wasn't already covered by the tests above. |
llvm/include/llvm/Support/FileSystem.h | ||
---|---|---|
1183 | I renamed the function into getLock. It looks not so categorical as lockFile and is unlikely to be confused with tryLockFile. Using static member functions leads to long names, it decreases readability. | |
1233 | Unit tests are not a motivation for using timeouts, but they demonstrates the benefits. If users can specify timeouts, they have possibility to handle deadlocks. It is up to user to handle such case or not, but the possibility exists. Without timeouts user must behave as if deadlocks never happen. It limits usability of the solution and it is the motivation. | |
llvm/unittests/Support/Path.cpp | ||
42–44 | Yes, rebased the patch properly. | |
2196 | Thank you for the idea! | |
2216–2233 | Reworked this place but with small change: temporary values are left in variables. It is convenient in tests, as if something goes wrong, it is easier to find reason of the problem in debugger. | |
2240–2251 | Removed this test. Previously it checked double unlock, but in the new implementation of Locker it is not possible to diagnose such case. |
llvm/include/llvm/Support/FileSystem.h | ||
---|---|---|
1183 | Having to guess what a short name does also hurts readability. So does inconsistency with other similar classes. getLock is less likely to be confused with tryLockFile, but its connection to the FileLocker class is still fairly vague for my taste, and it has lost the information that it only "tries" to get the lock, which I do find important. As for consistency, I think this class is most similar to TempFile class in this very file. That class also uses a static factory function TempFile::create to return an Expected<TempFile>. Other uses of static factory functions in the support library that I could find by a quick search are GlobPattern::create, MemoryBuffer::getFile***, SpecialCaseList::create, Error::success. There are also uses of free functions to create an object, but these generally have a different purpose -- to enable automatic template argument deduction (e.g. make_range) and not to provide error handling in a -fno-exceptions world. | |
1233 | Well.. it's not that the user _can_ specify timeouts. In the current setup, the user _must_ specify a timeout -- there is no blocking equivalent of this function. Which is fine, if the only use case we have for it right now wants/needs/etc. a timeout (we are not writing a general purpose library here). However, I remain unconvinced that a timeout is called for in this situation (but you don't have to convince me here -- that's another review). | |
llvm/unittests/Support/Path.cpp | ||
2196 | If you want to keep the if statement, that's fine but you should still change the ADD_FAILURE thingy to an EXPECT_THAT_EXPECTED, because the latter has a much better error message. | |
2216–2233 | I don't feel strongly about that, but I do want to point out that this pattern: |
llvm/include/llvm/Support/FileSystem.h | ||
---|---|---|
1183 | The problem is not in long names only. The classes you mention (TempFile, MemoryBuffer and so on) represent some program entities. In contrast, FileLocker is a helper class, it does not designate any entity in a program, it exists only to execute some action at scope exit. Ideally user does not need to know about it. We can reach that by using auto specifier: if (auto L = fs::getLock(FD)) { // ... do action that require file to be locked. } else { // ... handle lock error. }; That is why a free function is more convenient than a method. | |
1233 | Collection of build statistics usually is a side effect of build. If something goes wrong and lock cannot be released, it can cause build failure. It is often more acceptable to get incomplete statistics than to break a build. Timeouts can work as watchdog, excluding log machinery from culprits. However in some cases blocking operations may be more preferable. I think we can solve this problem by adding blocking function as well. It is done in this version. | |
llvm/unittests/Support/Path.cpp | ||
2196 | Ok, changed. | |
2216–2233 | Got it. Reworked as you recommended. |
llvm/include/llvm/Support/FileSystem.h | ||
---|---|---|
1183 | I'll admit that there are some differences between this class and the others, but I don't believe they are sufficiently convincing to justify deviating from the existing practice. I think that the notion of a "program entity" is very vague. I could certainly think of a lock on a file as a "thing" that can be owned, passed around (it has move constructors), etc. And the existence of the unlock function means that the user may need to be aware of the class. scope_exit is a class whose only function is to run an action at scope exit, and it is indeed normally used as auto x = make_scope_exit(...), but I don't think that's because we don't want users to be aware of the classes existance. It's because it uses templates for efficiency, and spelling the template name out would be difficult or impossible (lambdas have no names). Given llvm's fairly cautious stance towards auto, if there was a way to achieve the same thing without templates, I'm pretty sure we would be spelling out that class everywhere. | |
1233 | (I think adding the blocking is good because it may prevent needless proliferation of calls to the timeouted versions. I just want to clarify that I did not consider that as a requirement for this patch.) |
(please address the clang-format lint code review feedback in changed/added lines to remove osme of the noise from this review)
llvm/include/llvm/Support/FileSystem.h | ||
---|---|---|
1183 | Sorry for the delay getting back to this review!
Sorry, the main point was "not a public constructor" - some kind of named constructor so that the result is "Expected<FileLocker>" to enforce testing of failure to lock. I don't mind if it's a static member function of FileLocker, or perhaps a non-static member function of raw_ostream, instead than exposing the FD from the raw_ostream. +1 to everything else @labath has said about this being perfectly reasonable to name & scope guards being a fairly special case of unnameable types and I wouldn't apply it in this case. | |
1233 | +1 please don't include timeout support if there's no immediate planned use case. |
llvm/include/llvm/Support/FileSystem.h | ||
---|---|---|
1183 |
The existing practice is to declare RAII object in a scope as a local variable. It was the initial variant, which was rejected because it didn't force a user to check if lock was successful, such check was supported but not enforced.
You are definitely right. A thing that is an auxiliary class can become a full-fledged entity in other design. A lock, which would allow ownership transfer, makes sense as a member of more elaborated class hierarchy. Such lock could support more flexible lock mechanism and provide checking facility. In this case it is an entity, not merely a RAII object. The class FileLocker on the other hand is a simple class, that supports only the simplest kind of file locking. It has move constructor only because it is necessary to use it with Expected. The only function it has is lock release. Hardly it need a factory method, as this function is already implemented by getLock.
This is common practice, when RAII class has possibility to free resources early, for example ContextRAII::pop or ScopeRAII::destroy.
Sure, there is no intention to hide the class from user. However if user can be unaware about existence of a class, the class is auxiliary. IIUC the problem now is whether to create RAII object with a free function or a method of FileLocker? What is the advantage of this solution? | |
1233 | The time out support is motivated by the following consideration. At present compiler jobs executed in parallel build are independent. Each compiler process has exclusive access to any file it writes to. Allowing writes to the same log file creates dependency between compiler processes, which may affect build process. For example, the following hypothetical scenario. If some process having lock to the shared log file is suspended, other processes may be suspended waiting the lock. The memory is occupied with waiting processes, and the build can fail due to resource shortage. Using timeouts can alleviate this issue at expense of log completeness. |
@labath - what do you think, for retrieving an Expected<FileLocker> - static member function of FileLocker, or a non-static member function of raw_ostream?
llvm/include/llvm/Support/FileSystem.h | ||
---|---|---|
1233 | That's a nice hypothetical scenario - but please add it when there's a concrete need in the LLVM project for that functionality, not before. |
(I'm moving this discussion out-of-line as it's getting rather long.
I'll admit that there are some differences between this class and the others, but I don't believe they are sufficiently convincing to justify deviating from the existing practice.The existing practice is to declare RAII object in a scope as a local variable. It was the initial variant, which was rejected because it didn't force a user to check if lock was successful, such check was supported but not enforced.
It's not the existing practice when the creation of an object (RAII or not) can fail for reasons which are outside of the programmers control. Locking a mutex can not fail, (except for timeouts, if you use the timeout constructor). I'd be willing to go with that idea (to emulate std::unique_lock), if we can agree to declare all other locking problems to be "programmer errors", and assert they don't happen.
linux documents the following flock errors:
- EBADF fd is not an open file descriptor. -- definitely a programmer error
- EINTR While waiting to acquire a lock, the call was interrupted by delivery of a signal caught by a handler -- can/should be handled by retrying
- EINVAL operation is invalid. -- sounds like a programmer error?
- ENOLCK The kernel ran out of memory for allocating lock records. -- not really programmer error, but llvm is not robust against resource exhaustion anyway
- EWOULDBLOCK The file is locked and the LOCK_NB flag was selected. -- timeout
I think I would be fine with asserting on all of these (except EWOULDBLOCK). The main question i have is when can EINVAL be returned. If it's just a garbage operation argument then I'm fine. But if it's possible that we get this if the underlying fs does not support locking, then this would not be ok, since the programmer cannot know whether that is supported -- other OSs document an EOPNOTSUPP result, but they still leave it fairly vague about when does that apply (e.g. can it happen for regular files too, or only for the more "exotic" file descriptors).
You are definitely right. A thing that is an auxiliary class can become a full-fledged entity in other design. A lock, which would allow ownership transfer, makes sense as a member of more elaborated class hierarchy. Such lock could support more flexible lock mechanism and provide checking facility. In this case it is an entity, not merely a RAII object.
I'm sorry, but I still find the entity-vs-auxiliary line too fine to walk on.
The class FileLocker on the other hand is a simple class, that supports only the simplest kind of file locking. It has move constructor only because it is necessary to use it with Expected. The only function it has is lock release. Hardly it need a factory method, as this function is already implemented by getLock.
That goes both ways -- with a factory method, there's no need for the getLock function. And regardless of the reason, the class does have move operations, so it does support ownership transfer.
And the existence of the unlock function means that the user may need to be aware of the class.This is common practice, when RAII class has possibility to free resources early, for example ContextRAII::pop or ScopeRAII::destroy.
True, but those classes are not constructed with auto (hence one can't say the users aren't aware of their existence), and they construction cannot fail (so they avoid the entire free-vs-static debate).
IIUC the problem now is whether to create RAII object with a free function or a method of FileLocker? What is the advantage of this solution?
Yes, I think you understand correctly. As for the advantage, the killer argument for me is consistency. I also find that the getLock vs lockFile does not emphasize enough the difference in semantics between the two functions. This could be alleviated by changing the name to something like getFileLocker, but I don't see a need to argue the merits of that given the first point. I also don't expect this class to be used so widely to be worth optimizing the couple of characters of typing.
Both seem fine to me. The thing which, for me, would really speak in favor of a raw_(fd_)ostream method is if we wanted to make the locker class smarter and have it automatically flush the stream before unlocking. As it stands now, the user still has to manually call flush (or remember to use unbuffered I/O), otherwise the entire locking may be for nothing.
llvm/include/llvm/Support/FileSystem.h | ||
---|---|---|
1183 | My reply is out-of-line, as this thread is getting rather long. | |
1233 | I believe that is not a hypothetical scenario, but *the* scenario because of which we're having this entire debate in the first place (see the dependent patch) :) |
Updated patch
- Free function getLock was replaced by raw_fd_stream::tryToLock,
- Removed variant of lock with timeout.
This looks good to me, modulo a couple of inline comments. But let's give a chance for @dblaikie to look this over too...
llvm/include/llvm/Support/FileSystem.h | ||
---|---|---|
1186 | If this stays as public, I think it should be documented very explicitly that this takes an already-locked file descriptor. Alternatively, we could still introduce the static factory function here (:P) to do the locking, but have the raw_ostream function as the "recommended" way of locking a raw_fd_ostream (actually, without exposing the file descriptor, it will essentially be the _only_ way of doing it). | |
llvm/lib/Support/Error.cpp | ||
85–90 ↗ | (On Diff #264277) | I'd move this next to the definitions of other ErrorInfoBase members (line 53--54). |
llvm/unittests/Support/Path.cpp | ||
2203 | This line here is not going anything because the EXPECT macro already "consumes" the error. The double consumes for Errors are mostly harmless in practice, but I already got complaints from some clang-tidy checks about this, when I accidentally introduced them. So... consider yourself warned. :) I know you want to demonstrate the "recommended" usage. That is commendable, but I don't think it really works out here... |
Yep, looks about right to me. Thanks!
The two changes to Error.h (message/cantFail) look like they're unrelated to this patch & should be removed/not committed?
I think they're needed because this introduces a circular dependency between raw_ostream and Error classes. Does not seem unreasonable to me, but if we don't want that, then we need to revisit this idea...
Exactly. I got compilation errors because of Expected used in declaration of raw_ostream::tryToLock. But now I tried removing these changes and compilation was successful. Probably the recent Visual Studio update solved this problem. So I removed these changes. If however someone using old MS compiler will see compilation fails, we know what to do.
New method names, fixed unit test
Review of D78903 demonstrated that using the name tryToLock for
blocking operation is misleading. So the method name was changed
to lock. Attribute LLVM_NODISCARD should prevent from ignoring
return value.
Also due to the change of implementation of lock operation on POSIX
systems, existing unit test is no more valid as it uses locks per
handle inside the same process. This test however still can be run
on Windows, so it was adapted for changed interface.
Perhaps this type should be boolean testable (for the error), or using a factory function to create one that returns llvm::Error (wrapped around the error_code) on failure:
Hmm - if unlocking can fail, and that failure needs to be handled, this doesn't seem suitable for an RAII type device...