This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Support/
-
llvm/
-
Support/
-
raw_ostream.h
-
lib/Support/
-
Support/
-
raw_ostream.cpp
-
unittests/Support/
-
Support/
-
raw_ostream_test.cpp

Differential D78897

[Support] raw_fd_ostream can lock file before write
AbandonedPublic

Authored by sepavloff on Apr 26 2020, 11:37 PM.

Download Raw Diff

Details

Reviewers

ruiu
labath
csigg
jhenderson
dblaikie

Summary

Using new methods lockForUpdate and lockTimeout an instance of
raw_fd_ostream can be put into mode in which it locks underlying file
before each write and unlocks it immediately after. This is convenience
method, it allows transparent operations on log files in parallel builds,
when several processes write into the same file.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

sepavloff created this revision.Apr 26 2020, 11:37 PM

Herald added a project: Restricted Project. · View Herald TranscriptApr 26 2020, 11:37 PM

Herald added a subscriber: hiraditya. · View Herald Transcript

sepavloff added a parent revision: D78896: [Support] Add file lock/unlock functions.Apr 26 2020, 11:47 PM

sepavloff added a child revision: D78903: [Driver] Add option -fproc-stat-report.Apr 27 2020, 12:38 AM

Harbormaster failed remote builds in B54742: Diff 260219!Apr 27 2020, 1:01 AM

While I don't really feel qualified to set the direction here, I gotta say that this patch (and the built-in timeout in particular) looks worrying to me.

Overall I feel that this sort of automatic locking will very rarely be the "right tool for the job". I think your follow-up patch sort of demonstrates that, because you've needed to format the output to a temporary string stream in order for it do what you need. I think it might be more reasonable to just leave this out of the raw_ostream class, and do a manual lockFile + write + unlockFile combo where needed. In fact, if we do that, I'm wondering if locking is really needed, as a write to a O_APPEND file is more-or-less guaranteed to be atomic (I think the only problem comes from signals, and I'm not sure if those really happen on local files):

O_APPEND
       The  file is opened in append mode.  Before each write(2), the file offset is positioned
       at the end of the file, as if with lseek(2).  The modification of the  file  offset  and
       the write operation are performed as a single atomic step.

I believe a similar effect on windows can be achieved with FILE_APPEND_DATA and/or writing to offset 0xffffffffffffffff

FILE_APPEND_DATA
	For a file object, the right to append data to the file. (For local files, write operations will not overwrite existing data if this flag is specified without FILE_WRITE_DATA.)

WriteFile:
        To write to the end of file, specify both the Offset and OffsetHigh members of the OVERLAPPED structure as 0xFFFFFFFF. This is functionally equivalent to previously calling the CreateFile function to open hFile using FILE_APPEND_DATA access.

In D78897#2005085, @labath wrote:

Overall I feel that this sort of automatic locking will very rarely be the "right tool for the job". I think your follow-up patch sort of demonstrates that, because you've needed to format the output to a temporary string stream in order for it do what you need. I think it might be more reasonable to just leave this out of the raw_ostream class, and do a manual lockFile + write + unlockFile combo where needed.

Of course, this is a matter of convenience. Automatic locking, as implemented in this patch is not flexible enough, it must be improved. But manual operations might be error-prone.

In fact, if we do that, I'm wondering if locking is really needed, as a write to a O_APPEND file is more-or-less guaranteed to be atomic (I think the only problem comes from signals, and I'm not sure if those really happen on local files):
O_APPEND
       The  file is opened in append mode.  Before each write(2), the file offset is positioned
       at the end of the file, as if with lseek(2).  The modification of the  file  offset  and
       the write operation are performed as a single atomic step.

Here is a discussion about atomicity of append operation: https://stackoverflow.com/questions/1154446/is-file-append-atomic-in-unix. In short, actually atomicity of seek+write is provided only for data of small size. This statement agree with our experience. Initially we have implementation that relied on atomicity of append operation. Log files were regularly spoiled. Using file locks solved this problem.

In D78897#2007851, @sepavloff wrote:

In D78897#2005085, @labath wrote:

Overall I feel that this sort of automatic locking will very rarely be the "right tool for the job". I think your follow-up patch sort of demonstrates that, because you've needed to format the output to a temporary string stream in order for it do what you need. I think it might be more reasonable to just leave this out of the raw_ostream class, and do a manual lockFile + write + unlockFile combo where needed.

Of course, this is a matter of convenience. Automatic locking, as implemented in this patch is not flexible enough, it must be improved. But manual operations might be error-prone.

Well.. the thing is, I am not sure automatic locking *can* be improved to be useful. I find this situation very reminiscent of various "thread-safe" containers. While in theory we could have an std::thread_safe_map, I think there's a (good) reason that the standard library does not provide it -- it usually is not that useful, because one needs to have more control over the scope of the "critical sections". A typical use case for a thread-safe map is to compute something once and then cache it, but there one needs the whole test-and-set block to be atomic, not just individual operations.

I think this is fairly similar to what happens with raw_ostream, where one typically would wants to have locking granularity be different from what happens to fall out of the stream implementation. And in that case, I am not sure the implementation needs to be inside the stream class. Maybe an RAII object is in order ?

{
  stream_locker Lock(OS); // not sure what to do about error handling
  OS << whatever;
} // Lock destructor calls OS.flush(), and unlocks ?

In fact, if we do that, I'm wondering if locking is really needed, as a write to a O_APPEND file is more-or-less guaranteed to be atomic (I think the only problem comes from signals, and I'm not sure if those really happen on local files):
O_APPEND
       The  file is opened in append mode.  Before each write(2), the file offset is positioned
       at the end of the file, as if with lseek(2).  The modification of the  file  offset  and
       the write operation are performed as a single atomic step.
Here is a discussion about atomicity of append operation: https://stackoverflow.com/questions/1154446/is-file-append-atomic-in-unix. In short, actually atomicity of seek+write is provided only for data of small size. This statement agree with our experience. Initially we have implementation that relied on atomicity of append operation. Log files were regularly spoiled. Using file locks solved this problem.

Yeah, I had a feeling PIPE_BUF would come into play sooner or later. If that is not enough for your use case, then yes, I guess we will need to have some sort of locking primitives.

In D78897#2007992, @labath wrote:
In D78897#2007851, @sepavloff wrote:

In D78897#2005085, @labath wrote:

Overall I feel that this sort of automatic locking will very rarely be the "right tool for the job". I think your follow-up patch sort of demonstrates that, because you've needed to format the output to a temporary string stream in order for it do what you need. I think it might be more reasonable to just leave this out of the raw_ostream class, and do a manual lockFile + write + unlockFile combo where needed.

Of course, this is a matter of convenience. Automatic locking, as implemented in this patch is not flexible enough, it must be improved. But manual operations might be error-prone.

Well.. the thing is, I am not sure automatic locking *can* be improved to be useful. I find this situation very reminiscent of various "thread-safe" containers. While in theory we could have an std::thread_safe_map, I think there's a (good) reason that the standard library does not provide it -- it usually is not that useful, because one needs to have more control over the scope of the "critical sections". A typical use case for a thread-safe map is to compute something once and then cache it, but there one needs the whole test-and-set block to be atomic, not just individual operations.

I think this is fairly similar to what happens with raw_ostream, where one typically would wants to have locking granularity be different from what happens to fall out of the stream implementation. And in that case, I am not sure the implementation needs to be inside the stream class. Maybe an RAII object is in order ?
{
  stream_locker Lock(OS); // not sure what to do about error handling
  OS << whatever;
} // Lock destructor calls OS.flush(), and unlocks ?

Exactly!
It is in D79066.

It is superseded by D79066.

sepavloff removed a child revision: D78903: [Driver] Add option -fproc-stat-report.Apr 28 2020, 10:43 PM

sepavloff mentioned this in D79066: [Support] Class to facilitate file locking.Apr 28 2020, 11:50 PM

Revision Contents

Path

Size

llvm/

include/

llvm/

Support/

raw_ostream.h

15 lines

lib/

Support/

raw_ostream.cpp

10 lines

unittests/

Support/

raw_ostream_test.cpp

39 lines

Diff 260219

llvm/include/llvm/Support/raw_ostream.h

//===--- raw_ostream.h - Raw output stream ----------------------- C++ --===//		//===--- raw_ostream.h - Raw output stream ----------------------- C++ --===//
		Lint: Lint Inline Actions clang-format not found in user's PATH; not linting file. Lint: Lint: clang-format not found in user's PATH; not linting file.
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
▲ Show 20 Lines • Show All 396 Lines • ▼ Show 20 Lines	#ifdef _WIN32
/// terminal emulators are TTYs, but they are not consoles.		/// terminal emulators are TTYs, but they are not consoles.
bool IsWindowsConsole = false;		bool IsWindowsConsole = false;
#endif		#endif

std::error_code EC;		std::error_code EC;

uint64_t pos = 0;		uint64_t pos = 0;

		/// If set, the stream tries to lock underlying file before writing to it.
		bool LockForUpdate = false;

		/// Time in milliseconds during which the stream tries to lock file.
		unsigned LockTimeout = 1000;

/// See raw_ostream::write_impl.		/// See raw_ostream::write_impl.
void write_impl(const char *Ptr, size_t Size) override;		void write_impl(const char *Ptr, size_t Size) override;

void pwrite_impl(const char *Ptr, size_t Size, uint64_t Offset) override;		void pwrite_impl(const char *Ptr, size_t Size, uint64_t Offset) override;

/// Return the current position within the stream, not counting the bytes		/// Return the current position within the stream, not counting the bytes
/// currently in the buffer.		/// currently in the buffer.
uint64_t current_pos() const override { return pos; }		uint64_t current_pos() const override { return pos; }
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	public:
raw_ostream &reverseColor() override;		raw_ostream &reverseColor() override;

bool is_displayed() const override;		bool is_displayed() const override;

bool has_colors() const override;		bool has_colors() const override;

void enable_colors(bool enable) override { ColorEnabled = enable; }		void enable_colors(bool enable) override { ColorEnabled = enable; }

		/// Requires locking file before write to it.
		void lockForUpdate(bool X) { LockForUpdate = X; }
		bool lockForUpdate() const { return LockForUpdate; }

		/// Sets time in milliseconds during which the stream tries to lock
		/// underlying file.
		void lockTimeout(unsigned X) { LockTimeout = X; }
		unsigned lockTimeout() const { return LockTimeout; }

std::error_code error() const { return EC; }		std::error_code error() const { return EC; }

/// Return the value of the flag in this raw_fd_ostream indicating whether an		/// Return the value of the flag in this raw_fd_ostream indicating whether an
/// output error has been encountered.		/// output error has been encountered.
/// This doesn't implicitly flush any pending output. Also, it doesn't		/// This doesn't implicitly flush any pending output. Also, it doesn't
/// guarantee to detect all errors unless the stream has been closed.		/// guarantee to detect all errors unless the stream has been closed.
bool has_error() const { return bool(EC); }		bool has_error() const { return bool(EC); }

▲ Show 20 Lines • Show All 115 Lines • Show Last 20 Lines

llvm/lib/Support/raw_ostream.cpp

//===--- raw_ostream.cpp - Implement the raw_ostream classes --------------===//		//===--- raw_ostream.cpp - Implement the raw_ostream classes --------------===//
		Lint: Lint Inline Actions clang-format not found in user's PATH; not linting file. Lint: Lint: clang-format not found in user's PATH; not linting file.
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
▲ Show 20 Lines • Show All 690 Lines • ▼ Show 20 Lines	#endif
size_t MaxWriteSize = INT32_MAX;		size_t MaxWriteSize = INT32_MAX;

#if defined(__linux__)		#if defined(__linux__)
// It is observed that Linux returns EINVAL for a very large write (>2G).		// It is observed that Linux returns EINVAL for a very large write (>2G).
// Make it a reasonably small value.		// Make it a reasonably small value.
MaxWriteSize = 1024 * 1024 * 1024;		MaxWriteSize = 1024 * 1024 * 1024;
#endif		#endif

		if (LockForUpdate)
		if (std::error_code EC = llvm::sys::fs::lockFile(FD, LockTimeout)) {
		error_detected(EC);
		return;
		}

do {		do {
size_t ChunkSize = std::min(Size, MaxWriteSize);		size_t ChunkSize = std::min(Size, MaxWriteSize);
ssize_t ret = ::write(FD, Ptr, ChunkSize);		ssize_t ret = ::write(FD, Ptr, ChunkSize);

if (ret < 0) {		if (ret < 0) {
// If it's a recoverable error, swallow it and retry the write.		// If it's a recoverable error, swallow it and retry the write.
//		//
// Ideally we wouldn't ever see EAGAIN or EWOULDBLOCK here, since		// Ideally we wouldn't ever see EAGAIN or EWOULDBLOCK here, since
Show All 15 Lines	#endif
}		}

// The write may have written some or all of the data. Update the		// The write may have written some or all of the data. Update the
// size and buffer pointer to reflect the remainder that needs		// size and buffer pointer to reflect the remainder that needs
// to be written. If there are no bytes left, we're done.		// to be written. If there are no bytes left, we're done.
Ptr += ret;		Ptr += ret;
Size -= ret;		Size -= ret;
} while (Size > 0);		} while (Size > 0);

		if (LockForUpdate)
		if (std::error_code EC = llvm::sys::fs::unlockFile(FD))
		error_detected(EC);
}		}

void raw_fd_ostream::close() {		void raw_fd_ostream::close() {
assert(ShouldClose);		assert(ShouldClose);
ShouldClose = false;		ShouldClose = false;
flush();		flush();
if (auto EC = sys::Process::SafelyCloseFileDescriptor(FD))		if (auto EC = sys::Process::SafelyCloseFileDescriptor(FD))
error_detected(EC);		error_detected(EC);
▲ Show 20 Lines • Show All 198 Lines • Show Last 20 Lines

llvm/unittests/Support/raw_ostream_test.cpp

	//===- llvm/unittest/Support/raw_ostream_test.cpp - raw_ostream tests -----===//			//===- llvm/unittest/Support/raw_ostream_test.cpp - raw_ostream tests -----===//
				Lint: Lint Inline Actions clang-format not found in user's PATH; not linting file. Lint: Lint: clang-format not found in user's PATH; not linting file.
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include "llvm/ADT/SmallString.h"			#include "llvm/ADT/SmallString.h"
				#include "llvm/Support/Errc.h"
	#include "llvm/Support/FileSystem.h"			#include "llvm/Support/FileSystem.h"
				#include "llvm/Support/FileUtilities.h"
	#include "llvm/Support/Format.h"			#include "llvm/Support/Format.h"
	#include "llvm/Support/raw_ostream.h"			#include "llvm/Support/raw_ostream.h"
	#include "gtest/gtest.h"			#include "gtest/gtest.h"

	using namespace llvm;			using namespace llvm;

	namespace {			namespace {

	▲ Show 20 Lines • Show All 332 Lines • ▼ Show 20 Lines
	}			}

	TEST(raw_fd_ostreamTest, multiple_raw_fd_ostream_to_stdout) {			TEST(raw_fd_ostreamTest, multiple_raw_fd_ostream_to_stdout) {
	std::error_code EC;			std::error_code EC;

	{ raw_fd_ostream("-", EC, sys::fs::OpenFlags::OF_None); }			{ raw_fd_ostream("-", EC, sys::fs::OpenFlags::OF_None); }
	{ raw_fd_ostream("-", EC, sys::fs::OpenFlags::OF_None); }			{ raw_fd_ostream("-", EC, sys::fs::OpenFlags::OF_None); }
	}			}

				#ifdef _WIN32
				// Windows refuses lock request if file region is already locked by the same
				// process. POSIX system in this case updates the existing lock.
				TEST(raw_fd_ostreamTest, LockForUpdate) {
				int FD;
				std::error_code EC;
				SmallString<64> TempPath;
				EC = sys::fs::createTemporaryFile("test", "temp", FD, TempPath);
				ASSERT_FALSE(EC);
				FileRemover Cleanup(TempPath);

				raw_fd_ostream Stream(TempPath, EC);
				ASSERT_FALSE(EC);
				Stream.lockForUpdate(true);
				Stream.write("ABCDE", 5);
				Stream.flush();
				ASSERT_FALSE(Stream.error());
				Stream.write("FGHIJ", 5);
				Stream.flush();
				ASSERT_FALSE(Stream.error());

				EC = sys::fs::lockFile(FD);
				ASSERT_FALSE(EC);
				Stream.write("12345", 5);
				Stream.flush();
				ASSERT_EQ(errc::no_lock_available, Stream.error());

				Stream.clear_error();
				EC = sys::fs::unlockFile(FD);
				ASSERT_FALSE(EC);
				Stream.write("abcde", 5);
				Stream.flush();
				ASSERT_FALSE(Stream.error());
				}
				#endif

				Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Lint: Pre-merge checks: clang-format: please reformat the code ``` - ```
	}			}

This is an archive of the discontinued LLVM Phabricator instance.

[Support] raw_fd_ostream can lock file before writeAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 260219

llvm/include/llvm/Support/raw_ostream.h

llvm/lib/Support/raw_ostream.cpp

llvm/unittests/Support/raw_ostream_test.cpp

[Support] raw_fd_ostream can lock file before write
AbandonedPublic