Page MenuHomePhabricator

Fix an occasional failure in fs::rename() if another process uses fs::openFileForRead() on the same file
ClosedPublic

Authored by gbedwell on Oct 12 2015, 3:40 AM.

Details

Summary

Hi,

While looking at the effectiveness of Clang C++ modules on Windows, we encountered some occasional but seemingly random failures. Upon investigation we found that it is actually a general issue affecting the Windows implementation of llvm::sys::fs::rename() and llvm::sys::fs::openFileForRead().

Within this function there is a retry loop that spins for a maximum of around 2000ms. In this loop, the Windows MoveFileExW() function is used to rename the file. If the move fails with ERROR_ACCESS_DENIED, it goes around the loop and tries again. A comment in the code suggests the reason for permitting this error condition is to allow for scanners (indexers, antivirus, etc) that open the source file immediately after it is written to complete their work. However, the ERROR_ACCESS_DENIED error code also appears if the destination file is opened for reading by another process using llvm::sys::fs::openFileForRead(). It is usually the case that the reading completes in less than 2000ms, so the issue doesn't often result in a failure that escapes the retry loop, however we've seen cases where during a build the disk is being hit so hard that disk response times can be longer than this causing llvm::sys::fs::rename() to fail.

The attached patch contains a unit test (ReplaceFileTest.cpp) that demonstrates the issue and fix. In the test we create a directory containg two files: 'source' and 'target'. 'target' is opened for reading with openFileForRead() and then replaced with 'source' using rename(source, target). Using the existing descriptor it verifies that the old data can still be read, but with the descriptor from a subsequent call to openFileForRead() the new data is read instead. Without the fixes to Path.inc, the call to rename() in the test will fail.

The fix is to instead use the Windows ReplaceFileW() function documented at https://msdn.microsoft.com/en-us/library/windows/desktop/aa365512(v=vs.85).aspx within the retry loop in rename() and to add the FILE_SHARE_DELETE flag to the share mode flags in the call to CreateFileW() within openFileForRead(). We still need to keep the old call to MoveFileExW() for the case where the target of the rename does not already exist, as ReplaceFileW() will fail with ERROR_FILE_NOT_FOUND in this situation. This case is also covered by the unit test for completeness.

This patch is based on an investigation and an initial version of the patch by my colleague Edd Dawson.

Thanks for taking a look.

Thanks,

-Greg

Diff Detail

Repository
rL LLVM

Event Timeline

gbedwell updated this revision to Diff 37089.Oct 12 2015, 3:40 AM
gbedwell retitled this revision from to Fix an occasional failure in fs::rename() if another process uses fs::openFileForRead() on the same file.
gbedwell updated this object.
gbedwell added reviewers: Bigcheese, chapuni.
gbedwell added a subscriber: llvm-commits.
aaron.ballman accepted this revision.Oct 12 2015, 5:47 AM
aaron.ballman edited edge metadata.

Thank you for looking into this! LGTM

~Aaron

This revision is now accepted and ready to land.Oct 12 2015, 5:47 AM
This revision was automatically updated to reflect the committed changes.