This is an archive of the discontinued LLVM Phabricator instance.

Remove existing file in a separate thread asynchronously.
ClosedPublic

Authored by ruiu on Nov 30 2016, 10:35 PM.

Details

Summary

On Linux (and probably on other Unix-like systems), unlink(2) is
noticeably slow. It takes 250 milliseconds to remove a 1 GB file
on ext4 filesystem on my machine, whether the file is on SSD or
on a spinning disk.

To create a new result file, we remove existing file first. So, if
you repeatedly link a 1 GB program in a regular compile-link-debug
cycle, every cycle wastes 250 milliseconds only to remove a file.

Since LLD can link a 1 GB in about 5 seconds, that actually counts.

This patch defines unlinkAsync function. The function spawns a
background thread to call unlink. The calling thread returns
almost immediately.

Diff Detail

Repository
rL LLVM

Event Timeline

ruiu updated this revision to Diff 79868.Nov 30 2016, 10:35 PM
ruiu retitled this revision from to Remove existing file in a separate thread asynchronously..
ruiu updated this object.
ruiu added a reviewer: rafael.
ruiu added a subscriber: llvm-commits.
ruiu updated this revision to Diff 79869.Nov 30 2016, 10:40 PM
  • Use createUniqueFile instead of createTemporaryFile to create a temporary file name in the same directory as a result file.
ruiu updated this revision to Diff 79871.Nov 30 2016, 10:50 PM
  • Do not create a thread if !Config->Threads.
silvas added a subscriber: silvas.

Why are we unlinking at all? Hopefully we can just resize the file and ideally reuse most of the pages the kernel already has allocated. Doing so loses the atomicity of the replacement, but so does asynchronously unlinking like you do in this patch (although this patch would manifest the non-atomicity in a more benign way).

If we slightly overallocate the file (no problem since ELF doesn't mind having some extra zeros at the end), then most of the time re-links will avoid doing any kind of memory allocation from the kernel whatsoever. (edit-relink cycle may cause more or less bytes due to local code modifications from the edits, but the changes are unlikely to be very large).

I'm not sure if users expect atomic replacement of the destination file as the behavior from the linker. If they do, then there's not much we can do to speed this up unfortunately. Joerg, Ed? Any ideas on whether we need atomic replacement?

joerg edited edge metadata.Dec 1 2016, 2:34 AM

The reason for unlinking is to get consistent behavior in the error case as well as consistent access permissions.

That said, I can't reproduce the numbers on my laptop. For Ext4 on my laptop, I see around 0.1s for rm to delete a 1GB test file, tmpfs is half of that.

ruiu added a comment.Dec 1 2016, 9:28 AM

What is your kernel and how did you create your file? Removing an 1.6 GB link result always takes about 2.5 seconds for me.

$ /ssd/rel/bin/ld.lld @response -o /ssd/out

$ ls -l /ssd/out
-rwxr-x--- 1 ruiu eng 1624892488 Dec  1 09:25 /ssd/out

$ strace -c rm /ssd/out
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 99.96    0.260674      260674         1           unlinkat
  0.01    0.000022           3         8           mmap
  0.01    0.000015           5         3           open
  0.01    0.000014           4         4           mprotect
  0.00    0.000011           2         6           close
  0.00    0.000011           4         3         3 access
  0.00    0.000010          10         1           execve
  0.00    0.000006           6         1           munmap
  0.00    0.000005           3         2           newfstatat
  0.00    0.000004           4         1         1 lseek
  0.00    0.000003           1         3           fstat
  0.00    0.000003           1         3           brk
  0.00    0.000003           3         1           faccessat
  0.00    0.000002           2         1           read
  0.00    0.000002           2         1           ioctl
  0.00    0.000001           1         1           geteuid
  0.00    0.000001           1         1           arch_prctl
------ ----------- ----------- --------- --------- ----------------
100.00    0.260787                    41         4 total
ruiu added a comment.Dec 1 2016, 9:34 AM

As to writing to an existing file as opposed to atomic renaming, I believe the reason why we are doing that is to not leave a corrupted file in case of crash. If you do atomic renaming, you'll get a temporary file in the worst case instead of a broken, partically-written result.

However, that can be resolved by renaming an existing file different temporary name, write to the temporary file, and then rename it back. Maybe we should do that?

joerg added a comment.Dec 1 2016, 9:54 AM
dd if=/dev/zero of=testfile bs=1024k count=1024
sync
time rm testfile
ruiu added a comment.Dec 1 2016, 10:48 AM

dd, sync and rm are too different from real use cases.

I like the idea of writing to a temporary file, then renaming that. That way, we keep the original file in place until we have a complete file to replace it with, and the replace operation should be fast and atomic on most reasonable filesystems.

Having said that, there are a few potential downsides to this approach. For one, it requires disk space to be sufficient to hold both the old and the new file. For another, some scripts or tools may depend on the output file being absent if the link did not succeed.

If we do end up implementing this "rename, then remove in a separate thread", I would suggest moving it into LLVM. It is a technique I've often found useful, particularly when deleting directories.

silvas accepted this revision.Dec 2 2016, 12:09 AM
silvas added a reviewer: silvas.

Overall, I do like the idea of doing this and this LGTM (although let others chime in). One thing to think about is that if it takes this long to delete a file (even on tmpfs which Joerg mentioned), then how long does it take to allocate the memory for a new file? (let alone write it to disk)

So there is probably a similar cost hidden somewhere else that we could maybe get rid of if we arranged things right (not in this patch though).

This revision is now accepted and ready to land.Dec 2 2016, 12:09 AM
ruiu updated this revision to Diff 80223.Dec 4 2016, 6:25 PM
ruiu edited edge metadata.
  • Call unlinkAsync just before creating a new output file, instead of at the beginning of writeResult().

I'm going to submit this patch tomorrow if there's no objections.

This revision was automatically updated to reflect the committed changes.