This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Implement Directory Entry Caching -- Sort of.
ClosedPublic

Authored by EricWF on Jul 18 2018, 9:14 PM.

Details

Summary

This patch implements directory_entry caching *almost* as specified in P0317r1. However, I explicitly chose to deviate from the standard as I'll explain below.

The approach I decided to take is a fully caching one. When refresh() is called, the cache is populated by calls to stat and lstat as needed.
During directory iteration the cache is only populated with the file_type as reported by readdir.
The cache can be in the following states:

  • _Empty: There is nothing in the cache (likely due to an error)
  • _IterSymlink: Created by directory iteration when we walk onto a symlink only the symlink file type is known.
  • _IterNonSymlink: Created by directory iteration when we walk onto a non-symlink. Both the regular file type and symlink file type are known.
  • _RefreshSymlink and _RefreshNonSymlink: A full cache created by refresh(). This case includes dead symlinks.
  • _RefreshSymlinkUnresolved: A partial cache created by refresh when we fail to resolve the file pointed to by a symlink (likely due to permissions). Symlink attributes are cached, but attributes about the linked entity are not.

As mentioned, this implementation purposefully deviates from the standard. According to some readings of the specification, and the Windows filesystem implementation, the constructors and modifiers which don't pass an error_code must throw when the directory_entry points to a entity which doesn't exist. or when attribute resolution fails for another reason.

@BillyONeal has proposed a more reasonable set of requirements, where modifiers other than refresh ignore errors. This is the behavior libc++ currently implements, with the expectation some form of the new language will be accepted into the standard.

Some additional semantics which differ from the Windows implementation:

  1. refresh will not throw when the entry doesn't exist. In this case we can still meet the functions specification, so we don't treat it as an error.
  2. We don't clear the path name when a constructor fails via refresh (this will hopefully be changed in the standard as well).

It should be noted that libstdc++'s current implementation has the same behavior as libc++, except for point (2).

If the changes to the specification don't get accepted, we'll be able to make the changes later.

[1] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0317r1.html

Diff Detail

Event Timeline

EricWF created this revision.Jul 18 2018, 9:14 PM
EricWF updated this revision to Diff 156413.Jul 19 2018, 6:24 PM
EricWF edited the summary of this revision. (Show Details)
EricWF added a subscriber: BillyONeal.
  • Do a bunch of cleanup.
  • Make the tests more STL agnostic and get most tests passing against libstdc++ (libstdc++ bugs not withstanding).
  • Improve the windows placehold stub for __do_refresh().
EricWF updated this revision to Diff 156414.Jul 19 2018, 6:26 PM

I'm going to commit this w/o review. Comments are always still welcome post commit :-D

This revision was not accepted when it landed; it landed in state Needs Review.Jul 19 2018, 6:27 PM
This revision was automatically updated to reflect the committed changes.

@BillyONeal has proposed a more reasonable set of requirements

I believe that was you. I wanted the throws/fails behavior.

test/std/experimental/filesystem/class.directory_entry/directory_entry.cons/path.pass.cpp