This is an archive of the discontinued LLVM Phabricator instance.

[llvm][clang][vfs] NFC: Simplify directory iteration
Needs ReviewPublic

Authored by jansvoboda11 on Jan 5 2022, 5:50 AM.

Details

Summary

The iteration over directory entries of a VFS is a bit unweildy, since it requires using a pair of llvm::vfs::directory_iterator, calling llvm::vfs::directory_iterator::increment(std::error_code &) and checking std::error_code. Currently, there are 15 instances of this pattern in the Clang codebase.

This patch simplifies iteration over directory entries by introducing new type of iterator and adding FileSystem::dir_range() function.

Diff Detail

Event Timeline

jansvoboda11 created this revision.Jan 5 2022, 5:50 AM
jansvoboda11 requested review of this revision.Jan 5 2022, 5:50 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJan 5 2022, 5:50 AM

I think th eAPI is designed to ensure users don't forget to check the error code.

What about a new API/iterator type that wraps the existing one and has a "normal" operator++?

  • Take an error as an out-parameter on construction
  • Make it an llvm::Error which guarantees a crash if not checked
  • This would wrap the other type and store a reference to the Error. If there's an error during iteration, advances to end and sets the error.

Here's what I'm thinking:

Optional<llvm::Error> DirError;
for (vfs::error_directory_iterator File = D.getVFS().dir_begin(Cand.Path, DirError),
                                   FileEnd;
     File != FileEnd; ++File) {
  // ...
}
if (DirError)
  return errorToErrorCode(std::move(*IterationEC));

At that point, it'd be easy to add a dir_range wrapper that creates an iterator range:

for (auto &File : dir_range(Path, DirError)) {
  // ...
}

Replace macro with range of new iterators

jansvoboda11 retitled this revision from [llvm][clang][vfs] NFC: Extract directory iteration boilerplate into macro to [llvm][clang][vfs] NFC: Simplify directory iteration.Jan 12 2022, 2:44 AM
jansvoboda11 edited the summary of this revision. (Show Details)

I agree that introducing new iterator and implementing iterator_range<...> FileSystem::dir_range() is better solution than a macro.

I'm not sure the dir_range function needs to take an std::error_code out-param though. The error code is only used to stop the iteration, clients don't use it for any other purpose. I think the new iterator could handle error codes completely internally (by advancing to the end), providing better ergonomics. WDYT?

I might create a follow-up patches for recursive VFS-based iteration and also enable range-based for loops in code using llvm::sys::fs instead of the VFS.

I agree that introducing new iterator and implementing iterator_range<...> FileSystem::dir_range() is better solution than a macro.

I'm not sure the dir_range function needs to take an std::error_code out-param though. The error code is only used to stop the iteration, clients don't use it for any other purpose. I think the new iterator could handle error codes completely internally (by advancing to the end), providing better ergonomics. WDYT?

I might create a follow-up patches for recursive VFS-based iteration and also enable range-based for loops in code using llvm::sys::fs instead of the VFS.

I don't know, I'm a bit skeptical we want to make it so easy to ignore errors so easily. I'd rather require clients to explicitly ignore the error.

@benlangmuir, any thoughts on this?

I don't know, I'm a bit skeptical we want to make it so easy to ignore errors so easily. I'd rather require clients to explicitly ignore the error.

That's my gut feeling too. I suspect a bunch of this code should be explicitly ignoring ENOENT, but not ignoring e.g. EACCESS, EMFILE , etc. It seems like a bug that we're ignoring all errors right now.

If we really want to ignore all the iteration errors, we should do that consistently in llvm::sys::fs not just the wrapper in the VFS layer.

I don't know, I'm a bit skeptical we want to make it so easy to ignore errors so easily. I'd rather require clients to explicitly ignore the error.

That's my gut feeling too. I suspect a bunch of this code should be explicitly ignoring ENOENT, but not ignoring e.g. EACCESS, EMFILE , etc. It seems like a bug that we're ignoring all errors right now.

If we really want to ignore all the iteration errors, we should do that consistently in llvm::sys::fs not just the wrapper in the VFS layer.

If the main goal is to enable range-based-for, a simpler option than I suggested previously might be to have an iterator that dereferences to ErrorOr<FileEntry>. Could be an iterator adaptor that takes any increment(EC)-style iterator (either llvm::sys::fs or llvm::vfs). Maybe dir_range could return ErrorOr<iterator_range<...>> to expose the error-on-construction explicitly as well.