This is an archive of the discontinued LLVM Phabricator instance.

[8/N] [libcxx] Fix the preexisting directory_iterator code for windows
ClosedPublic

Authored by mstorsjo on Nov 10 2020, 1:41 AM.

Details

Summary

The directory_iterator.cpp file did contain an incomplete, non-working implementation for windows.

Change it to use the wchar version of the APIs.

Don't set the windows specific errors from GetLastError() as code in the generic category; remap the errors to the std::errc values.

Error out cleanly on empty paths.

Invoke FindFirstFile on <directoryname>/* to actually list the entries of the directory.

If the first entry retured by FindFirstFile is to be skipped (e.g. being "." or ".."), call advance() (which calls FindNextFile and loops) which doesn't return until a valid entry is found (or the end is reached).

Diff Detail

Event Timeline

mstorsjo created this revision.Nov 10 2020, 1:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 10 2020, 1:41 AM
Herald added 1 blocking reviewer(s): Restricted Project. · View Herald Transcript
rnk resigned from this revision.Nov 10 2020, 12:25 PM
rnk added a subscriber: amccarth.

Hi, I'm pretty busy right now, and I'm falling behind on other code reviews. Can you send this libc++ patch series to @amccarth? He's on my team, and is familiar with the Win32 API, but is unfamiliar with mingw conventions. Hopefully he has enough experience to help implement these parts of the standard library.

mstorsjo added a subscriber: rnk.Nov 10 2020, 1:11 PM
In D91140#2386807, @rnk wrote:

Hi, I'm pretty busy right now, and I'm falling behind on other code reviews. Can you send this libc++ patch series to @amccarth? He's on my team, and is familiar with the Win32 API, but is unfamiliar with mingw conventions. Hopefully he has enough experience to help implement these parts of the standard library.

Fair enough, thanks! Luckily, essentially nothing in this patchset is mingw specific (I've tried to steer away from anything that differs, that can be messy). Maybe the only real difference lies in being able to use the __int128 type, see D91139.

amccarth requested changes to this revision.Nov 12 2020, 10:15 AM

This is pretty close.

libcxx/src/filesystem/directory_iterator.cpp
86–87

I know you didn't change this line, but it looks wrong to me.

MAXDWORD is a macro for 0xffffffff, which I believe the compiler will interpret as unsigned int (or maybe unsigned long, which is only 32 bits in 64-bit Windows). Adding 1 will cause it to wrap to 0. Multiplying by 0 effectively ignores the higher bits, so we end up with just the low 32 bits of the size, which will be extended to a uintmax_t upon return.

libcxx/src/filesystem/filesystem_common.h
23

It's a real shame we have to include <windows.h> in a header. Even with WIN32_LEAN_AND_MEAN and NOMINMAX, it really pollutes the macro and global namespaces.

Does code that includes <filesystem> transitively get this pulled into their code? If so, would it be possible to move the definitions to a .cpp file and only include <windows> there?

130

constexpr?

184

Needs inline for ODR, right?

191

inline

This revision now requires changes to proceed.Nov 12 2020, 10:15 AM

Thanks for having a look!

libcxx/src/filesystem/directory_iterator.cpp
86–87

Good catch, I can try to fix that while fixing up the preexisting code here.

libcxx/src/filesystem/filesystem_common.h
23

No, luckily this header is only used by the implementation itself, it doesn't end up polluting user code. This header is essentially only used by src/filesystem/operations.cpp and src/filesystem/directory_iterator.cpp for stuff shared by both of them.

130

Will do

184

Hmm, I'm not entirely sure how this works, when this is an anonymous namespace (which essentially works as static in C afaik?), as that isn't supposed to be visible outside of the current translation unit (any similarity to similar named things in other translation units can be coincidental), so I'm not sure if such things can be made ODR... But this is the mechanism used for the header as it is right now.

If we can't get these shared between the two translation units in an anonymous namespace, I guess it'd be better to give at least the table a non-anonymous symbol so that it can be shared properly.

mstorsjo marked 2 inline comments as done.Nov 13 2020, 3:31 AM
mstorsjo added inline comments.
libcxx/src/filesystem/filesystem_common.h
23

I removed the need for windows.h in the shared header anyway, without excessive jumping through hoops.

184

Moved the table to a non-anonymous namespace to make one single instanance that can be shared between the two translation units. I didn't add inline as it doesn't have any effect on ODR in an anonymous namespace, so it matches the existing functions in the header.

mstorsjo updated this revision to Diff 305077.Nov 13 2020, 3:38 AM
mstorsjo edited the summary of this revision. (Show Details)

Fixed the file size calculation by casting to uint64_t (and doing shifting instead of multiplying by MAXDWORD+1), moved the error mapping table to operations.cpp, removed inclusions of windows.h in the filesystem_common.h header.

amccarth accepted this revision.Nov 18 2020, 9:35 AM

I'm not a libcxx owner, so I'm not sure what other hoops you should jump through for them, but this looks good to me. Thanks for the improvements!

Minor comment only. It would be cool to have the CI triggered...

libcxx/src/filesystem/operations.cpp
328

You could use an inline variable instead of putting it into an anonymous namespace. Personally I find it cleaner.

curdeius set the repository for this revision to rG LLVM Github Monorepo.Dec 10 2020, 2:11 PM
mstorsjo updated this revision to Diff 311023.Dec 10 2020, 2:21 PM
mstorsjo marked an inline comment as done.

Moved the error mapping table into an inline variable to avoid one extra anonymous namespace.

mstorsjo updated this revision to Diff 311024.Dec 10 2020, 2:22 PM
mstorsjo set the repository for this revision to rG LLVM Github Monorepo.

Try to re-trigger CI...

mstorsjo updated this revision to Diff 311025.Dec 10 2020, 2:24 PM
mstorsjo set the repository for this revision to rG LLVM Github Monorepo.

Third time's the charm?

ldionne requested changes to this revision.Dec 14 2020, 3:35 PM
ldionne added a subscriber: ldionne.
ldionne added inline comments.
libcxx/src/filesystem/directory_iterator.cpp
75–76

I think adding a comment here to signify that this is the Windows branch would add readability.

122

Isn't that a use-after-free? (root / "*").c_str() returns a pointer to the internal string held in the temporary object created by (root / "*"), unless I'm missing a subtlety here. I don't think the lifetime of objects in an expression used as a function argument is extended automatically.

libcxx/src/filesystem/filesystem_common.h
19–20

Please indent #includes in between.

This revision now requires changes to proceed.Dec 14 2020, 3:35 PM
mstorsjo added inline comments.Dec 14 2020, 11:37 PM
libcxx/src/filesystem/directory_iterator.cpp
75–76

Sure, I can add that.

122

I don't know the spec bits exactly that allow this, but I'm fairly sure it's supposed to work. An empirical test: https://godbolt.org/z/za8K46

I'm not too familiar with the C++ spec itself to reference it myself, but https://stackoverflow.com/a/584835/3115956 claims that "12.2 Temporary objects" in the C++ standard covers this, saying that the destructors for such temporaries aren't called until at the end of the full-expression.

libcxx/src/filesystem/filesystem_common.h
19–20

Ok, will do that for this block I'm touching here. There's more unindented blocks in the same file, but I'm keeping them untouched here.

mstorsjo updated this revision to Diff 311805.Dec 14 2020, 11:38 PM

Indented includes in a new ifdef block, added a comment after an #else to improve readability.

mstorsjo updated this revision to Diff 311806.Dec 14 2020, 11:39 PM
mstorsjo set the repository for this revision to rG LLVM Github Monorepo.

Reupload to readd the repo, to trigger CI

mstorsjo updated this revision to Diff 312442.Dec 17 2020, 3:54 AM
mstorsjo removed rG LLVM Github Monorepo as the repository for this revision.
mstorsjo set the repository for this revision to rG LLVM Github Monorepo.

Rebased, rerunning CI

ldionne accepted this revision.Dec 17 2020, 2:17 PM
ldionne added inline comments.
libcxx/src/filesystem/directory_iterator.cpp
122

Uh, you're right, I guess that's an enormous brain fart on my end. Like they say in the SO answer, expression templates are the obvious use case that make use of this property.

See http://eel.is/c++draft/class.temporary#4 for details.

This revision is now accepted and ready to land.Dec 17 2020, 2:17 PM