This is an archive of the discontinued LLVM Phabricator instance.

libcxx: Provide overloads for basic_filebuf::open() et al that take wchar_t* filenames on Windows.
ClosedPublic

Authored by pcc on Jan 17 2018, 7:02 PM.

Details

Summary

This is an MSVC standard library extension. It seems like a reasonable
enough extension to me because wchar_t* is the native format for
filenames on that platform.

If others agree, I will write the remaining tests.

Diff Detail

Repository
rCXX libc++

Event Timeline

pcc created this revision.Jan 17 2018, 7:02 PM

Can we avoid the _WIN32 usage please? We spent some effort to avoid it, and have _LIBCPP_WIN32API to indicate that we want the Win32 API. I know that Marshall had some strong opinions on avoiding the _WIN32 usage, but, beyond that, I think that this is a completely reasonable thing to provide. I'm still torn if we should enable this on not _LIBCPP_ABI_MICROSOFT. One other option would be to have a _LIBCPP_MS_EXTENSIONS flag to control whether they are provided. But, all of that is merely details on how to control access to the interfaces, not the direction itself.

libcxx/include/fstream
560 ↗(On Diff #130346)

Should this also be marked inline?

Can we avoid the _WIN32 usage please? We spent some effort to avoid it, and have _LIBCPP_WIN32API to indicate that we want the Win32 API. I know that Marshall had some strong opinions on avoiding the _WIN32 usage, but, beyond that, I think that this is a completely reasonable thing to provide.

I agree. _WIN32 is really a big hammer. I'd much rather some more focused and descriptive guard macro. Maybe _LIBCPP_HAS_OPEN_WITH_WCHAR.

Also, if the tests are for a libc++ extension, then they belong in the test/libcxx hierarchy.

pcc updated this revision to Diff 130747.Jan 19 2018, 8:36 PM
  • Add tests, address review comments
pcc added a comment.Jan 19 2018, 8:36 PM

I'm still torn if we should enable this on not _LIBCPP_ABI_MICROSOFT.

It looks like mingw32 provides a declaration of _wfopen, so at least from a technical perspective it seems fine.

Can we avoid the _WIN32 usage please? We spent some effort to avoid it, and have _LIBCPP_WIN32API to indicate that we want the Win32 API. I know that Marshall had some strong opinions on avoiding the _WIN32 usage, but, beyond that, I think that this is a completely reasonable thing to provide.

I agree. _WIN32 is really a big hammer. I'd much rather some more focused and descriptive guard macro. Maybe _LIBCPP_HAS_OPEN_WITH_WCHAR.

Also, if the tests are for a libc++ extension, then they belong in the test/libcxx hierarchy.

Done.

libcxx/include/fstream
560 ↗(On Diff #130346)

No strong opinions, I was just being consistent with the const char* overload which isn't marked inline.

compnerd accepted this revision.Jan 22 2018, 5:41 PM

LGTM

libcxx/include/__config
250 ↗(On Diff #130747)

Thanks, this is great.

libcxx/include/fstream
614 ↗(On Diff #130747)

It would be nice if we could do something like:

template <typename _CharT>
_CharT *ModeStringFromMode(const ios_base::openmode __mode);
This revision is now accepted and ready to land.Jan 22 2018, 5:41 PM
pcc added inline comments.Jan 22 2018, 6:08 PM
libcxx/include/fstream
614 ↗(On Diff #130747)

As discussed on IRC, I added a comment at the top of this function.

This revision was automatically updated to reflect the committed changes.

@pcc @mstorsjo Are we aware of anyone using these extensions?

I would like to suggest that we either remove this extension if it's not useful, or make it unconditional (not only on Windows) if we really think it's that useful (but I'd like that to come with at least a paper proposing to add them to the standard). Carrying around an extension that's only enabled on one platform (and not the most widely used platform for libc++ at that) is kind of awkward.

WDYT?

I believe that Chromium uses it (or at least it did at the time that I added this, and Chromium has since switched to using libc++ on Windows).

I don't work on Chromium much anymore but perhaps @thakis or @thomasanderson can confirm.

@pcc @mstorsjo Are we aware of anyone using these extensions?

I would like to suggest that we either remove this extension if it's not useful, or make it unconditional (not only on Windows) if we really think it's that useful (but I'd like that to come with at least a paper proposing to add them to the standard). Carrying around an extension that's only enabled on one platform (and not the most widely used platform for libc++ at that) is kind of awkward.

This extension is fairly essential - without it, you can't interact with files that have names outside of the 8 bit charset on Windows (and exactly what the 8 bit charset is, can vary from system to system). I can't point to a specific user of it, but I'd expect there to be numerous out there.

Making it universally available sounds like a sensible strategy forward - although I don't think I have the bandwidth to take on making it a standards proposal. Maybe someone from Microsoft (who invented this extension) can collaborate on it?

@pcc @mstorsjo Are we aware of anyone using these extensions?

I would like to suggest that we either remove this extension if it's not useful, or make it unconditional (not only on Windows) if we really think it's that useful (but I'd like that to come with at least a paper proposing to add them to the standard). Carrying around an extension that's only enabled on one platform (and not the most widely used platform for libc++ at that) is kind of awkward.

This extension is fairly essential - without it, you can't interact with files that have names outside of the 8 bit charset on Windows (and exactly what the 8 bit charset is, can vary from system to system). I can't point to a specific user of it, but I'd expect there to be numerous out there.

Making it universally available sounds like a sensible strategy forward - although I don't think I have the bandwidth to take on making it a standards proposal. Maybe someone from Microsoft (who invented this extension) can collaborate on it?

@CaseyCarter Any appetite for that?

@pcc @mstorsjo Are we aware of anyone using these extensions?

I would like to suggest that we either remove this extension if it's not useful, or make it unconditional (not only on Windows) if we really think it's that useful (but I'd like that to come with at least a paper proposing to add them to the standard). Carrying around an extension that's only enabled on one platform (and not the most widely used platform for libc++ at that) is kind of awkward.

This extension is fairly essential - without it, you can't interact with files that have names outside of the 8 bit charset on Windows (and exactly what the 8 bit charset is, can vary from system to system). I can't point to a specific user of it, but I'd expect there to be numerous out there.

Making it universally available sounds like a sensible strategy forward - although I don't think I have the bandwidth to take on making it a standards proposal. Maybe someone from Microsoft (who invented this extension) can collaborate on it?

@CaseyCarter Any appetite for that?

This isn't an extension for us. The Standard mandates overloads of basic_filebuf::open, basic_ifstream::open, basic_ofstream::open, and basic_fstream::open, as well as constructors for basic_ifstream, basic_ofstream, and basic_fstream that take const filesystem::path::value_type* which are "only be [sic] provided on systems where filesystem::path::value_type is not char." ([fstream.syn]/3) Unsurprisingly, filesystem::path::value_type for us is wchar_t. If you're supporting our mess of narrow character encodings, you may want to consider wchar_t for filesystem::path on Win32 as well. If you are only supporting UTF-8, I suppose you could instead transcode filenames to UTF-16 and use _wfopen?

@pcc @mstorsjo Are we aware of anyone using these extensions?

I would like to suggest that we either remove this extension if it's not useful, or make it unconditional (not only on Windows) if we really think it's that useful (but I'd like that to come with at least a paper proposing to add them to the standard). Carrying around an extension that's only enabled on one platform (and not the most widely used platform for libc++ at that) is kind of awkward.

This extension is fairly essential - without it, you can't interact with files that have names outside of the 8 bit charset on Windows (and exactly what the 8 bit charset is, can vary from system to system). I can't point to a specific user of it, but I'd expect there to be numerous out there.

Making it universally available sounds like a sensible strategy forward - although I don't think I have the bandwidth to take on making it a standards proposal. Maybe someone from Microsoft (who invented this extension) can collaborate on it?

@CaseyCarter Any appetite for that?

This isn't an extension for us. The Standard mandates overloads of basic_filebuf::open, basic_ifstream::open, basic_ofstream::open, and basic_fstream::open, as well as constructors for basic_ifstream, basic_ofstream, and basic_fstream that take const filesystem::path::value_type* which are "only be [sic] provided on systems where filesystem::path::value_type is not char." ([fstream.syn]/3) Unsurprisingly, filesystem::path::value_type for us is wchar_t. If you're supporting our mess of narrow character encodings, you may want to consider wchar_t for filesystem::path on Win32 as well. If you are only supporting UTF-8, I suppose you could instead transcode filenames to UTF-16 and use _wfopen?

We also use wchar_t for filesystem::path::value_type on Windows, essentially following the MS STL when it comes to those details. I think also libstdc++ might have added these wchar_t overloads these days.