Check the path length limit against the length of the UTF-16 version of
the input rather than the UTF-8 equivalent, as the UTF-16 length may be
shorter. Move widenPath from the llvm::sys::path namespace in Path.h to
the llvm::sys::windows namespace in WindowsSupport.h. Only use the
reduced path length limit for create directory. Canonicalize using
sys::path::remove_dots().
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/include/llvm/Support/FileSystem.h | ||
---|---|---|
1161 ↗ | (On Diff #247311) | Let's move this somewhere more obscure so clients like llvm-ar are less likely to find it and try to use it. I'd suggest WindowsSupport.h. Ideally we would make it local to Support, but we don't have any existing internal support headers. |
llvm/lib/Support/Windows/Path.inc | ||
84 | Why not always use the lower limit? What is the harm in using the UNC prefixes slightly more often? | |
llvm/unittests/Support/Path.cpp | ||
1888 | Is this the first introduction of a real UTF-8 source character? Are you confident that this will work portably? Do you need to write u8"pi" to make this work where utf-8 is not the local codepage? |
I'm not sure I agree with moving widen from sys::path to sys::fs. Is there consensus on the layering here? This adds dependencies for sys::fs on sys::path, but I would expect the dependencies to be in the other direction--the path style you use would depend on the underlying file system, not the other way around, right? The long path prefix on an absolute local path depends on the underlying file system being NTFS.
And, in my recent experience sys::path::is_absolute(..., sys::path::style::windows) seems buggy with UNC paths, so I'm skeptical if it works right with the long path prefix.
llvm/lib/Support/Windows/Path.inc | ||
---|---|---|
84 | The long path prefix isn't a UNC prefix. The long path prefix is an escape that tells many system APIs not to try to interpret the string as a file system path, but simply to pass it on to the underlying file system as an opaque string. That's great if you underlying file system is NTFS, that's great. I don't know whether something like a FAT32 flash drive handles that. If not, then there could be some harm to prematurely appending the long path prefix. That's probably pretty low risk. I think the ForCreateDir parameter makes this API a leaky abstraction and the risk is low. If you insist on having a way to enforce different limits, then I'd make the parameter be the path limit you want, default it to MAX_PATH, and let the special cases (like code that calls CreateDirectoryW) explain why they're setting it lower. | |
135 | Would it be appropriate to check (or at least assert) that RootName.size() <= Path8Str.size()? If somehow the entire input string could be considered the root name, then this reference would be out of bounds. | |
llvm/unittests/Support/Path.cpp | ||
675 | Does this patch, which allows longer paths on Windows, invalidate this test which checks that PATH_MAX is enforced? Presumably, createUniqueFile could use the new widen. Or is the issue here that PATH_MAX also applied to each component of a path? | |
1881 | const? | |
1904 | This seems like a better place to define Expected. |
llvm/lib/Support/Windows/Path.inc | ||
---|---|---|
84 | I think anything to help avoid having to mangle the input is beneficial. The need for this type of code is already bad enough. The real purpose of this function is to best prepare the path for the Win32 file APIs. It's not meant to be any kind of validation. So the Win32 file APIs could still fail for various reasons, such as other file system constraints. I like the idea to change ForCreateDir to MaxPathLen, thanks. | |
135 | Thanks for spotting this one! | |
llvm/unittests/Support/Path.cpp | ||
675 | As you've said, I believe that there is also a length limit to each component of a path, although I'm not sure what that exact limit is. | |
1881 | My usual style is to liberally use const where appropriate. But I've had objections to that in the past. As you've mentioned it here, I'll go back to my usual style and see if anyone objects. | |
1888 | Good point. It should be OK for all supported Win32 compilers, but to be on the safe side, I'll change it to hex representation to be sure. |
LGTM
I like Adrian's suggestion of having the caller pass in the maximum path length, thanks!
Why not always use the lower limit? What is the harm in using the UNC prefixes slightly more often?