This seems to be a "documented" quirk in libc++'s implementation of
weakly_canonical (in a fixme comment in the weakly_canonical test).
Together with a difference between windows and posix regarding whether
paths can go through nonexistent dirs, this results in a difference in
a trailing slash. Just document this as expected.
Details
- Reviewers
Mordante curdeius - Group Reviewers
Restricted Project - Commits
- rG7a154c32301d: [libcxx] [test] Account for differences in a trailing slash in weakly_canonical
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Wouldn't it be judicious to try to fix the underlying problem (as noted in the fixme comment)?
It seems to be of a similar nature as http://wg21.link/LWG3096, i.e. the fact whether (a part of) the path exists or not shouldn't matter.
WDYT?
Well, maybe, but I'm afraid that might open a potentially big can of worms... Also, for reference, both MS STL and libstdc++ seem to have the same quirk regarding weakly_canonical sometimes returning paths with a trailing slash and sometimes not.
In that case I think it would be better to remove the FIXME and document the quirk including the fact that this matches the behaviour of MS STL and libstdc++.
WDYT?
LGTM.
I must admit that I went down the rabbit hole a bit... But that's a different story and one might argue that the current behaviour is OK.
TL;DR:
The problem is in part in __weakly_canonical that should check whether the input path has (and so the output needs) a trailing separator.
The other part is in __canonical that calls ::realpath and this function, in turn, removes the trailing separator.
http://eel.is/c++draft/fs.op.weakly.canonical#2 says:
Postconditions: The returned path is in normal form ([fs.path.generic]).
http://eel.is/c++draft/fs.op.weakly.canonical#5:
Remarks: Implementations should avoid unnecessary normalization such as when canonical has already been called on the entirety of p.
So, I imply that canonical() should return the path in normal form.
But then, http://eel.is/c++draft/fs.op.canonical#1 says:
Effects: Converts p to an absolute path that has no symbolic link, dot, or dot-dot elements in its pathname in the generic format.
First of all, this phrase is ambiguous to me because it can be understood as either of:
- "in the generic format" concerns elements in its pathname: There is no such element of the path p that would be in the generic format.
- "in the generic format" concerns an absolute path: Converts p to an absolute path (...) in the generic format. So the result is in the generic format.
To be consistent with weakly_canonical, the interpretation should be 1.
Also, the normalization described in http://eel.is/c++draft/fs.path.generic#6 (as done by lexically_normal) does keep trailing separators (apart from the case where the path ends with "../" where it gets removed, cf. http://eel.is/c++draft/fs.path.generic#6.7).
Hence I would expect that weakly_canonical (and canonical) return normalized paths and keep trailing separators in the same way as lexically_normal does.
I see no reason for a distinction between existing and non-existing paths through the trailing separator.
So, the current tests for weakly_canonical seem wrong because they remove the trailing separator, specifically:
{static_env.SymlinkToDir / "dir2/.", static_env.Dir / "dir2"}, /// FIXME.... {static_env.SymlinkToDir / "dir2/./", static_env.Dir / "dir2"},
should be
{static_env.SymlinkToDir / "dir2/.", static_env.Dir / "dir2/"}, {static_env.SymlinkToDir / "dir2/./", static_env.Dir / "dir2/"},
I.e. having "path/." or "path/./" should not remove the separator after path.
The problem is in part in __weakly_canonical that should check whether the input path has (and so the output needs) a trailing separator.
The other part is in __canonical that calls ::realpath and this function, in turn, removes the trailing separator.
Thanks for doing the rabbit hole digging! That reinforces my initial feeling that I wouldn't want to go messing around in that unless strictly needed...