Page MenuHomePhabricator

[Support/Path] Make handling of paths like "///" consistent
ClosedPublic

Authored by labath on Apr 23 2018, 2:52 AM.

Details

Summary

Various path functions were not treating paths consisting of slashes
alone consistently. For example, the iterator-based accessors decomposed the
path "///" into two elements: "/" and ".". This is not too bad, but it
is different from the behavior specified by posix:

A pathname that contains ***at least one non-slash character*** and that
ends with one or more trailing slashes shall be resolved as if a single
dot character ( '.' ) were appended to the pathname.

More importantly, this was different from how we treated the same path
in the filename+parent_path functions, which decomposed this path into
"." and "". This was completely wrong as it lost the information that
this was an absolute path which referred to the root directory.

This patch fixes this behavior by making sure all functions treat paths
consisting of (back)slashes alone the same way as "/". I.e., the
iterator-based functions will just report one component ("/"), and the
filename+parent_path will decompose them into "/" and "".

A slightly controversial topic here may be the treatment of "". Posix
says that paths beginning with "
" may have special meaning and indeed
we have code which parses paths like "net/foo/bar" specially. However,
as we were already not being consistent in parsing the "
" string
alone, and any special parsing for it would complicate the code further,
I chose to treat it the same way as longer sequences of slashes (which
are guaranteed to be the same as "/").

Another slight change of behavior is in the parsing of paths like
"net". Previously the last component of this path was ".". However,
as in our parsing the "net" part in this path was the same as the
"drive" part in "c:\" and the next slash was the "root directory", it
made sense to treat "
net" the same way as "net/" (i.e., not to add
the extra "." component at the end).

Diff Detail

Repository
rL LLVM

Event Timeline

labath created this revision.Apr 23 2018, 2:52 AM
zturner added inline comments.Apr 23 2018, 9:19 AM
unittests/Support/Path.cpp
186–190 ↗(On Diff #143518)

A comment here maybe in order. Should this behavior be true for both path styles?

espindola edited reviewers, added: Bigcheese; removed: espindola.Apr 23 2018, 9:12 PM
labath added inline comments.Apr 24 2018, 3:36 AM
unittests/Support/Path.cpp
186–190 ↗(On Diff #143518)

I am not sure which aspect of this do you want to me comment on, as there several potentially controversial things going on here:

  • decomposition of //: posix states that paths beginning with // are magic (I'm not sure if windows has a spec, but they are magic there as well), so treating it the same way as "/" is not entirely correct. It might be better to decompose this to ""+"//", but that is not completely consistent either, as elsewhere we treat "//net" as a single component, not as // followed by "net". (this holds for both path styles)
  • treatment of \\\ on windows. It seems windows just refuses to resolve paths like this (though it is fine with multiple backslashes elsewhere in the path), so treating this the same way as \ is not entirely correct. However, I think it is somewhat better than treating it as ""+".", which we were doing before. (this is a windows speciality as posix will happily accept paths like this).

In general, I'm not trying to be overly-prescriptive about the *right* behavior here. I am just trying to make sure we do something vaguely reasonable, and I am totally open to changing there is a need for it, now or in the future. (Maybe I should put that in the comment?) It's just that right now this was the easiest to implement.

labath updated this revision to Diff 144733.May 1 2018, 9:01 AM

Added an explanatory comment to the tests for "//", etc.

zturner accepted this revision.May 8 2018, 11:41 AM
This revision is now accepted and ready to land.May 8 2018, 11:41 AM
labath added a comment.May 9 2018, 6:21 AM

Thank you for the review.

This revision was automatically updated to reflect the committed changes.