This is an archive of the discontinued LLVM Phabricator instance.

FileSpec objects that resolve to "." should have "." in m_filename and m_directory empty.
ClosedPublic

Authored by clayborg on May 11 2018, 3:56 PM.

Details

Summary

After switching to LLVM normalization, if we init FileSpec with "." we would end up with m_directory being NULL and m_filename being "".

This patch fixes this by allowing the path to be normalized and if it normalized to nothing, set it to m_filename.

Diff Detail

Event Timeline

clayborg created this revision.May 11 2018, 3:56 PM

This patch is needed for some PathMappingList fixes I have coming up.

Preserving the dot if it is the only path component is perfectly reasonable -- "" is not a valid path and we shouldn't convert a valid path into an invalid one. However, I think this should be done on the llvm side, in the remote_dots function and not as a workaround on top of it.

I agree, but we should still guard against it and test in in LLDB to ensure this doesn't regress. I'll be happy to make a LLVM patch. Many people have branches that link against older LLVMs and the check is cheap.

Adding a test for that in lldb is fine. If we're relying on some behavior we want to make sure it doesn't change without us noticing. However, I don't think we should make any special allowments for people using lldb with older versions of llvm. It sends the wrong message. Making this work in this case is easy enough, but this is just the tip of the iceberg. We have made many changes in the past where we add/modify/extend an llvm interface to make it work for our needs, and then almost immediately start using it in lldb (the last example of that is the switch to llvm path handling functions in FileSpec, if anyone tries use that with a older version of llvm, he'll get a bunch of errors). We're a single project with a single release schedule (I know a lot don't follow the official release schedule, but then they should at least cut from the same trunk revision; all the people that I know about do just that) and we don't support mixing revisions. If anyone tries to do things differently, he's on his own. If he's lucky, he'll get a build error. If not, he gets to track down subtle behavior differences.

So the function in llvm is called llvm::sys::path::remove_dots(...) and it is removing the dots. Not sure it is correct to be changing a function that says "remove_dots" to not remove dots and actually return something with a . in it... Seems like this should be taken care of in LLDB. Thoughts?

Preserving the dot if it is the only path component is perfectly reasonable -- "" is not a valid path and we shouldn't convert a valid path into an invalid one. However, I think this should be done on the llvm side, in the remote_dots function and not as a workaround on top of it.

so the issue isn't just keeping a single ".", but "./" and "foo/.." and many others that result in the "." as being the result. And I don't think changing a function that is supposed to remove dots to not actually remove them is a good idea.

So the function in llvm is called llvm::sys::path::remove_dots(...) and it is removing the dots. Not sure it is correct to be changing a function that says "remove_dots" to not remove dots and actually return something with a . in it... Seems like this should be taken care of in LLDB. Thoughts?

That's kinda true, but I think this is a corner case that the author simply didn't think of when writing that function. Even though it is not mentioned anywhere, I think that a part of the contract of that function is that it returns a *valid* path, which is *equivalent* to the input path. "" is neither valid nor equivalent to ".", so I think this should trump the desire to remove all dots. I don't think any user would want the current behavior.

In fact when you look at the function description, it speaks about removing "./" components. "." does not have a "./" component.

So yes, I still believe the place for this is in the remove_dots function. Feel free to add me to the llvm patch, and I'll help you argue the case. If it turns out that present behavior is actually desired/depended on by somebody, then we can do it on our side, but I expect this is just a bug/corner case that nobody cared for until now.

(Also, the function already does not remove leading .. components, so there is precedent already for not removing stuff when it causes the path to become not equivalent.)

clayborg added a comment.EditedMay 15 2018, 8:50 AM

So I made a fix to LLVM and there are tests that are testing for the empty string:

[ RUN      ] Support.RemoveDots
/Users/gclayton/Documents/src/llvm/clean/llvm/unittests/Support/Path.cpp:1149: Failure
      Expected: ""
To be equal to: remove_dots(".\\\\\\\\\\", false, path::Style::windows)
      Which is: "."
/Users/gclayton/Documents/src/llvm/clean/llvm/unittests/Support/Path.cpp:1166: Failure
      Expected: ""
To be equal to: remove_dots("./////", false, path::Style::posix)
      Which is: "."

Are we not concerned with this?

If those are the only tests that fail, then I'd still go for it, as I still firmly believe this is the correct behavior for such a function. Such a low level test does not have to mean that someone really cares about this, it could be more of a case of documenting existing behavior (the svn history is not really clear on this). If there are other llvm/clang tests failing as a result of changing that, then we can reconsider.

clayborg abandoned this revision.May 16 2018, 11:41 AM

Fixed LLDB test botes with:

Sending unittests/Utility/FileSpecTest.cpp
Transmitting file data .done
Committing transaction...
Committed revision 332511.

clayborg reclaimed this revision.May 16 2018, 4:46 PM

Reviving this patch so I can get the changes into LLDB. Clang is expecting an empty path in many locations and I don't feel comfortable changing remove_dots in clang after trying it. So I would like to use this patch to fix things in LLDB.

labath accepted this revision.May 17 2018, 3:47 AM

It would still be interesting to align remove_dots with the "standard practice", but it looks like that would end up being a big project. I think we can live with tweak on our side in this case. If someone ever comes around to changing remote_dots behavior, at least it we will be ready.

Thank you for trying though.

This revision is now accepted and ready to land.May 17 2018, 3:47 AM
This revision was automatically updated to reflect the committed changes.

This has broken the unit tests. Looks like a bad merge that did not take
into account the refactoring in r332088.

clayborg updated this revision to Diff 147345.May 17 2018, 10:13 AM

Updated to what was committed.

I did run unit tests and they all passed here?

Sorry, I was wrong about the cause, but the bots are red nonetheless. My bet is it's the last {"", "."}, test, which is not working because of an early return in SetFile. TBH, I am not sure we would want that to work anyway, as we too treat an empty filespec specially in a lot of places.

Fixed with:

svn commit unittests/Utility/FileSpecTest.cpp
Sending unittests/Utility/FileSpecTest.cpp
Transmitting file data .done
Committing transaction...
Committed revision 332633.

Fixed with
svn commit unittests/Utility/FileSpecTest.cpp
Sending unittests/Utility/FileSpecTest.cpp
Transmitting file data .done
Committing transaction...
Committed revision 332633.