This is an archive of the discontinued LLVM Phabricator instance.

Return a canonical path from getClangResourcePath()
AbandonedPublic

Authored by aaron.ballman on Jun 1 2017, 10:30 AM.

Details

Summary

on POSIX systems, CIndexer::getClangResourcesPath() uses dladdr() to get the path of the shared object. It seems that on some systems (in our case, OS X 10.6.8), dladdr() does not return a canonicalized path. We're getting a path like PATH/TO/CLANG/build/bin/../lib/clang/4.0.0. This resource directory path is then used to calculate a hash used by CompilerInvocation::getModuleHash(). This, in turn, is causing Index/pch-from-libclang.c to fail for us because the module cache paths have different names -- the first path is calculated with PATH/TO/CLANG/build/lib/clang/4.0.0 and the second path uses PATH/TO/CLANG/build/bin/../lib/clang/4.0.0.

Fix this bug by returning a canonicalized path.

Diff Detail

Event Timeline

aaron.ballman created this revision.Jun 1 2017, 10:30 AM
bruno added a reviewer: bruno.Jun 1 2017, 6:35 PM
bruno added a subscriber: bruno.

Hi Aaron,

Nice catch! Any chance you can add a testcase to this?

Hi Aaron,

Nice catch! Any chance you can add a testcase to this?

There's already a test case that covers this: Index/pch-from-libclang.c -- it was failing on OS X 10.6 for us. If you have a better test case in mind, I can certainly add it.

akyrtzi edited edge metadata.Jun 2 2017, 9:36 AM

Getting the real path is notoriously slow (at least it's horrible in OSX, not sure about linux). Since this is about dropping the '/../' part, could we do some simple canonicalization of removing the dots ? Not sure if there is an existing function that does that.

Getting the real path is notoriously slow (at least it's horrible in OSX, not sure about linux). Since this is about dropping the '/../' part, could we do some simple canonicalization of removing the dots ? Not sure if there is an existing function that does that.

Given the fact that this path is cached, so you only need to do the slow path one time, are you sure that's a practical concern? I'm unaware of any other API that canonicalizes the path, which is what users of this API are going to expect.

I retract my comment, I see that getMainExecutable on OSX calls realpath as well, so it's good to use realpath in this code to make sure they are in-sync with how the compiler will determine the path.

bruno edited edge metadata.Jun 2 2017, 11:29 AM

I'm unaware of any other API that canonicalizes the path, which is what users of this API are going to expect.

You can also call sys::path::remove_dots(Path, /*remove_dot_dot=*/true);

There's already a test case that covers this: Index/pch-from-libclang.c -- it was failing on OS X 10.6 for us. If you have a better test case in mind, I can certainly add it.

Works for me.

I'm unaware of any other API that canonicalizes the path, which is what users of this API are going to expect.

You can also call sys::path::remove_dots(Path, /*remove_dot_dot=*/true);

Yes, but that does not canonicalize the path. For instance, it won't handle doubled-up slashes or symlinks.

Ultimately, the value returned from this API gets passed to POSIX functions which expect a canonical path. I don't think remove_dots() is sufficient. It should do the same thing as getMainExecutable(), I believe.

There's already a test case that covers this: Index/pch-from-libclang.c -- it was failing on OS X 10.6 for us. If you have a better test case in mind, I can certainly add it.

Works for me.

bruno accepted this revision.Jun 2 2017, 11:57 AM

Yes, but that does not canonicalize the path. For instance, it won't handle doubled-up slashes or symlinks.

Right.

Ultimately, the value returned from this API gets passed to POSIX functions which expect a canonical path. I don't think remove_dots() is sufficient. It should do the same thing as getMainExecutable(), I believe.

I see, LGTM then

This revision is now accepted and ready to land.Jun 2 2017, 11:57 AM
aaron.ballman abandoned this revision.Nov 26 2017, 6:30 AM