This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Remove dots from getFilenameByIndex return value
AcceptedPublic

Authored by phosek on Sep 14 2020, 5:51 PM.

Details

Summary

When concatenating directory with filename in getFilenameByIndex, we
might end up with a path that contains extra dots. For example, if the
input is /path and ./example, we would return /path/./example. Run
sys::path::remove_dots on the output to eliminate unnecessary dots.

Diff Detail

Event Timeline

phosek created this revision.Sep 14 2020, 5:51 PM
phosek requested review of this revision.Sep 14 2020, 5:51 PM
dblaikie accepted this revision.Sep 14 2020, 5:59 PM

Got any particular examples of this from LLVM's DWARF output? I suspect this sort of thing might come from/lead to path non-canonicalization which would make the line table bigger than it needs to be, so might be worth fixing there too.

This revision is now accepted and ready to land.Sep 14 2020, 5:59 PM

Got any particular examples of this from LLVM's DWARF output? I suspect this sort of thing might come from/lead to path non-canonicalization which would make the line table bigger than it needs to be, so might be worth fixing there too.

Good question, I ran into this while testing D87656 in Fuchsia, for example I saw the following cases in one of our host tools:

/b/s/w/ir/k/staging/llvm_build/tools/clang/stage2-bins/./lib/clang/12.0.0/include/stddef.h
./../../prebuilt/third_party/clang/linux-x64/bin/../include/c++/v1/vector
../staging/llvm_build/tools/clang/stage2-bins/runtimes/runtimes-x86_64-unknown-linux-gnu-bins/libcxxabi/../libcxx/include/utility

It's not limited to just host targets, I saw the following in our kernel:

/usr/local/google/home/phosek/fuchsia/third_party/boringssl/src/crypto/fipsmodule/sha/../digest/../../internal.h
./../../zircon/kernel/lib/libc/include/../../../../../third_party/boringssl/src/include/openssl/sha.h
./../../prebuilt/third_party/clang/linux-x64/bin/../include/c++/v1/variant

We use -fdebug-prefix-map and -fdebug-compilation-dir in our build which may be contributing to this issue. I'll try to come up with a minimal reproducer.

So a trivial example is:

#include <string>
int main() {}

When compiled as:

clang++ test.cc -g3 -S -o test.S

I see paths like these in the output:

"/usr/local/google/home/phosek/fuchsia" "prebuilt/third_party/clang/linux-x64/bin/../include/c++/v1/cstddef"

We could go over driver code and make sure we always invoke sys::path::remove_dots inside methods like AddIncludePaths or addPathIfExists (see for example https://github.com/llvm/llvm-project/blob/b3afad046301d8bb1f4471aceaad704b87de3a69/clang/lib/Driver/ToolChains/Gnu.cpp#L2902 which where the .. in libc++ header path comes from), but maybe it'd be better to do the normalization uniformly when emitting debuginfo metadata?

So a trivial example is:

#include <string>
int main() {}

When compiled as:

clang++ test.cc -g3 -S -o test.S

I see paths like these in the output:

"/usr/local/google/home/phosek/fuchsia" "prebuilt/third_party/clang/linux-x64/bin/../include/c++/v1/cstddef"

We could go over driver code and make sure we always invoke sys::path::remove_dots inside methods like AddIncludePaths or addPathIfExists (see for example https://github.com/llvm/llvm-project/blob/b3afad046301d8bb1f4471aceaad704b87de3a69/clang/lib/Driver/ToolChains/Gnu.cpp#L2902 which where the .. in libc++ header path comes from), but maybe it'd be better to do the normalization uniformly when emitting debuginfo metadata?

Probably when emitting the LLVM IR Debug info, rather than at the backend of LLVM, so that the IR itself doesn't have problematic redundancies. CGDebugInfo::remapDIPath might be the place for this.

This revision was landed with ongoing or failed builds.Sep 14 2020, 8:31 PM
This revision was automatically updated to reflect the committed changes.

We should probably have a bit more direct testing for this behaviour, as I agree it's useful and we don't want to accidentally lose the behaviour in a future change. Maybe a unit test, I don't really mind.

Hi Petr Hosek,

It looks like some test failures are caused by the commit (042c23506869b4ae9a49d2c4bc5ea6e6baeabe78) on Windows target, please see the build bot http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/18411 for the details.

Could you please fix these test failures?

Thanks,
Maggie

This revision is now accepted and ready to land.Sep 15 2020, 5:11 PM
rnk added a subscriber: rnk.Sep 25 2020, 3:35 PM
rnk added inline comments.
llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
1394

I don't think we can remove .. components, that wouldn't respect symlinks. Consider:

$ mkdir -p foo/bar/baz
$ ln -s .. foo/bar/foo
$ ls foo/bar/
baz  foo
$ ls foo/bar/foo/..
# contents of CWD

It would seem better to let the debugger sort it out.

Does this address @rnk 's point about symlinks? ( https://reviews.llvm.org/D87657#2296028 )

Does this address @rnk 's point about symlinks? ( https://reviews.llvm.org/D87657#2296028 )

Oops, wrong review - nevermind that.