This is an archive of the discontinued LLVM Phabricator instance.

[PDB] Remove dots and canonicalize slashes when using /PDBSOURCEPATH
ClosedPublic

Authored by zturner on Feb 5 2019, 10:31 AM.

Details

Summary
In a previous patch, I made changes so that PDBs which were
generated on non-Windows platforms contained sensical paths
for the host.  While this is an esoteric use case, we need
it to be supported for certain cross compilation scenarios
especially with LLDB, which can debug things on non-Windows
platforms.

However, this regressed a case where you specify /PDBSOURCEPATH
and use a windows-style path.  Previously, we would still remove
dots and canonicalize slashes to backslashes, but since my
change intentionally tried to support non-backslash paths, this
was broken.

This patch fixes the situation by trying to guess which path
style the user is specifying when /PDBSOURCEPATH is passed.
It is intentionally conservative, erring on the side of a
Windows path style unless absolutely certain.  All dots are
removed and slashes canonicalized to whatever the deduced
path style is after appending the file path to the /PDBSOURCEPATH
argument.

Diff Detail

Repository
rLLD LLVM Linker

Event Timeline

zturner created this revision.Feb 5 2019, 10:31 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 5 2019, 10:31 AM

Do you understand why D53149 changed behavior on a Windows host?

I thought we had 2 issues:

  1. Your patch wasn't supposed to change behavior on Windows hosts, yet it did. Understand why and fix. This shouldn't need a change like this.
  1. In cross builds, it'd be nice if there was a way to get the Windows behavior, e.g. by looking for ':' (what this patch does).

Oh, I see http://reviews.llvm.org/rL344377 is different from https://reviews.llvm.org/D53149 . New in the non-reviewed bit is:

// Only apply native and dot removal to the relative file path.  We want to
// leave the path the user specified untouched since we assume they specified
// it for a reason.

Is that necessary for anything? That doesn't look like the behavior we want -- we do want /pdbsourcepath:c:\foo\bar and a file with a relative path of ..\..\blah to become c:\blah. Can we undo that bit to fix (1)?

// Only apply native and dot removal to the relative file path.  We want to
// leave the path the user specified untouched since we assume they specified
// it for a reason.

Is that necessary for anything? That doesn't look like the behavior we want -- we do want /pdbsourcepath:c:\foo\bar and a file with a relative path of ..\..\blah to become c:\blah. Can we undo that bit to fix (1)?

It's possible I'm misunderstanding your comment, but I actually did remove that in this patch?

Ah, I see.

It feels a bit weird to use the native path functions only to then manually replace slashes. What do you think about:

sys::path::Style guessed_style = Config->PDBSourcePath.startswith("/")
                                    ? sys::path::Style::posix
                                    : sys::path::Style::windows;

after the first return and then pass that to all the sys::path function calls in this function instead?

Ah, I see.

It feels a bit weird to use the native path functions only to then manually replace slashes. What do you think about:

sys::path::Style guessed_style = Config->PDBSourcePath.startswith("/")
                                    ? sys::path::Style::posix
                                    : sys::path::Style::windows;

after the first return and then pass that to all the sys::path function calls in this function instead?

Well, the second return (which gets entered if user did not specify /PDBSOURCEPATH) handles the case where it actually *must be* on the local file system, so I think those calls should still use the native path syntax. But it would probably be fine to use the sys::path functions everywhere else. The reason I didn't do it is because they deal with std::string, and it seemed silly to have to marshal this SmallString<> through a separate type just to satisfy the API. I could also have added an overload of convert_to_slash() that just dealt with SmallString, but ultimately they just boil down to a std::replace() anyway, so I just inlined it.

thakis accepted this revision.Feb 5 2019, 1:01 PM

lgtm either way, but below seems simpler to me.

(Also, isn't it GuessedStyle instead of guessed_style in llvm style?

lld/COFF/PDB.cpp
300 ↗(On Diff #185345)

It's already in a SmallString here though. Wouldn't it just be

sys::path::append(AbsolutFileName, guessed_style, FileName);
sys::path::native(AbsoluteFileName, guessed_style);
sys::path::remove_dots(AbsoluteFileName, /*remove_dot_dots=*/true,
                                        guessed_style);

?

(and put the sys::path::native(FileName) call above into the PDBSourcePath.empyt() if)

This revision is now accepted and ready to land.Feb 5 2019, 1:01 PM
This revision was automatically updated to reflect the committed changes.