Page MenuHomePhabricator

DWARF: Add some support for non-native directory separators
ClosedPublic

Authored by labath on Jan 10 2019, 6:07 AM.

Details

Summary

If we opened a file which was produced on system with different path
syntax, we would parse the paths from the debug info incorrectly.

The reason for that is that we would parse the paths as they were
native. For example this meant that on linux we would treat the entire
windows path as a single file name with no directory component, and then
we would concatenate that with the single directory component from the
DW_AT_comp_dir attribute. When parsing posix paths on windows, we would
at least get the directory separators right, but we still would treat
the posix paths as relative, and concatenate them where we shouldn't.

This patch attempts to remedy this by guessing the path syntax used in
each compile unit. (Unfortunately, there is no info in DWARF which would
give the definitive path style used by the produces, so guessing is all
we can do.) Currently, this guessing is based on the DW_AT_comp_dir
attribute of the compile unit, but this can be refined later if needed
(for example, the DW_AT_name of the compile unit may also contain some
useful info). This style is then used when parsing the line table of
that compile unit.

This patch is sufficient to make the line tables come out right, and
enable breakpoint setting by file name work correctly. Setting a
breakpoint by full path still has some kinks (specifically, using a
windows-style full path will not work on linux because the path will be
parsed as a linux path), but this will require larger changes in how
breakpoint setting works.

Diff Detail

Repository
rLLDB LLDB

Event Timeline

labath created this revision.Jan 10 2019, 6:07 AM
aprantl added inline comments.Jan 10 2019, 9:06 AM
source/Plugins/SymbolFile/DWARF/DWARFUnit.h
256

I would probably call this DetectPathStyle() since it's more a heuristic?

zturner added inline comments.Jan 10 2019, 9:53 AM
source/Plugins/SymbolFile/DWARF/DWARFUnit.h
256

Maybe even Guess since Compute implies absolute certainty.

clayborg requested changes to this revision.Jan 10 2019, 10:25 AM

DW_AT_comp_dir isn't enough. See inline suggestions.

source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
775–787

I would suggest adding a "FileSpec DWARFUnit::GetFileSpec()" and moving code from SymbolFileDWARF::ParseCompileUnit into this function:

FileSpec cu_file_spec(cu_die.GetName());
if (cu_file_spec) {
  // If we have a full path to the compile unit, we don't need to
  // resolve the file.  This can be expensive e.g. when the source
  // files are
  // NFS mounted.
  if (cu_file_spec.IsRelative()) {
    const char *cu_comp_dir{
        cu_die.GetAttributeValueAsString(DW_AT_comp_dir, nullptr)};
    cu_file_spec.PrependPathComponent(resolveCompDir(cu_comp_dir));
  }

We might even cache the FileSpec object inside DWARFUnit so it doesn't have to be recomputed?

Then use this resolved path. DW_AT_comp_dir isn't always there, sometimes the DW_AT_name is a full path only. So we should centralize this in DWARFUnit. We might want to make a function like:

void DWARFUnit::ResolveCompileUnitDirectoryRelativeFile(FileSpec &spec);

And then have "FileSpec DWARFUnit::GetFileSpec()" call it. I was looking at the other uses of DW_AT_comp_dir in the source, and the line tables tend to need to resolve relative paths. A few other locations. Might be nice to get the DWARFUnit and resolve the path using that? We can add "FileSpec DWARFDie::GetPath()" accessor, could even add a "void DWARFDie::ResolveRelativePath(FileSpec &)"

This revision now requires changes to proceed.Jan 10 2019, 10:25 AM
labath updated this revision to Diff 181261.Jan 11 2019, 6:41 AM
  • move the comp_dir resolution logic from SymbolFileDWARF to DWARFUnit
  • this required the addition on an accessor to the "comp_dir_symlinks" setting, which is used in the full comp_dir resolution
  • add FileSpec::MakeAbsolute() to simplify bits of code
  • determine the path style from the DW_AT_name field if DW_AT_comp_dir is missing (I recall that it is possible to get a CU without a comp_dir in some circumstances, but I wasn't able to get my compiler to do that).
labath marked 4 inline comments as done.Jan 11 2019, 6:50 AM
labath added inline comments.
source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
775–787

I think this version mostly implements what you had in mind. I added a DWARFUnit::GetCompilationDirectory() which returns the DW_AT_comp_dir attribute (I would expect that DWARFUnit::GetFileSpec() would return DW_AT_name, which is not useful for relative path resolution, except as a hint for the path style).

I've also added FileSpec::MakeAbsolute to shorten the if(!spec.relative()) spec.prepend(dir)) pattern we had in some places. I didn't add any special FileSpec accessor to DWARFDie, as it didn't seem particularly useful (at this point). The only place where that's used now is for the computation of the name of the compile unit, and that's pretty concise now anyway.

source/Plugins/SymbolFile/DWARF/DWARFUnit.h
256

With all the other changes, this is now called ComputeCompDirAndGuessPathStyle. Note that this is just a private helper function and the public accessor is still called GetPathStyle()

clayborg added inline comments.Jan 15 2019, 8:19 AM
source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
769–774

Remove?

777

Just move contents of ComputeCompDirAndGuessPathStyle() to this function now that we don't have GetPathStyle?

818

Move contents to DWARFUnit::GetCompilationDirectory?

source/Plugins/SymbolFile/DWARF/DWARFUnit.h
173

Is GetPathStyle() needed? Seems like we should be able to grab the style from:

auto style = unit->GetCompilationDirectory().GetPathStyle();
256

If we remove GetPathStyle above, should this be removed and just put into the code for GetCompilationDirectory() now that we don't have two accessors?

source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
757

Not sure how we would resolve this one. What if dwarf_cu doesn't have a DW_AT_comp_dir? Maybe we add a FileSpec constructor that takes a string and a FileSpec &relative_dir so this would would just become:

FileSpec cu_fs(cu_die.GetName(), dwarf_cu->GetCompilationDirectory());

The proto would be something like:

explicit FileSpec::FileSpec(llvm::StringRef path, const FileSpec &relative_root);

Then the if statement below goes away.

source/Utility/FileSpec.cpp
560

Should this be "bool FileSpec::ResolveRelativePath(const FileSpec &dir)"? Returns true if this path was relative and it was resolved, and false otherwise?

labath marked 5 inline comments as done.Jan 15 2019, 11:41 AM
labath added inline comments.
source/Plugins/SymbolFile/DWARF/DWARFUnit.h
173

I'd like to avoid that for two reasons:

  • I think this documents the intent better. Logically, the "path style" is a property of the compile unit and the fact that the compilation directory (and everything else) uses this path style is a consequence of that. The fact that we actually get the path style from the compilation directory is an implementation detail that users should not care about and may in fact change in the future e.g. if DWARF develops an ability to explicitly signal this information.
  • I'm not thrilled by the fact that we can have an empty (invalid) FileSpec, whose path style still has some significance (this happens if the CU does not have a comp_dir attribute, and we guess the style from the. In fact I even considered getting rid of that and having an explicit path style member). In the end I decided it's not too bad if it's internal to DWARFCompileUnit, but I would not want to expose this quirk to the outside world. So if anything, I'd actually go for making the path style a separate member.
source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
757

That would work if we say that we do support having empty/invalid FileSpec objects with certain PathStyle. However, as I said above, I am not entirely thrilled by that. FWIW, I don't think having this if is bad, as it makes it clear that how you're intending to handle the case when the CU has no name.

source/Utility/FileSpec.cpp
560

I called this MakeAbsolute because that's how the equivalent llvm function is called (llvm::sys::fs::make_absolute). I think it would be good to stay consistent here.

clayborg accepted this revision.Jan 15 2019, 1:21 PM

Ok, thanks for explaining. Your reasoning makes sense.

This revision is now accepted and ready to land.Jan 15 2019, 1:21 PM

Might be good to have a test where we have a relative DW_AT_name and not DW_AT_comp_dir? Just in case?

labath updated this revision to Diff 182010.Jan 16 2019, 4:33 AM
  • add a test for the no-DW_AT_comp_dir + relative-DW_AT_name case
This revision was automatically updated to reflect the committed changes.

The Windows Buildbot is failing because of one of the tests (SymbolFile/DWARF/dir-separator-no-comp-dir-relative-name.s) added in this change. You can see the failure here: http://lab.llvm.org:8011/builders/lldb-x64-windows-ninja/builds/666. It looks like it is checking for the wrong path on Windows:

# CHECK: Line table for foo/a.c

         ^

<stdin>:11:1: note: scanning from here

Line table for foo\a.c in `dir-separator-no-comp-dir-relative-name.s.tmp

^


error: command failed with exit status: 1