Page MenuHomePhabricator

[Driver] Allow setting the DWO name DWARF attribute separately
Changes PlannedPublic

Authored by aaronpuchert on Mar 21 2019, 3:21 PM.

Details

Reviewers
dblaikie
echristo
Summary

With Split DWARF the resulting object file (then called skeleton CU)
contains the file name of another ("DWO") file with the debug info.
This can be a problem for remote compilation, as it will contain the
name of the file on the compilation server, not on the client.

To use Split DWARF with remote compilation, one needs to either

  • make sure only relative paths are used, and mirror the build directory structure of the client on the server,
  • inject the desired file name on the client directly.

Here we provide an option for the latter solution:

-fsplit-dwarf-dwo-name-attr=<file>

sets DW_AT_[GNU_]dwo_name without changing the DWO output file name.
For now we keep this as CC1 option, but the idea is to promote it
eventually, if it picks up.

Based on a patch by Antonio Di Monaco.

Fixes PR40276.

Diff Detail

Event Timeline

aaronpuchert created this revision.Mar 21 2019, 3:21 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 21 2019, 3:21 PM

Pleasue include mention of the bug (PR40276) in the commit message & clarify that while this is useful for some remote compilation models, it's not strictly necessary/the only way to do it (a remote compilation model that keeps relative paths and uses compilation-dir instead can work without this attribute)

Use llvm-dwarfdump rather than llvm-objdump to dump the contents of the debug_info section and test the dwo_name there (rather than dumping hex) & probably the objdump part of the test isn't needed? (& I guess there's an LLVM patch to add the rest of this functionality?)

aaronpuchert edited the summary of this revision. (Show Details)Mar 21 2019, 5:40 PM

Use llvm-dwarfdump rather than llvm-objdump to dump the contents of the debug_info section and test the dwo_name there (rather than dumping hex)

I didn't know about llvm-dwarfdump, I wondered why llvm-objdump wouldn't pretty-print the debug info as objdump does. That makes a lot of sense then.

probably the objdump part of the test isn't needed?

Actually I only need to check that the DWO file is there (under the proper file name). Any ideas?

(& I guess there's an LLVM patch to add the rest of this functionality?)

What do we have to do there? Should we add the option to llc as well?

aaronpuchert edited the summary of this revision. (Show Details)Mar 21 2019, 6:06 PM

Use llvm-dwarfdump to inspect debug info, remove unneeded flags.

Use llvm-dwarfdump rather than llvm-objdump to dump the contents of the debug_info section and test the dwo_name there (rather than dumping hex)

I didn't know about llvm-dwarfdump, I wondered why llvm-objdump wouldn't pretty-print the debug info as objdump does. That makes a lot of sense then.

probably the objdump part of the test isn't needed?

Actually I only need to check that the DWO file is there (under the proper file name). Any ideas?

Ah, fair. You could actually test the dwo_name is accurate in the .dwo file (I added the dwo_name to the .dwo file so that multi-level dwp error messages could be more informative)

(& I guess there's an LLVM patch to add the rest of this functionality?)

What do we have to do there? Should we add the option to llc as well?

Yep - pretty much anything in MCOptions should be surfaced through llc for llvm-level testing.

Ah, fair. You could actually test the dwo_name is accurate in the .dwo file (I added the dwo_name to the .dwo file so that multi-level dwp error messages could be more informative)

Ok, I'll just check the dwo_name for both files then.

(& I guess there's an LLVM patch to add the rest of this functionality?)

What do we have to do there? Should we add the option to llc as well?

Yep - pretty much anything in MCOptions should be surfaced through llc for llvm-level testing.

I was about to add this when I discovered that there is such an option already. The command line clang -cc1 -split-dwarf-file a -fsplit-dwarf-dwo-name-attr b corresponds to llc -split-dwarf-output a -split-dwarf-file b. That seems a bit unfortunate, especially the different meanings of -split-dwarf-file. Should/can we strive to emulate the behavior of llc in Clang instead? That would perhaps require changes to the behavior of -split-dwarf-file. Here's how the options behave with llc:

Option-split-dwarf-file a
No splitSplit DI in same object, set DW_AT_GNU_dwo_name = a, no extra file
-split-dwarf-output bNo splitSplit DI, set DW_AT_GNU_dwo_name = a, separate file b

Currently, Clang's -enable-split-dwarf[=split] -split-dwarf-file a does the same as -split-dwarf-output a -split-dwarf-file a in llc, and -enable-split-dwarf=single -split-dwarf-file a does the same as -split-dwarf-file a in llc.

Ok, here is an idea. We introduce -split-dwarf-output in Clang instead of -fsplit-dwarf-dwo-name-attr. If given, it overrides the output file name for the Split DWARF file, which we otherwise take from -split-dwarf-file. The option is obviously incompatible with -enable-split-dwarf=single, so we will disallow that. This should be backwards-compatible, and bring the behavior of llc and clang -cc1 closer together. What do you think?

Ok, here is an idea. We introduce -split-dwarf-output in Clang instead of -fsplit-dwarf-dwo-name-attr. If given, it overrides the output file name for the Split DWARF file, which we otherwise take from -split-dwarf-file. The option is obviously incompatible with -enable-split-dwarf=single, so we will disallow that. This should be backwards-compatible, and bring the behavior of llc and clang -cc1 closer together. What do you think?

Sure, I think the naming's a bit weird (but hard to come up with good names for any of this) - but consistency seems like a good first pass at least, given we're halfway there anyway.

aaronpuchert planned changes to this revision.Sun, Apr 14, 6:29 PM

Sure, I think the naming's a bit weird (but hard to come up with good names for any of this)

Agreed, -split-dwarf-output is pretty clear, but -split-dwarf-file could be both the actual filename or the attribute.

There is test/CodeGen/split-debug-filename.c checking that we emit !DICompileUnit({{.*}}, splitDebugFilename: "foo.dwo" into the IR under some circumstances, but I'm not sure why. It seems to be ignored by llc. What's the intended use for this? Your commit message in rC301063 suggests that this is needed for implicit modules. How does this work? I'm asking of course because I'm not sure whether we might need to emit both file names there.

Sure, I think the naming's a bit weird (but hard to come up with good names for any of this)

Agreed, -split-dwarf-output is pretty clear, but -split-dwarf-file could be both the actual filename or the attribute.

There is test/CodeGen/split-debug-filename.c checking that we emit !DICompileUnit({{.*}}, splitDebugFilename: "foo.dwo" into the IR under some circumstances, but I'm not sure why. It seems to be ignored by llc.

For implicit modular debug info I think it probably isn't ignored. See DwarfDebug.cpp:~630.

What's the intended use for this? Your commit message in rC301063 suggests that this is needed for implicit modules. How does this work? I'm asking of course because I'm not sure whether we might need to emit both file names there.

So implicit modules can use a sort of pseudo-split DWARF. The .pcm (module file) itself is an object file in this mode - with the split DWARF and a section containing the AST bitcode. Then in the normal source files that use the .pcm, they get the usual non-split DWARF, plus skeleton CUs that reference the .pcm file (& contain some other attributes for regenerating it in case it's been cleaned up from the module cache). This is why the file name can be (& has to be) passed down through the IR - it /can/ be, because this compilation isn't choosing tnhe file name (so there's no issue with the .dwo file name changing between IR generation and object generation (as there is with LTO - where the compilation doesn't know the final object name, only the linker knows that)) and it /must/ be done this way due to multiple skeleton CUs if the main CU references multiple .pcm debug info.