This is an archive of the discontinued LLVM Phabricator instance.

[Clang] - Add '-gsplit-dwarf[=split,=single]' version for '-gsplit-dwarf' option.
ClosedPublic

Authored by grimar on Sep 20 2018, 3:45 AM.

Details

Summary

The DWARF5 specification says(Appendix F.1):

"The sections that do not require relocation, however, can be written to the relocatable object (.o) file but ignored by the
linker
or they can be written to a separate DWARF object (.dwo) file that need not be accessed by the linker."

The first part describes a single file split DWARF feature and there is no way to trigger this behavior atm.
Fortunately, no many changes are required to keep *.dwo sections in a .o, the patch does that.

Diff Detail

Event Timeline

grimar created this revision.Sep 20 2018, 3:45 AM

Do you/what's your particular use case for this scenario? I guess this looks a bit like Apple's format (where debug info stays in the object file and isn't linked into the final binary), but don't expect they'd be moving to this any time soon.

Do we generate the .dwo file directly these days? If not, I can imagine wanting to avoid the overhead of the objcopy hack; as long as the linker is smart enough not to bother with the .debug_*.dwo sections this seems like a build-time win.

We want to take the benefit that debug fission provides, but keep the debug information in the objects.
That is a bit less intrusive way than using regular split dwarf flow.

For a regular build and debug cycle that allows the linker to do minimal processing of the .debug_str.
Since .debug_str.dwo is ignored, that allows to significantly reduce the link time,
as strings processing is slow and string sections are large by themselves to handle them.

And at the same time the build system still does not see anything unusual/strange
as it sees only .o files like it is just a regular non-split-dwarf flow.

To summarise:

  • It shares most of the benefits with the .dwo solution.
  • Unlike .dwo, there is no need to split the file and it is friendlier to other tools (ccache, distcc, etc)
  • But unlike .dwo a distributed build system has to copy larger .o files.

Do we generate the .dwo file directly these days? If not, I can imagine wanting to avoid the overhead of the objcopy hack; as long as the linker is smart enough not to bother with the .debug_*.dwo sections this seems like a build-time win.

We do generate them generically with no objcopy hack.

As far as the standard text here, IMO it was just there in case people didn't have an objcopy around or don't want to split it. I'm not sure why we would want the ability. That said, if we do I'd rather have it as dwarf5 without split-dwarf as an option rather than a -gsingle-file-split-dwarf option.

Do we generate the .dwo file directly these days? If not, I can imagine wanting to avoid the overhead of the objcopy hack; as long as the linker is smart enough not to bother with the .debug_*.dwo sections this seems like a build-time win.

We do generate them generically with no objcopy hack.

As far as the standard text here, IMO it was just there in case people didn't have an objcopy around or don't want to split it. I'm not sure why we would want the ability. That said, if we do I'd rather have it as dwarf5 without split-dwarf as an option rather than a -gsingle-file-split-dwarf option.

Do we generate the .dwo file directly these days? If not, I can imagine wanting to avoid the overhead of the objcopy hack; as long as the linker is smart enough not to bother with the .debug_*.dwo sections this seems like a build-time win.

We do generate them generically with no objcopy hack.

Eric, could you elaborate for me your position, please

As far as the standard text here, IMO it was just there in case people didn't have an objcopy around or don't want to split it.

Yeah, we do not want to split it and I see no other way to say clang to keep them in a .o files rather than introducing the new flag.
Am I missing something?

I'm not sure why we would want the ability.
That said, if we do I'd rather have it as dwarf5 without split-dwarf as an option rather than a -gsingle-file-split-dwarf option.

What do you mean as "dwarf5 without split-dwarf as an option" here? Do you mean to do split-dwarf by default? It is orthogonal to what this patch does.

@grimar, this is an interesting observation which I've had on my mind for quite some time as well; a couple of things which I have not double-checked yet - just in case - do both gold and lld completely ignore dwo-related sections ? (did you check that ?), and another small question - just wondering if the debuggers (GDB and LLDB) are okay with it / or smth needs to be adjusted or fixed on their side. I guess everything should be fine, but asking just in case.

In D52296#1258677, @alexshap wrote:

@grimar, this is an interesting observation which I've had on my mind for quite some time as well; a couple of things which I have not double-checked yet - just in case - do both gold and lld completely ignore dwo-related sections ? (did you check that ?),

LLVM emits them with the SHF_EXCLUDE flag since the rL342800 "[lib/MC] - Set SHF_EXCLUDE flag for .dwo sections.". So if linker supports SHF_EXCLUDE flag properly, it will ignore it properly.
That is true for bfd/gold/LLD atm.

LLD does not ignore ".dwo" sections by name and I do not expect other linkers do (and that is fine, we do not want to ignore sections by name generally), so having this flag is an important and clean thing for things to work.

and another small question - just wondering if the debuggers (GDB and LLDB) are okay with it / or smth needs to be adjusted or fixed on their side. I guess everything should be fine, but asking just in case.

The patch for LLDB is ready to be landed: D52403. It waits for this one, since its test case mentions/uses -gsingle-file-split-dwarf option.
I am thinking about rewriting the comment and landing it independently.

I did not check the GDB yet.

I see, many thanks. I've cherry-picked this patch locally and played with GDB - it appears to work fine with it.
I'm also interested and support this change since this would simplify the adoption of Fission by some build systems.
@dblaikie, @echristo - are there any particular concerns with moving this forward ?

I talked to David @dblaikie offline about this diff yesterday, if I understood correctly this change is reasonable in general, @echristo Eric, would you mind having a look at this diff ?

@echristo

As far as the standard text here, IMO it was just there in case people didn't have an objcopy around or don't want to split it. I'm not sure why we would want the ability.

I think others have mentioned - but with distributed build it might be easier to keep it as a single file because your build system already knows how to ship around those files.

That said, if we do I'd rather have it as dwarf5 without split-dwarf as an option rather than a -gsingle-file-split-dwarf option.

Any suggestions on the name?

Maybe -gno-dwo? So we would write -genable-split-dwarf -gno-dwo?

Maybe -gno-dwo? So we would write -genable-split-dwarf -gno-dwo?

I'm not sure how everyone else feels about "-g" flags that modify debug behavior (like "-gsplit-dwarf") versus "-f" flags (eg: "-fdebug-types-section"). I kind of personally feel like "-g" flags are a bit odd & would think -f flags to modify debug info behavior, leaving "-g" mostly to turn on/off debug info (so, -g, -gN, -gmlt, etc - sort of like -O, -ON, -Os, etc).

by that logic -funsplit-split-dwarf ? Mostly goofy suggestion, vaguely trolling @echristo to see if he's got other/better ideas, or clearer feelings/understanding of history around the flag naming conventions (-f, -g, broad naming in general, etc).

grimar updated this revision to Diff 171480.Oct 29 2018, 3:57 AM

Ping.

  • Rebased.

Do we generate the .dwo file directly these days? If not, I can imagine wanting to avoid the overhead of the objcopy hack; as long as the linker is smart enough not to bother with the .debug_*.dwo sections this seems like a build-time win.

We do generate them generically with no objcopy hack.

Eric, could you elaborate for me your position, please

We don't use objcopy on linux or fuschia to generate .dwo files.

As far as the standard text here, IMO it was just there in case people didn't have an objcopy around or don't want to split it.

Yeah, we do not want to split it and I see no other way to say clang to keep them in a .o files rather than introducing the new flag.
Am I missing something?

I'm not sure why we would want the ability.
That said, if we do I'd rather have it as dwarf5 without split-dwarf as an option rather than a -gsingle-file-split-dwarf option.

What do you mean as "dwarf5 without split-dwarf as an option" here? Do you mean to do split-dwarf by default? It is orthogonal to what this patch does.

So, what are you trying to do here is, I guess, my other question.

Are you trying to take advantage of dwarf5 features, or are you just trying to have everything in one file for later splitting? Or something else? If you want to generate split dwarf (as opposed to a lot of dwarf5 features some of which are split dwarf) I'll bike shed that we need to come up with a better set of option naming here - I'd probably repurpose/alias -gsplit-dwarf and use -f options for whether or not to use specific features etc.

Can you elaborate on your motivations and what you're trying to do?

Thanks!

Can you elaborate on your motivations and what you're trying to do?

Thanks!

We want to see:

  • No extra files. The compiler produces just a .o.
  • The linker leaves most debug info in the .o files.

That makes the build friendly to existing tools and avoids most of the static linker work.

Can you elaborate on your motivations and what you're trying to do?

Thanks!

We want to see:

  • No extra files. The compiler produces just a .o.
  • The linker leaves most debug info in the .o files.

That makes the build friendly to existing tools and avoids most of the static linker work.

I guess in that case your distributed build system would have a constraint that it always ships all the object files back to the primary machine (where you'd be running the debugger)? (perhaps it just always runs the link locally - even though it distributes the compilations - I guess that's probably how things like distcc work, where they only inject themselves into the compilation command, not the linking command maybe?)

Also, may require some work/more flags to handle the possible file renaming (or relative/absolute adjustment) when object files are built on a remote system in one location, but then moved back to the local system and placed in another location?

I guess in that case your distributed build system would have a constraint that it always ships all the object files back to the primary machine (where you'd be running the debugger)? (perhaps it just always runs the link locally - even though it distributes the compilations - I guess that's probably how things like distcc work, where they only inject themselves into the compilation command, not the linking command maybe?)

AFAIK this is how distcc, icecc, SN-DBS all work. Compilations are distributed, the .o files come back, links are local.

Also, may require some work/more flags to handle the possible file renaming (or relative/absolute adjustment) when object files are built on a remote system in one location, but then moved back to the local system and placed in another location?

Any distributed build has to make this work, so the paths in the line table are usable. Not clear what you're thinking might be the problem with the split-in-.o case.

I guess in that case your distributed build system would have a constraint that it always ships all the object files back to the primary machine (where you'd be running the debugger)? (perhaps it just always runs the link locally - even though it distributes the compilations - I guess that's probably how things like distcc work, where they only inject themselves into the compilation command, not the linking command maybe?)

AFAIK this is how distcc, icecc, SN-DBS all work. Compilations are distributed, the .o files come back, links are local.

Thanks! I've never used them, so was just guessing/estimating.

Also, may require some work/more flags to handle the possible file renaming (or relative/absolute adjustment) when object files are built on a remote system in one location, but then moved back to the local system and placed in another location?

Any distributed build has to make this work, so the paths in the line table are usable. Not clear what you're thinking might be the problem with the split-in-.o case.

Fair point - same sort of problem, but might need distinct handling (ie: might not work "for free" with existing tools, but need more support), etc, depending on how it's solved?

Mostly just wondering whether Clang would need extra flags to support this - to get a sense of the total/overall solution/surface area of the feature, etc.

Any distributed build has to make this work, so the paths in the line table are usable. Not clear what you're thinking might be the problem with the split-in-.o case.

Fair point - same sort of problem, but might need distinct handling (ie: might not work "for free" with existing tools, but need more support), etc, depending on how it's solved?

Mostly just wondering whether Clang would need extra flags to support this - to get a sense of the total/overall solution/surface area of the feature, etc.

Shouldn't. We have licensees using distributed builds all day every day, but we haven't introduced anything into the compiler itself to make that work. On Windows, I know SN-DBS will intercept system calls to patch up filespecs. On Linux I'd expect the remote servers to set up chroot environments so the path names will Just Work, although I admit I've never actually looked. Never had to; it Just Works.
In this patch, where the .o normally points to the .dwo it would instead just point to itself, right? The linker doesn't need to fix anything up, it just ignores the .dwo sections.

Any distributed build has to make this work, so the paths in the line table are usable. Not clear what you're thinking might be the problem with the split-in-.o case.

Fair point - same sort of problem, but might need distinct handling (ie: might not work "for free" with existing tools, but need more support), etc, depending on how it's solved?

Mostly just wondering whether Clang would need extra flags to support this - to get a sense of the total/overall solution/surface area of the feature, etc.

Shouldn't. We have licensees using distributed builds all day every day, but we haven't introduced anything into the compiler itself to make that work. On Windows, I know SN-DBS will intercept system calls to patch up filespecs. On Linux I'd expect the remote servers to set up chroot environments so the path names will Just Work, although I admit I've never actually looked. Never had to; it Just Works.

Neat!

In this patch, where the .o normally points to the .dwo it would instead just point to itself, right?

Right - that's an absolute path, like the line table (between dir+name, if you don't override the base dir to something else, etc). So, yeah, if the solutions for that are implemented at a sufficiently low level as you've described above, sounds like there's a good chance it'd Just Work for this part too.

The linker doesn't need to fix anything up, it just ignores the .dwo sections.

*nod* Not suggesting the linker could/would be able to fix anything up here - but that the compiler might need extra command line features to help ensure the values were written in a way that was useful.

For example, I believe the Chromium build system may use -fdebug-compilation-dir and -fdebug-prefix-map to make cached distributed builds work (where it'd be impossible to bake in any particular path - because two users may be building Chromium from different locations, but they should be able to share build artefacts for maximal caching). But if that's not the usual problem/solution (or there are sufficiently many users/use cases that don't have that problem) - that's cool & it's nice if this just works for those users - and if, for whatever reason, that a Chromium-like situation comes up & wants non-split-split-DWARF & needs to address the problem, we can cross that bridge at that time (either by using the existing properties (like -fdebug-compilation-dir and -fdebug-prefix-map) if they provide enough information, or by adding other flags to that family to allow the user to do so)

grimar added a comment.Nov 1 2018, 4:38 AM

Nice :)
So seems the last unresolved question left is the naming of the new option?
If we do not want to go with -gsingle-file-split-dwarf then what it should be?

What about -fdebug-fission as an alias for -gsplit-dwarf.
And -fsingle-file-debug-fission for the new option?

Or, may be:

-fdebug-fission for the new option and then
-fdebug-fission -fdebug-split-dwo would work together as -gsplit-dwarf?

Nice :)
So seems the last unresolved question left is the naming of the new option?
If we do not want to go with -gsingle-file-split-dwarf then what it should be?

What about -fdebug-fission as an alias for -gsplit-dwarf.
And -fsingle-file-debug-fission for the new option?

Or, may be:

-fdebug-fission for the new option and then
-fdebug-fission -fdebug-split-dwo would work together as -gsplit-dwarf?

Only DWARF supports this, correct? So I would suggest: -fdwarf-fission[={split,single}] where the parameter defaults to split. Is there any particular value in having two separate options?

grimar added a comment.Nov 2 2018, 4:12 AM

Nice :)
So seems the last unresolved question left is the naming of the new option?
If we do not want to go with -gsingle-file-split-dwarf then what it should be?

What about -fdebug-fission as an alias for -gsplit-dwarf.
And -fsingle-file-debug-fission for the new option?

Or, may be:

-fdebug-fission for the new option and then
-fdebug-fission -fdebug-split-dwo would work together as -gsplit-dwarf?

Only DWARF supports this, correct?

I am not aware of any kind of debug information splitting except DWARF splitting.

So I would suggest: -fdwarf-fission[={split,single}] where the parameter defaults to split. Is there any particular value in having two separate options?

Probably not. -fdwarf-fission[={split,single}] sounds good for me.

Nice :)
So seems the last unresolved question left is the naming of the new option?
If we do not want to go with -gsingle-file-split-dwarf then what it should be?

What about -fdebug-fission as an alias for -gsplit-dwarf.
And -fsingle-file-debug-fission for the new option?

Or, may be:

-fdebug-fission for the new option and then
-fdebug-fission -fdebug-split-dwo would work together as -gsplit-dwarf?

Only DWARF supports this, correct?

I am not aware of any kind of debug information splitting except DWARF splitting.

So I would suggest: -fdwarf-fission[={split,single}] where the parameter defaults to split. Is there any particular value in having two separate options?

Probably not. -fdwarf-fission[={split,single}] sounds good for me.

Still not sure I understand the point, but this option and the alias sounds good to me.

Thanks.

-eric

grimar updated this revision to Diff 172953.Nov 7 2018, 7:16 AM
grimar retitled this revision from [Clang] - Add -gsingle-file-split-dwarf option. to [Clang] - Add -fdwarf-fission=split,single option..

Reimplemented option as -fdwarf-fission[=split,single].

Thanks! - though on reflection I'm going to invoke @echristo again about the naming. It's unfortunately a bit backwards that the pre-standard flag is -gsplit-dwarf and what we're proposing as the standard flag is -fdwarf-fission, when the DWARF standard doesn't use the word "Fission" at all (only "split DWARF").

Maybe it'd be easy enough/better to add the "=single" suffix to the existing "-gsplit-dwarf"? (as much as I'm still a bit confused by which debug options use a '-g' and which ones use a '-f', etc).

Really sorry to go back/forth/around about on this, George.

lib/Driver/ToolChains/Clang.cpp
2999–3001

Rather than creating a string in the case where there's no need for string matching/parsing, perhaps that could be avoided?

if (!A->getOption().matches(options::OPT_fdwarf_fission_EQ))
  return DwarfFissionKind::Split;

StringRef Value = A->getValue();
if (Value == "split")
...
3010–3011

I'd probably either accept this as a precondition (assert(!ArgIndex || ArgIndex == 0)) or do this bit at the start of the function rather than the end here, to make it simpler/clearer that all paths assign some value to ArgIndex before exit of the function (even though that does mean checking/assigning ArgIndex unnecessarily in the case where the argument is given, etc)

5854–5855

Could having more than one call to getDebugFissionKind lead to the error message (for -fdwarf-fission=garbage) being printed multiple times? Or is this call on a distinct/mutually exclusive codepath to the other?

Thanks! - though on reflection I'm going to invoke @echristo again about the naming. It's unfortunately a bit backwards that the pre-standard flag is -gsplit-dwarf and what we're proposing as the standard flag is -fdwarf-fission, when the DWARF standard doesn't use the word "Fission" at all (only "split DWARF").

Maybe it'd be easy enough/better to add the "=single" suffix to the existing "-gsplit-dwarf"? (as much as I'm still a bit confused by which debug options use a '-g' and which ones use a '-f', etc).

Really sorry to go back/forth/around about on this, George.

No problems. The naming of the new flags is an important thing, no need to rush here I believe.

grimar updated this revision to Diff 173135.Nov 8 2018, 2:38 AM
grimar marked an inline comment as done.
  • Addressed review comments.
lib/Driver/ToolChains/Clang.cpp
2999–3001

Done.

3010–3011

It turns out that A->getIndex() returned can be different from index in Args array.
So it is not correct to use something like size_t Index = A->getIndex(); ....; Arg* A = Args.getArgs()[Index].

I rewrote this method.

5854–5855

I think it should not be possible. Currently, there are 2 inclusions of the getDebugFissionKind . One is used for construction clang job,
and one is here, it is used to construct an assembler tool job. I do not see a way how it can be printed multiple times.

For example, the following invocation will print it only once:

clang main.c -o out.s -S -fdwarf-fission=foo
clang-8: error: unsupported argument 'foo' to option 'fdwarf-fission='
probinson added inline comments.Nov 8 2018, 11:42 AM
lib/Driver/ToolChains/Clang.cpp
5854–5855

The example you give here will not create an assembler job. Try
clang main.c -fdwarf-fission=foo -fno-integrated-as which I think will exercise the path David is asking about.

grimar added inline comments.Nov 9 2018, 1:39 AM
lib/Driver/ToolChains/Clang.cpp
5854–5855

Oh, I did not know.

Seems the important option to test is -save-temps. Seems I have the same result with -fno-integrated-as/-fintegrated-as,
but with/without -save-temps I got:

clang main.cpp -fdwarf-fission=foo -o 1.o
clang-8: error: unsupported argument 'foo' to option 'fdwarf-fission='

clang main.cpp -fdwarf-fission=foo -o 1.o -save-temps
clang-8: error: unsupported argument 'foo' to option 'fdwarf-fission='
clang-8: error: unsupported argument 'foo' to option 'fdwarf-fission='
clang-8: error: unsupported argument 'foo' to option 'fdwarf-fission='
clang-8: error: unsupported argument 'foo' to option 'fdwarf-fission='

Will fix, thanks!

(The behavior seems weird to me. I would not expect tool will be creating any jobs should happen when the command line is invalid from the start. I am not very familiar with the clang code/logic though to say it wrong).

grimar added inline comments.Nov 9 2018, 1:42 AM
lib/Driver/ToolChains/Clang.cpp
5854–5855

"I would not expect tool will be creating any jobs should happen" -> "I would not expect tool will be creating any jobs when ..."

grimar added a comment.Nov 9 2018, 3:16 AM

Looks like this behavior is what clang already have atm.
Messages for the options that use D.Diag to report invalid values can be printed multiple times sometimes.

The example is below:

clang main.cpp -fdwarf-fission=foo -o 1.o -mthread-model bar -fcf-runtime-abi=zed
clang-8: error: invalid thread model 'bar' in '-mthread-model bar' for this target
clang-8: error: unsupported argument 'foo' to option 'fdwarf-fission='
clang-8: error: invalid CoreFoundation Runtime ABI 'zed'; must be one of 'objc', 'standalone', 'swift', 'swift-5.0', 'swift-4.2', 'swift-4.1'
clang main.cpp -fdwarf-fission=foo -o 1.o -mthread-model bar -fcf-runtime-abi=zed -save-temps
clang-8: error: invalid thread model 'bar' in '-mthread-model bar' for this target
clang-8: error: unsupported argument 'foo' to option 'fdwarf-fission='
clang-8: error: invalid CoreFoundation Runtime ABI 'zed'; must be one of 'objc', 'standalone', 'swift', 'swift-5.0', 'swift-4.2', 'swift-4.1'
clang-8: error: invalid thread model 'bar' in '-mthread-model bar' for this target
clang-8: error: unsupported argument 'foo' to option 'fdwarf-fission='
clang-8: error: invalid CoreFoundation Runtime ABI 'zed'; must be one of 'objc', 'standalone', 'swift', 'swift-5.0', 'swift-4.2', 'swift-4.1'
clang-8: error: invalid thread model 'bar' in '-mthread-model bar' for this target
clang-8: error: unsupported argument 'foo' to option 'fdwarf-fission='
clang-8: error: invalid CoreFoundation Runtime ABI 'zed'; must be one of 'objc', 'standalone', 'swift', 'swift-5.0', 'swift-4.2', 'swift-4.1'
clang-8: error: unsupported argument 'foo' to option 'fdwarf-fission='

So it seems to me this patch just follows the behavior and nothing should be fixed at this point right now?

grimar added inline comments.Nov 9 2018, 4:12 AM
lib/Driver/ToolChains/Clang.cpp
5854–5855

And returning to this my example, indeed it should not have to create assembler job.
My mistake, sorry, just my head was overloaded with tons of new things.

The correct example should be:

clang library.s -fdwarf-fission=foo -o 1.o
clang-8: error: unsupported argument 'foo' to option 'fdwarf-fission='

So in this case clang creates the assembler job and prints the message only once.

If we add some *.cpp then we will see multiple messages (one for each job):

clang library.s main.cpp -fdwarf-fission=foo -o 1.o
clang-8: error: unsupported argument 'foo' to option 'fdwarf-fission='
clang-8: error: unsupported argument 'foo' to option 'fdwarf-fission='

But this is consistent with what clang already do. For example -masm= will trigger the same behavior:

~/LLVM/build_lldb/bin/clang library.s main.cpp -fdwarf-fission=foo -o 1.o -masm=foo
clang-8: error: unsupported argument 'foo' to option 'masm='
clang-8: error: unsupported argument 'foo' to option 'fdwarf-fission='
clang-8: error: unsupported argument 'foo' to option 'masm='
clang-8: error: unsupported argument 'foo' to option 'fdwarf-fission=

i.e. for options that are used both in main clang job and "secondary" jobs, like as that is possible already.

OK - thanks for that.

I'm going to make an executive decision on the naming. Let's go with -gsplit-dwarf[=single] (or explicitly -gsplit-dwarf=split, which is the default value when -gsplit-dwarf is specified). Saves adding a new name/flag, avoids the use of the word Fission which isn't a standard term & so best avoided as a way of talking about configuring DWARF standardized functionality.

Could you make that change? & I'll give it another look over after that, but should be fine.

grimar updated this revision to Diff 173643.Nov 12 2018, 3:25 AM
grimar retitled this revision from [Clang] - Add -fdwarf-fission=split,single option. to [Clang] - Add '-gsplit-dwarf[=split,=single]' version for '-gsplit-dwarf' option..
grimar edited the summary of this revision. (Show Details)

Thanks, David!

Changed:

  • Introduced a -gsplit-dwarf[=split,=single] version for '-gsplit-dwarf' option
dblaikie added inline comments.Nov 12 2018, 2:26 PM
lib/Driver/ToolChains/CommonArgs.cpp
779–796

If this function now takes the output file name - could it be simplified to just always use that, rather than the conditional code that follows and tests whether -o is specified and builds up something like the output file name & uses the dwo suffix?

test/CodeGen/split-debug-single-file.c
13

This looks like an end-to-end test, which we don't usually do in Clang (or in the LLVM project in general).

For example, the previous testing for split-dwarf had a driver test (that tested only that the driver produced the right cc1 invocation) and a CodeGen test (that tested that the right IR was produced), but I don't see any test like this (that tested the resulting object file)?

I know there's a narrow gap in IR testing - things that don't go in the IR, but instead go through MCOptions & that the SplitDwarfFile is one of those.

So, yeah, in this case it's a bit of a test hole - if you're going to keep this test, perhaps it could test assembly, rather than the object file? Since it doesn't need complex parsing, etc, perhaps?

grimar added inline comments.Nov 13 2018, 5:27 AM
lib/Driver/ToolChains/CommonArgs.cpp
779–796

I am investigating this.

test/CodeGen/split-debug-single-file.c
13

So, yeah, in this case it's a bit of a test hole - if you're going to keep this test, perhaps it could test assembly, rather than the object file? Since it doesn't need complex parsing, etc, perhaps?

I am not sure assembly can help here. For example
clang main.c -S -g -gsplit-dwarf would create single asm file output anyways
and what I tried to check here is that we can either place .dwo sections into the
same output or into a different one depending on new option.

For example, the previous testing for split-dwarf had a driver test (that tested only that the driver produced the right cc1 invocation) and a CodeGen test (that tested that the right IR was produced), but I don't see any test like this (that tested the resulting object file)?

I think split-debug-filename.c do that?
See: https://github.com/llvm-mirror/clang/blob/master/test/CodeGen/split-debug-filename.c#L18

Also, relax.c: https://github.com/llvm-mirror/clang/blob/master/test/CodeGen/relax.c
And mips-inline-asm-abi.c: https://github.com/llvm-mirror/clang/blob/master/test/CodeGen/mips-inline-asm-abi.c

Seems it is not common, but still acceptable?

grimar added inline comments.Nov 13 2018, 7:29 AM
lib/Driver/ToolChains/CommonArgs.cpp
779–796

So what I see now is that when the function becomes:

const char *tools::SplitDebugName(const ArgList &Args, const InputInfo &Input,
                                  const InputInfo &Output) {
  SmallString<128> T(Output.getFilename());
  llvm::sys::path::replace_extension(T, "dwo");
  return Args.MakeArgString(T);
}

Then no clang tests (check-clang) fail. This does not seem normal to me.
I would expect that such a change should break some -fdebug-compilation-dir relative
logic at least and I suspect we just have no test(s) atm.

What about if I add test case(s) and post a follow-up refactor patch for this place?
(I started work on it, but it will take some time).

dblaikie accepted this revision.Nov 13 2018, 10:01 AM
dblaikie added inline comments.
lib/Driver/ToolChains/CommonArgs.cpp
779–796

Sounds good - thanks!

test/CodeGen/split-debug-single-file.c
13

Ah, I see/good point, split-debug-filename.c tests both the IR & then also tests the object headers.

Sounds good

This revision is now accepted and ready to land.Nov 13 2018, 10:01 AM
This revision was automatically updated to reflect the committed changes.

Only DWARF supports this, correct?

I am not aware of any kind of debug information splitting except DWARF splitting.

Just for the record, msvc does something very similar to this -gsplit-dwarf=single option (using a linker option /DEBUG:FASTLINK instead of this SHF_EXCLUDE approach).

Some questions:

  • How is this handled by dwp/llvm-dwp? I guess not yet since neither supports DWARF5 atm.
  • How does it interact with -gdwarf-5? Should it?
  • There is some deeply buried information about how gdb requires .debug_gnu_pubnames in the case of split dwarf: https://reviews.llvm.org/D54497?id=173943#1299461. Is this still true for this single split case? I'm asking cause these sections are huge and I wonder if they defeat the split dwarf link time gains.

Only DWARF supports this, correct?

I am not aware of any kind of debug information splitting except DWARF splitting.

Just for the record, msvc does something very similar to this -gsplit-dwarf=single option (using a linker option /DEBUG:FASTLINK instead of this SHF_EXCLUDE approach).

Some questions:

  • How is this handled by dwp/llvm-dwp? I guess not yet since neither supports DWARF5 atm.
  • How does it interact with -gdwarf-5? Should it?

This functionality should be pretty orthogonal to -gdwarf-5 and dwp - though it may not be currently, I haven't looked/checked. But seems to me this sort of functionality should work with pre-standard/DWARFv4 Split DWARF as well as it does with DWARFv5 Split DWARF, unless I'm missing something.

And dwp tools should be made to ignore the non-.dwo-suffixed sections, but they may not be made that way currently, I haven't checked.

  • There is some deeply buried information about how gdb requires .debug_gnu_pubnames in the case of split dwarf: https://reviews.llvm.org/D54497?id=173943#1299461. Is this still true for this single split case? I'm asking cause these sections are huge and I wonder if they defeat the split dwarf link time gains.

I believe it still is the case, yes. Though GDB startup times are much better with an index with or without Split DWARF - so if you want an index anyway, the split (single split, or split split) might still be a good thing independent of that. If you have to tradeoff one or the other, yeah, that may be more complicated (lld, at least, last I heard, uses more memory to make gdb-index than gold does, so while lld's memory usage is less than gold's (& certainly less than bfd's) usually, with gdb-index it's a bit worse - so the tradeoffs might be more complicated depending on which linker you're using, etc)