This is an archive of the discontinued LLVM Phabricator instance.

[Driver] Don't make -gsplit-dwarf imply -g2
ClosedPublic

Authored by MaskRay on May 21 2020, 11:13 AM.

Details

Summary

RFC: http://lists.llvm.org/pipermail/cfe-dev/2020-May/065430.html
Agreement from GCC: https://sourceware.org/pipermail/gcc-patches/2020-May/545688.html

g_flags_Group options generally don't affect the amount of debugging
information. -gsplit-dwarf is an exception. Its order dependency with
other gN_Group options make it inconvenient in a build system:

  • -g0 -gsplit-dwarf -> level 2 -gsplit-dwarf "upgrades" the amount of debugging information despite the previous intention (-g0) to drop debugging information
  • -g1 -gsplit-dwarf -> level 2 -gsplit-dwarf "upgrades" the amount of debugging information.
  • If we have a higher-level -gN, -gN -gsplit-dwarf will supposedly decrease the amount of debugging information. This happens with GCC -g3.

The non-orthogonality has confused many users. GCC 11 will change the semantics
(-gsplit-dwarf no longer implies -g2) despite the backwards compatibility break.
This patch matches its behavior.

New semantics:

  • If there is a g_Group, allow split DWARF if useful: (not one of the following: -g0, -gline-directives-only, -g1 -fno-split-dwarf-inlining)
  • Otherwise, no-op.

To restore the original behavior, replace -gsplit-dwarf with -gsplit-dwarf -g.

Diff Detail

Event Timeline

MaskRay created this revision.May 21 2020, 11:13 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 21 2020, 11:13 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

I think it's probably best to keep this subject on the cfe-dev thread until there's agreement there.

Received Cary's response and I am authorized to share this

In retrospect, I regret not naming the option -fsplit-dwarf, which clearly
would not have implied -g, and would have fit in with a few other
dwarf-related -f options. (I don't know whether Richard's objection to it is
because there is already -gsplit-dwarf, or if he would have objected to it as
an alternative-universe spelling.)

At the time, I thought it was fairly common for all/most -g options (except
-g0) to imply -g. Perhaps that wasn't true then or is no longer true now. If
the rest of the community is OK with changing -gsplit-dwarf to not imply -g,
and no one has said it would cause them any hardship, I'm OK with your
proposed change.

My remark here: GCC folks think introducing another -fsplit-dwarf will be more confusing at this point.

I did design it so that you could get the equivalent by simply writing
"-gsplit-dwarf -g0" at the front of the compiler options (e.g., via an
environment variable), so that a subsequent -g would not only turn on debug
but would also enable split-dwarf. We used that fairly regularly at Google.

Regarding how the build system can discover whether or not split dwarf is in
effect, without parsing all the options presented to gcc, and without looking
for the side effects (the .dwo files), we dodged that in the Google build
system by having a higher-level build flag, --fission, which would tell the
build system to pass -gsplit-dwarf to gcc AND look for the .dwo files produced
on the side. We simply disallowed having the user pass -gsplit-dwarf directly
to the compiler.

Feel free to share this.

My take of "-gsplit-dwarf implies -g" on cfe-dev is that people don't have strong opinions on updated semantics but they do notice that the implied -g2 is confusing.

@dblaikie I think we can proceed with this patch. I'll follow up on GCC side. Clang 11 will be consistent with GCC 11.

Ping...

Was hoping to get some more discussion on the general debug flag naming thread on the GCC list, but that hasn't happened - so I've replied/poked it a bit: https://gcc.gnu.org/pipermail/gcc/2020-July/233282.html (@echristo - sorry, thought you were on that thread, guess you got dropped somewhere along the way (I saw an "Eric" in the reply line and figured it was you) - wouldn't mind if you took a look/weighed in, if you've got anything to share on the topic)

dblaikie accepted this revision.Dec 7 2020, 12:41 PM

Looks like GCC has changed the semantics here despite the backwards compatibility break - as much as I'd rather not break backwards compatibility it's probably best on balance to maintain GCC compatibility instead: https://godbolt.org/z/zGanxE

Could you either provide in this patch or a follow-up, a -gno-split-dwarf flag? (probabyl better in a follow-up)

Could you check that this is compatible with the cmake split DWARF support in the LLVM build system, and with Google's internal use, before committing? (not sure there are any other users of this feature we can effectively update - or that I specifically know of)

clang/lib/Driver/ToolChains/Clang.cpp
3779

Rather than undoing the DwarfFission uinitialization, instead could the DwarfFission initialization be rolled into the "if (gN_Group)" condition?

This revision is now accepted and ready to land.Dec 7 2020, 12:41 PM
MaskRay added a comment.EditedDec 7 2020, 3:49 PM

Looks like GCC has changed the semantics here despite the backwards compatibility break - as much as I'd rather not break backwards compatibility it's probably best on balance to maintain GCC compatibility instead: https://godbolt.org/z/zGanxE

Could you either provide in this patch or a follow-up, a -gno-split-dwarf flag? (probabyl better in a follow-up)

Sounds good. Will add -gno-split-dwarf

Could you check that this is compatible with the cmake split DWARF support in the LLVM build system, and with Google's internal use, before committing? (not sure there are any other users of this feature we can effectively update - or that I specifically know of)

Will do! I can test whether ("-gsplit-dwarf" -> "-gsplit-dwarf -g") exposes any issue. The new -gsplit-dwarf -g should be equivalent to the old -gsplit-dwarf

Looks like GCC has changed the semantics here despite the backwards compatibility break - as much as I'd rather not break backwards compatibility it's probably best on balance to maintain GCC compatibility instead: https://godbolt.org/z/zGanxE

Could you either provide in this patch or a follow-up, a -gno-split-dwarf flag? (probabyl better in a follow-up)

Sounds good. Will add -gno-split-dwarf

Great, thanks!

Could you check that this is compatible with the cmake split DWARF support in the LLVM build system, and with Google's internal use, before committing? (not sure there are any other users of this feature we can effectively update - or that I specifically know of)

Will do! I can test whether ("-gsplit-dwarf" -> "-gsplit-dwarf -g") exposes any issue. The new -gsplit-dwarf should be equivalent to the old -gsplit-dwarf -g

Yeah, likely what would be suitable would be to add the -g to both those use cases before committing the change in behavior.

MaskRay added inline comments.Dec 7 2020, 5:16 PM
clang/lib/Driver/ToolChains/Clang.cpp
3779

I could add the OPT_g_Group check to getDebugFissionKind, but that would make the effects of -g0 and "no -g at all" into two places. Having the code here makes it closer to "how DebugInfoKind == codegenoptions::NoDebugInfo affects fission" above.

dblaikie added inline comments.Dec 7 2020, 5:41 PM
clang/lib/Driver/ToolChains/Clang.cpp
3779

Sorry, I'm not following what you're suggesting/discussing. I wasn't meaning to suggest duplicating the OPT_g_Group check.

Code something like:

if (const Arg *A = Args.getLastArg(options::OPT_g_Group)) {
  DebugInfoKind = codegenoptions::LimitedDebugInfo;

  Arg* SplitDWARFArg;
  DwarfFission = getDebugFissionKind(D, Args, SplitDWARFArg);

  if (DwarfFission != DwarfFissionKind::None &&
      !checkDebugInfoOption(SplitDWARFArg, Args, D, TC)) {
    DwarfFission = DwarfFissionKind::None;
    SplitDWARFInlining = false;
  }

  // If the last option explicitly specified a debug-info level, use it.
  if (checkDebugInfoOption(A, Args, D, TC) &&
      A->getOption().matches(options::OPT_gN_Group)) {
    DebugInfoKind = DebugLevelToInfoKind(*A);
    // For -g0 or -gline-tables-only, drop -gsplit-dwarf. This gets a bit more
    // complicated if you've disabled inline info in the skeleton CUs
    // (SplitDWARFInlining) - then there's value in composing split-dwarf and
    // line-tables-only, so let those compose naturally in that case.
    if (DebugInfoKind == codegenoptions::NoDebugInfo ||
        DebugInfoKind == codegenoptions::DebugDirectivesOnly ||
        (DebugInfoKind == codegenoptions::DebugLineTablesOnly &&
         SplitDWARFInlining))
      DwarfFission = DwarfFissionKind::None;
  }
}

This way, hopefully, there wouldn't be a period where DwarfFission is incorrectly set and has to be reverted to "None", making it less likely the incorrect value could be used accidentally

MaskRay updated this revision to Diff 310073.Dec 7 2020, 6:55 PM
MaskRay edited the summary of this revision. (Show Details)

Comment

MaskRay marked 2 inline comments as done.Dec 7 2020, 6:55 PM

Looks like GCC has changed the semantics here despite the backwards compatibility break - as much as I'd rather not break backwards compatibility it's probably best on balance to maintain GCC compatibility instead: https://godbolt.org/z/zGanxE

Could you either provide in this patch or a follow-up, a -gno-split-dwarf flag? (probabyl better in a follow-up)

Sounds good. Will add -gno-split-dwarf

D92809

Could you check that this is compatible with the cmake split DWARF support in the LLVM build system, and with Google's internal use, before committing? (not sure there are any other users of this feature we can effectively update - or that I specifically know of)

Will do! I can test whether ("-gsplit-dwarf" -> "-gsplit-dwarf -g") exposes any issue. The new -gsplit-dwarf -g should be equivalent to the old -gsplit-dwarf

Performed an internal test -> passed.

For llvm/cmake/modules/HandleLLVMOptions.cmake, LLVM_USE_SPLIT_DWARF only works with DEBUG/RELWITHDEBINFO, so there should be no difference.

I have also notified ChromeOS/Android NDK/Bazel maintainers since they use -gsplit-dwarf (though might not be affected).

This revision was landed with ongoing or failed builds.Dec 8 2020, 1:14 PM
This revision was automatically updated to reflect the committed changes.

Do we emit a warning if we see -gsplit-dwarf alone? Should we?

thakis added inline comments.Jan 13 2021, 12:50 PM
clang/docs/ReleaseNotes.rst
120

This apparently bit us (chromium) in thinlto configs. Could this here be more explicit that -gsplit-dwarf now needs an explicit -g2 in cflags, and if using thinlto, in ldflags as well?

MaskRay added a comment.EditedJan 13 2021, 1:04 PM

Do we emit a warning if we see -gsplit-dwarf alone? Should we?

{gcc,clang} -gcolumn-info -fdebug-types-section -c -Wall a.c does not warn about unused -gcolumn-info. Not having a warning makes such toggle flags more independent. Users can add them as default without checking about -g levels.

This apparently bit us (chromium) in thinlto configs. Could this here be more explicit that -gsplit-dwarf now needs an explicit -g2 in cflags, and if using thinlto, in ldflags as well?

(I CCed you in a message half a year ago. My apologies, I probably should notified you again)

dblaikie added inline comments.Jan 13 2021, 1:05 PM
clang/docs/ReleaseNotes.rst
120

Arguably I think maybe it should be fixed/made to work for (thin)lto with -g2 in cflags and -gsplit-dwarf in ldflags, without the need to specify -g2 in ldflags (if that's what you're saying is currently required). Since the -gN level would hopefully at best be ignored (because the IR carries the N in -gN with it, and it'd be unfortunate it enabling split dwarf meant you had to ignore all your per-file -gN levels, or that the -gN is ignored except for allowing -gsplit-dwarf to be respected).

@MaskRay ?

MaskRay added inline comments.Jan 13 2021, 1:12 PM
clang/docs/ReleaseNotes.rst
120

LLD LTO does not have debug related options. ld -g is an ignored compatibility option. LLD somewhat inherited that from GNU ld. I don't think -gsplit-dwarf or -g can be separately controlled in ldflags.

@thakis Can you share the crbug.com discussion thread?

dblaikie added inline comments.Jan 13 2021, 1:16 PM
clang/docs/ReleaseNotes.rst
120

Hmm, I guess the thing that bit Chromium would've been the need to add -g2 to cflags where previously it might've been just -gsplit-dwarf for cflags and ldflags.

But it doesn't look like -g2 is needed in ldflags, only passing it to the compilation seems sufficient, then the -gsplit-dwarf passed to the link step is respected:

$ clang++-tot -flto=thin test.cpp -g -c && clang++-tot -flto=thin -fuse-ld=lld -gsplit-dwarf test.o && llvm-dwarfdump-tot a.out
a.out:  file format elf64-x86-64

.debug_info contents:
0x00000000: Compile Unit: length = 0x0000002c, format = DWARF32, version = 0x0004, abbr_offset = 0x0000, addr_size = 0x08 (next unit at 0x00000030)

0x0000000b: DW_TAG_compile_unit
              DW_AT_stmt_list   (0x00000000)
              DW_AT_comp_dir    ("/usr/local/google/home/blaikie/dev/scratch")
              DW_AT_GNU_dwo_name        ("a.out_dwo/1.dwo")
              DW_AT_GNU_dwo_id  (0xf29563d7c812deae)
              DW_AT_low_pc      (0x0000000000201730)
              DW_AT_high_pc     (0x0000000000201738)
              DW_AT_GNU_addr_base       (0x00000000)
$ clang++-tot -flto=thin test.cpp -g -c && clang++-tot -flto=thin -fuse-ld=lld  test.o && llvm-dwarfdump-tot a.out
a.out:  file format elf64-x86-64

.debug_info contents:
0x00000000: Compile Unit: length = 0x00000047, format = DWARF32, version = 0x0004, abbr_offset = 0x0000, addr_size = 0x08 (next unit at 0x0000004b)

0x0000000b: DW_TAG_compile_unit
              DW_AT_producer    ("clang version 12.0.0 (git@github.com:llvm/llvm-project.git 0d88d7d82bc44b211a8187650a06c6cd3492186a)")
              DW_AT_language    (DW_LANG_C_plus_plus_14)
              DW_AT_name        ("test.cpp")
              DW_AT_stmt_list   (0x00000000)
              DW_AT_comp_dir    ("/usr/local/google/home/blaikie/dev/scratch")
              DW_AT_low_pc      (0x0000000000201730)
              DW_AT_high_pc     (0x0000000000201738)

0x0000002a:   DW_TAG_subprogram
                DW_AT_low_pc    (0x0000000000201730)
                DW_AT_high_pc   (0x0000000000201738)
                DW_AT_frame_base        (DW_OP_reg6 RBP)
                DW_AT_name      ("main")
                DW_AT_decl_file ("/usr/local/google/home/blaikie/dev/scratch/test.cpp")
                DW_AT_decl_line (3)
                DW_AT_type      (0x00000043 "int")
                DW_AT_external  (true)

0x00000043:   DW_TAG_base_type
                DW_AT_name      ("int")
                DW_AT_encoding  (DW_ATE_signed)
                DW_AT_byte_size (0x04)

0x0000004a:   NULL

For Chrome on Chrome OS, this is https://crbug.com/1158215

Here, we saw our links fail with "output file too large". Investigation revealed that debug info was being included in the binary, instead of in .dwo files as expected. That turned out to be caused by this change.

It seems that there are cases where -gsplit-dwarf is passed, but it does not take effect because there isn't also a -g (or -g<level>) option:

clang -c -fthinlto-index=foo.o.thinlto.bc -gsplit-dwarf foo.o -o foo.bin.o

Shows no .dwo file and debug info in foo.bin.o, whereas

clang -c -fthinlto-index=foo.o.thinlto.bc -gsplit-dwarf -g foo.o -o foo.bin.o

shows a .dwo file which is referenced from foo.bin.o

The only thing that is different here is that the latter command has "-g" whereas the former doesn't.

In other words, the new implementation still doesn't make -g and -gsplit-dwarf fully orthogonal; instead of -gsplit-dwarf always turning on debug fission, it *only* turns it on when there is also a -g or -g<level> option - at least in distributed ThinLTO code generation.

It seems that orthogonality was the motivation for this change, and that -gsplit-dwarf without -g is supposed to work in the linker (per MaskRay's comment about -g being ignored by the linker).

Can we make it so that -gsplit-dwarf turns on debug fission and we then use that even if no -g options were passed?

For Chrome on Chrome OS, this is https://crbug.com/1158215

Here, we saw our links fail with "output file too large". Investigation revealed that debug info was being included in the binary, instead of in .dwo files as expected. That turned out to be caused by this change.

It seems that there are cases where -gsplit-dwarf is passed, but it does not take effect because there isn't also a -g (or -g<level>) option:

clang -c -fthinlto-index=foo.o.thinlto.bc -gsplit-dwarf foo.o -o foo.bin.o

Shows no .dwo file and debug info in foo.bin.o, whereas

clang -c -fthinlto-index=foo.o.thinlto.bc -gsplit-dwarf -g foo.o -o foo.bin.o

shows a .dwo file which is referenced from foo.bin.o

The only thing that is different here is that the latter command has "-g" whereas the former doesn't.

In other words, the new implementation still doesn't make -g and -gsplit-dwarf fully orthogonal; instead of -gsplit-dwarf always turning on debug fission, it *only* turns it on when there is also a -g or -g<level> option - at least in distributed ThinLTO code generation.

It seems that orthogonality was the motivation for this change, and that -gsplit-dwarf without -g is supposed to work in the linker (per MaskRay's comment about -g being ignored by the linker).

Can we make it so that -gsplit-dwarf turns on debug fission and we then use that even if no -g options were passed?

Ah, sorry - I assumed you were using implicit thinlto rather than explicit thinlto.

Yes, I think you've demonstrated a bug here - hopefully @MaskRay can look into/fix this so that only -gN is required when the IR is generated and only -gsplit-dwarf is required when real object files are generated - without needing both files even when only one of those operations (IR generation or object generation) is being performed.

For Chrome on Chrome OS, this is https://crbug.com/1158215

Here, we saw our links fail with "output file too large". Investigation revealed that debug info was being included in the binary, instead of in .dwo files as expected. That turned out to be caused by this change.

It seems that there are cases where -gsplit-dwarf is passed, but it does not take effect because there isn't also a -g (or -g<level>) option:

clang -c -fthinlto-index=foo.o.thinlto.bc -gsplit-dwarf foo.o -o foo.bin.o

Shows no .dwo file and debug info in foo.bin.o, whereas

clang -c -fthinlto-index=foo.o.thinlto.bc -gsplit-dwarf -g foo.o -o foo.bin.o

shows a .dwo file which is referenced from foo.bin.o

The only thing that is different here is that the latter command has "-g" whereas the former doesn't.

In other words, the new implementation still doesn't make -g and -gsplit-dwarf fully orthogonal; instead of -gsplit-dwarf always turning on debug fission, it *only* turns it on when there is also a -g or -g<level> option - at least in distributed ThinLTO code generation.

It seems that orthogonality was the motivation for this change, and that -gsplit-dwarf without -g is supposed to work in the linker (per MaskRay's comment about -g being ignored by the linker).

Can we make it so that -gsplit-dwarf turns on debug fission and we then use that even if no -g options were passed?

I am still confused what the problem is. I added some comments to the following distributed ThinLTO example.
Both -gsplit-dwarf and -g are needed for the ThinLTO backend compile step (-fthinlto-index=):

cat > ./a.c <<eof
int foo();
int main() { return foo();}
eof
echo 'int foo() {return 42; }' > ./b.c
cat > ./Makefile <<eof
clang := /tmp/RelA/bin/clang
all: final_link

compile a.o b.o a.indexing.o b.indexing.o: a.c b.c
# -gsplit-dwarf in the initial compile is a no-op.
        $(clang) -gsplit-dwarf -g -O2 -c -flto=thin -fthin-link-bitcode=a.indexing.o a.c
        $(clang) -gsplit-dwarf -g -O2 -c -flto=thin -fthin-link-bitcode=b.indexing.o b.c

thin_link lto/a.o.thinlto.bc lto/b.o.thinlto.bc a.rsp: a.indexing.o b.indexing.o
        $(clang) -fuse-ld=lld -Wl,--thinlto-index-only=a.rsp -Wl,--thinlto-prefix-replace=';lto/' -Wl,--thinlto-object-suffix-replace='.indexing.o;.o' a.indexing.o b.indexing.o

thinlto_backend lto/a.o lto/b.o: a.o b.o lto/a.o.thinlto.bc lto/b.o.thinlto.bc
# This does not produce lto/a.dwo
# Both -gsplit-dwarf and -g are needed to get lto/a.dwo
        $(clang) -g -O2 -c -fthinlto-index=lto/a.o.thinlto.bc a.o -o lto/a.o
        $(clang) -g -O2 -c -fthinlto-index=lto/b.o.thinlto.bc b.o -o lto/b.o

final_link exe: lto/a.o lto/b.o a.rsp
        $(clang) -fuse-ld=lld @a.rsp -o exe

clean:
        $(RM) -f *.o lto/*.o lto/*.bc lto/*.dwo
eof
mkdir -p ./lto
MaskRay added a comment.EditedJan 13 2021, 6:41 PM

From @dblaikie's email reply:

^ this I would consider to be a bug. -g should be needed to generate debug info into the IR, and -gsplit-dwarf should be needed to choose how debug info already in the IR should be placed into object files or dwo files.

Usually (non-LTO) these two steps are in the same compile, so -g and -gsplit-dwarf go together. But for LTO cases, I think it's suitable for -g to be needed for IR generation, and -gsplit-dwarf to be needed for object code generation, without needing -g as well in that latter step (the -g was already specified at IR generation time, and that shoul be enough).

We can see that behavior I'm describing as desirable in implicit ThinLTO I showed earlier in the thread.

I suggest this should be the desired behavior, because -g generally has no effect on IR->object code generation, so it seems strange/unnecessary to me to gate -gsplit-dwarf behind -g in that case. The -g was already specified and used during IR generation (& the -gN level, -gmlt behavior-etc has all been handled in IR generation (or embedded in the IR for use later) - so to me, specifying -g again during object code generation would be/is strange, because any special values would be ignored (-g1/g2/g3/etc would not be respected - the N value would only be respected when passed to the IR generation step))

Got it. -fthinlto-index= (thinlto backend compile) operates on bitcode files and does not IR generation. It isn't suitable for -g to be needed to use.
-gsplit-dwarf is both an IR generation option and an object file generation option. It is needed for thinlto backend compiles.

In Bazel, I think a typical configuration's LTO Backend Compile includes almost every option in the initial compile, so I did not notice the issue. I am working on a patch.

From @dblaikie's email reply:

^ this I would consider to be a bug. -g should be needed to generate debug info into the IR, and -gsplit-dwarf should be needed to choose how debug info already in the IR should be placed into object files or dwo files.

Usually (non-LTO) these two steps are in the same compile, so -g and -gsplit-dwarf go together. But for LTO cases, I think it's suitable for -g to be needed for IR generation, and -gsplit-dwarf to be needed for object code generation, without needing -g as well in that latter step (the -g was already specified at IR generation time, and that shoul be enough).

We can see that behavior I'm describing as desirable in implicit ThinLTO I showed earlier in the thread.

I suggest this should be the desired behavior, because -g generally has no effect on IR->object code generation, so it seems strange/unnecessary to me to gate -gsplit-dwarf behind -g in that case. The -g was already specified and used during IR generation (& the -gN level, -gmlt behavior-etc has all been handled in IR generation (or embedded in the IR for use later) - so to me, specifying -g again during object code generation would be/is strange, because any special values would be ignored (-g1/g2/g3/etc would not be respected - the N value would only be respected when passed to the IR generation step))

Got it. -fthinlto-index= (thinlto backend compile) operates on bitcode files and does not IR generation. It isn't suitable for -g to be needed to use.
-gsplit-dwarf is both an IR generation option and an object file generation option. It is needed for thinlto backend compiles.

(sounds like we're mostly on the same page now - minor note though: I don't think -gsplit-dwarf is an IR generation option. I don't think it has any effect on IR generation - I believe the IR is exactly the same whether it ends up being used with split or unsplit DWARF generation)

In Bazel, I think a typical configuration's LTO Backend Compile includes almost every option in the initial compile, so I did not notice the issue. I am working on a patch.

Thanks!