This is an archive of the discontinued LLVM Phabricator instance.

[COFF] Don't treat DWARF sections as GC roots
ClosedPublic

Authored by rnk on Mar 27 2020, 9:19 AM.

Details

Summary

DWARF sections are typically live and not COMDAT, so they would be
treated as GC roots. Enabling DWARF would essentially keep all code with
debug info alive, preventing any section GC.

Fixes PR45273

Diff Detail

Event Timeline

rnk created this revision.Mar 27 2020, 9:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 27 2020, 9:20 AM
Herald added a subscriber: aprantl. · View Herald Transcript
mstorsjo accepted this revision.Mar 27 2020, 10:20 AM

Looks good I think. I'm not very familiar with the GC parts of LLD, but the explanations sound sensible to me, and I believe this solution has been vetted by at least two others already (bug reporter and @rnk).

This revision is now accepted and ready to land.Mar 27 2020, 10:20 AM
MaskRay added a comment.EditedMar 27 2020, 10:53 AM

The code change looks good, but I've got some test suggestions. I have some thoughts if COFF/MarkLive.cpp is ever made more sophisticated. In ELF, gc is for sections which are part of the load image (SHF_ALLOC) and non-SHF_ALLOC sections are generally not discarded. I don't know what is similar to SHF_ALLOC in COFF, though.

lld/test/COFF/gc-dwarf.s
6

This can be --input-file=%t.map to avoid a cat.

9

-NEXT:

11

It will be better to prepend # MAP: Symbol to make clear it is the MAP column.

In not-so-rare circumstances the absolute path of %t.obj may contain main.
main{{$}} can prevent this pattern from matching a path component.

23

Optional: in ELF we tend to omit unneeded details like xorl %eax, %eax; retq.
But we will keep retq because an empty section and a non-empty section make a difference for GNU linkers.

rnk marked 4 inline comments as done.Mar 27 2020, 12:24 PM

The code change looks good, but I've got some test suggestions. I have some thoughts if COFF/MarkLive.cpp is ever made more sophisticated. In ELF, gc is for sections which are part of the load image (SHF_ALLOC) and non-SHF_ALLOC sections are generally not discarded. I don't know what is similar to SHF_ALLOC in COFF, though.

I guess the analogous flag would be IMAGE_SCN_MEM_DISCARDABLE. It is phrased in the opposite direction: sections lacking the flag are loaded, sections with the flag are not loaded into memory, and are generally sorted to the end of the image.

Strangely, LLVM MC suppresses the D assembler flag when printing textual assembly, and infers it from the section name:
https://github.com/llvm/llvm-project/blob/a55daa146166353236aa528546397226bee9363b/llvm/lib/MC/MCParser/COFFAsmParser.cpp#L258-L260

I realized, though, that there is one major difference in behavior. isDWARF() returns true for .eh_frame sections, but they are not discardable. They must be loaded into memory for unwinding to work. So, in this situation, GC will not work:

void foo();
int main() { foo(); } 
void unused1() { foo(); } 
void unused2() { foo(); }

Compiled like so:
$ clang -O1 --target=x86_64-windows-gnu -c t.c -fdwarf-exceptions -ffunction-sections

The unwind data is not separated into nice little comdat .pdata/.xdata sections, it is a single monolithic .eh_frame section:

$ llvm-objdump -h t.o

t.o:    file format coff-x86-64

Sections:
Idx Name          Size     VMA              Type
  0 .text         00000000 0000000000000000 TEXT
  1 .data         00000000 0000000000000000 DATA
  2 .bss          00000000 0000000000000000 BSS
  3 .text$main    0000001a 0000000000000000 TEXT
  4 .text$unused1 0000000e 0000000000000000 TEXT
  5 .text$unused2 0000000e 0000000000000000 TEXT
  6 .eh_frame     00000080 0000000000000000 DATA
  7 .llvm_addrsig 00000000 0000000000000000

I checked, and it has relocations against all the functions. :( Blech. I think we have to do the isDWARF check.

lld/test/COFF/gc-dwarf.s
11

Yeah, unfortunately there is no other leading text on the line to match against to try to prevent that, so I left it as was. It's a bit redundant to check which symbols were included anyway. But I can make it tighter.

23

Sure, I based this assembly on normal compiler output. I prefer it when the assembly appears to do something semi-plausible.

rnk updated this revision to Diff 253193.Mar 27 2020, 12:29 PM
rnk marked an inline comment as done.
  • Test .eh_frame refs
rnk added a comment.Mar 27 2020, 12:34 PM

The code change looks good, but I've got some test suggestions. I have some thoughts if COFF/MarkLive.cpp is ever made more sophisticated.

Also, I hope we don't have to make it sophisticated. Simplicity is a good thing. :)

Anyway, I didn't make any code changes, so I will land this with the updated .eh_frame test. Thanks!

MaskRay accepted this revision.Mar 27 2020, 12:45 PM

LGTM.

This revision was automatically updated to reflect the committed changes.