This is an archive of the discontinued LLVM Phabricator instance.

[LLD] [COFF] Treat .xdata/.pdata$<sym> as implicitly associative to <sym> for MinGW
ClosedPublic

Authored by mstorsjo on Jul 23 2018, 3:00 PM.

Details

Summary

MinGW configurations don't use associative comdats, as GNU ld doesn't support that. Instead they produce normal comdats named .text$sym, .xdata$sym and .pdata$sym.

GNU ld doesn't discard any comdats starting with .xdata or .pdata, even if --gc-sections is used (while it does discard other unreferenced comdats), regardless of what symbol name is used after the $ separator.

For LLD, treat any such comdat as implicitly associative to the base symbol. This requires maintaining a map from symbol name to section number, but that is only maintained when the MinGW flag has been enabled.

Diff Detail

Repository
rL LLVM

Event Timeline

mstorsjo created this revision.Jul 23 2018, 3:00 PM
mstorsjo retitled this revision from [COFF] Treat .xdata/.pdata$<sym> as implicitly associative to <sym> for MinGW to [LLD] [COFF] Treat .xdata/.pdata$<sym> as implicitly associative to <sym> for MinGW.
rnk added inline comments.Jul 23 2018, 4:25 PM
COFF/InputFiles.cpp
263 ↗(On Diff #156889)

Is there some way we can hash fewer symbols? It's unfortunate that we need a string hash table at all to parse object files. I wonder if there's a way we can defer this work, since for C++, a lot of comdat things end up getting discarded.

If we do need a string hash table, it should probably be a DenseMap<StringRef, uint32_t>, since this is performance critical.

Oh, here's an idea: maybe createDefined should return Prevailing somehow. Only add the symbol to the table if it's a prevailing comdat symbol in an executable section.

test/COFF/associative-comdat-mingw.s
40–43 ↗(On Diff #156889)

I wonder if we could keep using associative comdats for .xdata and .pdata, but use the GCC-style section names. I wonder if ld.bfd would do the right thing with them. That would allow us to avoid the string hash table and save us a fair amount of link speed.

mstorsjo added inline comments.Jul 24 2018, 1:06 AM
COFF/InputFiles.cpp
263 ↗(On Diff #156889)

Hmm, I'll try, unless the other suggestion turns out to be enough.

test/COFF/associative-comdat-mingw.s
40–43 ↗(On Diff #156889)

Good idea, I guess it probably would work, but I'll check.

That won't help with object files from GCC though, but GCC+LLD isn't a priority to anyone afaik, and I there's probably other bigger issues in that setup anyway.

mstorsjo added inline comments.Jul 24 2018, 2:00 PM
COFF/InputFiles.cpp
263 ↗(On Diff #156889)

I tried implementing this to reduce the amount of hashing, in case we still want to try to do this (for GCC object compatibility or if we don't want to do the other patch anyway).

test/COFF/associative-comdat-mingw.s
40–43 ↗(On Diff #156889)

I tested this, and this compromise does seem to work - I uploaded a patch for this version at https://reviews.llvm.org/D49757.

mstorsjo updated this revision to Diff 157123.Jul 24 2018, 2:01 PM

Using a DenseMap and limiting the hashed symbols to the absolute minimum.

ruiu added a comment.Jul 24 2018, 2:35 PM

initializeSymbols and createDefined were originally much simpler than they are now. They are becoming complicated gradually over time. Due to the length of these functions (they are a bit too long to my taste) and the lack of comments (for example, what does Prevailing means?), they are probably hard to read for newcomers. Can you simplify the code and add an explanation as a comment before adding more code there? I believe the code can be improved.

initializeSymbols and createDefined were originally much simpler than they are now. They are becoming complicated gradually over time. Due to the length of these functions (they are a bit too long to my taste) and the lack of comments (for example, what does Prevailing means?), they are probably hard to read for newcomers. Can you simplify the code and add an explanation as a comment before adding more code there? I believe the code can be improved.

I'm not sure if I'm up to trying to simplify the existing code (if possible) or add comments on how it works (I have no idea about most if it, I hadn't seen it until two days ago), and that sounds like a too loosely defined task, that easily ends up as impossible to properly complete, blocking this patch indefinitely.

But I could try to add more verbose comments to the new code I'm adding at least.

ruiu added a comment.Jul 24 2018, 2:47 PM

Yeah, I understand your situation. I wanted to share my concern with you (or anyone who add new code there) because even to me they are now hard to understand. Let me quickly take a look at this to see if we can actually improve it.

Yeah, I understand your situation. I wanted to share my concern with you (or anyone who add new code there) because even to me they are now hard to understand. Let me quickly take a look at this to see if we can actually improve it.

I understand your concern as well, for the evergrowing complexity, and especially when adding things that aren't quite according to any spec or so (i.e. hacks). Luckily this project at least has pretty decent tests, compared to many other codebases...

In any case; if we don't care too much for compatibility with object files built by GCC, and if @rnk thinks D49757 is fine, we could also drop this one.

ruiu added a comment.Jul 24 2018, 2:54 PM

Sorry, I didn't follow the discussion as to how much important it is to be compatible with gcc, but I guess that's important for you? I don't think that the new code you are adding is not complicated, so as long as we can keep the overall complexity of these functions low, adding new code should be fine.

Sorry, I didn't follow the discussion as to how much important it is to be compatible with gcc, but I guess that's important for you? I don't think that the new code you are adding is not complicated, so as long as we can keep the overall complexity of these functions low, adding new code should be fine.

No, compatibility with gcc produced object files isn't very important to me here.

There's currently three (more or less) commonly used mingw configurations:

  • gcc + ld.bfd (the normal default one)
  • clang + ld.bfd (@rnk fixed an issue for this case in llvm, which triggered all of this)
  • clang + lld (the one I mostly work on)

The fourth plausible combo would be gcc + lld, but I don't know of anybody using that, and I'm pretty sure there are other issues that one would run into in that case.

ruiu added a comment.Jul 24 2018, 3:14 PM

Ah, understood. Honestly I don't think there is a strong reason for people to choose lld over bfd when they chose gcc over clang.

rnk added a comment.Jul 26 2018, 1:12 PM
  • clang + ld.bfd (@rnk fixed an issue for this case in llvm, which triggered all of this)

As long as this works, D49757 is fine with me. I'm going to check.

rnk added a comment.Jul 26 2018, 1:30 PM
In D49700#1177256, @rnk wrote:
  • clang + ld.bfd (@rnk fixed an issue for this case in llvm, which triggered all of this)

As long as this works, D49757 is fine with me. I'm going to check.

Unfortunately, it doesn't work. ld.bfd will not discard the duplicate associated .pdata and .xdata sections. Here was my program:

$ cat t.h
void bar(int*);
inline int foo() {
  volatile int x = 42;
  ++x;
  try {
  bar((int*)&x);
  } catch(...) {}
  return x;
}

$ cat a.cpp
#include "t.h"
void baz();
void bar(int *) {}
int main() {
  foo();
  baz();
}

$ cat b.cpp
#include "t.h"
void baz() {
  foo();
}

$ clang -c a.cpp b.cpp --target=x86_64-windows-gnu && g++ a.o b.o -o t.exe

$ llvm-objdump -u -d t.exe | less

There's only one _Z3foov copy in .text, so that's good:

...
_Z3foov:
  402c60:       48 83 ec 48     subq    $72, %rsp
  402c64:       c7 44 24 44 2a 00 00 00         movl    $42, 68(%rsp)
  402c6c:       8b 44 24 44     movl    68(%rsp), %eax
  402c70:       83 c0 01        addl    $1, %eax
  402c73:       89 44 24 44     movl    %eax, 68(%rsp)
  402c77:       48 8d 4c 24 44  leaq    68(%rsp), %rcx
...

But there are two pdata / xdata sets for it in the unwind info:

...
Function Table:
  Start Address: 0x2c60
  End Address: 0x2cb3
  Unwind Info Address: 0x608c
    Version: 1
    Flags: 3 UNW_ExceptionHandler UNW_TerminateHandler
    Size of prolog: 4
    Number of Codes: 1
    No frame pointer used
    Unwind Codes:
      0x04: UOP_AllocSmall 72

Function Table:
  Start Address: 0x2c60
  End Address: 0x2cb3
  Unwind Info Address: 0x60b8
    Version: 1
    Flags: 3 UNW_ExceptionHandler UNW_TerminateHandler
    Size of prolog: 4
    Number of Codes: 1
    No frame pointer used
    Unwind Codes:
      0x04: UOP_AllocSmall 72

For any medium sized C++ application, that's going to add up.

In D49700#1177295, @rnk wrote:
In D49700#1177256, @rnk wrote:
  • clang + ld.bfd (@rnk fixed an issue for this case in llvm, which triggered all of this)

As long as this works, D49757 is fine with me. I'm going to check.

Unfortunately, it doesn't work. ld.bfd will not discard the duplicate associated .pdata and .xdata sections. Here was my program:

$ cat t.h
void bar(int*);
inline int foo() {
  volatile int x = 42;
  ++x;
  try {
  bar((int*)&x);
  } catch(...) {}
  return x;
}

$ cat a.cpp
#include "t.h"
void baz();
void bar(int *) {}
int main() {
  foo();
  baz();
}

$ cat b.cpp
#include "t.h"
void baz() {
  foo();
}

$ clang -c a.cpp b.cpp --target=x86_64-windows-gnu && g++ a.o b.o -o t.exe

$ llvm-objdump -u -d t.exe | less

There's only one _Z3foov copy in .text, so that's good:

...
_Z3foov:
  402c60:       48 83 ec 48     subq    $72, %rsp
  402c64:       c7 44 24 44 2a 00 00 00         movl    $42, 68(%rsp)
  402c6c:       8b 44 24 44     movl    68(%rsp), %eax
  402c70:       83 c0 01        addl    $1, %eax
  402c73:       89 44 24 44     movl    %eax, 68(%rsp)
  402c77:       48 8d 4c 24 44  leaq    68(%rsp), %rcx
...

But there are two pdata / xdata sets for it in the unwind info:

Oh, indeed - sorry for not testing it thoroughly enough myself to realize this, and kudos to you for being thorough.

I just went with your initial testcase from https://github.com/Alexpux/MINGW-packages/issues/1677 and tested that it didn't break...

It's surprising that this didn't work, but I'm not interested enough in the specifics to dive in the binutils source to figure out why it doesn't.

I guess that some form of this patch is what we need to go with then?

rnk added a comment.Jul 26 2018, 2:53 PM

Oh, indeed - sorry for not testing it thoroughly enough myself to realize this, and kudos to you for being thorough.

Thanks!

It's surprising that this didn't work, but I'm not interested enough in the specifics to dive in the binutils source to figure out why it doesn't.

I gave it a try, but I didn't succeed.

I guess that some form of this patch is what we need to go with then?

I think we might, because anyone using libstdc++ compiled with gcc on mingw is going to be linking gcc-produced object files. That fourth case that we wanted to rule out might be more important than we think.

In D49700#1177387, @rnk wrote:

It's surprising that this didn't work, but I'm not interested enough in the specifics to dive in the binutils source to figure out why it doesn't.

I gave it a try, but I didn't succeed.

That sounds like the usual case with binutils sources...

I guess that some form of this patch is what we need to go with then?

I think we might, because anyone using libstdc++ compiled with gcc on mingw is going to be linking gcc-produced object files. That fourth case that we wanted to rule out might be more important than we think.

Well, it's one of the cases where I'm sure nobody is doing it right now at least; there are a bunch of other incompatibilities that'd block it (lld can't handle gnu import libraries, D38513; ld.bfd provides extra symbols for __CTORS_*__, which forces the mingw crt startup files to be different depending on what linker you'll use; some aspects of dll autoimporting of data). All things that we might want to work on unifying at some point, but it's not all done yet. For a pure clang+llvm-dlltool+lld setup, things are pretty ok though.

That said, I'm fine with moving forward with this patch; it makes the gcc compat issue list one item shorter at least. Are there any other issues with it, or other suggestions? Rui maybe wanted more comments?

rnk added a comment.Jul 26 2018, 3:47 PM

That said, I'm fine with moving forward with this patch; it makes the gcc compat issue list one item shorter at least. Are there any other issues with it, or other suggestions? Rui maybe wanted more comments?

We should try to make it faster. We don't want to hash every symbol we ever add, only the prevailing comdat ones. This will probably save a large number of hash table insertions, because oftentimes most comdat function definitions will be redundant and get discarded.

rnk added a comment.Jul 26 2018, 3:56 PM
In D49700#1177490, @rnk wrote:

We should try to make it faster. We don't want to hash every symbol we ever add, only the prevailing comdat ones. This will probably save a large number of hash table insertions, because oftentimes most comdat function definitions will be redundant and get discarded.

Uh, I guess ignore that. I see your changes to do that now.

In D49700#1177509, @rnk wrote:
In D49700#1177490, @rnk wrote:

We should try to make it faster. We don't want to hash every symbol we ever add, only the prevailing comdat ones. This will probably save a large number of hash table insertions, because oftentimes most comdat function definitions will be redundant and get discarded.

Uh, I guess ignore that. I see your changes to do that now.

Yes, I tried implementing all those suggestions. Do you think the performance hit (limited to the mingw case) is tolerable now?

mstorsjo updated this revision to Diff 157641.Jul 26 2018, 9:42 PM

Added a bit more comments, rebased past some changes nearby.

In D49700#1177509, @rnk wrote:
In D49700#1177490, @rnk wrote:

We should try to make it faster. We don't want to hash every symbol we ever add, only the prevailing comdat ones. This will probably save a large number of hash table insertions, because oftentimes most comdat function definitions will be redundant and get discarded.

Uh, I guess ignore that. I see your changes to do that now.

Ping @rnk @ruiu - what are the next steps for this patch now?

@ruiu, could you please comment on if the current state of this is ok with you, or if you want still more comments somewhere - afaik @rnk didn't have much more to say, and I'd like to get this merged into 7.0 (because currently, there's a regression in the clang+lld combo for mingw targets with SEH).

rnk added a comment.Aug 6 2018, 1:34 PM

I've made some suggestions to improve readability. Hopefully I'm channeling @ruiu well enough to land this after implementing them.

COFF/InputFiles.cpp
258 ↗(On Diff #157641)

Let's name this PrevailingSectionMap or something like that. Basically, it's a map from prevailing comdat symbols to their section number.

271–272 ↗(On Diff #157641)

As future work, I think createDefined is doing too much work. Ideally, we should find a way to avoid the Optional<> return value.

274–285 ↗(On Diff #157641)

Can you pull this out into a method like recordPrevailingSymbolForMingw?

308–316 ↗(On Diff #157641)

And pull this out into maybeAssociateSEHForMingw?

392 ↗(On Diff #157641)

IMO it would be simpler to eliminate this and use PrevailingComdat in std::tie below. Maybe name the parameter Prevailing to match.

In D49700#1189936, @rnk wrote:

I've made some suggestions to improve readability. Hopefully I'm channeling @ruiu well enough to land this after implementing them.

Thanks, it sure looks a lot more palatable after these changes!

COFF/InputFiles.cpp
258 ↗(On Diff #157641)

Done

274–285 ↗(On Diff #157641)

Done

308–316 ↗(On Diff #157641)

Done

392 ↗(On Diff #157641)

Done

mstorsjo updated this revision to Diff 159385.Aug 6 2018, 1:51 PM

Applied refactorings suggested by @rnk

rnk accepted this revision.Aug 6 2018, 2:19 PM

Let's land it. I think if there are any objections we can resolve them later.

This revision is now accepted and ready to land.Aug 6 2018, 2:19 PM
This revision was automatically updated to reflect the committed changes.
mstorsjo added a subscriber: hans.Aug 7 2018, 12:25 AM

@hans, can you merge this one? This fixes a regression in using clang together with lld for SEH exception handling in mingw setups.

hans added a comment.Aug 7 2018, 12:47 AM

@hans, can you merge this one? This fixes a regression in using clang together with lld for SEH exception handling in mingw setups.

Merged in r339108. Thanks!