This is an archive of the discontinued LLVM Phabricator instance.

[LLD] Remove global state in lld/COFF
ClosedPublic

Authored by akhuang on Sep 24 2021, 3:47 PM.

Details

Summary

Remove globals from the lldCOFF library, by moving globals into a context class.
This patch mostly moves the config object into COFFLinkerContext.

See https://lists.llvm.org/pipermail/llvm-dev/2021-June/151184.html for
context about removing globals from LLD.

Diff Detail

Event Timeline

akhuang created this revision.Sep 24 2021, 3:47 PM
akhuang requested review of this revision.Sep 24 2021, 3:47 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 24 2021, 3:47 PM
aganea added inline comments.Sep 27 2021, 10:37 AM
lld/COFF/DLL.cpp
107

I'm seeing that a lot of the *Chunk classes have this extra context now. Have you tried to measure peak RAM usage when linking large executables? Also in scenarios when /Gy is used, for example MinGW seems to use that.

My point being, if we can pass the context by argument that would be better maybe instead of storing it into these tiny objects. But I realize that isn't always practical, thus the global access proposed in D108850 - COFFLinkerContext would simply inherit from CommonLinkingContext and then could be accessed from anywhere by a call to context().

akhuang added inline comments.Sep 28 2021, 3:05 PM
lld/COFF/DLL.cpp
107

Yeah, it feels non-ideal to store the reference in a bunch of Chunks, I'll try to see whether it's feasible to pass it by argument.

Memory usage doesn't seem to increase too much, though -- peak memory usage for linking chrome (measured with https://github.com/sgraham/tim/blob/master/tim.exe) goes up by 4mb (3972mb -> 3976mb). I think chrome builds with /Gy.

(If ELF ever needs a context, I may just use Context instead of ELFLinkerContext)

Thanks for doing this @akhuang! Looks good generally, a few comments:

lld/COFF/COFFLinkerContext.h
61

I think this field isn't needed anymore, since after your patch, DefinedAbsolute and applySecIdx() have direct access to ctx.outputSections.size(), which this field duplicates. The only thing that is needed is probably an assert() (see other comment in Chunks.cpp)

lld/COFF/CallGraphSort.cpp
76–77

Could be ctx see other comment.

lld/COFF/Chunks.cpp
97–103

I would pass a unsigned numOutputSections here and assert within the function that the value fits on 16-bits.

lld/COFF/DLL.cpp
107

Sounds good.

lld/COFF/InputFiles.cpp
1023

Shouldn't we keep this in ctx? Otherwise we'll be creating over and over these fake sections for each bitcode file?

lld/COFF/PDB.cpp
1721

Could also be const uint32_t secrelReloc...

lld/COFF/SymbolTable.h
50
lld/COFF/Symbols.cpp
108

Just a tiny detail, it is a bit confusing that the diff show a totally unrelated addition here, ie.before we had numOutputSection and now we have ::getRVA(). Just my preference, but I usually insert new additions in places where nothing was changed, as compared with the previous revision.

lld/COFF/Symbols.h
24–25

Please remove commented out code.

lld/COFF/Writer.cpp
721

Please capture only &ctx

1946–1947

Please remove, see the other comment.

ormris removed a subscriber: ormris.Jan 18 2022, 9:49 AM
aganea added a subscriber: rnk.Jan 20 2022, 11:48 AM

Hello @akhuang! Are you able to work on this before the 14.0 branch out? If no, can I take it over and land it? Barring any comments - @MaskRay would you have a bit of time to take a look please ?

@aganea Ah, right, I can take a look at the comments today

akhuang updated this revision to Diff 401820.Jan 20 2022, 5:21 PM
akhuang marked 6 inline comments as done.

Address comments

aganea accepted this revision.EditedJan 21 2022, 5:17 AM

Thanks again Amy! LGTM with a note:

lld/COFF/Driver.cpp
1465

As an option, in this kind of long function, one could probably also create an alias at the top, if you want to avoid those many changes:

Config &config = ctx.config;
This revision is now accepted and ready to land.Jan 21 2022, 5:17 AM
akhuang updated this revision to Diff 402070.Jan 21 2022, 12:27 PM

cleanup and fix stuff that got messed during rebase

akhuang updated this revision to Diff 402147.Jan 21 2022, 5:46 PM

further cleanup

aganea added inline comments.Jan 21 2022, 6:52 PM
lld/COFF/Driver.cpp
1406

Sorry I actually meant Configuration *config = &ctx.config; just so that you don’t have all these changes below when you do a git blame.

akhuang added inline comments.Jan 21 2022, 7:16 PM
lld/COFF/Driver.cpp
1406

haha right, makes sense

akhuang updated this revision to Diff 402159.Jan 21 2022, 7:16 PM

config. to config->

aganea added inline comments.Jan 24 2022, 7:37 AM
lld/COFF/Driver.cpp
1406

Up to your discretion, I think you can apply the same "recipe" in a few other places (see other comments).

lld/COFF/DriverUtils.cpp
99

And here?

lld/COFF/Writer.cpp
1022–1023

And here?

1433

And here.

1777

Maybe here too.

1962

You could probably do the same thing here: Configuration *config = ctx.config;

akhuang updated this revision to Diff 403324.Jan 26 2022, 10:10 AM
akhuang marked an inline comment as done.

some more ctx.config -> config->

aganea accepted this revision.EditedJan 26 2022, 10:26 AM

Still looks good. Perhaps if you can, before commit to ensure there are no regressions, check that the contents of Chromium's out/ dir is the same before and after this patch? (or any other real-world example)

(If ELF is ever able to do this, we may use a shorter name than const COFFLinkerContext &ctx ...)

If other ports want to follow suit, I think storing const COFFLinkerContext &ctx; in classes like *File are fine since the instances are not too many.

Placing the member in *Section is definitely too wasteful. For classes like ImportThunkChunk, I think it'd be better to avoid storing the member as well.

aganea added a comment.EditedJan 26 2022, 11:17 AM

If other ports want to follow suit, I think storing const COFFLinkerContext &ctx; in classes like *File are fine since the instances are not too many.

Placing the member in *Section is definitely too wasteful. For classes like ImportThunkChunk, I think it'd be better to avoid storing the member as well.

@akhuang mentioned that this patch only increases by 4 MB the peak mem when linking a Chromium .dll with /Gy, see above https://reviews.llvm.org/D110450#3028764
Since the COFF driver would still not be thread-safe after this patch, if you prefer we can also use lld::context<COFFLinkerContext>() (instead of storing ctx in *ImportThunkChunk). And then as a second step, introduce thread_local contexts, for cases like this where passing ctx through the callstack is infeasible/requires too many changes. See https://reviews.llvm.org/D108850?id=370678, lld/Common/Globals.cpp, L44.

MaskRay added a comment.EditedJan 26 2022, 11:23 AM

If other ports want to follow suit, I think storing const COFFLinkerContext &ctx; in classes like *File are fine since the instances are not too many.

Placing the member in *Section is definitely too wasteful. For classes like ImportThunkChunk, I think it'd be better to avoid storing the member as well.

@akhuang mentioned that this patch only increases by 4 MB the peak mem when linking a Chromium .dll with /Gy, see above https://reviews.llvm.org/D110450#3028764
Since the COFF driver would still not be thread-safe after this patch, if you prefer we can also use lld::context<COFFLinkerContext>() (instead of storing ctx in *ImportThunkChunk). And then as a second step, introduce thread_local contexts, for cases like this where passing ctx through the callstack is infeasible/requires too many changes. See https://reviews.llvm.org/D108850?id=370678, lld/Common/Globals.cpp, L44.

If it is not trivial to avoid the context variable in classes like COFFLinkerContext, I think it is fine.
4MB increase seems acceptable.

thread_local performance is fine on aarch64 but has a higher cost (hidden visibility doesn't help) on other architectures (especially x86-64 which many people are concerned) because llvm-project compiles lld files with -fPIC which uses traditional GD/LD TLS models which are slow:(

(This remind me that at some point I should implement TLSDESC for x86-64 (now that ld.lld support is fully correct after D116900)).

thread_local performance is fine on aarch64 but has a higher cost (hidden visibility doesn't help) on other architectures (especially x86-64 which many people are concerned) because llvm-project compiles lld files with -fPIC which uses traditional GD/LD TLS models which are slow:(

It isn't particularly cheap on the Windows ABI either: https://godbolt.org/z/frEe87zcq - however I/O dominates in LLD right now, so using thread_local for ctx doesn't really change the runtime performance of LLD, at least on Windows/x64: https://reviews.llvm.org/D108850?id=370678#2975472

Hello @akhuang! Do you think I can land this on your behalf? I've already rebased & tested locally.

Herald added a project: Restricted Project. · View Herald TranscriptJul 14 2022, 3:19 PM
Herald added a subscriber: StephenFan. · View Herald Transcript

Hello @akhuang! Do you think I can land this on your behalf? I've already rebased & tested locally.

Yep, go ahead! Sorry I've been fairly unresponsive on this patch.

This revision was landed with ongoing or failed builds.Jan 8 2023, 3:57 PM
Closed by commit rG7370ff624d21: [LLD] Remove global state in lld/COFF (authored by akhuang, committed by aganea). · Explain Why
This revision was automatically updated to reflect the committed changes.

This patch broke one case for me - it's reproducible by improving one of the existing testcases like this:

diff --git a/lld/test/COFF/export-all-lto.ll b/lld/test/COFF/export-all-lto.ll
index 13620e11d9b9..8d54b981a104 100644
--- a/lld/test/COFF/export-all-lto.ll
+++ b/lld/test/COFF/export-all-lto.ll
@@ -2,9 +2,10 @@
 
 ; RUN: llvm-as %s -o %t.bc
 
-; RUN: lld-link -lldmingw -dll -out:%t.dll %t.bc -noentry -output-def:%t.def
+; RUN: lld-link -lldmingw -dll -out:%t.dll %t.bc -noentry -output-def:%t.def -implib:%t.lib
 ; RUN: llvm-readobj --coff-exports %t.dll | grep Name: | FileCheck %s
 ; RUN: cat %t.def | FileCheck --check-prefix=IMPLIB %s
+; RUN: llvm-nm %t.lib | FileCheck --check-prefix=IMPLIB-SYMS %s
 
 ; CHECK: Name: MyComdatFunc
 ; CHECK: Name: MyExtData
@@ -14,6 +15,12 @@
 ; IMPLIB: MyExtData @2 DATA
 ; IMPLIB: MyLibFunc @3{{$}}
 
+; IMPLIB-SYMS: 00000000 T MyComdatFunc
+; IMPLIB-SYMS: 00000000 T __imp_MyComdatFunc
+; IMPLIB-SYMS: 00000000 D __imp_MyExtData
+; IMPLIB-SYMS: 00000000 T MyLibFunc
+; IMPLIB-SYMS: 00000000 T __imp_MyLibFunc
+
 target datalayout = "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
 target triple = "x86_64-w64-windows-gnu"

@aganea How do you want to go about fixing this; should I revert the patch and land this test improvement separately, so that this can be relanded with the test in place?

aganea added a comment.Jan 9 2023, 4:49 AM

@mstorsjo Yes please revert. Thanks in advance!

@mstorsjo Yes please revert. Thanks in advance!

Ok, done now!

aganea added inline comments.Jan 9 2023, 8:42 PM
lld/COFF/COFFLinkerContext.cpp
41

@mstorsjo These two structs were keeping a dangling pointer, as the FakeSections above went out of scope.

thakis added a subscriber: thakis.Jan 10 2023, 6:03 AM

This broke tests on Windows: http://45.33.8.238/win/73025/step_10.txt

Please take a look and revert for now if it takes a while to fix.

andrewng added a subscriber: andrewng.EditedJan 10 2023, 6:25 AM

This broke tests on Windows: http://45.33.8.238/win/73025/step_10.txt

Please take a look and revert for now if it takes a while to fix.

I've already committed a fix: https://reviews.llvm.org/rG85a2f29fd485d29e6a93de6f0c2bfdc009d06a56.

This broke tests on Windows: http://45.33.8.238/win/73025/step_10.txt

Please take a look and revert for now if it takes a while to fix.

Looking now!

This broke tests on Windows: http://45.33.8.238/win/73025/step_10.txt

Please take a look and revert for now if it takes a while to fix.

I've already committed a fix: https://reviews.llvm.org/rG85a2f29fd485d29e6a93de6f0c2bfdc009d06a56.

Thank you!