This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lld/trunk/
-
trunk/
-
COFF/
3/3
Driver.cpp
-
InputFiles.cpp
6/6
PDB.cpp
-
Writer.h
2/2
Writer.cpp
-
test/COFF/
-
COFF/
-
pdb-publics-import.test

Differential D54802

[LLD][COFF] Generate import modules in PDB
ClosedPublic

Authored by aganea on Nov 21 2018, 8:05 AM.

Download Raw Diff

Tokens

"Like" token, awarded by santagada.

Details

Reviewers

zturner
pcc
rnk
mstorsjo
compnerd

Commits

rG09cca5b243d0: [LLD][COFF] Generate import modules & COFF groups in PDB
rLLD357308: [LLD][COFF] Generate import modules & COFF groups in PDB
rL357308: [LLD][COFF] Generate import modules & COFF groups in PDB

Summary

This patch makes LLD generate a PDB with import modules & coff groups.
This is suitable for hot patching tools like Recode and Live++, which need to extract this information from the PDB. We're using this patch on several productions for a few months now.

We generate one debug symbol stream for each imported dll:

                          Streams
============================================================
[...]
  Stream 10 ( 256 bytes): [Module "Import:pdb-publics-import.test.tmp1.dll"]
             Blocks: [9]

We generate two modules for each imported dll:
(currently the first module is empty because llvm-lib does not generate the additional symbol stream, and some descriptors are not imported (remains TODO)

                        Module Stats
============================================================
[...]
Mod 0001 | `pdb-publics-import.test.tmp1.dll`:
  Mod 1 (debug info not present): [pdb-publics-import.test.tmp1.dll]
Mod 0002 | `Import:pdb-publics-import.test.tmp1.dll`:
  Stream 10, 256 bytes

    Symbols
                                       Total:       8 entries (     248 bytes)
    --------------------------------------------------------------------------
                                   S_THUNK32:       2 entries (      72 bytes)
                                       S_END:       2 entries (       8 bytes)
                                   S_OBJNAME:       2 entries (      88 bytes)
                                  S_COMPILE3:       2 entries (      80 bytes)

    Chunks
                                       Total:       0 entries (       0 bytes)
    --------------------------------------------------------------------------

We ensure the modules' first contrib section is properly set (65535 values):

                          Modules
============================================================
[...]
Mod 0001 | `pdb-publics-import.test.tmp1.dll`:
SC[???]  | mod = 65535, 65535:0000, size = -1, data crc = 0, reloc crc = 0
        none
Obj: `F:\svn\build2\tools\lld\test\COFF\Output\pdb-publics-import.test.tmp1.lib`:
debug stream: 65535, # files: 0, has ec info: false
pdb file ni: 0 ``, src file ni: 0 ``
Mod 0002 | `Import:pdb-publics-import.test.tmp1.dll`:
SC[.text]  | mod = 2, 0001:0032, size = 6, data crc = 0, reloc crc = 0
        IMAGE_SCN_CNT_CODE | IMAGE_SCN_MEM_EXECUTE | IMAGE_SCN_MEM_READ
Obj: `F:\svn\build2\tools\lld\test\COFF\Output\pdb-publics-import.test.tmp1.lib`:
debug stream: 10, # files: 0, has ec info: false
pdb file ni: 0 ``, src file ni: 0 ``

We generate proper symbol streams for each import module, in the same way as link.exe:

                          Symbols
============================================================
[...]
Mod 0002 | `Import:pdb-publics-import.test.tmp1.dll`:
     4 | S_OBJNAME [size = 44] sig=0, `pdb-publics-import.test.tmp1.dll`
    48 | S_COMPILE3 [size = 40]
         machine = intel x86-x64, Ver = LLVM Linker, language = link
         frontend = 0.0.0.0, backend = 14.10.25019.0
         flags = none
    88 | S_THUNK32 [size = 36] `exportfn1`
         parent = 0, end = 124, next = 0
         kind = thunk, size = 6, addr = 0001:0016
   124 | S_END [size = 4]
   128 | S_OBJNAME [size = 44] sig=0, `pdb-publics-import.test.tmp1.dll`
   172 | S_COMPILE3 [size = 40]
         machine = intel x86-x64, Ver = LLVM Linker, language = link
         frontend = 0.0.0.0, backend = 14.10.25019.0
         flags = none
   212 | S_THUNK32 [size = 36] `exportfn2`
         parent = 0, end = 248, next = 0
         kind = thunk, size = 6, addr = 0001:0032
   248 | S_END [size = 4]

And we also generate COFF groups in the * Linker * section, out from PartialSections (that is, unmerged sections).
(there's a subtly here: the .idata$XXX COFF groups have the "write" flag set, whereas the corresponding .idata section does not have it - this is what link.exe does)

Mod 0003 | `* Linker *`:
[...]
   588 | S_SECTION [size = 28] `.text`
         length = 38, alignment = 12, rva = 4096, section # = 1
         characteristics =
           code
           execute permissions
           read permissions
   616 | S_COFFGROUP [size = 24] `.text`
         length = 8, addr = 0001:0000
         characteristics =
           code
           execute permissions
           read permissions
   640 | S_SECTION [size = 28] `.rdata`
         length = 61, alignment = 12, rva = 8192, section # = 2
         characteristics =
           initialized data
           read permissions
   668 | S_SECTION [size = 28] `.idata`
         length = 145, alignment = 12, rva = 12288, section # = 3
         characteristics =
           initialized data
           read permissions
   696 | S_COFFGROUP [size = 28] `.idata$2`
         length = 40, addr = 0003:0000
         characteristics =
           initialized data
           read permissions
           write permissions
   724 | S_COFFGROUP [size = 28] `.idata$4`
         length = 24, addr = 0003:0040
         characteristics =
           initialized data
           read permissions
           write permissions
   752 | S_COFFGROUP [size = 28] `.idata$5`
         length = 24, addr = 0003:0064
         characteristics =
           initialized data
           read permissions
           write permissions
   780 | S_COFFGROUP [size = 28] `.idata$6`
         length = 24, addr = 0003:0088
         characteristics =
           initialized data
           read permissions
           write permissions
   808 | S_COFFGROUP [size = 28] `.idata$7`
         length = 33, addr = 0003:0112
         characteristics =
           initialized data
           read permissions
           write permissions

Diff Detail

Repository: rL LLVM

Event Timeline

aganea created this revision.Nov 21 2018, 8:05 AM

Herald added subscribers: javed.absar, aprantl. · View Herald TranscriptNov 21 2018, 8:05 AM

Please note this work was started from Stefan Reinalter's D49230 and D49231. We agreed offline with Stefan that I will finish those patches.

aganea added a subscriber: stefan_reinalter.Nov 21 2018, 8:38 AM

The file below shows the remaining differences betwen link and lld-link (with this patch). I've modified lld\test\COFF\Output\pdb-publics-import.test to use link.exe instead of lld-link.exe

pdb-publics-import-compare.zip16 KBDownload

Do you happen to know what these are used for? I'm ok with supporting them just for the sake of compatibility, but it's always nice to know "if these aren't present, X won't work"

mstorsjo added a subscriber: llvm-commits.Nov 21 2018, 9:42 AM

In D54802#1305448, @zturner wrote:

Do you happen to know what these are used for? I'm ok with supporting them just for the sake of compatibility, but it's always nice to know "if these aren't present, X won't work"

Some edit-and-continue tools rely on this: Recode and Live++. These tools need to extract and repro the linking with the EXE/PDB alone. They need this information for recompiling TUs that change, and hot patch the process in memory.

This all boils down to iterativity. Using the tools I've mentionned above allows for really quick iterations, when you need to change small bits of code. When you're iterating over some gameplay or UI code, for example.
Without these tools, for testing any small change, you need to: compile the TU, link the whole EXE, start it, load the game map, and do a walkthrough until you're at the right position in the game world. This can take more than 10 minutes.
With these edit-and-continue tools, you don't shut down the game. The player remains at the same position in the game. You simply change the code, the tool applies the changes, and that's it. You can go for hundreds of iterations like this. Live++ even allows the game to save/reload data structures, which means that you can change pretty much anything in the code, including class definitions.

lantictac added a subscriber: lantictac.Nov 21 2018, 3:49 PM

dblaikie added a subscriber: dblaikie.Nov 26 2018, 10:49 AM

Cool. :)

lld/trunk/COFF/Driver.cpp
1294	Do you want to submit this separately? It seems to cause wide ranging mechanical changes to the tests. Feel free to submit it without review.
lld/trunk/COFF/Writer.cpp
844–846	This seems like it's really just replicating `Map` into some longer lived data structure, and then creating a backreference from the output section to the concatenated input section. There are three other helper functions that take a reference to `Map`, and that type is ridiculously long. I think we could clean this code up significantly if we planned some NFC changes first to make `Map` a member of `Writer` and then give it a real key and value type. WDYT? I'm imagining that the type would become: `std::map<NameAndChars, InputSection>` That would have the side effect of cleaning up a lot of unreadable `Pair.second` and `Pair.first.first` accesses in this code. Honestly I started looking at this at first just because we copy the `Chunks` vector again, and I was wondering if that could be avoided, especially for a no-PDB build...

pcc added inline comments.Nov 26 2018, 6:50 PM

lld/trunk/COFF/Driver.cpp
1294	Are you sure that this is what link.exe does? I'm pretty sure I added this here after noticing that this was what link.exe did, see D45737

After further testing, in some scenarios this patch doesn't seem to be enough for Recode to work. I'll pause this patch and follow up with Will (@lantictac) to ensure everything is covered.

lld/trunk/COFF/Driver.cpp
1294	import-tests.zip46 KBDownload @pcc: Please see the zip file above for comparing the output from different linkers. I figured out a few more things with `link.exe` along the way: `S_TRAMPOLINE` records are used for jump thunks when incremental linking is active. Of course this makes the produced EXE to execute slower because there are missed function inlining oportunities. The `* Linker ` module has a section contribution in `.idata` only when incremental linking is specified. Otherwise, there's no contrib section, only a debug stream. There's a lot more `.text` padding (CCCCCC) around functions when incremental linking is active. More importantly, `link.exe` folds `.idata` sections into `.rdata` only* when `/INCREMENTAL:NO` is specified. Given that `/INCREMENTAL` is the default, I thought we should do the same in LLD. I would need to disable that behavior when providing `/INCREMENTAL:NO` (not in this patch yet). Doc and defaults for /INCREMENTAL are here.
lld/trunk/COFF/Writer.cpp
844–846	Agreed. Will do that.

Actually, the question about folding .idata into .rdata remains open. Should we follow the same behavior as link.exe or not, to reduce variability in the output?

In D54802#1309701, @aganea wrote:

Actually, the question about folding .idata into .rdata remains open. Should we follow the same behavior as link.exe or not, to reduce variability in the output?

I don't think we intend to support incremental linking, so I'd just fold idata into rdata by default as we do now.

rnk mentioned this in D54554: [PDB] Add symbol records in bulk.Nov 27 2018, 10:57 AM

In D54802#1309791, @rnk wrote:

I don't think we intend to support incremental linking, so I'd just fold idata into rdata by default as we do now.

I'd say incremental linking is the only major hurdle for widespread LLD adoption on industrial sites. Like we discussed at the LLVM conf., I agree that EXEs/DLLs shouldn't be incrementally linked. But what about incremental PDB generation?

For our typical gameplay developement loop, for a DLL target, this is what we currently see:

  Input File Reading:          1847 ms (  3.7%)
  Code Layout:                  579 ms (  1.2%)
  PDB Emission (Cumulative):  43468 ms ( 87.8%)
    Add Objects:              33074 ms ( 66.8%)
      Type Merging:           28276 ms ( 57.1%)
      Symbol Merging:          4748 ms (  9.6%)
    TPI Stream Layout:         1487 ms (  3.0%)
    Globals Stream Layout:     1789 ms (  3.6%)
    Commit to Disk:            5981 ms ( 12.1%)
  Commit Output File:          2837 ms (  5.7%)
-------------------------------------------------
Total Link Time:              49494 ms (100.0%)

That is for the biggest DLL in the game, and inherently the one that programmers iterate the most on:

-- DLL: 182 MB
-- PDB: 930 MB
-- Merged OBJs: 179 (no GHASH, OBJs are from cl.exe; these are unity/blob .cpp files)
-- Inserted type records: 91,573,186
-- Final unique type records: 3,578,658 (hash table size: 8,388,608)
-- TypeTable.RecordStorage: 1,639,715,884 bytes allocated

For the same thing, link.exe /INCREMENTAL takes about 3 sec to relink any change. If we could at least do the Type/Symbol passes incrementally, that would be a major win.

Sorry that was the wrong timings - that was for MSVC-built LLD. With Clang 7.0 things are a bit better in reality:

  Input File Reading:          1658 ms (  4.7%)
  Code Layout:                  621 ms (  1.8%)
  PDB Emission (Cumulative):  30380 ms ( 86.7%)
    Add Objects:              22615 ms ( 64.6%)
      Type Merging:           19205 ms ( 54.8%)
      Symbol Merging:          3385 ms (  9.7%)
    TPI Stream Layout:          897 ms (  2.6%)
    Globals Stream Layout:     1418 ms (  4.1%)
    Commit to Disk:            4559 ms ( 13.0%)
  Commit Output File:          1717 ms (  4.9%)
-------------------------------------------------
Total Link Time:              35021 ms (100.0%)

I noticed by the way that the default cmake config for Clang/LLVM/LLD comes with /INCREMENTAL and /MD (dll CRT) active. This causes a lot of thunking. I changed that to /INCREMENTAL:NO and /MT and things are a lot better - I'll send another review for that.

In D54802#1311401, @aganea wrote:

  Input File Reading:          1658 ms (  4.7%)
  Code Layout:                  621 ms (  1.8%)
  PDB Emission (Cumulative):  30380 ms ( 86.7%)
    Add Objects:              22615 ms ( 64.6%)
      Type Merging:           19205 ms ( 54.8%)
      Symbol Merging:          3385 ms (  9.7%)
    TPI Stream Layout:          897 ms (  2.6%)
    Globals Stream Layout:     1418 ms (  4.1%)
    Commit to Disk:            4559 ms ( 13.0%)
  Commit Output File:          1717 ms (  4.9%)
-------------------------------------------------
Total Link Time:              35021 ms (100.0%)

I think we can optimize copying symbols using the same techniques we've used for optimizing LLD. Symbol records are pretty straightforward: You figure out which .debug$S sections are live, relocate them, process them a bit, and copy the bytes to the PDB in some order. In D54554 I did some work, and that cuts down the "Commit Output File" (13%) and "Symbol Merging" times, but there may be more things to do.

For types, I think this is one of the classic problems where you make a hash table and say "it's O(1) insertion" but really it's order "length of key", and probably with a high constant factor. I think if we looked at the distribution of type record sizes, we'd see a few categories like this:

Small records, < 20bytes, i.e. less than a SHA-1, like pointer types, qualified (const/volatile) types, array types, function types, etc. Merging theses in the linker should be cheap, ghash or no ghash.
Records with names, like LF_STRUCTURE and LF_PROC_ID. C++ mangled names can get long, but they probably aren't 64K, the max CV record length, bytes long. Think ~4KB.
Long field lists. LF_FIELDLIST is a bear. I expect on average most field lists are 32KB plus, since they include the name of every field, method, and typedef. Templates tend to have lots of members, many of which are unused in most instantiations.

The other thing to keep in mind is that type hash table "hits" are common. Most types are duplicates. Given some 64KB field list record, you can expect it will be deduplicated O(#objects) times. Without the content hash, this means you have to memcmp the entire 64KB for every duplicate. With the hash, you do O(length of hash) work, so a 8 or 20 byte memcmp. I think that's where the real ghash gains are coming from. I'm not sure what to do with that information, but it seems like a useful insight.

One thing we talked about a while ago was trying to parallelize type merging. I think the challenge there is it requires a good concurrent hash map implementation, which we'd have to find or implement ourselves.

thakis added a subscriber: thakis.Nov 28 2018, 3:35 PM

thakis added inline comments.

lld/trunk/COFF/PDB.cpp
1586	This regresses PDBs being independent of the build directory, cf r344061

In D54802#1311943, @rnk wrote:

I think the challenge there is it requires a good concurrent hash map implementation, which we'd have to find or implement ourselves.

I can provide that, we have a good lockless, insert-only, DenseHash implemention that we have been using for many years.

Although the issue at stake here is the high memory bandwith, due to the large volume of data to process in a tight loop. Like you are pointing out, GHash uses a smaller key size (8 bytes) which packs more entries per cache line (instead of the {hash, ArrayRef<U8>} key (24 bytes) when GHash isn’t used). I’m not entierely sure multithreading will help a lot, but we can give it a try. Packing more tightly the data in the DenseHash, and avoiding indirections at all cost would probably benefit more in the short term.

lld/trunk/COFF/PDB.cpp
1586	Thanks for pointing out!

aganea mentioned this in D55293: [LLD][COFF] Partial sections.Dec 4 2018, 1:08 PM

aganea mentioned this in rLLD352336: [LLD][COFF] Partial sections.Jan 27 2019, 5:46 PM

aganea mentioned this in rL352336: [LLD][COFF] Partial sections.

aganea mentioned this in D57666: [LLD] [COFF] Avoid O(n^2) accesses into PartialSections.Feb 4 2019, 7:07 AM

mstorsjo added inline comments.Feb 4 2019, 12:36 PM

lld/trunk/test/COFF/hello32.test
60 ↗	(On Diff #174923)	I didn't read the whole patch in detail, but can you give a TL;DR about what's added to the output files in the case when no PDB output is enabled, that shuffles some sections forward now?

I still need to work on this patch. I'll mark it as pending in the meanwhile.

lld/trunk/test/COFF/hello32.test
60 ↗	(On Diff #174923)	Please see lld/trunk/COFF/Driver.cpp, L1226, I initially removed `parseMerge(".idata=.rdata");` And the comment above: "More importantly, link.exe folds .idata sections into .rdata only when /INCREMENTAL:NO is specified. Given that /INCREMENTAL is the default, I thought we should do the same in LLD. I would need to disable that behavior when providing /INCREMENTAL:NO (not in this patch yet)." However, it was decided that we'll keep the current .idata merging behavior for now.

mstorsjo added inline comments.Feb 4 2019, 12:50 PM

lld/trunk/test/COFF/hello32.test
60 ↗	(On Diff #174923)	Ok, I see. Thanks!

santagada added a subscriber: santagada.Mar 6 2019, 7:08 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 6 2019, 7:08 AM

Herald added a subscriber: jdoerfert. · View Herald Transcript

aganea mentioned this in D49231: Add import libraries to list of modules in PDB.Mar 14 2019, 11:52 AM

aganea mentioned this in D49230: Add support for COFF groups in * Linker * compiland in PDB.

Simplify & addressed comments.

I took the liberty to remove COFF groups from the MinGW target. This was adding a significant footprint to the PDB, given that each function is in its own .text section.
As an additionnal side-effect, MinGW PDBs are now much smaller, because we don't emit empty debug streams anymore.

As an example, using Martin's test package, I generated libqt_plugin.pdb:

trunk			21 MB
	coff groups	import modules	27.1 MB
(this patch)	no coff groups	import modules	13.5 MB
	no coff groups	no import modules	13.3 MB

The impact is less significat in non-MinGW, on a 1 GB PDB this patch adds 2 MB and scales with the size (actually depends on the number of imported libraries).

aganea edited the summary of this revision. (Show Details)Mar 15 2019, 1:35 PM

aganea edited the summary of this revision. (Show Details)Mar 15 2019, 1:40 PM

rnk added inline comments.Mar 15 2019, 2:10 PM

lld/trunk/COFF/PDB.cpp
1109–1110	Is this relevant to this change?
1577	DllModSet, maybe? DllToModuleDbi?
1595–1597	Usually we use Twine to avoid the cost of concatenating strings used for debugging purposes, like names for instructions that aren't used in release mode, or log statements. In this case, I think you can use plain old std::string, something like `addModuleInfo(std::string("Import:") + File->DLLName`. With that simplification, I think the lambda is unnecessary.
llvm/trunk/lib/DebugInfo/PDB/Native/ModuleDebugStream.cpp
40–50 ↗	(On Diff #190888)	I don't see the change to add this method to the class in this patch. Was this change intended to be part of this review? It seems unnecessary for import files.

aganea marked 3 inline comments as done.Mar 15 2019, 2:21 PM

aganea added inline comments.

lld/trunk/COFF/PDB.cpp
1109–1110	Yes, I think we can remove that comment now, unless @zturner says otherwise.
llvm/trunk/lib/DebugInfo/PDB/Native/ModuleDebugStream.cpp
40–50 ↗	(On Diff #190888)	I indeed forgot an include file from the diff. But you're right, this is for supporting empty streams, I should maybe do that in another patch.

santagada awarded a token.Mar 20 2019, 6:10 AM

Rebased, reduced & adressed comments.

Anything missing for this to be accepted? It seems like this patch is ready.

lgtm, thanks!

In D54802#1440864, @santagada wrote:

Anything missing for this to be accepted? It seems like this patch is ready.

I just missed the update, sorry.

COFF/Writer.cpp
307 ↗	(On Diff #191490)	Needs wrapping

This revision is now accepted and ready to land.Mar 26 2019, 12:56 PM

Closed by commit rL357308: [LLD][COFF] Generate import modules & COFF groups in PDB (authored by aganea). · Explain WhyMar 29 2019, 1:26 PM

This revision was automatically updated to reflect the committed changes.

aganea marked an inline comment as done.

Revision Contents

Path

Size

lld/

trunk/

COFF/

4 lines

2 lines

148 lines

16 lines

18 lines

test/

COFF/

pdb-publics-import.test

155 lines

Diff 192898

lld/trunk/COFF/Driver.cpp

Show First 20 Lines • Show All 198 Lines • ▼ Show 20 Lines	else
Driver->addBuffer(std::move(MBOrErr.first), WholeArchive);		Driver->addBuffer(std::move(MBOrErr.first), WholeArchive);
});		});
}		}

void LinkerDriver::addArchiveBuffer(MemoryBufferRef MB, StringRef SymName,		void LinkerDriver::addArchiveBuffer(MemoryBufferRef MB, StringRef SymName,
StringRef ParentName) {		StringRef ParentName) {
file_magic Magic = identify_magic(MB.getBuffer());		file_magic Magic = identify_magic(MB.getBuffer());
if (Magic == file_magic::coff_import_library) {		if (Magic == file_magic::coff_import_library) {
Symtab->addFile(make<ImportFile>(MB));		InputFile *Imp = make<ImportFile>(MB);
		Imp->ParentName = ParentName;
		Symtab->addFile(Imp);
return;		return;
}		}

InputFile *Obj;		InputFile *Obj;
if (Magic == file_magic::coff_object) {		if (Magic == file_magic::coff_object) {
Obj = make<ObjFile>(MB);		Obj = make<ObjFile>(MB);
} else if (Magic == file_magic::bitcode) {		} else if (Magic == file_magic::bitcode) {
Obj = make<BitcodeFile>(MB);		Obj = make<BitcodeFile>(MB);
▲ Show 20 Lines • Show All 1,070 Lines • ▼ Show 20 Lines	for (auto *Arg : Args.filtered(OPT_failifmismatch))
checkFailIfMismatch(Arg->getValue(), nullptr);		checkFailIfMismatch(Arg->getValue(), nullptr);

// Handle /merge		// Handle /merge
for (auto *Arg : Args.filtered(OPT_merge))		for (auto *Arg : Args.filtered(OPT_merge))
parseMerge(Arg->getValue());		parseMerge(Arg->getValue());

// Add default section merging rules after user rules. User rules take		// Add default section merging rules after user rules. User rules take
// precedence, but we will emit a warning if there is a conflict.		// precedence, but we will emit a warning if there is a conflict.
parseMerge(".idata=.rdata");		parseMerge(".idata=.rdata");
rnkUnsubmitted Done Reply Inline Actions Do you want to submit this separately? It seems to cause wide ranging mechanical changes to the tests. Feel free to submit it without review. rnk: Do you want to submit this separately? It seems to cause wide ranging mechanical changes to the…
pccUnsubmitted Done Reply Inline Actions Are you sure that this is what link.exe does? I'm pretty sure I added this here after noticing that this was what link.exe did, see D45737 pcc: Are you sure that this is what link.exe does? I'm pretty sure I added this here after noticing…
aganeaAuthorUnsubmitted Done Reply Inline Actions import-tests.zip46 KBDownload @pcc: Please see the zip file above for comparing the output from different linkers. I figured out a few more things with `link.exe` along the way: `S_TRAMPOLINE` records are used for jump thunks when incremental linking is active. Of course this makes the produced EXE to execute slower because there are missed function inlining oportunities. The `* Linker ` module has a section contribution in `.idata` only when incremental linking is specified. Otherwise, there's no contrib section, only a debug stream. There's a lot more `.text` padding (CCCCCC) around functions when incremental linking is active. More importantly, `link.exe` folds `.idata` sections into `.rdata` only* when `/INCREMENTAL:NO` is specified. Given that `/INCREMENTAL` is the default, I thought we should do the same in LLD. I would need to disable that behavior when providing `/INCREMENTAL:NO` (not in this patch yet). Doc and defaults for /INCREMENTAL are here. aganea: {F7611855} @pcc: Please see the zip file above for comparing the output from different linkers.
parseMerge(".didat=.rdata");		parseMerge(".didat=.rdata");
parseMerge(".edata=.rdata");		parseMerge(".edata=.rdata");
parseMerge(".xdata=.rdata");		parseMerge(".xdata=.rdata");
parseMerge(".bss=.data");		parseMerge(".bss=.data");

if (Config->MinGW) {		if (Config->MinGW) {
parseMerge(".ctors=.rdata");		parseMerge(".ctors=.rdata");
parseMerge(".dtors=.rdata");		parseMerge(".dtors=.rdata");
▲ Show 20 Lines • Show All 432 Lines • Show Last 20 Lines

lld/trunk/COFF/InputFiles.cpp

	Show First 20 Lines • Show All 772 Lines • ▼ Show 20 Lines
	static StringRef getBasename(StringRef Path) {			static StringRef getBasename(StringRef Path) {
	return sys::path::filename(Path, sys::path::Style::windows);			return sys::path::filename(Path, sys::path::Style::windows);
	}			}

	// Returns a string in the format of "foo.obj" or "foo.obj(bar.lib)".			// Returns a string in the format of "foo.obj" or "foo.obj(bar.lib)".
	std::string lld::toString(const coff::InputFile *File) {			std::string lld::toString(const coff::InputFile *File) {
	if (!File)			if (!File)
	return "<internal>";			return "<internal>";
	if (File->ParentName.empty())			if (File->ParentName.empty() \|\| File->kind() == coff::InputFile::ImportKind)
	return File->getName();			return File->getName();

	return (getBasename(File->ParentName) + "(" + getBasename(File->getName()) +			return (getBasename(File->ParentName) + "(" + getBasename(File->getName()) +
	")")			")")
	.str();			.str();
	}			}

lld/trunk/COFF/PDB.cpp

Show First 20 Lines • Show All 103 Lines • ▼ Show 20 Lines	public:
void initialize(llvm::codeview::DebugInfo *BuildId);		void initialize(llvm::codeview::DebugInfo *BuildId);

/// Add natvis files specified on the command line.		/// Add natvis files specified on the command line.
void addNatvisFiles();		void addNatvisFiles();

/// Link CodeView from each object file in the symbol table into the PDB.		/// Link CodeView from each object file in the symbol table into the PDB.
void addObjectsToPDB();		void addObjectsToPDB();

		/// Link info for each import file in the symbol table into the PDB.
		void addImportFilesToPDB(ArrayRef<OutputSection *> OutputSections);

/// Link CodeView from a single object file into the target (output) PDB.		/// Link CodeView from a single object file into the target (output) PDB.
/// When a precompiled headers object is linked, its TPI map might be provided		/// When a precompiled headers object is linked, its TPI map might be provided
/// externally.		/// externally.
void addObjFile(ObjFile File, CVIndexMap ExternIndexMap = nullptr);		void addObjFile(ObjFile File, CVIndexMap ExternIndexMap = nullptr);

/// Produce a mapping from the type and item indices used in the object		/// Produce a mapping from the type and item indices used in the object
/// file to those in the destination PDB.		/// file to those in the destination PDB.
///		///
▲ Show 20 Lines • Show All 753 Lines • ▼ Show 20 Lines	static void scopeStackOpen(SmallVectorImpl<SymbolScope> &Stack,
S.ScopeOffset = CurOffset;		S.ScopeOffset = CurOffset;
S.OpeningRecord = const_cast<ScopeRecord *>(		S.OpeningRecord = const_cast<ScopeRecord *>(
reinterpret_cast<const ScopeRecord *>(Sym.content().data()));		reinterpret_cast<const ScopeRecord *>(Sym.content().data()));
S.OpeningRecord->PtrParent = Stack.empty() ? 0 : Stack.back().ScopeOffset;		S.OpeningRecord->PtrParent = Stack.empty() ? 0 : Stack.back().ScopeOffset;
Stack.push_back(S);		Stack.push_back(S);
}		}

static void scopeStackClose(SmallVectorImpl<SymbolScope> &Stack,		static void scopeStackClose(SmallVectorImpl<SymbolScope> &Stack,
uint32_t CurOffset, ObjFile *File) {		uint32_t CurOffset, InputFile *File) {
if (Stack.empty()) {		if (Stack.empty()) {
warn("symbol scopes are not balanced in " + File->getName());		warn("symbol scopes are not balanced in " + File->getName());
return;		return;
}		}
SymbolScope S = Stack.pop_back_val();		SymbolScope S = Stack.pop_back_val();
S.OpeningRecord->PtrEnd = CurOffset;		S.OpeningRecord->PtrEnd = CurOffset;
}		}

▲ Show 20 Lines • Show All 208 Lines • ▼ Show 20 Lines	if (auto *SecChunk = dyn_cast_or_null<SectionChunk>(C)) {
SC.Imod = SecChunk->File->ModuleDBI->getModuleIndex();		SC.Imod = SecChunk->File->ModuleDBI->getModuleIndex();
ArrayRef<uint8_t> Contents = SecChunk->getContents();		ArrayRef<uint8_t> Contents = SecChunk->getContents();
JamCRC CRC(0);		JamCRC CRC(0);
ArrayRef<char> CharContents = makeArrayRef(		ArrayRef<char> CharContents = makeArrayRef(
reinterpret_cast<const char *>(Contents.data()), Contents.size());		reinterpret_cast<const char *>(Contents.data()), Contents.size());
CRC.update(CharContents);		CRC.update(CharContents);
SC.DataCrc = CRC.getCRC();		SC.DataCrc = CRC.getCRC();
} else {		} else {
SC.Characteristics = OS ? OS->Header.Characteristics : 0;		SC.Characteristics = OS ? OS->Header.Characteristics : 0;
// FIXME: When we start creating DBI for import libraries, use those here.
SC.Imod = Modi;		SC.Imod = Modi;
		rnkUnsubmitted Done Reply Inline Actions Is this relevant to this change? rnk: Is this relevant to this change?
		aganeaAuthorUnsubmitted Done Reply Inline Actions Yes, I think we can remove that comment now, unless @zturner says otherwise. aganea: Yes, I think we can remove that comment now, unless @zturner says otherwise.
}		}
SC.RelocCrc = 0; // FIXME		SC.RelocCrc = 0; // FIXME

return SC;		return SC;
}		}

static uint32_t		static uint32_t
translateStringTableIndex(uint32_t ObjIndex,		translateStringTableIndex(uint32_t ObjIndex,
▲ Show 20 Lines • Show All 332 Lines • ▼ Show 20 Lines	if (HasQ) {
R.append(A);		R.append(A);
}		}
if (HasWS \|\| HasQ)		if (HasWS \|\| HasQ)
R.push_back('"');		R.push_back('"');
}		}
return R;		return R;
}		}

static void addCommonLinkerModuleSymbols(StringRef Path,		static void fillLinkerVerRecord(Compile3Sym &CS) {
pdb::DbiModuleDescriptorBuilder &Mod,
BumpPtrAllocator &Allocator) {
ObjNameSym ONS(SymbolRecordKind::ObjNameSym);
Compile3Sym CS(SymbolRecordKind::Compile3Sym);
EnvBlockSym EBS(SymbolRecordKind::EnvBlockSym);

ONS.Name = "* Linker *";
ONS.Signature = 0;

CS.Machine = toCodeViewMachine(Config->Machine);		CS.Machine = toCodeViewMachine(Config->Machine);
// Interestingly, if we set the string to 0.0.0.0, then when trying to view		// Interestingly, if we set the string to 0.0.0.0, then when trying to view
// local variables WinDbg emits an error that private symbols are not present.		// local variables WinDbg emits an error that private symbols are not present.
// By setting this to a valid MSVC linker version string, local variables are		// By setting this to a valid MSVC linker version string, local variables are
// displayed properly. As such, even though it is not representative of		// displayed properly. As such, even though it is not representative of
// LLVM's version information, we need this for compatibility.		// LLVM's version information, we need this for compatibility.
CS.Flags = CompileSym3Flags::None;		CS.Flags = CompileSym3Flags::None;
CS.VersionBackendBuild = 25019;		CS.VersionBackendBuild = 25019;
CS.VersionBackendMajor = 14;		CS.VersionBackendMajor = 14;
CS.VersionBackendMinor = 10;		CS.VersionBackendMinor = 10;
CS.VersionBackendQFE = 0;		CS.VersionBackendQFE = 0;

// MSVC also sets the frontend to 0.0.0.0 since this is specifically for the		// MSVC also sets the frontend to 0.0.0.0 since this is specifically for the
// linker module (which is by definition a backend), so we don't need to do		// linker module (which is by definition a backend), so we don't need to do
// anything here. Also, it seems we can use "LLVM Linker" for the linker name		// anything here. Also, it seems we can use "LLVM Linker" for the linker name
// without any problems. Only the backend version has to be hardcoded to a		// without any problems. Only the backend version has to be hardcoded to a
// magic number.		// magic number.
CS.VersionFrontendBuild = 0;		CS.VersionFrontendBuild = 0;
CS.VersionFrontendMajor = 0;		CS.VersionFrontendMajor = 0;
CS.VersionFrontendMinor = 0;		CS.VersionFrontendMinor = 0;
CS.VersionFrontendQFE = 0;		CS.VersionFrontendQFE = 0;
CS.Version = "LLVM Linker";		CS.Version = "LLVM Linker";
CS.setLanguage(SourceLanguage::Link);		CS.setLanguage(SourceLanguage::Link);
		}

		static void addCommonLinkerModuleSymbols(StringRef Path,
		pdb::DbiModuleDescriptorBuilder &Mod,
		BumpPtrAllocator &Allocator) {
		ObjNameSym ONS(SymbolRecordKind::ObjNameSym);
		EnvBlockSym EBS(SymbolRecordKind::EnvBlockSym);
		Compile3Sym CS(SymbolRecordKind::Compile3Sym);
		fillLinkerVerRecord(CS);

		ONS.Name = "* Linker *";
		ONS.Signature = 0;

ArrayRef<StringRef> Args = makeArrayRef(Config->Argv).drop_front();		ArrayRef<StringRef> Args = makeArrayRef(Config->Argv).drop_front();
std::string ArgStr = quote(Args);		std::string ArgStr = quote(Args);
EBS.Fields.push_back("cwd");		EBS.Fields.push_back("cwd");
SmallString<64> cwd;		SmallString<64> cwd;
if (Config->PDBSourcePath.empty())		if (Config->PDBSourcePath.empty())
sys::fs::current_path(cwd);		sys::fs::current_path(cwd);
else		else
Show All 10 Lines	static void addCommonLinkerModuleSymbols(StringRef Path,
Mod.addSymbol(codeview::SymbolSerializer::writeOneSymbol(		Mod.addSymbol(codeview::SymbolSerializer::writeOneSymbol(
ONS, Allocator, CodeViewContainer::Pdb));		ONS, Allocator, CodeViewContainer::Pdb));
Mod.addSymbol(codeview::SymbolSerializer::writeOneSymbol(		Mod.addSymbol(codeview::SymbolSerializer::writeOneSymbol(
CS, Allocator, CodeViewContainer::Pdb));		CS, Allocator, CodeViewContainer::Pdb));
Mod.addSymbol(codeview::SymbolSerializer::writeOneSymbol(		Mod.addSymbol(codeview::SymbolSerializer::writeOneSymbol(
EBS, Allocator, CodeViewContainer::Pdb));		EBS, Allocator, CodeViewContainer::Pdb));
}		}

		static void addLinkerModuleCoffGroup(PartialSection *Sec,
		pdb::DbiModuleDescriptorBuilder &Mod,
		OutputSection &OS,
		BumpPtrAllocator &Allocator) {
		// If there's a section, there's at least one chunk
		assert(!Sec->Chunks.empty());
		const Chunk firstChunk = Sec->Chunks.begin();
		const Chunk lastChunk = Sec->Chunks.rbegin();

		// Emit COFF group
		CoffGroupSym CGS(SymbolRecordKind::CoffGroupSym);
		CGS.Name = Sec->Name;
		CGS.Segment = OS.SectionIndex;
		CGS.Offset = firstChunk->getRVA() - OS.getRVA();
		CGS.Size = lastChunk->getRVA() + lastChunk->getSize() - firstChunk->getRVA();
		CGS.Characteristics = Sec->Characteristics;

		// Somehow .idata sections & sections groups in the debug symbol stream have
		// the "write" flag set. However the section header for the corresponding
		// .idata section doesn't have it.
		if (CGS.Name.startswith(".idata"))
		CGS.Characteristics \|= llvm::COFF::IMAGE_SCN_MEM_WRITE;

		Mod.addSymbol(codeview::SymbolSerializer::writeOneSymbol(
		CGS, Allocator, CodeViewContainer::Pdb));
		}

static void addLinkerModuleSectionSymbol(pdb::DbiModuleDescriptorBuilder &Mod,		static void addLinkerModuleSectionSymbol(pdb::DbiModuleDescriptorBuilder &Mod,
OutputSection &OS,		OutputSection &OS,
BumpPtrAllocator &Allocator) {		BumpPtrAllocator &Allocator) {
SectionSym Sym(SymbolRecordKind::SectionSym);		SectionSym Sym(SymbolRecordKind::SectionSym);
Sym.Alignment = 12; // 2^12 = 4KB		Sym.Alignment = 12; // 2^12 = 4KB
Sym.Characteristics = OS.Header.Characteristics;		Sym.Characteristics = OS.Header.Characteristics;
Sym.Length = OS.getVirtualSize();		Sym.Length = OS.getVirtualSize();
Sym.Name = OS.Name;		Sym.Name = OS.Name;
Sym.Rva = OS.getRVA();		Sym.Rva = OS.getRVA();
Sym.SectionNumber = OS.SectionIndex;		Sym.SectionNumber = OS.SectionIndex;
Mod.addSymbol(codeview::SymbolSerializer::writeOneSymbol(		Mod.addSymbol(codeview::SymbolSerializer::writeOneSymbol(
Sym, Allocator, CodeViewContainer::Pdb));		Sym, Allocator, CodeViewContainer::Pdb));

		// Skip COFF groups in MinGW because it adds a significant footprint to the
		// PDB, due to each function being in its own section
		if (Config->MinGW)
		return;

		// Output COFF groups for individual chunks of this section.
		for (PartialSection *Sec : OS.ContribSections) {
		addLinkerModuleCoffGroup(Sec, Mod, OS, Allocator);
		}
		}

		// Add all import files as modules to the PDB.
		void PDBLinker::addImportFilesToPDB(ArrayRef<OutputSection *> OutputSections) {
		if (ImportFile::Instances.empty())
		return;

		std::map<std::string, llvm::pdb::DbiModuleDescriptorBuilder *> DllToModuleDbi;
		rnkUnsubmitted Done Reply Inline Actions DllModSet, maybe? DllToModuleDbi? rnk: DllModSet, maybe? DllToModuleDbi?

		for (ImportFile *File : ImportFile::Instances) {
		if (!File->Live)
		continue;

		if (!File->ThunkSym)
		continue;

		std::string DLL = StringRef(File->DLLName).lower();
		thakisUnsubmitted Done Reply Inline Actions This regresses PDBs being independent of the build directory, cf r344061 thakis: This regresses PDBs being independent of the build directory, cf r344061
		aganeaAuthorUnsubmitted Done Reply Inline Actions Thanks for pointing out! aganea: Thanks for pointing out!
		llvm::pdb::DbiModuleDescriptorBuilder *&Mod = DllToModuleDbi[DLL];
		if (!Mod) {
		pdb::DbiStreamBuilder &DbiBuilder = Builder.getDbiBuilder();
		SmallString<128> LibPath = File->ParentName;
		pdbMakeAbsolute(LibPath);
		sys::path::native(LibPath);

		// Name modules similar to MSVC's link.exe.
		// The first module is the simple dll filename
		llvm::pdb::DbiModuleDescriptorBuilder &FirstMod =
		ExitOnErr(DbiBuilder.addModuleInfo(File->DLLName));
		rnkUnsubmitted Done Reply Inline Actions Usually we use Twine to avoid the cost of concatenating strings used for debugging purposes, like names for instructions that aren't used in release mode, or log statements. In this case, I think you can use plain old std::string, something like `addModuleInfo(std::string("Import:") + File->DLLName`. With that simplification, I think the lambda is unnecessary. rnk: Usually we use Twine to avoid the cost of concatenating strings used for debugging purposes…
		FirstMod.setObjFileName(LibPath);
		pdb::SectionContrib SC =
		createSectionContrib(nullptr, llvm::pdb::kInvalidStreamIndex);
		FirstMod.setFirstSectionContrib(SC);

		// The second module is where the import stream goes.
		Mod = &ExitOnErr(DbiBuilder.addModuleInfo("Import:" + File->DLLName));
		Mod->setObjFileName(LibPath);
		}

		DefinedImportThunk *Thunk = cast<DefinedImportThunk>(File->ThunkSym);

		ObjNameSym ONS(SymbolRecordKind::ObjNameSym);
		Compile3Sym CS(SymbolRecordKind::Compile3Sym);
		Thunk32Sym TS(SymbolRecordKind::Thunk32Sym);
		ScopeEndSym ES(SymbolRecordKind::ScopeEndSym);

		ONS.Name = File->DLLName;
		ONS.Signature = 0;

		fillLinkerVerRecord(CS);

		TS.Name = Thunk->getName();
		TS.Parent = 0;
		TS.End = 0;
		TS.Next = 0;
		TS.Thunk = ThunkOrdinal::Standard;
		TS.Length = Thunk->getChunk()->getSize();
		TS.Segment = Thunk->getChunk()->getOutputSection()->SectionIndex;
		TS.Offset = Thunk->getChunk()->OutputSectionOff;

		Mod->addSymbol(codeview::SymbolSerializer::writeOneSymbol(
		ONS, Alloc, CodeViewContainer::Pdb));
		Mod->addSymbol(codeview::SymbolSerializer::writeOneSymbol(
		CS, Alloc, CodeViewContainer::Pdb));

		SmallVector<SymbolScope, 4> Scopes;
		CVSymbol NewSym = codeview::SymbolSerializer::writeOneSymbol(
		TS, Alloc, CodeViewContainer::Pdb);
		scopeStackOpen(Scopes, Mod->getNextSymbolOffset(), NewSym);

		Mod->addSymbol(NewSym);

		NewSym = codeview::SymbolSerializer::writeOneSymbol(ES, Alloc,
		CodeViewContainer::Pdb);
		scopeStackClose(Scopes, Mod->getNextSymbolOffset(), File);

		Mod->addSymbol(NewSym);

		pdb::SectionContrib SC =
		createSectionContrib(Thunk->getChunk(), Mod->getModuleIndex());
		Mod->setFirstSectionContrib(SC);
		}
}		}

// Creates a PDB file.		// Creates a PDB file.
void coff::createPDB(SymbolTable *Symtab,		void coff::createPDB(SymbolTable *Symtab,
ArrayRef<OutputSection *> OutputSections,		ArrayRef<OutputSection *> OutputSections,
ArrayRef<uint8_t> SectionTable,		ArrayRef<uint8_t> SectionTable,
llvm::codeview::DebugInfo *BuildId) {		llvm::codeview::DebugInfo *BuildId) {
ScopedTimer T1(TotalPdbLinkTimer);		ScopedTimer T1(TotalPdbLinkTimer);
PDBLinker PDB(Symtab);		PDBLinker PDB(Symtab);

PDB.initialize(BuildId);		PDB.initialize(BuildId);
PDB.addObjectsToPDB();		PDB.addObjectsToPDB();
		PDB.addImportFilesToPDB(OutputSections);
PDB.addSections(OutputSections, SectionTable);		PDB.addSections(OutputSections, SectionTable);
PDB.addNatvisFiles();		PDB.addNatvisFiles();

ScopedTimer T2(DiskCommitTimer);		ScopedTimer T2(DiskCommitTimer);
codeview::GUID Guid;		codeview::GUID Guid;
PDB.commit(&Guid);		PDB.commit(&Guid);
memcpy(&BuildId->PDB70.Signature, &Guid, 16);		memcpy(&BuildId->PDB70.Signature, &Guid, 16);

▲ Show 20 Lines • Show All 227 Lines • Show Last 20 Lines

lld/trunk/COFF/Writer.h

	Show All 16 Lines
	#include <vector>			#include <vector>

	namespace lld {			namespace lld {
	namespace coff {			namespace coff {
	static const int PageSize = 4096;			static const int PageSize = 4096;

	void writeResult();			void writeResult();

				class PartialSection {
				public:
				PartialSection(StringRef N, uint32_t Chars)
				: Name(N), Characteristics(Chars) {}
				StringRef Name;
				unsigned Characteristics;
				std::vector<Chunk *> Chunks;
				};

	// OutputSection represents a section in an output file. It's a			// OutputSection represents a section in an output file. It's a
	// container of chunks. OutputSection and Chunk are 1:N relationship.			// container of chunks. OutputSection and Chunk are 1:N relationship.
	// Chunks cannot belong to more than one OutputSections. The writer			// Chunks cannot belong to more than one OutputSections. The writer
	// creates multiple OutputSections and assign them unique,			// creates multiple OutputSections and assign them unique,
	// non-overlapping file offsets and RVAs.			// non-overlapping file offsets and RVAs.
	class OutputSection {			class OutputSection {
	public:			public:
	OutputSection(llvm::StringRef N, uint32_t Chars) : Name(N) {			OutputSection(llvm::StringRef N, uint32_t Chars) : Name(N) {
	Header.Characteristics = Chars;			Header.Characteristics = Chars;
	}			}
	void addChunk(Chunk *C);			void addChunk(Chunk *C);
	void insertChunkAtStart(Chunk *C);			void insertChunkAtStart(Chunk *C);
	void merge(OutputSection *Other);			void merge(OutputSection *Other);
	void setPermissions(uint32_t C);			void setPermissions(uint32_t C);
	uint64_t getRVA() { return Header.VirtualAddress; }			uint64_t getRVA() { return Header.VirtualAddress; }
	uint64_t getFileOff() { return Header.PointerToRawData; }			uint64_t getFileOff() { return Header.PointerToRawData; }
	void writeHeaderTo(uint8_t *Buf);			void writeHeaderTo(uint8_t *Buf);
				void addContributingPartialSection(PartialSection *Sec);

	// Returns the size of this section in an executable memory image.			// Returns the size of this section in an executable memory image.
	// This may be smaller than the raw size (the raw size is multiple			// This may be smaller than the raw size (the raw size is multiple
	// of disk sector size, so there may be padding at end), or may be			// of disk sector size, so there may be padding at end), or may be
	// larger (if that's the case, the loader reserves spaces after end			// larger (if that's the case, the loader reserves spaces after end
	// of raw data).			// of raw data).
	uint64_t getVirtualSize() { return Header.VirtualSize; }			uint64_t getVirtualSize() { return Header.VirtualSize; }

	// Returns the size of the section in the output file.			// Returns the size of the section in the output file.
	uint64_t getRawSize() { return Header.SizeOfRawData; }			uint64_t getRawSize() { return Header.SizeOfRawData; }

	// Set offset into the string table storing this section name.			// Set offset into the string table storing this section name.
	// Used only when the name is longer than 8 bytes.			// Used only when the name is longer than 8 bytes.
	void setStringTableOff(uint32_t V) { StringTableOff = V; }			void setStringTableOff(uint32_t V) { StringTableOff = V; }

	// N.B. The section index is one based.			// N.B. The section index is one based.
	uint32_t SectionIndex = 0;			uint32_t SectionIndex = 0;

	llvm::StringRef Name;			llvm::StringRef Name;
	llvm::object::coff_section Header = {};			llvm::object::coff_section Header = {};

	std::vector<Chunk *> Chunks;			std::vector<Chunk *> Chunks;
	std::vector<Chunk *> OrigChunks;			std::vector<Chunk *> OrigChunks;

				std::vector<PartialSection *> ContribSections;

	private:			private:
	uint32_t StringTableOff = 0;			uint32_t StringTableOff = 0;
	};			};

	}			} // namespace coff
	}			} // namespace lld

	#endif			#endif

lld/trunk/COFF/Writer.cpp

Show First 20 Lines • Show All 169 Lines • ▼ Show 20 Lines	bool operator<(const PartialSectionKey &Other) const {
if (C == 1)		if (C == 1)
return false;		return false;
if (C == 0)		if (C == 0)
return Characteristics < Other.Characteristics;		return Characteristics < Other.Characteristics;
return true;		return true;
}		}
};		};

class PartialSection {
public:
PartialSection(StringRef N, uint32_t Chars)
: Name(N), Characteristics(Chars) {}
StringRef Name;
unsigned Characteristics;
std::vector<Chunk *> Chunks;
};

// The writer writes a SymbolTable result to a file.		// The writer writes a SymbolTable result to a file.
class Writer {		class Writer {
public:		public:
Writer() : Buffer(errorHandler().OutputBuffer) {}		Writer() : Buffer(errorHandler().OutputBuffer) {}
void run();		void run();

private:		private:
void createSections();		void createSections();
▲ Show 20 Lines • Show All 112 Lines • ▼ Show 20 Lines	void OutputSection::setPermissions(uint32_t C) {
Header.Characteristics \|= C;		Header.Characteristics \|= C;
}		}

void OutputSection::merge(OutputSection *Other) {		void OutputSection::merge(OutputSection *Other) {
for (Chunk *C : Other->Chunks)		for (Chunk *C : Other->Chunks)
C->setOutputSection(this);		C->setOutputSection(this);
Chunks.insert(Chunks.end(), Other->Chunks.begin(), Other->Chunks.end());		Chunks.insert(Chunks.end(), Other->Chunks.begin(), Other->Chunks.end());
Other->Chunks.clear();		Other->Chunks.clear();
		ContribSections.insert(ContribSections.end(), Other->ContribSections.begin(),
		Other->ContribSections.end());
		Other->ContribSections.clear();
}		}

// Write the section header to a given buffer.		// Write the section header to a given buffer.
void OutputSection::writeHeaderTo(uint8_t *Buf) {		void OutputSection::writeHeaderTo(uint8_t *Buf) {
auto Hdr = reinterpret_cast<coff_section >(Buf);		auto Hdr = reinterpret_cast<coff_section >(Buf);
*Hdr = Header;		*Hdr = Header;
if (StringTableOff) {		if (StringTableOff) {
// If name is too long, write offset into the string table as a name.		// If name is too long, write offset into the string table as a name.
sprintf(Hdr->Name, "/%d", StringTableOff);		sprintf(Hdr->Name, "/%d", StringTableOff);
} else {		} else {
assert(!Config->Debug \|\| Name.size() <= COFF::NameSize \|\|		assert(!Config->Debug \|\| Name.size() <= COFF::NameSize \|\|
(Hdr->Characteristics & IMAGE_SCN_MEM_DISCARDABLE) == 0);		(Hdr->Characteristics & IMAGE_SCN_MEM_DISCARDABLE) == 0);
strncpy(Hdr->Name, Name.data(),		strncpy(Hdr->Name, Name.data(),
std::min(Name.size(), (size_t)COFF::NameSize));		std::min(Name.size(), (size_t)COFF::NameSize));
}		}
}		}

		void OutputSection::addContributingPartialSection(PartialSection *Sec) {
		ContribSections.push_back(Sec);
		}

} // namespace coff		} // namespace coff
} // namespace lld		} // namespace lld

// Check whether the target address S is in range from a relocation		// Check whether the target address S is in range from a relocation
// of type RelType at address P.		// of type RelType at address P.
static bool isInRange(uint16_t RelType, uint64_t S, uint64_t P, int Margin) {		static bool isInRange(uint16_t RelType, uint64_t S, uint64_t P, int Margin) {
if (Config->Machine == ARMNT) {		if (Config->Machine == ARMNT) {
int64_t Diff = AbsoluteDifference(S, P + 4) + Margin;		int64_t Diff = AbsoluteDifference(S, P + 4) + Margin;
▲ Show 20 Lines • Show All 497 Lines • ▼ Show 20 Lines	if (Name == ".CRT") {
log("Processing section " + PSec->Name + " -> " + Name);		log("Processing section " + PSec->Name + " -> " + Name);

sortCRTSectionChunks(PSec->Chunks);		sortCRTSectionChunks(PSec->Chunks);
}		}

OutputSection *Sec = CreateSection(Name, OutChars);		OutputSection *Sec = CreateSection(Name, OutChars);
for (Chunk *C : PSec->Chunks)		for (Chunk *C : PSec->Chunks)
Sec->addChunk(C);		Sec->addChunk(C);

		Sec->addContributingPartialSection(PSec);
}		}

		rnkUnsubmitted Done Reply Inline Actions This seems like it's really just replicating `Map` into some longer lived data structure, and then creating a backreference from the output section to the concatenated input section. There are three other helper functions that take a reference to `Map`, and that type is ridiculously long. I think we could clean this code up significantly if we planned some NFC changes first to make `Map` a member of `Writer` and then give it a real key and value type. WDYT? I'm imagining that the type would become: `std::map<NameAndChars, InputSection>` That would have the side effect of cleaning up a lot of unreadable `Pair.second` and `Pair.first.first` accesses in this code. Honestly I started looking at this at first just because we copy the `Chunks` vector again, and I was wondering if that could be avoided, especially for a no-PDB build... rnk: This seems like it's really just replicating `Map` into some longer lived data structure, and…
		aganeaAuthorUnsubmitted Done Reply Inline Actions Agreed. Will do that. aganea: Agreed. Will do that.
// Finally, move some output sections to the end.		// Finally, move some output sections to the end.
auto SectionOrder = [&](OutputSection *S) {		auto SectionOrder = [&](OutputSection *S) {
// Move DISCARDABLE (or non-memory-mapped) sections to the end of file because		// Move DISCARDABLE (or non-memory-mapped) sections to the end of file because
// the loader cannot handle holes. Stripping can remove other discardable ones		// the loader cannot handle holes. Stripping can remove other discardable ones
// than .reloc, which is first of them (created early).		// than .reloc, which is first of them (created early).
if (S->Header.Characteristics & IMAGE_SCN_MEM_DISCARDABLE)		if (S->Header.Characteristics & IMAGE_SCN_MEM_DISCARDABLE)
return 2;		return 2;
// .rsrc should come at the end of the non-discardable sections because its		// .rsrc should come at the end of the non-discardable sections because its
▲ Show 20 Lines • Show All 1,005 Lines • Show Last 20 Lines

lld/trunk/test/COFF/pdb-publics-import.test

	Make a DLL that exports a few functions, then make a DLL with PDBs that imports			Make a DLL that exports a few functions, then make a DLL with PDBs that imports
	them. Check that the __imp_ pointer and the generated thunks appear in the			them. Check that the __imp_ pointer and the generated thunks appear in the
	publics stream.			publics stream.

	RUN: yaml2obj < %p/Inputs/export.yaml > %t1.obj			RUN: yaml2obj < %p/Inputs/export.yaml > %t1.obj
	RUN: lld-link /out:%t1.dll /dll %t1.obj /implib:%t1.lib \			RUN: lld-link /out:%t1.dll /dll %t1.obj /implib:%t1.lib \
	RUN: /export:exportfn1 /export:exportfn2			RUN: /export:exportfn1 /export:exportfn2
	RUN: yaml2obj < %p/Inputs/import.yaml > %t2.obj			RUN: yaml2obj < %p/Inputs/import.yaml > %t2.obj
	RUN: lld-link /out:%t2.exe /pdb:%t2.pdb /pdbaltpath:test.pdb \			RUN: lld-link /out:%t2.exe /pdb:%t2.pdb /pdbaltpath:test.pdb \
	RUN: /debug /entry:main %t2.obj %t1.lib			RUN: /debug /entry:main %t2.obj %t1.lib
	RUN: llvm-pdbutil dump %t2.pdb -publics -section-contribs \| FileCheck %s			RUN: llvm-pdbutil dump %t2.pdb -all \| FileCheck %s

				CHECK: Streams
				CHECK-NEXT: ============================================================
				CHECK-LABEL: Stream 10 ( 256 bytes): [Module "Import:pdb-publics-import.test.tmp1.dll"]

				CHECK: Module Stats
				CHECK-NEXT: ============================================================
				CHECK-NEXT: Mod 0000 \| `{{.*}}pdb-publics-import.test.tmp2.obj`:
				CHECK-NEXT: Mod 0 (debug info not present): [{{.*}}pdb-publics-import.test.tmp2.obj]
				CHECK-NEXT: Mod 0001 \| `pdb-publics-import.test.tmp1.dll`:
				CHECK-NEXT: Mod 1 (debug info not present): [pdb-publics-import.test.tmp1.dll]
				CHECK-NEXT: Mod 0002 \| `Import:pdb-publics-import.test.tmp1.dll`:
				CHECK-NEXT: Stream 10, 256 bytes

				CHECK: Modules
				CHECK-NEXT: ============================================================
				CHECK-NEXT: Mod 0000 \| `{{.*}}pdb-publics-import.test.tmp2.obj`:
				CHECK-NEXT: SC[.text] \| mod = 0, 0001:0000, size = 8, data crc = 0, reloc crc = 0
				CHECK-NEXT: IMAGE_SCN_CNT_CODE \| IMAGE_SCN_ALIGN_4BYTES \| IMAGE_SCN_MEM_EXECUTE \|
				CHECK-NEXT: IMAGE_SCN_MEM_READ
				CHECK-NEXT: Obj: `{{.*}}pdb-publics-import.test.tmp2.obj`:
				CHECK-NEXT: debug stream: 65535, # files: 0, has ec info: false
				CHECK-NEXT: pdb file ni: 0 ``, src file ni: 0 ``
				CHECK-NEXT: Mod 0001 \| `pdb-publics-import.test.tmp1.dll`:
				CHECK-NEXT: SC[???] \| mod = 65535, 65535:0000, size = -1, data crc = 0, reloc crc = 0
				CHECK-NEXT: none
				CHECK-NEXT: Obj: `{{.*}}pdb-publics-import.test.tmp1.lib`:
				CHECK-NEXT: debug stream: 65535, # files: 0, has ec info: false
				CHECK-NEXT: pdb file ni: 0 ``, src file ni: 0 ``
				CHECK-NEXT: Mod 0002 \| `Import:pdb-publics-import.test.tmp1.dll`:
				CHECK-NEXT: SC[.text] \| mod = 2, 0001:0032, size = 6, data crc = 0, reloc crc = 0
				CHECK-NEXT: IMAGE_SCN_CNT_CODE \| IMAGE_SCN_MEM_EXECUTE \| IMAGE_SCN_MEM_READ
				CHECK-NEXT: Obj: `{{.*}}pdb-publics-import.test.tmp1.lib`:
				CHECK-NEXT: debug stream: 10, # files: 0, has ec info: false
				CHECK-NEXT: pdb file ni: 0 ``, src file ni: 0 ``
				CHECK-NEXT: Mod 0003 \| `* Linker *`:
				CHECK-NEXT: SC[???] \| mod = 65535, 65535:0000, size = -1, data crc = 0, reloc crc = 0
				CHECK-NEXT: none
				CHECK-NEXT: Obj: ``:
				CHECK-NEXT: debug stream: 11, # files: 0, has ec info: false
				CHECK-NEXT: pdb file ni: 1 `{{.*}}pdb-publics-import.test.tmp2.pdb`, src file ni: 0 ``

	CHECK: Public Symbols			CHECK: Public Symbols
	CHECK-NEXT: ============================================================			CHECK-NEXT: ============================================================
	CHECK-NEXT: Records			CHECK-NEXT: Records
	CHECK-NEXT: 112 \| S_PUB32 [size = 20] `main`			CHECK-NEXT: 112 \| S_PUB32 [size = 20] `main`
	CHECK-NEXT: flags = function, addr = 0001:0000			CHECK-NEXT: flags = function, addr = 0001:0000
	CHECK-NEXT: 64 \| S_PUB32 [size = 24] `exportfn1`			CHECK-NEXT: 64 \| S_PUB32 [size = 24] `exportfn1`
	CHECK-NEXT: flags = function, addr = 0001:0016			CHECK-NEXT: flags = function, addr = 0001:0016
	CHECK-NEXT: 88 \| S_PUB32 [size = 24] `exportfn2`			CHECK-NEXT: 88 \| S_PUB32 [size = 24] `exportfn2`
	CHECK-NEXT: flags = function, addr = 0001:0032			CHECK-NEXT: flags = function, addr = 0001:0032
	CHECK-NEXT: 32 \| S_PUB32 [size = 32] `__imp_exportfn2`			CHECK-NEXT: 32 \| S_PUB32 [size = 32] `__imp_exportfn2`
	CHECK-NEXT: flags = none, addr = 0002:0136			CHECK-NEXT: flags = none, addr = 0002:0136
	CHECK-NEXT: 0 \| S_PUB32 [size = 32] `__imp_exportfn1`			CHECK-NEXT: 0 \| S_PUB32 [size = 32] `__imp_exportfn1`
	CHECK-NEXT: flags = none, addr = 0002:0128			CHECK-NEXT: flags = none, addr = 0002:0128

				CHECK: Symbols
				CHECK-NEXT: ============================================================
				CHECK-NEXT: Mod 0000 \| `{{.*}}pdb-publics-import.test.tmp2.obj`:
				CHECK-NEXT: Error loading module stream 0. The specified stream could not be loaded. Module stream not present
				CHECK-NEXT: Mod 0001 \| `pdb-publics-import.test.tmp1.dll`:
				CHECK-NEXT: Error loading module stream 1. The specified stream could not be loaded. Module stream not present
				CHECK-NEXT: Mod 0002 \| `Import:pdb-publics-import.test.tmp1.dll`:
				CHECK-NEXT: 4 \| S_OBJNAME [size = 44] sig=0, `pdb-publics-import.test.tmp1.dll`
				CHECK-NEXT: 48 \| S_COMPILE3 [size = 40]
				CHECK-NEXT: machine = intel x86-x64, Ver = LLVM Linker, language = link
				CHECK-NEXT: frontend = 0.0.0.0, backend = 14.10.25019.0
				CHECK-NEXT: flags = none
				CHECK-NEXT: 88 \| S_THUNK32 [size = 36] `exportfn1`
				CHECK-NEXT: parent = 0, end = 124, next = 0
				CHECK-NEXT: kind = thunk, size = 6, addr = 0001:0016
				CHECK-NEXT: 124 \| S_END [size = 4]
				CHECK-NEXT: 128 \| S_OBJNAME [size = 44] sig=0, `pdb-publics-import.test.tmp1.dll`
				CHECK-NEXT: 172 \| S_COMPILE3 [size = 40]
				CHECK-NEXT: machine = intel x86-x64, Ver = LLVM Linker, language = link
				CHECK-NEXT: frontend = 0.0.0.0, backend = 14.10.25019.0
				CHECK-NEXT: flags = none
				CHECK-NEXT: 212 \| S_THUNK32 [size = 36] `exportfn2`
				CHECK-NEXT: parent = 0, end = 248, next = 0
				CHECK-NEXT: kind = thunk, size = 6, addr = 0001:0032
				CHECK-NEXT: 248 \| S_END [size = 4]
				CHECK-NEXT: Mod 0003 \| `* Linker *`:
				CHECK-NEXT: 4 \| S_OBJNAME [size = 20] sig=0, `* Linker *`
				CHECK-NEXT: 24 \| S_COMPILE3 [size = 40]
				CHECK-NEXT: machine = intel x86-x64, Ver = LLVM Linker, language = link
				CHECK-NEXT: frontend = 0.0.0.0, backend = 14.10.25019.0
				CHECK-NEXT: flags = none
				CHECK-NEXT: 64 \| S_ENVBLOCK [size = {{[0-9]+}}]
				CHECK: {{[0-9]+}} \| S_SECTION [size = 28] `.text`
				CHECK-NEXT: length = 38, alignment = 12, rva = 4096, section # = 1
				CHECK-NEXT: characteristics =
				CHECK-NEXT: code
				CHECK-NEXT: execute permissions
				CHECK-NEXT: read permissions
				CHECK-NEXT: {{[0-9]+}} \| S_COFFGROUP [size = 24] `.text`
				CHECK-NEXT: length = 8, addr = 0001:0000
				CHECK-NEXT: characteristics =
				CHECK-NEXT: code
				CHECK-NEXT: execute permissions
				CHECK-NEXT: read permissions
				CHECK-NEXT: {{[0-9]+}} \| S_SECTION [size = 28] `.rdata`
				CHECK-NEXT: length = 209, alignment = 12, rva = 8192, section # = 2
				CHECK-NEXT: characteristics =
				CHECK-NEXT: initialized data
				CHECK-NEXT: read permissions
				CHECK-NEXT: {{[0-9]+}} \| S_COFFGROUP [size = 28] `.idata$2`
				CHECK-NEXT: length = 40, addr = 0002:0061
				CHECK-NEXT: characteristics =
				CHECK-NEXT: initialized data
				CHECK-NEXT: read permissions
				CHECK-NEXT: write permissions
				CHECK-NEXT: {{[0-9]+}} \| S_COFFGROUP [size = 28] `.idata$4`
				CHECK-NEXT: length = 24, addr = 0002:0104
				CHECK-NEXT: characteristics =
				CHECK-NEXT: initialized data
				CHECK-NEXT: read permissions
				CHECK-NEXT: write permissions
				CHECK-NEXT: {{[0-9]+}} \| S_COFFGROUP [size = 28] `.idata$5`
				CHECK-NEXT: length = 24, addr = 0002:0128
				CHECK-NEXT: characteristics =
				CHECK-NEXT: initialized data
				CHECK-NEXT: read permissions
				CHECK-NEXT: write permissions
				CHECK-NEXT: {{[0-9]+}} \| S_COFFGROUP [size = 28] `.idata$6`
				CHECK-NEXT: length = 24, addr = 0002:0152
				CHECK-NEXT: characteristics =
				CHECK-NEXT: initialized data
				CHECK-NEXT: read permissions
				CHECK-NEXT: write permissions
				CHECK-NEXT: {{[0-9]+}} \| S_COFFGROUP [size = 28] `.idata$7`
				CHECK-NEXT: length = 33, addr = 0002:0176
				CHECK-NEXT: characteristics =
				CHECK-NEXT: initialized data
				CHECK-NEXT: read permissions
				CHECK-NEXT: write permissions

	CHECK: Section Contributions			CHECK: Section Contributions
	CHECK-NEXT: ============================================================			CHECK-NEXT: ============================================================
	main			main
	CHECK-NEXT: SC[.text] \| mod = 0, 0001:0000, size = 8, data crc = 0, reloc crc = 0			CHECK-NEXT: SC[.text] \| mod = 0, 0001:0000, size = 8, data crc = 0, reloc crc = 0
	CHECK-NEXT: IMAGE_SCN_CNT_CODE \| IMAGE_SCN_ALIGN_4BYTES \| IMAGE_SCN_MEM_EXECUTE \|			CHECK-NEXT: IMAGE_SCN_CNT_CODE \| IMAGE_SCN_ALIGN_4BYTES \| IMAGE_SCN_MEM_EXECUTE \|
	CHECK-NEXT: IMAGE_SCN_MEM_READ			CHECK-NEXT: IMAGE_SCN_MEM_READ
	exportfn1 thunk			exportfn1 thunk
	CHECK-NEXT: SC[.text] \| mod = 1, 0001:0016, size = 6, data crc = 0, reloc crc = 0			CHECK-NEXT: SC[.text] \| mod = 3, 0001:0016, size = 6, data crc = 0, reloc crc = 0
	CHECK-NEXT: IMAGE_SCN_CNT_CODE \| IMAGE_SCN_MEM_EXECUTE \| IMAGE_SCN_MEM_READ			CHECK-NEXT: IMAGE_SCN_CNT_CODE \| IMAGE_SCN_MEM_EXECUTE \| IMAGE_SCN_MEM_READ
	exportfn2 thunk			exportfn2 thunk
	CHECK-NEXT: SC[.text] \| mod = 1, 0001:0032, size = 6, data crc = 0, reloc crc = 0			CHECK-NEXT: SC[.text] \| mod = 3, 0001:0032, size = 6, data crc = 0, reloc crc = 0
	CHECK-NEXT: IMAGE_SCN_CNT_CODE \| IMAGE_SCN_MEM_EXECUTE \| IMAGE_SCN_MEM_READ			CHECK-NEXT: IMAGE_SCN_CNT_CODE \| IMAGE_SCN_MEM_EXECUTE \| IMAGE_SCN_MEM_READ
	.rdata debug directory data chunks			.rdata debug directory data chunks
	CHECK-NEXT: SC[.rdata] \| mod = 1, 0002:0000, size = 28, data crc = 0, reloc crc = 0			CHECK-NEXT: SC[.rdata] \| mod = 3, 0002:0000, size = 28, data crc = 0, reloc crc = 0
	CHECK-NEXT: IMAGE_SCN_CNT_INITIALIZED_DATA \| IMAGE_SCN_MEM_READ
	CHECK-NEXT: SC[.rdata] \| mod = 1, 0002:0028, size = 33, data crc = 0, reloc crc = 0
	CHECK-NEXT: IMAGE_SCN_CNT_INITIALIZED_DATA \| IMAGE_SCN_MEM_READ			CHECK-NEXT: IMAGE_SCN_CNT_INITIALIZED_DATA \| IMAGE_SCN_MEM_READ
				CHECK-NEXT: SC[.rdata] \| mod = 3, 0002:0028, size = 33, data crc = 0, reloc crc = 0
				CHECK-NEXT: IMAGE_SCN_CNT_INITIALIZED_DATA \| IMAGE_SCN_MEM_READ
				CHECK-NEXT: SC[.rdata] \| mod = 3, 0002:0061, size = 20, data crc = 0, reloc crc = 0
				CHECK-NEXT: IMAGE_SCN_CNT_INITIALIZED_DATA \| IMAGE_SCN_MEM_READ
				CHECK-NEXT: SC[.rdata] \| mod = 3, 0002:0081, size = 20, data crc = 0, reloc crc = 0
				CHECK-NEXT: IMAGE_SCN_CNT_INITIALIZED_DATA \| IMAGE_SCN_MEM_READ
				CHECK-NEXT: SC[.rdata] \| mod = 3, 0002:0104, size = 8, data crc = 0, reloc crc = 0
				CHECK-NEXT: IMAGE_SCN_CNT_INITIALIZED_DATA \| IMAGE_SCN_MEM_READ
				CHECK-NEXT: SC[.rdata] \| mod = 3, 0002:0112, size = 8, data crc = 0, reloc crc = 0
				CHECK-NEXT: IMAGE_SCN_CNT_INITIALIZED_DATA \| IMAGE_SCN_MEM_READ
				CHECK-NEXT: SC[.rdata] \| mod = 3, 0002:0120, size = 8, data crc = 0, reloc crc = 0
				CHECK-NEXT: IMAGE_SCN_CNT_INITIALIZED_DATA \| IMAGE_SCN_MEM_READ
				CHECK-NEXT: SC[.rdata] \| mod = 3, 0002:0128, size = 8, data crc = 0, reloc crc = 0
				CHECK-NEXT: IMAGE_SCN_CNT_INITIALIZED_DATA \| IMAGE_SCN_MEM_READ
				CHECK-NEXT: SC[.rdata] \| mod = 3, 0002:0136, size = 8, data crc = 0, reloc crc = 0
				CHECK-NEXT: IMAGE_SCN_CNT_INITIALIZED_DATA \| IMAGE_SCN_MEM_READ
				CHECK-NEXT: SC[.rdata] \| mod = 3, 0002:0144, size = 8, data crc = 0, reloc crc = 0
				CHECK-NEXT: IMAGE_SCN_CNT_INITIALIZED_DATA \| IMAGE_SCN_MEM_READ
				CHECK-NEXT: SC[.rdata] \| mod = 3, 0002:0152, size = 12, data crc = 0, reloc crc = 0
				CHECK-NEXT: IMAGE_SCN_CNT_INITIALIZED_DATA \| IMAGE_SCN_MEM_READ
				CHECK-NEXT: SC[.rdata] \| mod = 3, 0002:0164, size = 12, data crc = 0, reloc crc = 0
				CHECK-NEXT: IMAGE_SCN_CNT_INITIALIZED_DATA \| IMAGE_SCN_MEM_READ
				CHECK-NEXT: SC[.rdata] \| mod = 3, 0002:0176, size = 33, data crc = 0, reloc crc = 0
				CHECK-NEXT: IMAGE_SCN_CNT_INITIALIZED_DATA \| IMAGE_SCN_MEM_READ