This is an archive of the discontinued LLVM Phabricator instance.

[LLD][COFF] Partial sections
ClosedPublic

Authored by aganea on Dec 4 2018, 1:08 PM.

Details

Summary

As suggested by @rnk here, I persisted the Map variable, previously in lld::coff::Writer::createSections(), now as a member in class lld::coff::Writer and converted its data into a PartialSection.

PartialSections are simply sections that contribute to an OutputSection.
This is needed later by D54802 to generate COFF groups in the * Linker * module.

I used a std::set below instead of a DenseSet because we need to retain a sorted order when enumerating.

Diff Detail

Repository
rLLD LLVM Linker

Event Timeline

aganea created this revision.Dec 4 2018, 1:08 PM
rnk added a subscriber: mstorsjo.Dec 12 2018, 3:09 PM

I used a std::map below instead of a DenseHash because we need to retain a sorted order when enumerating.

Yep, I looked at removing this before. It might actually be worth it for mingw, because GCC outputs sections named like ".text$_Z3foov", so you can imagine that this map is going to have one entry per symbol, so one extra memory allocation per symbol.

@mstorsjo, I assume that ld.bfd doesn't actually sort the .text (and .pdata and .xdata) section contributions alphabetically. Should we drop the "$_Z3foov" portion of the object file section name before making the unmerged section? It might matter for performance.

COFF/Writer.cpp
335

LLVM generally tries to keep anon namespaces small and use static instead:
https://llvm.org/docs/CodingStandards.html#anonymous-namespaces

619–620

What is NC supposed to reference here and below?

703

Eh, this is small, since it's only the real output sections of the PE. I guess it doesn't matter if we use std::map for simplicity.

rnk added reviewers: pcc, ruiu.Dec 12 2018, 3:09 PM

I just hit send before saying that, at a high level, looks good, but we should check with @ruiu and @pcc about the terminology.

In D55293#1328930, @rnk wrote:

Yep, I looked at removing this before. It might actually be worth it for mingw, because GCC outputs sections named like ".text$_Z3foov", so you can imagine that this map is going to have one entry per symbol, so one extra memory allocation per symbol.

@mstorsjo, I assume that ld.bfd doesn't actually sort the .text (and .pdata and .xdata) section contributions alphabetically. Should we drop the "$_Z3foov" portion of the object file section name before making the unmerged section? It might matter for performance.

ld.bfd does sort the section contributions alphabetically. I tested with a test snippet like this:

$ cat test.cpp
inline int foo1() {
    return 1;
}
inline int foo3() {
    return 3;
}
inline int foo2() {
    return 2;
}
int main(int argc, char* argv[]) {
    return foo1() + foo3() + foo2();
}
$ x86_64-w64-mingw32-g++ test.cpp -c
$ x86_64-w64-mingw32-g++ test.o -o test.exe -Wl,-Map,map.txt
$ cat map.txt
...
 .text          0x0000000000402da0        0x0 /Users/martin/mingw64/lib/gcc/x86_64-w64-mingw32/5.4.0/crtend.o
 *(SORT(.text$*))
 .text$_Z4foo1v
                0x0000000000402da0       0x10 test.o
                0x0000000000402da0                foo1()
 .text$_Z4foo2v
                0x0000000000402db0       0x10 test.o
                0x0000000000402db0                foo2()
 .text$_Z4foo3v
                0x0000000000402dc0       0x10 test.o
                0x0000000000402dc0                foo3()

So it does seem to be sorted, intentionally.

COFF/Writer.cpp
222

I think the parameter is unnecessary when converting this from a static function to a method, as the class already has got the same IdataContents Idata member.

674–687

I'd maybe name it ImportTableSec or so - everything in .idata is relating to imports so ImportSec is a bit generic as it would go for all of them. This one specifically refers to the sections that ImportTableStart/ImportTableSize refers to.

COFF/Writer.h
82 ↗(On Diff #176699)

Maybe leave out the ContribSections part from this patch - as far as I see it's not actually needed/used at this stage yet, and introducing it here makes the patch a bit more confusing.

ruiu added inline comments.Dec 17 2018, 4:50 PM
COFF/Writer.h
82 ↗(On Diff #176699)

I don't think I understand the point of this change. Should the fact that some section is not assigned to some output section be a C++ class? It seems to me that that's a bit weird concept to be a class. Also, OutputSection has a list of unmerged sections, which seems odd to me too because once they are added, they are no longer unmerged.

I guess that first of all, I don't understand the intention of this change. Are you changing the code so that you can handle sections in input files and synthetic sections in the same way?

rnk added a comment.Dec 17 2018, 5:06 PM

@ruiu, the motivation is to have some abstraction over the collection of chunks before "merging" section names. So, this would allow us to enumerate all chunks contributing to the .CRT$XCU section before it is concatenated with all the other .CRT$* sections and then merged into the .rdata section by our builtin /merge: rule. We accumulate those sections in a plain vector currently, and I felt it would be useful to have some name for that grouping.

ruiu added a comment.Dec 17 2018, 5:14 PM

Is .CRT$XCU just an example, or do you want to do something special for .CRT$XCU? If latter is the case, have you considered just directly implementing what you want do?

COFF/Writer.h
82 ↗(On Diff #176699)

It seems you are not using this member.

rnk added a comment.Dec 17 2018, 5:20 PM

Is .CRT$XCU just an example, or do you want to do something special for .CRT$XCU? If latter is the case, have you considered just directly implementing what you want do?

It is just an example. I believe the PDB has this information for all unmerged sections.

aganea marked 2 inline comments as done.Dec 18 2018, 6:00 AM
aganea added inline comments.
COFF/Writer.cpp
703

This is essentially due to the new NameAndChars structure. If we want to use a (Small)DenseMap here, it needs a bit of boilerplate for DenseMapInfo<NameAndChars>, only to be used here. I don't think space or perf. will be an issue here.

COFF/Writer.h
82 ↗(On Diff #176699)

You are right, this member shouldn't be part of the current patch, I will remove it.
@ruiu The overall intention of this patch is to support Coff Groups, as proposed in D54802.

Mod 0003 | `* Linker *`:
[...]
   588 | S_SECTION [size = 28] `.text`
         length = 38, alignment = 12, rva = 4096, section # = 1
         characteristics =
           code
           execute permissions
           read permissions
   616 | S_COFFGROUP [size = 24] `.text`
         length = 8, addr = 0001:0000
         characteristics =
           code
           execute permissions
           read permissions
   640 | S_SECTION [size = 28] `.rdata`
         length = 61, alignment = 12, rva = 8192, section # = 2
         characteristics =
           initialized data
           read permissions
   668 | S_SECTION [size = 28] `.idata`
         length = 145, alignment = 12, rva = 12288, section # = 3
         characteristics =
           initialized data
           read permissions
   696 | S_COFFGROUP [size = 28] `.idata$2`
         length = 40, addr = 0003:0000
         characteristics =
           initialized data
           read permissions
           write permissions
   724 | S_COFFGROUP [size = 28] `.idata$4`
         length = 24, addr = 0003:0040
         characteristics =
           initialized data
           read permissions
           write permissions
   752 | S_COFFGROUP [size = 28] `.idata$5`
         length = 24, addr = 0003:0064
         characteristics =
           initialized data
           read permissions
           write permissions
   780 | S_COFFGROUP [size = 28] `.idata$6`
         length = 24, addr = 0003:0088
         characteristics =
           initialized data
           read permissions
           write permissions
   808 | S_COFFGROUP [size = 28] `.idata$7`
         length = 33, addr = 0003:0112
         characteristics =
           initialized data
           read permissions
           write permissions

Without this, the unmerged sections are otherwise lost in Writer::mergeSections(), and are not available later in PDBLinker::addSections().

aganea updated this revision to Diff 178652.Dec 18 2018, 6:11 AM
aganea marked 8 inline comments as done.
aganea added inline comments.
COFF/Writer.cpp
619–620

It's a std::pair<NameAndChars, UnmergedSection *>. I changed it back to auto &Pair like it was before.

674–687

I changed the .idata$2 reference to ImportDirs , and .idata$5 reference to ImportAddresses. Does that sound better?

mstorsjo marked an inline comment as done.Dec 18 2018, 6:14 AM
mstorsjo added inline comments.
COFF/Writer.cpp
674–687

Sounds good to me.

Gentle ping. Any further comments? Thank you in advance for taking the time to review this.

aganea added a comment.Jan 4 2019, 1:54 PM

Hi @ruiu ! This is waiting for your final review / approval. Could you possibly take a quick look please? Thank you!

rnk added a comment.Jan 15 2019, 11:33 AM

Hi @ruiu ! This is waiting for your final review / approval. Could you possibly take a quick look please? Thank you!

@ruiu, ping

ruiu added inline comments.Jan 15 2019, 1:11 PM
COFF/Writer.cpp
161

Format.

165

NC -> Other by convention.

169

nit: I'd stick with < when defining operator<. You can just return the result of the comparison of characteristics.

if (C == 0)
 return Characteristics < NC.Characteritics.
611

80 cols?

I recommend you use clang-format.

724–725

Why did you have to rename the function?

COFF/Writer.h
28–38 ↗(On Diff #178652)

Since no one except one .cpp file uses this class declaration, you don't need to write it in a header. Please move to the .cpp file.

30 ↗(On Diff #178652)

It feels that UnmergedSection is still a bit weird name. You don't remove members from "Unmerged" once they are "merged" right?

37 ↗(On Diff #178652)

You have an unused member variable.

aganea updated this revision to Diff 182074.Jan 16 2019, 9:48 AM
aganea marked 10 inline comments as done.
aganea retitled this revision from [LLD][COFF] Unmerged sections to [LLD][COFF] Partial sections.
aganea edited the summary of this revision. (Show Details)

Updated following @ruiu 's comments - thank you!

I wasn't happy with the UnmergedSection name either - I changed that into PartialSection. If could also be PartialOutSection if we want to better designate the destination.

I also removed the struct NameAndChars because the concept was overlapping with data in PartialSection; that changed the std::map into a std::set and a custom comparer which uses part of PartialSection 's data as the "key".

COFF/Writer.cpp
724–725

Reverted.

COFF/Writer.h
30 ↗(On Diff #178652)

What about PartialSection?

rnk accepted this revision.Jan 23 2019, 11:13 AM

lgtm I like the new name. I think this is an overall readability improvement.

This revision is now accepted and ready to land.Jan 23 2019, 11:13 AM
ruiu added inline comments.Jan 23 2019, 11:19 AM
COFF/Writer.cpp
166

Now you can refer this class as just StringRef instead of llvm::StringRef.

182

Can this null condition happen?

aganea marked an inline comment as done.Jan 23 2019, 12:43 PM
aganea added inline comments.
COFF/Writer.cpp
182

No, it cannot. I can transform the null checks into an assert, if you’re fine with that? The intent will be better expressed.

ruiu added inline comments.Jan 23 2019, 1:16 PM
COFF/Writer.cpp
182

I don't think you even have to have a null check. Dereferencing a pointer clearly delivers the intent that the pointer cannot be null.

This revision was automatically updated to reflect the committed changes.