This is an archive of the discontinued LLVM Phabricator instance.

ELF: Implement ICF.
ClosedPublic

Authored by ruiu on Feb 22 2016, 5:11 PM.

Details

Reviewers
rafael
Summary

This patch is in my queue for a long period of time, so I'd like
to flush this although it is not still compatible with most C++
programs.

This patch implements the same algorithm as LLD/COFF's ICF. I'm
not going to repeat the same description about how it works, so you
want to read the comment in ICF.cpp in this patch. This algorithm
should be more powerful than GNU gold ICF algorithm. It can even
merge mutually-recursive functions (which is harder than one might
think).

The lacking feature is "safe" version of ICF. This merges all
identical sections. That is not compatible with a C/C++ language
requirement that two distinct functions must not have the address.

But as long as your program do not rely on pointer equality,
your program should work with this patch. LLD works fine for
example.

GNU gold implements so-called "safe ICF" that identifies functions
that are safe to merge by heuristics -- for example, gold thinks
that constructors are safe to merge because there is no way to
take an address of a constructor in C++. We have a different idea
which David Majnemer suggested, which is to add NOPs at beginning
of merged functions so that two or more pointers can have distinct
values. We can do whichever we want, but this patch does not
include neither.

Diff Detail

Event Timeline

ruiu updated this revision to Diff 48761.Feb 22 2016, 5:11 PM
ruiu retitled this revision from to ELF: Implement ICF..
ruiu updated this object.
ruiu added a reviewer: rafael.
ruiu added a subscriber: llvm-commits.
ruiu updated this revision to Diff 48762.Feb 22 2016, 5:15 PM
  • corrected a few typos.
silvas added a subscriber: silvas.Feb 22 2016, 5:32 PM

The lacking feature is "safe" version of ICF. This merges all
identical sections. That is not compatible with C/C++ language
requirement that two distinct functions must not have the address.
I think violation this rule breaks many C++ ABIs.

How does the COFF ICF avoid this problem?

We have a different idea
which David Majnemer suggested, which is to add NOPs at beginning
of merged functions so that two or more pointers can have distinct
values.

If we do this, we need to be careful to not misalign the function as emitted by the compiler. E.g. even if we only need to insert a single nop before the current section contents, we still want to grow the section by the section's alignment.

ELF/ICF.cpp
16

Don't you use a partition-refinement approach instead of a graph approach? Why mention graphs?

ELF/InputSection.h
182

I know we aren't currently optimizing the sizeof of InputSection, but we should be aware of it since we potentially allocate many.
Out of curiosity, have you measured if the extra memory pressure of this GroupId causes any performance change?
It it has an adverse effect, we can always merge it with OutSecOff, since I don't think we need both simultaneously. (or we can make sure that Repl always points to a canonical group member).

silvas added inline comments.Feb 22 2016, 6:48 PM
ELF/ICF.cpp
285

Why only the first element of RelocSections?

grimar added a subscriber: grimar.Feb 22 2016, 10:54 PM
rafael added inline comments.Feb 23 2016, 2:37 PM
ELF/ICF.cpp
10

On ELF it is possible to merge non COMDAT sections. We can keep the acronym by doing what gold does and calling this Identical *Code* Folding.

rafael added inline comments.Feb 23 2016, 3:24 PM
ELF/ICF.cpp
118

Assert that there only one? Return 0 if there is none?

120

It might be cleaner to just say RelSec->sh_size/RelSec->sh_entsize;

128

Not exactly true since the number of relocations is included.

Since this is the only use of numRels and it is only used for hashing, how about just using RelSec->sh_size?

150

Why only PROGBITS? OK if it is just to keep it simple for now.

168

Return true if a new group was created.

This could also return void and have the caller check NextId.

232

I don't think this is possible.

It is all SHT_RELA or SHT_REL.

247

It is possible to merge when the alignment is different, we just have to keep the larger one. A fixme for now is fine.

262

The same 'if' is in equalsContant. Do you need it here?

268

If both are null they could still be equal, no?

FIXME for now is fine.

272

This is also conservatively correct, right?

Two relocations pointing to the string "foobar" in a SHF_MERGE section are equivalent.

311

Do we need stable_sort to make this deterministic?

ELF/InputSection.cpp
31

When is Config null?

75

Can you commit the const fixes first?

ELF/OutputSections.cpp
1151

Can you commit the s/isLife/Life/ change first?

ELF/Symbols.h
236

Not sure I follow. Why is this not redundant with the Repl pointer in the section?

silvas added inline comments.Feb 23 2016, 4:18 PM
ELF/ICF.cpp
169

The naming of this function is confusing. The name 'partition' makes it sound like it is algorithmically similar to std::partition, but it doesn't do the same thing. I would maybe call it 'segregate'.

173

Please add a comment about the loop invariant. Currently it seems to be

  • No element of [I,End) compares equal via Eq to any element in [Begin,I).
  • If two elements of [Begin,I) compare equal via Eq, then they are part of a contiguous run of elements that all pairwise compare equal via Eq.

Also, this function is quadratic in the number of distinct elements (as determined by Eq). It is probably worth a comment explaining this and why pathological cases are not expected to happen. Without a stronger structure than Eq (e.g. a < ordering) I don't think it is actually algorithmically possible to make this non-quadratic though (since when Eq returns false you don't gain any information that can be transitively used.).

Note that in ICF<ELFT>::run the patch does a *sort* based on the hash (i.e. uses a stronger < structure). I believe the overall algorithm is actually correct without the initial hash assignment/sorting; it will just be quadratic in the size of the input instead of the maximum equivalence class size after the initial sort.

205

relocationsEq is probably a better name. Otherwise we have a function equalsConstant calling a function called constantEq which is confusing.

ruiu marked 3 inline comments as done.Feb 23 2016, 4:27 PM
ruiu added inline comments.
ELF/ICF.cpp
10

Done.

118

Done.

120

Done.

128

Done.

150

It doesn't have to be PROGBITS only. Removed.

168

Done.

232

Removed.

247

Done.

262

Removed.

268

Yes, and in that case the above if (SA== SB) should be true, so it should already be covered.

272

I believe the above if (SA == SB) should cover most cases already.

311

Done.

ELF/InputSection.cpp
31

Discarded section is initialized statically, and when it is initialized, Config is still null.

75

Done.

ELF/OutputSections.cpp
1151

Done.

ELF/Symbols.h
236

In many places we write Sym->Section to get an input section for the symbol. We either update all such occurrences with Sym->Section->Repl, or make Section a reference like this.

ruiu updated this revision to Diff 48858.Feb 23 2016, 4:44 PM
  • Address review comments.
ruiu updated this revision to Diff 48864.Feb 23 2016, 5:17 PM
  • Address Sean's comments.
ELF/ICF.cpp
171

Done.

175

Done.

207

Done.

ruiu updated this revision to Diff 48865.Feb 23 2016, 5:28 PM
  • Make forEachGroup and segregate return void.

Some tiny wordsmithing nits.

ELF/ICF.cpp
169

I would say "rearranges" instead of "sorts", since "sort" implies an absolute order.

170

I think it is worth clarifying that it is quadratic in the number of *distinct* sections.

ruiu updated this revision to Diff 48958.Feb 24 2016, 9:53 AM
  • Update comment.
ELF/ICF.cpp
169

Done.

170

Done.

rafael added inline comments.Feb 24 2016, 11:04 AM
ELF/ICF.cpp
121

The comment is out of date.

124

Range loop?

256

Two absolute relocations to different values would show up as equal in here, no?

Same issue for two relocations pointing to different strings.

ruiu updated this revision to Diff 48991.Feb 24 2016, 4:51 PM
  • Address review comments.
  • Fixed a mislinking bug. Now it is able to link lld and clang with ICF enabled. All tests has passed except Clang's crash-report tests (it seems to have messed up something in stacktrace.)
ruiu updated this revision to Diff 48992.Feb 24 2016, 4:53 PM
  • Added a test for .init and .fini.
ruiu updated this object.Feb 24 2016, 4:54 PM
ruiu updated this revision to Diff 48993.Feb 24 2016, 5:06 PM
  • Removed dead code.
grimar added inline comments.Feb 25 2016, 12:06 AM
ELF/ICF.cpp
347

Knowing that you like to reduce the code:

while (++Cnt) {
...
365

What about to separate verbose() method to error.cpp file ?
Something alike:

void verbose(Twine Msg) {
  if (Config->Verbose)
    llvm::outs() << Msg;
}

Then it can be used from another code.

ELF/InputSection.h
49

s/instnace/instance

52

ditto

182

This comment is very tight. It has no value as anyone who look at this variable can quick search and find that it is used somewhere in ICF.
I suggest to expand or to remove it.

rafael edited edge metadata.Feb 25 2016, 6:20 AM

Can you add a test showing that we merge sections with different alignments and keep the largest?

LGTM with that and George's suggestions addressed.

Thanks a lot!

rafael accepted this revision.Feb 25 2016, 6:20 AM
rafael edited edge metadata.
This revision is now accepted and ready to land.Feb 25 2016, 6:20 AM
ruiu added inline comments.Feb 25 2016, 9:14 AM
ELF/ICF.cpp
121

Done.

124

Done.

347

I love short and readable code but I don't think it's readable. Cnt is not a loop terminating condition and it just happened to be always true (so as a result it happens to work.) Your code is semantically equivalent to

for (;;) {
  if (++Cnt)
    break;

which is rather confusing.

365

That may be a good idea. I'm not going to do that in this patch, but it worths trying.

ELF/InputSection.h
182

I think the comments are valuable for first-time readers.

ruiu closed this revision.Feb 25 2016, 10:48 AM

Committed as r261912.

grimar added inline comments.Feb 25 2016, 11:33 PM
ELF/InputSection.h
182

Of course. What I meant - for me as for first time reader of this, it is not clear anyways what GroupId do from this comment.
It says it is used in ICF, that what I can find in few seconds, but question is "for what", right ? Comments should explain the details and not the obvious things.

I am observing multiple crashes in tests for this atm. All ICF tests are affected, using the latest code of lld. Going to debug it right now.
Example:

23> ****
23> FAIL: lld :: ELF/icf1.s (215 of 1104)
23> **** TEST 'lld :: ELF/icf1.s' FAILED ****
23> Script:
23> --
23> C:/access_softek/c_make_build_dir/Debug/bin\llvm-mc.EXE -filetype=obj -triple=x86_64-unknown-linux C:\access_softek\llvm\tools\lld\test\ELF\icf1.s -o C:\access_softek\c_make_build_dir\tools\lld\test\ELF\Output\icf1.s.tmp
23> C:/access_softek/c_make_build_dir/Debug/bin\ld.lld.EXE C:\access_softek\c_make_build_dir\tools\lld\test\ELF\Output\icf1.s.tmp -o C:\access_softek\c_make_build_dir\tools\lld\test\ELF\Output\icf1.s.tmp2 --icf=all --verbose | C:/access_softek/c_make_build_dir/Debug/bin\FileCheck.EXE C:\access_softek\llvm\tools\lld\test\ELF\icf1.s
23> --
23> Exit Code: 2
23>
23> Command Output (stdout):
23> --
23> Command 0: "C:/access_softek/c_make_build_dir/Debug/bin\llvm-mc.EXE" "-filetype=obj" "-triple=x86_64-unknown-linux" "C:\access_softek\llvm\tools\lld\test\ELF\icf1.s" "-o" "C:\access_softek\c_make_build_dir\tools\lld\test\ELF\Output\icf1.s.tmp"
23> Command 0 Result: 0
23> Command 0 Output:
23>
23>
23> Command 0 Stderr:
23>
23>
23> Command 1: "C:/access_softek/c_make_build_dir/Debug/bin\ld.lld.EXE" "C:\access_softek\c_make_build_dir\tools\lld\test\ELF\Output\icf1.s.tmp" "-o" "C:\access_softek\c_make_build_dir\tools\lld\test\ELF\Output\icf1.s.tmp2" "--icf=all" "--verbose"
23> Command 1 Result: -2147483645
23> Command 1 Output:
23>
23>
23> Command 1 Stderr:
23> 0x54322AA6 (0x03427A60 0x03427788 0x00000048 0x002B7887), ?_Debug_message@std@@YAXPB_W0I@Z() + 0x26 bytes(s)
23>
23> 0x029DF23E (0x0522F3B0 0x0522F3F8 0x029E88C6 0x044A1DE4), std::_Vector_const_iterator<std::_Vector_val<std::_Simple_types<lld::elf2::InputSection<llvm::object::ELFType<1,1> > *> > >::operator*() + 0x5E bytes(s), c:\program files (x86)\microsoft visual studio 14.0\vc\include\vector, line 72 + 0x14 byte(s)
23>
23> 0x029DF676 (0x044A1DE4 0x002D6C14 0x0522F3B0 0x00000000), std::_Vector_iterator<std::_Vector_val<std::_Simple_types<lld::elf2::InputSection<llvm::object::ELFType<1,1> > *> > >::operator*() + 0x16 bytes(s), c:\program files (x86)\microsoft visual studio 14.0\vc\include\vector, line 326
23>
23> 0x029E88C6 (0x0522F55C 0x044A1DE4 0x002D6C14 0x029EB753), lld::elf2::ICF<llvm::object::ELFType<1,1> >::forEachGroup() + 0xA6 bytes(s), c:\access_softek\llvm\tools\lld\elf\icf.cpp, line 209 + 0x16 byte(s)
23>
23> 0x029EB771 (0x0522F6CC 0x00000001 0x00000000 0x0522F7D0), lld::elf2::ICF<llvm::object::ELFType<1,1> >::run() + 0xF1 bytes(s), c:\access_softek\llvm\tools\lld\elf\icf.cpp, line 344
23>
23> 0x029D6C2F (0x0522F6CC 0x0522F8AC 0x04493668 0xCCCCCCCC), lld::elf2::doIcf<llvm::object::ELFType<1,1> >() + 0x2F bytes(s), c:\access_softek\llvm\tools\lld\elf\icf.cpp, line 374
23>
23> 0x0294362A (0x0522F7F4 0x0522FA80 0x00000002 0xCCCCCCCC), lld::elf2::LinkerDriver::link<llvm::object::ELFType<1,1> >() + 0x51A bytes(s), c:\access_softek\llvm\tools\lld\elf\driver.cpp, line 380 + 0xC byte(s)
23>
23> 0x029355A1 (0x054836F4 0x00000005 0x0522FC58 0xCCCCCCCC), lld::elf2::LinkerDriver::main() + 0x111 bytes(s), c:\access_softek\llvm\tools\lld\elf\driver.cpp, line 182
23>
23> 0x029353CC (0x054836F0 0x00000006 0x046694C8 0x0522FCA4), lld::elf2::link() + 0x9C bytes(s), c:\access_softek\llvm\tools\lld\elf\driver.cpp, line 45
23>
23> 0x00446D98 (0x054836F0 0x00000006 0x046694C8 0x002B7887), lld::UniversalDriver::link() + 0x2B8 bytes(s), c:\access_softek\llvm\tools\lld\lib\driver\universaldriver.cpp, line 208 + 0x29 byte(s)
23>
23> 0x0044686A (0x00000006 0x054836F0 0x0547F590 0x0522FD10), main() + 0x5A bytes(s), c:\access_softek\llvm\tools\lld\tools\lld\lld.cpp, line 35 + 0x22 byte(s)
23>
23> 0x0333EFDE (0x2E988591 0x002B7887 0x002B7887 0x7F034000), invoke_main() + 0x1E bytes(s), f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl, line 74 + 0x1B byte(s)
23>
23> 0x0333EE2A (0x0522FD20 0x0333EFF8 0x0522FD34 0x77897C04), scrt_common_main_seh() + 0x15A bytes(s), f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl, line 264 + 0x5 byte(s)
23>
23> 0x0333ECBD (0x0522FD34 0x77897C04 0x7F034000 0x77897BE0),
scrt_common_main() + 0xD bytes(s), f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl, line 309
23>
23> 0x0333EFF8 (0x7F034000 0x77897BE0 0x5D66FCE8 0x0522FD7C), mainCRTStartup() + 0x8 bytes(s), f:\dd\vctools\crt\vcstartup\src\startup\exe_main.cpp, line 17
23>
23> 0x77897C04 (0x7F034000 0x5D45F739 0x00000000 0x00000000), BaseThreadInitThunk() + 0x24 bytes(s)
23>
23> 0x77B0AD1F (0xFFFFFFFF 0x77AF024A 0x00000000 0x00000000), RtlInitializeExceptionChain() + 0x8F bytes(s)
23>
23> 0x77B0ACEA (0x002B7887 0x7F034000 0x00000000 0x00000000), RtlInitializeExceptionChain() + 0x5A bytes(s)

Problem is in

template <class ELFT>
void ICF<ELFT>::forEachGroup(std::vector<InputSection<ELFT> *> &V,
                             Comparator Eq) {
  for (auto I = V.begin(), E = V.end(); I != E;) {
    InputSection<ELFT> *Head = *I;
    auto Bound = std::find_if(I + 1, E, [&](InputSection<ELFT> *S) {
      return S->GroupId != Head->GroupId;
    });
    segregate(&*I, &*Bound, Eq);
    I = Bound;
  }
}

Bound == V.end() and that crashes at

segregate(&*I, &*Bound, Eq);

Have next values for V:
+ [0] 0x0096f698 {RelocSections={Val={Val={Value=0x00000000 } } } OutSecOff=0x0000000000000000 GroupId=0x8000000055626995 }
+ [1] 0x0096f6c8 {RelocSections={Val={Val={Value=0x00000000 } } } OutSecOff=0x0000000000000000 GroupId=0x8000000055626995 }
+ [2] 0x0096f668 {RelocSections={Val={Val={Value=0x00000000 } } } OutSecOff=0x0000000000000000 GroupId=0x80000000af442220 }
Head->GroupId == 0x8000000055626995

at first iteration it sets I to V[2]. Then on next one finds nothing, because I + 1 is already == E.
If I add next after "auto Bound ="

if (Bound == V.end())
  return;

all tests pass for me.

ruiu added a subscriber: ruiu.Feb 26 2016, 7:12 AM

Thank you for your investigation. On it.

ruiu added a comment.Feb 26 2016, 7:18 AM

Should be fixed in r262022.

In D17529#362806, @ruiu wrote:

Should be fixed in r262022.

Yep, fixed for me. Thanks !

ruiu added inline comments.Feb 26 2016, 7:53 AM
ELF/InputSection.h
182

That is explained in ICF.cpp so we don't want to repeat that in the other file. The point is that you can just ignore this field unless you are working on ICF because it is used (only) by ICF.