This is an archive of the discontinued LLVM Phabricator instance.

Add support for SHF_MERGE sections
ClosedPublic

Authored by rafael on Oct 16 2015, 9:17 AM.

Details

Reviewers
ruiu
Summary

For now, only non-strings are supported. I have also made no effort to optimize this.

I am doing a fresh clang build and will put the numbers in here.

Diff Detail

Event Timeline

rafael updated this revision to Diff 37601.Oct 16 2015, 9:17 AM
rafael retitled this revision from to Add support for SHF_MERGE sections.
rafael updated this object.
rafael added a reviewer: ruiu.
rafael added a project: lld.
rafael added a subscriber: llvm-commits.
emaste added a subscriber: emaste.Oct 16 2015, 9:34 AM

Gold takes 1.132828663s and produces a 77868808 bytes binary.
trunk takes 0.479872948 seconds and produces a 87477544 bytes binary.
patch takes 0.495242910 seconds and produces a 87453096 bytes binary.

It is a pretty small saving. I think a lot more in is SHF_STRINGS, which this is a stepping stone for.

shankare added inline comments.
ELF/InputFiles.cpp
160

An appropriate error message might be needed here.

164–167

Linker can merge across readonly merge sections. You might want to check that section flags do not have write permission.

ELF/InputSection.cpp
81–82

You can also have a relocation to a local symbol in a merge section. For the local symbol usecase in a merge string section you would need to get the value of the local symbol too.

174

Error message.

ELF/OutputSections.cpp
472

error message.

553

Can we change the variable names and error messages to be proper ?

556

error message.

ruiu added inline comments.Oct 16 2015, 2:14 PM
ELF/InputSection.cpp
69–70

Needs a brief comment to describe what this function is for.

As I understand, this function is to fix addend if it is referring a SHF_MERGE'ed section so that the addend becomes an offset from start of the entry, but I needed to read this code carefully.

85

Don't we have isa_or_null?

Oh we don't. Nevermind.

168–177

This function is called indirectly for each relocation, and this basically does string search from a large vector. That's probably too slow.

You may want to keep a vector of offsets for each element of MergeInputSection, so that you don't have to do string search.

ELF/InputSection.h
68

s/merge/SHF_MERGE/ to make it clear that it's for ELF merge sections.

ELF/OutputSections.cpp
522

unsinged -> uint64_t or uintX_t to support cross-linking huge sections on 32 bit platform.

(In reality, LLD will always run out of virtual memory space if we have such huge chunks on 32-bit platform, but still unsigned doesn't feel correct. If it is size_t, it may make sense.)

673–682

You may want to separate this as a function.

ELF/OutputSections.h
208

This member needs a comment because of its importance.

ELF/Writer.cpp
421–422

Can you run two loops, one for regular sections and the other for merged sections, to improve readability? I don't like large loops.

rafael updated this revision to Diff 37656.Oct 16 2015, 4:35 PM

This is still in the prototype stage. Uploading a new version since I am closing for the day.

I will go over the suggestions as soon as I get back to work.

rafael updated this revision to Diff 37753.Oct 19 2015, 7:33 AM

I think this implements every review request except:

Can you run two loops, one for regular sections and the other for merged sections, to improve readability? I don't like large loops.

There is substantial overlap on what the bodies would be.

And I have also not implemented the requested optimization. I will start coding it now. It can be committed with this patch or as an immediate followup. I would have a small preference of committing in as a followup.

rafael updated this revision to Diff 37762.Oct 19 2015, 8:45 AM
ruiu edited edge metadata.Oct 19 2015, 10:39 AM

I don't think splitting the loop would produce substantially duplicated code. Even if code in total gets larger, it's a good thing to do as long as each loop becomes more clear. I didn't try that myself though. You can leave it as is for now if you want. I'll take a look at that later.

ELF/InputSection.h
83–87

Mix of Base and llvm::object::ElfFile<ELFT> looks a bit odd. I'd always use llvm::object::ElfFile<ELFT> because that is one less indirection.

ELF/OutputSections.cpp
467–468

Is this correct? Offset is a rounded value, and the return value does not seem to include the information you discarded using the % operator.

ELF/OutputSections.h
208

entry (typo)

rafael updated this revision to Diff 37786.Oct 19 2015, 1:35 PM
rafael updated this object.
rafael edited edge metadata.

Correctly handle addends when creating dynamic relocations.

ruiu accepted this revision.Oct 19 2015, 1:45 PM
ruiu edited edge metadata.

LGTM

This revision is now accepted and ready to land.Oct 19 2015, 1:45 PM
Eugene.Zelenko added a subscriber: Eugene.Zelenko.

Committed in r250737.