This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
ELF/
-
InputSection.h
-
SyntheticSections.cpp

Differential D52126

Discard uncompressed buffer after creating .gdb_index contents.
ClosedPublic

Authored by ruiu on Sep 14 2018, 3:37 PM.

Download Raw Diff

Details

Reviewers

MaskRay
• espindola

Commits

rG751dfbe39b8a: Discard uncompressed buffer after creating .gdb_index contents.
rL342297: Discard uncompressed buffer after creating .gdb_index contents.
rLLD342297: Discard uncompressed buffer after creating .gdb_index contents.

Summary

Once we create .gdb_index contents, .zdebug_gnu_pub{names,types}
are useless, so there's no need to keep their uncompressed data
in memory.

I observed that for a test case in which lld creates a 3GB .gdb_index
section, the maximum resident set size reduced from 43GB to 29GB after
this patch.

Diff Detail

Repository: rLLD LLVM Linker

Event Timeline

ruiu created this revision.Sep 14 2018, 3:37 PM

Herald added a reviewer: • espindola. · View Herald TranscriptSep 14 2018, 3:37 PM

Herald added subscribers: arichardson, emaste. · View Herald Transcript

Great finding! Just a question: do you move the code because the new place fits well?

This revision is now accepted and ready to land.Sep 14 2018, 3:44 PM

MaskRay added inline comments.Sep 14 2018, 3:46 PM

lld/ELF/SyntheticSections.cpp
2516 ↗	(On Diff #165603)	I don't know if for `std::unique_ptr`, `reset()` is more conventional

Great finding! Just a question: do you move the code because the new place fits well?

In the above parallelForEachN, we read contents of .debug_gnu_pub{names,types}, so the previous position doesn't work.

use reset()

Harbormaster completed remote builds in B22677: Diff 165605.Sep 14 2018, 3:51 PM

Closed by commit rLLD342297: Discard uncompressed buffer after creating .gdb_index contents. (authored by ruiu). · Explain WhySep 14 2018, 4:00 PM

This revision was automatically updated to reflect the committed changes.

Out of curiosity - would it be possible/useful to avoid keeping all the
pubnames sections uncompressed at the same time? Would they be able to be
processed one at a time (uncompress 1, process it, delete it, uncompress
the second, process it, delete it, etc)?

It is doable and perhaps we should do that. Currently, we decompress all compressed sections before doing anything, so that such sections are handled as if they weren't compressed at all, but sometimes that leads to a waste of time and memory.

But one thing we need to keep in mind (and that's what I'm currently working on) is, if we discard a decompressed section buffer, we can't have StringRefs pointing to that section. That naturally affects the design because we usually create a lot of StringRefs pointing to input sections to avoid the cost of copying. I don't have a good idea about how to write code that works well both for uncompressed and compressed sections yet.

*nod* It might be possible to (maybe too low-level - not sure if zlib
exposes this, or if the format even allows it to be efficiently answered)
retrieve the size of a compressed section without decompressing it (at
least in zlib-gnu fromat, I think the uncompressed size is written before
the compressed data, so easy there) - that way maybe more things wouldn't
need to care about whether it was compressed or not - it could be
decompressed lazily (& then the other half would be to deallocate promptly,
as soon as those bytes were finished with/written out to the output and no
longer needed).

At some point, it'd be even great to use streaming compression in and out.
(I guess you could probably even use streaming decompression for the
pubnames - so even a whole object file's pubnames wouldn't need to be
decompressed simultaneously - just ask for the next chunk of decompressed
data, process it, then overwrite it with the next chunk, etc).

This might be a silly question, but why do we compress only debug sections? If we really want to compress object files for valid reasons (e.g. reducing amount of network traffic when doing a distributed build) we can simply compress an entire object file instead of compressing only the debug section. Then we can stream-uncompress object files to a disk and then run the linker on the input files.

Fair question - for Google, stream uncompressing to disk wouldn't help
matters - disk is a ramfs, so it's the same as uncompressing the whole
thing to a buffer in memory, which hurts a bit (due to memory limits).

For somewhat more "normal" users (I assume compressed debug info was
probably implemented before google's needs - but I could be wrong there,
maybe Google folks implemented it in gold/gcc before the LLVM switch) -
especially pre-Fission, debug info was the big culprit. Though that doesn't
mean it was better to just compress it rather than compressing everything.
Perhaps because keeping the object file as the outer container meant more
things continued to "just work" - objdump, etc, things that relied only on
the object file headers and not the section contents.

Revision Contents

Path

Size

ELF/

InputSection.h

1 line

SyntheticSections.cpp

17 lines

Diff 165607

ELF/InputSection.h

Show First 20 Lines • Show All 199 Lines • ▼ Show 20 Lines	public:


template <typename T> llvm::ArrayRef<T> getDataAs() const {		template <typename T> llvm::ArrayRef<T> getDataAs() const {
size_t S = Data.size();		size_t S = Data.size();
assert(S % sizeof(T) == 0);		assert(S % sizeof(T) == 0);
return llvm::makeArrayRef<T>((const T *)Data.data(), S / sizeof(T));		return llvm::makeArrayRef<T>((const T *)Data.data(), S / sizeof(T));
}		}

private:
// A pointer that owns decompressed data if a section is compressed by zlib.		// A pointer that owns decompressed data if a section is compressed by zlib.
// Since the feature is not used often, this is usually a nullptr.		// Since the feature is not used often, this is usually a nullptr.
std::unique_ptr<char[]> DecompressBuf;		std::unique_ptr<char[]> DecompressBuf;
};		};

// SectionPiece represents a piece of splittable section contents.		// SectionPiece represents a piece of splittable section contents.
// We allocate a lot of these and binary search on them. This means that they		// We allocate a lot of these and binary search on them. This means that they
// have to be as compact as possible, which is why we don't store the size (can		// have to be as compact as possible, which is why we don't store the size (can
▲ Show 20 Lines • Show All 147 Lines • Show Last 20 Lines

ELF/SyntheticSections.cpp

Show First 20 Lines • Show All 2,489 Lines • ▼ Show 20 Lines	createSymbols(ArrayRef<std::vector<GdbIndexSection::NameTypeEntry>> NameTypes) {

return Ret;		return Ret;
}		}

// Returns a newly-created .gdb_index section.		// Returns a newly-created .gdb_index section.
template <class ELFT> GdbIndexSection *GdbIndexSection::create() {		template <class ELFT> GdbIndexSection *GdbIndexSection::create() {
std::vector<InputSection *> Sections = getDebugInfoSections();		std::vector<InputSection *> Sections = getDebugInfoSections();

// .debug_gnu_pub{names,types} are useless in executables.
// They are present in input object files solely for creating
// a .gdb_index. So we can remove them from the output.
for (InputSectionBase *S : InputSections)
if (S->Name == ".debug_gnu_pubnames" \|\| S->Name == ".debug_gnu_pubtypes")
S->Live = false;

std::vector<GdbChunk> Chunks(Sections.size());		std::vector<GdbChunk> Chunks(Sections.size());
std::vector<std::vector<NameTypeEntry>> NameTypes(Sections.size());		std::vector<std::vector<NameTypeEntry>> NameTypes(Sections.size());

parallelForEachN(0, Sections.size(), [&](size_t I) {		parallelForEachN(0, Sections.size(), [&](size_t I) {
ObjFile<ELFT> *File = Sections[I]->getFile<ELFT>();		ObjFile<ELFT> *File = Sections[I]->getFile<ELFT>();
DWARFContext Dwarf(make_unique<LLDDwarfObj<ELFT>>(File));		DWARFContext Dwarf(make_unique<LLDDwarfObj<ELFT>>(File));

Chunks[I].Sec = Sections[I];		Chunks[I].Sec = Sections[I];
Chunks[I].CompilationUnits = readCuList(Dwarf);		Chunks[I].CompilationUnits = readCuList(Dwarf);
Chunks[I].AddressAreas = readAddressAreas(Dwarf, Sections[I]);		Chunks[I].AddressAreas = readAddressAreas(Dwarf, Sections[I]);
NameTypes[I] = readPubNamesAndTypes(Dwarf, I);		NameTypes[I] = readPubNamesAndTypes(Dwarf, I);
});		});

		// .debug_gnu_pub{names,types} are useless in executables.
		// They are present in input object files solely for creating
		// a .gdb_index. So we can remove them from the output.
		for (InputSectionBase *S : InputSections) {
		if (S->Name != ".debug_gnu_pubnames" && S->Name != ".debug_gnu_pubtypes")
		continue;
		S->Live = false;
		S->DecompressBuf.reset();
		}

auto *Ret = make<GdbIndexSection>();		auto *Ret = make<GdbIndexSection>();
Ret->Chunks = std::move(Chunks);		Ret->Chunks = std::move(Chunks);
Ret->Symbols = createSymbols(NameTypes);		Ret->Symbols = createSymbols(NameTypes);
Ret->initOutputSize();		Ret->initOutputSize();
return Ret;		return Ret;
}		}

void GdbIndexSection::writeTo(uint8_t *Buf) {		void GdbIndexSection::writeTo(uint8_t *Buf) {
▲ Show 20 Lines • Show All 630 Lines • Show Last 20 Lines