This is an archive of the discontinued LLVM Phabricator instance.

I think this kills the additional optimization we have here. Unfortunately, I do not remember
how much improvement this piece had in comparison with the simple code as your patch has,
but this was the code initially implemented by Rafael in r284594, which comment says:

Add a faster binary search.

Even with the hash table cache, binary search was still pretty
hot. This can be made even faster with prefetching.

Idea from http://cglab.ca/~morin/misc/arraylayout-v2/

I will suggest moving this to llvm.

I do not know if that was moved to llvm (or if that what we might want atm, if not).

In D55248#1318096, @grimar wrote:
I think this kills the additional optimization we have here. Unfortunately, I do not remember
how much improvement this piece had in comparison with the simple code as your patch has,
but this was the code initially implemented by Rafael in r284594, which comment says:
Add a faster binary search.

Even with the hash table cache, binary search was still pretty
hot. This can be made even faster with prefetching.

Idea from http://cglab.ca/~morin/misc/arraylayout-v2/

I will suggest moving this to llvm.
I do not know if that was moved to llvm (or if that what we might want atm, if not).

The optimization is that by using a simple conditional assignment, the compilers will know to generate conditional move (branch-less binary search as the paper says)

// either one below generates conditional mov
if (Pieces[Idx + H].InputOff <= Offset)
 Idx += H;

Idx = Pieces[Idx + H].InputOff <= Offset ? Idx + H : Idx;

// but std::upper_bound doesn't

The (negligible) disadvantage of this approach is that in each iteration, it reduces the range size from n to ceil(n/2), instead of floor(n/2) as is the case of std::upper_bound (which produces the not; add instruction sequence to do size -= H+1).

But note OffsetMap.find(Offset) also takes some time. The improvement here is negligible to me..

A static -DCMAKE_BUILD_TYPE=Debug build of bin/clang-8:

perf stat -r 10 ~/llvm/Release/bin/ld.lld @response.txt -o /dev/null => The times vary from 4.5435 to 4.7403 with or without this optimization. The differences are purely noise.. I cannot find the (very minor) optimization (branch instruction -> conditional move) improve the linking time.

Yeah, I think with an optimizing compiler you cannot see any difference between the old and the new code.

ELF/InputSection.cpp
1226–1230	Can you add `llvm::` to `upper_bound` so that that looks obviously different from `std::upper_bound`?
1226–1231	Calculating a distance between two iterators to use it as an array access index seems a bit awkward.

Add llvm::

Harbormaster completed remote builds in B25691: Diff 176709.Dec 4 2018, 1:54 PM

LGTM

ELF/InputSection.cpp
1230	You can remove this `assert` as it doesn't make much sense anymore.

This revision is now accepted and ready to land.Dec 4 2018, 2:26 PM

Closed by commit rLLD348311: [ELF] Simplify getSectionPiece (authored by MaskRay). · Explain WhyDec 4 2018, 2:28 PM

This revision was automatically updated to reflect the committed changes.

MaskRay mentioned this in D55234: Do not use a hash table to uniquify mergeable strings..Dec 4 2018, 4:23 PM

Revision Contents

Path

Size

ELF/

InputSection.cpp

17 lines

Diff 176711

ELF/InputSection.cpp

Show First 20 Lines • Show All 1,217 Lines • ▼ Show 20 Lines	SectionPiece *MergeInputSection::getSectionPiece(uint64_t Offset) {

// Find a piece starting at a given offset.		// Find a piece starting at a given offset.
auto It = OffsetMap.find(Offset);		auto It = OffsetMap.find(Offset);
if (It != OffsetMap.end())		if (It != OffsetMap.end())
return &Pieces[It->second];		return &Pieces[It->second];

// If Offset is not at beginning of a section piece, it is not in the map.		// If Offset is not at beginning of a section piece, it is not in the map.
// In that case we need to do a binary search of the original section piece vector.		// In that case we need to do a binary search of the original section piece vector.
size_t Size = Pieces.size();		auto It2 =
size_t Idx = 0;		llvm::upper_bound(Pieces, Offset, [](uint64_t Offset, SectionPiece P) {
		return Offset < P.InputOff;
while (Size != 1) {		});
size_t H = Size / 2;		return &It2[-1];
		ruiuUnsubmitted Not Done Reply Inline Actions Can you add `llvm::` to `upper_bound` so that that looks obviously different from `std::upper_bound`? ruiu: Can you add `llvm::` to `upper_bound` so that that looks obviously different from `std…
		ruiuUnsubmitted Not Done Reply Inline Actions You can remove this `assert` as it doesn't make much sense anymore. ruiu: You can remove this `assert` as it doesn't make much sense anymore.
Size -= H;
if (Pieces[Idx + H].InputOff <= Offset)
Idx += H;
}
if (Offset < Pieces[Idx].InputOff)
--Idx;
return &Pieces[Idx];
}		}
		ruiuUnsubmitted Not Done Reply Inline Actions Calculating a distance between two iterators to use it as an array access index seems a bit awkward. ruiu: Calculating a distance between two iterators to use it as an array access index seems a bit…

// Returns the offset in an output section for a given input offset.		// Returns the offset in an output section for a given input offset.
// Because contents of a mergeable section is not contiguous in output,		// Because contents of a mergeable section is not contiguous in output,
// it is not just an addition to a base output offset.		// it is not just an addition to a base output offset.
uint64_t MergeInputSection::getParentOffset(uint64_t Offset) const {		uint64_t MergeInputSection::getParentOffset(uint64_t Offset) const {
// If Offset is not at beginning of a section piece, it is not in the map.		// If Offset is not at beginning of a section piece, it is not in the map.
// In that case we need to search from the original section piece vector.		// In that case we need to search from the original section piece vector.
const SectionPiece &Piece =		const SectionPiece &Piece =
▲ Show 20 Lines • Show All 46 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Simplify getSectionPieceClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 176711

ELF/InputSection.cpp

[ELF] Simplify getSectionPiece
ClosedPublic