This is an archive of the discontinued LLVM Phabricator instance.

[ELF] .gdb_index: use lower_bound to compute relative CU index in an object file
ClosedPublic

Authored by MaskRay on Nov 13 2018, 11:24 AM.

Event Timeline

MaskRay created this revision.Nov 13 2018, 11:24 AM
dblaikie added inline comments.Nov 13 2018, 11:35 AM
ELF/SyntheticSections.cpp
2430–2435

I get where @ruiu is coming from in this being a bit harder to read - and if he'd prefer it to be a linear search (N^2, ultimately where N is the number of pub tables in the file - which would usually be small, except for LTO situations), that's OK.

I'd prefer a binary search - though perhaps there are more legible ways to write it, maybe having a nicer general algorithm, something like:

auto I = llvm::BinarySearch(CUs, [](GDBIndexSection::CuEntry CU) {
  return CU.CuOffset == Offset ? 0 : CU.CuOffset < Offset ? -1 : 1;
});
if (I == CUs.end()) {
  /* invalid debug info, a pubnames seciton with a CU offset that's not the offset of any CU in the input */
}
... *I ...

Might be nice - open to ideas on naming "BinarySearch" so that it's not confusing with other similar APIs, etc. Though I don't think it could be misused terribly easily - if someone mistook it for std::binary_search it'd only successfully compile if the object being searched for also happened to have an operator() that matched this usage, et.c

test/ELF/gdb-index-multiple-cu.s
62

Probably worth explaining why they're swapped - what this is intending to test for (that it's testing the case where the pubnames are not in the same order as the CUs), something like, perhaps:

  1. Swap the pubname sections to test the case where pubnames are in a different order than the CUs they refer to
MaskRay updated this revision to Diff 173920.Nov 13 2018, 12:56 PM
MaskRay retitled this revision from [ELF] Use lower_bound to compute relative CU index to [ELF] .gdb_index: use lower_bound to compute relative CU index in an object file.

Rebase because of the Kind: -> Attributes: change

MaskRay updated this revision to Diff 173924.Nov 13 2018, 1:11 PM

Elaborate the rationale behind swapping as dblaikie suggested

MaskRay marked an inline comment as done.Nov 13 2018, 1:21 PM
MaskRay added inline comments.
ELF/SyntheticSections.cpp
2430–2435

Yes, the layout clang-format uses is not pretty. I don't know if additional error checking is worth the few more lines of code, as gold only handles the first set of .debug_gnu_pubnames. I'm not sure if the implementer has noticed that, but I won't be surprised if they don't as debug symbol accelerating in the partially linked object files does not seem very noticeable.

As to llvm::BinarySearch, I'm unclear if the end()-returning behavior when there is no element equal to the needle is useful. stdlib.h bsearch and std::binary_search look mostly useless to me.

It is important to get CuIndex of subsequent object files correct (that was the main merit of D54361)

dblaikie added inline comments.Nov 13 2018, 1:35 PM
ELF/SyntheticSections.cpp
2430–2435

Not sure I follow a few things

  • Error checking seems necessary to me (but I know lld takes a different approach to error handling than I would - not sure how far that extends, whether only to invalid object files, or invalid data (like this DWARF) inside a valid object file)
  • There are several things I meant to imply by the llvm::BinarySerach example - one, that it wouldn't take the needle parameter and pass it back into the comparator (this seemed to be Rui's cited complaint - and it is a bit awkward/unnecessary), returning end rather than the next element (seems more likely what most users want when they're searching for something in a list), and container (rather than begin/end) support
  • stdlib.h/std::binary_search - yeah, neither look particularly helpful for this problem/making this code more readable
  • Not sure I follow why any of the solutions discussed would cause/risk the CuIndex of subsequent object files to be incorrect
MaskRay added inline comments.Nov 13 2018, 1:49 PM
ELF/SyntheticSections.cpp
2430–2435

Sorry I didn't notice that llvm::BinarySearch does not take the needle parameter. Yeah it indeed looks better than std::binary_search!

(this seemed to be Rui's cited complaint - and it is a bit awkward/unnecessary)

std::binary_search has reserved the third parameter as the needle with a generic type. As it cannot overload on that parameter to express a comparator, when both the needle and the comparator appear, it does look a bit redundant when we pass a lambda as a comparator.

Not sure I follow why any of the solutions discussed would cause/risk the CuIndex of subsequent object files to be incorrect

That bug has been fixed and is no longer relevant. These revisions are just polishing the interface and fixing some corner cases that people may not notice. They are good from a quality of implementation view, though.

MaskRay updated this revision to Diff 173935.Nov 13 2018, 1:52 PM

Change std::lower_bound to llvm::lower_bound

but i'm still unclear if we should handle nonexistence case and if so, what error message we should use. It is difficult as LLDDwarfObj does not know the object name.

ruiu added inline comments.Nov 13 2018, 9:35 PM
ELF/SyntheticSections.cpp
2433

Hmm, not sure if I follow. This new code still assume that CUs is sorted by CuOffset, no?

dblaikie added inline comments.Nov 13 2018, 11:25 PM
ELF/SyntheticSections.cpp
2433

It's not that CUs could be unordered wrt CuOffset - they're read from the section linearly, so they're necessarily ordered from smallest to largest CuOffset.

But the order of references from pubnames contributions isn't ordered relative to them - so the first pubnames contribution might reference the last CU, for instance.

The currently committed solution assumes that the pubnames contribution are in the same order as the CU contributions (for each pubnames contribution, it walks from the last CU referenced from a pubnames contribution (or from the start if it's the first) expecting to find the next one - never doubling back).

Thanks that dblaikie gave his point. Should we move forward with this revision? This patch makes the code correct in this corner case that gold would fail at the cost of one more line. The local lower_bound usage is indeed a bit uglier but it shouldn't be a huge problem. When more needs emerge we can add a utility to STLExtras.h and improve the binary search call sites like this place.

ruiu accepted this revision.Nov 28 2018, 2:47 PM

LGTM

This revision is now accepted and ready to land.Nov 28 2018, 2:47 PM
This revision was automatically updated to reflect the committed changes.