This is an archive of the discontinued LLVM Phabricator instance.

[lld-macho] Fix symbol name returned from InputSection::getLocation
ClosedPublic

Authored by BertalanD on Jun 13 2022, 10:15 AM.

Details

Summary

This commit fixes the issue that getLocation always printed the name of
the first symbol in the section.

For clarity, upper_bound is used instead of a linear search for finding
the closest symbol name. Note that this change does not affect
performance: this function is only called when printing errors and
symbols typically contains a single symbol because of
.subsections_via_symbols.

Please commit this diff with the following author info:
Daniel Bertalan <dani@danielbertalan.dev>

Diff Detail

Event Timeline

BertalanD created this revision.Jun 13 2022, 10:15 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJun 13 2022, 10:15 AM
BertalanD requested review of this revision.Jun 13 2022, 10:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 13 2022, 10:15 AM

Thanks for the patch!

As a general guideline, I think it helps if every patch is either a behavior-preserving refactoring (these patches can be larger), or something that changes behavior (these patches are ideally smaller). That makes it easier to reason about invariants, and makes the patch easier to review.

In this case, the patch is tiny either way, and it doesn't matter all that much. But it does two separate things: 1. Change the search algorithm implementation (behavior-preserving) 2. Change the diag (not behavior preserving). So abstractly, it seems kind of nice to split them, or at least to talk about both parts separately.

For the refactor patch: There are usually few symbols per section, so a linear scan is likely faster in practice. Also, this function is only called when emitting errors, so it's not performance-critical anyways, probably. So I guess the motivation for that part is clarity?

For the behavior change: Nice catch! Patches that change behavior ideally come with tests – do you think you could include one? In practice, we'd only handle this in files without .subsections_via_symbols, right?

BertalanD retitled this revision from [lld-macho] Use binary search in InputSection::getLocation to [lld-macho] Fix symbol name returned from InputSection::getLocation.
BertalanD edited the summary of this revision. (Show Details)

Clarified that the use of upper_bound is purely a code style change, and added a test.

thakis accepted this revision.Jun 13 2022, 12:47 PM

Thanks!

This revision is now accepted and ready to land.Jun 13 2022, 12:47 PM
This revision was landed with ongoing or failed builds.Jun 13 2022, 12:52 PM
This revision was automatically updated to reflect the committed changes.
int3 added a comment.Jun 14 2022, 2:53 AM

For the refactor patch: There are usually few symbols per section, so a linear scan is likely faster in practice. Also, this function is only called when emitting errors, so it's not performance-critical anyways, probably. So I guess the motivation for that part is clarity?

Yep, those were my reasons. But yeah no harm changing it to a binary search either since it's not perf-critical