Page MenuHomePhabricator

[LLD][ELF] Optimize linker script filename glob pattern matching NFC

Authored by andrewng on Sep 10 2020, 10:20 AM.



Optimize the filename glob pattern matching in
LinkerScript::computeInputSections() and LinkerScript::shouldKeep().

Add InputFile::getNameForScript() which gets and if required caches the
Inputfile's name used for linker script matching. This avoids the
overhead of name creation that was in getFilename() in LinkerScript.cpp.

Add InputSectionDescription::matchesFile() and
SectionPattern::excludesFile() which perform the glob pattern matching
for an InputFile and make use of a cache of the previous result. As both
computeInputSections() and shouldKeep() process sections in order and
the sections of the same InputFile are contiguous, these single entry
caches can significantly speed up performance for more complex glob

These changes have been seen to reduce link time with --gc-sections by
up to ~40% with linker scripts that contain KEEP filename glob patterns
such as "*crtbegin*.o".

Diff Detail

Event Timeline

andrewng created this revision.Sep 10 2020, 10:20 AM
andrewng requested review of this revision.Sep 10 2020, 10:20 AM
MaskRay added inline comments.Sep 10 2020, 10:51 AM

Is the string allocation the bottleneck?

andrewng added inline comments.Sep 10 2020, 11:01 AM

Yes, that is a small part of the overhead. For inputs from archives there's also the string concatenation.

The main overhead is the repeated glob pattern matching for the same input file, i.e. any input file that contains a large number of sections.

I think this is fine approach in general. I wonder if it should/could be splitted into 2 patches though:
one for InputSectionDescription and one for SectionPattern?


It feels that getFilename can instead be just inlined now, though?


You can just call emplace. (And below).


I'd add a comment.


Should we make InputSectionDescription to be class (because it has a member function now)
and make filePat private?


I'd add a comment here too probably.

andrewng updated this revision to Diff 291220.Sep 11 2020, 7:56 AM

Update to address review comments and suggestions.

andrewng marked 4 inline comments as done.Sep 11 2020, 8:01 AM

I think this is fine approach in general. I wonder if it should/could be splitted into 2 patches though:
one for InputSectionDescription and one for SectionPattern?

I did consider this, but the changes for both are very similar, somewhat overlap and have the same justification, so that's why I've put them together. However, if the general consensus is that it should be split up, then I'm happy to do this.


I've inlined the function.

grimar accepted this revision.Sep 14 2020, 2:10 AM

LGTM. Please wait for @MaskRay opinion.


I've meant it is a single line function and you can probably just remove it and inline its logic.

if (!matchesFileCache || matchesFileCache->first != file)
      file, filePat.match(getFilename(file ? file->getNameForScript() : "")));

Up to you.

A bit unrelated to this diff: thinking more I wonder if we want to assume an empty file name at all for matching.
We perhaps can be more explicit. Something like:

bool InputSectionDescription::matchesFile(const InputFile *file) const {
  if (filePat.isTrivialMatchAll())
    return true;

  if (!file)
    return false;
bool SectionPattern::excludesFile(const InputFile *file) const {
  if (!file || excludedFilePat.empty())
    return false;
This revision is now accepted and ready to land.Sep 14 2020, 2:10 AM
MaskRay added inline comments.Sep 14 2020, 5:19 PM



Delete the two lines.


... and the argument.


... and the argument.

MaskRay added inline comments.Sep 14 2020, 5:24 PM

getFilename is called twice. Having it separate looks good to me.

if (filePat.isTrivialMatchAll())

Avoiding the single-element cache with just (filePat.isTrivialMatchAll())? I think that will not speed up KEEP (*crtbegin.o(.ctors)) as in GNU ld's internal linker scripts. The idea is, in computeInputSections, when testing sections. The sections of the same file are contiguous. So a single-element cache can speed up.

The description should be modified to mention this point.

This change requires D87468.

You can click Edit Related Revisions... to make this clearer in Phab.

andrewng updated this revision to Diff 291883.Sep 15 2020, 6:12 AM
andrewng marked an inline comment as done.
andrewng edited the summary of this revision. (Show Details)

Update to address review comments.

andrewng marked 7 inline comments as done.Sep 15 2020, 6:30 AM
andrewng added inline comments.

@grimar, the null InputFile for matchesFile() isn't quite so straightforward because isTrivialMatchAll() only currently identifies "*" and not "**" (or any other sequence of '*'). But yes, something like you suggest could be done as another change if desired.

@MaskRay, the isTrivialMatchAll() is there to catch the common case of the file glob pattern being "*". This was more important before I added getNameForScript() but it is still beneficial which is why I have left it in. I have updated the description regarding the speed up due to the caching of the glob pattern match.

MaskRay accepted this revision.Sep 15 2020, 8:57 AM


Herald added a project: Restricted Project. · View Herald TranscriptSep 16 2020, 2:27 AM