This is an archive of the discontinued LLVM Phabricator instance.

Optimize lld::elf::ScriptLexer::getLineNumber by avoiding repeated work
ClosedPublic

Authored by ccross on Jun 11 2021, 11:16 AM.

Details

Summary

getLineNumber() was counting the number of line feeds from the start
of the buffer to the current token. For large linker scripts this
became a performance bottleneck. For one 4MB linker script over 4
minutes was spent in getLineNumber's StringRef::count.

Store the line number from the last token, and only count the additional
line feeds since the last token.

Diff Detail

Event Timeline

ccross created this revision.Jun 11 2021, 11:16 AM
ccross requested review of this revision.Jun 11 2021, 11:16 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 11 2021, 11:16 AM
MaskRay added inline comments.Jun 13 2021, 10:48 AM
lld/ELF/ScriptLexer.cpp
65

the prevailing style is compact. You can move tokOffset immediately below StringRef tok and delete the extra empty lines

// For the first token, or when going backwards, start from the beginning of
// the buffer. If this token is after the previous token start from the previous token.
size_t line = 1;
size_t start = 0;
if (lastLineNumberOffset > 0 && tokOffset >= lastLineNumberOffset) {
MaskRay accepted this revision.Jun 13 2021, 10:48 AM

Looks great!

This revision is now accepted and ready to land.Jun 13 2021, 10:48 AM
ccross updated this revision to Diff 352009.Jun 14 2021, 3:22 PM

Reorder variable definitions to reduce line count

MaskRay accepted this revision.Jun 14 2021, 6:18 PM

I can commit this on your behalf:)

For one 4MB linker script over 4 minutes was spent in getLineNumber's StringRef::count.

Hmm... A 4MB linker script is weird.

@ccross How did you end up with a 4MB linker script? Something like .text : { input0(.text) input1(.text) input2(.text) } to reorder sections?

@ccross How did you end up with a 4MB linker script? Something like .text : { input0(.text) input1(.text) input2(.text) } to reorder sections?

I was experimenting with embedding a binary into another binary by converting the loadable sections into blocks of .WORD(0x....) in a linker script. I ended up needing an extra .s file to set the writable/executable flags on the resulting sections, so I moved the data back into the .s file and no longer have a 4MB linker script. If I could set the flags on sections created from a linker script I could handle the embedding entirely in the linker script.