This is an archive of the discontinued LLVM Phabricator instance.

[BOLT] Always match stale entry blocks
ClosedPublic

Authored by spupyrev on Sep 8 2023, 11:35 AM.

Details

Summary

Two (minor) improvements for stale matching:

  • always match entry blocks to each other, even if there is a hash mismatch;
  • ignore nops in (loose) hash computation.

I record a small improvement in inference quality on my benchmarks. Tests are not affected

Diff Detail

Event Timeline

spupyrev created this revision.Sep 8 2023, 11:35 AM
Herald added a reviewer: Amir. · View Herald Transcript
Herald added a reviewer: maksfb. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
spupyrev published this revision for review.Sep 8 2023, 11:37 AM
spupyrev edited the summary of this revision. (Show Details)
Herald added a project: Restricted Project. · View Herald TranscriptSep 8 2023, 11:38 AM
Amir accepted this revision.Sep 8 2023, 2:51 PM

Thanks. We recently discussed a case where stale matching was unable to match any block in a function and so function exec count was not set. We thought it's still beneficial to set exec count in this case for function reordering. I assume the entry block match will also make stale matching set function exec count in this scenario?

This revision is now accepted and ready to land.Sep 8 2023, 2:51 PM
Amir added a comment.Sep 8 2023, 2:55 PM

Can you please retitle as imperative statement before landing?

spupyrev retitled this revision from [BOLT] matching stale entry blocks to [BOLT] Always match stale entry blocks.Sep 8 2023, 3:38 PM

Thanks. We recently discussed a case where stale matching was unable to match any block in a function and so function exec count was not set. We thought it's still beneficial to set exec count in this case for function reordering. I assume the entry block match will also make stale matching set function exec count in this scenario?

Once we get into the block-matching code path, the function is guaranteed to be marked as having valid profile, and thus, it will be considered for all optimizations. There are a few exceptions earlier (canApplyInference()) which may leave some functions without profile, and this is unchanged by the diff. To address your scenario, we'd need something else.

This revision was automatically updated to reflect the committed changes.