This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Reintroduce CC exception for non-inlined functions in Promote Alloca limits
ClosedPublic

Authored by Pierre-vh on May 15 2023, 2:24 AM.

Details

Summary

This is basically a partial revert of https://reviews.llvm.org/D145586 ( fd1d60873fdc )

D145586 was originally introduced to help with SWDEV-363662, and it did, but
it also caused a 25% drop in performance in
some MIOpen benchmarks where, it seems,
functions are inlined more conservatively.

This patch restores the pre-D145586 behavior
for PromoteAlloca: functions with a non-entry CC
have a 32 VGPRs threshold, but only if the function
is not marked with "alwaysinline".

A good number of AMDGPU code makes uses of
the AMDGPUAlwaysInline pass anyway, so in our
backend "alwaysinline" seems very common.

This change does not affect SWDEV-363662 (the motivating issue for introducing D145586).

Fixes SWDEV-399519

Diff Detail

Event Timeline

Pierre-vh created this revision.May 15 2023, 2:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 15 2023, 2:24 AM
Pierre-vh requested review of this revision.May 15 2023, 2:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 15 2023, 2:24 AM
Pierre-vh added a reviewer: Restricted Project.May 17 2023, 2:00 AM
rampitec accepted this revision.May 22 2023, 9:54 AM
This revision is now accepted and ready to land.May 22 2023, 9:54 AM