This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] prepare more dq form for P10 pair load/store
ClosedPublic

Authored by shchenz on Dec 1 2020, 7:27 AM.

Details

Summary

For P10, we have dq form pair load/store.Making PPCLoopInstrFormPrep pass can prepare more dq form chains.
This is important if there are many IV Users in a loop. Because currently many search space narrow heuristics functions are targeted for register number, but on powerpc we should narrow search space targeted for instruction number.

I tried to narrow down the test case, but seems, the issue only happens when there is a certain number of IV users which causes LSR can not find the best formulae sets. We can produce the issue when there are 5 chains, I used 7 chains to track the code generation for some internal testings. Please focus on the code change in loop .LBB0_4: # %_loop_2_do_

With this patch, we get more dform pair load/store for P10 in some internal testings and get slight gains (about 1%) for some cpu2017 benchmarks on P9.

Diff Detail

Event Timeline

shchenz created this revision.Dec 1 2020, 7:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 1 2020, 7:27 AM
shchenz requested review of this revision.Dec 1 2020, 7:27 AM
shchenz edited the summary of this revision. (Show Details)Dec 1 2020, 7:35 AM
shchenz edited the summary of this revision. (Show Details)Dec 1 2020, 7:43 AM
steven.zhang added inline comments.Dec 6 2020, 11:02 PM
llvm/lib/Target/PowerPC/PPCLoopInstrFormPrep.cpp
83–84

Comments update ?

shchenz updated this revision to Diff 309815.Dec 6 2020, 11:35 PM
shchenz edited the summary of this revision. (Show Details)

1: fix comments

shchenz edited the summary of this revision. (Show Details)Dec 6 2020, 11:36 PM
steven.zhang accepted this revision.Dec 7 2020, 5:11 PM

LGTM as far as we see the perf improvement.

This revision is now accepted and ready to land.Dec 7 2020, 5:11 PM
This revision was landed with ongoing or failed builds.Dec 8 2020, 6:09 PM
This revision was automatically updated to reflect the committed changes.