This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] don't check CTR clobber in hardware loop insertion pass
ClosedPublic

Authored by shchenz on Oct 12 2022, 8:37 PM.

Details

Summary

We added a new post-isel CTRLoop pass in D122125. That pass will expand the hardware loop related intrinsic to CTR loop or normal loop based on the loop context.

So we don't need to conservatively check the CTR clobber now on the IR level.

As expected, now, more CTR loops are generated.

Diff Detail

Event Timeline

shchenz created this revision.Oct 12 2022, 8:37 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 12 2022, 8:37 PM
shchenz requested review of this revision.Oct 12 2022, 8:37 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 12 2022, 8:37 PM

gentle ping

gentle ping.

I am going to commit this first in next week. This patch does not introduce any new logic, it just deletes some legacy codes and the test change seems good to me. (we may can delete more legacy codes, I'll confirm this later.) I also found no regression for this patch on ppc with some required end-to-end tests.

lkail added inline comments.Nov 6 2022, 6:13 PM
llvm/test/CodeGen/PowerPC/ctrloop-fp128.ll
0–1

Could you please re-generate this test case with complete assembly output, so that we can see why CTR loop is viable on POWER9 now?

shchenz marked an inline comment as done.Nov 6 2022, 8:24 PM
shchenz added inline comments.
llvm/test/CodeGen/PowerPC/ctrloop-fp128.ll
0–1

Power9 will not call library function while Power8 will call library function. I will use script to regenerate this file.

shchenz updated this revision to Diff 473574.Nov 6 2022, 11:35 PM
shchenz marked an inline comment as done.

test case update

lkail added inline comments.Nov 7 2022, 8:02 PM
llvm/test/CodeGen/PowerPC/pr36292.ll
22

Do you have any idea of this regression? Looks we have additional addi and b in the loop after the change.

llvm/test/CodeGen/PowerPC/pr48519.ll
31

nit: Maybe we should change undef to concrete value to avoid flag flip.

llvm/test/CodeGen/PowerPC/pr55463.ll
71

nit: undef leads to flag flip.

llvm/test/CodeGen/PowerPC/spe.ll
1798–1800

This regression looks the same pattern as llvm/test/CodeGen/PowerPC/pr36292.ll.

shchenz added inline comments.Nov 7 2022, 11:17 PM
llvm/test/CodeGen/PowerPC/pr36292.ll
22

Now in the hardware insertion pass, we treat this loop as an hardware loop without checking the loop body. So the hardware insertion pass will try to calculate the loop count in the loop preheader and then decrease the loop count in the loop latch. This is different with the left one. The left one is not a CTR loop form, i.e. the loop count is not calculated in the loop preheader.

This adds more instructions(addi and bc, please note that the last branch is changed to uncontional branch) in the loop body. but for other case, if the body has no CTR clobber, it is a valid CTR loop which can benefit other optimizations.

llvm/test/CodeGen/PowerPC/pr55463.ll
71

Can you be more specific? Do you mean we should not use addi -1? The addi -1 is hard-coded. See https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/PowerPC/PPCCTRLoops.cpp#L263-L266 .

shchenz added inline comments.Nov 13 2022, 11:28 PM
llvm/test/CodeGen/PowerPC/pr55463.ll
71

The loop count for this loop is undef, so ideally this loop can be totally optimized out, I tried to fix this in D137758, but the reviewer suggested this should not be an issue as long as opt is able to remove this loop.

lkail accepted this revision as: lkail.Nov 14 2022, 11:41 PM

LGTM. I prefer using MIR based pass to perform CTR loop conversion instead of maintaining a large list of functions that might clobber CTR. Please hold on couple of days in case other reviewers have comments.

This revision is now accepted and ready to land.Nov 14 2022, 11:41 PM
shchenz updated this revision to Diff 476348.Nov 17 2022, 11:16 PM
This revision was landed with ongoing or failed builds.Dec 4 2022, 5:54 PM
This revision was automatically updated to reflect the committed changes.