Page MenuHomePhabricator

PowerPC: Don't hoist float multiply + add to fused operation on SPE
Needs ReviewPublic

Authored by jhibbits on Apr 6 2020, 8:12 AM.

Details

Reviewers
shchenz
Group Reviewers
Restricted Project
Summary

SPE doesn't have a fmadd instruction, so don't bother hoisting a
multiply and add sequence to this, as it'd become just a library call.
Hoisting happens too late for the CTR usability test to veto using the CTR
in a loop, and results in an assert "Invalid PPC CTR loop!".

A test is forthcoming. This may be the wrong approach, and a better approach
may be instead to use a feature flag, or something else, for the CPU. However,
since the "traditional" FPU and all its derivatives have fmadd, I didn't see it
necessary for solving this problem. There may be additional CTR related
problems on SPE, though, that I haven't found yet.

Diff Detail

Event Timeline

jhibbits created this revision.Apr 6 2020, 8:12 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 6 2020, 8:12 AM

This happens too late for the CTR usability test to veto using the CTR in a loop, and results in an assert "Invalid PPC CTR loop!".

This means after not hoisting fmul makes a case have "Invalid PPC CTR loop!" assertion? This is a little surprised for me. Hoist or not hoist fmul should not impact CTR register.
Looking forward to your case.

This happens too late for the CTR usability test to veto using the CTR in a loop, and results in an assert "Invalid PPC CTR loop!".

This means after not hoisting fmul makes a case have "Invalid PPC CTR loop!" assertion? This is a little surprised for me. Hoist or not hoist fmul should not impact CTR register.
Looking forward to your case.

Sorry, on a further reading of the summary I can see it can be a little confusing. *hoisting* fmul + fadd to fma results in a library call, because SPE doesn't have a fma instruction. Unfortunately, this transform is performed long after a loop is transformed to a CTR with bdnz loop, so it can't be caught at the loop transform time, and block the loop. In addition to triggering that assert, hoisting two instructions into a function call is quite a pessimization anyway :)

I'll update the summary to clarify this.

jhibbits edited the summary of this revision. (Show Details)Apr 7 2020, 7:06 AM