This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Use Power9 test data class instruction to lower IS_FPCLASS
ClosedPublic

Authored by qiucf on Dec 20 2022, 1:59 AM.

Details

Summary

Power ISA 3.0 introduced new 'test data class' instructions (xststdc(s|d|q)p and xvtstdc(s|d)p), which accept flags for: NaN, +Infinity, -Infinity, +Zero, -Zero, +Denormal, -Denormal.

This instruction can be used to implement custom lowering for llvm.is.fpclass intrinsic, but some extra bits provided by the intrinsic are missing: +Normal, -Normal, QNaN, SNaN ('normal' means a number is not inf/nan/zero/denormal). For those, this patch uses a two-way or three-way combination to implement correct behavior:

  • For flags including both +Normal/-Normal, try reverse flags to avoid them.
  • For flags including either +Normal/-Normal, try the rest flags first, and use method above with sign check.
  • For flags including either QNaN/SNaN, try the rest flags first, and test 'whether it's NaN' with quiet bit set/unset.

I tested all flags (2**10=1024) against one value from every class with float/double/fp128 type (3*10), and compare result with Power8, all 30720 cases pass on ppc64le. And all 20480 cases (fp128 is not supported) pass on ppc64/ppc32.

Nearly all 3072 cases see improvement, except some shown in changes of is_fpclass.ll.

Diff Detail

Event Timeline

qiucf created this revision.Dec 20 2022, 1:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 20 2022, 1:59 AM
qiucf requested review of this revision.Dec 20 2022, 1:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 20 2022, 1:59 AM
qiucf added inline comments.Dec 20 2022, 2:19 AM
llvm/test/CodeGen/PowerPC/is_fpclass.ll
577

When return type is zeroext i1, this should be:

xscvdpspn 0, 1
mffprwz	3, 0
rlwinm 3, 3, 9, 23, 31
subfic 3, 3, 254
rldicl 3, 3, 1, 63

1 less instruction than new codegen.

qiucf added a comment.Mar 8 2023, 11:59 PM

Gentle ping... Any comments?

sepavloff added inline comments.Mar 14 2023, 9:11 AM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
10958

It is better to use FPClassTest instead of unsigned.

10988

If Mask were FPClassTest, & fcAllFlags could be omitted.

10995

Should it be Mask & ~fcNormal?

11065

Should it be (Mask & fcNan) == fcNan)?

qiucf updated this revision to Diff 505386.Mar 15 2023, 12:36 AM
qiucf marked 4 inline comments as done.
sepavloff added inline comments.Mar 15 2023, 3:48 AM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
11186

According to the LLVM coding stype (https://llvm.org/docs/CodingStandards.html#do-not-use-braced-initializer-lists-to-call-a-constructor) constructor with parentheses should be used. It also applies to some initializations below.

11205

It seems that if Mask is fcNormal (which is fcPosNormal | fcNegNormal), the OR node is not needed.

11215

Mask & ~fcNan should be enough.

qiucf updated this revision to Diff 506458.Mar 19 2023, 8:32 PM
qiucf marked 2 inline comments as done.
qiucf added inline comments.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
11205

It's already been handled by previous condition (Mask & fcNormal) == fcNormal.

sepavloff accepted this revision.Mar 21 2023, 11:01 AM

LGTM.

I am not proficient in Power ISA and rely on the results of testing mentioned in the patch summary.

This revision is now accepted and ready to land.Mar 21 2023, 11:01 AM

Hi @nemanjai @shchenz do you have any comments? I verified the behavior of new codegen again, all okay on ppc64le/ppc64/ppc32

shchenz added inline comments.Mar 29 2023, 12:14 AM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
11151

Should we check isKnownNeverNaN() or isKnownNeverZeroFloat() first before we really compute the class? (we may need to add isKnownNeverInfinity() and isKnownNeverSubnormal()). With these functions, we can handle some compile-time class. For example, the Op is a fp constant.

11207

Do we need the OR if Mask == fcPosNormal || Mask == fcNegNormal ?

11244

Will we generate an illegal type if useCRBits() is set to false?

11248

Same as above, we may not need the OR is we only test SNAN or QNAN?

qiucf updated this revision to Diff 509246.Mar 29 2023, 1:25 AM
qiucf marked 3 inline comments as done.
qiucf added inline comments.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
11151

I think they were already handled in constant folding/earlycse before ISel. Following code still produces folded result after this patch on pwr9:

define zeroext i1 @fpclass() {
entry:
  %x = call i1 @llvm.is.fpclass.f64(double 0.0, i32 64)
  ret i1 %x
}
declare i1 @llvm.is.fpclass.f64(double, i32)

SystemZ implementation does also not include constant folding.

shchenz accepted this revision as: shchenz.Mar 29 2023, 2:04 AM

Looks reasonable to me too. Thanks for making this improvement.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
11151

OK. Let's improve this later if we find a pattern that is created during ISEL. I agree if the pattern is created before, we should handle it in inst combine phase.

This revision was landed with ongoing or failed builds.Apr 2 2023, 8:44 PM
This revision was automatically updated to reflect the committed changes.