This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Exploit the vector min/max instructions
ClosedPublic

Authored by nemanjai on May 24 2018, 9:19 AM.

Details

Summary

These instructions are significantly faster than the corresponding compare/select code.

Diff Detail

Repository
rL LLVM

Event Timeline

nemanjai created this revision.May 24 2018, 9:19 AM
jedilyn added inline comments.Jul 18 2018, 6:12 AM
lib/Target/PowerPC/PPCInstrAltivec.td
931 ↗(On Diff #148430)

Hi Nemanja, I have one concern on whether these two hardware instructions for vector float point can be perfectly mapped to these two ISDNode.
As the description of fmaxnum/fminnum "in the case where a single input is NaN, the non-NaN input is returned.",
while the description for the vmaxfp/vminfp in ISA like "The maximum of +0 and -0 is +0. The maximum of any value and a NaN is a QNaN."
It looks more suitable for the fmaxnan/fminnan?

lib/Target/PowerPC/PPCInstrVSX.td
1507 ↗(On Diff #148430)

I'm not sure why not use similar patterns like VMAXSW in PPCInstrAltivec.td but being located in HasP8Vector scope? Is there some special reasons with COPY_TO_REGCLASS?

test/CodeGen/PowerPC/vec-min-max.ll
53 ↗(On Diff #148430)

Some more cases cover sge/sle seems trivial but the coverage is better?

nemanjai planned changes to this revision.Sep 18 2018, 7:01 AM

@jedilyn Hi Ke Wen, thanks for your comments. This needs some cleanup with regard to which instructions match the semantics of F{MIN|MAX}NUM vs. F{MIN|MAX}NAN. I'll clean that up and re-post this for your review. Thanks.

lib/Target/PowerPC/PPCInstrAltivec.td
931 ↗(On Diff #148430)

I am really sorry about such a long delay in responding to this. You are absolutely right. The correct nodes are FMINNAN and FMAXNAN. I think what I meant to use here are XVMAXDP, XVMINDP, XVMAXSP, XVMINSP. Those have the mentioned semantics (i.e. comparing a value and a NaN returns the value). And I don't think we need to worry about signalling NaNs at this time.

lib/Target/PowerPC/PPCInstrVSX.td
1507 ↗(On Diff #148430)

These instructions are new in ISA 2.07 so they have to be in the P8Vector block. Also, the COPY_TO_REGCLASS is needed because VRRC registers cannot contain v2i64 values.

This covers PR39130 right?

Maybe worth adding the new vec-min-max.ll test file to trunk with current codegen so this patch shows the improved codegen diff?

test/CodeGen/PowerPC/vec-min-max.ll
5 ↗(On Diff #148430)

You might be better using common prefixes to share (and reduce) checks.

--check-prefixes=CHECK,P8VEC
--check-prefixes=CHECK,NOP8VEC

nemanjai marked an inline comment as done.Dec 29 2018, 4:37 PM

This covers PR39130 right?

Maybe worth adding the new vec-min-max.ll test file to trunk with current codegen so this patch shows the improved codegen diff?

Yes, this will fix the PR for vector types. Scalar types will come later.

test/CodeGen/PowerPC/vec-min-max.ll
5 ↗(On Diff #148430)

Oh cool. I was not aware of this functionality. Thank you.

nemanjai updated this revision to Diff 179727.Dec 30 2018, 5:25 AM

Updated to remove the patterns for the Altivec versions of vector min/max as they have IEEE semantics wrt. handling NaN. A subsequent patch will legalize the _IEEE versions of the nodes for single precision and provide patterns to match them to vmaxfp/vminfp.

Is anything happening with this? We've hit PPC issues with ISD::ABS on https://reviews.llvm.org/D49837 and I noticed that PPCTargetLowering::LowerABS has a reference to this patch.

Herald added a project: Restricted Project. · View Herald TranscriptMar 11 2019, 4:22 AM
Herald added a subscriber: jdoerfert. · View Herald Transcript

Is anything happening with this? We've hit PPC issues with ISD::ABS on https://reviews.llvm.org/D49837 and I noticed that PPCTargetLowering::LowerABS has a reference to this patch.

Thanks for pointing this out - I forgot about this patch. It should be ready to commit, perhaps @hfinkel, @echristo or @jsji can go through this and give their opinion on whether we can go ahead with this.

jedilyn accepted this revision.Mar 12 2019, 8:44 PM

LGTM. Thanks for your time!

This revision is now accepted and ready to land.Mar 12 2019, 8:44 PM
This revision was automatically updated to reflect the committed changes.