This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Fold vector fmul/fdiv by pow2 into floating-point/fixed-point conversions.
ClosedPublic

Authored by mcrosier on Oct 5 2015, 11:22 AM.

Details

Summary

Adds DAG combines for:

  1. Folding a floating-point multiply by power of two into floating-point to fixed-point conversion and
  2. Folding a floating-point multiply by power of two into fixed-point to floating-point conversion.

This is essentially a port of a combine from the ARM backend (r133813).

Thanks,

Chad

Diff Detail

Repository
rL LLVM

Event Timeline

mcrosier updated this revision to Diff 36534.Oct 5 2015, 11:22 AM
mcrosier retitled this revision from to [AArch64] Fold vector fmul by pow2 into floating-point to fixed-point conversion..
mcrosier updated this object.
mcrosier added reviewers: t.p.northover, jmolloy, Gerolf.
mcrosier set the repository for this revision to rL LLVM.
mcrosier added subscribers: llvm-commits, mssimpso, haicheng and 3 others.
mcrosier added a reviewer: ab.Oct 5 2015, 11:26 AM
t.p.northover edited edge metadata.Oct 5 2015, 11:45 AM

Hi Chad,

I think the bulk of the logic's fine, just one or two cleanups...

lib/Target/AArch64/AArch64ISelLowering.cpp
7537–7538

I think this function could be made simpler (and more generic, e.g. handling undefs too) by calling BuildVectorSDNode::isConstantSplat.

test/CodeGen/AArch64/fcvt_combine.ll
5

This style doesn't check that the fixed-point shift is correctly translated.

mcrosier updated this revision to Diff 36631.Oct 6 2015, 9:07 AM
mcrosier retitled this revision from [AArch64] Fold vector fmul by pow2 into floating-point to fixed-point conversion. to [AArch64] Fold vector fmul/fdiv by pow2 into floating-point/fixed-point conversions..
mcrosier updated this object.
mcrosier edited edge metadata.
mcrosier removed rL LLVM as the repository for this revision.

Updated the helper function, isConstVecPow2, per Tim's comments.

Added support for folding sitofp/uitofp followed by fdiv by power of two.

mcrosier marked 2 inline comments as done.Oct 6 2015, 9:07 AM

Thanks for updating the patch, Chad. Still got a few more issues...

lib/Target/AArch64/AArch64ISelLowering.cpp
7537

I've just noticed this started life in ARMISelLowering.cpp; since we're improving it here, can we fix that one too? Or even share the implementation somewhere (BuildVectorSDNode, or perhaps SelectionDAG if you anticipate handling more cases in the future)

7542

Why are we bailing on undef here?

7548

I don't think this can handle 2^64. Really, we might as well convert to an APSInt and use exactLogBase2 or something.

Thanks, Tim. I'll post a new revision in a bit.

lib/Target/AArch64/AArch64ISelLowering.cpp
7537

I was planning on updating the version in ARMISelLowering as well, but I agree we can just make this common to BuildVectorSDNode or elsewhere and use the new version in both ARM and AArch64.

7542

No particular reason other than I was trying to be conservative. I'll remove the !UndefElements.none() check.

7548

Thanks for the suggestion, Tim. I'm not overly familiar with all of the APFloat/APInt APIs, hence the less than ideal solution. I'll start digging to see if I can't implement a more robust solution.

mcrosier updated this revision to Diff 36749.Oct 7 2015, 8:26 AM

This should take care of all of your suggested improvements, Tim.

  1. Accommodated Tim's request for a generic interface for both ARM and AArch64 to compute const FP splat of power of 2.
  2. Allow matching against constant splats with undefs.

    Chad
mcrosier marked 6 inline comments as done.Oct 7 2015, 8:27 AM
t.p.northover accepted this revision.Oct 7 2015, 9:19 AM
t.p.northover edited edge metadata.

Thanks Chad. I think this looks good now.

Tim.

This revision is now accepted and ready to land.Oct 7 2015, 9:19 AM
mcrosier closed this revision.Oct 7 2015, 10:54 AM

Thanks, Tim! Committed in r249572, r249576, and r249579.