This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Compute the Newton series for reciprocals natively
ClosedPublic

Authored by evandro on Nov 10 2016, 12:48 PM.

Details

Summary

Implement the Newton series for square root, its reciprocal and reciprocal natively using the helper instructions in AArch64.

Diff Detail

Repository
rL LLVM

Event Timeline

evandro updated this revision to Diff 77536.Nov 10 2016, 12:48 PM
evandro retitled this revision from to [AArch64] Compute the Newton series for reciprocals natively.
evandro updated this object.
evandro added reviewers: spatel, t.p.northover, rengolin.
evandro set the repository for this revision to rL LLVM.
jmolloy requested changes to this revision.Nov 11 2016, 12:44 AM
jmolloy added a reviewer: jmolloy.
jmolloy added a subscriber: jmolloy.

Hi Evandro,

Some comments (purely technical this time, thank goodness!) :)

James

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
4658 ↗(On Diff #77536)

This seems a bit of a hack. Can we at least get a comment saying this is working around the model somehow?

Can Sanjay comment on how the model was meant to be used where targets have native step support?

4681 ↗(On Diff #77536)

Why do you need this? Is this to preserve sign-of-zero?

You don't need this if the original operation was marked nsz or fast.

This revision now requires changes to proceed.Nov 11 2016, 12:44 AM
spatel added inline comments.Nov 11 2016, 7:10 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
4658 ↗(On Diff #77536)

There might be a better API, but this is more or less what I imagined if you want to disable the generic refinement:

/// A target may choose to implement its own refinement within this function.
/// If that's true, then return '0' as the number of RefinementSteps to avoid
/// any further refinement of the estimate.
evandro added inline comments.Nov 11 2016, 8:37 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
4681 ↗(On Diff #77536)

I'm not sure what you're pointing out here, so allow me to be verbose.

When taking sqrt(0.0), the instruction sequence above ends up with a NaN. Therefore, 0.0 is a special case when the result is hamfisted for correctness.

Instead of materializing 0.0, which requires extra instructions, since the argument is known to be 0.0, I chose to use it instead. Yes, as a result its sign is preserved, but this is not the reason why I consider this to be a more sensible choice.

jmolloy accepted this revision.Nov 11 2016, 9:01 AM
jmolloy edited edge metadata.

All concerns addressed from my side.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
4681 ↗(On Diff #77536)

Ah, I understand now, thanks!

This revision is now accepted and ready to land.Nov 11 2016, 9:01 AM
This revision was automatically updated to reflect the committed changes.
evandro marked 5 inline comments as done.