This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Fix the number of iterations for the Newton series
ClosedPublic

Authored by evandro on Nov 1 2017, 1:33 PM.

Details

Summary

The number of iterations was incorrectly determined for DP FP vector types and the tests were insufficient to flag this issue.

Probably the releases 4 and 5 of LLVM should also be fixed.

Diff Detail

Event Timeline

evandro created this revision.Nov 1 2017, 1:33 PM

Full context is missing (seems you forgot diff .... -U9999)

fhahn accepted this revision.Nov 2 2017, 2:58 AM
fhahn added a subscriber: fhahn.

LGTM. This is a change I wanted to make soon myself, but never got around to it yet. It would be great if you could upload a full-context diff and wait a bit with committing, to give other people a chance to raise concerns.

This revision is now accepted and ready to land.Nov 2 2017, 2:58 AM
fhahn added a subscriber: huntergr.Nov 2 2017, 2:59 AM
evandro updated this revision to Diff 121330.Nov 2 2017, 9:28 AM

Sorry about that.

fhahn added a comment.Nov 3 2017, 2:43 AM

Thanks Evandro. Out of curiosity, were there any tests/benchmarks, where this caused failures/wrong results?

evandro added a comment.EditedNov 3 2017, 8:31 AM

Thanks Evandro. Out of curiosity, were there any tests/benchmarks, where this caused failures/wrong results?

Yes, with the series expansion enabled, neither CPU2000 nor CPU2006 would validate. With this fix, expanding sqrt() with the series both validate, but not when expanding division.

I suspect that the initial estimate for division is insufficient for DP, failing to converge even with additional iterations.

This revision was automatically updated to reflect the committed changes.