Implement the Newton series for square root, its reciprocal and reciprocal natively using the helper instructions in AArch64.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
Hi Evandro,
Some comments (purely technical this time, thank goodness!) :)
James
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
4658 | This seems a bit of a hack. Can we at least get a comment saying this is working around the model somehow? Can Sanjay comment on how the model was meant to be used where targets have native step support? | |
4681 | Why do you need this? Is this to preserve sign-of-zero? You don't need this if the original operation was marked nsz or fast. |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
4658 | There might be a better API, but this is more or less what I imagined if you want to disable the generic refinement: /// A target may choose to implement its own refinement within this function. /// If that's true, then return '0' as the number of RefinementSteps to avoid /// any further refinement of the estimate. |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
4681 | I'm not sure what you're pointing out here, so allow me to be verbose. When taking sqrt(0.0), the instruction sequence above ends up with a NaN. Therefore, 0.0 is a special case when the result is hamfisted for correctness. Instead of materializing 0.0, which requires extra instructions, since the argument is known to be 0.0, I chose to use it instead. Yes, as a result its sign is preserved, but this is not the reason why I consider this to be a more sensible choice. |
All concerns addressed from my side.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
4681 | Ah, I understand now, thanks! |
This seems a bit of a hack. Can we at least get a comment saying this is working around the model somehow?
Can Sanjay comment on how the model was meant to be used where targets have native step support?