This is an archive of the discontinued LLVM Phabricator instance.

[NFC] [DAGCombine] Correct the result for sqrt even the iteration is zero
ClosedPublic

Authored by steven.zhang on Jan 12 2021, 2:12 AM.

Details

Summary

For now, we correct the result for sqrt if iteration > 0. This doesn't make sense as they are not strict relative. Besides, some target(i.e. AArch64 and PowerPC downstream) want to have custom sqrt soft expansion instead of the default OneConstantNR/TwoConstantNR and set the iteration as zero. We need to correct the result for them also. As AArch64 don't want to test the input with IEEE, so, I move the test implementation into the hook.

Diff Detail

Event Timeline

steven.zhang created this revision.Jan 12 2021, 2:12 AM
steven.zhang requested review of this revision.Jan 12 2021, 2:12 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 12 2021, 2:12 AM
steven.zhang edited the summary of this revision. (Show Details)Jan 12 2021, 2:27 AM
RKSimon added inline comments.Jan 19 2021, 1:55 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
5860

(style) drop the else

  if (Mode.Input == DenormalMode::IEEE) {
  ...
  }

  // Test = X == 0.0
  return DAG.getSetCC(DL, CCVT, Op, FPZero, ISD::SETEQ);
}

Address comments.

spatel added inline comments.Jan 21 2021, 8:07 AM
llvm/test/CodeGen/AArch64/sqrt-fastmath.ll
75–77

The bif/bic difference seems logically fine and probably better overall since we're using less registers, but someone who knows AArch should comment.

Also, it would be interesting to know why the code changed because we are seemingly producing the same set of SDNode ops? There might be another combine/lowering opportunity.

steven.zhang retitled this revision from [DAGCombine] Correct the result for sqrt even the iteration is zero to [NFC] [DAGCombine] Correct the result for sqrt even the iteration is zero.
steven.zhang edited the summary of this revision. (Show Details)

The bif/bic difference seems logically fine and probably better overall since we're using less registers, but someone who knows AArch should comment.
Also, it would be interesting to know why the code changed because we are seemingly producing the same set of SDNode ops? There might be another combine/lowering opportunity.

This is a good example to show how important the code review is and thank you for pointing out this. I should have realized it as it should be a NFC patch. The reason why you see the test change is that, AArch64 don't want the zero if it is denormal input while the default implementation return zero if it is denormal input. I miss to override it.

dmgreen accepted this revision.Jan 22 2021, 1:25 AM

but someone who knows AArch should comment.

Sorry, I had no idea what this was and what it was doing. I see now, and this does indeed look like an NFC.

It turns out that on AArch64 it is controlled by a subtarget feature that is never enabled, so won't be used without being forced. Not handling IEEE denomal inputs might be a mistake, I'm not sure. This patch is an NFC in that regards though.

LGTM

This revision is now accepted and ready to land.Jan 22 2021, 1:25 AM