This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDAG] Fix fptoi.sat scalable vector lowering
ClosedPublic

Authored by dmgreen on Jul 18 2022, 10:24 AM.

Details

Summary

Vector fptosi_sat and fptoui_sat were being expanded by unrolling the vector operation. This doesn't work for scalable vector, so this patch adds a call to TLI.expandFP_TO_INT_SAT if the vector is scalable.

Scalable tests are added for AArch64 and RISCV. Some of the AArch64 fptoi_sat operations should be legal, but that will be handled in another patch.

Diff Detail

Event Timeline

dmgreen created this revision.Jul 18 2022, 10:24 AM
dmgreen requested review of this revision.Jul 18 2022, 10:24 AM

This also needed an adjustment to create SetCC+VSelect as opposed to SelectCC for vectors. These should either be combined into SelectCC or handled as needed by the target. (I went back and forth as to whether this should be used for scalar too, or if the selectcc should be handled differently. This way was at least simpler to get codegen not crashing).

This seems a bit weird to me... a couple questions:

  1. For SelectCC with a vector result, are the compare operands supposed to be scalar or vector? Can we add an appropriate assertion to SelectionDAG::getNode()?
  2. Is legalization not correctly handling the case where SELECT_CC is marked "expand"?
llvm/test/CodeGen/AArch64/fptosi-sat-vector.ll
1069

This isn't an improvement.

This also needed an adjustment to create SetCC+VSelect as opposed to SelectCC for vectors. These should either be combined into SelectCC or handled as needed by the target. (I went back and forth as to whether this should be used for scalar too, or if the selectcc should be handled differently. This way was at least simpler to get codegen not crashing).

This seems a bit weird to me... a couple questions:

  1. For SelectCC with a vector result, are the compare operands supposed to be scalar or vector? Can we add an appropriate assertion to SelectionDAG::getNode()?

I guess I had presumed a vector, as that seems to be what is used in practice. I hadn't really considered it though, just called DAG.getSetCC and DAG.getSelect with the existing operands. With getSetCCResultType from the result type, which would be vector.

  1. Is legalization not correctly handling the case where SELECT_CC is marked "expand"?

From what I remember - A vector SELECT_CC will Expand by unrolling. There is no specific expansion code. When I added some (to turn it into a getSetcc+getSelect) the code for some tests was worse than unrolling. (There is also some MVE errors about the condition types of float vectors without ). I could add expansion code that is only for scalable vectors.

The other place I've seen vector select_cc created is the expansion of fmin/fmax reductions via TargetLowering::createSelectForFMINNUM_FMAXNUM.

llvm/test/CodeGen/AArch64/fptosi-sat-vector.ll
1069

Yeah these are the i50 tests I mentioned in the summary. It doesn't help the optics that the return <4 x i50> type is passed in scalars.
There was quite a lot of changes originally from the fallout of adding the expansion of fp_to_si_sat. I can add more to the conditions of the TLI.expandFP_TO_INT_SAT call, but then we are just trading smaller code in more common cases for some regressions in more obscure. It already felt like I was adding too many conditions. This patch was intended as a bugfix though, so perhaps it is better to keep it simpler.

Suggestions welcome.

I guess I had presumed a vector, as that seems to be what is used in practice. I hadn't really considered it though, just called DAG.getSetCC and DAG.getSelect with the existing operands. With getSetCCResultType from the result type, which would be vector.

Okay.

From what I remember - A vector SELECT_CC will Expand by unrolling. There is no specific expansion code. When I added some (to turn it into a getSetcc+getSelect) the code for some tests was worse than unrolling. (There is also some MVE errors about the condition types of float vectors without ). I could add expansion code that is only for scalable vectors.

Okay.

I think I'd like to have the expansion, just so it isn't a trap for anyone deciding using SELECT_CC in the future... we can just turn it on for scalable vectors, sure.

llvm/test/CodeGen/AArch64/fptosi-sat-vector.ll
1069

Oh, hmm, the scalarized return is making it look worse than it actually is.

Maybe it's fine for now, then.

dmgreen updated this revision to Diff 445716.Jul 19 2022, 1:18 AM
dmgreen edited the summary of this revision. (Show Details)

This now adds SELECT_CC into VectorLegalizer::Expand for scalable vectors, expanding into a SETCC and a DAG.getSelect. The code in TargetLowering::expandFP_TO_INT_SAT is now unchanged, and an assert is added for checking the type of a vector select_cc condition is a vector.

I can change the expansion of FP_TO_INT_SAT to scalable vectors only if that is preferred to keep this patch a scalable-vector fix. Let me know.

I think I'd prefer to analyze the changes for non-scalable vectors more closely in a separate review; there are significant missed optimizations either way. In particular, it looks like we're saturating multiple times in some cases.

dmgreen updated this revision to Diff 446059.Jul 20 2022, 12:37 AM

OK - now only scalable vectors.

dmgreen edited the summary of this revision. (Show Details)Jul 20 2022, 12:38 AM
This revision is now accepted and ready to land.Jul 20 2022, 11:17 AM
This revision was landed with ongoing or failed builds.Jul 21 2022, 12:00 AM
This revision was automatically updated to reflect the committed changes.