This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE][Fixed length] Fix div miscompile
ClosedPublic

Authored by peterwaller-arm on Dec 8 2022, 3:34 AM.

Details

Summary

The prior code worked before SVE DIV was enabled 128 bit vectors.
With 128 bit vectors, when run on a 256 bit machine, it would split and
do a signed unpack, but this resulted in one full vector and one empty
vector with a half-sized predicate. The effect was that only half the
elements were treated correctly.

The fix is to bisect the vector, sign extend, do the division, truncate
and then concat.

Fixes #59357.

Diff Detail

Event Timeline

peterwaller-arm created this revision.Dec 8 2022, 3:34 AM
Herald added a project: Restricted Project. · View Herald Transcript
peterwaller-arm requested review of this revision.Dec 8 2022, 3:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 8 2022, 3:34 AM
sdesmalen added inline comments.Dec 8 2022, 6:31 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
22640–22650

nit: This comment doesn't directly relate to BisecExtendVector

22641

nit: ExtractLoHi ?

peterwaller-arm marked an inline comment as done.Dec 8 2022, 6:50 AM
peterwaller-arm added inline comments.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
22640–22650

If this comment were moved up one line and was followed by a blank, would that fix the nit?

22641

I'm using "BisectExtend" as a compound verb to add more information about the intent of the operation (halving, extending). The knowledge it's extracting lo/hi is already available from the other operations occurring within the function.

Matt added a subscriber: Matt.Dec 8 2022, 7:00 AM
sdesmalen added inline comments.Dec 8 2022, 7:26 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
22640–22650

Maybe better to move it to the block below where it uses the lambda, because that's where it will perform this sequence of operations.

22641

Specifically the 'Bisect' part of the name confused me more than it helped me. Perhaps ExtractAndExtendLoHi is more clear?

Tested this on the bot (which is A64FX) and it fixes the scal-to-vec1.c GCC torture test failure for all optimisation levels.

peterwaller-arm marked an inline comment as done.
  • Address review comments: Move comment down and rename BisectExtendVector => HalveAndExtendVector.
paulwalker-arm accepted this revision.Dec 12 2022, 2:06 AM
This revision is now accepted and ready to land.Dec 12 2022, 2:06 AM
This revision was landed with ongoing or failed builds.Dec 12 2022, 3:33 AM
This revision was automatically updated to reflect the committed changes.