This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE] Allow to lower WHILEop operations with constant operands to PTRUE
ClosedPublic

Authored by dtemirbulatov on Nov 30 2022, 6:58 PM.

Details

Summary

This change allows to lower WHILEop operations with constant operands to PTRUE instructions, it also fixes issue with unsigned integer overflow for WHILELO support already commited in https://reviews.llvm.org/D137547.

Diff Detail

Event Timeline

dtemirbulatov created this revision.Nov 30 2022, 6:58 PM
dtemirbulatov requested review of this revision.Nov 30 2022, 6:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 30 2022, 6:58 PM

Unified handling of WHILEop.

paulwalker-arm added inline comments.Dec 5 2022, 10:31 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
4817–4828

This isn't quite what I had in mind when suggesting you can increase commonality. I had in mind a helper function that takes parameters like IsSigned and IsLess or perhaps even makes use of the existing AArch64CC::CondCode enum?

Addressed comments, added signed overflow handling while subtraction/addintion.

sdesmalen added inline comments.Dec 6 2022, 9:34 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
4327

nit: Move ElementSize closer to its use.

4329

nit: Can you move these out into separate variables, e.g. X and Y?

4339

Should this Overflow test be moved below the if (IsOpEqualOrSame) { ... } condition?

4811–4950

Is it worth just moving all the behaviour into that function, and passing IsSigned, IsLess and IsEqual as parameters, such that you get:

case Intrinsic::aarch64_sve_whilelo:
  return optimizeWhile(Op.getOperand(1), Op.getOperand(2), /*IsSigned=*/false, /*IsLess=*/true, /*IsEqual=*/false);
case Intrinsic::aarch64_sve_whilels:
  return optimizeWhile(Op.getOperand(1), Op.getOperand(2), /*IsSigned=*/false, /*IsLess=*/true, /*IsEqual=*/true);
case Intrinsic::aarch64_sve_whilelt:
  return optimizeWhile(Op.getOperand(1), Op.getOperand(2), /*IsSigned=*/true, /*IsLess=*/true, /*IsEqual=*/false);
case Intrinsic::aarch64_sve_whilele:
  return optimizeWhile(Op.getOperand(1), Op.getOperand(2), /*IsSigned=*/true, /*IsLess=*/true, /*IsEqual=*/true);
...

Addressed remarks, avoided to check overflow while adding one to the range for equal or same conditions.

Thanks that's looking a lot better!

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
4602

nit: add a newline before the if?

4605

I guess this can still overflow right? I think you had the right code for that before, it only needed moving the if (Overflow) return SDValue(); after the increment.

It would also be good to add a test for the overflow case (both signed and unsigned, and both 'less' and 'less or equal')

Matt added a subscriber: Matt.Dec 7 2022, 6:13 PM

Added tests for signed/unsigned overflow.

dtemirbulatov marked 2 inline comments as done.Dec 8 2022, 4:05 AM
This comment was removed by dtemirbulatov.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
4605

With the increment case, I don't think we have to worry about overflow/underflow. it either world be 0 or too large/too small number that are not representable with PTRUE VL instruction.

dtemirbulatov marked an inline comment as done.Dec 8 2022, 4:06 AM
dtemirbulatov added inline comments.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
4605

With the increment case, I don't think we have to worry about overflow/underflow. it either world be 0 or too large/too small number that are not representable with PTRUE VL instruction.

Added overflow check for the increment.

sdesmalen added inline comments.Dec 13 2022, 1:25 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
4605

nit: add newline above.

4609

You didn't add any tests for this case, so it's currently untested.

sdesmalen added inline comments.Dec 13 2022, 1:43 AM
llvm/test/CodeGen/AArch64/sve-intrinsics-while.ll
87

I guess this doesn't match:

ptrue p0.d, vl4

because NumActiveElems.getZExtValue() <= (MinSVEVectorSize / ElementSize) evaluates to false.

Can you clarify the reason these intrinsics don't fold in the name of these tests (e.g. whilele_d_ii_dont_fold_to_ptrue_larger_than_minvec) and/or add a comment to explain it?

101

Same question here: rename this to whilele_b_ii_dont_fold_to_ptrue_nonexistent_vl9 or something?

121

same question about renaming this test, e.g. whilele_b_ii_dont_fold_to_ptrue_overflow ?

257

For whilelo (unsigned), this is the same as -1, and is therefore not much different than the test above.
I would suggest keeping this test (because it's unsigned) and removing the test above

385

For whilels (unsigned), this is the same as -1, and is therefore not much different than the test above.
I would suggest keeping this test (because it's unsigned) and removing the test above.

dtemirbulatov marked 5 inline comments as done.

Addressed suggestions.

dtemirbulatov marked 2 inline comments as done.

Missed two comments, addressed those as well.

sdesmalen added inline comments.Dec 14 2022, 9:31 AM
llvm/test/CodeGen/AArch64/sve2-intrinsics-while.ll
141

Similar comment as for whilegt, it's better to pick some numbers that actually end up as a valid vl if you remove the overflow check, e.g.

@llvm.aarch64.sve.whilege.nxv16i1.i32(i32 -2147483641, i32 2147483647)

245

Similar comment as for whilehi, this doesn't overflow. You could use

@llvm.aarch64.sve.whilehs.nxv16i1.i32(i32 6, i32 4294967295)    ; this would wrap around to vl8
374

It's better to pick some numbers that actually end up as a valid vl if you remove the overflow check, e.g.

@llvm.aarch64.sve.whilegt.nxv16i1.i32(i32 -2147483641, i32 2147483647)

Can you use similar numbers whilege?

449

This doesn't overflow right? 4294967295 - 6 => 4294967289 (no overflow here).

For whilehi, it would overflow if the start value would be lower than the end value, e.g.

@llvm.aarch64.sve.whilehi.nxv16i1.i32(i32 7, i32 4294967295)    ; this would wrap around to vl8
dtemirbulatov marked 4 inline comments as done.

Fixed remarks.

sdesmalen accepted this revision.Dec 16 2022, 3:54 AM
This revision is now accepted and ready to land.Dec 16 2022, 3:54 AM
This revision was landed with ongoing or failed builds.Dec 17 2022, 5:28 PM
This revision was automatically updated to reflect the committed changes.