This is an archive of the discontinued LLVM Phabricator instance.

[SVE ACLE] Implement IR combines to convert intrinsics used for _m C/C++ builtins
ClosedPublic

Authored by jolanta.jensen on Jun 2 2023, 9:50 AM.

Details

Summary

This patch implements IR combines to convert intrinsics used for _m C/C++ builtins
which take an all active predicate to their equivalent _u intrinsic.

Diff Detail

Event Timeline

jolanta.jensen created this revision.Jun 2 2023, 9:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 2 2023, 9:50 AM
jolanta.jensen requested review of this revision.Jun 2 2023, 9:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 2 2023, 9:50 AM
Matt added a subscriber: Matt.Jun 2 2023, 3:35 PM
paulwalker-arm added inline comments.Jun 5 2023, 5:11 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
1399–1418

Rather than have nested switch statements perhaps you can pass the new IntrinsicID into this function?

Since both this function and instCombineSVEAllActive3VA don't change the original call's operands, I think you should be able to make use of II.args(). I'm not sure if there's a variant of CreateIntrinsic that'll take this directly but if not, you should be able to create a SmallVector from it and pass that to CreateIntrinsic.

1406–1407

InstCombiner comes with its own IRBuilder already set up. It should be as simple as replacing your new Builder.Create...lines with IC.Builder.Create.... You'll see the existing SVE functions have been updated similarly.

Review comments addressed.
Corrected a bug where arguments to fused intrinsics were unnecessarily swapped.

Still missing conversions for mul, fmul, add, fadd, sub and fsub to their _u variants.

jolanta.jensen added inline comments.Jun 6 2023, 10:05 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
1399–1418

Fixed. So much better now :)

1406–1407

Changed to InstCombiner own IRBuilder.

mgabka added a comment.Jun 7 2023, 1:38 AM

I think it would be good to explain in the commit message why conversion to _u is a good thing, otherwise it is not clear why we are doing it.

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
1406–1410

this isn't efficient (as you are copying element by element and doing unnecessary clear), and it is a lot of line of code, SmallVector has a dedicated constructor which can take an iterator_range and create a vector from it, please use it instead.

1783–1842

would be good if the order of intrinsics here aligned with the order in the test file so it would be easier to check that we have full coverage

llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-m-to-x.ll
1 ↗(On Diff #528915)

I am not a big fan of the name of this file, it suggests that there should be intrinsics with _x or _m suffix being used but there no such ones.

I think it would also be good to sort the test in alphabetical order by the name of the intrinsic being tested, that would make navigation through this long file easier.

jolanta.jensen added inline comments.Jun 7 2023, 8:55 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
1783–1842

I think they are. But I could make a mistake, I'll check.

llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-m-to-x.ll
1 ↗(On Diff #528915)

_x and _m suffixes were inspired by acle docs. I happily take suggestions to rename the file.

Shall the alphabetical sorting be for the whole file or within the sections, i.e. within Float arithmetics, Integer arithmetics, Shifts and Logical operations?

mgabka added inline comments.Jun 7 2023, 9:36 AM
llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-m-to-x.ll
1 ↗(On Diff #528915)

both options are fine, I think the one you proposed sounds better.

Yeah I know from where the _x and _m is coming, I just think that in this case it would be easier to use file name like sve-intrinsics-combine-to-u-forms.ll or something simialar

Review comments addressed.
Bug fix for not propagating fast math flags and attributes.

jolanta.jensen added inline comments.Jun 8 2023, 10:03 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
1406–1410

Fixed.

1783–1842

Should be now in the same order as in the test file. And the test file is renamed and alphabetically ordered within the sections.

1841

mul and mul_u execute identical code. Please advise if this can be solved some better way.

Minor refactoring.
All IR combines are now implemented.

paulwalker-arm added inline comments.Jun 13 2023, 5:01 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
1408–1411

Having to manually choose which flags and attribute to copy seems fragile. The current set doesn't include things like fast math flags (as can be seen from sve-intrinsics-combine-to-u-forms.ll: replace_fabd_intrinsic_half) and debug metadata.

This instcombine is essentially changing the name of the function called with everything else expected to remain as was so I think you should be able to use II.setCalledFunction(), which takes the new function declaration but will leave everything else (operands, flags, attributes...) as they are.

NOTE: This means you're not creating a new call node thus removes the need to call replaceInstUsesWith().
1785

I think my setCalledFunction() suggestion hopefully means this will no longer be required.

1787–1789

For this and other cases can you try to push the logic into the main function of the operation. So in this case it would be better to have all the add related combines within instCombineSVEVectorAdd.

Refactoring after review comments.
Bug fix where flags and attributes were not propagated correctly.

jolanta.jensen added inline comments.Jun 15 2023, 7:28 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
1408–1411

Fixed. And the test corrected.

1785

It is no longer required. Removed.

1787–1789

Fixed.

mgabka added inline comments.Jun 15 2023, 7:29 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
1283

looks like it is not used anymore so can be removed

mgabka added inline comments.Jun 15 2023, 8:42 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
1291

I think the preference is to use auto

1292

I think the preference is to use auto

llvm/test/Transforms/InstCombine/AArch64/sve-intrinsics-combine-to-u-forms.ll
67

I have a question here, would it make more sense to have a separate test file for the cases where intrinsic is combined to a LLVM instructions?
or maybe worth to add a comment to clarify why in this case we do not expect _u intrinsic?

paulwalker-arm added inline comments.Jun 15 2023, 8:56 AM
llvm/test/Transforms/InstCombine/AArch64/sve-intrinsics-combine-to-u-forms.ll
67

These tests do exercise the new code so I think they belong in this file. Adding a comment to acknowledge the perhaps unexpected output sounds reasonable.

Addressed review comments.

jolanta.jensen added inline comments.Jun 16 2023, 6:19 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
1283

Removed.

1291

Changed to auto.

1292

Changed to auto.

llvm/test/Transforms/InstCombine/AArch64/sve-intrinsics-combine-to-u-forms.ll
67

Added comments for fadd, fmul and fsub.

paulwalker-arm added inline comments.Jun 16 2023, 8:41 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
1283

Please add a function description comment along the lines of "Canonicalise operations that take an all active predicate (e.g. sve.add -> sve.add_u).".

1302

I guess this test makes the functions easier to use. Perhaps move the test into instCombineSVEAllActive so the slight weirdness is hidden?

1353–1357

I'm pretty sure it's not safe to move this code here because instCombineSVEVectorAdd is called for both sve.add and sve.add_u. If you consider:

sve.add(a, sve.mul.u(b, c))

here the inactive lanes of the result are defined to come from a. However this code will combine the IR into:

sve.mla.u(a, b, c)

where the result for inactive lanes will be undefined. This is why the combine originally lived outside of instCombineSVEVectorAdd and must remain outside after this work.

This also means https://reviews.llvm.org/D150768 has introduced a bug that I missed during review.

1395–1399

Same comment as with instCombineSVEVectorAdd.

1765–1767

Based on my comment above, I think this code has to say here and you'll want to add a call to instCombineSVEAllActive.

1775–1777

As above.

paulwalker-arm added inline comments.Jun 18 2023, 3:46 PM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
1353–1357

I've committed https://reviews.llvm.org/rGb7287a82d33b to fix the bug introduced by D150768.

Rebase on top of rGb7287a82d33b.

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
1283

Added.

1302

If I move the test to instCombineAllActive() I will not know if the returned intrinsic is the renamed one or the same _u intrinsic I got via the argument list here. If I already got an _u intrinsic I want to execute the remaining code in this function, not to return the intrinsic. But I removed this test as of now, instCombineVectorAdd() will never be called for an _u intrinsics. However I still have this check in instCombineSVEVectorMul(), which is called with both _u and non _u intrinsics. Happy for any suggestions how it could be rewritten.

1353–1357

Right, I now recall a comment in https://reviews.llvm.org/D144413 about the issue. Corrected by rebase on top of rGb7287a82d33b

1353–1357

Thank you!

1395–1399

Corrected by rebase on top of rGb7287a82d33b

1765–1767

Corrected by rebase on top of rGb7287a82d33b

1775–1777

Fixed by rebase on top of rGb7287a82d33b.

paulwalker-arm added inline comments.Jun 19 2023, 9:50 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
1302

Thanks for investigating. As you say, the refactoring means there's now only a single instance so not worth worrying about.

mgabka added inline comments.Jun 20 2023, 2:13 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
1292

to me this does not look right, what are you trying to do here?
why not use getModule function?

llvm/test/Transforms/InstCombine/AArch64/sve-intrinsics-combine-to-u-forms.ll
12

type, should be "replace with"

14

to me the singular form is more correct in this context

727

to me the singular form is more correct in this context

2013

if these are SVE2 instructions (and they are implemented like that currently in LLVM) then the attached attribute to these functions isn't correct, it should have sve2 attribute attached otherwise those intrinsics can not be code generated. you can try to run llc on this file and check it yourself , you will observe a compiler crash with the current version.

Addressed review comments.

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
1292

Changed to getModule(). It's not a function defined in llvm::IntrinsicInst but inherited and I did not find it looking for something suitable. So I reused existing snippet of code.

llvm/test/Transforms/InstCombine/AArch64/sve-intrinsics-combine-to-u-forms.ll
12

Fixed.

14

Fixed.

727

Fixed.

2013

Added +sve2 to the attributes.

paulwalker-arm accepted this revision.Jun 20 2023, 7:04 AM

A minor test issue to fix, but otherwise looks good.

llvm/test/Transforms/InstCombine/AArch64/sve-intrinsics-combine-to-u-forms.ll
254

Should this be fmaxnm to match the function name?

This revision is now accepted and ready to land.Jun 20 2023, 7:04 AM

Minor test correction after review comment.

llvm/test/Transforms/InstCombine/AArch64/sve-intrinsics-combine-to-u-forms.ll
254

Yes, it should. Corrected.

Via Conduit