Page MenuHomePhabricator

[llvm][AArch64][SVE] Fold literals into math instructions
Needs ReviewPublic

Authored by DavidTruby on Mar 22 2021, 6:15 AM.



SVE has predicated literal forms of some instructions for specific
literals, which currently are generated correctly when using ACLE
but not when those instructions are generated directly.

This adds the patterns to generate those instructions when
generating from standard LLVM IR instructions.

Co-authored-by: Sander De Smalen <>
Co-authored-by: Cullen Rhodes <>

Diff Detail

Event Timeline

DavidTruby created this revision.Mar 22 2021, 6:15 AM
DavidTruby requested review of this revision.Mar 22 2021, 6:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 22 2021, 6:15 AM
david-arm added inline comments.

Hi @DavidTruby, this is quite a lot of very similar looking patterns to maintain here. Is there a possible way to condense them somehow, i.e. the A and B patterns are duplicates, so you could invoke a multiclass for A, then a multiclass for B?

Added extra tests for fmax, fmaxnm, fmin, fminnm

Refactored patterns into helper multiclass

DavidTruby added inline comments.Mar 29 2021, 7:28 AM

@david-arm hopefully this helps a bit, it at least means that each pattern is only mentioned once so only need changing once if they have to be changed in future

joechrisellis added inline comments.Apr 6 2021, 3:28 AM

nit: really small thing but the naming here is inconsistent with fpimm0. I would expect fpimm1 and fpimm2. I suppose the difficulty here is shoehoring fpimm_half into this naming convention. If you can think of a sensible way of doing this, I would prefer we change this. Otherwise, ignore me. :-)


General test suggestion -- prefer:

target triple = "aarch64-unknown-linux-gnu"


attributes #0 = { "target-features"="+sve" }

... and tag your functions with #0. I personally find this to be a bit more reliable when you're running llc on this file. 😄

Also might be worth adding:

; RUN: ... 2>%t ...
; RUN: FileCheck --check-prefix=WARN --allow-empty %s <%t

Please can we add:

; If this check fails please read test/CodeGen/AArch64/README for instructions on how to resolve it.
; WARN-NOT: warning
DavidTruby updated this revision to Diff 335785.Apr 7 2021, 5:47 AM

Moved the zeroing predicates behind the UseExperimentalZeroingPseudos feature
Added additional tests for one of the patterns
Addressed misc reivew comments

paulwalker-arm added inline comments.Mon, Apr 19, 3:37 AM

These will need to be split in two because the the UNDEF forms are safe to use by default but the ZERO forms must be protected by UseExperimentalZeroingPseudos. You can see how we've done this for FADD_ZPZZ where there exists two multiclasses, sve_fp_bin_pred_hfd and sve_fp_2op_p_zds_zeroing_hsd.


My immediate thought is whether these (or at least some of these) be rolled into the instruction/pseudo multiclasses. By this I mean can sve_fp_2op_i_p_zds be extended with extra parameters so the necessary intrinsic patterns can be added.

Likewise for the multiclasses used to create the UNDEF and ZERO psuedos, can the multiclasses themselves contain the IR patterns plus the vselect variants.


Just a note to say this and the above RUN: ... 2>%t request are no longer necessary because llc will now error in these circumstances.


Minor point but attributes are normally placed after the function definitions as that where the IR printer would put them.


Please use to generate the CHECK lines.

Matt added a subscriber: Matt.Tue, May 4, 9:46 AM
DavidTruby updated this revision to Diff 343020.Wed, May 5, 6:01 AM

Separated zeroing forms and regenreated check lines

This comment was removed by DavidTruby.

Fixed accidentally overwriting this with a different patch...