This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Add f16 fpimm patterns
ClosedPublic

Authored by RosieSumpter on Jul 18 2022, 1:47 AM.

Details

Summary

Depends on D129623

This patch recognizes f16 immediates as legal and adds the necessary
patterns. This allows the fadda folding introduced in D129623 to be
applied to the f16 cases.

Diff Detail

Event Timeline

RosieSumpter created this revision.Jul 18 2022, 1:47 AM
Herald added a project: Restricted Project. · View Herald Transcript
RosieSumpter requested review of this revision.Jul 18 2022, 1:47 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 18 2022, 1:47 AM
  • Removed comment from sve-fadda-select.ll test
dmgreen added inline comments.Jul 18 2022, 4:31 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
1129

I think if this is added we need to add extra lowering for FP16 constants of any kind, and it probably shouldn't be guarded by hasSVE but done in general.

Any i16 constant can be materialized with a single mov, so the worst case is that plus a gpr->fpr, which is likely better than the load we currently produce.
https://godbolt.org/z/r59arn5bq

Matt added a subscriber: Matt.Jul 18 2022, 11:31 AM
RosieSumpter retitled this revision from [AArch64][SVE] Recognize splats of f16 -0.0 for fadda to [AArch64] Add f16 fpimm patterns.
RosieSumpter edited the summary of this revision. (Show Details)
  • Updated title and summary
  • Made f16 immediates legal for hasFullFP16 rather than just SVE
  • Added f16 fpimm patterns
  • Updated tests
RosieSumpter added inline comments.Jul 20 2022, 9:04 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
1129

Hi Dave, thanks for having a look. I've had a go at doing this - let me know if it looks alright.

Yeah seems OK as far as I can tell.

llvm/lib/Target/AArch64/AArch64InstrInfo.td
1629 ↗(On Diff #446167)

COPY's are quite loose in their definition, but it is probably best to keep them copying between values of the same size. I think this can use FMOVWHr directly, so long as the instruction has already been defined.
It should probably have a: let Predicates = [HasFullFP16] too.

paulwalker-arm added inline comments.Jul 21 2022, 3:18 AM
llvm/lib/Target/AArch64/AArch64InstrInfo.td
1629 ↗(On Diff #446167)

Is HasFullFP16 relevant? The lowering requires no special instructions so I'd even argue the setOperationAction call requires no protection. What's important is whether the f16 type is legal. Once we get to operation lowering and isel we only need to protect instructions not types.

We made this mistake for bf16 for SVE and it just lead to unnecessary complexity which was eventually removed.

paulwalker-arm added inline comments.Jul 21 2022, 3:48 AM
llvm/lib/Target/AArch64/AArch64InstrInfo.td
1629 ↗(On Diff #446167)

I've checked and FMOVWHr does require HasFullFP16 so the question becomes, should we allow good code generation even for the case when HasFullFP16 is not available? Or is HasFullFP16 so universally available these days for us not to care?

dmgreen added inline comments.Jul 21 2022, 3:49 AM
llvm/lib/Target/AArch64/AArch64InstrInfo.td
1629 ↗(On Diff #446167)

FMOVWHr requires FullFP16. It could use FMOVWSr instead, but then the types are a little awkward.

The call to setOperationAction is currently inside a HasFullFP16 block, so It could be moved out and we could always be using generic instructions. If we do go that route then make sure that we have tests for it.

RosieSumpter marked an inline comment as done.
  • Use FMOVWHr instead of COPY_TO_REGCLASS
  • Add let Predicates = [HasFullFP16]
  • Removed code to copy from GPR32 to FPR16
  • Updated a couple of recently added tests
dmgreen accepted this revision.Jul 22 2022, 2:54 AM

LGTM. Thanks.

I've checked and FMOVWHr does require HasFullFP16 so the question becomes, should we allow good code generation even for the case when HasFullFP16 is not available? Or is HasFullFP16 so universally available these days for us not to care?

I think it's fine for this patch. It's a nice general improvement for when fp16 is available, and fixes the original issue of recognizing scalable splat constants.

This revision is now accepted and ready to land.Jul 22 2022, 2:54 AM
This revision was landed with ongoing or failed builds.Jul 25 2022, 1:12 AM
This revision was automatically updated to reflect the committed changes.