This is an archive of the discontinued LLVM Phabricator instance.

[SVE] Lower fixed length VECREDUCE_[FMAX|FMIN] to Scalable
ClosedPublic

Authored by cameron.mcinally on Sep 28 2020, 1:10 PM.

Details

Summary

This patch is pretty similar to the fixed length integer reductions, with some notes:

  1. The OverrideNEON flag isn't so straight forward now. But I don't see a cleaner solution. Any thoughts on this?
  1. There's no vector of f16 NEON instructions, so use SVE for them.

Diff Detail

Event Timeline

cameron.mcinally requested review of this revision.Sep 28 2020, 1:10 PM

Also, while I have everyone's attention, there are a number of unhandled vector reduction intrinsics with SVE support. Do we want to add lowerings for those? E.g. ANDV.

Also, while I have everyone's attention, there are a number of unhandled vector reduction intrinsics with SVE support. Do we want to add lowerings for those? E.g. ANDV.

Yes please.

paulwalker-arm added inline comments.Sep 29 2020, 3:50 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
9667–9669

There are f16 MAX/MIN reduction instructions for NEON, it's just they're an optional v8.2 extension. However, when SVE is implemented the extension is mandatory so we can rely on them.

I did a quick test using your tests against master and they just worked.

llvm/test/CodeGen/AArch64/sve-fixed-length-fp-reduce.ll
199

What's going on here?

211

And here?

cameron.mcinally marked 3 inline comments as done.

Address reviews...

llvm/test/CodeGen/AArch64/sve-fixed-length-fp-reduce.ll
199

Fatigue error. Botched the CHECK line copy-and-paste and missed it. Sorry about that.

llvm/test/CodeGen/AArch64/sve-fixed-length-fp-reduce.ll
199

Pffff. Looks like this copy-and-paste problem has history. Correcting other tests now...

llvm/test/CodeGen/AArch64/sve-fixed-length-fp-reduce.ll
199

Fixed with 01c95f79424d.

paulwalker-arm added inline comments.Sep 29 2020, 9:03 AM
llvm/test/CodeGen/AArch64/sve-fixed-length-fp-reduce.ll
69

1024

131

1024

142

2048

259

1024

322

1024

333

2048

Update more typos...

cameron.mcinally marked 5 inline comments as done.Sep 29 2020, 9:13 AM

Sorry again, Paul. Still looking at how far this propagated. Looks like it was introduced with the VECREDUC_ADD patch. Need some more time...

Ok, I think that's all of them. Looks like it started with D87796 and was buried in other changes. To confirm:

<scrubbed> CodeGen/AArch64> grep -rn VBITS_GE_1048 *
<scrubbed> CodeGen/AArch64> grep -rn VBITS_GE_2086 *
<scrubbed> CodeGen/AArch64> grep -rn VBITS_GE_2096 *
<scrubbed> CodeGen/AArch64>

Sorry again. I shouldn't have done that.

paulwalker-arm accepted this revision.Sep 29 2020, 9:45 AM

@cameron.mcinally No worries, I clearly didn't do the best job reviewing that patch either.

This revision is now accepted and ready to land.Sep 29 2020, 9:45 AM