This is an archive of the discontinued LLVM Phabricator instance.

[x86] use more phadd for reductions
ClosedPublic

Authored by spatel on Jul 15 2019, 10:13 AM.

Details

Summary

This is part of what is requested by PR42023:
https://bugs.llvm.org/show_bug.cgi?id=42023

There's an extension needed for FP add, but exactly how we would specify that using flags is not clear to me, so I left that as a TODO.
We're still missing patterns for partial reductions when the input vector is 256-bit or 512-bit, but I think that's a failure of vector narrowing. If we can reduce the widths, then this matching should work on those tests.

Diff Detail

Event Timeline

spatel created this revision.Jul 15 2019, 10:13 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 15 2019, 10:13 AM
RKSimon added inline comments.Jul 15 2019, 11:14 AM
llvm/lib/Target/X86/X86ISelLowering.cpp
35618

(style) early-out if this is not the case

llvm/test/CodeGen/X86/vector-reduce-add.ll
680

ideally we'd have these cases combine as ll

spatel marked an inline comment as done.Jul 15 2019, 4:51 PM
spatel added inline comments.
llvm/test/CodeGen/X86/vector-reduce-add.ll
680

'll' - do you mean alter/override the ExpandReductions pass in IR?

spatel updated this revision to Diff 210012.Jul 15 2019, 7:18 PM

Patch updated:
Early exit if wrong types or subtarget.

RKSimon added inline comments.Jul 16 2019, 1:07 AM
llvm/test/CodeGen/X86/vector-reduce-add.ll
680

sorry - "ideally we'd have these cases combine as well"

spatel updated this revision to Diff 210093.Jul 16 2019, 7:22 AM

Patch updated:
Allow 256-bit reductions by extracting and using 1 more 128-bit hop.

spatel updated this revision to Diff 210121.Jul 16 2019, 9:40 AM

Patch updated - no functional changes from the previous draft:

  1. Move local variable for NumElts closer to uses.
  2. Add TODO comment about handling bigger-than-256-bit types.
RKSimon accepted this revision.Jul 16 2019, 10:04 AM

LGTM - cheers

This revision is now accepted and ready to land.Jul 16 2019, 10:04 AM
This revision was automatically updated to reflect the committed changes.