This is an archive of the discontinued LLVM Phabricator instance.

[X86][AVX2] Missing AVX2 memory folding instructions
ClosedPublic

Authored by RKSimon on Feb 8 2015, 1:02 PM.

Details

Summary

Added most of the missing vector folding patterns for AVX2 (as well as fixing the vpermpd and verpmq patterns)

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon updated this revision to Diff 19553.Feb 8 2015, 1:02 PM
RKSimon retitled this revision from to [X86][AVX2] Missing AVX2 memory folding instructions.
RKSimon updated this object.
RKSimon edited the test plan for this revision. (Show Details)
RKSimon added reviewers: qcolombet, mkuper, andreadb, spatel.
RKSimon set the repository for this revision to rL LLVM.
RKSimon added a subscriber: Unknown Object (MLST).
mkuper edited edge metadata.Feb 9 2015, 4:57 AM

What bothers me a bit is that all of the pre-existing broadcast instructions have TB_NO_REVERSE, and the new ones don't.
Unfortunately I can't remember *why* they have TB_NO_REVERSE.

Nadav, do you remember, by any chance?

mkuper added a subscriber: nadav.Feb 9 2015, 4:58 AM

What bothers me a bit is that all of the pre-existing broadcast instructions have TB_NO_REVERSE, and the new ones don't.
Unfortunately I can't remember *why* they have TB_NO_REVERSE.

I believe its because the VBROADCASTS{SD}rr register instructions were an AVX2 addition while the VBROADCASTS{SD}rm memory instructions were available from AVX1 - so it prevents unfolding from introducing an illegal instruction on AVX1 targets. The VPBROADCAST instructions are all AVX2 instructions so don't need an equivalent limitation.

Nadav might be able to confirm?

mkuper accepted this revision.Feb 9 2015, 10:34 PM
mkuper edited edge metadata.

Ah, right!
Thanks for the patch and explanation, Simon.

LGTM.

This revision is now accepted and ready to land.Feb 9 2015, 10:34 PM

Thanks Michael + Nadav - I should add that I don't know why the AVX-512 broadcast instructions have set TB_NO_REVERSE - I'm assuming its just a copy + paste error but I don't know enough about the ISA to recommend its removal.

This revision was automatically updated to reflect the committed changes.