This is an archive of the discontinued LLVM Phabricator instance.

[mlir][Vector] Prevent AVX2 lowering for non-f32 transpose ops
ClosedPublic

Authored by dcaballe on Feb 23 2022, 12:08 PM.

Details

Summary

The AVX2 lowering for transpose operations is only applicable to f32 vector types.

Diff Detail

Event Timeline

dcaballe created this revision.Feb 23 2022, 12:08 PM
dcaballe requested review of this revision.Feb 23 2022, 12:08 PM
aartbik accepted this revision.Feb 23 2022, 2:52 PM

Just curious if F32 data movement would not just work for any 32-bit entity, but change looks good to me.

Make sure to get presubmit green first before committing though.

This revision is now accepted and ready to land.Feb 23 2022, 2:52 PM

Thanks!

Just curious if F32 data movement would not just work for any 32-bit entity, but change looks good to me.

The existing patterns currently use f32 specific instructions (note the Ps):

Value t0 = mm256UnpackLoPs(ib, vs[0], vs[1]);
Value t1 = mm256UnpackHiPs(ib, vs[0], vs[1]);
Value t2 = mm256UnpackLoPs(ib, vs[2], vs[3]);
Value t3 = mm256UnpackHiPs(ib, vs[2], vs[3]);
Value t4 = mm256UnpackLoPs(ib, vs[4], vs[5]);
Value t5 = mm256UnpackHiPs(ib, vs[4], vs[5]);
Value t6 = mm256UnpackLoPs(ib, vs[6], vs[7]);
Value t7 = mm256UnpackHiPs(ib, vs[6], vs[7]);

It should be relatively easy to enable them on i32 since they have the same size.

This revision was landed with ongoing or failed builds.Feb 25 2022, 11:30 AM
This revision was automatically updated to reflect the committed changes.