This is an archive of the discontinued LLVM Phabricator instance.

[X86][intrinsics] lower _mm[256|512]_mask[z]_set1_epi[8|16|32|64] intrinsic to IR
ClosedPublic

Authored by jina.nahias on Sep 10 2017, 4:34 AM.

Diff Detail

Event Timeline

jina.nahias created this revision.Sep 10 2017, 4:34 AM

reduce the code lines in /lib/IR/AutoUpgrade.cpp

craig.topper edited edge metadata.Sep 11 2017, 9:27 AM

This also seems to be only a partial diff now.

This also seems to be only a partial diff now.

craig.topper added inline comments.Sep 12 2017, 12:01 AM
lib/IR/AutoUpgrade.cpp
75–79

Version number should be 6.0

RKSimon edited edge metadata.Sep 12 2017, 2:48 AM

If you could start the relevant -fast-isel test files (at least for these cases) it'd be much appreciated - I've never found the strength of will to start testing all the avx512 instructions...

there is a problem with intrinsics that use 64 bit masks (e.g. *epi8) have a sub optimal asm code (hundreds of code lines) on a 32bit machine when running fast-isel flag . working on a solution and updating in another patch.

Do we have no test coverage of x86_avx512_mask_pbroadcast_q_mem_512? Still worth adding an auto-upgrade path?

@RKSimon - added a new test and a path in AutoUpgrade .

What's going on with vselect-packss.ll ?

it was a diff from an older commit. now its ok.

craig.topper added inline comments.Sep 14 2017, 3:15 PM
test/CodeGen/X86/avx512-intrinsics-upgrade.ll
2

Can you add a 32-bit mode run line here? I want to make sure we handle this well.

54

The pbroadcast.q.mem.512 intrinsic seems to have never worked. We don't need to add a test or autoupgrade for it if that's the case.

test/CodeGen/X86/avx512-intrinsics.ll
4226

We really need a 32-bit run line on this file. This test case throws an unreachable on trunk today.

test/CodeGen/X86/avx512bw-intrinsics-fast-isel.ll
8

Any idea why this generated such terrible code?

jina.nahias added inline comments.Sep 17 2017, 10:59 PM
test/CodeGen/X86/avx512bw-intrinsics-fast-isel.ll
8

there is a problem with intrinsics that use 64 bit masks (e.g. *epi8) have a sub optimal asm code (hundreds of code lines) on a 32bit machine when running fast-isel flag . working on a solution and updating in another patch.

delete broadcast_mem test

jina.nahias added inline comments.Sep 18 2017, 5:14 AM
test/CodeGen/X86/avx512-intrinsics-upgrade.ll
2

the current generated code for 32-bit is not optimal, 32-bit needs some more work which will be in a following patch .

jina.nahias added inline comments.Sep 18 2017, 5:16 AM
test/CodeGen/X86/avx512-intrinsics.ll
4226

it still throws an unreachable after my changes

This revision is now accepted and ready to land.Sep 18 2017, 9:27 AM
This revision was automatically updated to reflect the committed changes.