This is an archive of the discontinued LLVM Phabricator instance.

[X86] Emit native IR for pmuldq/pmuludq builtins.
ClosedPublic

Authored by craig.topper on Apr 8 2018, 6:28 PM.

Details

Summary

I believe all the pieces are now in place in the backend to make this correctly. We can truncate the vXi64 type to vXi32, extend it back up to the original width and multiply.

In the backend the truncate+extend will becomes sign_extend_inreg/zero_extend_inreg(really an and). Then those will be combined with the mul to PMULDQ/PMULUDQ. Then SimplifyDemandedBits will strip the sign_extend_inreg/zero_extend_inreg out.

The only question I have is whether its ok to emit the v2i32 intermediate type for the 128-bit version. I wasn't sure of any examples where we use an illegal type in our intrinsic/builtin handling. At least not a narrower type. I know pavg uses a wider type.

I think I could probably do this all in the header file using __builtin_convertvector if that's desired.

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper created this revision.Apr 8 2018, 6:28 PM
spatel added a comment.Apr 9 2018, 6:47 AM

The only question I have is whether its ok to emit the v2i32 intermediate type for the 128-bit version. I wasn't sure of any examples where we use an illegal type in our intrinsic/builtin handling. At least not a narrower type. I know pavg uses a wider type.

I don't know of any precedence at this level, but we created illegal scalar int types (i128/i256) as part of memcmp expansion knowing that we'd match and combine those specific patterns in the DAG for x86. I figured that as long as we take responsibility for handling the illegal types, it's ok to do that...nobody has complained so far. :)

IIRC the SSE pmovsx/pmovzx generic implementations do this?

Yes. @RKSimon is correct. pmovzx/pmovsx do use illegal types already.

spatel added inline comments.Apr 9 2018, 10:01 AM
test/CodeGen/sse2-builtins.c
7 ↗(On Diff #141577)

There should be matching codegen tests for the new IR patterns here or have they moved?

Yes. I'll make the llvm changes before committing this. Just wanted to make sure this direction was ok first.

Yes. I'll make the llvm changes before committing this. Just wanted to make sure this direction was ok first.

Ah, seems ok then. But instcombine is going to turn these casts into 'and' or 'shl+ashr', right? Shouldn't clang produce those patterns directly? More efficient and no need to toe the illegal type line.

Use shifts or and to match what InstCombine will do. This sidesteps the illegal type question.

spatel accepted this revision.Apr 9 2018, 11:38 AM

LGTM (assuming the backend gets this right in all cases and we have tests for that).

This revision is now accepted and ready to land.Apr 9 2018, 11:38 AM
This revision was automatically updated to reflect the committed changes.