This is an archive of the discontinued LLVM Phabricator instance.

[X86][AVX512] Instructions fixups
ClosedPublic

Authored by coby on Nov 17 2016, 7:05 AM.

Details

Summary

Some instructions were missing, other implemented falsely. this patch aims at amending those issues. full list:

vcvtps2pd
vcvtudq2pd
vcvtps2qq
vcvttps2qq
vcvtps2uqq
vcvttps2uqq

variants are:

[Dst]XMM(zero-masked/merge-masked/unmasked)
[Src]Mem64

Differential Revision: https://reviews.llvm.org/D26799

Diff Detail

Repository
rL LLVM

Event Timeline

coby updated this revision to Diff 78364.Nov 17 2016, 7:05 AM
coby retitled this revision from to [X86][AVX512] Instructions fixups.
coby updated this object.
coby set the repository for this revision to rL LLVM.
coby added a subscriber: llvm-commits.
craig.topper added inline comments.Nov 17 2016, 7:04 PM
lib/Target/X86/X86InstrAVX512.td
0–1

Shouldn't this be i64mem too? The only difference between VCVTDQ2PD and VCVTUDQ2PD is that one is signed and the other is unsigned. They should have the same load size.

coby added inline comments.Nov 18 2016, 12:17 AM
lib/Target/X86/X86InstrAVX512.td
0–1

Note we are altering (in this case - preserving) the AVX512(f,vl) variants alone, which are to load from an xmm-width memory locations (according to formal documentation).
quad-word instances are restricted to AVX/SSE.
OTOH, VCVTUDQ2PD, is indeed spec'd to load from a quad-word.

craig.topper edited edge metadata.Nov 18 2016, 8:05 AM

That has to be a mistake in the documentation. It seems very unlikely they would have changed the behavior between SSE/AVX and AVX-512. The description under the table in the Intel docs does mention 64-bit memory location.

"EVEX encoded versions: The source operand can be a YMM/XMM/XMM (low 64 bits) register, a 256/128/64-bit
memory location or a 256/128/64-bit vector broadcasted from a 32-bit memory location. The destination operand
is a ZMM/YMM/XMM register conditionally updated with writemask k1. Attempt to encode this instruction with EVEX
embedded rounding is ignored."

delena added inline comments.Nov 19 2016, 12:28 PM
lib/Target/X86/X86InstrAVX512.td
0–1

I also think that it should be i64mem for XMM-destination form and i128mem for YMM-destination form.
So, in this case I agree with Craig. The bug is in documentation.

coby added inline comments.Nov 20 2016, 12:15 AM
lib/Target/X86/X86InstrAVX512.td
0–1

Then i'll refer it as such. will make amends.

coby updated this revision to Diff 78650.Nov 20 2016, 12:21 AM
coby edited edge metadata.

vcvtdq2pd --> now sets to load from a i64mem.
added appropriate test cases.

delena accepted this revision.Nov 20 2016, 5:50 AM
delena edited edge metadata.
This revision is now accepted and ready to land.Nov 20 2016, 5:50 AM
coby updated this object.Nov 20 2016, 9:14 AM
coby edited edge metadata.
This revision was automatically updated to reflect the committed changes.