This is an archive of the discontinued LLVM Phabricator instance.

[X86] Fix bug in masked compress
ClosedPublic

Authored by aymanmus on Aug 29 2016, 1:16 AM.

Details

Summary

From the description of the x86 masked compress instruction:
"The opmask register k1 selects the active elements (partial vector or possibly non-contiguous if less than 16 active
elements) from the source operand to compress into a contiguous vector. The contiguous vector is written to the
destination starting from the low element of the destination operand.
Memory destination version: Only the contiguous vector is written to the destination memory location. EVEX.z
must be zero.
"
Which means that the instruction leaves the upper part of the memory untouched.
The current pattern assumes that the instruction pads the rest of the space with zeros and thus replace the select with ImmAllZeros with the masked compress instruction.

Diff Detail

Event Timeline

aymanmus updated this revision to Diff 69536.Aug 29 2016, 1:16 AM
aymanmus retitled this revision from to [X86] Fix bug in masked compress.
aymanmus updated this object.
aymanmus added reviewers: delena, mkuper, igorb.
aymanmus added a subscriber: llvm-commits.
delena added inline comments.Aug 29 2016, 1:35 AM
test/CodeGen/X86/compress-maskz.ll
10 ↗(On Diff #69536)

I don't see any store instruction in your check. Could you, please, use the "update_" utility to generate the full picture.

igorb added inline comments.Aug 29 2016, 1:52 AM
lib/Target/X86/X86ISelLowering.cpp
0–1

Chain should be changed to LoadAddress

test/CodeGen/X86/compress-maskz.ll
1 ↗(On Diff #69536)

could you please move the test to avx512vl-intrinsics.ll

aymanmus updated this revision to Diff 69553.Aug 29 2016, 4:32 AM
  • Fixed chains in load/store instructions
  • Moved test to avx512vl-intrinsics.ll
aymanmus updated this revision to Diff 69555.Aug 29 2016, 4:47 AM
mkuper edited edge metadata.Aug 30 2016, 2:03 PM

I'll let Igor/Elena review the logic, just a couple of comments on style.

lib/Target/X86/X86ISelLowering.cpp
1

Why is this a dyn_cast? if DAG.getLoad() is guaranteed to return a LoadSDNode (and I'd assume it is), you want a cast<>. If it's not, then you need to check the result of the dyn_cast.

2

The formatting here looks wrong.

delena added inline comments.Aug 31 2016, 2:06 AM
lib/Target/X86/X86ISelLowering.cpp
1

We discussed with Igor and got into conclusion that this solution is not safe. The existing solution has the same problem.
If any of subsequent optimizations will break the store-vselect-compess sequence it may end up with memory exception.
In all other cases we handle masked store as a sole node in order to exclude this situation.

I suggest to add "IsCompressed" flag to MaskedStoreSDNode and than use it for COMPRESS_TO_MEM. Igor implemented masked_truncstore, you can take a look.

aymanmus updated this revision to Diff 70704.Sep 8 2016, 8:30 AM
aymanmus edited edge metadata.

Entirely changed the approach of the solution as Elena suggested.
Masked compressed store is now lowered to a sole node of masked store with a new isCompressed flag.
Later on the node is replaced with the specific X86 avx3 compress machine instruction.

delena added inline comments.Sep 8 2016, 12:38 PM
include/llvm/CodeGen/SelectionDAGNodes.h
17

Please change it to IsCompressing. It will be compatible with IsTrucating.

lib/Target/X86/X86ISelLowering.cpp
3

do not leave commented out code

aymanmus updated this revision to Diff 72424.Sep 25 2016, 6:13 AM
  • Removed commented out code.
  • Changed "compressed" -> "compressing".
delena added inline comments.Sep 25 2016, 6:20 AM
include/llvm/CodeGen/SelectionDAG.h
972

wrong code alignment

include/llvm/CodeGen/SelectionDAGNodes.h
1974

Please add comments before the API. Please explain carefully what the compression means.

lib/Target/X86/X86ISelLowering.cpp
18829

Code alignment

aymanmus updated this revision to Diff 72425.Sep 25 2016, 6:42 AM
aymanmus updated this revision to Diff 72426.Sep 25 2016, 6:49 AM

Fixed comment/

aymanmus updated this revision to Diff 72427.Sep 25 2016, 7:26 AM
delena accepted this revision.Sep 25 2016, 9:34 AM
delena edited edge metadata.
This revision is now accepted and ready to land.Sep 25 2016, 9:34 AM
This revision was automatically updated to reflect the committed changes.