zvi (Zvi Rackover)
User

Projects

User does not belong to any projects.

User Details

User Since
Jun 8 2016, 12:50 PM (127 w, 4 d)

Recent Activity

Jul 27 2018

zvi accepted D49829: [X86] Add pattern matching for PMADDUBSW.
Jul 27 2018, 12:58 PM
zvi added a comment to D49829: [X86] Add pattern matching for PMADDUBSW.

LGTM. Regarding the SSE2->SSSE3 test change, i think it's fine. Can you update the --check-prefix to SSSE3 in a follow-up commit? I think it's convenient to review as-is, but in the longer term it would be misleading to leave it as-is.

Jul 27 2018, 12:58 PM

Jul 26 2018

zvi accepted D49636: [X86] Add matching for another pattern of PMADDWD..

LGTM. Thanks!

Jul 26 2018, 12:26 PM

Jun 27 2018

zvi added a comment to D45806: DAGcombiner: Handle correctly non-splat power of 2 -1 divisor.

Sorry for not being responsive. Won't have time to work on this, so thanks @RKSimon for taking charge.

Jun 27 2018, 8:39 AM

Apr 22 2018

zvi added a comment to D43608: [X86] Use setcc ISD opcode for AVX512 integer comparisons all the way to isel.

Is this patch pending for review? If yes, can you please rebase it on ToT?

Apr 22 2018, 12:38 PM
zvi added a comment to D45585: [DAGCombiner][X86] When promoting loads don't use ZEXTLOAD even its legal.

LGTM

Apr 22 2018, 11:28 AM
zvi added a comment to D45929: [X86] Add vector element insertion/extraction scheduler classes.

LGTM

Apr 22 2018, 11:12 AM

Apr 19 2018

zvi updated the diff for D45806: DAGcombiner: Handle correctly non-splat power of 2 -1 divisor.

Minor improvement: No need for the select if there are no negatives

Apr 19 2018, 1:37 AM
zvi created D45806: DAGcombiner: Handle correctly non-splat power of 2 -1 divisor.
Apr 19 2018, 1:28 AM

Apr 11 2018

zvi abandoned D42171: X86CallFrameOptimization: Bail on win64cc calls.
Apr 11 2018, 2:01 PM

Apr 8 2018

zvi added inline comments to D42044: X86: Utilize ZeroableElements for canWidenShuffleElements.
Apr 8 2018, 4:47 PM
zvi updated the diff for D42044: X86: Utilize ZeroableElements for canWidenShuffleElements.
  • Rebase
  • Simplify the creation of the shuffle by only modifying the widened mask to take all zeros from the all-zero operand.
Apr 8 2018, 4:37 PM
zvi committed rL329525: DAGCombiner: Combine SDIV with non-splat vector pow2 divisor.
DAGCombiner: Combine SDIV with non-splat vector pow2 divisor
Apr 8 2018, 4:38 AM
zvi closed D42479: DAGCombiner: Combine SDIV with non-splat vector pow2 divisor.
Apr 8 2018, 4:38 AM

Apr 5 2018

zvi added inline comments to D43619: [X86] Limit Store Forwarding Block only to cases where we can prove that the memcpy does not overlap.
Apr 5 2018, 3:44 PM
zvi updated the diff for D42479: DAGCombiner: Combine SDIV with non-splat vector pow2 divisor.
  • Remove deprecated comment
  • Fix variable names
  • N1C can be passed to matchUnaryPredicate also when the divisor is a constant-splat. Makes check more efficient as we avoid re-checking the splatted element for every vector element. (why doesn't matchUnaryPredicate check for splat and do the same?)
Apr 5 2018, 3:20 PM
zvi committed rL329356: X86 Tests: Add a case for combining sdiv by a splatted pow2 negative. NFC..
X86 Tests: Add a case for combining sdiv by a splatted pow2 negative. NFC.
Apr 5 2018, 3:01 PM
zvi added a comment to D42479: DAGCombiner: Combine SDIV with non-splat vector pow2 divisor.

I still think it'd be better if you treated splat vectors as a vector instead of a scalar - your change to matchUnaryPredicate means that we're accepting UNDEF elements where we weren't before, which for DIV/REM opcodes is supposed to be a big no-no.

I think that at the point this combine runs there should not be any undef elements because there is an earlier combine that handles division by undef (or vector with any undef elements)., see SelectionDAG::isUndef.
Having said that, there is room for improvement by visiting only the source splatted value
instead of every BUILD_VECTOR operand. Will upload an improved patch.

Apr 5 2018, 2:36 PM

Apr 1 2018

zvi updated the diff for D42479: DAGCombiner: Combine SDIV with non-splat vector pow2 divisor.

The previous revision was a rebase on a dated revision. This is a rebase on the latest.

Apr 1 2018, 8:21 PM
zvi updated the diff for D42479: DAGCombiner: Combine SDIV with non-splat vector pow2 divisor.

Rebase

Apr 1 2018, 12:10 PM

Mar 19 2018

zvi added a comment to D42479: DAGCombiner: Combine SDIV with non-splat vector pow2 divisor.

I apologize about the delay. Got some other priority work. Will make an effort to continue this week. If not, please go ahead and complete the work, Simon.

Mar 19 2018, 6:26 AM

Feb 15 2018

zvi accepted D37418: [X86] Use btc/btr/bts to implement xor/and/or that affects a single bit in the upper 32-bits of a 64-bit operation..

LGTM

Feb 15 2018, 11:14 AM
zvi updated the diff for D42479: DAGCombiner: Combine SDIV with non-splat vector pow2 divisor.

Addressing Simon's comments

Feb 15 2018, 11:05 AM

Feb 14 2018

zvi accepted D42896: [SelectionDAG] Add initial implementation of TargetLowering::SimplifyDemandedVectorElts.

LGTM after fixing the signed/unsigned mismatches.

Feb 14 2018, 1:10 PM
zvi added inline comments to D42896: [SelectionDAG] Add initial implementation of TargetLowering::SimplifyDemandedVectorElts.
Feb 14 2018, 7:21 AM
zvi added inline comments to D42896: [SelectionDAG] Add initial implementation of TargetLowering::SimplifyDemandedVectorElts.
Feb 14 2018, 6:46 AM

Feb 11 2018

zvi retitled D42479: DAGCombiner: Combine SDIV with non-splat vector pow2 divisor from DAGCombiner: Combine SDIV with non-splat vector pow2 divider to DAGCombiner: Combine SDIV with non-splat vector pow2 divisor.
Feb 11 2018, 2:34 PM
zvi updated the diff for D42479: DAGCombiner: Combine SDIV with non-splat vector pow2 divisor.
  1. matchBinaryPredicate -> matchUnryPredicate
  2. Use Simon's uniform scalar/vector code suggestion for computing INEXACT
Feb 11 2018, 2:27 PM

Feb 6 2018

zvi added inline comments to D42770: [X86] Don't emit KTEST instructions unless only the Z flag is being used.
Feb 6 2018, 11:03 AM

Feb 4 2018

zvi updated the diff for D42044: X86: Utilize ZeroableElements for canWidenShuffleElements.

Rebase + ping

Feb 4 2018, 11:42 AM
zvi committed rL324200: X86 Tests: Add shuffle that can be improved by widening elements. NFC.
X86 Tests: Add shuffle that can be improved by widening elements. NFC
Feb 4 2018, 11:33 AM
zvi updated the diff for D42479: DAGCombiner: Combine SDIV with non-splat vector pow2 divisor.

Following Simon's suggestions, dropping the TLI hook seems to improve all cases except for v2i64 on SSE/AVX1.

Feb 4 2018, 10:34 AM
zvi added a comment to D42479: DAGCombiner: Combine SDIV with non-splat vector pow2 divisor.

How bad does the codegen get if we don't limit this to targets with vector shifts? Again, thinking AVX1 (Jaguar) here., but combine_vec_sdiv_by_pow2b_v4i64 looks like a missed opportunity

I think you are right. Probably all cases will profit except for v2i64. Will try to drop the TLI hook.

Feb 4 2018, 10:17 AM

Jan 30 2018

zvi accepted D42526: [X86][XOP] Update isVectorShiftByScalarCheap with cases covered by XOP.

LGTM

Jan 30 2018, 5:25 AM

Jan 25 2018

zvi committed rL323418: X86 Tests: Add AVX+XOP config to SDIV combine tests.
X86 Tests: Add AVX+XOP config to SDIV combine tests
Jan 25 2018, 6:09 AM

Jan 24 2018

zvi accepted D42258: [X86][SSE] Aggressively use PMADDWD for v4i32 multiplies with 17 or more leading zeros.

LGTM with a minor request:

Jan 24 2018, 10:05 AM
zvi committed rL323343: InstSimplify: If divisor element is undef simplify to undef.
InstSimplify: If divisor element is undef simplify to undef
Jan 24 2018, 9:24 AM
zvi closed D42485: InstSimplify: If divisor element is undef simplify to undef.
Jan 24 2018, 9:23 AM
zvi added a comment to D42485: InstSimplify: If divisor element is undef simplify to undef.

LGTM. Just curious - do we have vector intrinsics or any passes that create vector integer division?

Jan 24 2018, 9:20 AM
zvi created D42485: InstSimplify: If divisor element is undef simplify to undef.
Jan 24 2018, 8:25 AM
zvi created D42479: DAGCombiner: Combine SDIV with non-splat vector pow2 divisor.
Jan 24 2018, 7:09 AM
zvi committed rL323329: X86 Tests: Add more sdiv combine cases. NFC.
X86 Tests: Add more sdiv combine cases. NFC
Jan 24 2018, 7:06 AM

Jan 23 2018

zvi closed D42437: X86: Update isVectorShiftByScalarCheap with cases covered by AVX512BW.
Jan 23 2018, 5:38 PM
zvi committed rL323292: X86: Update isVectorShiftByScalarCheap with cases covered by AVX512BW.
X86: Update isVectorShiftByScalarCheap with cases covered by AVX512BW
Jan 23 2018, 5:38 PM
zvi updated the diff for D42437: X86: Update isVectorShiftByScalarCheap with cases covered by AVX512BW.

Rebase

Jan 23 2018, 5:36 PM
zvi created D42437: X86: Update isVectorShiftByScalarCheap with cases covered by AVX512BW.
Jan 23 2018, 12:11 PM
zvi committed rL323242: X86 Tests: Add AVX512BW config to CodeGenPrepare test. NFC.
X86 Tests: Add AVX512BW config to CodeGenPrepare test. NFC
Jan 23 2018, 11:22 AM
zvi accepted D42431: [X86][AVX] LowerBUILD_VECTORAsVariablePermute - add support for VPERMILPV to v2i64/v2f64.

LGTM

Jan 23 2018, 11:21 AM
zvi added a comment to D42380: [X86][SSE] LowerBUILD_VECTORAsVariablePermute - fix PSHUFB source/index operand ordering.

Thanks for the fix.

Jan 23 2018, 2:00 AM

Jan 19 2018

zvi added inline comments to D42044: X86: Utilize ZeroableElements for canWidenShuffleElements.
Jan 19 2018, 9:10 AM

Jan 17 2018

zvi added a comment to D42205: [X86] Add intrinsic support for the RDPID instruction.

Add some basic encoding tests?

Jan 17 2018, 10:55 PM
zvi added a comment to D42171: X86CallFrameOptimization: Bail on win64cc calls.
In D42171#978790, @rnk wrote:

Isn't this MI buggy? We're adjusting SP down by 40 bytes and storing to SP+48, which could overwrite data. I think the assert is valid.

Jan 17 2018, 1:49 PM
zvi added a comment to D42171: X86CallFrameOptimization: Bail on win64cc calls.

I would appreciate suggestions for alternative solutions.

Jan 17 2018, 6:11 AM
zvi created D42171: X86CallFrameOptimization: Bail on win64cc calls.
Jan 17 2018, 6:10 AM

Jan 16 2018

zvi added a comment to D41944: [LLVM][IR][LIT] support of 'no-overflow' flag for sdiv\udiv instructions.
  • With the way you are modeling the new flag, means that existing bitcode/.ll files will change semantics when read with newer compilers. I'm not sure that is a good idea for this, in any way at the very least you have to provide AutoUpgrade logic for that.

This seems like a real issue. With no version info in the module, how can AutoUpgrade tell if a divide with no 'nof' attribute is of the old form or new form? This is really a performance issue, because AutoUpgrade can always pessimistically not add 'nof' if the version of the incoming module is unknown. Possible solutions:

Jan 16 2018, 1:50 AM

Jan 14 2018

zvi created D42044: X86: Utilize ZeroableElements for canWidenShuffleElements.
Jan 14 2018, 3:50 PM

Jan 13 2018

zvi abandoned D40865: X86 AVX2: Prefer one VPERMV over ShuffleAsRepeatedMaskAndLanePermute.
Jan 13 2018, 9:54 AM
zvi committed rL322446: X86: Add pattern matching for PMADDWD.
X86: Add pattern matching for PMADDWD
Jan 13 2018, 9:43 AM
zvi closed D41811: X86: Add pattern matching for PMADDWD.
Jan 13 2018, 9:43 AM
zvi updated the diff for D41811: X86: Add pattern matching for PMADDWD.

Generalize to account for commutativity of add and mul

Jan 13 2018, 12:24 AM
zvi committed rL322434: X86 Tests: add more pamddwd cases. NFC.
X86 Tests: add more pamddwd cases. NFC
Jan 13 2018, 12:22 AM

Jan 12 2018

zvi updated the diff for D41811: X86: Add pattern matching for PMADDWD.

Check both BUILD_VECTOR nodes together if one is composed of odd indexed extracts and the other composed of even idexed extracts.

Jan 12 2018, 1:44 AM

Jan 11 2018

zvi updated the diff for D41811: X86: Add pattern matching for PMADDWD.

Rebase

Jan 11 2018, 10:09 AM
zvi committed rL322300: DAGCombine: Let truncates negate extension through extract-subvector.
DAGCombine: Let truncates negate extension through extract-subvector
Jan 11 2018, 10:04 AM
zvi closed D41927: DAGCombine: Let truncates negate extension through extract-subvector.
Jan 11 2018, 10:04 AM
zvi updated the diff for D41927: DAGCombine: Let truncates negate extension through extract-subvector.

Rebase after adding the missing zext cases

Jan 11 2018, 9:56 AM
zvi committed rL322297: X86 Tests: Add zext cases in (trunc (subvector)) test. NFC.
X86 Tests: Add zext cases in (trunc (subvector)) test. NFC
Jan 11 2018, 9:51 AM
zvi committed rL322296: X86: Refactor type-splitting to target-legal size vector to a helper function.
X86: Refactor type-splitting to target-legal size vector to a helper function
Jan 11 2018, 9:31 AM
zvi closed D41925: X86: Refactor type-splitting to target-legal size vector to a helper function.
Jan 11 2018, 9:31 AM
zvi updated the diff for D41925: X86: Refactor type-splitting to target-legal size vector to a helper function.

Add asserions for type sizes and fix typo in comment

Jan 11 2018, 9:25 AM
zvi committed rL322272: X86: Fix LowerBUILD_VECTORAsVariablePermute for case Src is smaller than Indices.
X86: Fix LowerBUILD_VECTORAsVariablePermute for case Src is smaller than Indices
Jan 11 2018, 4:28 AM
zvi closed D41865: X86: Fix LowerBUILD_VECTORAsVariablePermute for case Src is smaller than Indices.
Jan 11 2018, 4:28 AM

Jan 10 2018

zvi added a dependent revision for D41925: X86: Refactor type-splitting to target-legal size vector to a helper function: D41811: X86: Add pattern matching for PMADDWD.
Jan 10 2018, 4:53 PM
zvi added a dependency for D41811: X86: Add pattern matching for PMADDWD: D41925: X86: Refactor type-splitting to target-legal size vector to a helper function.
Jan 10 2018, 4:53 PM
zvi added a dependent revision for D41811: X86: Add pattern matching for PMADDWD: D41927: DAGCombine: Let truncates negate extension through extract-subvector.
Jan 10 2018, 4:53 PM
zvi added a dependency for D41927: DAGCombine: Let truncates negate extension through extract-subvector: D41811: X86: Add pattern matching for PMADDWD.
Jan 10 2018, 4:53 PM
zvi created D41927: DAGCombine: Let truncates negate extension through extract-subvector.
Jan 10 2018, 4:51 PM
zvi updated the diff for D41811: X86: Add pattern matching for PMADDWD.

Reabase on top D41925

Jan 10 2018, 4:48 PM
zvi created D41925: X86: Refactor type-splitting to target-legal size vector to a helper function.
Jan 10 2018, 4:45 PM
zvi added a comment to D40055: [SelectionDAG][X86] Explicitly store the scale in the gather/scatter ISD nodes.

There are some occurrences of calls to getMaskedGather in DAGCombine.cpp which i do not see being addressed by this patch. I guess they are not being covered by tests?

Jan 10 2018, 9:50 AM
zvi updated the diff for D41865: X86: Fix LowerBUILD_VECTORAsVariablePermute for case Src is smaller than Indices.

Fix issue identified by Simon: use original vector type for the insert_vector

Jan 10 2018, 9:25 AM
zvi committed rL322192: X86 Tests: Add isel tests for truncate-extract_vector-extend. NFC..
X86 Tests: Add isel tests for truncate-extract_vector-extend. NFC.
Jan 10 2018, 6:57 AM
zvi added inline comments to D41811: X86: Add pattern matching for PMADDWD.
Jan 10 2018, 5:54 AM
zvi added inline comments to D40055: [SelectionDAG][X86] Explicitly store the scale in the gather/scatter ISD nodes.
Jan 10 2018, 4:32 AM
zvi added a comment to D41865: X86: Fix LowerBUILD_VECTORAsVariablePermute for case Src is smaller than Indices.
In D41865#971307, @zvi wrote:

Sure, but looking at your example the return type should have the same number of elements as the indices vector, right?

Yup, sorry for the typo. Are you intending to support cases like this?

Jan 10 2018, 4:20 AM
zvi updated the diff for D41811: X86: Add pattern matching for PMADDWD.

Average lowering fully using the refactored type-splitting code.

Jan 10 2018, 1:15 AM
zvi updated the diff for D41811: X86: Add pattern matching for PMADDWD.
  1. Following Simon's suggestion, refactored out the code that splits the vector to legal-types to 'LowerBinTo' (the function name probably needs revision)) and applied to PMADDWD.
  2. Added a missing DAGCombine to let a truncate negate a sext through an EXTRACT_SUBVECTOR.
Jan 10 2018, 12:47 AM

Jan 9 2018

zvi updated the diff for D41811: X86: Add pattern matching for PMADDWD.

Fixes for Craig's comments

Jan 9 2018, 12:31 PM
zvi added inline comments to D41811: X86: Add pattern matching for PMADDWD.
Jan 9 2018, 12:09 PM
zvi updated the diff for D41865: X86: Fix LowerBUILD_VECTORAsVariablePermute for case Src is smaller than Indices.

Added test with source vector larger than indices vector

Jan 9 2018, 10:30 AM
zvi added a comment to D41865: X86: Fix LowerBUILD_VECTORAsVariablePermute for case Src is smaller than Indices.

Sure, but looking at your example the return type should have the same number of elements as the indices vector, right?

Jan 9 2018, 10:04 AM
zvi accepted D41850: [X86] Add a DAG combine to combine (sext (setcc)) with VLX.

LGTM

Jan 9 2018, 8:47 AM
zvi committed rL322090: X86 Tests: Update more isel tests with FastVariableShuffle feature.
X86 Tests: Update more isel tests with FastVariableShuffle feature
Jan 9 2018, 8:27 AM
This revision was not accepted when it landed; it landed in state Needs Review.
Jan 9 2018, 8:27 AM
zvi updated the diff for D41851: X86 Tests: Update more isel tests with FastVariableShuffle feature.

Rebase + apply fixes for Simon's comments. Will commit this change right away to avoid conflicts.

Jan 9 2018, 8:20 AM
zvi committed rL322089: X86 Tests: Add common check prefix to test-case. NFC..
X86 Tests: Add common check prefix to test-case. NFC.
Jan 9 2018, 8:15 AM
zvi added inline comments to D41851: X86 Tests: Update more isel tests with FastVariableShuffle feature.
Jan 9 2018, 8:01 AM
zvi created D41865: X86: Fix LowerBUILD_VECTORAsVariablePermute for case Src is smaller than Indices.
Jan 9 2018, 7:53 AM
zvi added a comment to D41851: X86 Tests: Update more isel tests with FastVariableShuffle feature.
Assuming D41436 is accepted, is the plan to remove the +fast-variable-shuffle arg from the avx512 cases? In which case might it make sense to commit the avx2 and avx512 changes separately?
Jan 9 2018, 5:04 AM

Jan 8 2018

zvi created D41851: X86 Tests: Update more isel tests with FastVariableShuffle feature.
Jan 8 2018, 11:20 PM