Fix typo in comment pointed out by @xbolva00
Following suggestion by @lebedev.ri to support non-splat vector shift amount
Addressed comments by @lebedev.ri
Jul 27 2018
LGTM. Regarding the SSE2->SSSE3 test change, i think it's fine. Can you update the --check-prefix to SSSE3 in a follow-up commit? I think it's convenient to review as-is, but in the longer term it would be misleading to leave it as-is.
Jul 26 2018
Jun 27 2018
Sorry for not being responsive. Won't have time to work on this, so thanks @RKSimon for taking charge.
Apr 22 2018
Is this patch pending for review? If yes, can you please rebase it on ToT?
Apr 19 2018
Minor improvement: No need for the select if there are no negatives
Apr 11 2018
Apr 8 2018
- Simplify the creation of the shuffle by only modifying the widened mask to take all zeros from the all-zero operand.
Apr 5 2018
- Remove deprecated comment
- Fix variable names
- N1C can be passed to matchUnaryPredicate also when the divisor is a constant-splat. Makes check more efficient as we avoid re-checking the splatted element for every vector element. (why doesn't matchUnaryPredicate check for splat and do the same?)
I still think it'd be better if you treated splat vectors as a vector instead of a scalar - your change to matchUnaryPredicate means that we're accepting UNDEF elements where we weren't before, which for DIV/REM opcodes is supposed to be a big no-no.
I think that at the point this combine runs there should not be any undef elements because there is an earlier combine that handles division by undef (or vector with any undef elements)., see SelectionDAG::isUndef.
Having said that, there is room for improvement by visiting only the source splatted value
instead of every BUILD_VECTOR operand. Will upload an improved patch.
Apr 1 2018
The previous revision was a rebase on a dated revision. This is a rebase on the latest.
Mar 19 2018
I apologize about the delay. Got some other priority work. Will make an effort to continue this week. If not, please go ahead and complete the work, Simon.
Feb 15 2018
Addressing Simon's comments
Feb 14 2018
LGTM after fixing the signed/unsigned mismatches.
Feb 11 2018
- matchBinaryPredicate -> matchUnryPredicate
- Use Simon's uniform scalar/vector code suggestion for computing INEXACT
Feb 6 2018
Feb 4 2018
Rebase + ping
Following Simon's suggestions, dropping the TLI hook seems to improve all cases except for v2i64 on SSE/AVX1.
How bad does the codegen get if we don't limit this to targets with vector shifts? Again, thinking AVX1 (Jaguar) here., but combine_vec_sdiv_by_pow2b_v4i64 looks like a missed opportunity
I think you are right. Probably all cases will profit except for v2i64. Will try to drop the TLI hook.
Jan 30 2018
Jan 25 2018
Jan 24 2018
LGTM with a minor request:
LGTM. Just curious - do we have vector intrinsics or any passes that create vector integer division?
Jan 23 2018
Thanks for the fix.
Jan 19 2018
Jan 17 2018
Add some basic encoding tests?
Isn't this MI buggy? We're adjusting SP down by 40 bytes and storing to SP+48, which could overwrite data. I think the assert is valid.
I would appreciate suggestions for alternative solutions.
Jan 16 2018
- With the way you are modeling the new flag, means that existing bitcode/.ll files will change semantics when read with newer compilers. I'm not sure that is a good idea for this, in any way at the very least you have to provide AutoUpgrade logic for that.
This seems like a real issue. With no version info in the module, how can AutoUpgrade tell if a divide with no 'nof' attribute is of the old form or new form? This is really a performance issue, because AutoUpgrade can always pessimistically not add 'nof' if the version of the incoming module is unknown. Possible solutions:
Jan 14 2018
Jan 13 2018
Generalize to account for commutativity of add and mul
Jan 12 2018
Check both BUILD_VECTOR nodes together if one is composed of odd indexed extracts and the other composed of even idexed extracts.
Jan 11 2018
Rebase after adding the missing zext cases
Add asserions for type sizes and fix typo in comment
Jan 10 2018
Reabase on top D41925
There are some occurrences of calls to getMaskedGather in DAGCombine.cpp which i do not see being addressed by this patch. I guess they are not being covered by tests?
Fix issue identified by Simon: use original vector type for the insert_vector
Average lowering fully using the refactored type-splitting code.
- Following Simon's suggestion, refactored out the code that splits the vector to legal-types to 'LowerBinTo' (the function name probably needs revision)) and applied to PMADDWD.
- Added a missing DAGCombine to let a truncate negate a sext through an EXTRACT_SUBVECTOR.
Jan 9 2018
Fixes for Craig's comments
Added test with source vector larger than indices vector
Sure, but looking at your example the return type should have the same number of elements as the indices vector, right?