[SLP]Improve gathering of scalar elements.
- Better sorting of scalars to be gathered. Trying to insert constants/arguments/instructions-out-of-loop at first and only then the instructions which are inside the loop. It improves hoisting of invariant insertelements instructions.
- Better detection of shuffle candidates in gathering function.
- The cost of insertelement for constants is 0.
Part of D57059.
Differential Revision: https://reviews.llvm.org/D103458