Vectorization of memory instruction (Load/Store) is possible when the pointer is coming from GEP. The GEP analysis allows to estimate the profit.
In some cases we have a "bitcast" between GEP and memory instruction.
I added code that skips the "bitcast".
I replaced vectorization-remarks-profitable.ll test. As a result of the optimization, the remarks about non-beneficial vectorization disappeared and loop became vectorizable. I put another loops, that generate the same remarks as the previous.