This patch adds a new combine that tries to scalarize chains of
extractelement (load %ptr), %idx to load (gep %ptr, %idx). This is
profitable when extracting only a few elements out of a large vector.
At the moment, , store (extractelement (load %ptr), %idx), %ptr
operations on large vectors result in huge code in the backend.
This can easily be triggered by using the matrix extension, e.g.
https://clang.godbolt.org/z/qsccPdPf4
This should complement D98240.
I didn't check the bot failure, but we dealt with a previous sanitizer failure with an additional predicate here.