This is a fix for PR25543:
https://llvm.org/bugs/show_bug.cgi?id=25543
The idea is to take the existing fold of:
bitcast ( trunc ( lshr ( bitcast X))) --> extractelement (bitcast X)
( http://reviews.llvm.org/rL112232 )
And break it into 2 less specific transforms so we'll catch more cases such as the example in the bug report:
bitcast ( trunc ( lshr ( bitcast X))) --> bitcast ( extractelement (bitcast X)) --> extractelement (bitcast X)
D14879 handles the 2nd transform: folding of bitcasts around the extractelement.
I'm a bit surprised by this change. We used to prefer the vector bitcast over the scalar one, and with this change, we prefer the scalar bitcast after the abstract. I can see benefits to this as a canonical form (even through some backends will need to work somewhat harder to retail good code quality: vector bitcasts are often cheaper than scalar ones). However, what happens when both elements are extracted? Do we end up with multiple scalar bitcasts?