This patch improves the target-specific cost model to better handle signed
division by a power of two. The immediate result is that this enables the SLP
vectorizer and loop vectorizer to do a better job.
Just something I saw in passing. This is already done for the X86 and AArch64 backends.