We had some code for this for 32-bit ARM, but this doesn't really need to be in target-specific code; generalize it.
Not sure if we also need this for x86; I would guess we do, but I haven't tried.
(I think this started showing up recently because we added an optimization that converts pow to powi.)