D37076 makes LICM duplicate instructions into exit blocks if the instruction is free. For GEPs, the motivation appears to be that this allows the GEP to be folded into addressing modes, while non-foldable users outside the loop might prevent this. TBH I don't think LICM is the place to do this (why doesn't CGP apply this heuristic itself?) but at least I understand the motivation.
However, the transform is also applied to all other "free" instructions, which are just that (removed during lowering and not "folded" in some way). For such instruction, this transform seems somewhere between useless, counter-productive (undoing CSE/GVN) and actively incorrect. For example, this transform can duplicate freeze instructions, which is illegal.
This patch limits the transform to just foldable GEPs, though we might want to drop it from LICM entirely as a followup.
This is a small compile-time improvement, because querying TTI cost model for every single instruction is expensive: http://llvm-compile-time-tracker.com/compare.php?from=057b5f1f3573ddceb04d9eb6fb9973358d53fece&to=1211bdf470f784888b8bef867e1e613539998e9b&stat=instructions:u