Currently, in OptimizeGlobalAddressOfMalloc, the transformation for global loads assumes that they have the same Type. With the support of ConstantExpr (https://reviews.llvm.org/D106589), this may not be true any more (as seen in the test case), and we miss the code to handle this, This is to fix that.
Hopefully, this is the problem that @saugustine reported in https://reviews.llvm.org/D106589. The test case I added here has the similar msg you have. Please verify. Thanks.
Why do you need to mess with the way you're iterating over the uses of LI? Calling ConstantExpr::getBitCast() doesn't mess with the use-list of LI.