If X is -NaN, this transform will change the result of fabs(X) * fabs(X) from +NaN to -NaN. (In all other cases, it will produce the same result.)
So my recommendation is to guard this transformation by I.hasNoNaNs().
My understanding of the standard is that NaN has a sign in order to avoid special casing it for operation like fabs. There is no expectation that it has a well defined value.
I'm not a huge fan of SimplifyLibCalls and would prefer optimizations stick to the intrinsics. SimplifyLibCalls has the assumption that you you only can optimize when TargetLibraryInfo reports having a sqrt lib call (even for the intrinsic). This is not helpful for targets that do not have libcalls, and may have hardware instructions or custom expansions for the operations. Especially since we treat the math intrinsics as "fast" versions with looser rules than the library function, I don't think we should conflate the intrinsics and library calls so much.
I would prefer if optimizations stuck to the intrinsic whenever possible. See my RFC from a couple days ago: http://lists.llvm.org/pipermail/llvm-dev/2016-January/094593.html