This helps in C library function fmod. remainder needs similar optimization, but since that's a libcall instead of a IR/DAG op, we can do it in future patches.
I see you have added an isLegal check for FREM. But I don't understand why you are doing this as a combine in the first place, instead of changing the way FREM is legalized to do this as a lowering instead of calling a libcall.
I'm not sure it's good idea to consider fast-math flags in legalizing.. Here frem just looks like how fdiv is transformed into series of operations in combiner.
I like this as a combine because we can chain it with other combines. For example, the fdiv becomes a reciprocal op in the current tests.
I'd like to see tests for other targets though. I did a quick test, and this may trigger for any of AArch64, AMDGPU, or x86 (use -mattr=avx to make sure ftrunc is supported).
There's an open question about exactly which FMF are needed to enable this transform. Should we require 'reassoc' or others? Is 'reassoc' enough by itself?
The current tests show 'fast' only, so we should reduce at least one of those to whatever we decide is the minimum requirement.
There's also a possible missing constraint when optimizing for minimum size - if we know this node can be lowered to a libcall, is that smaller than the expansion?
We might be able to answer some of these questions and share code with the transforms under DAGCombiner::visitFPOW().