This patch tries to reassociate two patterns related to FMA to expose more ILP on PowerPC.
// Pattern 1: // A = FADD X, Y (Leaf) // B = FMA A, M21, M22 (Prev) // C = FMA B, M31, M32 (Root) // --> // A = FMA X, M21, M22 // B = FMA Y, M31, M32 // C = FADD A, B
// Pattern 2: // A = FMA X, M11, M12 (Leaf) // B = FMA A, M21, M22 (Prev) // C = FMA B, M31, M32 (Root) // --> // A = FMUL M11, M12 // B = FMA X, M21, M22 // D = FMA A, M31, M32 // C = FADD B, D
breaking the dependency between A and B, allowing FMA to be executed in parallel (or back-to-back in a pipeline) instead of depending on each other.
Pattern? -> P? Or replace P to Pattern in following line.