Page MenuHomePhabricator

[AMDGPU][GlobalISel] Transform (fsub (fmul x, y), z) -> (fma x, y, -z)
Needs ReviewPublic

Authored by matejam on Feb 12 2021, 9:29 AM.

Details

Summary

Instead of sub and mul instructions, use v_mad, v_mac or v_fma if fma
instructions are faster and are legal for the given architecture.
Combiner for a simple case that has only one subtraction and one
multiplication instruction and transforms them into some of the fma
instructions depending on the architecture.

Diff Detail

Event Timeline

matejam created this revision.Feb 12 2021, 9:29 AM
matejam requested review of this revision.Feb 12 2021, 9:29 AM
arsenm added inline comments.Feb 12 2021, 9:39 AM
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
4085–4086

This should probably allow vectors we can break down later too

4088

Don't see where isFMADLegal is fedined

4107–4119

I'm not sure I follow this heuristic, or what SwapPriority means

4160–4161

The types are all identical, there's no reason to query every type

4164

You can directly use the type and avoid the explicit createGenericVirtualRegister with
auto Neg = B.buildFNeg(Ty, X)

llvm/test/CodeGen/AMDGPU/GlobalISel/combine-fma-sub-mul.ll
118–119

Why did we fail to fold the modifier here?

matejam updated this revision to Diff 327459.Mar 2 2021, 8:05 AM

Added support for vector types and some refactoring.

Thanks for the suggestions!

llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
4088

In the other revision (the parent): D93305

4107–4119

If SwapPriority is equal to 0 that means that the first and second operands aren't both fmul instructions, if it's equal to 2 it means that both of the arguments are fmul and that the second arg has fewer uses so we pick him for folding, vice versa if it's equal to 1. I will make it more simple in the next version.

matejam updated this revision to Diff 328111.Mar 4 2021, 3:45 AM
This comment was removed by matejam.
matejam updated this revision to Diff 329335.Mar 9 2021, 7:18 AM

Put back the accidentally deleted combiner from the list of combiners (load_or_combine).

matejam updated this revision to Diff 343374.May 6 2021, 5:47 AM

Minor changes in CombinerHelper.cpp and in the tests.

matejam updated this revision to Diff 346787.May 20 2021, 10:27 AM

Use mi_match for comparing instructions instead of comparing the opcodes.

foad added a comment.May 28 2021, 7:51 AM

Does this assume that all targets can do the fneg for free? Or can a target choose to fold only (fadd (fmul x, y), z), not (fsub (fmul x, y), z) ?

llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
4438

Typo "refers", and they're called MI0 and MI1.

4443

Use hasMoreUses() from the previous patch.

Does this assume that all targets can do the fneg for free? Or can a target choose to fold only (fadd (fmul x, y), z), not (fsub (fmul x, y), z) ?

Similar to SelectionDAG which negates an SDValue:

fold (fsub (fmul x, y), z) -> (fma x, y, (fneg z))
DAG.getNode(PreferredFusedOpcode, SL, VT, XY.getOperand(0),
            XY.getOperand(1), DAG.getNode(ISD::FNEG, SL, VT, Z));
matejam updated this revision to Diff 349856.Jun 4 2021, 7:10 AM

Typos and refactoring.