Page MenuHomePhabricator

[AMDGPU][GlobalISel] Transform (fadd (fmul x, y), z) -> (fma x, y, z)
AcceptedPublic

Authored by matejam on Dec 15 2020, 8:21 AM.

Details

Summary

Instead of add (sub) and mul instructions, use v_mad, v_mac or v_fma
if fma instructions are faster and are legal for the given architecture.
Combiner for a simple case that has only one add and one mul instruction
and transforms them into some of the fma instructions depending on the
architecture.

Diff Detail

Unit TestsFailed

TimeTest
2,500 msx64 debian > libarcher.races::task-two.c
Script: -- : 'RUN: at line 13'; /mnt/disks/ssd0/agent/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests -I /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/runtime/src -L /mnt/disks/ssd0/agent/llvm-project/build/lib -Wl,-rpath,/mnt/disks/ssd0/agent/llvm-project/build/lib /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/races/task-two.c -o /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/task-two.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/deflake.bash /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/task-two.c.tmp 2>&1 | tee /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/task-two.c.tmp.log | /mnt/disks/ssd0/agent/llvm-project/build/./bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/races/task-two.c

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
matejam requested review of this revision.Dec 15 2020, 8:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 15 2020, 8:21 AM
arsenm added inline comments.Dec 15 2020, 3:43 PM
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
3145–3146

Using bf16 here instead of regular half is wrong. We also have a utility function to get the FP type from the LLT

3159

Checking the denormal mode here isn't right. The denormal mode only impacts whether this should be used for AMDGPU and isn't a generic property

3185–3187

return compare directly

3196

The matching logic here implies this should be able to use fma also

matejam updated this revision to Diff 323356.Feb 12 2021, 9:24 AM
matejam retitled this revision from [AMDGPU][GlobalISel] Add combiner for generating G_FMAD to [AMDGPU][GlobalISel] Add combiners for v_mad/v_mac/v_fma.

Instead of add and mul instructions, use v_mad, v_mac or v_fma
if fma instructions are faster and are legal for the given architecture.
Combiner for a simple case that has only one add and one mul instruction
and transforms them into some of the fma instructions depending on the
architecture.

matejam retitled this revision from [AMDGPU][GlobalISel] Add combiners for v_mad/v_mac/v_fma to [AMDGPU][GlobalISel] Transform (fadd (fmul x, y), z) -> (fma x, y, z).Feb 12 2021, 9:28 AM
matejam edited the summary of this revision. (Show Details)
arsenm added inline comments.Feb 12 2021, 9:42 AM
llvm/include/llvm/CodeGen/TargetLowering.h
796

This will break for vectors

1116

This will break for vectors. Also, this should not be built on top of the DAG legalizer rules. This should be something in LegalizerInfo

2744

Broken for vectors

foad added inline comments.Feb 12 2021, 10:49 AM
llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h
112

isLegal needs its own comment.

333

Comment should mention fmad.

334

Capitalise them like "FAdd" and "FMul" and "FMA" and "FMad". There are akready examples of this in GlobalISel.

550

"FMul"

llvm/include/llvm/CodeGen/TargetLowering.h
795

Function needs a comment.

798

Do you need EVT() here? I thought MVT would implicitly convert to EVT.

1114

Function needs a comment.

2693

Function needs a comment.

2734

Function needs a comment.

2739

Function needs a comment.

llvm/include/llvm/Target/GlobalISel/Combine.td
560

"combine_fadd_fmul_to_fmad_or_fma"?

llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
3575

Function needs a comment.

matejam updated this revision to Diff 328110.Mar 4 2021, 3:44 AM

Added support for vector types and some refactoring.

foad added inline comments.Mar 5 2021, 1:42 AM
llvm/include/llvm/Target/GlobalISel/Combine.td
608

You've lost load_or_combine.

matejam updated this revision to Diff 329329.Mar 9 2021, 7:14 AM

Put back the accidentally deleted combiner from the list of combiners (load_or_combine).

matejam updated this revision to Diff 336459.EditedApr 9 2021, 7:57 AM

Added two mir tests (prelegalize and postlegalize combiner) and a few adjustments in CombinerHelper.cpp.

matejam updated this revision to Diff 336833.Apr 12 2021, 7:58 AM

Code formatting in CombinerHelper.cpp.

matejam updated this revision to Diff 337401.Apr 14 2021, 5:36 AM

Adjustments in the tests.

foad added inline comments.Apr 22 2021, 4:27 AM
llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h
110–111

Comment seems wrong. Should it be "... running after legalization ..."?

llvm/include/llvm/CodeGen/TargetLowering.h
2717

"const LLT" seems weird for a parameter type. Is there a reason for the const?

2729

"const LLT" seems weird.

But why do we need this isFMADLegal function, which calls into selectiondag legalization code? Can't you use CombinerHelper isLegal methods instead?

llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
3906

This could be static, just like isContractable?

3910

return AllowFusionGlobally || isContractable(MI);?

3927

MI.getMF()

3939

This expression is exactly what "isLegalOrBeforeLegalization" does, so you didn't need to create isLegal.

3951

!isContractable already implies !MI.FmReassoc so you don't need to check it again here.

llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
3330

Can you submit this change as a separate patch? (Does it affect any existing lit tests?)

matejam added inline comments.Apr 22 2021, 5:31 AM
llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h
110–111

That seems ok, I'll change that.

llvm/include/llvm/CodeGen/TargetLowering.h
2717

I don't think there is a good reason for that, I'll change that.

2729

I'll change the const LLT.
I tried to find a way to call methods like "hasMadF16", "hasFP64FP16Denormals", etc. from CombinerHelper, but using adjusted SelectionDAG legalization code seemed much easier and prettier to me.

llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
3906

It probably should, I'll change that.

3910

Thanks!

3927

Thanks!

3939

"isLegalOrBeforeLegalization" returns the value of "!LI || LI->getAction(Query).Action == LegalizeActions::Legal"
and isLegal returns the value of "LI && LI->getAction(Query).Action == LegalizeActions::Legal", so I thinks it's not exactly the same.

3951

Thanks!

llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
3330

Yes, it should be in a separate patch, it was added here by mistake.

foad added inline comments.Apr 22 2021, 6:04 AM
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
3939

I mean !LegalOperations || isLegal(...) is the same as isLegalOrBeforeLegalization(...). Because LegalOperations is just another name for LI.

foad added inline comments.Apr 22 2021, 6:15 AM
llvm/include/llvm/CodeGen/TargetLowering.h
2729

I was hoping that hasFP64FP16Denormals etc would be tested in AMDGPULegalizerInfo::AMDGPULegalizerInfo, where it decides whether G_FMAD is legal, so the normal isLegal* functions would work. But now I see that hasFP64FP16Denormals depends on function attributes, so it can't be done that way.

matejam added inline comments.Apr 22 2021, 11:17 AM
llvm/include/llvm/CodeGen/TargetLowering.h
2729

So do you think I should keep it this way?

llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
3939

You're right, I'll change that. Thanks!

foad added inline comments.Apr 23 2021, 2:00 AM
llvm/include/llvm/CodeGen/TargetLowering.h
2729

I'm not sure. Alternatively you could just check "isLegalOrBeforeLegalizer" on the FMAD instructions. Then pre-legalization we would always combine to FMAD, and then the legalizer might decide that the FMAD is not legal and turn it into either MUL+ADD or FMA. Would that work?

foad added a comment.Apr 29 2021, 4:20 AM

I think this will look good once you've updated the patch for the outstanding minor comments.

llvm/include/llvm/CodeGen/TargetLowering.h
2729

We discussed this offline. I still think this would be an interesting experiment to try, but I also think the approach in the current patch is OK for now.

matejam updated this revision to Diff 341512.Apr 29 2021, 7:19 AM

A few minor adjustments in CombinerHelper.cpp.

matejam updated this revision to Diff 343368.May 6 2021, 5:45 AM

Minor changes in CombinerHelper.cpp and in the tests.

matejam updated this revision to Diff 346781.May 20 2021, 10:24 AM

Use mi_match for comparing instructions instead of comparing the opcodes.

foad accepted this revision.May 28 2021, 6:23 AM
foad added reviewers: aemerson, paquette.

LGTM but I'm adding a couple more globalisel reviewers. Please wait a couple more days in case they have comments.

llvm/include/llvm/CodeGen/TargetLowering.h
797

"benefits"

2735

"if \p MI can be combined with another instruction to form"?

This revision is now accepted and ready to land.May 28 2021, 6:23 AM
matejam updated this revision to Diff 348797.May 31 2021, 7:09 AM

A few typos.

arsenm added inline comments.Jun 1 2021, 4:27 PM
llvm/include/llvm/CodeGen/TargetLowering.h
2729

Right, the FP mode is not a property of the target. It's a property of the FP mode of the function, which is contextually valid or not. The legalizer rules are not allowed to differ based on context like that