This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][InstCombine] Remove zero LOD bias
ClosedPublic

Authored by sebastian-ne on Dec 20 2021, 7:54 AM.

Details

Summary

If the bias is zero, we can remove it from the image instruction.
Also move the image optimizations to IR combines.

Diff Detail

Event Timeline

sebastian-ne created this revision.Dec 20 2021, 7:54 AM
sebastian-ne requested review of this revision.Dec 20 2021, 7:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 20 2021, 7:54 AM
arsenm added inline comments.Dec 20 2021, 7:55 AM
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
4455 ↗(On Diff #395452)

The DAG version doesn't have the isNegative check?

foad added a comment.Dec 20 2021, 7:59 AM

Could this be done at the IR level instead, in AMDGPUInstCombineIntrinsic?

sebastian-ne marked an inline comment as done.

Remove the isNegative check from GlobalISel and add tests to the pre-committed test files.

Could this be done at the IR level instead, in AMDGPUInstCombineIntrinsic?

I think so. All the three combines (l -> lz, mip -> nomip and bias -> nobias) can probably be combines. It’s on my todo list, do you think it should be done before this patch?

Remove the isNegative check from GlobalISel and add tests to the pre-committed test files.

Could this be done at the IR level instead, in AMDGPUInstCombineIntrinsic?

I think so. All the three combines (l -> lz, mip -> nomip and bias -> nobias) can probably be combines. It’s on my todo list, do you think it should be done before this patch?

I think optimizations should only be repeated in codegen if they could plausibly show up as a result of legalization exposing the pattern

foad added a comment.Dec 20 2021, 8:19 AM

If we do it at the IR level, why do we need this patch?

If we do it at the IR level, why do we need this patch?

Oh, my question was meant to ask if I should do 1. push this patch, 2. move all combines to IR combines, or if it should be 1. move the existing optimizations to IR combines, 2. implement the zero bias optimization as IR combine.

I think optimizations should only be repeated in codegen if they could plausibly show up as a result of legalization exposing the pattern

I interpret this as: We shall implement these three optimizations as IR combines and remove them from the code generation (as compared to duplicate the code as IR combine and leave it in codgen)?

foad added a comment.Dec 20 2021, 8:44 AM

If we do it at the IR level, why do we need this patch?

Oh, my question was meant to ask if I should do 1. push this patch, 2. move all combines to IR combines, or if it should be 1. move the existing optimizations to IR combines, 2. implement the zero bias optimization as IR combine.

It seems a bit silly to review and push this code if you are planning to remove it again. But I don't really mind.

Move all image optimizations to IR combines and add image_gather instructions to zero-bias optimization.

sebastian-ne edited the summary of this revision. (Show Details)Dec 21 2021, 2:27 AM
foad added a comment.Dec 21 2021, 6:13 AM

I like it.

llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
103

I don't understand why this needs to be a template, if FuncT is always the same as you have documented.

142

Is it possible that more than one of these optimizations applies to a single intrinsic call, and if so do you handle that case correctly? (E.g. when you apply one optimization, does simplifyAMDGCNImageIntrinsic get called again to apply other optimizations to the new intrinsic?)

arsenm added inline comments.Dec 21 2021, 6:18 AM
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
6249 ↗(On Diff #395624)

I would split the codegen change into a separate patch

llvm/test/CodeGen/AMDGPU/image_ls_mipmap_zero.ll
132 ↗(On Diff #395624)

I don't know about deleting codegen tests for this, these still show the behavior in this scenario

sebastian-ne marked 2 inline comments as done.

Remove codegen changes from this patch and use std::function instead of template.

arsenm accepted this revision.Jan 4 2022, 12:12 PM
This revision is now accepted and ready to land.Jan 4 2022, 12:12 PM
sebastian-ne added inline comments.Jan 5 2022, 1:57 AM
llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
142

Yes, it gets called again. (E.g. the combines for Intrinsic::amdgcn_fma_legacy also return early when one of them matches.)
The @sample_b_1d_a16 test also checks that the no-bias and a16 combine is applied.

llvm/test/CodeGen/AMDGPU/image_ls_mipmap_zero.ll
132 ↗(On Diff #395624)

I deleted them because they started failing for me. I guess the codegen tests don’t run the combines? (Anyway, this moved to D116116)

This revision was landed with ongoing or failed builds.Jan 21 2022, 3:09 AM
This revision was automatically updated to reflect the committed changes.