This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Optimize (add (mul x, c0), c1)
ClosedPublic

Authored by benshi001 on Aug 23 2021, 8:54 PM.

Details

Summary

Optimize (add (mul x, c0), c1) -> (ADDI (MUL (ADDI, c1/c0), c0), c1%c0),
if c1/c0 and c1%c0 are simm12, while c1 is not.

Optimize (add (mul x, c0), c1) -> (MUL (ADDI, c1/c0), c0),
if c1%c0 is zero, and c1/c0 is simm12 while c1 is not.

Diff Detail

Event Timeline

benshi001 created this revision.Aug 23 2021, 8:54 PM
benshi001 requested review of this revision.Aug 23 2021, 8:54 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 23 2021, 8:54 PM
benshi001 updated this revision to Diff 368264.Aug 23 2021, 8:56 PM
benshi001 edited the summary of this revision. (Show Details)
craig.topper added inline comments.Aug 23 2021, 9:14 PM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
5864

You just use isInt<12> from MathUtils.h and not use APInt.

5864

You need to guard against C0 being 0.

I'm also concerned about the possibility of signed overflow if C1=INT_MIN C0 = -1.

5875

No need for else after a return.

benshi001 marked 3 inline comments as done.
benshi001 added inline comments.
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
5864

I think we can skip if C1 is -1/0/+1, then all corner cases are excluded.

Thanks, I have changed according to your comments.

What's more, I think for c1==1/0/-1, I can just skip and let other passes handle it.

benshi001 updated this revision to Diff 368335.Aug 24 2021, 6:59 AM
benshi001 edited the summary of this revision. (Show Details)
craig.topper added inline comments.Sep 16 2021, 8:30 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
6200

Please put this call in performADDCombine. The idea was that all ADD specific things should be in one place. This switch is getting pretty crazy.

benshi001 updated this revision to Diff 373123.Sep 16 2021, 7:02 PM
benshi001 marked an inline comment as done.
craig.topper added inline comments.Sep 18 2021, 3:20 PM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
5851

VT is guaranteed integer since you're starting from an integer add instruction.

5859

I believe N0C and N1C are more consistent with other DAGCombiner code.

5887

I don't like the name N0 here. SelectionDAG code very often uses N0 to mean N->getOperand(0).

I'd suggest

if (SDValue V = transformAddImmMulImm(N, DAG, Subtarget))
  return V;

V is more generic name and putting in the declare in the if limits its scope.

benshi001 updated this revision to Diff 373447.Sep 19 2021, 2:17 AM
benshi001 marked 3 inline comments as done.Sep 19 2021, 2:18 AM
benshi001 updated this revision to Diff 373448.Sep 19 2021, 2:46 AM
This revision is now accepted and ready to land.Sep 20 2021, 9:42 AM
This revision was landed with ongoing or failed builds.Sep 21 2021, 7:13 AM
This revision was automatically updated to reflect the committed changes.

This commit causes an infinite loop when compiling the Linux kernel:

$ git bisect log
# bad: [43d6991c2a4cc2ac374e68c029634f2b59ffdfdf] [IR] Look through bitcast in hasFnAttribute()
# good: [6a2c2263fbca07a59b9f41181c1418df052e24d1] [X86] Improve i8 all-ones element insertion in pre-SSE4.1
git bisect start '43d6991c2a4cc2ac374e68c029634f2b59ffdfdf' '6a2c2263fbca07a59b9f41181c1418df052e24d1'
# good: [792101fff749191dfd4dadabe2ecd30a4d8cd973] [RISCV] Add test cases for missed opportunity to use vfmacc.vf. NFC
git bisect good 792101fff749191dfd4dadabe2ecd30a4d8cd973
# good: [20b58855e0cfc263d609e8bb59e692024ecb42aa] [CodeGen] SelectionDAGBuilder - Use const-ref iterator in for-range loops. NFCI.
git bisect good 20b58855e0cfc263d609e8bb59e692024ecb42aa
# bad: [08ef71ca92d9bd474ac4bbb5109e9ced19c77f0e] [InstCombine] move/add tests for trunc-of-lshr; NFC
git bisect bad 08ef71ca92d9bd474ac4bbb5109e9ced19c77f0e
# bad: [744ec74b305a5039dc74e2d043e1c136e06beac1] [NFC] `goto fail` has failed us in the past...
git bisect bad 744ec74b305a5039dc74e2d043e1c136e06beac1
# good: [b8e7f5320825812fc4b597ee5df3dccc53fe78bb] [AMDGPU][MC][GFX10] Enabled dlc for FLAT and GLOBAL atomics
git bisect good b8e7f5320825812fc4b597ee5df3dccc53fe78bb
# good: [ee31ad0ab5f7d443b0bd582582a3524dcf7f13f0] [clang-offload-bundler][docs][NFC] Add archive unbundling documentation
git bisect good ee31ad0ab5f7d443b0bd582582a3524dcf7f13f0
# good: [32b994bca66641cdac8586f25315daf349921ebc] [OpenCL] Defines helper function for OpenCL default address space
git bisect good 32b994bca66641cdac8586f25315daf349921ebc
# bad: [b3052013b43617defd777b1f3b4339540c4c07df] [RISCV] Optimize (add (mul x, c0), c1)
git bisect bad b3052013b43617defd777b1f3b4339540c4c07df
# first bad commit: [b3052013b43617defd777b1f3b4339540c4c07df] [RISCV] Optimize (add (mul x, c0), c1)

$ cat evergreen.i
int evergreen_dram_bandwidth_for_display_wm_0,
    evergreen_dram_bandwidth_for_display_disp_dram_allocation_0,
    evergreen_dram_bandwidth_for_display_bandwidth_0,
    evergreen_dram_bandwidth_for_display_a_0;
void evergreen_dram_bandwidth_for_display() {
  evergreen_dram_bandwidth_for_display_a_0 = 1000 << 12;
  evergreen_dram_bandwidth_for_display_disp_dram_allocation_0 = 3 << 12;
  long tmp = evergreen_dram_bandwidth_for_display_disp_dram_allocation_0 << 13;
  tmp = tmp / evergreen_dram_bandwidth_for_display_a_0;
  evergreen_dram_bandwidth_for_display_bandwidth_0 =
      evergreen_dram_bandwidth_for_display_wm_0 * tmp + 2048;
}

$ timeout 10s clang -O2 --target=riscv64-linux-gnu -c -o /dev/null evergreen.i

$ echo $?
124

This is still reproducible at current ToT (3778c1cd6ef5a3286d5d49e842a2c65fffb8f3a6).

This commit causes an infinite loop when compiling the Linux kernel:

$ git bisect log
# bad: [43d6991c2a4cc2ac374e68c029634f2b59ffdfdf] [IR] Look through bitcast in hasFnAttribute()
# good: [6a2c2263fbca07a59b9f41181c1418df052e24d1] [X86] Improve i8 all-ones element insertion in pre-SSE4.1
git bisect start '43d6991c2a4cc2ac374e68c029634f2b59ffdfdf' '6a2c2263fbca07a59b9f41181c1418df052e24d1'
# good: [792101fff749191dfd4dadabe2ecd30a4d8cd973] [RISCV] Add test cases for missed opportunity to use vfmacc.vf. NFC
git bisect good 792101fff749191dfd4dadabe2ecd30a4d8cd973
# good: [20b58855e0cfc263d609e8bb59e692024ecb42aa] [CodeGen] SelectionDAGBuilder - Use const-ref iterator in for-range loops. NFCI.
git bisect good 20b58855e0cfc263d609e8bb59e692024ecb42aa
# bad: [08ef71ca92d9bd474ac4bbb5109e9ced19c77f0e] [InstCombine] move/add tests for trunc-of-lshr; NFC
git bisect bad 08ef71ca92d9bd474ac4bbb5109e9ced19c77f0e
# bad: [744ec74b305a5039dc74e2d043e1c136e06beac1] [NFC] `goto fail` has failed us in the past...
git bisect bad 744ec74b305a5039dc74e2d043e1c136e06beac1
# good: [b8e7f5320825812fc4b597ee5df3dccc53fe78bb] [AMDGPU][MC][GFX10] Enabled dlc for FLAT and GLOBAL atomics
git bisect good b8e7f5320825812fc4b597ee5df3dccc53fe78bb
# good: [ee31ad0ab5f7d443b0bd582582a3524dcf7f13f0] [clang-offload-bundler][docs][NFC] Add archive unbundling documentation
git bisect good ee31ad0ab5f7d443b0bd582582a3524dcf7f13f0
# good: [32b994bca66641cdac8586f25315daf349921ebc] [OpenCL] Defines helper function for OpenCL default address space
git bisect good 32b994bca66641cdac8586f25315daf349921ebc
# bad: [b3052013b43617defd777b1f3b4339540c4c07df] [RISCV] Optimize (add (mul x, c0), c1)
git bisect bad b3052013b43617defd777b1f3b4339540c4c07df
# first bad commit: [b3052013b43617defd777b1f3b4339540c4c07df] [RISCV] Optimize (add (mul x, c0), c1)

$ cat evergreen.i
int evergreen_dram_bandwidth_for_display_wm_0,
    evergreen_dram_bandwidth_for_display_disp_dram_allocation_0,
    evergreen_dram_bandwidth_for_display_bandwidth_0,
    evergreen_dram_bandwidth_for_display_a_0;
void evergreen_dram_bandwidth_for_display() {
  evergreen_dram_bandwidth_for_display_a_0 = 1000 << 12;
  evergreen_dram_bandwidth_for_display_disp_dram_allocation_0 = 3 << 12;
  long tmp = evergreen_dram_bandwidth_for_display_disp_dram_allocation_0 << 13;
  tmp = tmp / evergreen_dram_bandwidth_for_display_a_0;
  evergreen_dram_bandwidth_for_display_bandwidth_0 =
      evergreen_dram_bandwidth_for_display_wm_0 * tmp + 2048;
}

$ timeout 10s clang -O2 --target=riscv64-linux-gnu -c -o /dev/null evergreen.i

$ echo $?
124

This is still reproducible at current ToT (3778c1cd6ef5a3286d5d49e842a2c65fffb8f3a6).

Thanks for the report. Hopefully https://reviews.llvm.org/rG40b230f6856d fixes this

Thanks, I can confirm that commit fixes it!