Page MenuHomePhabricator

[AAch64] Optimize muls with operands having enough sign bits. One operand is a sub.
Needs ReviewPublic

Authored by bipmis on Dec 6 2022, 3:58 AM.

Details

Reviewers
dmgreen
samtebbs
Summary

Muls with 64bit operands where one of the operands is a register with enough sign bits
The other operand is a sub with enough sign bits.
We can generate a 32bit sub and a single smull instruction on a 32bit operand.

Diff Detail

Unit TestsFailed

TimeTest
60,040 msx64 debian > libFuzzer.libFuzzer::fuzzer-leak.test
Script: -- : 'RUN: at line 3'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang --driver-mode=g++ -O2 -gline-tables-only -fsanitize=address,fuzzer -I/var/lib/buildkite-agent/builds/llvm-project/compiler-rt/lib/fuzzer -m64 /var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/fuzzer/LeakTest.cpp -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/fuzzer/X86_64DefaultLinuxConfig/Output/fuzzer-leak.test.tmp-LeakTest
60,050 msx64 debian > libFuzzer.libFuzzer::minimize_crash.test
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang --driver-mode=g++ -O2 -gline-tables-only -fsanitize=address,fuzzer -I/var/lib/buildkite-agent/builds/llvm-project/compiler-rt/lib/fuzzer -m64 /var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/fuzzer/NullDerefTest.cpp -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/fuzzer/X86_64DefaultLinuxConfig/Output/minimize_crash.test.tmp-NullDerefTest

Event Timeline

bipmis requested review of this revision.Dec 6 2022, 3:58 AM
bipmis created this revision.

Can you give some more details about why is this true? I would expect the sub to have 31 sign bits.

The mul in submulwithsignbits will be commutative, so will match either way. The code needs to account for that I think, not just check for operand(1).

bipmis added a comment.EditedDec 6 2022, 8:00 AM

Can you give some more details about why is this true? I would expect the sub to have 31 sign bits.

The mul in submulwithsignbits will be commutative, so will match either way. The code needs to account for that I think, not just check for operand(1).

Basically if looked from IR perspective we are trying to implement

define i64 @smull_sext_sub(i32* %x0, i32 %x1, i32 %x2) {
entry:
  %ext64 = load i32, i32* %x0
  %sext = sext i32 %ext64 to i64
  %sext2 = sext i32 %x1 to i64
  %sext3 = sext i32 %x2 to i64
  %sub = sub i64 %sext, %sext2
  %mul = mul i64 %sext3, %sub
  ret i64 %mul
}

as

define i64 @smull_sext_sub2(i32* %x0, i32 %x1, i32 %x2) {
entry:
  %ext64 = load i32, i32* %x0
  %sext3 = sext i32 %x2 to i64
  %sub = sub i32 %ext64, %x1
  %sext = sext i32 %sub to i64
  %mul = mul i64 %sext3, %sext
  ret i64 %mul
}

Why 31 bits. If we look sub and mul as arithmetic instructions they both need the same number of sign bits to determine of a 64bit arithmetic can be reduced to an equivalent 32bit.
A simple example - https://alive2.llvm.org/ce/z/RJU_ua