This is an archive of the discontinued LLVM Phabricator instance.

[AAch64] Optimize muls with operands having enough sign bits. One operand is a sub.
AbandonedPublic

Authored by bipmis on Dec 6 2022, 3:58 AM.

Details

Reviewers
dmgreen
samtebbs
Summary

Muls with 64bit operands where one of the operands is a register with enough sign bits
The other operand is a sub with enough sign bits.
We can generate a 32bit sub and a single smull instruction on a 32bit operand.

Diff Detail

Event Timeline

bipmis requested review of this revision.Dec 6 2022, 3:58 AM
bipmis created this revision.

Can you give some more details about why is this true? I would expect the sub to have 31 sign bits.

The mul in submulwithsignbits will be commutative, so will match either way. The code needs to account for that I think, not just check for operand(1).

bipmis added a comment.EditedDec 6 2022, 8:00 AM

Can you give some more details about why is this true? I would expect the sub to have 31 sign bits.

The mul in submulwithsignbits will be commutative, so will match either way. The code needs to account for that I think, not just check for operand(1).

Basically if looked from IR perspective we are trying to implement

define i64 @smull_sext_sub(i32* %x0, i32 %x1, i32 %x2) {
entry:
  %ext64 = load i32, i32* %x0
  %sext = sext i32 %ext64 to i64
  %sext2 = sext i32 %x1 to i64
  %sext3 = sext i32 %x2 to i64
  %sub = sub i64 %sext, %sext2
  %mul = mul i64 %sext3, %sub
  ret i64 %mul
}

as

define i64 @smull_sext_sub2(i32* %x0, i32 %x1, i32 %x2) {
entry:
  %ext64 = load i32, i32* %x0
  %sext3 = sext i32 %x2 to i64
  %sub = sub i32 %ext64, %x1
  %sext = sext i32 %sub to i64
  %mul = mul i64 %sext3, %sext
  ret i64 %mul
}

Why 31 bits. If we look sub and mul as arithmetic instructions they both need the same number of sign bits to determine of a 64bit arithmetic can be reduced to an equivalent 32bit.
A simple example - https://alive2.llvm.org/ce/z/RJU_ua

bipmis abandoned this revision.Jul 27 2023, 1:55 AM