Though maybe the exact patterns can be automatically generated with nested loops, for reduced mental burden to prove correctness? I've sketched something and it seems doable:

>>> alsl = lambda j, k, imm: (j << imm) + k
>>> a = [alsl(alsl(1, 1, i), alsl(1, 1, i), j) for j in range(1,5) for i in range(1,5)]
>>> a
[9, 15, 27, 51, 15, 25, 45, 85, 27, 45, 81, 153, 51, 85, 153, 289]
>>> list(sorted(set(a)))
[9, 15, 25, 27, 45, 51, 81, 85, 153, 289]
>>> b = [alsl(alsl(1, 1, i), 1, j) for j in range(1,5) for i in range(1,5)]
>>> b
[7, 11, 19, 35, 13, 21, 37, 69, 25, 41, 73, 137, 49, 81, 145, 273]
>>> ab = set(a).union(set(b))
>>> len(ab)
24
>>> list(sorted(set(a).intersection(set(b))))
[25, 81]

>>> c = [alsl(1, alsl(1, 1, i), j) for j in range(1,5) for i in range(1,5)]
>>> c
[5, 7, 11, 19, 7, 9, 13, 21, 11, 13, 17, 25, 19, 21, 25, 33]
>>> list(sorted(set(c)))
[5, 7, 9, 11, 13, 17, 19, 21, 25, 33]

>>> set(c).difference(ab)
{33, 5, 17}
>>> all = set(c).union(ab).difference({3, 5, 9, 17})
>>> len(all)
24
>>> list(sorted(all))
[7, 11, 13, 15, 19, 21, 25, 27, 33, 35, 37, 41, 45, 49, 51, 69, 73, 81, 85, 137, 145, 153, 273, 289]

So basically, we can strength-reduce a total of 24 different constant-multiplications with two alsl's:

case 1: alsl T, X, X, i; alsl Y, T, T, j: 15, 25, 27, 45, 51, 81, 85, 153, 289
case 2: alsl T, X, X, i; alsl Y, T, X, j: 7, 11, 13, 19, 21, 25, 35, 37, 41, 49, 69, 73, 81, 137, 145, 273
case 3: alsl T, X, X, i; alsl Y, X, T, j: 7, 11, 13, 19, 21, 25, 33

Problem is that there are some overlaps between the 3 possible combinations, and some inside case 1 and 3. If we could somehow avoid producing conflicting rules then probably leveraging TableGen's loop and computation abilities would produce code that's easier to maintain. Otherwise, simplifying the code with some macros could also be beneficial.

llvm/lib/Target/LoongArch/LoongArchInstrInfo.td
865–866 ↗	(On Diff #509937)	The inner `if` could be simplified into just `return N1C->hasOneUse()`. The outer `return false` could be kept though, for avoiding an overly complex single return expression.

Harbormaster completed remote builds in B222936: Diff 509937.Mar 31 2023, 3:22 AM

In D147305#4236071, @xen0n wrote:
This is good strength reduction overall, thanks for the insight!

Though maybe the exact patterns can be automatically generated with nested loops, for reduced mental burden to prove correctness? I've sketched something and it seems doable:
>>> alsl = lambda j, k, imm: (j << imm) + k
>>> a = [alsl(alsl(1, 1, i), alsl(1, 1, i), j) for j in range(1,5) for i in range(1,5)]
>>> a
[9, 15, 27, 51, 15, 25, 45, 85, 27, 45, 81, 153, 51, 85, 153, 289]
>>> list(sorted(set(a)))
[9, 15, 25, 27, 45, 51, 81, 85, 153, 289]
>>> b = [alsl(alsl(1, 1, i), 1, j) for j in range(1,5) for i in range(1,5)]
>>> b
[7, 11, 19, 35, 13, 21, 37, 69, 25, 41, 73, 137, 49, 81, 145, 273]
>>> ab = set(a).union(set(b))
>>> len(ab)
24
>>> list(sorted(set(a).intersection(set(b))))
[25, 81]

>>> c = [alsl(1, alsl(1, 1, i), j) for j in range(1,5) for i in range(1,5)]
>>> c
[5, 7, 11, 19, 7, 9, 13, 21, 11, 13, 17, 25, 19, 21, 25, 33]
>>> list(sorted(set(c)))
[5, 7, 9, 11, 13, 17, 19, 21, 25, 33]

>>> set(c).difference(ab)
{33, 5, 17}
>>> all = set(c).union(ab).difference({3, 5, 9, 17})
>>> len(all)
24
>>> list(sorted(all))
[7, 11, 13, 15, 19, 21, 25, 27, 33, 35, 37, 41, 45, 49, 51, 69, 73, 81, 85, 137, 145, 153, 273, 289]
So basically, we can strength-reduce a total of 24 different constant-multiplications with two alsl's:

case 1: alsl T, X, X, i; alsl Y, T, T, j: 15, 25, 27, 45, 51, 81, 85, 153, 289

case 2: alsl T, X, X, i; alsl Y, T, X, j: 7, 11, 13, 19, 21, 25, 35, 37, 41, 49, 69, 73, 81, 137, 145, 273

case 3: alsl T, X, X, i; alsl Y, X, T, j: 7, 11, 13, 19, 21, 25, 33

Problem is that there are some overlaps between the 3 possible combinations, and some inside case 1 and 3. If we could somehow avoid producing conflicting rules then probably leveraging TableGen's loop and computation abilities would produce code that's easier to maintain. Otherwise, simplifying the code with some macros could also be beneficial.

Thanks for your suggestion! Using foreach really makes the code more clear!

BTW: ALSL only accepts shift amount 1,2,3,4, the value 5 is not supported.

In D147305#4236226, @benshi001 wrote:
In D147305#4236071, @xen0n wrote:
This is good strength reduction overall, thanks for the insight!

Though maybe the exact patterns can be automatically generated with nested loops, for reduced mental burden to prove correctness? I've sketched something and it seems doable:
>>> alsl = lambda j, k, imm: (j << imm) + k
>>> a = [alsl(alsl(1, 1, i), alsl(1, 1, i), j) for j in range(1,5) for i in range(1,5)]
>>> a
[9, 15, 27, 51, 15, 25, 45, 85, 27, 45, 81, 153, 51, 85, 153, 289]
>>> list(sorted(set(a)))
[9, 15, 25, 27, 45, 51, 81, 85, 153, 289]
>>> b = [alsl(alsl(1, 1, i), 1, j) for j in range(1,5) for i in range(1,5)]
>>> b
[7, 11, 19, 35, 13, 21, 37, 69, 25, 41, 73, 137, 49, 81, 145, 273]
>>> ab = set(a).union(set(b))
>>> len(ab)
24
>>> list(sorted(set(a).intersection(set(b))))
[25, 81]

>>> c = [alsl(1, alsl(1, 1, i), j) for j in range(1,5) for i in range(1,5)]
>>> c
[5, 7, 11, 19, 7, 9, 13, 21, 11, 13, 17, 25, 19, 21, 25, 33]
>>> list(sorted(set(c)))
[5, 7, 9, 11, 13, 17, 19, 21, 25, 33]

>>> set(c).difference(ab)
{33, 5, 17}
>>> all = set(c).union(ab).difference({3, 5, 9, 17})
>>> len(all)
24
>>> list(sorted(all))
[7, 11, 13, 15, 19, 21, 25, 27, 33, 35, 37, 41, 45, 49, 51, 69, 73, 81, 85, 137, 145, 153, 273, 289]
So basically, we can strength-reduce a total of 24 different constant-multiplications with two alsl's:

case 1: alsl T, X, X, i; alsl Y, T, T, j: 15, 25, 27, 45, 51, 81, 85, 153, 289

case 2: alsl T, X, X, i; alsl Y, T, X, j: 7, 11, 13, 19, 21, 25, 35, 37, 41, 49, 69, 73, 81, 137, 145, 273

case 3: alsl T, X, X, i; alsl Y, X, T, j: 7, 11, 13, 19, 21, 25, 33

Problem is that there are some overlaps between the 3 possible combinations, and some inside case 1 and 3. If we could somehow avoid producing conflicting rules then probably leveraging TableGen's loop and computation abilities would produce code that's easier to maintain. Otherwise, simplifying the code with some macros could also be beneficial.
Thanks for your suggestion! Using foreach really makes the code more clear!

My pleasure. ;-)

BTW: ALSL only accepts shift amount 1,2,3,4, the value 5 is not supported.

It's just one of the Python idiosyncrasies: range(1, 5) really yields 1, 2, 3, 4. Just like how for (int i = 1; i < 5; i++) is the same in C.

Harbormaster completed remote builds in B222972: Diff 509975.Mar 31 2023, 8:22 AM

Thanks for the improvements.
So this is only for the case 2 mentioned by @xen0n, right? Seems that the test for 81 is missing.

Will case 1 and case 3 be handled later?

benshi001 updated this revision to Diff 510162.Mar 31 2023, 7:26 PM

Harbormaster completed remote builds in B223102: Diff 510162.Mar 31 2023, 7:27 PM

In D147305#4237944, @SixWeining wrote:

Thanks for the improvements.
So this is only for the case 2 mentioned by @xen0n, right? Seems that the test for 81 is missing.

Will case 1 and case 3 be handled later?

Thanks for your comments.

The missing case 81 is added.
I will implement case 1 and case 3 later in another patch. BTW: Total amount of these two cases is small, and they even have redundant values with case 2, so how about implementing them without foreach, just standalone Pat one by one ?

benshi001 updated this revision to Diff 510164.Mar 31 2023, 7:46 PM

Harbormaster completed remote builds in B223104: Diff 510164.Mar 31 2023, 8:26 PM

In D147305#4237973, @benshi001 wrote:

In D147305#4237944, @SixWeining wrote:

Thanks for the improvements.
So this is only for the case 2 mentioned by @xen0n, right? Seems that the test for 81 is missing.

Will case 1 and case 3 be handled later?

Thanks for your comments.

The missing case 81 is added.

I will implement case 1 and case 3 later in another patch. BTW: Total amount of these two cases is small, and they even have redundant values with case 2, so how about implementing them without foreach, just standalone Pat one by one ?

As for point 2, fine by me. Case 1 would still have many remaining constants so a macro would go a long way (you could only go over the distinct Imm1 and Imm2 and auto-compute the source constant as you did for Case 2), and only 33 would be left for Case 3 so you may write the pattern straight-forward.

In D147305#4238054, @xen0n wrote:

In D147305#4237973, @benshi001 wrote:

In D147305#4237944, @SixWeining wrote:

Thanks for the improvements.
So this is only for the case 2 mentioned by @xen0n, right? Seems that the test for 81 is missing.

Will case 1 and case 3 be handled later?

Thanks for your comments.

The missing case 81 is added.

I will implement case 1 and case 3 later in another patch. BTW: Total amount of these two cases is small, and they even have redundant values with case 2, so how about implementing them without foreach, just standalone Pat one by one ?

As for point 2, fine by me. Case 1 would still have many remaining constants so a macro would go a long way (you could only go over the distinct Imm1 and Imm2 and auto-compute the source constant as you did for Case 2), and only 33 would be left for Case 3 so you may write the pattern straight-forward.

Thanks for your suggestion. I will do in my next patch.

benshi001 edited the summary of this revision. (Show Details)Apr 1 2023, 12:53 AM

xen0n accepted this revision.Apr 1 2023, 1:12 AM

This revision is now accepted and ready to land.Apr 1 2023, 1:12 AM

LGTM.

Closed by commit rG734c21300430: [LoongArch] Optimize multiplication with immediates (authored by benshi001). · Explain WhyApr 1 2023, 3:12 AM

This revision was automatically updated to reflect the committed changes.

benshi001 added a commit: rG734c21300430: [LoongArch] Optimize multiplication with immediates.

benshi001 mentioned this in rG918209bf856e: [LoongArch][NFC] Add tests of multiplication with immediates (for D147305).

In D147305#4238054, @xen0n wrote:

In D147305#4237973, @benshi001 wrote:

In D147305#4237944, @SixWeining wrote:

Thanks for the improvements.
So this is only for the case 2 mentioned by @xen0n, right? Seems that the test for 81 is missing.

Will case 1 and case 3 be handled later?

Thanks for your comments.

The missing case 81 is added.

I will implement case 1 and case 3 later in another patch. BTW: Total amount of these two cases is small, and they even have redundant values with case 2, so how about implementing them without foreach, just standalone Pat one by one ?

As for point 2, fine by me. Case 1 would still have many remaining constants so a macro would go a long way (you could only go over the distinct Imm1 and Imm2 and auto-compute the source constant as you did for Case 2), and only 33 would be left for Case 3 so you may write the pattern straight-forward.

Unfortanately the case 3 you mentioned can not be implemented, besides other duplicates to case 2, the remaining immediate x * 33 will be optimized to (x << 5) + x.

Diff 509930

llvm/test/CodeGen/LoongArch/ir-instruction/mul.ll

	Show First 20 Lines • Show All 278 Lines • ▼ Show 20 Lines
	; LA64: # %bb.0:			; LA64: # %bb.0:
	; LA64-NEXT: mulw.d.wu $a0, $a0, $a1			; LA64-NEXT: mulw.d.wu $a0, $a0, $a1
	; LA64-NEXT: ret			; LA64-NEXT: ret
	%1 = zext i32 %a to i64			%1 = zext i32 %a to i64
	%2 = zext i32 %b to i64			%2 = zext i32 %b to i64
	%3 = mul i64 %1, %2			%3 = mul i64 %1, %2
	ret i64 %3			ret i64 %3
	}			}

				define signext i32 @mul_i32_11(i32 %a) {
				; LA32-LABEL: mul_i32_11:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a1, $zero, 11
				; LA32-NEXT: mul.w $a0, $a0, $a1
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i32_11:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 11
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: addi.w $a0, $a0, 0
				; LA64-NEXT: ret
				%b = mul i32 %a, 11
				ret i32 %b
				}

				define signext i32 @mul_i32_13(i32 %a) {
				; LA32-LABEL: mul_i32_13:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a1, $zero, 13
				; LA32-NEXT: mul.w $a0, $a0, $a1
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i32_13:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 13
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: addi.w $a0, $a0, 0
				; LA64-NEXT: ret
				%b = mul i32 %a, 13
				ret i32 %b
				}

				define signext i32 @mul_i32_19(i32 %a) {
				; LA32-LABEL: mul_i32_19:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a1, $zero, 19
				; LA32-NEXT: mul.w $a0, $a0, $a1
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i32_19:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 19
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: addi.w $a0, $a0, 0
				; LA64-NEXT: ret
				%b = mul i32 %a, 19
				ret i32 %b
				}

				define signext i32 @mul_i32_21(i32 %a) {
				; LA32-LABEL: mul_i32_21:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a1, $zero, 21
				; LA32-NEXT: mul.w $a0, $a0, $a1
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i32_21:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 21
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: addi.w $a0, $a0, 0
				; LA64-NEXT: ret
				%b = mul i32 %a, 21
				ret i32 %b
				}

				define signext i32 @mul_i32_25(i32 %a) {
				; LA32-LABEL: mul_i32_25:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a1, $zero, 25
				; LA32-NEXT: mul.w $a0, $a0, $a1
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i32_25:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 25
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: addi.w $a0, $a0, 0
				; LA64-NEXT: ret
				%b = mul i32 %a, 25
				ret i32 %b
				}

				define signext i32 @mul_i32_35(i32 %a) {
				; LA32-LABEL: mul_i32_35:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a1, $zero, 35
				; LA32-NEXT: mul.w $a0, $a0, $a1
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i32_35:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 35
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: addi.w $a0, $a0, 0
				; LA64-NEXT: ret
				%b = mul i32 %a, 35
				ret i32 %b
				}

				define signext i32 @mul_i32_37(i32 %a) {
				; LA32-LABEL: mul_i32_37:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a1, $zero, 37
				; LA32-NEXT: mul.w $a0, $a0, $a1
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i32_37:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 37
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: addi.w $a0, $a0, 0
				; LA64-NEXT: ret
				%b = mul i32 %a, 37
				ret i32 %b
				}

				define signext i32 @mul_i32_41(i32 %a) {
				; LA32-LABEL: mul_i32_41:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a1, $zero, 41
				; LA32-NEXT: mul.w $a0, $a0, $a1
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i32_41:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 41
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: addi.w $a0, $a0, 0
				; LA64-NEXT: ret
				%b = mul i32 %a, 41
				ret i32 %b
				}

				define signext i32 @mul_i32_49(i32 %a) {
				; LA32-LABEL: mul_i32_49:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a1, $zero, 49
				; LA32-NEXT: mul.w $a0, $a0, $a1
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i32_49:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 49
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: addi.w $a0, $a0, 0
				; LA64-NEXT: ret
				%b = mul i32 %a, 49
				ret i32 %b
				}

				define signext i32 @mul_i32_69(i32 %a) {
				; LA32-LABEL: mul_i32_69:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a1, $zero, 69
				; LA32-NEXT: mul.w $a0, $a0, $a1
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i32_69:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 69
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: addi.w $a0, $a0, 0
				; LA64-NEXT: ret
				%b = mul i32 %a, 69
				ret i32 %b
				}

				define signext i32 @mul_i32_73(i32 %a) {
				; LA32-LABEL: mul_i32_73:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a1, $zero, 73
				; LA32-NEXT: mul.w $a0, $a0, $a1
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i32_73:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 73
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: addi.w $a0, $a0, 0
				; LA64-NEXT: ret
				%b = mul i32 %a, 73
				ret i32 %b
				}

				define signext i32 @mul_i32_137(i32 %a) {
				; LA32-LABEL: mul_i32_137:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a1, $zero, 137
				; LA32-NEXT: mul.w $a0, $a0, $a1
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i32_137:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 137
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: addi.w $a0, $a0, 0
				; LA64-NEXT: ret
				%b = mul i32 %a, 137
				ret i32 %b
				}

				define signext i32 @mul_i32_145(i32 %a) {
				; LA32-LABEL: mul_i32_145:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a1, $zero, 145
				; LA32-NEXT: mul.w $a0, $a0, $a1
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i32_145:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 145
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: addi.w $a0, $a0, 0
				; LA64-NEXT: ret
				%b = mul i32 %a, 145
				ret i32 %b
				}

				define signext i32 @mul_i32_273(i32 %a) {
				; LA32-LABEL: mul_i32_273:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a1, $zero, 273
				; LA32-NEXT: mul.w $a0, $a0, $a1
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i32_273:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 273
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: addi.w $a0, $a0, 0
				; LA64-NEXT: ret
				%b = mul i32 %a, 273
				ret i32 %b
				}

				define i64 @mul_i64_11(i64 %a) {
				; LA32-LABEL: mul_i64_11:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a2, $zero, 11
				; LA32-NEXT: mul.w $a1, $a1, $a2
				; LA32-NEXT: mulh.wu $a3, $a0, $a2
				; LA32-NEXT: add.w $a1, $a3, $a1
				; LA32-NEXT: mul.w $a0, $a0, $a2
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i64_11:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 11
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: ret
				%b = mul i64 %a, 11
				ret i64 %b
				}

				define i64 @mul_i64_13(i64 %a) {
				; LA32-LABEL: mul_i64_13:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a2, $zero, 13
				; LA32-NEXT: mul.w $a1, $a1, $a2
				; LA32-NEXT: mulh.wu $a3, $a0, $a2
				; LA32-NEXT: add.w $a1, $a3, $a1
				; LA32-NEXT: mul.w $a0, $a0, $a2
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i64_13:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 13
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: ret
				%b = mul i64 %a, 13
				ret i64 %b
				}

				define i64 @mul_i64_19(i64 %a) {
				; LA32-LABEL: mul_i64_19:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a2, $zero, 19
				; LA32-NEXT: mul.w $a1, $a1, $a2
				; LA32-NEXT: mulh.wu $a3, $a0, $a2
				; LA32-NEXT: add.w $a1, $a3, $a1
				; LA32-NEXT: mul.w $a0, $a0, $a2
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i64_19:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 19
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: ret
				%b = mul i64 %a, 19
				ret i64 %b
				}

				define i64 @mul_i64_21(i64 %a) {
				; LA32-LABEL: mul_i64_21:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a2, $zero, 21
				; LA32-NEXT: mul.w $a1, $a1, $a2
				; LA32-NEXT: mulh.wu $a3, $a0, $a2
				; LA32-NEXT: add.w $a1, $a3, $a1
				; LA32-NEXT: mul.w $a0, $a0, $a2
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i64_21:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 21
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: ret
				%b = mul i64 %a, 21
				ret i64 %b
				}

				define i64 @mul_i64_25(i64 %a) {
				; LA32-LABEL: mul_i64_25:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a2, $zero, 25
				; LA32-NEXT: mul.w $a1, $a1, $a2
				; LA32-NEXT: mulh.wu $a3, $a0, $a2
				; LA32-NEXT: add.w $a1, $a3, $a1
				; LA32-NEXT: mul.w $a0, $a0, $a2
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i64_25:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 25
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: ret
				%b = mul i64 %a, 25
				ret i64 %b
				}

				define i64 @mul_i64_35(i64 %a) {
				; LA32-LABEL: mul_i64_35:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a2, $zero, 35
				; LA32-NEXT: mul.w $a1, $a1, $a2
				; LA32-NEXT: mulh.wu $a3, $a0, $a2
				; LA32-NEXT: add.w $a1, $a3, $a1
				; LA32-NEXT: mul.w $a0, $a0, $a2
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i64_35:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 35
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: ret
				%b = mul i64 %a, 35
				ret i64 %b
				}

				define i64 @mul_i64_37(i64 %a) {
				; LA32-LABEL: mul_i64_37:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a2, $zero, 37
				; LA32-NEXT: mul.w $a1, $a1, $a2
				; LA32-NEXT: mulh.wu $a3, $a0, $a2
				; LA32-NEXT: add.w $a1, $a3, $a1
				; LA32-NEXT: mul.w $a0, $a0, $a2
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i64_37:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 37
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: ret
				%b = mul i64 %a, 37
				ret i64 %b
				}

				define i64 @mul_i64_41(i64 %a) {
				; LA32-LABEL: mul_i64_41:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a2, $zero, 41
				; LA32-NEXT: mul.w $a1, $a1, $a2
				; LA32-NEXT: mulh.wu $a3, $a0, $a2
				; LA32-NEXT: add.w $a1, $a3, $a1
				; LA32-NEXT: mul.w $a0, $a0, $a2
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i64_41:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 41
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: ret
				%b = mul i64 %a, 41
				ret i64 %b
				}

				define i64 @mul_i64_49(i64 %a) {
				; LA32-LABEL: mul_i64_49:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a2, $zero, 49
				; LA32-NEXT: mul.w $a1, $a1, $a2
				; LA32-NEXT: mulh.wu $a3, $a0, $a2
				; LA32-NEXT: add.w $a1, $a3, $a1
				; LA32-NEXT: mul.w $a0, $a0, $a2
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i64_49:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 49
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: ret
				%b = mul i64 %a, 49
				ret i64 %b
				}

				define i64 @mul_i64_69(i64 %a) {
				; LA32-LABEL: mul_i64_69:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a2, $zero, 69
				; LA32-NEXT: mul.w $a1, $a1, $a2
				; LA32-NEXT: mulh.wu $a3, $a0, $a2
				; LA32-NEXT: add.w $a1, $a3, $a1
				; LA32-NEXT: mul.w $a0, $a0, $a2
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i64_69:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 69
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: ret
				%b = mul i64 %a, 69
				ret i64 %b
				}

				define i64 @mul_i64_73(i64 %a) {
				; LA32-LABEL: mul_i64_73:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a2, $zero, 73
				; LA32-NEXT: mul.w $a1, $a1, $a2
				; LA32-NEXT: mulh.wu $a3, $a0, $a2
				; LA32-NEXT: add.w $a1, $a3, $a1
				; LA32-NEXT: mul.w $a0, $a0, $a2
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i64_73:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 73
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: ret
				%b = mul i64 %a, 73
				ret i64 %b
				}

				define i64 @mul_i64_137(i64 %a) {
				; LA32-LABEL: mul_i64_137:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a2, $zero, 137
				; LA32-NEXT: mul.w $a1, $a1, $a2
				; LA32-NEXT: mulh.wu $a3, $a0, $a2
				; LA32-NEXT: add.w $a1, $a3, $a1
				; LA32-NEXT: mul.w $a0, $a0, $a2
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i64_137:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 137
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: ret
				%b = mul i64 %a, 137
				ret i64 %b
				}

				define i64 @mul_i64_145(i64 %a) {
				; LA32-LABEL: mul_i64_145:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a2, $zero, 145
				; LA32-NEXT: mul.w $a1, $a1, $a2
				; LA32-NEXT: mulh.wu $a3, $a0, $a2
				; LA32-NEXT: add.w $a1, $a3, $a1
				; LA32-NEXT: mul.w $a0, $a0, $a2
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i64_145:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 145
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: ret
				%b = mul i64 %a, 145
				ret i64 %b
				}

				define i64 @mul_i64_273(i64 %a) {
				; LA32-LABEL: mul_i64_273:
				; LA32: # %bb.0:
				; LA32-NEXT: ori $a2, $zero, 273
				; LA32-NEXT: mul.w $a1, $a1, $a2
				; LA32-NEXT: mulh.wu $a3, $a0, $a2
				; LA32-NEXT: add.w $a1, $a3, $a1
				; LA32-NEXT: mul.w $a0, $a0, $a2
				; LA32-NEXT: ret
				;
				; LA64-LABEL: mul_i64_273:
				; LA64: # %bb.0:
				; LA64-NEXT: ori $a1, $zero, 273
				; LA64-NEXT: mul.d $a0, $a0, $a1
				; LA64-NEXT: ret
				%b = mul i64 %a, 273
				ret i64 %b
				}

This is an archive of the discontinued LLVM Phabricator instance.

[LoongArch] Optimize multiplication with immediates
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 509930

llvm/test/CodeGen/LoongArch/ir-instruction/mul.ll

This is an archive of the discontinued LLVM Phabricator instance.

[LoongArch] Optimize multiplication with immediatesClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 509930

llvm/test/CodeGen/LoongArch/ir-instruction/mul.ll

[LoongArch] Optimize multiplication with immediates
ClosedPublic