Download Raw Diff

Details

Reviewers

dylanmckay
aykevl

Commits

rG25531a1d9657: [AVR] Optimize 8-bit logic left/right shifts

Diff Detail

Event Timeline

benshi001 created this revision.Oct 8 2020, 7:26 AM

Herald added a project: Restricted Project. · View Herald TranscriptOct 8 2020, 7:26 AM

Herald added subscribers: llvm-commits, Jim, hiraditya. · View Herald Transcript

benshi001 requested review of this revision.Oct 8 2020, 7:26 AM

The original AVR backend TD is also wrong, that it maps the bswap IR to AVR's swap instruction.

Since the IR bswap is byte level operation, which requires the minimal data type to be at least 16-bit.

But AVR's swap performs on half-byte level.

Harbormaster completed remote builds in B74446: Diff 296967.Oct 8 2020, 8:33 AM

benshi001 updated this revision to Diff 298275.Oct 14 2020, 7:02 PM

benshi001 edited the summary of this revision. (Show Details)

The original AVR backend TD is also wrong, that it maps the bswap IR to AVR's swap instruction.

Since the IR bswap is byte level operation, which requires the minimal data type to be at least 16-bit.

But AVR's swap performs on half-byte level.

I agree it looks wrong, but I can't get it to produce invalid code: https://godbolt.org/z/3YTnzx

Also, while this is an improvement I wonder whether there is a more systematic way to do this? This optimizes a very specific code pattern but it doesn't seem easy to extend it to other shifts and rotates, such as ((char)x) << 3 for example. The thing I have in mind works like this:

8 bit shifts are all implemented with similar code that uses shifts, swaps, and ands etc to get the correct shift amount
16 bit shifts can be built on top of that, either shifting two bytes individually and ORing the results together, or for larger (>= 8 bit) shifts swap the bytes and use an all-zero or all-ones byte for the other.

Or perhaps there are better ways to do this.

If you play around with godbolt.org (https://godbolt.org/z/foW7qc) you can see that it manages to produce very short code for nearly all shifts, apparently falling back to loops for hard cases. It even changes its strategy to inlining the entire shift (no loop) when using -O2.

aykevl mentioned this in D90092: [AVR] Optimize 16-bit int shift.Oct 31 2020, 2:17 PM

Also, while this is an improvement I wonder whether there is a more systematic way to do this? This optimizes a very specific code pattern but it doesn't seem easy to extend it to other shifts and rotates, such as ((char)x) << 3 for example. The thing I have in mind works like this:

The optimizations for shifts are a bit complex, and I would like to do it in smaller sperated patches. This way makes them easy to review.

8 bit shifts are all implemented with similar code that uses shifts, swaps, and ands etc to get the correct shift amount

It is hard for 8-bit shift with shiftAmount = 3.

16 bit shifts can be built on top of that, either shifting two bytes individually and ORing the results together, or for larger (>= 8 bit) shifts swap the bytes and use an all-zero or all-ones byte for the other.

In current patch, I would like to only optimize 8-bit shifts and leave 16-bit shifts in other patches.

Or perhaps there are better ways to do this.

If you play around with godbolt.org (https://godbolt.org/z/foW7qc) you can see that it manages to produce very short code for nearly all shifts, apparently falling back to loops for hard cases. It even changes its strategy to inlining the entire shift (no loop) when using -O2.

For 8-bit shifts, my patch performs the same as avr-gcc, except for shiftAmount = 7. I will improve it soon.

benshi001 updated this revision to Diff 302242.Nov 2 2020, 4:38 AM

benshi001 retitled this revision from [AVR] Optimize logic left/right shift to [AVR] Optimize 8-bit logic left/right shifts.

benshi001 edited the summary of this revision. (Show Details)

I have uploaded a new revision with more tests.

Now llvm-avr generates the same asm for logic left/right shifts when ShiftAmount = 1,2,3,4,5,6, except for 7.

This is shown in the test llvm/test/CodeGen/AVR/shift.ll.

Shall we commit this first? Then I will go on with the specific case ShiftAmount=7 in another patch.

benshi001 mentioned this in D90678: [AVR] Optimize 8-bit int shift.Nov 3 2020, 5:17 AM

This comment was removed by benshi001.

Now llvm-avr generates the same asm for 8-bit shifts as AVR-GCC does, when ShiftAmount = 1,2,3,4,5,6, 7.

dsprenkels added a subscriber: dsprenkels.Nov 7 2020, 2:12 AM

Nice patch, cheers.

Just a small comment around a latent FIXME, then it's good to approve.

llvm/lib/Target/AVR/AVRISelLowering.cpp
353	Now llvm-avr generates the same asm for 8-bit shifts as AVR-GCC does, when ShiftAmount = 1,2,3,4,5,6, 7. This suggests that either this `TODO` comment is now unnecessary, or there is another optimization that could be implemented specifically for 7-bit shifts that AVR-GCC does not implement. Remove the TODO comment, or, if you feel it is important, add a sentence describing the TODO

This revision now requires changes to proceed.Nov 18 2020, 3:15 AM

benshi001 updated this revision to Diff 306067.Nov 18 2020, 5:23 AM

benshi001 marked an inline comment as done.

benshi001 added inline comments.

llvm/lib/Target/AVR/AVRISelLowering.cpp
353	The TODO is removed.

ping ...

ping

dylanmckay accepted this revision.Jan 23 2021, 1:03 AM

This revision is now accepted and ready to land.Jan 23 2021, 1:03 AM

Closed by commit rG25531a1d9657: [AVR] Optimize 8-bit logic left/right shifts (authored by benshi001). · Explain WhyJan 23 2021, 7:54 AM

This revision was automatically updated to reflect the committed changes.

benshi001 added a commit: rG25531a1d9657: [AVR] Optimize 8-bit logic left/right shifts.

aykevl mentioned this in D96506: [AVR] Optimize 16-bit shifts.Feb 24 2021, 9:11 AM

Diff 302242

llvm/lib/Target/AVR/AVRISelLowering.h

Show First 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	enum NodeType {
/// or TEST instruction.		/// or TEST instruction.
BRCOND,		BRCOND,
/// Compare instruction.		/// Compare instruction.
CMP,		CMP,
/// Compare with carry instruction.		/// Compare with carry instruction.
CMPC,		CMPC,
/// Test for zero or minus instruction.		/// Test for zero or minus instruction.
TST,		TST,
		/// Swap Rd[7:4] <-> Rd[3:0].
		SWAP,
/// Operand 0 and operand 1 are selection variable, operand 2		/// Operand 0 and operand 1 are selection variable, operand 2
/// is condition code and operand 3 is flag operand.		/// is condition code and operand 3 is flag operand.
SELECT_CC		SELECT_CC
};		};

} // end of namespace AVRISD		} // end of namespace AVRISD

class AVRSubtarget;		class AVRSubtarget;
▲ Show 20 Lines • Show All 116 Lines • Show Last 20 Lines

llvm/lib/Target/AVR/AVRISelLowering.cpp

Show First 20 Lines • Show All 328 Lines • ▼ Show 20 Lines	case ISD::SRL:
break;		break;
case ISD::SHL:		case ISD::SHL:
Opc8 = AVRISD::LSL;		Opc8 = AVRISD::LSL;
break;		break;
default:		default:
llvm_unreachable("Invalid shift opcode");		llvm_unreachable("Invalid shift opcode");
}		}

		// Optimize int8 shifts.
		if (VT.getSizeInBits() == 8) {
		if (Op.getOpcode() == ISD::SHL && 4 <= ShiftAmount && ShiftAmount < 7) {
		// Optimize LSL when 4 <= ShiftAmount <= 6.
		Victim = DAG.getNode(AVRISD::SWAP, dl, VT, Victim);
		Victim =
		DAG.getNode(ISD::AND, dl, VT, Victim, DAG.getConstant(0xf0, dl, VT));
		ShiftAmount -= 4;
		} else if (Op.getOpcode() == ISD::SRL && 4 <= ShiftAmount &&
		ShiftAmount < 7) {
		// Optimize LSR when 4 <= ShiftAmount <= 6.
		Victim = DAG.getNode(AVRISD::SWAP, dl, VT, Victim);
		Victim =
		DAG.getNode(ISD::AND, dl, VT, Victim, DAG.getConstant(0x0f, dl, VT));
		ShiftAmount -= 4;
		// TODO
		// } else if (Op.getOpcode() == ISD::SHL && ShiftAmount == 7) {
		dylanmckayUnsubmitted Done Reply Inline Actions Now llvm-avr generates the same asm for 8-bit shifts as AVR-GCC does, when ShiftAmount = 1,2,3,4,5,6, 7. This suggests that either this `TODO` comment is now unnecessary, or there is another optimization that could be implemented specifically for 7-bit shifts that AVR-GCC does not implement. Remove the TODO comment, or, if you feel it is important, add a sentence describing the TODO dylanmckay: > Now llvm-avr generates the same asm for 8-bit shifts as AVR-GCC does, when ShiftAmount = 1,2…
		benshi001AuthorUnsubmitted Done Reply Inline Actions The TODO is removed. benshi001: The TODO is removed.
		// } else if (Op.getOpcode() == ISD::SRL && ShiftAmount == 7) {
		}
		}

while (ShiftAmount--) {		while (ShiftAmount--) {
Victim = DAG.getNode(Opc8, dl, VT, Victim);		Victim = DAG.getNode(Opc8, dl, VT, Victim);
}		}

return Victim;		return Victim;
}		}

SDValue AVRTargetLowering::LowerDivRem(SDValue Op, SelectionDAG &DAG) const {		SDValue AVRTargetLowering::LowerDivRem(SDValue Op, SelectionDAG &DAG) const {
▲ Show 20 Lines • Show All 1,664 Lines • Show Last 20 Lines

llvm/lib/Target/AVR/AVRInstrInfo.td

	Show First 20 Lines • Show All 61 Lines • ▼ Show 20 Lines

	// Pseudo shift nodes for non-constant shift amounts.			// Pseudo shift nodes for non-constant shift amounts.
	def AVRlslLoop : SDNode<"AVRISD::LSLLOOP", SDTIntShiftOp>;			def AVRlslLoop : SDNode<"AVRISD::LSLLOOP", SDTIntShiftOp>;
	def AVRlsrLoop : SDNode<"AVRISD::LSRLOOP", SDTIntShiftOp>;			def AVRlsrLoop : SDNode<"AVRISD::LSRLOOP", SDTIntShiftOp>;
	def AVRrolLoop : SDNode<"AVRISD::ROLLOOP", SDTIntShiftOp>;			def AVRrolLoop : SDNode<"AVRISD::ROLLOOP", SDTIntShiftOp>;
	def AVRrorLoop : SDNode<"AVRISD::RORLOOP", SDTIntShiftOp>;			def AVRrorLoop : SDNode<"AVRISD::RORLOOP", SDTIntShiftOp>;
	def AVRasrLoop : SDNode<"AVRISD::ASRLOOP", SDTIntShiftOp>;			def AVRasrLoop : SDNode<"AVRISD::ASRLOOP", SDTIntShiftOp>;

				// SWAP node.
				def AVRSwap : SDNode<"AVRISD::SWAP", SDTIntUnaryOp>;

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// AVR Operands, Complex Patterns and Transformations Definitions.			// AVR Operands, Complex Patterns and Transformations Definitions.
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	def imm8_neg_XFORM : SDNodeXForm<imm,			def imm8_neg_XFORM : SDNodeXForm<imm,
	[{			[{
	return CurDAG->getTargetConstant(-N->getAPIntValue(), SDLoc(N), MVT::i8);			return CurDAG->getTargetConstant(-N->getAPIntValue(), SDLoc(N), MVT::i8);
	}]>;			}]>;
	▲ Show 20 Lines • Show All 1,636 Lines • ▼ Show 20 Lines
	// SWAP Rd			// SWAP Rd
	// Swaps the high and low nibbles in a register.			// Swaps the high and low nibbles in a register.
	let Constraints = "$src = $rd" in			let Constraints = "$src = $rd" in
	def SWAPRd : FRd<0b1001,			def SWAPRd : FRd<0b1001,
	0b0100010,			0b0100010,
	(outs GPR8:$rd),			(outs GPR8:$rd),
	(ins GPR8:$src),			(ins GPR8:$src),
	"swap\t$rd",			"swap\t$rd",
	[(set i8:$rd, (bswap i8:$src))]>;			[(set i8:$rd, (AVRSwap i8:$src))]>;

	// IO register bit set/clear operations.			// IO register bit set/clear operations.
	//:TODO: add patterns when popcount(imm)==2 to be expanded with 2 sbi/cbi			//:TODO: add patterns when popcount(imm)==2 to be expanded with 2 sbi/cbi
	// instead of in+ori+out which requires one more instr.			// instead of in+ori+out which requires one more instr.
	def SBIAb : FIOBIT<0b10,			def SBIAb : FIOBIT<0b10,
	(outs),			(outs),
	(ins imm_port5:$addr, i8imm:$bit),			(ins imm_port5:$addr, i8imm:$bit),
	"sbi\t$addr, $bit",			"sbi\t$addr, $bit",
	▲ Show 20 Lines • Show All 387 Lines • Show Last 20 Lines

llvm/test/CodeGen/AVR/ctlz.ll

	; RUN: llc < %s -march=avr \| FileCheck %s			; RUN: llc < %s -march=avr \| FileCheck %s

	define i8 @count_leading_zeros(i8) unnamed_addr {			define i8 @count_leading_zeros(i8) unnamed_addr {
	entry-block:			entry-block:
	%1 = tail call i8 @llvm.ctlz.i8(i8 %0)			%1 = tail call i8 @llvm.ctlz.i8(i8 %0)
	ret i8 %1			ret i8 %1
	}			}

	declare i8 @llvm.ctlz.i8(i8)			declare i8 @llvm.ctlz.i8(i8)

	; CHECK-LABEL: count_leading_zeros:			; CHECK-LABEL: count_leading_zeros:
	; CHECK: cpi [[RESULT:r[0-9]+]], 0			; CHECK: cpi [[RESULT:r[0-9]+]], 0
	; CHECK: brne .LBB0_1			; CHECK: breq .LBB0_2
	; CHECK: rjmp .LBB0_2
	; CHECK: mov [[SCRATCH:r[0-9]+]], {{.*}}[[RESULT]]			; CHECK: mov [[SCRATCH:r[0-9]+]], {{.*}}[[RESULT]]
	; CHECK: lsr {{.*}}[[SCRATCH]]			; CHECK: lsr {{.*}}[[SCRATCH]]
	; CHECK: or {{.}}[[SCRATCH]], {{.}}[[RESULT]]			; CHECK: or {{.}}[[SCRATCH]], {{.}}[[RESULT]]
	; CHECK: mov {{.}}[[RESULT]], {{.}}[[SCRATCH]]			; CHECK: mov {{.}}[[RESULT]], {{.}}[[SCRATCH]]
	; CHECK: lsr {{.*}}[[RESULT]]			; CHECK: lsr {{.*}}[[RESULT]]
	; CHECK: lsr {{.*}}[[RESULT]]			; CHECK: lsr {{.*}}[[RESULT]]
	; CHECK: or {{.}}[[RESULT]], {{.}}[[SCRATCH]]			; CHECK: or {{.}}[[RESULT]], {{.}}[[SCRATCH]]
	; CHECK: mov {{.}}[[SCRATCH]], {{.}}[[RESULT]]			; CHECK: mov {{.}}[[SCRATCH]], {{.}}[[RESULT]]
	; CHECK: lsr {{.*}}[[SCRATCH]]			; CHECK: swap {{.*}}[[SCRATCH]]
	; CHECK: lsr {{.*}}[[SCRATCH]]			; CHECK: andi {{.*}}[[SCRATCH]], 15
	; CHECK: lsr {{.*}}[[SCRATCH]]
	; CHECK: lsr {{.*}}[[SCRATCH]]
	; CHECK: or {{.}}[[SCRATCH]], {{.}}[[RESULT]]			; CHECK: or {{.}}[[SCRATCH]], {{.}}[[RESULT]]
	; CHECK: com {{.*}}[[SCRATCH]]			; CHECK: com {{.*}}[[SCRATCH]]
	; CHECK: mov {{.}}[[RESULT]], {{.}}[[SCRATCH]]			; CHECK: mov {{.}}[[RESULT]], {{.}}[[SCRATCH]]
	; CHECK: lsr {{.*}}[[RESULT]]			; CHECK: lsr {{.*}}[[RESULT]]
	; CHECK: andi {{.*}}[[RESULT]], 85			; CHECK: andi {{.*}}[[RESULT]], 85
	; CHECK: sub {{.}}[[SCRATCH]], {{.}}[[RESULT]]			; CHECK: sub {{.}}[[SCRATCH]], {{.}}[[RESULT]]
	; CHECK: mov {{.}}[[RESULT]], {{.}}[[SCRATCH]]			; CHECK: mov {{.}}[[RESULT]], {{.}}[[SCRATCH]]
	; CHECK: andi {{.*}}[[RESULT]], 51			; CHECK: andi {{.*}}[[RESULT]], 51
	; CHECK: lsr {{.*}}[[SCRATCH]]			; CHECK: lsr {{.*}}[[SCRATCH]]
	; CHECK: lsr {{.*}}[[SCRATCH]]			; CHECK: lsr {{.*}}[[SCRATCH]]
	; CHECK: andi {{.*}}[[SCRATCH]], 51			; CHECK: andi {{.*}}[[SCRATCH]], 51
	; CHECK: add {{.}}[[SCRATCH]], {{.}}[[RESULT]]			; CHECK: add {{.}}[[SCRATCH]], {{.}}[[RESULT]]
	; CHECK: mov {{.}}[[RESULT]], {{.}}[[SCRATCH]]			; CHECK: mov {{.}}[[RESULT]], {{.}}[[SCRATCH]]
	; CHECK: lsr {{.*}}[[RESULT]]			; CHECK: swap {{.*}}[[RESULT]]
	; CHECK: lsr {{.*}}[[RESULT]]
	; CHECK: lsr {{.*}}[[RESULT]]
	; CHECK: lsr {{.*}}[[RESULT]]
	; CHECK: add {{.}}[[RESULT]], {{.}}[[SCRATCH]]			; CHECK: add {{.}}[[RESULT]], {{.}}[[SCRATCH]]
	; CHECK: andi {{.*}}[[RESULT]], 15			; CHECK: andi {{.*}}[[RESULT]], 15
	; CHECK: ret			; CHECK: ret
	; CHECK: LBB0_2:			; CHECK: LBB0_2:
	; CHECK: ldi {{.*}}[[RESULT]], 8			; CHECK: ldi {{.*}}[[RESULT]], 8
	; CHECK: ret			; CHECK: ret

llvm/test/CodeGen/AVR/ctpop.ll

	Show All 14 Lines
	; CHECK: sub {{.}}[[RESULT]], {{.}}[[SCRATCH]]			; CHECK: sub {{.}}[[RESULT]], {{.}}[[SCRATCH]]
	; CHECK: mov {{.}}[[SCRATCH]], {{.}}[[RESULT]]			; CHECK: mov {{.}}[[SCRATCH]], {{.}}[[RESULT]]
	; CHECK: andi {{.*}}[[SCRATCH]], 51			; CHECK: andi {{.*}}[[SCRATCH]], 51
	; CHECK: lsr {{.*}}[[RESULT]]			; CHECK: lsr {{.*}}[[RESULT]]
	; CHECK: lsr {{.*}}[[RESULT]]			; CHECK: lsr {{.*}}[[RESULT]]
	; CHECK: andi {{.*}}[[RESULT]], 51			; CHECK: andi {{.*}}[[RESULT]], 51
	; CHECK: add {{.}}[[RESULT]], {{.}}[[SCRATCH]]			; CHECK: add {{.}}[[RESULT]], {{.}}[[SCRATCH]]
	; CHECK: mov {{.}}[[SCRATCH]], {{.}}[[RESULT]]			; CHECK: mov {{.}}[[SCRATCH]], {{.}}[[RESULT]]
	; CHECK: lsr {{.*}}[[SCRATCH]]			; CHECK: swap {{.*}}[[SCRATCH]]
	; CHECK: lsr {{.*}}[[SCRATCH]]
	; CHECK: lsr {{.*}}[[SCRATCH]]
	; CHECK: lsr {{.*}}[[SCRATCH]]
	; CHECK: add {{.}}[[SCRATCH]], {{.}}[[RESULT]]			; CHECK: add {{.}}[[SCRATCH]], {{.}}[[RESULT]]
	; CHECK: andi {{.*}}[[SCRATCH]], 15			; CHECK: andi {{.*}}[[SCRATCH]], 15
	; CHECK: mov {{.}}[[RESULT]], {{.}}[[SCRATCH]]			; CHECK: mov {{.}}[[RESULT]], {{.}}[[SCRATCH]]
	; CHECK: ret			; CHECK: ret

llvm/test/CodeGen/AVR/cttz.ll

	Show All 20 Lines
	; CHECK: sub {{.}}[[RESULT]], {{.}}[[SCRATCH]]			; CHECK: sub {{.}}[[RESULT]], {{.}}[[SCRATCH]]
	; CHECK: mov {{.}}[[SCRATCH]], {{.}}[[RESULT]]			; CHECK: mov {{.}}[[SCRATCH]], {{.}}[[RESULT]]
	; CHECK: andi {{.*}}[[SCRATCH]], 51			; CHECK: andi {{.*}}[[SCRATCH]], 51
	; CHECK: lsr {{.*}}[[RESULT]]			; CHECK: lsr {{.*}}[[RESULT]]
	; CHECK: lsr {{.*}}[[RESULT]]			; CHECK: lsr {{.*}}[[RESULT]]
	; CHECK: andi {{.*}}[[RESULT]], 51			; CHECK: andi {{.*}}[[RESULT]], 51
	; CHECK: add {{.}}[[RESULT]], {{.}}[[SCRATCH]]			; CHECK: add {{.}}[[RESULT]], {{.}}[[SCRATCH]]
	; CHECK: mov {{.}}[[SCRATCH]], {{.}}[[RESULT]]			; CHECK: mov {{.}}[[SCRATCH]], {{.}}[[RESULT]]
	; CHECK: lsr {{.*}}[[SCRATCH]]			; CHECK: swap {{.*}}[[SCRATCH]]
	; CHECK: lsr {{.*}}[[SCRATCH]]
	; CHECK: lsr {{.*}}[[SCRATCH]]
	; CHECK: lsr {{.*}}[[SCRATCH]]
	; CHECK: add {{.}}[[SCRATCH]], {{.}}[[RESULT]]			; CHECK: add {{.}}[[SCRATCH]], {{.}}[[RESULT]]
	; CHECK: andi {{.*}}[[SCRATCH]], 15			; CHECK: andi {{.*}}[[SCRATCH]], 15
	; CHECK: mov {{.}}[[RESULT]], {{.}}[[SCRATCH]]			; CHECK: mov {{.}}[[RESULT]], {{.}}[[SCRATCH]]
	; CHECK: ret			; CHECK: ret
	; CHECK: [[END_BB]]:			; CHECK: [[END_BB]]:
	; CHECK: ldi {{.*}}[[SCRATCH]], 8			; CHECK: ldi {{.*}}[[SCRATCH]], 8
	; CHECK: mov {{.}}[[RESULT]], {{.}}[[SCRATCH]]			; CHECK: mov {{.}}[[RESULT]], {{.}}[[SCRATCH]]
	; CHECK: ret			; CHECK: ret

llvm/test/CodeGen/AVR/shift.ll

	; RUN: llc < %s -march=avr \| FileCheck %s			; RUN: llc < %s -march=avr \| FileCheck %s

	; CHECK-LABEL: shift_i64_i64			; CHECK-LABEL: shift_i64_i64
	define i64 @shift_i64_i64(i64 %a, i64 %b) {			define i64 @shift_i64_i64(i64 %a, i64 %b) {
	; CHECK: call __ashldi3			; CHECK: call __ashldi3
	%result = shl i64 %a, %b			%result = shl i64 %a, %b
	ret i64 %result			ret i64 %result
	}			}

				define i8 @lsl_i8_1(i8 %a) {
				; CHECK-LABEL: lsl_i8_1:
				; CHECK: lsl r24
				%res = shl i8 %a, 1
				ret i8 %res
				}

				define i8 @lsl_i8_2(i8 %a) {
				; CHECK-LABEL: lsl_i8_2:
				; CHECK: lsl r24
				; CHECK-NEXT: lsl r24
				%res = shl i8 %a, 2
				ret i8 %res
				}

				define i8 @lsl_i8_3(i8 %a) {
				; CHECK-LABEL: lsl_i8_3:
				; CHECK: lsl r24
				; CHECK-NEXT: lsl r24
				; CHECK-NEXT: lsl r24
				%res = shl i8 %a, 3
				ret i8 %res
				}

				define i8 @lsl_i8_4(i8 %a) {
				; CHECK-LABEL: lsl_i8_4:
				; CHECK: swap r24
				; CHECK-NEXT: andi r24, -16
				%res = shl i8 %a, 4
				ret i8 %res
				}

				define i8 @lsl_i8_5(i8 %a) {
				; CHECK-LABEL: lsl_i8_5:
				; CHECK: swap r24
				; CHECK-NEXT: andi r24, -16
				; CHECK-NEXT: lsl r24
				%res = shl i8 %a, 5
				ret i8 %res
				}

				define i8 @lsl_i8_6(i8 %a) {
				; CHECK-LABEL: lsl_i8_6:
				; CHECK: swap r24
				; CHECK-NEXT: andi r24, -16
				; CHECK-NEXT: lsl r24
				; CHECK-NEXT: lsl r24
				%res = shl i8 %a, 6
				ret i8 %res
				}

				define i8 @lsr_i8_1(i8 %a) {
				; CHECK-LABEL: lsr_i8_1:
				; CHECK: lsr r24
				%res = lshr i8 %a, 1
				ret i8 %res
				}

				define i8 @lsr_i8_2(i8 %a) {
				; CHECK-LABEL: lsr_i8_2:
				; CHECK: lsr r24
				; CHECK-NEXT: lsr r24
				%res = lshr i8 %a, 2
				ret i8 %res
				}

				define i8 @lsr_i8_3(i8 %a) {
				; CHECK-LABEL: lsr_i8_3:
				; CHECK: lsr r24
				; CHECK-NEXT: lsr r24
				; CHECK-NEXT: lsr r24
				%res = lshr i8 %a, 3
				ret i8 %res
				}

				define i8 @lsr_i8_4(i8 %a) {
				; CHECK-LABEL: lsr_i8_4:
				; CHECK: swap r24
				; CHECK-NEXT: andi r24, 15
				%res = lshr i8 %a, 4
				ret i8 %res
				}

				define i8 @lsr_i8_5(i8 %a) {
				; CHECK-LABEL: lsr_i8_5:
				; CHECK: swap r24
				; CHECK-NEXT: andi r24, 15
				; CHECK-NEXT: lsr r24
				%res = lshr i8 %a, 5
				ret i8 %res
				}

				define i8 @lsr_i8_6(i8 %a) {
				; CHECK-LABEL: lsr_i8_6:
				; CHECK: swap r24
				; CHECK-NEXT: andi r24, 15
				; CHECK-NEXT: lsr r24
				; CHECK-NEXT: lsr r24
				%res = lshr i8 %a, 6
				ret i8 %res
				}

This is an archive of the discontinued LLVM Phabricator instance.

[AVR] Optimize 8-bit logic left/right shifts
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 302242

llvm/lib/Target/AVR/AVRISelLowering.h

llvm/lib/Target/AVR/AVRISelLowering.cpp

llvm/lib/Target/AVR/AVRInstrInfo.td

llvm/test/CodeGen/AVR/ctlz.ll

llvm/test/CodeGen/AVR/ctpop.ll

llvm/test/CodeGen/AVR/cttz.ll

llvm/test/CodeGen/AVR/shift.ll

This is an archive of the discontinued LLVM Phabricator instance.

[AVR] Optimize 8-bit logic left/right shiftsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 302242

llvm/lib/Target/AVR/AVRISelLowering.h

llvm/lib/Target/AVR/AVRISelLowering.cpp

llvm/lib/Target/AVR/AVRInstrInfo.td

llvm/test/CodeGen/AVR/ctlz.ll

llvm/test/CodeGen/AVR/ctpop.ll

llvm/test/CodeGen/AVR/cttz.ll

llvm/test/CodeGen/AVR/shift.ll

[AVR] Optimize 8-bit logic left/right shifts
ClosedPublic