Download Raw Diff

Details

Reviewers

lebedev.ri
spatel
RKSimon

Commits

rGbed587631f90: [AggressiveInstCombine] Add arithmetic shift right instr to `TruncInstCombine`…

Summary

Add `ashr` instruction to the DAG post-dominated by `trunc`, allowing                                         
`TruncInstCombine` to reduce bitwidth of expressions containing                                                          
these instructions.                                                                               
                                                                                                                          
We should be shifting by less than the target bitwidth.                                            
Also it is sufficient to require that all truncated bits                                                       
of the value-to-be-shifted are sign bits (all zeros or ones) and                                                         
one sign bit is left untruncated: https://alive2.llvm.org/ce/z/Ajo2__                                           
                                                                                                        
Part of https://reviews.llvm.org/D107766

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

anton-afanasyev created this revision.Aug 19 2021, 2:02 AM

Herald added a subscriber: hiraditya. · View Herald TranscriptAug 19 2021, 2:02 AM

anton-afanasyev requested review of this revision.Aug 19 2021, 2:02 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 19 2021, 2:02 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

anton-afanasyev mentioned this in D107766: [AggressiveInstCombine] Add shift instructions to `TruncInstCombine` DAG.Aug 19 2021, 2:02 AM

Harbormaster completed remote builds in B120282: Diff 367427.Aug 19 2021, 2:02 AM

I'm pretty sure we should count the number of sign bits here, much like how we count the number of leading zeros:
https://godbolt.org/z/vGsf664Gf => https://alive2.llvm.org/ce/z/gIc43E

This revision now requires changes to proceed.Aug 19 2021, 2:10 AM

In D108355#2954130, @lebedev.ri wrote:

I'm pretty sure we should count the number of sign bits here, much like how we count the number of leading zeros:
https://godbolt.org/z/vGsf664Gf => https://alive2.llvm.org/ce/z/gIc43E

Ok, I've extended both lshr and ashr to use countMinSignBits(), unifying both cases, but we have to computeKnownBits() one more time here.

Use countMinSignBits()

anton-afanasyev edited the summary of this revision. (Show Details)Aug 19 2021, 7:33 AM

Harbormaster completed remote builds in B120328: Diff 367492.Aug 19 2021, 7:34 AM

anton-afanasyev mentioned this in D108091: [AggressiveInstCombine] Add shift left instruction to `TruncInstCombine` DAG.Aug 19 2021, 7:34 AM

lebedev.ri added inline comments.Aug 19 2021, 7:53 AM

llvm/lib/Transforms/AggressiveInstCombine/TruncInstCombine.cpp
295–298	These seem like a separate change
409–410	[Why] do we have to do this? Doesn't seem like something this transform should worry about?

lebedev.ri added inline comments.Aug 19 2021, 7:56 AM

llvm/lib/Transforms/AggressiveInstCombine/TruncInstCombine.cpp
299–307

anton-afanasyev marked 3 inline comments as done.Aug 19 2021, 9:43 AM

anton-afanasyev added inline comments.

llvm/lib/Transforms/AggressiveInstCombine/TruncInstCombine.cpp
295–298	Ok, I'm to split this
299–307	I've intentionally merged this two cases, since `ashr` and `lshr` processing doesn't differ. What really matters is positivity/negativity of their operand. This is symmetrical cases (see updated summary). For instance, for your change we still have to replace `ashr` with `lshr` if sign bits were zeros and with `ashr` if sign bits were ones. Why not to do the same for `lshr`?
409–410	If sign bits were ones we have to replace original shift with `ashr`, `lshr` doesn't work (see `@lshr_negative_operand_but_short()` test).

Split unrelated changes

Harbormaster completed remote builds in B120358: Diff 367529.Aug 19 2021, 9:45 AM

This patch does not apply for me.
Please precommit the tests, and rebase the patch so that it applies to main.

This revision now requires changes to proceed.Aug 19 2021, 12:11 PM

Also, let's split trunc_shifts.ll into trunc_{shl,lshr,ashr}.ll.

Rebase after splitted and precommited tests

In D108355#2955549, @lebedev.ri wrote:

This patch does not apply for me.
Please precommit the tests, and rebase the patch so that it applies to main.

In D108355#2955569, @lebedev.ri wrote:

Also, let's split trunc_shifts.ll into trunc_{shl,lshr,ashr}.ll.

Ok, done

Harbormaster completed remote builds in B120495: Diff 367707.Aug 19 2021, 9:52 PM

In D108355#2956676, @anton-afanasyev wrote:

Rebase after splitted and precommited tests

In D108355#2956680, @anton-afanasyev wrote:

In D108355#2955549, @lebedev.ri wrote:

This patch does not apply for me.
Please precommit the tests, and rebase the patch so that it applies to main.

In D108355#2955569, @lebedev.ri wrote:

Also, let's split trunc_shifts.ll into trunc_{shl,lshr,ashr}.ll.

Ok, done

Thank you!

llvm/lib/Transforms/AggressiveInstCombine/TruncInstCombine.cpp
299–307	Ehrm, these two inline comments are separate. Can you explain why we should ask for known bits even for `ashr`, even though we then ask for known sign bits?
409–410	I see. Please explain all that in a code comment.

lebedev.ri added inline comments.Aug 20 2021, 2:20 AM

llvm/lib/Transforms/AggressiveInstCombine/TruncInstCombine.cpp
299–307	(note that there's a typo in my diff, it should be `MinBitWidth = std::max(MinBitWidth, MinSignedBits);` not `MinBitWidth = std::max(MinBitWidth, NumSignBits);`)

lebedev.ri added inline comments.Aug 20 2021, 2:30 AM

llvm/lib/Transforms/AggressiveInstCombine/TruncInstCombine.cpp
409–410	But, this brings another question - should this assert that the sign bit is known?

lebedev.ri added inline comments.Aug 20 2021, 6:25 AM

llvm/lib/Transforms/AggressiveInstCombine/TruncInstCombine.cpp
409–410	Actually, let me backtrack this. This patch adds support for `ashr` truncation. It shouldn't also suddenly subtly change reasoning for `lshr`. Can we please not do that?

Make step back, leaving lshr untouched and counting sign bits for ashr only.
The motivation is such that for natural cases unsigned values are
zexted and lshred whereas signed values are sexted and ashred.
So it is naturally to grasp only these two cases. We miss here cases like truncation of ashr(zext(*)),
but it is transformed for AIC by preceding IC to lshr(zext(*)). Also remove unrelated tests.

llvm/lib/Transforms/AggressiveInstCombine/TruncInstCombine.cpp
299–307	Ok, thanks, used `CountMinSignBits()` for `ashr`.
409–410	Ok, I've changed that lines.

Harbormaster completed remote builds in B120548: Diff 367782.Aug 20 2021, 6:55 AM

anton-afanasyev edited the summary of this revision. (Show Details)Aug 20 2021, 6:55 AM

Update comment

Harbormaster completed remote builds in B120550: Diff 367784.Aug 20 2021, 7:00 AM

Fix tests

lebedev.ri added a comment.Aug 20 2021, 7:04 AM

This comment was removed by lebedev.ri.

Harbormaster completed remote builds in B120551: Diff 367786.Aug 20 2021, 7:04 AM

LGTM, thank you.
@spatel ?

This revision is now accepted and ready to land.Aug 20 2021, 7:04 AM

anton-afanasyev marked an inline comment as done.Aug 20 2021, 7:08 AM

anton-afanasyev added inline comments.

llvm/lib/Transforms/AggressiveInstCombine/TruncInstCombine.cpp
409–410	Can we please not do that? Sure, I've eventually done it that way: `lshr` is transformed as before, `ashr` transformation works for numbers treated as signed ones.

Rebase against dd19f342fa21

Harbormaster completed remote builds in B120573: Diff 367817.Aug 20 2021, 9:40 AM

lebedev.ri added inline comments.Aug 20 2021, 9:53 AM

llvm/lib/Transforms/AggressiveInstCombine/TruncInstCombine.cpp
411–413	FWIW i think we can generalize this like that in the future.

Update setting exactness

llvm/lib/Transforms/AggressiveInstCombine/TruncInstCombine.cpp
411–413	Thanks, done.

Harbormaster completed remote builds in B120663: Diff 367945.Aug 21 2021, 12:16 AM

(i don't have any further comments here, but it might be good to wait on @spatel)

Sorry - I missed the earlier ping.
LGTM.

llvm/lib/Transforms/AggressiveInstCombine/TruncInstCombine.cpp
290–293	if (I->isShift()) {

anton-afanasyev marked an inline comment as done.Aug 24 2021, 12:40 AM

anton-afanasyev added inline comments.

llvm/lib/Transforms/AggressiveInstCombine/TruncInstCombine.cpp
290–293	Thanks, done

Use isShift()

This revision was landed with ongoing or failed builds.Aug 24 2021, 12:41 AM

Closed by commit rGbed587631f90: [AggressiveInstCombine] Add arithmetic shift right instr to `TruncInstCombine`… (authored by anton-afanasyev). · Explain Why

This revision was automatically updated to reflect the committed changes.

anton-afanasyev added a commit: rGbed587631f90: [AggressiveInstCombine] Add arithmetic shift right instr to `TruncInstCombine`….

Harbormaster completed remote builds in B120927: Diff 368291.Aug 24 2021, 1:15 AM

Diff 367529

llvm/lib/Transforms/AggressiveInstCombine/TruncInstCombine.cpp

Show First 20 Lines • Show All 59 Lines • ▼ Show 20 Lines static void getRelevantOperands(Instruction *I, SmallVectorImpl<Value *> &Ops) {

case Instruction::Add: case Instruction::Add:

case Instruction::Sub: case Instruction::Sub:

case Instruction::Mul: case Instruction::Mul:

case Instruction::And: case Instruction::And:

case Instruction::Or: case Instruction::Or:

case Instruction::Xor: case Instruction::Xor:

case Instruction::Shl: case Instruction::Shl:

case Instruction::LShr: case Instruction::LShr:

case Instruction::AShr:

Ops.push_back(I->getOperand(0)); Ops.push_back(I->getOperand(0));

Ops.push_back(I->getOperand(1)); Ops.push_back(I->getOperand(1));

break; break;

case Instruction::Select: case Instruction::Select:

Ops.push_back(I->getOperand(1)); Ops.push_back(I->getOperand(1));

Ops.push_back(I->getOperand(2)); Ops.push_back(I->getOperand(2));

break; break;

default: default:

▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines while (!Worklist.empty()) {

case Instruction::Add: case Instruction::Add:

case Instruction::Sub: case Instruction::Sub:

case Instruction::Mul: case Instruction::Mul:

case Instruction::And: case Instruction::And:

case Instruction::Or: case Instruction::Or:

case Instruction::Xor: case Instruction::Xor:

case Instruction::Shl: case Instruction::Shl:

case Instruction::LShr: case Instruction::LShr:

case Instruction::AShr:

case Instruction::Select: { case Instruction::Select: {

SmallVector<Value *, 2> Operands; SmallVector<Value *, 2> Operands;

getRelevantOperands(I, Operands); getRelevantOperands(I, Operands);

append_range(Worklist, Operands); append_range(Worklist, Operands);

break; break;

} }

default: default:

// TODO: Can handle more cases here: // TODO: Can handle more cases here:

// 1. shufflevector, extractelement, insertelement // 1. shufflevector, extractelement, insertelement

// 2. udiv, urem // 2. udiv, urem

// 3. ashr // 3. phi node(and loop handling)

// 4. phi node(and loop handling)

// ... // ...

return false; return false;

} }

return true; return true;

} }

unsigned TruncInstCombine::getMinBitWidth() { unsigned TruncInstCombine::getMinBitWidth() {

▲ Show 20 Lines • Show All 116 Lines • ▼ Show 20 Lines for (auto *U : I->users())

DesiredBitWidth = ExtInstBitWidth; DesiredBitWidth = ExtInstBitWidth;

} }

unsigned OrigBitWidth = unsigned OrigBitWidth =

CurrentTruncInst->getOperand(0)->getType()->getScalarSizeInBits(); CurrentTruncInst->getOperand(0)->getType()->getScalarSizeInBits();

// Initialize MinBitWidth for shift instructions with the minimum number // Initialize MinBitWidth for shift instructions with the minimum number

// that is greater than shift amount (i.e. shift amount + 1). For `lshr` // that is greater than shift amount (i.e. shift amount + 1).

// adjust MinBitWidth so that all potentially truncated bits of // For `lshr` and `ashr` adjust MinBitWidth so that all potentially truncated

// the value-to-be-shifted are zeros. // bits of the value-to-be-shifted are zeros if this value is known to be

// Also normalize MinBitWidth not to be greater than source bitwidth. // positive or all truncated bits and even one more (first untruncated) bit

// are ones if this value is known to be negative. The reduced instr is later

// to be replaced by `lshr` or `ashr` accordingly.

// Exit early if MinBitWidth is not less than original bitwidth.

for (auto &Itr : InstInfoMap) { for (auto &Itr : InstInfoMap) {

Instruction *I = Itr.first; Instruction *I = Itr.first;

if (I->getOpcode() == Instruction::Shl || if (I->getOpcode() == Instruction::Shl ||

I->getOpcode() == Instruction::LShr) { I->getOpcode() == Instruction::LShr ||

I->getOpcode() == Instruction::AShr) {

KnownBits KnownRHS = computeKnownBits(I->getOperand(1), DL); KnownBits KnownRHS = computeKnownBits(I->getOperand(1), DL);

spatelUnsubmitted

Done

if (I->isShift()) {

spatel: if (I->isShift()) {

anton-afanasyevAuthorUnsubmitted

Done

Thanks, done

anton-afanasyev: Thanks, done

unsigned MinBitWidth = KnownRHS.getMaxValue() unsigned MinBitWidth = KnownRHS.getMaxValue()

.uadd_sat(APInt(OrigBitWidth, 1)) .uadd_sat(APInt(OrigBitWidth, 1))

.getLimitedValue(OrigBitWidth); .getLimitedValue(OrigBitWidth);

if (MinBitWidth == OrigBitWidth) if (MinBitWidth == OrigBitWidth)

return nullptr; return nullptr;

lebedev.riUnsubmitted

Done

These seem like a separate change

lebedev.ri: These seem like a separate change

anton-afanasyevAuthorUnsubmitted

Done

Ok, I'm to split this

anton-afanasyev: Ok, I'm to split this

if (I->getOpcode() == Instruction::LShr) { if (I->getOpcode() == Instruction::AShr ||

I->getOpcode() == Instruction::LShr) {

KnownBits KnownLHS = computeKnownBits(I->getOperand(0), DL); KnownBits KnownLHS = computeKnownBits(I->getOperand(0), DL);

MinBitWidth = MinBitWidth =

std::max(MinBitWidth, KnownLHS.getMaxValue().getActiveBits()); std::max(MinBitWidth, OrigBitWidth - KnownLHS.countMinSignBits() +

KnownLHS.isNegative());

if (MinBitWidth >= OrigBitWidth) if (MinBitWidth >= OrigBitWidth)

return nullptr; return nullptr;

} }

lebedev.riUnsubmitted

Done

return nullptr;

if (I->getOpcode() == Instruction::AShr ||

I->getOpcode() == Instruction::LShr) {

- KnownBits KnownLHS = computeKnownBits(I->getOperand(0), DL);

- MinBitWidth =

- std::max(MinBitWidth, OrigBitWidth - KnownLHS.countMinSignBits() +

- KnownLHS.isNegative());

+ if (I->getOpcode() == Instruction::LShr) {

+ KnownBits KnownLHS = computeKnownBits(I->getOperand(0), DL);

+ MinBitWidth = std::max(MinBitWidth, KnownLHS.getMaxValue().getActiveBits());;

+ }

+ if (I->getOpcode() == Instruction::AShr) {

+ unsigned NumSignBits = ComputeNumSignBits(I->getOperand(0), DL);

+ unsigned MinSignedBits = OrigBitWidth - NumSignBits + 1;

+ MinBitWidth = std::max(MinBitWidth, NumSignBits);

+ }

if (MinBitWidth >= OrigBitWidth)

return nullptr;

}

Itr.second.MinBitWidth = MinBitWidth;

lebedev.ri:

anton-afanasyevAuthorUnsubmitted

Done

I've intentionally merged this two cases, since ashr and lshr processing doesn't differ. What really matters is positivity/negativity of their operand. This is symmetrical cases (see updated summary).

For instance, for your change we still have to replace ashr with lshr if sign bits were zeros and with ashr if sign bits were ones. Why not to do the same for lshr?

anton-afanasyev: I've intentionally merged this two cases, since `ashr` and `lshr` processing doesn't differ.

lebedev.riUnsubmitted

Done

Ehrm, these two inline comments are separate.
Can you explain why we should ask for known *bits* even for ashr,
even though we then ask for known *sign* bits?

lebedev.ri: Ehrm, these two inline comments are separate. Can you explain why we should ask for known…

lebedev.riUnsubmitted

Done

(note that there's a typo in my diff, it should be
MinBitWidth = std::max(MinBitWidth, MinSignedBits); not
MinBitWidth = std::max(MinBitWidth, NumSignBits);)

lebedev.ri: (note that there's a typo in my diff, it should be `MinBitWidth = std::max(MinBitWidth…

anton-afanasyevAuthorUnsubmitted

Done

Ok, thanks, used CountMinSignBits() for ashr.

anton-afanasyev: Ok, thanks, used `CountMinSignBits()` for `ashr`.

Itr.second.MinBitWidth = MinBitWidth; Itr.second.MinBitWidth = MinBitWidth;

} }

// Calculate minimum allowed bit-width allowed for shrinking the currently // Calculate minimum allowed bit-width allowed for shrinking the currently

// visited truncate's operand. // visited truncate's operand.

unsigned MinBitWidth = getMinBitWidth(); unsigned MinBitWidth = getMinBitWidth();

▲ Show 20 Lines • Show All 75 Lines • ▼ Show 20 Lines case Instruction::SExt: {

break; break;

} }

case Instruction::Add: case Instruction::Add:

case Instruction::Sub: case Instruction::Sub:

case Instruction::Mul: case Instruction::Mul:

case Instruction::And: case Instruction::And:

case Instruction::Or: case Instruction::Or:

case Instruction::Xor: case Instruction::Xor:

case Instruction::Shl: case Instruction::Shl: {

case Instruction::LShr: {

Value *LHS = getReducedOperand(I->getOperand(0), SclTy); Value *LHS = getReducedOperand(I->getOperand(0), SclTy);

Value *RHS = getReducedOperand(I->getOperand(1), SclTy); Value *RHS = getReducedOperand(I->getOperand(1), SclTy);

Res = Builder.CreateBinOp((Instruction::BinaryOps)Opc, LHS, RHS); Res = Builder.CreateBinOp((Instruction::BinaryOps)Opc, LHS, RHS);

break;

}

case Instruction::LShr:

case Instruction::AShr: {

Value *LHS = getReducedOperand(I->getOperand(0), SclTy);

Value *RHS = getReducedOperand(I->getOperand(1), SclTy);

KnownBits KnownLHS = computeKnownBits(I->getOperand(0), DL);

Opc = KnownLHS.isNegative() ? Instruction::AShr : Instruction::LShr;

lebedev.riUnsubmitted

Done

[Why] do we have to do this?
Doesn't seem like something this transform should worry about?

lebedev.ri: [Why] do we have to do this? Doesn't seem like something this transform should worry about?

anton-afanasyevAuthorUnsubmitted

Done

If sign bits were ones we have to replace original shift with ashr, lshr doesn't work (see @lshr_negative_operand_but_short() test).

anton-afanasyev: If sign bits were ones we have to replace original shift with `ashr`, `lshr` doesn't work (see…

lebedev.riUnsubmitted

Done

I see. Please explain all that in a code comment.

lebedev.ri: I see. Please explain all that in a code comment.

anton-afanasyevAuthorUnsubmitted

Done

Ok, I've changed that lines.

anton-afanasyev: Ok, I've changed that lines.

lebedev.riUnsubmitted

Done

Actually, let me backtrack this.
This patch adds support for ashr truncation.
It shouldn't also suddenly subtly change reasoning for lshr.
Can we please not do that?

lebedev.ri: Actually, let me backtrack this. This patch adds support for `ashr` truncation. It shouldn't…

anton-afanasyevAuthorUnsubmitted

Done

Can we please not do that?

Sure, I've eventually done it that way: lshr is transformed as before, ashr transformation works for numbers treated as signed ones.

anton-afanasyev: > Can we please not do that? Sure, I've eventually done it that way: `lshr` is transformed as…

lebedev.riUnsubmitted

Done

Value *RHS = getReducedOperand(I->getOperand(1), SclTy);

- KnownBits KnownLHS = computeKnownBits(I->getOperand(0), DL);

- Opc = KnownLHS.isNegative() ? Instruction::AShr : Instruction::LShr;

+ if(Opc == Instruction::AShr) {

+ KnownBits KnownLHS = computeKnownBits(I->getOperand(0), DL);

+ if(!KnownLHS.isNegative())

+ Opc = Instruction::LShr;

+ }

// Preserve `exact` flag since truncation doesn't change exactness

But, this brings another question - should this assert that the sign bit is known?

lebedev.ri: But, this brings another question - should this assert that the sign bit is known?

// Preserve `exact` flag since truncation doesn't change exactness // Preserve `exact` flag since truncation doesn't change exactness

if (Opc == Instruction::LShr) Res = Builder.CreateBinOp((Instruction::BinaryOps)Opc, LHS, RHS);

cast<Instruction>(Res)->setIsExact(I->isExact()); cast<Instruction>(Res)->setIsExact(I->isExact());

lebedev.riUnsubmitted

Done

Res = Builder.CreateBinOp((Instruction::BinaryOps)Opc, LHS, RHS);

// Preserve `exact` flag since truncation doesn't change exactness

- if (Opc == Instruction::LShr || Opc == Instruction::AShr)

+ if (auto *PEO = dyn_cast<PossiblyExactOperator>(I))

if (auto *ResI = dyn_cast<Instruction>(Res))

- ResI->setIsExact(I->isExact());

+ ResI->setIsExact(PEO->isExact());

break;

FWIW i think we can generalize this like that in the future.

lebedev.ri: FWIW i think we can generalize this like that in the future.

anton-afanasyevAuthorUnsubmitted

Done

Thanks, done.

anton-afanasyev: Thanks, done.

break; break;

} }

case Instruction::Select: { case Instruction::Select: {

Value *Op0 = I->getOperand(0); Value *Op0 = I->getOperand(0);

Value *LHS = getReducedOperand(I->getOperand(1), SclTy); Value *LHS = getReducedOperand(I->getOperand(1), SclTy);

Value *RHS = getReducedOperand(I->getOperand(2), SclTy); Value *RHS = getReducedOperand(I->getOperand(2), SclTy);

Res = Builder.CreateSelect(Op0, LHS, RHS); Res = Builder.CreateSelect(Op0, LHS, RHS);

break; break;

▲ Show 20 Lines • Show All 66 Lines • Show Last 20 Lines

llvm/test/Transforms/AggressiveInstCombine/trunc_shifts.ll

Show First 20 Lines • Show All 354 Lines • ▼ Show 20 Lines	;
%zext = zext i16 %x to i32		%zext = zext i16 %x to i32
%lshr = lshr exact i32 %zext, 15		%lshr = lshr exact i32 %zext, 15
%trunc = trunc i32 %lshr to i16		%trunc = trunc i32 %lshr to i16
ret i16 %trunc		ret i16 %trunc
}		}

define i16 @ashr_15(i16 %x) {		define i16 @ashr_15(i16 %x) {
; CHECK-LABEL: @ashr_15(		; CHECK-LABEL: @ashr_15(
; CHECK-NEXT: [[ZEXT:%.]] = zext i16 [[X:%.]] to i32		; CHECK-NEXT: [[ASHR:%.]] = lshr i16 [[X:%.]], 15
; CHECK-NEXT: [[ASHR:%.*]] = ashr i32 [[ZEXT]], 15		; CHECK-NEXT: ret i16 [[ASHR]]
; CHECK-NEXT: [[TRUNC:%.*]] = trunc i32 [[ASHR]] to i16
; CHECK-NEXT: ret i16 [[TRUNC]]
;		;
%zext = zext i16 %x to i32		%zext = zext i16 %x to i32
%ashr = ashr i32 %zext, 15		%ashr = ashr i32 %zext, 15
%trunc = trunc i32 %ashr to i16		%trunc = trunc i32 %ashr to i16
ret i16 %trunc		ret i16 %trunc
}		}

; Negative test		; Negative test
Show All 29 Lines	;
%a = add i32 %s, %z		%a = add i32 %s, %z
%s2 = ashr i32 %a, 2		%s2 = ashr i32 %a, 2
%t = trunc i32 %s2 to i16		%t = trunc i32 %s2 to i16
ret i16 %t		ret i16 %t
}		}

define i16 @ashr_var_bounded_shift_amount(i8 %x, i8 %amt) {		define i16 @ashr_var_bounded_shift_amount(i8 %x, i8 %amt) {
; CHECK-LABEL: @ashr_var_bounded_shift_amount(		; CHECK-LABEL: @ashr_var_bounded_shift_amount(
; CHECK-NEXT: [[Z:%.]] = zext i8 [[X:%.]] to i32		; CHECK-NEXT: [[Z:%.]] = zext i8 [[X:%.]] to i16
; CHECK-NEXT: [[ZA:%.]] = zext i8 [[AMT:%.]] to i32		; CHECK-NEXT: [[ZA:%.]] = zext i8 [[AMT:%.]] to i16
; CHECK-NEXT: [[ZA2:%.*]] = and i32 [[ZA]], 15		; CHECK-NEXT: [[ZA2:%.*]] = and i16 [[ZA]], 15
; CHECK-NEXT: [[S:%.*]] = ashr i32 [[Z]], [[ZA2]]		; CHECK-NEXT: [[S:%.*]] = lshr i16 [[Z]], [[ZA2]]
; CHECK-NEXT: [[A:%.*]] = add i32 [[S]], [[Z]]		; CHECK-NEXT: [[A:%.*]] = add i16 [[S]], [[Z]]
; CHECK-NEXT: [[S2:%.*]] = ashr i32 [[A]], 2		; CHECK-NEXT: [[S2:%.*]] = lshr i16 [[A]], 2
; CHECK-NEXT: [[T:%.*]] = trunc i32 [[S2]] to i16		; CHECK-NEXT: ret i16 [[S2]]
; CHECK-NEXT: ret i16 [[T]]
;		;
%z = zext i8 %x to i32		%z = zext i8 %x to i32
%za = zext i8 %amt to i32		%za = zext i8 %amt to i32
%za2 = and i32 %za, 15		%za2 = and i32 %za, 15
%s = ashr i32 %z, %za2		%s = ashr i32 %z, %za2
%a = add i32 %s, %z		%a = add i32 %s, %z
%s2 = ashr i32 %a, 2		%s2 = ashr i32 %a, 2
%t = trunc i32 %s2 to i16		%t = trunc i32 %s2 to i16
Show All 16 Lines	;
%and = and i64 %sext, 4294967295		%and = and i64 %sext, 4294967295
%shl = ashr i64 %zext, %and		%shl = ashr i64 %zext, %and
%trunc = trunc i64 %shl to i32		%trunc = trunc i64 %shl to i32
ret i32 %trunc		ret i32 %trunc
}		}

define void @ashr_big_dag(i16* %a, i8 %b, i8 %c) {		define void @ashr_big_dag(i16* %a, i8 %b, i8 %c) {
; CHECK-LABEL: @ashr_big_dag(		; CHECK-LABEL: @ashr_big_dag(
; CHECK-NEXT: [[ZEXT1:%.]] = zext i8 [[B:%.]] to i32		; CHECK-NEXT: [[ZEXT1:%.]] = zext i8 [[B:%.]] to i16
; CHECK-NEXT: [[ZEXT2:%.]] = zext i8 [[C:%.]] to i32		; CHECK-NEXT: [[ZEXT2:%.]] = zext i8 [[C:%.]] to i16
; CHECK-NEXT: [[ADD1:%.*]] = add i32 [[ZEXT1]], [[ZEXT2]]		; CHECK-NEXT: [[ADD1:%.*]] = add i16 [[ZEXT1]], [[ZEXT2]]
; CHECK-NEXT: [[SFT1:%.*]] = and i32 [[ADD1]], 15		; CHECK-NEXT: [[SFT1:%.*]] = and i16 [[ADD1]], 15
; CHECK-NEXT: [[SHR1:%.*]] = ashr i32 [[ADD1]], [[SFT1]]		; CHECK-NEXT: [[SHR1:%.*]] = lshr i16 [[ADD1]], [[SFT1]]
; CHECK-NEXT: [[ADD2:%.*]] = add i32 [[ADD1]], [[SHR1]]		; CHECK-NEXT: [[ADD2:%.*]] = add i16 [[ADD1]], [[SHR1]]
; CHECK-NEXT: [[SFT2:%.*]] = and i32 [[ADD2]], 7		; CHECK-NEXT: [[SFT2:%.*]] = and i16 [[ADD2]], 7
; CHECK-NEXT: [[SHR2:%.*]] = ashr i32 [[ADD2]], [[SFT2]]		; CHECK-NEXT: [[SHR2:%.*]] = lshr i16 [[ADD2]], [[SFT2]]
; CHECK-NEXT: [[TRUNC:%.*]] = trunc i32 [[SHR2]] to i16		; CHECK-NEXT: store i16 [[SHR2]], i16* [[A:%.*]], align 2
; CHECK-NEXT: store i16 [[TRUNC]], i16* [[A:%.*]], align 2
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
%zext1 = zext i8 %b to i32		%zext1 = zext i8 %b to i32
%zext2 = zext i8 %c to i32		%zext2 = zext i8 %c to i32
%add1 = add i32 %zext1, %zext2		%add1 = add i32 %zext1, %zext2
%sft1 = and i32 %add1, 15		%sft1 = and i32 %add1, 15
%shr1 = ashr i32 %add1, %sft1		%shr1 = ashr i32 %add1, %sft1
%add2 = add i32 %add1, %shr1		%add2 = add i32 %add1, %shr1
Show All 18 Lines	;
%zext2 = zext i16 %ashr to i32		%zext2 = zext i16 %ashr to i32
%ashr2 = ashr i32 %zext2, 2		%ashr2 = ashr i32 %zext2, 2
%trunc = trunc i32 %ashr2 to i8		%trunc = trunc i32 %ashr2 to i8
ret i8 %trunc		ret i8 %trunc
}		}

define <2 x i16> @ashr_vector(<2 x i8> %x) {		define <2 x i16> @ashr_vector(<2 x i8> %x) {
; CHECK-LABEL: @ashr_vector(		; CHECK-LABEL: @ashr_vector(
; CHECK-NEXT: [[Z:%.]] = zext <2 x i8> [[X:%.]] to <2 x i32>		; CHECK-NEXT: [[Z:%.]] = zext <2 x i8> [[X:%.]] to <2 x i16>
; CHECK-NEXT: [[ZA:%.*]] = and <2 x i32> [[Z]], <i32 7, i32 8>		; CHECK-NEXT: [[ZA:%.*]] = and <2 x i16> [[Z]], <i16 7, i16 8>
; CHECK-NEXT: [[S:%.*]] = ashr <2 x i32> [[Z]], [[ZA]]		; CHECK-NEXT: [[S:%.*]] = lshr <2 x i16> [[Z]], [[ZA]]
; CHECK-NEXT: [[A:%.*]] = add <2 x i32> [[S]], [[Z]]		; CHECK-NEXT: [[A:%.*]] = add <2 x i16> [[S]], [[Z]]
; CHECK-NEXT: [[S2:%.*]] = ashr <2 x i32> [[A]], <i32 4, i32 5>		; CHECK-NEXT: [[S2:%.*]] = lshr <2 x i16> [[A]], <i16 4, i16 5>
; CHECK-NEXT: [[T:%.*]] = trunc <2 x i32> [[S2]] to <2 x i16>		; CHECK-NEXT: ret <2 x i16> [[S2]]
; CHECK-NEXT: ret <2 x i16> [[T]]
;		;
%z = zext <2 x i8> %x to <2 x i32>		%z = zext <2 x i8> %x to <2 x i32>
%za = and <2 x i32> %z, <i32 7, i32 8>		%za = and <2 x i32> %z, <i32 7, i32 8>
%s = ashr <2 x i32> %z, %za		%s = ashr <2 x i32> %z, %za
%a = add <2 x i32> %s, %z		%a = add <2 x i32> %s, %z
%s2 = ashr <2 x i32> %a, <i32 4, i32 5>		%s2 = ashr <2 x i32> %a, <i32 4, i32 5>
%t = trunc <2 x i32> %s2 to <2 x i16>		%t = trunc <2 x i32> %s2 to <2 x i16>
ret <2 x i16> %t		ret <2 x i16> %t
Show All 38 Lines	;
%a = add <2 x i32> %s, %z		%a = add <2 x i32> %s, %z
%s2 = ashr <2 x i32> %a, <i32 16, i32 5>		%s2 = ashr <2 x i32> %a, <i32 16, i32 5>
%t = trunc <2 x i32> %s2 to <2 x i16>		%t = trunc <2 x i32> %s2 to <2 x i16>
ret <2 x i16> %t		ret <2 x i16> %t
}		}

define i16 @ashr_exact(i16 %x) {		define i16 @ashr_exact(i16 %x) {
; CHECK-LABEL: @ashr_exact(		; CHECK-LABEL: @ashr_exact(
; CHECK-NEXT: [[ZEXT:%.]] = zext i16 [[X:%.]] to i32		; CHECK-NEXT: [[AND:%.]] = and i16 [[X:%.]], 32767
; CHECK-NEXT: [[AND:%.*]] = and i32 [[ZEXT]], 32767		; CHECK-NEXT: [[ASHR:%.*]] = lshr exact i16 [[AND]], 15
; CHECK-NEXT: [[ASHR:%.*]] = ashr exact i32 [[AND]], 15		; CHECK-NEXT: ret i16 [[ASHR]]
; CHECK-NEXT: [[TRUNC:%.*]] = trunc i32 [[ASHR]] to i16
; CHECK-NEXT: ret i16 [[TRUNC]]
;		;
%zext = zext i16 %x to i32		%zext = zext i16 %x to i32
%and = and i32 %zext, 32767		%and = and i32 %zext, 32767
%ashr = ashr exact i32 %and, 15		%ashr = ashr exact i32 %and, 15
%trunc = trunc i32 %ashr to i16		%trunc = trunc i32 %ashr to i16
ret i16 %trunc		ret i16 %trunc
}		}

Show All 11 Lines	;
%xor = xor i32 -1, %zext		%xor = xor i32 -1, %zext
%lshr2 = lshr i32 %xor, 2		%lshr2 = lshr i32 %xor, 2
%trunc = trunc i32 %lshr2 to i16		%trunc = trunc i32 %lshr2 to i16
ret i16 %trunc		ret i16 %trunc
}		}

define i16 @lshr_negative_operand_but_short(i16 %x) {		define i16 @lshr_negative_operand_but_short(i16 %x) {
; CHECK-LABEL: @lshr_negative_operand_but_short(		; CHECK-LABEL: @lshr_negative_operand_but_short(
; CHECK-NEXT: [[ZEXT:%.]] = zext i16 [[X:%.]] to i32		; CHECK-NEXT: [[AND:%.]] = and i16 [[X:%.]], 32767
; CHECK-NEXT: [[AND:%.*]] = and i32 [[ZEXT]], 32767		; CHECK-NEXT: [[XOR:%.*]] = xor i16 -1, [[AND]]
; CHECK-NEXT: [[XOR:%.*]] = xor i32 -1, [[AND]]		; CHECK-NEXT: [[LSHR2:%.*]] = ashr i16 [[XOR]], 2
; CHECK-NEXT: [[LSHR2:%.*]] = lshr i32 [[XOR]], 2		; CHECK-NEXT: ret i16 [[LSHR2]]
; CHECK-NEXT: [[TRUNC:%.*]] = trunc i32 [[LSHR2]] to i16
; CHECK-NEXT: ret i16 [[TRUNC]]
;		;
%zext = zext i16 %x to i32		%zext = zext i16 %x to i32
%and = and i32 %zext, 32767		%and = and i32 %zext, 32767
%xor = xor i32 -1, %and		%xor = xor i32 -1, %and
%lshr2 = lshr i32 %xor, 2		%lshr2 = lshr i32 %xor, 2
%trunc = trunc i32 %lshr2 to i16		%trunc = trunc i32 %lshr2 to i16
ret i16 %trunc		ret i16 %trunc
}		}
Show All 12 Lines	;
%xor = xor i32 -1, %zext		%xor = xor i32 -1, %zext
%lshr2 = ashr i32 %xor, 2		%lshr2 = ashr i32 %xor, 2
%trunc = trunc i32 %lshr2 to i16		%trunc = trunc i32 %lshr2 to i16
ret i16 %trunc		ret i16 %trunc
}		}

define i16 @ashr_negative_operand_but_short(i16 %x) {		define i16 @ashr_negative_operand_but_short(i16 %x) {
; CHECK-LABEL: @ashr_negative_operand_but_short(		; CHECK-LABEL: @ashr_negative_operand_but_short(
; CHECK-NEXT: [[ZEXT:%.]] = zext i16 [[X:%.]] to i32		; CHECK-NEXT: [[AND:%.]] = and i16 [[X:%.]], 32767
; CHECK-NEXT: [[AND:%.*]] = and i32 [[ZEXT]], 32767		; CHECK-NEXT: [[XOR:%.*]] = xor i16 -1, [[AND]]
; CHECK-NEXT: [[XOR:%.*]] = xor i32 -1, [[AND]]		; CHECK-NEXT: [[LSHR2:%.*]] = ashr i16 [[XOR]], 2
; CHECK-NEXT: [[LSHR2:%.*]] = ashr i32 [[XOR]], 2		; CHECK-NEXT: ret i16 [[LSHR2]]
; CHECK-NEXT: [[TRUNC:%.*]] = trunc i32 [[LSHR2]] to i16
; CHECK-NEXT: ret i16 [[TRUNC]]
;		;
%zext = zext i16 %x to i32		%zext = zext i16 %x to i32
%and = and i32 %zext, 32767		%and = and i32 %zext, 32767
%xor = xor i32 -1, %and		%xor = xor i32 -1, %and
%lshr2 = ashr i32 %xor, 2		%lshr2 = ashr i32 %xor, 2
%trunc = trunc i32 %lshr2 to i16		%trunc = trunc i32 %lshr2 to i16
ret i16 %trunc		ret i16 %trunc
}		}

This is an archive of the discontinued LLVM Phabricator instance.

[AggressiveInstCombine] Add arithmetic shift right instr to `TruncInstCombine` DAG
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 367529

llvm/lib/Transforms/AggressiveInstCombine/TruncInstCombine.cpp

llvm/test/Transforms/AggressiveInstCombine/trunc_shifts.ll

This is an archive of the discontinued LLVM Phabricator instance.

[AggressiveInstCombine] Add arithmetic shift right instr to `TruncInstCombine` DAGClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 367529

llvm/lib/Transforms/AggressiveInstCombine/TruncInstCombine.cpp

llvm/test/Transforms/AggressiveInstCombine/trunc_shifts.ll

[AggressiveInstCombine] Add arithmetic shift right instr to `TruncInstCombine` DAG
ClosedPublic