Download Raw Diff

Details

Reviewers

spatel
RKSimon
efriedma
ZaMaZaN4iK
bogner

Commits

rGa686c60c45d5: [DivRemPairs] Recommit: Handling for expanded-form rem - recomposition (PR42673)
rL367419: [DivRemPairs] Recommit: Handling for expanded-form rem - recomposition (PR42673)
rGc75cdd056f69: [DivRemPairs] Handling for expanded-form rem - recomposition (PR42673)
rL367288: [DivRemPairs] Handling for expanded-form rem - recomposition (PR42673)

Summary

While -div-rem-pairs pass can decompose rem in div+rem pair when div-rem pair
is unsupported by target, nothing performs the opposite fold.
We can't do that in InstCombine or DAGCombine since neither of those has access to TTI.
So it makes most sense to teach -div-rem-pairs about it.

If we matched rem in expanded form, we know we will be able to place div-rem pair
next to each other so we won't regress the situation.
Also, we shouldn't decompose rem if we matched already-decomposed form.
This is surprisingly straight-forward otherwise.

https://bugs.llvm.org/show_bug.cgi?id=42673

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

lebedev.ri created this revision.Jul 25 2019, 1:37 PM

Herald added a subscriber: hiraditya. · View Herald TranscriptJul 25 2019, 1:37 PM

LGTM - but I'd prefer that someone with a better grasp of C++ / PointerIntPair also have a look to make sure I'm not missing any implementation subtleties.

One question: should we keep track of the dead multiply instructions and remove them in this pass (for example, using RecursivelyDeleteTriviallyDeadInstructions())?

In D65298#1602367, @spatel wrote:

LGTM - but I'd prefer that someone with a better grasp of C++ / PointerIntPair also have a look to make sure I'm not missing any implementation subtleties.

Thank you. I personally don't think there's any subtleties there.

One question: should we keep track of the dead multiply instructions and remove them in this pass (for example, using RecursivelyDeleteTriviallyDeadInstructions())?

I have thought about that. I kind-of don't see how to nicely fit that here.
The even bigger issue is that we can now rewrite that ((x / y) * y) as x - (x % y)
thus replacing mul with sub, but this also doesn't exactly fit here..
So i'd personally prefer to leave it for further passes to deal with
(yes, i have seen comment in git log about running after instcombine,
so i guess that leaves dagcombine.)

LGTM

This revision is now accepted and ready to land.Jul 29 2019, 1:40 PM

@spatel @bogner thank you for the review!

Closed by commit rL367288: [DivRemPairs] Handling for expanded-form rem - recomposition (PR42673) (authored by lebedevri). · Explain WhyJul 30 2019, 12:10 AM

This revision was automatically updated to reflect the committed changes.

And reverted.
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/25150/steps/test-suite/logs/test.log

Only PHI nodes may reference their own value!
  %sub33 = srem i32 %sub33, %ranks_in_i

Not sure what's going on yet.

This revision is now accepted and ready to land.Jul 30 2019, 12:44 AM

Reduced:

$ cat zz.ll 
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define i32 @d(i32 %X, i32 %Y, i32 %Z) {
bb:
  %t0 = mul nsw i32 %Z, %Y
  %t1 = sdiv i32 %X, %t0
  %t2 = mul nsw i32 %t0, %t1
  %t3 = sub nsw i32 %X, %t2
  %t4 = sdiv i32 %t3, %Y
  %t5 = mul nsw i32 %t4, %Y
  %t6 = sub nsw i32 %t3, %t5
  ret i32 %t6
}
$ /builddirs/llvm-project/build-Clang8-unknown/bin/opt -div-rem-pairs zz.ll -o - -S 
Only PHI nodes may reference their own value!
  %t6 = srem i32 %t6, %Y
in function d
LLVM ERROR: Broken function found, compilation aborted!
$ /builddirs/llvm-project/build-Clang8-unknown/bin/opt -instcombine zz.ll -o - -S 
; ModuleID = 'zz.ll'
source_filename = "zz.ll"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define i32 @d(i32 %X, i32 %Y, i32 %Z) {
bb:
  %t0 = mul nsw i32 %Z, %Y
  %0 = srem i32 %X, %t0
  %1 = srem i32 %0, %Y
  ret i32 %1
}

So yeah, that obviously doesn't work.

We do RAUW but we ignore the maps we created, so if one of the values we RAUW'd
happens to be dividend/divisor (and so is used as part of the key in those maps), we get UB.

While this patch exposed the issue, the same bug already exists in trunk: https://bugs.llvm.org/show_bug.cgi?id=42823

It is trivial to catch this via PoisoningVH<>/AssertingVH<>,
but the fix is not obvious - they are used as a key in a map,
so just using TrackingVH/WeakTrackingVH won't work.

There's ValueMap, but we use DivRemMapKey as key, not a single Value.

Also, this pass assumes that there is at most one div instruction and at most one rem instruction with given pair of arguments.

$ cat zz2.ll 
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "aarch64-unknown-linux-gnu"

define void @d(i32 %X, i32 %Y, i32 %Z, i1 %c, i32* %dst00, i32* %dst01, i32* %dst10, i32* %dst11) {
bb:
  %t0 = mul nsw i32 %Z, %Y
  br i1 %c, label %bb1, label %bb2
bb1:
  %t1 = sdiv i32 %X, %t0
  %t3.recomposed = srem i32 %X, %t0
  store i32 %t1, i32* %dst00
  store i32 %t3.recomposed, i32* %dst01
  br label %end
bb2:
  %t12 = sdiv i32 %X, %t0
  %t32.recomposed = srem i32 %X, %t0
  store i32 %t12, i32* %dst10
  store i32 %t32.recomposed, i32* %dst11
  br label %end
end:
  ret void
}
$ /builddirs/llvm-project/build-Clang8-unknown/bin/opt -div-rem-pairs zz2.ll -o - -S 
; ModuleID = 'zz2.ll'
source_filename = "zz2.ll"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "aarch64-unknown-linux-gnu"

define void @d(i32 %X, i32 %Y, i32 %Z, i1 %c, i32* %dst00, i32* %dst01, i32* %dst10, i32* %dst11) {
bb:
  %t0 = mul nsw i32 %Z, %Y
  br i1 %c, label %bb1, label %bb2

bb1:                                              ; preds = %bb
  %t1 = sdiv i32 %X, %t0
  %t3.recomposed = srem i32 %X, %t0
  store i32 %t1, i32* %dst00
  store i32 %t3.recomposed, i32* %dst01
  br label %end

bb2:                                              ; preds = %bb
  %t12 = sdiv i32 %X, %t0
  %0 = mul i32 %t12, %t0
  %1 = sub i32 %X, %0
  store i32 %t12, i32* %dst10
  store i32 %1, i32* %dst11
  br label %end

end:                                              ; preds = %bb2, %bb1
  ret void
}

@spatel PTAL https://reviews.llvm.org/D65451

Diffusion mentioned this in rL367325: [DivRemPairs] Add srem-of-srem tests (PR42823, D65298, D65451).Jul 30 2019, 8:46 AM

lebedev.ri mentioned this in rG5e0adce40f34: [DivRemPairs] Add srem-of-srem tests (PR42823, D65298, D65451).Jul 30 2019, 8:46 AM

Rebased ontop of D65451, simplified code.

This revision is now accepted and ready to land.Jul 30 2019, 11:07 AM

lebedev.ri added a parent revision: D65451: [DivRemPairs] Avoid RAUW pitfalls (PR42823).Jul 30 2019, 11:07 AM

NFC, tidy up the diff a bit more.

Closed by commit rL367419: [DivRemPairs] Recommit: Handling for expanded-form rem - recomposition (PR42673) (authored by lebedevri). · Explain WhyJul 31 2019, 5:08 AM

This revision was automatically updated to reflect the committed changes.

Diffusion mentioned this in rL367417: [DivRemPairs] Avoid RAUW pitfalls (PR42823).

lebedev.ri mentioned this in rG5f616901f579: [DivRemPairs] Avoid RAUW pitfalls (PR42823).Jul 31 2019, 5:10 AM

hans mentioned this in rL367531: Merging r367417:.Aug 1 2019, 2:10 AM

hansw mentioned this in rGf08bb47ab4f6: Merging r367417: --------------------------------------------------------------….Aug 1 2019, 2:15 AM

Diff 212388

llvm/include/llvm/Transforms/Utils/BypassSlowDivision.h

	Show All 26 Lines
	class BasicBlock;			class BasicBlock;
	class Value;			class Value;

	struct DivRemMapKey {			struct DivRemMapKey {
	bool SignedOp;			bool SignedOp;
	AssertingVH<Value> Dividend;			AssertingVH<Value> Dividend;
	AssertingVH<Value> Divisor;			AssertingVH<Value> Divisor;

				DivRemMapKey() = default;

	DivRemMapKey(bool InSignedOp, Value InDividend, Value InDivisor)			DivRemMapKey(bool InSignedOp, Value InDividend, Value InDivisor)
	: SignedOp(InSignedOp), Dividend(InDividend), Divisor(InDivisor) {}			: SignedOp(InSignedOp), Dividend(InDividend), Divisor(InDivisor) {}
	};			};

	template <> struct DenseMapInfo<DivRemMapKey> {			template <> struct DenseMapInfo<DivRemMapKey> {
	static bool isEqual(const DivRemMapKey &Val1, const DivRemMapKey &Val2) {			static bool isEqual(const DivRemMapKey &Val1, const DivRemMapKey &Val2) {
	return Val1.SignedOp == Val2.SignedOp && Val1.Dividend == Val2.Dividend &&			return Val1.SignedOp == Val2.SignedOp && Val1.Dividend == Val2.Dividend &&
	Val1.Divisor == Val2.Divisor;			Val1.Divisor == Val2.Divisor;
	Show All 30 Lines

llvm/lib/Transforms/Scalar/DivRemPairs.cpp

//===- DivRemPairs.cpp - Hoist/decompose division and remainder -- C++ --===//		//===- DivRemPairs.cpp - Hoist/[dr]ecompose division and remainder --------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This pass hoists and/or decomposes integer division and remainder		// This pass hoists and/or decomposes/recomposes integer division and remainder
// instructions to enable CFG improvements and better codegen.		// instructions to enable CFG improvements and better codegen.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/Transforms/Scalar/DivRemPairs.h"		#include "llvm/Transforms/Scalar/DivRemPairs.h"
#include "llvm/ADT/DenseMap.h"		#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/MapVector.h"		#include "llvm/ADT/MapVector.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/Analysis/GlobalsModRef.h"		#include "llvm/Analysis/GlobalsModRef.h"
#include "llvm/Analysis/TargetTransformInfo.h"		#include "llvm/Analysis/TargetTransformInfo.h"
#include "llvm/IR/Dominators.h"		#include "llvm/IR/Dominators.h"
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"
		#include "llvm/IR/PatternMatch.h"
#include "llvm/Pass.h"		#include "llvm/Pass.h"
#include "llvm/Support/DebugCounter.h"		#include "llvm/Support/DebugCounter.h"
#include "llvm/Transforms/Scalar.h"		#include "llvm/Transforms/Scalar.h"
#include "llvm/Transforms/Utils/BypassSlowDivision.h"		#include "llvm/Transforms/Utils/BypassSlowDivision.h"

using namespace llvm;		using namespace llvm;
		using namespace llvm::PatternMatch;

#define DEBUG_TYPE "div-rem-pairs"		#define DEBUG_TYPE "div-rem-pairs"
STATISTIC(NumPairs, "Number of div/rem pairs");		STATISTIC(NumPairs, "Number of div/rem pairs");
		STATISTIC(NumRecomposed, "Number of instructions recomposed");
STATISTIC(NumHoisted, "Number of instructions hoisted");		STATISTIC(NumHoisted, "Number of instructions hoisted");
STATISTIC(NumDecomposed, "Number of instructions decomposed");		STATISTIC(NumDecomposed, "Number of instructions decomposed");
DEBUG_COUNTER(DRPCounter, "div-rem-pairs-transform",		DEBUG_COUNTER(DRPCounter, "div-rem-pairs-transform",
"Controls transformations in div-rem-pairs pass");		"Controls transformations in div-rem-pairs pass");

		namespace {
		struct ExpandedMatch {
		DivRemMapKey Key;
		Instruction *Value;
		};
		} // namespace

		/// See if we can match: (which is the form we expand into)
		/// X - ((X ?/ Y) * Y)
		/// which is equivalent to:
		/// X ?% Y
		static llvm::Optional<ExpandedMatch> matchExpandedRem(Instruction &I) {
		Value Dividend, XroundedDownToMultipleOfY;
		if (!match(&I, m_Sub(m_Value(Dividend), m_Value(XroundedDownToMultipleOfY))))
		return llvm::None;

		Value *Divisor;
		Instruction *Div;
		// Look for ((X / Y) * Y)
		if (!match(
		XroundedDownToMultipleOfY,
		m_c_Mul(m_CombineAnd(m_IDiv(m_Specific(Dividend), m_Value(Divisor)),
		m_Instruction(Div)),
		m_Deferred(Divisor))))
		return llvm::None;

		ExpandedMatch M;
		M.Key.SignedOp = Div->getOpcode() == Instruction::SDiv;
		M.Key.Dividend = Dividend;
		M.Key.Divisor = Divisor;
		M.Value = &I;
		return M;
		}

/// A thin wrapper to store two values that we matched as div-rem pair.		/// A thin wrapper to store two values that we matched as div-rem pair.
/// We want this extra indirection to avoid dealing with RAUW'ing the map keys.		/// We want this extra indirection to avoid dealing with RAUW'ing the map keys.
struct DivRemPairWorklistEntry {		struct DivRemPairWorklistEntry {
/// The actual udiv/sdiv instruction. Source of truth.		/// The actual udiv/sdiv instruction. Source of truth.
AssertingVH<Instruction> DivInst;		AssertingVH<Instruction> DivInst;

/// The instruction that we have matched as a remainder instruction.		/// The instruction that we have matched as a remainder instruction.
/// Should only be used as Value, don't introspect it.		/// Should only be used as Value, don't introspect it.
Show All 13 Lines	struct DivRemPairWorklistEntry {
Type *getType() const { return DivInst->getType(); }		Type *getType() const { return DivInst->getType(); }

/// Is this pair signed or unsigned?		/// Is this pair signed or unsigned?
bool isSigned() const { return DivInst->getOpcode() == Instruction::SDiv; }		bool isSigned() const { return DivInst->getOpcode() == Instruction::SDiv; }

/// In this pair, what are the divident and divisor?		/// In this pair, what are the divident and divisor?
Value *getDividend() const { return DivInst->getOperand(0); }		Value *getDividend() const { return DivInst->getOperand(0); }
Value *getDivisor() const { return DivInst->getOperand(1); }		Value *getDivisor() const { return DivInst->getOperand(1); }

		bool isRemExpanded() const {
		switch (RemInst->getOpcode()) {
		case Instruction::SRem:
		case Instruction::URem:
		return false; // single 'rem' instruction - unexpanded form.
		default:
		return true; // expanded form.
		}
		}
};		};
using DivRemWorklistTy = SmallVector<DivRemPairWorklistEntry, 4>;		using DivRemWorklistTy = SmallVector<DivRemPairWorklistEntry, 4>;

/// Find matching pairs of integer div/rem ops (they have the same numerator,		/// Find matching pairs of integer div/rem ops (they have the same numerator,
/// denominator, and signedness). Place those pairs into a worklist for further		/// denominator, and signedness). Place those pairs into a worklist for further
/// processing. This indirection is needed because we have to use TrackingVH<>		/// processing. This indirection is needed because we have to use TrackingVH<>
/// because we will be doing RAUW, and if one of the rem instructions we change		/// because we will be doing RAUW, and if one of the rem instructions we change
/// happens to be an input to another div/rem in the maps, we'd have problems.		/// happens to be an input to another div/rem in the maps, we'd have problems.
Show All 9 Lines	for (auto &I : BB) {
if (I.getOpcode() == Instruction::SDiv)		if (I.getOpcode() == Instruction::SDiv)
DivMap[DivRemMapKey(true, I.getOperand(0), I.getOperand(1))] = &I;		DivMap[DivRemMapKey(true, I.getOperand(0), I.getOperand(1))] = &I;
else if (I.getOpcode() == Instruction::UDiv)		else if (I.getOpcode() == Instruction::UDiv)
DivMap[DivRemMapKey(false, I.getOperand(0), I.getOperand(1))] = &I;		DivMap[DivRemMapKey(false, I.getOperand(0), I.getOperand(1))] = &I;
else if (I.getOpcode() == Instruction::SRem)		else if (I.getOpcode() == Instruction::SRem)
RemMap[DivRemMapKey(true, I.getOperand(0), I.getOperand(1))] = &I;		RemMap[DivRemMapKey(true, I.getOperand(0), I.getOperand(1))] = &I;
else if (I.getOpcode() == Instruction::URem)		else if (I.getOpcode() == Instruction::URem)
RemMap[DivRemMapKey(false, I.getOperand(0), I.getOperand(1))] = &I;		RemMap[DivRemMapKey(false, I.getOperand(0), I.getOperand(1))] = &I;
		else if (auto Match = matchExpandedRem(I))
		RemMap[Match->Key] = Match->Value;
}		}
}		}

// We'll accumulate the matching pairs of div-rem instructions here.		// We'll accumulate the matching pairs of div-rem instructions here.
DivRemWorklistTy Worklist;		DivRemWorklistTy Worklist;

// We can iterate over either map because we are only looking for matched		// We can iterate over either map because we are only looking for matched
// pairs. Choose remainders for efficiency because they are usually even more		// pairs. Choose remainders for efficiency because they are usually even more
Show All 34 Lines	static bool optimizeDivRem(Function &F, const TargetTransformInfo &TTI,
bool Changed = false;		bool Changed = false;

// Get the matching pairs of div-rem instructions. We want this extra		// Get the matching pairs of div-rem instructions. We want this extra
// indirection to avoid dealing with having to RAUW the keys of the maps.		// indirection to avoid dealing with having to RAUW the keys of the maps.
DivRemWorklistTy Worklist = getWorklist(F);		DivRemWorklistTy Worklist = getWorklist(F);

// Process each entry in the worklist.		// Process each entry in the worklist.
for (DivRemPairWorklistEntry &E : Worklist) {		for (DivRemPairWorklistEntry &E : Worklist) {
		if (!DebugCounter::shouldExecute(DRPCounter))
		continue;

bool HasDivRemOp = TTI.hasDivRemOp(E.getType(), E.isSigned());		bool HasDivRemOp = TTI.hasDivRemOp(E.getType(), E.isSigned());

auto &DivInst = E.DivInst;		auto &DivInst = E.DivInst;
auto &RemInst = E.RemInst;		auto &RemInst = E.RemInst;

		const bool RemOriginallyWasInExpandedForm = E.isRemExpanded();

		if (HasDivRemOp && E.isRemExpanded()) {
		// The target supports div+rem but the rem is expanded.
		// We should recompose it first.
		Value *X = E.getDividend();
		Value *Y = E.getDivisor();
		Instruction *RealRem = E.isSigned() ? BinaryOperator::CreateSRem(X, Y)
		: BinaryOperator::CreateURem(X, Y);
		// Note that we place it right next to the original expanded instruction,
		// and letting further handling to move it if needed.
		RealRem->setName(RemInst->getName() + ".recomposed");
		RealRem->insertAfter(RemInst);
		// And finally, replace the instruction.
		Instruction *OrigRemInst = RemInst;
		RemInst = RealRem;
		OrigRemInst->replaceAllUsesWith(RealRem);
		OrigRemInst->eraseFromParent();
		NumRecomposed++;
		// Note that we have left ((X / Y) * Y) around.
		// If it had other uses we could rewrite it as X - X % Y
		}

		assert(((RemInst->getOpcode() == Instruction::SRem \|\|
		RemInst->getOpcode() == Instruction::URem) \|\|
		!HasDivRemOp) &&
		"If the target supports div-rem, then by now the RemInst is "
		"Instruction::[US]Rem.");

// If the target supports div+rem and the instructions are in the same block		// If the target supports div+rem and the instructions are in the same block
// already, there's nothing to do. The backend should handle this. If the		// already, there's nothing to do. The backend should handle this. If the
// target does not support div+rem, then we will decompose the rem.		// target does not support div+rem, then we will decompose the rem.
if (HasDivRemOp && RemInst->getParent() == DivInst->getParent())		if (HasDivRemOp && RemInst->getParent() == DivInst->getParent())
continue;		continue;

bool DivDominates = DT.dominates(DivInst, RemInst);		bool DivDominates = DT.dominates(DivInst, RemInst);
if (!DivDominates && !DT.dominates(RemInst, DivInst))		if (!DivDominates && !DT.dominates(RemInst, DivInst)) {
		// We have matching div-rem pair, but they are in two different blocks,
		// neither of which dominates one another.
		assert(!RemOriginallyWasInExpandedForm &&
		"Won't happen for expanded-form rem.");
		// FIXME: We could hoist both ops to the common predecessor block?
continue;		continue;
		}

if (!DebugCounter::shouldExecute(DRPCounter))		// The target does not have a single div/rem operation,
		// and the rem is already in expanded form. Nothing to do.
		if (!HasDivRemOp && E.isRemExpanded())
continue;		continue;

if (HasDivRemOp) {		if (HasDivRemOp) {
// The target has a single div/rem operation. Hoist the lower instruction		// The target has a single div/rem operation. Hoist the lower instruction
// to make the matched pair visible to the backend.		// to make the matched pair visible to the backend.
if (DivDominates)		if (DivDominates)
RemInst->moveAfter(DivInst);		RemInst->moveAfter(DivInst);
else		else
DivInst->moveAfter(RemInst);		DivInst->moveAfter(RemInst);
NumHoisted++;		NumHoisted++;
} else {		} else {
// The target does not have a single div/rem operation. Decompose the		// The target does not have a single div/rem operation,
// remainder calculation as:		// and the rem is not in a already-expanded form.
		// Decompose the remainder calculation as:
// X % Y --> X - ((X / Y) * Y).		// X % Y --> X - ((X / Y) * Y).

		assert(!RemOriginallyWasInExpandedForm &&
		"We should not be expanding if the rem was in expanded form to "
		"begin with.");

Value *X = E.getDividend();		Value *X = E.getDividend();
Value *Y = E.getDivisor();		Value *Y = E.getDivisor();
Instruction *Mul = BinaryOperator::CreateMul(DivInst, Y);		Instruction *Mul = BinaryOperator::CreateMul(DivInst, Y);
Instruction *Sub = BinaryOperator::CreateSub(X, Mul);		Instruction *Sub = BinaryOperator::CreateSub(X, Mul);

// If the remainder dominates, then hoist the division up to that block:		// If the remainder dominates, then hoist the division up to that block:
//		//
// bb1:		// bb1:
▲ Show 20 Lines • Show All 99 Lines • Show Last 20 Lines

llvm/test/Transforms/DivRemPairs/X86/div-expanded-rem-pair.ll

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt < %s -div-rem-pairs -S -mtriple=x86_64-unknown-unknown \| FileCheck %s		; RUN: opt < %s -div-rem-pairs -S -mtriple=x86_64-unknown-unknown \| FileCheck %s

declare void @foo(i32, i32)		declare void @foo(i32, i32)

define void @decompose_illegal_srem_same_block(i32 %a, i32 %b) {		define void @decompose_illegal_srem_same_block(i32 %a, i32 %b) {
; CHECK-LABEL: @decompose_illegal_srem_same_block(		; CHECK-LABEL: @decompose_illegal_srem_same_block(
; CHECK-NEXT: [[DIV:%.]] = sdiv i32 [[A:%.]], [[B:%.*]]		; CHECK-NEXT: [[DIV:%.]] = sdiv i32 [[A:%.]], [[B:%.*]]
; CHECK-NEXT: [[T0:%.*]] = mul i32 [[DIV]], [[B]]		; CHECK-NEXT: [[T0:%.*]] = mul i32 [[DIV]], [[B]]
; CHECK-NEXT: [[REM:%.*]] = sub i32 [[A]], [[T0]]		; CHECK-NEXT: [[REM:%.*]] = srem i32 [[A]], [[B]]
; CHECK-NEXT: call void @foo(i32 [[REM]], i32 [[DIV]])		; CHECK-NEXT: call void @foo(i32 [[REM]], i32 [[DIV]])
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
%div = sdiv i32 %a, %b		%div = sdiv i32 %a, %b
%t0 = mul i32 %div, %b		%t0 = mul i32 %div, %b
%rem = sub i32 %a, %t0		%rem = sub i32 %a, %t0
call void @foo(i32 %rem, i32 %div)		call void @foo(i32 %rem, i32 %div)
ret void		ret void
}		}

define void @decompose_illegal_urem_same_block(i32 %a, i32 %b) {		define void @decompose_illegal_urem_same_block(i32 %a, i32 %b) {
; CHECK-LABEL: @decompose_illegal_urem_same_block(		; CHECK-LABEL: @decompose_illegal_urem_same_block(
; CHECK-NEXT: [[DIV:%.]] = udiv i32 [[A:%.]], [[B:%.*]]		; CHECK-NEXT: [[DIV:%.]] = udiv i32 [[A:%.]], [[B:%.*]]
; CHECK-NEXT: [[T0:%.*]] = mul i32 [[DIV]], [[B]]		; CHECK-NEXT: [[T0:%.*]] = mul i32 [[DIV]], [[B]]
; CHECK-NEXT: [[REM:%.*]] = sub i32 [[A]], [[T0]]		; CHECK-NEXT: [[REM:%.*]] = urem i32 [[A]], [[B]]
; CHECK-NEXT: call void @foo(i32 [[REM]], i32 [[DIV]])		; CHECK-NEXT: call void @foo(i32 [[REM]], i32 [[DIV]])
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
%div = udiv i32 %a, %b		%div = udiv i32 %a, %b
%t0 = mul i32 %div, %b		%t0 = mul i32 %div, %b
%rem = sub i32 %a, %t0		%rem = sub i32 %a, %t0
call void @foo(i32 %rem, i32 %div)		call void @foo(i32 %rem, i32 %div)
ret void		ret void
}		}

; Recompose and hoist the srem if it's safe and free, otherwise keep as-is..		; Recompose and hoist the srem if it's safe and free, otherwise keep as-is..

define i16 @hoist_srem(i16 %a, i16 %b) {		define i16 @hoist_srem(i16 %a, i16 %b) {
; CHECK-LABEL: @hoist_srem(		; CHECK-LABEL: @hoist_srem(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[DIV:%.]] = sdiv i16 [[A:%.]], [[B:%.*]]		; CHECK-NEXT: [[DIV:%.]] = sdiv i16 [[A:%.]], [[B:%.*]]
		; CHECK-NEXT: [[REM:%.*]] = srem i16 [[A]], [[B]]
; CHECK-NEXT: [[CMP:%.*]] = icmp eq i16 [[DIV]], 42		; CHECK-NEXT: [[CMP:%.*]] = icmp eq i16 [[DIV]], 42
; CHECK-NEXT: br i1 [[CMP]], label [[IF:%.]], label [[END:%.]]		; CHECK-NEXT: br i1 [[CMP]], label [[IF:%.]], label [[END:%.]]
; CHECK: if:		; CHECK: if:
; CHECK-NEXT: [[T0:%.*]] = mul i16 [[DIV]], [[B]]		; CHECK-NEXT: [[T0:%.*]] = mul i16 [[DIV]], [[B]]
; CHECK-NEXT: [[REM:%.*]] = sub i16 [[A]], [[T0]]
; CHECK-NEXT: br label [[END]]		; CHECK-NEXT: br label [[END]]
; CHECK: end:		; CHECK: end:
; CHECK-NEXT: [[RET:%.]] = phi i16 [ [[REM]], [[IF]] ], [ 3, [[ENTRY:%.]] ]		; CHECK-NEXT: [[RET:%.]] = phi i16 [ [[REM]], [[IF]] ], [ 3, [[ENTRY:%.]] ]
; CHECK-NEXT: ret i16 [[RET]]		; CHECK-NEXT: ret i16 [[RET]]
;		;
entry:		entry:
%div = sdiv i16 %a, %b		%div = sdiv i16 %a, %b
%cmp = icmp eq i16 %div, 42		%cmp = icmp eq i16 %div, 42
Show All 10 Lines
}		}

; Recompose and hoist the urem if it's safe and free, otherwise keep as-is..		; Recompose and hoist the urem if it's safe and free, otherwise keep as-is..

define i8 @hoist_urem(i8 %a, i8 %b) {		define i8 @hoist_urem(i8 %a, i8 %b) {
; CHECK-LABEL: @hoist_urem(		; CHECK-LABEL: @hoist_urem(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[DIV:%.]] = udiv i8 [[A:%.]], [[B:%.*]]		; CHECK-NEXT: [[DIV:%.]] = udiv i8 [[A:%.]], [[B:%.*]]
		; CHECK-NEXT: [[REM:%.*]] = urem i8 [[A]], [[B]]
; CHECK-NEXT: [[CMP:%.*]] = icmp eq i8 [[DIV]], 42		; CHECK-NEXT: [[CMP:%.*]] = icmp eq i8 [[DIV]], 42
; CHECK-NEXT: br i1 [[CMP]], label [[IF:%.]], label [[END:%.]]		; CHECK-NEXT: br i1 [[CMP]], label [[IF:%.]], label [[END:%.]]
; CHECK: if:		; CHECK: if:
; CHECK-NEXT: [[T0:%.*]] = mul i8 [[DIV]], [[B]]		; CHECK-NEXT: [[T0:%.*]] = mul i8 [[DIV]], [[B]]
; CHECK-NEXT: [[REM:%.*]] = sub i8 [[A]], [[T0]]
; CHECK-NEXT: br label [[END]]		; CHECK-NEXT: br label [[END]]
; CHECK: end:		; CHECK: end:
; CHECK-NEXT: [[RET:%.]] = phi i8 [ [[REM]], [[IF]] ], [ 3, [[ENTRY:%.]] ]		; CHECK-NEXT: [[RET:%.]] = phi i8 [ [[REM]], [[IF]] ], [ 3, [[ENTRY:%.]] ]
; CHECK-NEXT: ret i8 [[RET]]		; CHECK-NEXT: ret i8 [[RET]]
;		;
entry:		entry:
%div = udiv i8 %a, %b		%div = udiv i8 %a, %b
%cmp = icmp eq i8 %div, 42		%cmp = icmp eq i8 %div, 42
Show All 31 Lines	;
%t6.recomposed = srem i32 %t3.recomposed, %Y		%t6.recomposed = srem i32 %t3.recomposed, %Y
ret i32 %t6.recomposed		ret i32 %t6.recomposed
}		}
define i32 @srem_of_srem_expanded(i32 %X, i32 %Y, i32 %Z) {		define i32 @srem_of_srem_expanded(i32 %X, i32 %Y, i32 %Z) {
; CHECK-LABEL: @srem_of_srem_expanded(		; CHECK-LABEL: @srem_of_srem_expanded(
; CHECK-NEXT: [[T0:%.]] = mul nsw i32 [[Z:%.]], [[Y:%.*]]		; CHECK-NEXT: [[T0:%.]] = mul nsw i32 [[Z:%.]], [[Y:%.*]]
; CHECK-NEXT: [[T1:%.]] = sdiv i32 [[X:%.]], [[T0]]		; CHECK-NEXT: [[T1:%.]] = sdiv i32 [[X:%.]], [[T0]]
; CHECK-NEXT: [[T2:%.*]] = mul nsw i32 [[T0]], [[T1]]		; CHECK-NEXT: [[T2:%.*]] = mul nsw i32 [[T0]], [[T1]]
; CHECK-NEXT: [[T3:%.*]] = sub nsw i32 [[X]], [[T2]]		; CHECK-NEXT: [[T3:%.*]] = srem i32 [[X]], [[T0]]
; CHECK-NEXT: [[T4:%.*]] = sdiv i32 [[T3]], [[Y]]		; CHECK-NEXT: [[T4:%.*]] = sdiv i32 [[T3]], [[Y]]
; CHECK-NEXT: [[T5:%.*]] = mul nsw i32 [[T4]], [[Y]]		; CHECK-NEXT: [[T5:%.*]] = mul nsw i32 [[T4]], [[Y]]
; CHECK-NEXT: [[T6:%.*]] = sub nsw i32 [[T3]], [[T5]]		; CHECK-NEXT: [[T6:%.*]] = srem i32 [[T3]], [[Y]]
; CHECK-NEXT: ret i32 [[T6]]		; CHECK-NEXT: ret i32 [[T6]]
;		;
%t0 = mul nsw i32 %Z, %Y		%t0 = mul nsw i32 %Z, %Y
%t1 = sdiv i32 %X, %t0		%t1 = sdiv i32 %X, %t0
%t2 = mul nsw i32 %t0, %t1		%t2 = mul nsw i32 %t0, %t1
%t3 = sub nsw i32 %X, %t2		%t3 = sub nsw i32 %X, %t2
%t4 = sdiv i32 %t3, %Y		%t4 = sdiv i32 %t3, %Y
%t5 = mul nsw i32 %t4, %Y		%t5 = mul nsw i32 %t4, %Y
Show All 34 Lines

llvm/test/Transforms/DivRemPairs/X86/div-rem-pairs.ll

Show First 20 Lines • Show All 166 Lines • ▼ Show 20 Lines	;
%t6.recomposed = srem i32 %t3.recomposed, %Y		%t6.recomposed = srem i32 %t3.recomposed, %Y
ret i32 %t6.recomposed		ret i32 %t6.recomposed
}		}
define i32 @srem_of_srem_expanded(i32 %X, i32 %Y, i32 %Z) {		define i32 @srem_of_srem_expanded(i32 %X, i32 %Y, i32 %Z) {
; CHECK-LABEL: @srem_of_srem_expanded(		; CHECK-LABEL: @srem_of_srem_expanded(
; CHECK-NEXT: [[T0:%.]] = mul nsw i32 [[Z:%.]], [[Y:%.*]]		; CHECK-NEXT: [[T0:%.]] = mul nsw i32 [[Z:%.]], [[Y:%.*]]
; CHECK-NEXT: [[T1:%.]] = sdiv i32 [[X:%.]], [[T0]]		; CHECK-NEXT: [[T1:%.]] = sdiv i32 [[X:%.]], [[T0]]
; CHECK-NEXT: [[T2:%.*]] = mul nsw i32 [[T0]], [[T1]]		; CHECK-NEXT: [[T2:%.*]] = mul nsw i32 [[T0]], [[T1]]
; CHECK-NEXT: [[T3:%.*]] = sub nsw i32 [[X]], [[T2]]		; CHECK-NEXT: [[T3:%.*]] = srem i32 [[X]], [[T0]]
; CHECK-NEXT: [[T4:%.*]] = sdiv i32 [[T3]], [[Y]]		; CHECK-NEXT: [[T4:%.*]] = sdiv i32 [[T3]], [[Y]]
; CHECK-NEXT: [[T5:%.*]] = mul nsw i32 [[T4]], [[Y]]		; CHECK-NEXT: [[T5:%.*]] = mul nsw i32 [[T4]], [[Y]]
; CHECK-NEXT: [[T6:%.*]] = sub nsw i32 [[T3]], [[T5]]		; CHECK-NEXT: [[T6:%.*]] = srem i32 [[T3]], [[Y]]
; CHECK-NEXT: ret i32 [[T6]]		; CHECK-NEXT: ret i32 [[T6]]
;		;
%t0 = mul nsw i32 %Z, %Y		%t0 = mul nsw i32 %Z, %Y
%t1 = sdiv i32 %X, %t0		%t1 = sdiv i32 %X, %t0
%t2 = mul nsw i32 %t0, %t1		%t2 = mul nsw i32 %t0, %t1
%t3 = sub nsw i32 %X, %t2		%t3 = sub nsw i32 %X, %t2
%t4 = sdiv i32 %t3, %Y		%t4 = sdiv i32 %t3, %Y
%t5 = mul nsw i32 %t4, %Y		%t5 = mul nsw i32 %t4, %Y
▲ Show 20 Lines • Show All 154 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[DivRemPairs] Handling for expanded-form rem - recomposition (PR42673)
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 212388

llvm/include/llvm/Transforms/Utils/BypassSlowDivision.h

llvm/lib/Transforms/Scalar/DivRemPairs.cpp

llvm/test/Transforms/DivRemPairs/X86/div-expanded-rem-pair.ll

llvm/test/Transforms/DivRemPairs/X86/div-rem-pairs.ll

This is an archive of the discontinued LLVM Phabricator instance.

[DivRemPairs] Handling for expanded-form rem - recomposition (PR42673)ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 212388

llvm/include/llvm/Transforms/Utils/BypassSlowDivision.h

llvm/lib/Transforms/Scalar/DivRemPairs.cpp

llvm/test/Transforms/DivRemPairs/X86/div-expanded-rem-pair.ll

llvm/test/Transforms/DivRemPairs/X86/div-rem-pairs.ll

[DivRemPairs] Handling for expanded-form rem - recomposition (PR42673)
ClosedPublic