This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/
-
llvm/
-
CodeGen/
-
ExpandReductions.h
-
Passes.h
-
InitializePasses.h
-
Transforms/Utils/
-
Utils/
-
LoopUtils.h
-
lib/
-
CodeGen/
-
CMakeLists.txt
5/19
ExpandReductions.cpp
-
Transforms/Utils/
-
Utils/
-
LoopUtils.cpp
-
test/CodeGen/Generic/
-
CodeGen/
-
Generic/
1/1
expand-experimental-reductions.ll
-
tools/
-
llc/
-
llc.cpp
-
opt/
-
opt.cpp

Differential D32245

Add an IR expansion pass for the experimental reductions
ClosedPublic

Authored by aemerson on Apr 19 2017, 2:34 PM.

Download Raw Diff

Details

Reviewers

mkuper
delena
rengolin

Commits

rG836b0f48c116: Add a late IR expansion pass for the experimental reduction intrinsics.
rL302631: Add a late IR expansion pass for the experimental reduction intrinsics.

Summary

This is an IR expansion pass intended to allow targets to opt-in to using the experimental reduction intrinsics introduced in D30086.

Its purpose is to see the effects of switching to the intrinsics in the IR, so this pass should be added to a target's pass config late, just before codegen. The expansion should result in the same shufflevector sequence form that targets currently expect reductions to be in.

Diff Detail

Repository: rL LLVM

Event Timeline

aemerson created this revision.Apr 19 2017, 2:34 PM

Herald added a subscriber: mgorny. · View Herald TranscriptApr 19 2017, 2:34 PM

aemerson added a parent revision: D30086: Add generic IR vector reductions.Apr 19 2017, 2:35 PM

tschuett added a subscriber: tschuett.Apr 19 2017, 11:05 PM

Adding some target people - I think they ought to care about this more than I do. :-)

lib/CodeGen/ExpandReductions.cpp
82	I'm not a huge fan of this - I would prefer not to rely on the invalidation semantics. Maybe collect all relevant instructions into a vector first, then do the replacement? (But if others disagree, this is fine.)
84	How do you expect this to happen?
92	What do we expect to happen in this case?
96	Please annotate the fallthrough here. (And perhaps it would be better to rewrite this to avoid it)
105	What about the internal instructions?
117	I'd expect a target query somewhere, regarding whether the intrinsic needs to be expanded.

aemerson marked 4 inline comments as done.Apr 26 2017, 3:18 AM

aemerson added a subscriber: RKSimon.

aemerson added inline comments.

lib/CodeGen/ExpandReductions.cpp
82	I can do that, but I've seen this kind of thing done before in other places.
84	Sorry, not entirely sure what you mean? This is an early exit if the given instruction isn't an intrinsic call.
92	As no in-tree target currently supports ordered reductions, and given that for SVE we want to enable support completely without using this expansion pass, I decided against trying to handle ordered reductions here. We just skip the intrinsic if we find it's an ordered reduction. If other targets want to experiment with ordered I think they can implement expansion via some scalarization method here.
117	My expectation was that targets wouldn't need, at least at first, that level of granularity. @RKSimon what do you think about this?

Please add full context to the diff - especially as its dependent on another (in progress) patch.

lib/CodeGen/ExpandReductions.cpp
117	Are we guaranteeing that the reductions will match the ones supported by TargetTransformInfo::getReductionCost ?
test/CodeGen/Generic/expand-experimental-reductions.ll
1	I'd prefer to see the full reduction codegen here - regenerate with utils\update_test_checks.py ?

aemerson marked 3 inline comments as done.Apr 26 2017, 5:47 AM

aemerson added inline comments.

lib/CodeGen/ExpandReductions.cpp
117	Do you mean useReductionIntrinsic()? If so, I suppose it comes down to the exact use case of this expansion. Michael originally asked for this so that targets could check the effect of using the intrinsics at the IR level only, and at a very late stage converting them into the shuffle form we have now. For that, I don't see why you would care about which individual intrinsics are expanded, rather than a simple on/off decision. If however there might be more uses of this, for example in future, if we want to enable intrinsic forms for all targets as a canonical form, and then use this pass with TTI to make a target dependent decision on which codegen-level form is preferred, then I think a TTI hook would make sense. I can add a hook anyway, perhaps defaulting to "expand all intrinsics" unless the target overrides it.

RKSimon added inline comments.Apr 26 2017, 6:35 AM

lib/CodeGen/ExpandReductions.cpp
96	You've marked it as done by LLVM_FALLTHROUGH is still missing from these - you will get warnings on some buildbots

aemerson added inline comments.Apr 26 2017, 6:39 AM

lib/CodeGen/ExpandReductions.cpp
96	I might be misunderstanding what the "Done" means. I used it to mean I'll address this in the next patch when I upload it. I haven't got around to that yet.

mkuper added inline comments.Apr 26 2017, 10:45 AM

lib/CodeGen/ExpandReductions.cpp
82	I'd suggest getting another reviewer's opinion on this.
84	Sorry, I misread this is dyn_cast<Instruction>, ignore.
92	The issue is that target-independent intrinsics are, by definition, supposed to be handled by any target. I shouldn't see a backend crash if I write IR that has the ordered intrinsic, and try to compile it for x86. Having said that - this is fine for now, but if we ever want to make these intrinsics non-experimental, this will have to be dealt with somehow. Please add a TODO.
117	I can add a hook anyway, perhaps defaulting to "expand all intrinsics" unless the target overrides it. That's exactly what I'd expect, thanks.

Addressed review comments, rewritten the pass a bit to be somewhat neater. D30086 is now committed now so this is ready to go if it looks ok.

Herald added a subscriber: javed.absar. · View Herald TranscriptMay 9 2017, 7:56 AM

LGTM, with a nit.

lib/CodeGen/ExpandReductions.cpp
138	I don't believe you should ever be in the situation you don't have a TTI here. So it should be safe to just do: const auto *TTI = getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);

This revision is now accepted and ready to land.May 9 2017, 3:38 PM

Thanks, I'll make that change and commit.

Closed by commit rL302631: Add a late IR expansion pass for the experimental reduction intrinsics. (authored by aemerson). · Explain WhyMay 10 2017, 2:56 AM

This revision was automatically updated to reflect the committed changes.

ZhangKang marked an inline comment as done.Sep 25 2019, 12:46 AM

ZhangKang added a subscriber: ZhangKang.

ZhangKang added inline comments.

llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h

466 ↗

(On Diff #98420)

Hello @aemerson ,
Here you set the function shouldExpandReduction to return true.
For below test case:

asm
declare i8 @llvm.experimental.vector.reduce.and.i8.v3i8(<3 x i8> %a)
define i8 @test_v3i8(<3 x i8> %a) nounwind {
  %b = call i8 @llvm.experimental.vector.reduce.and.i8.v3i8(<3 x i8> %a)
  ret i8 %b
}

If I built above case on ppc:, I will get below error:

shell
llc error_case.ll -mtriple=powerpc64-unknown-linux-gnu
llc: /home/shkzhang/llvm/llvm/lib/Transforms/Utils/LoopUtils.cpp:828: llvm::Value *llvm::getShuffleReduction(IRBuilder<> &, llvm::Value *, unsigned int, RecurrenceDescriptor::MinMaxRecurrenceKind, ArrayRef<llvm::Value *>): Assertion `isPowerOf2_32(VF) && "Reduction emission only supported for pow2 vectors!"' failed.
Stack dump:
0.	Program arguments: llc error_case.ll -mtriple=powerpc64-unknown-linux-gnu
1.	Running pass 'Function Pass Manager' on module 'error_case.ll'.
2.	Running pass 'Expand reduction intrinsics' on function '@test_v3i8'
 #0 0x000000001244d094 PrintStackTraceSignalHandler(void*) (/home/shkzhang/llvm/build/bin/llc+0x1244d094)
 #1 0x000000001244a348 llvm::sys::RunSignalHandlers() (/home/shkzhang/llvm/build/bin/llc+0x1244a348)
 #2 0x000000001244d6cc SignalHandler(int) (/home/shkzhang/llvm/build/bin/llc+0x1244d6cc)
 #3 0x00007869689104d8 (linux-vdso64.so.1+0x4d8)
 #4 0x00007869681ee98c __libc_signal_restore_set /build/glibc-uvws04/glibc-2.27/signal/../sysdeps/unix/sysv/linux/nptl-signals.h:80:0
 #5 0x00007869681ee98c raise /build/glibc-uvws04/glibc-2.27/signal/../sysdeps/unix/sysv/linux/raise.c:48:0
 #6 0x00007869681f0be0 abort /build/glibc-uvws04/glibc-2.27/stdlib/abort.c:79:0
 #7 0x00007869681dbb38 __assert_fail_base /build/glibc-uvws04/glibc-2.27/assert/assert.c:92:0
 #8 0x00007869681dbbe4 __assert_fail /build/glibc-uvws04/glibc-2.27/assert/assert.c:101:0
 #9 0x00000000124e036c llvm::getShuffleReduction(llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::Value*, unsigned int, llvm::RecurrenceDescriptor::MinMaxRecurrenceKind, llvm::ArrayRef<llvm::Value*>) (/home/shkzhang/llvm/build/bin/llc+0x124e036c)
#10 0x000000001175de9c (anonymous namespace)::expandReductions(llvm::Function&, llvm::TargetTransformInfo const*) (/home/shkzhang/llvm/build/bin/llc+0x1175de9c)
#11 0x0000000011cc9700 llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/shkzhang/llvm/build/bin/llc+0x11cc9700)
#12 0x0000000011cc9b90 llvm::FPPassManager::runOnModule(llvm::Module&) (/home/shkzhang/llvm/build/bin/llc+0x11cc9b90)
#13 0x0000000011cca354 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/shkzhang/llvm/build/bin/llc+0x11cca354)
#14 0x0000000011cca9ec llvm::legacy::PassManager::run(llvm::Module&) (/home/shkzhang/llvm/build/bin/llc+0x11cca9ec)
#15 0x0000000010377408 compileModule(char**, llvm::LLVMContext&) (/home/shkzhang/llvm/build/bin/llc+0x10377408)
#16 0x0000000010374a3c main (/home/shkzhang/llvm/build/bin/llc+0x10374a3c)
#17 0x00007869681c441c generic_start_main /build/glibc-uvws04/glibc-2.27/csu/../csu/libc-start.c:310:0
#18 0x00007869681c4618 __libc_start_main /build/glibc-uvws04/glibc-2.27/csu/../sysdeps/unix/sysv/linux/powerpc/libc-start.c:116:0
Aborted (core dumped)

This is because I use v3i8 here, it's not pow2. But for those ARCH like AArch64, this case can pass, because the function shouldExpandReduction will return false.

I have question that, whether we should fix above error. For example, if the number of element is not pow2, we do not call shouldExpandReduction?

Herald added a project: Restricted Project. · View Herald TranscriptSep 25 2019, 12:46 AM

Revision Contents

Path

Size

include/

llvm/

CodeGen/

ExpandReductions.h

24 lines

Passes.h

4 lines

InitializePasses.h

1 line

Transforms/

Utils/

LoopUtils.h

6 lines

lib/

CodeGen/

CMakeLists.txt

1 line

ExpandReductions.cpp

160 lines

Transforms/

Utils/

LoopUtils.cpp

6 lines

test/

CodeGen/

Generic/

expand-experimental-reductions.ll

167 lines

tools/

llc/

llc.cpp

1 line

opt/

opt.cpp

1 line

Diff 95820

include/llvm/CodeGen/ExpandReductions.h

This file was added.

				//===----- ExpandReductions.h - Expand experimental reduction intrinsics --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CODEGEN_EXPANDREDUCTIONS_H
				#define LLVM_CODEGEN_EXPANDREDUCTIONS_H

				#include "llvm/IR/PassManager.h"

				namespace llvm {

				class ExpandReductionsPass
				: public PassInfoMixin<ExpandReductionsPass> {
				public:
				PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);
				};
				} // end namespace llvm

				#endif // LLVM_CODEGEN_EXPANDREDUCTIONS_H

include/llvm/CodeGen/Passes.h

Context not available.
	/// This pass frees the memory occupied by the MachineFunction.	/// This pass frees the memory occupied by the MachineFunction.
	FunctionPass *createFreeMachineFunctionPass();	FunctionPass *createFreeMachineFunctionPass();

	/// This pass combine basic blocks guarded by the same branch.	/// This pass combine basic blocks guarded by the same branch.
	extern char &BranchCoalescingID;	extern char &BranchCoalescingID;

	/// This pass performs outlining on machine instructions directly before	/// This pass performs outlining on machine instructions directly before
	/// printing assembly.	/// printing assembly.
	ModulePass *createMachineOutlinerPass();	ModulePass *createMachineOutlinerPass();

		/// This pass expands the experimental reduction intrinsics into sequences of
		/// shuffles.
		FunctionPass *createExpandReductionsPass();

	} // End llvm namespace	} // End llvm namespace

	/// Target machine pass initializer for passes with dependencies. Use with	/// Target machine pass initializer for passes with dependencies. Use with
	/// INITIALIZE_TM_PASS_END.	/// INITIALIZE_TM_PASS_END.
	#define INITIALIZE_TM_PASS_BEGIN INITIALIZE_PASS_BEGIN	#define INITIALIZE_TM_PASS_BEGIN INITIALIZE_PASS_BEGIN

	/// Target machine pass initializer for passes with dependencies. Use with	/// Target machine pass initializer for passes with dependencies. Use with
	/// INITIALIZE_TM_PASS_BEGIN.	/// INITIALIZE_TM_PASS_BEGIN.
	#define INITIALIZE_TM_PASS_END(passName, arg, name, cfg, analysis) \	#define INITIALIZE_TM_PASS_END(passName, arg, name, cfg, analysis) \
	PassInfo *PI = new PassInfo( \	PassInfo *PI = new PassInfo( \
Context not available.

include/llvm/InitializePasses.h

Context not available.
	void initializeDominatorTreeWrapperPassPass(PassRegistry&);	void initializeDominatorTreeWrapperPassPass(PassRegistry&);
	void initializeDwarfEHPreparePass(PassRegistry&);	void initializeDwarfEHPreparePass(PassRegistry&);
	void initializeEarlyCSELegacyPassPass(PassRegistry&);	void initializeEarlyCSELegacyPassPass(PassRegistry&);
	void initializeEarlyCSEMemSSALegacyPassPass(PassRegistry&);	void initializeEarlyCSEMemSSALegacyPassPass(PassRegistry&);
	void initializeEarlyIfConverterPass(PassRegistry&);	void initializeEarlyIfConverterPass(PassRegistry&);
	void initializeEdgeBundlesPass(PassRegistry&);	void initializeEdgeBundlesPass(PassRegistry&);
	void initializeEfficiencySanitizerPass(PassRegistry&);	void initializeEfficiencySanitizerPass(PassRegistry&);
	void initializeEliminateAvailableExternallyLegacyPassPass(PassRegistry&);	void initializeEliminateAvailableExternallyLegacyPassPass(PassRegistry&);
	void initializeExpandISelPseudosPass(PassRegistry&);	void initializeExpandISelPseudosPass(PassRegistry&);
	void initializeExpandPostRAPass(PassRegistry&);	void initializeExpandPostRAPass(PassRegistry&);
		void initializeExpandReductionsPass(PassRegistry&);
	void initializeExternalAAWrapperPassPass(PassRegistry&);	void initializeExternalAAWrapperPassPass(PassRegistry&);
	void initializeFEntryInserterPass(PassRegistry&);	void initializeFEntryInserterPass(PassRegistry&);
	void initializeFinalizeMachineBundlesPass(PassRegistry&);	void initializeFinalizeMachineBundlesPass(PassRegistry&);
	void initializeFlattenCFGPassPass(PassRegistry&);	void initializeFlattenCFGPassPass(PassRegistry&);
	void initializeFloat2IntLegacyPassPass(PassRegistry&);	void initializeFloat2IntLegacyPassPass(PassRegistry&);
	void initializeForceFunctionAttrsLegacyPassPass(PassRegistry&);	void initializeForceFunctionAttrsLegacyPassPass(PassRegistry&);
	void initializeForwardControlFlowIntegrityPass(PassRegistry&);	void initializeForwardControlFlowIntegrityPass(PassRegistry&);
	void initializeFuncletLayoutPass(PassRegistry&);	void initializeFuncletLayoutPass(PassRegistry&);
	void initializeFunctionImportLegacyPassPass(PassRegistry&);	void initializeFunctionImportLegacyPassPass(PassRegistry&);
	void initializeGCMachineCodeAnalysisPass(PassRegistry&);	void initializeGCMachineCodeAnalysisPass(PassRegistry&);
Context not available.

include/llvm/Transforms/Utils/LoopUtils.h

Context not available.
	/// preheader to loop body (no speculation).	/// preheader to loop body (no speculation).
	/// If SafetyInfo is not null, we are checking for hoisting/sinking	/// If SafetyInfo is not null, we are checking for hoisting/sinking
	/// instructions from loop body to preheader/exit. Check if the instruction	/// instructions from loop body to preheader/exit. Check if the instruction
	/// can execute speculatively.	/// can execute speculatively.
	/// If \p ORE is set use it to emit optimization remarks.	/// If \p ORE is set use it to emit optimization remarks.
	bool canSinkOrHoistInst(Instruction &I, AAResults AA, DominatorTree DT,	bool canSinkOrHoistInst(Instruction &I, AAResults AA, DominatorTree DT,
	Loop CurLoop, AliasSetTracker CurAST,	Loop CurLoop, AliasSetTracker CurAST,
	LoopSafetyInfo *SafetyInfo,	LoopSafetyInfo *SafetyInfo,
	OptimizationRemarkEmitter *ORE = nullptr);	OptimizationRemarkEmitter *ORE = nullptr);

		/// Generates a vector reduction using shufflevectors to reduce the value.
		Value *getShuffleReduction(
		IRBuilder<> &Builder, Value *Src, unsigned Op,
		RecurrenceDescriptor::MinMaxRecurrenceKind *MinMaxKind = nullptr,
		ArrayRef<Value > RedOps = ArrayRef<Value >());

	/// Create a target reduction of the given vector. The reduction must be simple,	/// Create a target reduction of the given vector. The reduction must be simple,
	/// that is, it must not be complex like a minmax reduction.	/// that is, it must not be complex like a minmax reduction.
	Value createTargetReduction(IRBuilder<> &B, const TargetTransformInfo TTI,	Value createTargetReduction(IRBuilder<> &B, const TargetTransformInfo TTI,
	unsigned Opcode, Value *Src,	unsigned Opcode, Value *Src,
	ArrayRef<Value > RedOps = ArrayRef<Value>());	ArrayRef<Value > RedOps = ArrayRef<Value>());

	/// Create a generic target reduction. This queries the target to determine if	/// Create a generic target reduction. This queries the target to determine if
	/// it wants the given reduction type as an intrinsic or a log2 shuffle IR	/// it wants the given reduction type as an intrinsic or a log2 shuffle IR
	/// pattern.	/// pattern.
	Value *createTargetReduction(IRBuilder<> &B,	Value *createTargetReduction(IRBuilder<> &B,
Context not available.

lib/CodeGen/CMakeLists.txt

Context not available.
	CriticalAntiDepBreaker.cpp	CriticalAntiDepBreaker.cpp
	DeadMachineInstructionElim.cpp	DeadMachineInstructionElim.cpp
	DetectDeadLanes.cpp	DetectDeadLanes.cpp
	DFAPacketizer.cpp	DFAPacketizer.cpp
	DwarfEHPrepare.cpp	DwarfEHPrepare.cpp
	EarlyIfConversion.cpp	EarlyIfConversion.cpp
	EdgeBundles.cpp	EdgeBundles.cpp
	ExecutionDepsFix.cpp	ExecutionDepsFix.cpp
	ExpandISelPseudos.cpp	ExpandISelPseudos.cpp
	ExpandPostRAPseudos.cpp	ExpandPostRAPseudos.cpp
		ExpandReductions.cpp
	FaultMaps.cpp	FaultMaps.cpp
	FEntryInserter.cpp	FEntryInserter.cpp
	FuncletLayout.cpp	FuncletLayout.cpp
	GCMetadata.cpp	GCMetadata.cpp
	GCMetadataPrinter.cpp	GCMetadataPrinter.cpp
	GCRootLowering.cpp	GCRootLowering.cpp
	GCStrategy.cpp	GCStrategy.cpp
	GlobalMerge.cpp	GlobalMerge.cpp
	IfConversion.cpp	IfConversion.cpp
	ImplicitNullChecks.cpp	ImplicitNullChecks.cpp
Context not available.

lib/CodeGen/ExpandReductions.cpp

This file was added.

				//===--- ExpandReductions.cpp - Expand experimental reduction intrinsics --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This pass implements IR expansion for reduction intrinsics, allowing targets
				// to enable the experimental intrinsics until just before codegen.
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/CodeGen/ExpandReductions.h"
				#include "llvm/CodeGen/Passes.h"
				#include "llvm/IR/Function.h"
				#include "llvm/IR/IRBuilder.h"
				#include "llvm/IR/InstIterator.h"
				#include "llvm/IR/Intrinsics.h"
				#include "llvm/IR/IntrinsicInst.h"
				#include "llvm/IR/Module.h"
				#include "llvm/Transforms/Utils/LoopUtils.h"
				#include "llvm/Pass.h"

				using namespace llvm;

				namespace {

				unsigned getOpcode(Intrinsic::ID ID) {
				switch (ID) {
				case Intrinsic::experimental_vector_reduce_fadd:
				return Instruction::FAdd;
				case Intrinsic::experimental_vector_reduce_fmul:
				return Instruction::FMul;
				case Intrinsic::experimental_vector_reduce_add:
				return Instruction::Add;
				case Intrinsic::experimental_vector_reduce_mul:
				return Instruction::Mul;
				case Intrinsic::experimental_vector_reduce_and:
				return Instruction::And;
				case Intrinsic::experimental_vector_reduce_or:
				return Instruction::Or;
				case Intrinsic::experimental_vector_reduce_xor:
				return Instruction::Xor;
				case Intrinsic::experimental_vector_reduce_smax:
				case Intrinsic::experimental_vector_reduce_smin:
				case Intrinsic::experimental_vector_reduce_umax:
				case Intrinsic::experimental_vector_reduce_umin:
				return Instruction::ICmp;
				case Intrinsic::experimental_vector_reduce_fmax:
				case Intrinsic::experimental_vector_reduce_fmin:
				return Instruction::FCmp;
				default:
				llvm_unreachable("Unexpected ID");
				}
				}

				RecurrenceDescriptor::MinMaxRecurrenceKind getMRK(Intrinsic::ID ID) {
				switch (ID) {
				case Intrinsic::experimental_vector_reduce_smax:
				return RecurrenceDescriptor::MRK_SIntMax;
				case Intrinsic::experimental_vector_reduce_smin:
				return RecurrenceDescriptor::MRK_SIntMin;
				case Intrinsic::experimental_vector_reduce_umax:
				return RecurrenceDescriptor::MRK_UIntMax;
				case Intrinsic::experimental_vector_reduce_umin:
				return RecurrenceDescriptor::MRK_UIntMin;
				case Intrinsic::experimental_vector_reduce_fmax:
				return RecurrenceDescriptor::MRK_FloatMax;
				case Intrinsic::experimental_vector_reduce_fmin:
				return RecurrenceDescriptor::MRK_FloatMin;
				default:
				llvm_unreachable("Unexpected ID");
				}
				}

				bool expandReductions(Function &F) {
				bool Changed = false;
				inst_iterator NextIt;
				for (inst_iterator I = inst_begin(F), E = inst_end(F); I != E; I = NextIt) {
				NextIt = std::next(I);
				mkuperUnsubmitted Done Reply Inline Actions I'm not a huge fan of this - I would prefer not to rely on the invalidation semantics. Maybe collect all relevant instructions into a vector first, then do the replacement? (But if others disagree, this is fine.) mkuper: I'm not a huge fan of this - I would prefer not to rely on the invalidation semantics. Maybe…
				aemersonAuthorUnsubmitted Not Done Reply Inline Actions I can do that, but I've seen this kind of thing done before in other places. aemerson: I can do that, but I've seen this kind of thing done before in other places.
				mkuperUnsubmitted Not Done Reply Inline Actions I'd suggest getting another reviewer's opinion on this. mkuper: I'd suggest getting another reviewer's opinion on this.
				auto II = dyn_cast<IntrinsicInst>(&*I);
				if (!II)
				mkuperUnsubmitted Not Done Reply Inline Actions How do you expect this to happen? mkuper: How do you expect this to happen?
				aemersonAuthorUnsubmitted Not Done Reply Inline Actions Sorry, not entirely sure what you mean? This is an early exit if the given instruction isn't an intrinsic call. aemerson: Sorry, not entirely sure what you mean? This is an early exit if the given instruction isn't an…
				mkuperUnsubmitted Not Done Reply Inline Actions Sorry, I misread this is dyn_cast<Instruction>, ignore. mkuper: Sorry, I misread this is dyn_cast<Instruction>, ignore.
				continue;
				IRBuilder<> Builder(II);
				Value *Vec = nullptr;
				auto ID = II->getIntrinsicID();
				switch (ID) {
				case Intrinsic::experimental_vector_reduce_fadd:
				case Intrinsic::experimental_vector_reduce_fmul:
				// FMFs must be attached to the call, otherwise it's an ordered reduction
				mkuperUnsubmitted Not Done Reply Inline Actions What do we expect to happen in this case? mkuper: What do we expect to happen in this case?
				aemersonAuthorUnsubmitted Not Done Reply Inline Actions As no in-tree target currently supports ordered reductions, and given that for SVE we want to enable support completely without using this expansion pass, I decided against trying to handle ordered reductions here. We just skip the intrinsic if we find it's an ordered reduction. If other targets want to experiment with ordered I think they can implement expansion via some scalarization method here. aemerson: As no in-tree target currently supports ordered reductions, and given that for SVE we want to…
				mkuperUnsubmitted Not Done Reply Inline Actions The issue is that target-independent intrinsics are, by definition, supposed to be handled by any target. I shouldn't see a backend crash if I write IR that has the ordered intrinsic, and try to compile it for x86. Having said that - this is fine for now, but if we ever want to make these intrinsics non-experimental, this will have to be dealt with somehow. Please add a TODO. mkuper: The issue is that target-independent intrinsics are, by definition, supposed to be handled by…
				// and it can't be handled by generating this shuffle sequence.
				if (!II->getFastMathFlags().unsafeAlgebra())
				continue;
				Vec = II->getArgOperand(1);
				mkuperUnsubmitted Done Reply Inline Actions Please annotate the fallthrough here. (And perhaps it would be better to rewrite this to avoid it) mkuper: Please annotate the fallthrough here. (And perhaps it would be better to rewrite this to avoid…
				RKSimonUnsubmitted Not Done Reply Inline Actions You've marked it as done by LLVM_FALLTHROUGH is still missing from these - you will get warnings on some buildbots RKSimon: You've marked it as done by LLVM_FALLTHROUGH is still missing from these - you will get…
				aemersonAuthorUnsubmitted Not Done Reply Inline Actions I might be misunderstanding what the "Done" means. I used it to mean I'll address this in the next patch when I upload it. I haven't got around to that yet. aemerson: I might be misunderstanding what the "Done" means. I used it to mean I'll address this in the…
				case Intrinsic::experimental_vector_reduce_add:
				case Intrinsic::experimental_vector_reduce_mul:
				case Intrinsic::experimental_vector_reduce_and:
				case Intrinsic::experimental_vector_reduce_or:
				case Intrinsic::experimental_vector_reduce_xor: {
				if (!Vec)
				Vec = II->getArgOperand(0);
				auto Rdx = getShuffleReduction(Builder, Vec, getOpcode(ID));
				cast<Instruction>(Rdx)->setDebugLoc(II->getDebugLoc());
				mkuperUnsubmitted Done Reply Inline Actions What about the internal instructions? mkuper: What about the internal instructions?
				II->replaceAllUsesWith(Rdx);
				II->eraseFromParent();
				Changed = true;
				continue;
				}
				case Intrinsic::experimental_vector_reduce_smax:
				case Intrinsic::experimental_vector_reduce_smin:
				case Intrinsic::experimental_vector_reduce_umax:
				case Intrinsic::experimental_vector_reduce_umin:
				case Intrinsic::experimental_vector_reduce_fmax:
				case Intrinsic::experimental_vector_reduce_fmin: {
				Vec = II->getArgOperand(0);
				mkuperUnsubmitted Done Reply Inline Actions I'd expect a target query somewhere, regarding whether the intrinsic needs to be expanded. mkuper: I'd expect a target query somewhere, regarding whether the intrinsic needs to be expanded.
				aemersonAuthorUnsubmitted Not Done Reply Inline Actions My expectation was that targets wouldn't need, at least at first, that level of granularity. @RKSimon what do you think about this? aemerson: My expectation was that targets wouldn't need, at least at first, that level of granularity.
				RKSimonUnsubmitted Not Done Reply Inline Actions Are we guaranteeing that the reductions will match the ones supported by TargetTransformInfo::getReductionCost ? RKSimon: Are we guaranteeing that the reductions will match the ones supported by TargetTransformInfo…
				aemersonAuthorUnsubmitted Not Done Reply Inline Actions Do you mean useReductionIntrinsic()? If so, I suppose it comes down to the exact use case of this expansion. Michael originally asked for this so that targets could check the effect of using the intrinsics at the IR level only, and at a very late stage converting them into the shuffle form we have now. For that, I don't see why you would care about which individual intrinsics are expanded, rather than a simple on/off decision. If however there might be more uses of this, for example in future, if we want to enable intrinsic forms for all targets as a canonical form, and then use this pass with TTI to make a target dependent decision on which codegen-level form is preferred, then I think a TTI hook would make sense. I can add a hook anyway, perhaps defaulting to "expand all intrinsics" unless the target overrides it. aemerson: Do you mean useReductionIntrinsic()? If so, I suppose it comes down to the exact use case of…
				mkuperUnsubmitted Not Done Reply Inline Actions I can add a hook anyway, perhaps defaulting to "expand all intrinsics" unless the target overrides it. That's exactly what I'd expect, thanks. mkuper: > I can add a hook anyway, perhaps defaulting to "expand all intrinsics" unless the target…
				auto MRK = getMRK(ID);
				auto Rdx = getShuffleReduction(Builder, Vec, getOpcode(ID), &MRK);
				cast<Instruction>(Rdx)->setDebugLoc(II->getDebugLoc());
				II->replaceAllUsesWith(Rdx);
				II->eraseFromParent();
				Changed = true;
				continue;
				}
				default:
				continue;
				}
				}
				return Changed;
				}

				class ExpandReductions : public FunctionPass {
				public:
				static char ID;
				ExpandReductions() : FunctionPass(ID) {}

				bool runOnFunction(Function &F) {
				mkuperUnsubmitted Done Reply Inline Actions I don't believe you should ever be in the situation you don't have a TTI here. So it should be safe to just do: const auto TTI = getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F); mkuper:* I don't believe you should ever be in the situation you don't have a TTI here. So it should be…
				return expandReductions(F);
				}
				};

				char ExpandReductions::ID;
				}

				INITIALIZE_PASS(ExpandReductions, "expand-reductions",
				"Expand reduction intrinsics", false, false)

				namespace llvm {
				FunctionPass *createExpandReductionsPass() {
				return new ExpandReductions();
				}

				PreservedAnalyses ExpandReductionsPass::run(Function &F,
				FunctionAnalysisManager &AM) {
				if (!expandReductions(F))
				return PreservedAnalyses::all();
				return PreservedAnalyses::none();
				}
				} // End llvm namespace

lib/Transforms/Utils/LoopUtils.cpp

Context not available.
	static Value addFastMathFlag(Value V) {	static Value addFastMathFlag(Value V) {
	if (isa<FPMathOperator>(V)) {	if (isa<FPMathOperator>(V)) {
	FastMathFlags Flags;	FastMathFlags Flags;
	Flags.setUnsafeAlgebra();	Flags.setUnsafeAlgebra();
	cast<Instruction>(V)->setFastMathFlags(Flags);	cast<Instruction>(V)->setFastMathFlags(Flags);
	}	}
	return V;	return V;
	}	}

	// Helper to generate a log2 shuffle reduction.	// Helper to generate a log2 shuffle reduction.
	static Value *getShuffleReduction(	Value *llvm::getShuffleReduction(
	IRBuilder<> &Builder, Value *Src, unsigned Op,	IRBuilder<> &Builder, Value *Src, unsigned Op,
	RecurrenceDescriptor::MinMaxRecurrenceKind *MinMaxKind = nullptr,	RecurrenceDescriptor::MinMaxRecurrenceKind *MinMaxKind,
	ArrayRef<Value > RedOps = ArrayRef<Value >()) {	ArrayRef<Value *> RedOps) {
	unsigned VF = Src->getType()->getVectorNumElements();	unsigned VF = Src->getType()->getVectorNumElements();
	// VF is a power of 2 so we can emit the reduction using log2(VF) shuffles	// VF is a power of 2 so we can emit the reduction using log2(VF) shuffles
	// and vector ops, reducing the set of values being computed by half each	// and vector ops, reducing the set of values being computed by half each
	// round.	// round.
	assert(isPowerOf2_32(VF) &&	assert(isPowerOf2_32(VF) &&
	"Reduction emission only supported for pow2 vectors!");	"Reduction emission only supported for pow2 vectors!");
	Value *TmpVec = Src;	Value *TmpVec = Src;
	SmallVector<Constant *, 32> ShuffleMask(VF, nullptr);	SmallVector<Constant *, 32> ShuffleMask(VF, nullptr);
	for (unsigned i = VF; i != 1; i >>= 1) {	for (unsigned i = VF; i != 1; i >>= 1) {
	// Move the upper half of the vector to the lower half.	// Move the upper half of the vector to the lower half.
Context not available.

test/CodeGen/Generic/expand-experimental-reductions.ll

This file was added.

				; RUN: opt < %s -mtriple=aarch64 -expand-reductions -S \| FileCheck %s
				RKSimonUnsubmitted Done Reply Inline Actions I'd prefer to see the full reduction codegen here - regenerate with utils\update_test_checks.py ? RKSimon: I'd prefer to see the full reduction codegen here - regenerate with utils\update_test_checks.py…
				declare i64 @llvm.experimental.vector.reduce.add.i64.v2i64(<2 x i64>)
				declare i64 @llvm.experimental.vector.reduce.mul.i64.v2i64(<2 x i64>)
				declare i64 @llvm.experimental.vector.reduce.and.i64.v2i64(<2 x i64>)
				declare i64 @llvm.experimental.vector.reduce.or.i64.v2i64(<2 x i64>)
				declare i64 @llvm.experimental.vector.reduce.xor.i64.v2i64(<2 x i64>)

				declare float @llvm.experimental.vector.reduce.fadd.f32.v4f32(float, <4 x float>)
				declare float @llvm.experimental.vector.reduce.fmul.f32.v4f32(float, <4 x float>)

				declare i64 @llvm.experimental.vector.reduce.smax.i64.v2i64(<2 x i64>)
				declare i64 @llvm.experimental.vector.reduce.smin.i64.v2i64(<2 x i64>)
				declare i64 @llvm.experimental.vector.reduce.umax.i64.v2i64(<2 x i64>)
				declare i64 @llvm.experimental.vector.reduce.umin.i64.v2i64(<2 x i64>)

				declare double @llvm.experimental.vector.reduce.fmax.f64.v2f64(<2 x double>)
				declare double @llvm.experimental.vector.reduce.fmin.f64.v2f64(<2 x double>)


				define i64 @add_i64(<2 x i64> %vec) {
				; CHECK-LABEL: @add_i64
				; CHECK: [[SHUF:%[a-zA-Z0-9.]+]] = shufflevector <2 x i64> %vec
				; CHECK-NEXT: [[RDX:%[a-zA-Z0-9.]+]] = add <2 x i64> %vec, [[SHUF]]
				; CHECK-NEXT: extractelement <2 x i64> [[RDX]], i32 0
				entry:
				%r = call i64 @llvm.experimental.vector.reduce.add.i64.v2i64(<2 x i64> %vec)
				ret i64 %r
				}

				define i64 @mul_i64(<2 x i64> %vec) {
				; CHECK-LABEL: @mul_i64
				; CHECK: [[SHUF:%[a-zA-Z0-9.]+]] = shufflevector <2 x i64> %vec
				; CHECK-NEXT: [[RDX:%[a-zA-Z0-9.]+]] = mul <2 x i64> %vec, [[SHUF]]
				; CHECK-NEXT: extractelement <2 x i64> [[RDX]], i32 0
				entry:
				%r = call i64 @llvm.experimental.vector.reduce.mul.i64.v2i64(<2 x i64> %vec)
				ret i64 %r
				}

				define i64 @and_i64(<2 x i64> %vec) {
				; CHECK-LABEL: @and_i64
				; CHECK: [[SHUF:%[a-zA-Z0-9.]+]] = shufflevector <2 x i64> %vec
				; CHECK-NEXT: [[RDX:%[a-zA-Z0-9.]+]] = and <2 x i64> %vec, [[SHUF]]
				; CHECK-NEXT: extractelement <2 x i64> [[RDX]], i32 0
				entry:
				%r = call i64 @llvm.experimental.vector.reduce.and.i64.v2i64(<2 x i64> %vec)
				ret i64 %r
				}

				define i64 @or_i64(<2 x i64> %vec) {
				; CHECK-LABEL: @or_i64
				; CHECK: [[SHUF:%[a-zA-Z0-9.]+]] = shufflevector <2 x i64> %vec
				; CHECK-NEXT: [[RDX:%[a-zA-Z0-9.]+]] = or <2 x i64> %vec, [[SHUF]]
				; CHECK-NEXT: extractelement <2 x i64> [[RDX]], i32 0
				entry:
				%r = call i64 @llvm.experimental.vector.reduce.or.i64.v2i64(<2 x i64> %vec)
				ret i64 %r
				}

				define i64 @xor_i64(<2 x i64> %vec) {
				; CHECK-LABEL: @xor_i64
				; CHECK: [[SHUF:%[a-zA-Z0-9.]+]] = shufflevector <2 x i64> %vec
				; CHECK-NEXT: [[RDX:%[a-zA-Z0-9.]+]] = xor <2 x i64> %vec, [[SHUF]]
				; CHECK-NEXT: extractelement <2 x i64> [[RDX]], i32 0
				entry:
				%r = call i64 @llvm.experimental.vector.reduce.xor.i64.v2i64(<2 x i64> %vec)
				ret i64 %r
				}

				define float @fadd_f32(<4 x float> %vec) {
				; CHECK-LABEL: @fadd_f32
				; CHECK: [[SHUF:%[a-zA-Z0-9.]+]] = shufflevector <4 x float> %vec
				; CHECK-NEXT: [[RDX:%[a-zA-Z0-9.]+]] = fadd fast <4 x float> %vec, [[SHUF]]
				; CHECK-NEXT: [[SHUF2:%[a-zA-Z0-9.]+]] = shufflevector <4 x float> [[RDX]]
				; CHECK-NEXT: [[RDX2:%[a-zA-Z0-9.]+]] = fadd fast <4 x float> [[RDX]], [[SHUF2]]
				; CHECK-NEXT: extractelement <4 x float> [[RDX2]], i32 0
				entry:
				%r = call fast float @llvm.experimental.vector.reduce.fadd.f32.v4f32(float undef, <4 x float> %vec)
				ret float %r
				}

				define float @fadd_f32_strict(<4 x float> %vec) {
				entry:
				; CHECK-LABEL: @fadd_f32_strict
				; CHECK-NOT: shufflevector
				; CHECK: call float @llvm.experimental.vector.reduce.fadd
				%r = call float @llvm.experimental.vector.reduce.fadd.f32.v4f32(float undef, <4 x float> %vec)
				ret float %r
				}

				define float @fmul_f32(<4 x float> %vec) {
				; CHECK-LABEL: @fmul_f32
				; CHECK: [[SHUF:%[a-zA-Z0-9.]+]] = shufflevector <4 x float> %vec
				; CHECK-NEXT: [[RDX:%[a-zA-Z0-9.]+]] = fmul fast <4 x float> %vec, [[SHUF]]
				; CHECK-NEXT: [[SHUF2:%[a-zA-Z0-9.]+]] = shufflevector <4 x float> [[RDX]]
				; CHECK-NEXT: [[RDX2:%[a-zA-Z0-9.]+]] = fmul fast <4 x float> [[RDX]], [[SHUF2]]
				; CHECK-NEXT: extractelement <4 x float> [[RDX2]], i32 0
				entry:
				%r = call fast float @llvm.experimental.vector.reduce.fmul.f32.v4f32(float undef, <4 x float> %vec)
				ret float %r
				}

				define i64 @smax_i64(<2 x i64> %vec) {
				; CHECK-LABEL: @smax_i64
				; CHECK: [[SHUF:%[a-zA-Z0-9.]+]] = shufflevector <2 x i64> %vec
				; CHECK-NEXT: [[CMP:%[a-zA-Z0-9.]+]] = icmp sgt <2 x i64> %vec, [[SHUF]]
				; CHECK-NEXT: [[SEL:%[a-zA-Z0-9.]+]] = select <2 x i1> [[CMP]], <2 x i64> %vec, <2 x i64> [[SHUF]]
				; CHECK-NEXT: extractelement <2 x i64> [[SEL]], i32 0
				entry:
				%r = call i64 @llvm.experimental.vector.reduce.smax.i64.v2i64(<2 x i64> %vec)
				ret i64 %r
				}

				define i64 @smin_i64(<2 x i64> %vec) {
				; CHECK-LABEL: @smin_i64
				; CHECK: [[SHUF:%[a-zA-Z0-9.]+]] = shufflevector <2 x i64> %vec
				; CHECK-NEXT: [[CMP:%[a-zA-Z0-9.]+]] = icmp slt <2 x i64> %vec, [[SHUF]]
				; CHECK-NEXT: [[SEL:%[a-zA-Z0-9.]+]] = select <2 x i1> [[CMP]], <2 x i64> %vec, <2 x i64> [[SHUF]]
				; CHECK-NEXT: extractelement <2 x i64> [[SEL]], i32 0
				entry:
				%r = call i64 @llvm.experimental.vector.reduce.smin.i64.v2i64(<2 x i64> %vec)
				ret i64 %r
				}

				define i64 @umax_i64(<2 x i64> %vec) {
				; CHECK-LABEL: @umax_i64
				; CHECK: [[SHUF:%[a-zA-Z0-9.]+]] = shufflevector <2 x i64> %vec
				; CHECK-NEXT: [[CMP:%[a-zA-Z0-9.]+]] = icmp ugt <2 x i64> %vec, [[SHUF]]
				; CHECK-NEXT: [[SEL:%[a-zA-Z0-9.]+]] = select <2 x i1> [[CMP]], <2 x i64> %vec, <2 x i64> [[SHUF]]
				; CHECK-NEXT: extractelement <2 x i64> [[SEL]], i32 0
				entry:
				%r = call i64 @llvm.experimental.vector.reduce.umax.i64.v2i64(<2 x i64> %vec)
				ret i64 %r
				}

				define i64 @umin_i64(<2 x i64> %vec) {
				; CHECK-LABEL: @umin_i64
				; CHECK: [[SHUF:%[a-zA-Z0-9.]+]] = shufflevector <2 x i64> %vec
				; CHECK-NEXT: [[CMP:%[a-zA-Z0-9.]+]] = icmp ult <2 x i64> %vec, [[SHUF]]
				; CHECK-NEXT: [[SEL:%[a-zA-Z0-9.]+]] = select <2 x i1> [[CMP]], <2 x i64> %vec, <2 x i64> [[SHUF]]
				; CHECK-NEXT: extractelement <2 x i64> [[SEL]], i32 0
				entry:
				%r = call i64 @llvm.experimental.vector.reduce.umin.i64.v2i64(<2 x i64> %vec)
				ret i64 %r
				}

				define double @fmax_f64(<2 x double> %vec) {
				; CHECK-LABEL: @fmax_f64
				; CHECK: [[SHUF:%[a-zA-Z0-9.]+]] = shufflevector <2 x double> %vec
				; CHECK-NEXT: [[CMP:%[a-zA-Z0-9.]+]] = fcmp fast ogt <2 x double> %vec, [[SHUF]]
				; CHECK-NEXT: [[SEL:%[a-zA-Z0-9.]+]] = select <2 x i1> [[CMP]], <2 x double> %vec, <2 x double> [[SHUF]]
				; CHECK-NEXT: extractelement <2 x double> [[SEL]], i32 0
				entry:
				%r = call double @llvm.experimental.vector.reduce.fmax.f64.v2f64(<2 x double> %vec)
				ret double %r
				}

				define double @fmin_f64(<2 x double> %vec) {
				; CHECK-LABEL: @fmin_f64
				; CHECK: [[SHUF:%[a-zA-Z0-9.]+]] = shufflevector <2 x double> %vec
				; CHECK-NEXT: [[CMP:%[a-zA-Z0-9.]+]] = fcmp fast olt <2 x double> %vec, [[SHUF]]
				; CHECK-NEXT: [[SEL:%[a-zA-Z0-9.]+]] = select <2 x i1> [[CMP]], <2 x double> %vec, <2 x double> [[SHUF]]
				; CHECK-NEXT: extractelement <2 x double> [[SEL]], i32 0
				entry:
				%r = call double @llvm.experimental.vector.reduce.fmin.f64.v2f64(<2 x double> %vec)
				ret double %r
				}

tools/llc/llc.cpp

Context not available.
	PassRegistry *Registry = PassRegistry::getPassRegistry();	PassRegistry *Registry = PassRegistry::getPassRegistry();
	initializeCore(*Registry);	initializeCore(*Registry);
	initializeCodeGen(*Registry);	initializeCodeGen(*Registry);
	initializeLoopStrengthReducePass(*Registry);	initializeLoopStrengthReducePass(*Registry);
	initializeLowerIntrinsicsPass(*Registry);	initializeLowerIntrinsicsPass(*Registry);
	initializeCountingFunctionInserterPass(*Registry);	initializeCountingFunctionInserterPass(*Registry);
	initializeUnreachableBlockElimLegacyPassPass(*Registry);	initializeUnreachableBlockElimLegacyPassPass(*Registry);
	initializeConstantHoistingLegacyPassPass(*Registry);	initializeConstantHoistingLegacyPassPass(*Registry);
	initializeScalarOpts(*Registry);	initializeScalarOpts(*Registry);
	initializeVectorization(*Registry);	initializeVectorization(*Registry);
		initializeExpandReductionsPass(*Registry);

	// Register the target printer for --version.	// Register the target printer for --version.
	cl::AddExtraVersionPrinter(TargetRegistry::printRegisteredTargetsForVersion);	cl::AddExtraVersionPrinter(TargetRegistry::printRegisteredTargetsForVersion);

	cl::ParseCommandLineOptions(argc, argv, "llvm system compiler\n");	cl::ParseCommandLineOptions(argc, argv, "llvm system compiler\n");

	Context.setDiscardValueNames(DiscardValueNames);	Context.setDiscardValueNames(DiscardValueNames);

	// Set a diagnostic handler that doesn't exit on the first error	// Set a diagnostic handler that doesn't exit on the first error
	bool HasError = false;	bool HasError = false;
Context not available.

tools/opt/opt.cpp

Context not available.
	initializeRewriteSymbolsLegacyPassPass(Registry);	initializeRewriteSymbolsLegacyPassPass(Registry);
	initializeWinEHPreparePass(Registry);	initializeWinEHPreparePass(Registry);
	initializeDwarfEHPreparePass(Registry);	initializeDwarfEHPreparePass(Registry);
	initializeSafeStackPass(Registry);	initializeSafeStackPass(Registry);
	initializeSjLjEHPreparePass(Registry);	initializeSjLjEHPreparePass(Registry);
	initializePreISelIntrinsicLoweringLegacyPassPass(Registry);	initializePreISelIntrinsicLoweringLegacyPassPass(Registry);
	initializeGlobalMergePass(Registry);	initializeGlobalMergePass(Registry);
	initializeInterleavedAccessPass(Registry);	initializeInterleavedAccessPass(Registry);
	initializeCountingFunctionInserterPass(Registry);	initializeCountingFunctionInserterPass(Registry);
	initializeUnreachableBlockElimLegacyPassPass(Registry);	initializeUnreachableBlockElimLegacyPassPass(Registry);
		initializeExpandReductionsPass(Registry);

	#ifdef LINK_POLLY_INTO_TOOLS	#ifdef LINK_POLLY_INTO_TOOLS
	polly::initializePollyPasses(Registry);	polly::initializePollyPasses(Registry);
	#endif	#endif

	cl::ParseCommandLineOptions(argc, argv,	cl::ParseCommandLineOptions(argc, argv,
	"llvm .bc -> .bc modular optimizer and analysis printer\n");	"llvm .bc -> .bc modular optimizer and analysis printer\n");

	if (AnalyzeOnly && NoOutput) {	if (AnalyzeOnly && NoOutput) {
	errs() << argv[0] << ": analyze mode conflicts with no-output mode.\n";	errs() << argv[0] << ": analyze mode conflicts with no-output mode.\n";
Context not available.

This is an archive of the discontinued LLVM Phabricator instance.

Add an IR expansion pass for the experimental reductionsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 95820

include/llvm/CodeGen/ExpandReductions.h

include/llvm/CodeGen/Passes.h

include/llvm/InitializePasses.h

include/llvm/Transforms/Utils/LoopUtils.h

lib/CodeGen/CMakeLists.txt

lib/CodeGen/ExpandReductions.cpp

lib/Transforms/Utils/LoopUtils.cpp

test/CodeGen/Generic/expand-experimental-reductions.ll

tools/llc/llc.cpp

tools/opt/opt.cpp

Add an IR expansion pass for the experimental reductions
ClosedPublic