This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Frontend/OpenMP/
-
llvm/
-
Frontend/
-
OpenMP/
3/3
OMPIRBuilder.h
-
lib/Frontend/OpenMP/
-
Frontend/
-
OpenMP/
6/8
OMPIRBuilder.cpp
-
unittests/Frontend/
-
Frontend/
-
OpenMPIRBuilderTest.cpp

Differential D90830

[OpenMPIRBuilder] Implement CreateCanonicalLoop.
ClosedPublic

Authored by Meinersbur on Nov 5 2020, 12:48 AM.

Download Raw Diff

Details

Reviewers

jdoerfert
AMDChirag
anchu-rajendran
kiranchandramohan
AlexisPerry
SouraVX
kiranktp
fghanim

Commits

rGf44ee0f5e7d1: [OpenMPIRBuilder] Implement CreateCanonicalLoop.

Summary

CreateCanonicalLoop generates a standardized control flow structure for OpenMP canonical for loops. The structure can be consumed by loop-associated directives such as worksharing-loop, distribute, simd etc. as well as loop transformations such as tile and unroll.

This is a first design without considering all complexities yet. The control-flow emits more basic block than strictly necessary, but these will be optimized by CFGSimplify anyway, provide a nice separation of concerns and might later be useful with more complex scenarios. I successfully implemented a basic tile construct using this API, which is not part of this patch.

The fundamental building block is the CreateCanonicalLoop that only takes the loop trip count and operates on the logical iteration spaces only. An overloaded CreateCanonicalLoop for using LB, UB, Increment is provided as well, but at least for C++, Clang will need to implement a loop counter to logical induction variable mapping anyway, since iterator overload resolution cannot be done in LLVMFrontend.

As there currently is no user for CreateCanonicalLoop, it is only called from unittests. Similarly, CanonicalLoopInfo::eraseFromParent() is used in my file implementation and might be generally useful for implementing loop-associated constructs, but is not used in this patch itself.

The following non-exhaustive list describes not yet covered items:

collapse clause (including non-rectangular and non-perfectly nested); idea is to provide a OpenMPIRBuilder::collapseLoopNest method consuming multiple nested loops and returning a new CanonicalLoopInfo that can be used for loop-associated directives.
simarly: ordered clause for DOACROSS loops
branch weights
Cancellation point (?)
AllocaIP
break statement (if needed at all)
Exceptions (if not completely handled in the front-end)
- Using it in Clang; this requires implementing at least one loop-associated construct.
...

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

Meinersbur created this revision.Nov 5 2020, 12:48 AM

Herald added subscribers: guansong, hiraditya, yaxunl. · View Herald TranscriptNov 5 2020, 12:48 AM

Meinersbur requested review of this revision.Nov 5 2020, 12:48 AM

Herald added a subscriber: sstefan1. · View Herald TranscriptNov 5 2020, 12:48 AM

Harbormaster completed remote builds in B77668: Diff 303049.Nov 5 2020, 1:32 AM

Meinersbur edited the summary of this revision. (Show Details)Nov 5 2020, 8:04 AM

clementval added a subscriber: clementval.Nov 5 2020, 12:24 PM

Some minor comments, overall this looks reasonable. I'll let someone else take a look.

llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
702	front = getTerminator, right? Can we add a few more assertions here and elsewhere, just to make sure the structure is as we expect it.
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
887	I usually prefer the `assert(CL->isValid() && "...")` style but no objection at the end of the day. (the reason is the other one can be used to dump more of the context easily, e.g., `if (!isValid()) dump_stuff();`.
903	Nit : `&& "Loop counter type mismatch"` or similar.
958	Nit: msg.
978	I thought we have an API to delete blocks without the need for this. Maybe try to use that instead?

Thanks for the Patch. Generally looks Good.
Just a couple of very minor comments/questions

My understanding of the usage case for this, is that when we get CreateDistribute, CreateSimd, CreateLoop, etc., they will use this to generate the loop/s required, but this is not meant to be used directly by clang or flang? is that correct? If yes, why do we have them as public methods, wouldn't it be better to make them private?

Also, a question; I think when we have something like:

#pragma omp for
for ( ... ; ... ; ...) {
 //body
}

this is going to be encoded in the clang AST as the omp for statement, whose body the ForStatement node. Now the first will end up using the CreateForDirective() that we plan to add in the OMPBuilder, while the latter will basically have clang generate the loop for us ( i.e. using`
emitForStatement() ). is this correct? If yes, wouldn't that mean that there would be some redundancy, or is this for a different case?

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1393	did you mean to say `Cond->getTerminator()`?

In D90830#2377846, @fghanim wrote:

Thanks for the Patch. Generally looks Good.
Just a couple of very minor comments/questions

My understanding of the usage case for this, is that when we get CreateDistribute, CreateSimd, CreateLoop, etc., they will use this to generate the loop/s required, but this is not meant to be used directly by clang or flang? is that correct? If yes, why do we have them as public methods, wouldn't it be better to make them private?

OpenMP 5.1 includes the tile and unroll directive that allows chaining loop-associated constructs as controlled by the front-end. I.e.

#pragma omp for
#pragma omp unroll partial
for (int i = 0; i < N; i+=1)

the front-end will do:

auto LiteralLoop = CreateCanonicalLoop(ForStmt)
auto UnrolledLoop = CreateUnrollDirective(LiteralLoop );
CreateWorksharingDirective(UnrolledLoop);

Since LLVMFrontend cannot implement all combinations in which tiling/unrolling/other loop-associated can be combined, the clang/flang must have access to CreateCanonicalLoop to represent the composition in the right order.

The concept is more powerful, e.g. for combined directives:

auto LiteralLoop = CreateCanonicalLoop(ForStmt)
auto VectorizedLoop = CreateSimdDirective(LiteralLoop);
CreateWorksharingDirective(VectorizedLoop);

to implement #pragma omp for simd without OMPIRBuilderr necessarily implementing every valid combination of directives.

Also, a question; I think when we have something like:
#pragma omp for
for ( ... ; ... ; ...) {
 //body
}
this is going to be encoded in the clang AST as the omp for statement, whose body the ForStatement node. Now the first will end up using the CreateForDirective() that we plan to add in the OMPBuilder, while the latter will basically have clang generate the loop for us ( i.e. using`
emitForStatement() ). is this correct? If yes, wouldn't that mean that there would be some redundancy, or is this for a different case?

Unfortunately CodeGenFunction::EmitForStmt does not generate a predictable control-flow, or a way to get the loop's trip count that we need to determine the logical iterations. I.e. we have to implement that ourselves. However, my experimental clang code started its live as a copy&paste of CodeGenFunction::EmitForStmt and I try to keep them similar.

llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
702	front is the first instruction in the block, this is the compare instruction for `i < TripCount`. I.e. the second operator is the trip count. This is already asserted in `assertOK`, but I can add one here as well.
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
887	This follows the pattern `AssertOK` used by LLVM's instructions. I usually prefer this pattern because the crash message includes what is wrong, not just that it is invalid. Dumping stuff also assumes that the data structure is valid.
1393	Yep.

Address review comments

re-add CreateCopyPrivate

Harbormaster completed remote builds in B77983: Diff 303621.Nov 7 2020, 12:32 AM

Harbormaster completed remote builds in B77984: Diff 303622.Nov 7 2020, 12:43 AM

Thanks for the response, I think I am beginning to understand where this fits.
Just one Nit, O/W LGTM

I am going to wait for @jdoerfert to finish his review, and I'll accept the patch if he doesn't

llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
625–639	Thank you for adding this!
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1402	Nit: must have two successors

LGTM. We will make progress based on this as we go. Thanks :)

This revision is now accepted and ready to land.Nov 9 2020, 7:47 AM

This revision was landed with ongoing or failed builds.Nov 9 2020, 1:03 PM

Closed by commit rGf44ee0f5e7d1: [OpenMPIRBuilder] Implement CreateCanonicalLoop. (authored by Meinersbur). · Explain Why

This revision was automatically updated to reflect the committed changes.

Meinersbur marked an inline comment as done.

Meinersbur added a commit: rGf44ee0f5e7d1: [OpenMPIRBuilder] Implement CreateCanonicalLoop..

Meinersbur mentioned this in D89671: [LLVM][OpenMP] Adding support for OpenMP sections construct in OpenMPIRBuilder.Nov 9 2020, 1:18 PM

kiranchandramohan mentioned this in D91982: [mlir] Add conversion from SCF parallel loops to OpenMP.Nov 24 2020, 9:52 AM

Meinersbur mentioned this in D92974: [OpenMPIRBuilder] Implement tileLoops..Dec 9 2020, 3:07 PM

Revision Contents

Path

Size

llvm/

include/

llvm/

Frontend/

OpenMP/

OMPIRBuilder.h

193 lines

lib/

Frontend/

OpenMP/

OMPIRBuilder.cpp

227 lines

unittests/

Frontend/

OpenMPIRBuilderTest.cpp

130 lines

Diff 303965

llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h

Show All 12 Lines

#ifndef LLVM_OPENMP_IR_IRBUILDER_H		#ifndef LLVM_OPENMP_IR_IRBUILDER_H
#define LLVM_OPENMP_IR_IRBUILDER_H		#define LLVM_OPENMP_IR_IRBUILDER_H

#include "llvm/Frontend/OpenMP/OMPConstants.h"		#include "llvm/Frontend/OpenMP/OMPConstants.h"
#include "llvm/IR/DebugLoc.h"		#include "llvm/IR/DebugLoc.h"
#include "llvm/IR/IRBuilder.h"		#include "llvm/IR/IRBuilder.h"
#include "llvm/Support/Allocator.h"		#include "llvm/Support/Allocator.h"
		#include <forward_list>

namespace llvm {		namespace llvm {
		class CanonicalLoopInfo;

/// An interface to create LLVM-IR for OpenMP directives.		/// An interface to create LLVM-IR for OpenMP directives.
///		///
/// Each OpenMP directive has a corresponding public generator method.		/// Each OpenMP directive has a corresponding public generator method.
class OpenMPIRBuilder {		class OpenMPIRBuilder {
public:		public:
/// Create a new OpenMPIRBuilder operating on the given module \p M. This will		/// Create a new OpenMPIRBuilder operating on the given module \p M. This will
/// not have an effect on \p M (see initialize).		/// not have an effect on \p M (see initialize).
▲ Show 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	public:
/// placed.		/// placed.
/// \param ContinuationBB is the basic block target to leave the body.		/// \param ContinuationBB is the basic block target to leave the body.
///		///
/// Note that all blocks pointed to by the arguments have terminators.		/// Note that all blocks pointed to by the arguments have terminators.
using BodyGenCallbackTy =		using BodyGenCallbackTy =
function_ref<void(InsertPointTy AllocaIP, InsertPointTy CodeGenIP,		function_ref<void(InsertPointTy AllocaIP, InsertPointTy CodeGenIP,
BasicBlock &ContinuationBB)>;		BasicBlock &ContinuationBB)>;

		/// Callback type for loop body code generation.
		///
		/// \param CodeGenIP is the insertion point where the loop's body code must be
		/// placed. This will be a dedicated BasicBlock with a
		/// conditional branch from the loop condition check and
		/// terminated with an unconditional branch to the loop
		/// latch.
		/// \param IndVar is the induction variable usable at the insertion point.
		using LoopBodyGenCallbackTy =
		function_ref<void(InsertPointTy CodeGenIP, Value *IndVar)>;

/// Callback type for variable privatization (think copy & default		/// Callback type for variable privatization (think copy & default
/// constructor).		/// constructor).
///		///
/// \param AllocaIP is the insertion point at which new alloca instructions		/// \param AllocaIP is the insertion point at which new alloca instructions
/// should be placed.		/// should be placed.
/// \param CodeGenIP is the insertion point at which the privatization code		/// \param CodeGenIP is the insertion point at which the privatization code
/// should be placed.		/// should be placed.
/// \param Val The value beeing copied/created.		/// \param Val The value beeing copied/created.
▲ Show 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	public:
/// \returns The insertion position after the parallel.		/// \returns The insertion position after the parallel.
IRBuilder<>::InsertPoint		IRBuilder<>::InsertPoint
CreateParallel(const LocationDescription &Loc, InsertPointTy AllocaIP,		CreateParallel(const LocationDescription &Loc, InsertPointTy AllocaIP,
BodyGenCallbackTy BodyGenCB, PrivatizeCallbackTy PrivCB,		BodyGenCallbackTy BodyGenCB, PrivatizeCallbackTy PrivCB,
FinalizeCallbackTy FiniCB, Value *IfCondition,		FinalizeCallbackTy FiniCB, Value *IfCondition,
Value *NumThreads, omp::ProcBindKind ProcBind,		Value *NumThreads, omp::ProcBindKind ProcBind,
bool IsCancellable);		bool IsCancellable);

		/// Generator for the control flow structure of an OpenMP canonical loop.
		///
		/// This generator operates on the logical iteration space of the loop, i.e.
		/// the caller only has to provide a loop trip count of the loop as defined by
		/// base language semantics. The trip count is interpreted as an unsigned
		/// integer. The induction variable passed to \p BodyGenCB will be of the same
		/// type and run from 0 to \p TripCount - 1. It is up to the callback to
		/// convert the logical iteration variable to the loop counter variable in the
		/// loop body.
		///
		/// \param Loc The insert and source location description.
		/// \param BodyGenCB Callback that will generate the loop body code.
		/// \param TripCount Number of iterations the loop body is executed.
		///
		/// \returns An object representing the created control flow structure which
		/// can be used for loop-associated directives.
		CanonicalLoopInfo *CreateCanonicalLoop(const LocationDescription &Loc,
		LoopBodyGenCallbackTy BodyGenCB,
		Value *TripCount);

		/// Generator for the control flow structure of an OpenMP canonical loop.
		///
		/// Instead of a logical iteration space, this allows specifying user-defined
		/// loop counter values using increment, upper- and lower bounds. To
		/// disambiguate the terminology when counting downwards, instead of lower
		/// bounds we use \p Start for the loop counter value in the first body
		/// iteration.
		///
		/// Consider the following limitations:
		///
		/// * A loop counter space over all integer values of its bit-width cannot be
		/// represented. E.g using uint8_t, its loop trip count of 256 cannot be
		/// stored into an 8 bit integer):
		///
		/// DO I = 0, 255, 1
		///
		/// * Unsigned wrapping is only supported when wrapping only "once"; E.g.
		/// effectively counting downwards:
		///
		/// for (uint8_t i = 100u; i > 0; i += 127u)
		///
		///
		/// TODO: May need to add addtional parameters to represent:
		///
		/// * Allow representing downcounting with unsigned integers.
		///
		/// * Sign of the step and the comparison operator might disagree:
		///
		/// for (int i = 0; i < 42; --i)
		///
		//
		/// \param Loc The insert and source location description.
		/// \param BodyGenCB Callback that will generate the loop body code.
		/// \param Start Value of the loop counter for the first iterations.
		/// \param Stop Loop counter values past this will stop the the
		/// iterations.
		/// \param Step Loop counter increment after each iteration; negative
		/// means counting down. \param IsSigned Whether Start, Stop
		/// and Stop are signed integers.
		/// \param InclusiveStop Whether \p Stop itself is a valid value for the loop
		/// counter.
		///
		/// \returns An object representing the created control flow structure which
		/// can be used for loop-associated directives.
		CanonicalLoopInfo *CreateCanonicalLoop(const LocationDescription &Loc,
		LoopBodyGenCallbackTy BodyGenCB,
		Value Start, Value Stop, Value *Step,
		bool IsSigned, bool InclusiveStop);

/// Generator for '#omp flush'		/// Generator for '#omp flush'
///		///
/// \param Loc The location where the flush directive was encountered		/// \param Loc The location where the flush directive was encountered
void CreateFlush(const LocationDescription &Loc);		void CreateFlush(const LocationDescription &Loc);

/// Generator for '#omp taskwait'		/// Generator for '#omp taskwait'
///		///
/// \param Loc The location where the taskwait directive was encountered.		/// \param Loc The location where the taskwait directive was encountered.
▲ Show 20 Lines • Show All 121 Lines • ▼ Show 20 Lines	struct OutlineInfo {
/// vector and set.		/// vector and set.
void collectBlocks(SmallPtrSetImpl<BasicBlock *> &BlockSet,		void collectBlocks(SmallPtrSetImpl<BasicBlock *> &BlockSet,
SmallVectorImpl<BasicBlock *> &BlockVector);		SmallVectorImpl<BasicBlock *> &BlockVector);
};		};

/// Collection of regions that need to be outlined during finalization.		/// Collection of regions that need to be outlined during finalization.
SmallVector<OutlineInfo, 16> OutlineInfos;		SmallVector<OutlineInfo, 16> OutlineInfos;

		/// Collection of owned canonical loop objects that eventually need to be
		/// free'd.
		std::forward_list<CanonicalLoopInfo> LoopInfos;

/// Add a new region that will be outlined later.		/// Add a new region that will be outlined later.
void addOutlineInfo(OutlineInfo &&OI) { OutlineInfos.emplace_back(OI); }		void addOutlineInfo(OutlineInfo &&OI) { OutlineInfos.emplace_back(OI); }

/// An ordered map of auto-generated variables to their unique names.		/// An ordered map of auto-generated variables to their unique names.
/// It stores variables with the following names: 1) ".gomp_critical_user_" +		/// It stores variables with the following names: 1) ".gomp_critical_user_" +
/// <critical_section_name> + ".var" for "omp critical" directives; 2)		/// <critical_section_name> + ".var" for "omp critical" directives; 2)
/// <mangled_name_for_global_var> + ".cache." for cache for threadprivate		/// <mangled_name_for_global_var> + ".cache." for cache for threadprivate
/// variables.		/// variables.
▲ Show 20 Lines • Show All 203 Lines • ▼ Show 20 Lines	private:
/// Returns corresponding lock object for the specified critical region		/// Returns corresponding lock object for the specified critical region
/// name. If the lock object does not exist it is created, otherwise the		/// name. If the lock object does not exist it is created, otherwise the
/// reference to the existing copy is returned.		/// reference to the existing copy is returned.
/// \param CriticalName Name of the critical region.		/// \param CriticalName Name of the critical region.
///		///
Value *getOMPCriticalRegionLock(StringRef CriticalName);		Value *getOMPCriticalRegionLock(StringRef CriticalName);
};		};

		/// Class to represented the control flow structure of an OpenMP canonical loop.
		///
		/// The control-flow structure is standardized for easy consumption by
		/// directives associated with loops. For instance, the worksharing-loop
		/// construct may change this control flow such that each loop iteration is
		/// executed on only one thread.
		///
		/// The control flow can be described as follows:
		///
		/// Preheader
		/// \|
		/// /-> Header
		/// \| \|
		/// \| Cond---\
		/// \| \| \|
		/// \| Body \|
		/// \| \| \|
		/// \--Latch \|
		/// \|
		/// Exit
		/// \|
		/// After
		fghanimUnsubmitted Done Reply Inline Actions Thank you for adding this! fghanim: Thank you for adding this!
		///
		/// Code in the header, condition block, latch and exit block must not have any
		/// side-effect.
		///
		/// Defined outside OpenMPIRBuilder because one cannot forward-declare nested
		/// classes.
		class CanonicalLoopInfo {
		friend class OpenMPIRBuilder;

		private:
		/// Whether this object currently represents a loop.
		bool IsValid = false;

		BasicBlock *Preheader;
		BasicBlock *Header;
		BasicBlock *Cond;
		BasicBlock *Body;
		BasicBlock *Latch;
		BasicBlock *Exit;
		BasicBlock *After;

		/// Delete this loop if unused.
		void eraseFromParent();

		public:
		/// The preheader ensures that there is only a single edge entering the loop.
		/// Code that must be execute before any loop iteration can be emitted here,
		/// such as computing the loop trip count and begin lifetime markers. Code in
		/// the preheader is not considered part of the canonical loop.
		BasicBlock *getPreheader() const { return Preheader; }

		/// The header is the entry for each iteration. In the canonical control flow,
		/// it only contains the PHINode for the induction variable.
		BasicBlock *getHeader() const { return Header; }

		/// The condition block computes whether there is another loop iteration. If
		/// yes, branches to the body; otherwise to the exit block.
		BasicBlock *getCond() const { return Cond; }

		/// The body block is the single entry for a loop iteration and not controlled
		/// by CanonicalLoopInfo. It can contain arbitrary control flow but must
		/// eventually branch to the \p Latch block.
		BasicBlock *getBody() const { return Body; }

		/// Reaching the latch indicates the end of the loop body code. In the
		/// canonical control flow, it only contains the increment of the induction
		/// variable.
		BasicBlock *getLatch() const { return Latch; }

		/// Reaching the exit indicates no more iterations are being executed.
		BasicBlock *getExit() const { return Exit; }

		/// The after block is intended for clean-up code such as lifetime end
		/// markers. It is separate from the exit block to ensure, analogous to the
		/// preheader, it having just a single entry edge and being free from PHI
		/// nodes should there be multiple loop exits (such as from break
		/// statements/cancellations).
		BasicBlock *getAfter() const { return After; }

		/// Returns the llvm::Value containing the number of loop iterations. I must
		/// be valid in the preheader and always interpreted as an unsigned integer of
		/// any bit-width.
		Value *getTripCount() const {
		jdoerfertUnsubmitted Done Reply Inline Actions front = getTerminator, right? Can we add a few more assertions here and elsewhere, just to make sure the structure is as we expect it. jdoerfert: front = getTerminator, right? Can we add a few more assertions here and elsewhere, just to make…
		MeinersburAuthorUnsubmitted Done Reply Inline Actions front is the first instruction in the block, this is the compare instruction for `i < TripCount`. I.e. the second operator is the trip count. This is already asserted in `assertOK`, but I can add one here as well. Meinersbur: front is the first instruction in the block, this is the compare instruction for `i <…
		Instruction *CmpI = &Cond->front();
		assert(isa<CmpInst>(CmpI) && "First inst must compare IV with TripCount");
		return CmpI->getOperand(1);
		}

		/// Returns the instruction representing the current logical induction
		/// variable. Always unsigned, always starting at 0 with an increment of one.
		Instruction *getIndVar() const {
		Instruction *IndVarPHI = &Header->front();
		assert(isa<PHINode>(IndVarPHI) && "First inst must be the IV PHI");
		return IndVarPHI;
		}

		/// Return the insertion point for user code after the loop.
		OpenMPIRBuilder::InsertPointTy getAfterIP() const {
		return {After, After->begin()};
		};

		/// Consistency self-check.
		void assertOK() const;
		};

} // end namespace llvm		} // end namespace llvm

#endif // LLVM_IR_IRBUILDER_H		#endif // LLVM_IR_IRBUILDER_H

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp

Show First 20 Lines • Show All 807 Lines • ▼ Show 20 Lines	OpenMPIRBuilder::CreateMaster(const LocationDescription &Loc,

Function *ExitRTLFn = getOrCreateRuntimeFunctionPtr(OMPRTL___kmpc_end_master);		Function *ExitRTLFn = getOrCreateRuntimeFunctionPtr(OMPRTL___kmpc_end_master);
Instruction *ExitCall = Builder.CreateCall(ExitRTLFn, Args);		Instruction *ExitCall = Builder.CreateCall(ExitRTLFn, Args);

return EmitOMPInlinedRegion(OMPD, EntryCall, ExitCall, BodyGenCB, FiniCB,		return EmitOMPInlinedRegion(OMPD, EntryCall, ExitCall, BodyGenCB, FiniCB,
/Conditional/ true, /hasFinalize/ true);		/Conditional/ true, /hasFinalize/ true);
}		}

		CanonicalLoopInfo *
		OpenMPIRBuilder::CreateCanonicalLoop(const LocationDescription &Loc,
		LoopBodyGenCallbackTy BodyGenCB,
		Value *TripCount) {
		BasicBlock *BB = Loc.IP.getBlock();
		BasicBlock *NextBB = BB->getNextNode();
		Function *F = BB->getParent();
		Type *IndVarTy = TripCount->getType();

		// Create the basic block structure.
		BasicBlock *Preheader =
		BasicBlock::Create(M.getContext(), "omp_for.preheader", F, NextBB);
		BasicBlock *Header =
		BasicBlock::Create(M.getContext(), "omp_for.header", F, NextBB);
		BasicBlock *Cond =
		BasicBlock::Create(M.getContext(), "omp_for.cond", F, NextBB);
		BasicBlock *Body =
		BasicBlock::Create(M.getContext(), "omp_for.body", F, NextBB);
		BasicBlock *Latch =
		BasicBlock::Create(M.getContext(), "omp_for.inc", F, NextBB);
		BasicBlock *Exit =
		BasicBlock::Create(M.getContext(), "omp_for.exit", F, NextBB);
		BasicBlock *After =
		BasicBlock::Create(M.getContext(), "omp_for.after", F, NextBB);

		updateToLocation(Loc);
		Builder.CreateBr(Preheader);

		Builder.SetInsertPoint(Preheader);
		Builder.CreateBr(Header);

		Builder.SetInsertPoint(Header);
		PHINode *IndVarPHI = Builder.CreatePHI(IndVarTy, 2, "omp_for.iv");
		IndVarPHI->addIncoming(ConstantInt::get(IndVarTy, 0), Preheader);
		Builder.CreateBr(Cond);

		Builder.SetInsertPoint(Cond);
		Value *Cmp = Builder.CreateICmpULT(IndVarPHI, TripCount, "omp_for.cmp");
		Builder.CreateCondBr(Cmp, Body, Exit);

		Builder.SetInsertPoint(Body);
		Builder.CreateBr(Latch);

		Builder.SetInsertPoint(Latch);
		Value *Next = Builder.CreateAdd(IndVarPHI, ConstantInt::get(IndVarTy, 1),
		"omp_for.next", /HasNUW=/true);
		Builder.CreateBr(Header);
		IndVarPHI->addIncoming(Next, Latch);

		Builder.SetInsertPoint(Exit);
		Builder.CreateBr(After);

		// After all control flow has been created, insert the body user code.
		BodyGenCB(InsertPointTy(Body, Body->begin()), IndVarPHI);

		// Remember and return the canonical control flow.
		LoopInfos.emplace_front();
		CanonicalLoopInfo *CL = &LoopInfos.front();

		CL->Preheader = Preheader;
		CL->Header = Header;
		CL->Cond = Cond;
		CL->Body = Body;
		CL->Latch = Latch;
		CL->Exit = Exit;
		CL->After = After;

		CL->IsValid = true;

		#ifndef NDEBUG
		CL->assertOK();
		#endif
		jdoerfertUnsubmitted Not Done Reply Inline Actions I usually prefer the `assert(CL->isValid() && "...")` style but no objection at the end of the day. (the reason is the other one can be used to dump more of the context easily, e.g., `if (!isValid()) dump_stuff();`. jdoerfert: I usually prefer the `assert(CL->isValid() && "...")` style but no objection at the end of the…
		MeinersburAuthorUnsubmitted Done Reply Inline Actions This follows the pattern `AssertOK` used by LLVM's instructions. I usually prefer this pattern because the crash message includes what is wrong, not just that it is invalid. Dumping stuff also assumes that the data structure is valid. Meinersbur: This follows the pattern `AssertOK` used by LLVM's instructions. I usually prefer this pattern…
		return CL;
		}

		CanonicalLoopInfo *OpenMPIRBuilder::CreateCanonicalLoop(
		const LocationDescription &Loc, LoopBodyGenCallbackTy BodyGenCB,
		Value Start, Value Stop, Value *Step, bool IsSigned, bool InclusiveStop) {
		// Consider the following difficulties (assuming 8-bit signed integers):
		// * Adding \p Step to the loop counter which passes \p Stop may overflow:
		// DO I = 1, 100, 50
		/// * A \p Step of INT_MIN cannot not be normalized to a positive direction:
		// DO I = 100, 0, -128

		// Start, Stop and Step must be of the same integer type.
		auto *IndVarTy = cast<IntegerType>(Start->getType());
		assert(IndVarTy == Stop->getType() && "Stop type mismatch");
		assert(IndVarTy == Step->getType() && "Step type mismatch");
		jdoerfertUnsubmitted Done Reply Inline Actions Nit : `&& "Loop counter type mismatch"` or similar. jdoerfert: Nit : `&& "Loop counter type mismatch"` or similar.

		updateToLocation(Loc);

		ConstantInt *Zero = ConstantInt::get(IndVarTy, 0);
		ConstantInt *One = ConstantInt::get(IndVarTy, 1);

		// Like Step, but always positive.
		Value *Incr = Step;

		// Distance between Start and Stop; always positive.
		Value *Span;

		// Condition whether there are no iterations are executed at all, e.g. because
		// UB < LB.
		Value *ZeroCmp;

		if (IsSigned) {
		// Ensure that increment is positive. If not, negate and invert LB and UB.
		Value *IsNeg = Builder.CreateICmpSLT(Step, Zero);
		Incr = Builder.CreateSelect(IsNeg, Builder.CreateNeg(Step), Step);
		Value *LB = Builder.CreateSelect(IsNeg, Stop, Start);
		Value *UB = Builder.CreateSelect(IsNeg, Start, Stop);
		Span = Builder.CreateSub(UB, LB, "", false, true);
		ZeroCmp = Builder.CreateICmp(
		InclusiveStop ? CmpInst::ICMP_SLT : CmpInst::ICMP_SLE, UB, LB);
		} else {
		Span = Builder.CreateSub(Stop, Start, "", true);
		ZeroCmp = Builder.CreateICmp(
		InclusiveStop ? CmpInst::ICMP_ULT : CmpInst::ICMP_ULE, Stop, Start);
		}

		Value *CountIfLooping;
		if (InclusiveStop) {
		CountIfLooping = Builder.CreateAdd(Builder.CreateUDiv(Span, Incr), One);
		} else {
		// Avoid incrementing past stop since it could overflow.
		Value *CountIfTwo = Builder.CreateAdd(
		Builder.CreateUDiv(Builder.CreateSub(Span, One), Incr), One);
		Value *OneCmp = Builder.CreateICmp(
		InclusiveStop ? CmpInst::ICMP_ULT : CmpInst::ICMP_ULE, Span, Incr);
		CountIfLooping = Builder.CreateSelect(OneCmp, One, CountIfTwo);
		}
		Value *TripCount = Builder.CreateSelect(ZeroCmp, Zero, CountIfLooping);

		auto BodyGen = [=](InsertPointTy CodeGenIP, Value *IV) {
		Builder.restoreIP(CodeGenIP);
		Value *Span = Builder.CreateMul(IV, Step);
		Value *IndVar = Builder.CreateAdd(Span, Start);
		BodyGenCB(Builder.saveIP(), IndVar);
		};
		return CreateCanonicalLoop(Builder.saveIP(), BodyGen, TripCount);
		}

		void CanonicalLoopInfo::eraseFromParent() {
		assert(IsValid && "can only erase previously valid loop cfg");
		jdoerfertUnsubmitted Done Reply Inline Actions Nit: msg. jdoerfert: Nit: msg.
		IsValid = false;

		SmallVector<BasicBlock *, 5> BBsToRemove{Header, Cond, Latch, Exit};
		SmallVector<Instruction *, 16> InstsToRemove;

		// Only remove preheader if not re-purposed somewhere else.
		if (Preheader->getNumUses() == 0)
		BBsToRemove.push_back(Preheader);

		DeleteDeadBlocks(BBsToRemove);
		}

OpenMPIRBuilder::InsertPointTy		OpenMPIRBuilder::InsertPointTy
OpenMPIRBuilder::CreateCopyPrivate(const LocationDescription &Loc,		OpenMPIRBuilder::CreateCopyPrivate(const LocationDescription &Loc,
llvm::Value BufSize, llvm::Value CpyBuf,		llvm::Value BufSize, llvm::Value CpyBuf,
llvm::Value CpyFn, llvm::Value DidIt) {		llvm::Value CpyFn, llvm::Value DidIt) {
if (!updateToLocation(Loc))		if (!updateToLocation(Loc))
return Loc.IP;		return Loc.IP;

Constant *SrcLocStr = getOrCreateSrcLocStr(Loc);		Constant *SrcLocStr = getOrCreateSrcLocStr(Loc);
		jdoerfertUnsubmitted Done Reply Inline Actions I thought we have an API to delete blocks without the need for this. Maybe try to use that instead? jdoerfert: I thought we have an API to delete blocks without the need for this. Maybe try to use that…
Value *Ident = getOrCreateIdent(SrcLocStr);		Value *Ident = getOrCreateIdent(SrcLocStr);
Value *ThreadId = getOrCreateThreadID(Ident);		Value *ThreadId = getOrCreateThreadID(Ident);

llvm::Value *DidItLD = Builder.CreateLoad(DidIt);		llvm::Value *DidItLD = Builder.CreateLoad(DidIt);

Value *Args[] = {Ident, ThreadId, BufSize, CpyBuf, CpyFn, DidItLD};		Value *Args[] = {Ident, ThreadId, BufSize, CpyBuf, CpyFn, DidItLD};

Function *Fn = getOrCreateRuntimeFunctionPtr(OMPRTL___kmpc_copyprivate);		Function *Fn = getOrCreateRuntimeFunctionPtr(OMPRTL___kmpc_copyprivate);
▲ Show 20 Lines • Show All 381 Lines • ▼ Show 20 Lines	void OpenMPIRBuilder::OutlineInfo::collectBlocks(
while (!Worklist.empty()) {		while (!Worklist.empty()) {
BasicBlock *BB = Worklist.pop_back_val();		BasicBlock *BB = Worklist.pop_back_val();
BlockVector.push_back(BB);		BlockVector.push_back(BB);
for (BasicBlock *SuccBB : successors(BB))		for (BasicBlock *SuccBB : successors(BB))
if (BlockSet.insert(SuccBB).second)		if (BlockSet.insert(SuccBB).second)
Worklist.push_back(SuccBB);		Worklist.push_back(SuccBB);
}		}
}		}

		void CanonicalLoopInfo::assertOK() const {
		#ifndef NDEBUG
		if (!IsValid)
		return;

		// Verify standard control-flow we use for OpenMP loops.
		assert(Preheader);
		assert(isa<BranchInst>(Preheader->getTerminator()) &&
		"Preheader must terminate with unconditional branch");
		assert(Preheader->getSingleSuccessor() == Header &&
		"Preheader must jump to header");

		assert(Header);
		assert(isa<BranchInst>(Header->getTerminator()) &&
		"Header must terminate with unconditional branch");
		assert(Header->getSingleSuccessor() == Cond &&
		"Header must jump to exiting block");
		fghanimUnsubmitted Done Reply Inline Actions did you mean to say `Cond->getTerminator()`? fghanim: did you mean to say `Cond->getTerminator()`?
		MeinersburAuthorUnsubmitted Done Reply Inline Actions Yep. Meinersbur: Yep.

		assert(Cond);
		assert(Cond->getSinglePredecessor() == Header &&
		"Exiting block only reachable from header");

		assert(isa<BranchInst>(Cond->getTerminator()) &&
		"Exiting block must terminate with conditional branch");
		assert(size(successors(Cond)) == 2 &&
		"Exiting block must have two successors");
		fghanimUnsubmitted Not Done Reply Inline Actions Nit: must have two successors fghanim: Nit: must have two successors
		assert(cast<BranchInst>(Cond->getTerminator())->getSuccessor(0) == Body &&
		"Exiting block's first successor jump to the body");
		assert(cast<BranchInst>(Cond->getTerminator())->getSuccessor(1) == Exit &&
		"Exiting block's second successor must exit the loop");

		assert(Body);
		assert(Body->getSinglePredecessor() == Cond &&
		"Body only reachable from exiting block");

		assert(Latch);
		assert(isa<BranchInst>(Latch->getTerminator()) &&
		"Latch must terminate with unconditional branch");
		assert(Latch->getSingleSuccessor() == Header && "Latch must jump to header");

		assert(Exit);
		assert(isa<BranchInst>(Exit->getTerminator()) &&
		"Exit block must terminate with unconditional branch");
		assert(Exit->getSingleSuccessor() == After &&
		"Exit block must jump to after block");

		assert(After);
		assert(After->getSinglePredecessor() == Exit &&
		"After block only reachable from exit block");

		Instruction *IndVar = getIndVar();
		assert(IndVar && "Canonical induction variable not found?");
		assert(isa<IntegerType>(IndVar->getType()) &&
		"Induction variable must be an integer");
		assert(cast<PHINode>(IndVar)->getParent() == Header &&
		"Induction variable must be a PHI in the loop header");

		Value *TripCount = getTripCount();
		assert(TripCount && "Loop trip count not found?");
		assert(IndVar->getType() == TripCount->getType() &&
		"Trip count and induction variable must have the same type");

		auto *CmpI = cast<CmpInst>(&Cond->front());
		assert(CmpI->getPredicate() == CmpInst::ICMP_ULT &&
		"Exit condition must be a signed less-than comparison");
		assert(CmpI->getOperand(0) == IndVar &&
		"Exit condition must compare the induction variable");
		assert(CmpI->getOperand(1) == TripCount &&
		"Exit condition must compare with the trip count");
		#endif
		}

llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp

Show First 20 Lines • Show All 823 Lines • ▼ Show 20 Lines	if (!isa<ReturnInst>(ExitBB->front())) {
ASSERT_TRUE(isa<BranchInst>(ExitBB->front()));		ASSERT_TRUE(isa<BranchInst>(ExitBB->front()));
ASSERT_EQ(cast<BranchInst>(ExitBB->front()).getNumSuccessors(), 1U);		ASSERT_EQ(cast<BranchInst>(ExitBB->front()).getNumSuccessors(), 1U);
ASSERT_TRUE(isa<ReturnInst>(		ASSERT_TRUE(isa<ReturnInst>(
cast<BranchInst>(ExitBB->front()).getSuccessor(0)->front()));		cast<BranchInst>(ExitBB->front()).getSuccessor(0)->front()));
}		}
}		}
}		}

		TEST_F(OpenMPIRBuilderTest, CanonicalLoopSimple) {
		using InsertPointTy = OpenMPIRBuilder::InsertPointTy;
		OpenMPIRBuilder OMPBuilder(*M);
		OMPBuilder.initialize();
		IRBuilder<> Builder(BB);
		OpenMPIRBuilder::LocationDescription Loc({Builder.saveIP(), DL});
		Value *TripCount = F->getArg(0);

		unsigned NumBodiesGenerated = 0;
		auto LoopBodyGenCB = [&](InsertPointTy CodeGenIP, llvm::Value *LC) {
		NumBodiesGenerated += 1;

		Builder.restoreIP(CodeGenIP);

		Value *Cmp = Builder.CreateICmpEQ(LC, TripCount);
		Instruction ThenTerm, ElseTerm;
		SplitBlockAndInsertIfThenElse(Cmp, CodeGenIP.getBlock()->getTerminator(),
		&ThenTerm, &ElseTerm);
		};

		CanonicalLoopInfo *Loop =
		OMPBuilder.CreateCanonicalLoop(Loc, LoopBodyGenCB, TripCount);

		Builder.restoreIP(Loop->getAfterIP());
		ReturnInst *RetInst = Builder.CreateRetVoid();
		OMPBuilder.finalize();

		Loop->assertOK();
		EXPECT_FALSE(verifyModule(*M, &errs()));

		EXPECT_EQ(NumBodiesGenerated, 1U);

		// Verify control flow structure (in addition to Loop->assertOK()).
		EXPECT_EQ(Loop->getPreheader()->getSinglePredecessor(), &F->getEntryBlock());
		EXPECT_EQ(Loop->getAfter(), Builder.GetInsertBlock());

		Instruction *IndVar = Loop->getIndVar();
		EXPECT_TRUE(isa<PHINode>(IndVar));
		EXPECT_EQ(IndVar->getType(), TripCount->getType());
		EXPECT_EQ(IndVar->getParent(), Loop->getHeader());

		EXPECT_EQ(Loop->getTripCount(), TripCount);

		BasicBlock *Body = Loop->getBody();
		Instruction *CmpInst = &Body->getInstList().front();
		EXPECT_TRUE(isa<ICmpInst>(CmpInst));
		EXPECT_EQ(CmpInst->getOperand(0), IndVar);

		BasicBlock *LatchPred = Loop->getLatch()->getSinglePredecessor();
		EXPECT_TRUE(llvm::all_of(successors(Body), [=](BasicBlock *SuccBB) {
		return SuccBB->getSingleSuccessor() == LatchPred;
		}));

		EXPECT_EQ(&Loop->getAfter()->front(), RetInst);
		}

		TEST_F(OpenMPIRBuilderTest, CanonicalLoopBounds) {
		using InsertPointTy = OpenMPIRBuilder::InsertPointTy;
		OpenMPIRBuilder OMPBuilder(*M);
		OMPBuilder.initialize();
		IRBuilder<> Builder(BB);

		// Check the trip count is computed correctly. We generate the canonical loop
		// but rely on the IRBuilder's constant folder to compute the final result
		// since all inputs are constant. To verify overflow situations, limit the
		// trip count / loop counter widths to 16 bits.
		auto EvalTripCount = [&](int64_t Start, int64_t Stop, int64_t Step,
		bool IsSigned, bool InclusiveStop) -> int64_t {
		OpenMPIRBuilder::LocationDescription Loc({Builder.saveIP(), DL});
		Type *LCTy = Type::getInt16Ty(Ctx);
		Value *StartVal = ConstantInt::get(LCTy, Start);
		Value *StopVal = ConstantInt::get(LCTy, Stop);
		Value *StepVal = ConstantInt::get(LCTy, Step);
		auto LoopBodyGenCB = [&](InsertPointTy CodeGenIP, llvm::Value *LC) {};
		CanonicalLoopInfo *Loop =
		OMPBuilder.CreateCanonicalLoop(Loc, LoopBodyGenCB, StartVal, StopVal,
		StepVal, IsSigned, InclusiveStop);
		Loop->assertOK();
		Builder.restoreIP(Loop->getAfterIP());
		Value *TripCount = Loop->getTripCount();
		return cast<ConstantInt>(TripCount)->getValue().getZExtValue();
		};

		ASSERT_EQ(EvalTripCount(0, 0, 1, false, false), 0);
		ASSERT_EQ(EvalTripCount(0, 1, 2, false, false), 1);
		ASSERT_EQ(EvalTripCount(0, 42, 1, false, false), 42);
		ASSERT_EQ(EvalTripCount(0, 42, 2, false, false), 21);
		ASSERT_EQ(EvalTripCount(21, 42, 1, false, false), 21);
		ASSERT_EQ(EvalTripCount(0, 5, 5, false, false), 1);
		ASSERT_EQ(EvalTripCount(0, 9, 5, false, false), 2);
		ASSERT_EQ(EvalTripCount(0, 11, 5, false, false), 3);
		ASSERT_EQ(EvalTripCount(0, 0xFFFF, 1, false, false), 0xFFFF);
		ASSERT_EQ(EvalTripCount(0xFFFF, 0, 1, false, false), 0);
		ASSERT_EQ(EvalTripCount(0xFFFE, 0xFFFF, 1, false, false), 1);
		ASSERT_EQ(EvalTripCount(0, 0xFFFF, 0x100, false, false), 0x100);
		ASSERT_EQ(EvalTripCount(0, 0xFFFF, 0xFFFF, false, false), 1);

		ASSERT_EQ(EvalTripCount(0, 6, 5, false, false), 2);
		ASSERT_EQ(EvalTripCount(0, 0xFFFF, 0xFFFE, false, false), 2);
		ASSERT_EQ(EvalTripCount(0, 0, 1, false, true), 1);
		ASSERT_EQ(EvalTripCount(0, 0, 0xFFFF, false, true), 1);
		ASSERT_EQ(EvalTripCount(0, 0xFFFE, 1, false, true), 0xFFFF);
		ASSERT_EQ(EvalTripCount(0, 0xFFFE, 2, false, true), 0x8000);

		ASSERT_EQ(EvalTripCount(0, 0, -1, true, false), 0);
		ASSERT_EQ(EvalTripCount(0, 1, -1, true, true), 0);
		ASSERT_EQ(EvalTripCount(20, 5, -5, true, false), 3);
		ASSERT_EQ(EvalTripCount(20, 5, -5, true, true), 4);
		ASSERT_EQ(EvalTripCount(-4, -2, 2, true, false), 1);
		ASSERT_EQ(EvalTripCount(-4, -3, 2, true, false), 1);
		ASSERT_EQ(EvalTripCount(-4, -2, 2, true, true), 2);

		ASSERT_EQ(EvalTripCount(INT16_MIN, 0, 1, true, false), 0x8000);
		ASSERT_EQ(EvalTripCount(INT16_MIN, 0, 1, true, true), 0x8001);
		ASSERT_EQ(EvalTripCount(INT16_MIN, 0x7FFF, 1, true, false), 0xFFFF);
		ASSERT_EQ(EvalTripCount(INT16_MIN + 1, 0x7FFF, 1, true, true), 0xFFFF);
		ASSERT_EQ(EvalTripCount(INT16_MIN, 0, 0x7FFF, true, false), 2);
		ASSERT_EQ(EvalTripCount(0x7FFF, 0, -1, true, false), 0x7FFF);
		ASSERT_EQ(EvalTripCount(0, INT16_MIN, -1, true, false), 0x8000);
		ASSERT_EQ(EvalTripCount(0, INT16_MIN, -16, true, false), 0x800);
		ASSERT_EQ(EvalTripCount(0x7FFF, INT16_MIN, -1, true, false), 0xFFFF);
		ASSERT_EQ(EvalTripCount(0x7FFF, 1, INT16_MIN, true, false), 1);
		ASSERT_EQ(EvalTripCount(0x7FFF, -1, INT16_MIN, true, true), 2);

		// Finalize the function and verify it.
		Builder.CreateRetVoid();
		OMPBuilder.finalize();
		EXPECT_FALSE(verifyModule(*M, &errs()));
		}

TEST_F(OpenMPIRBuilderTest, MasterDirective) {		TEST_F(OpenMPIRBuilderTest, MasterDirective) {
using InsertPointTy = OpenMPIRBuilder::InsertPointTy;		using InsertPointTy = OpenMPIRBuilder::InsertPointTy;
OpenMPIRBuilder OMPBuilder(*M);		OpenMPIRBuilder OMPBuilder(*M);
OMPBuilder.initialize();		OMPBuilder.initialize();
F->setName("func");		F->setName("func");
IRBuilder<> Builder(BB);		IRBuilder<> Builder(BB);

OpenMPIRBuilder::LocationDescription Loc({Builder.saveIP(), DL});		OpenMPIRBuilder::LocationDescription Loc({Builder.saveIP(), DL});
▲ Show 20 Lines • Show All 265 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[OpenMPIRBuilder] Implement CreateCanonicalLoop.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 303965

llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp

llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp

[OpenMPIRBuilder] Implement CreateCanonicalLoop.
ClosedPublic