This is an archive of the discontinued LLVM Phabricator instance.

Scalar/PHI code genration
ClosedPublic

Authored by jdoerfert on Feb 9 2015, 10:34 AM.

Download Raw Diff

Details

Reviewers

sebpop
• zinob
grosser
simbuerg

Commits

rGecff11dcfb49: Add scalar and phi code generation
rPLO238070: Add scalar and phi code generation
rL238070: Add scalar and phi code generation

Summary

Added scalar and phi code generation to the backend, hence making the polly
prepare and independent block pass obsolete.

Initial statistics with the LLVM test suite show that we detect now more
SCoPs and while they are smaller (wrt. #basic blocks), they still cover the
same source lines. Additionally, the compile time was decreased for various
benchmarks while the execution time varies only very little.

TODO: - add more of my unit tests
      - create unit tests for the special cases that occured in lnt
      - add more explaination in the commit message
      - do a more comprehensive comparison with the polly prepare and
        independent blocks passes.

FIXME: There is/was one failing lnt test (MultiSource/Applications/oggenc),
       however the problematic SCoP doesn't even contain scalar
       accesses but only real memory accesses with simple __and__
       existentially quantified access functions. For now it's safe to
       assume this is just a new, failing, SCoP, not a error in this
       patch.

Diff Detail

Repository: rL LLVM

Event Timeline

jdoerfert updated this revision to Diff 19598.Feb 9 2015, 10:34 AM

jdoerfert retitled this revision from to Scalar/PHI code genration.

jdoerfert edited the test plan for this revision. (Show Details)

jdoerfert added reviewers: grosser, sebpop, simbuerg, • zinob.

jdoerfert updated this object.

jdoerfert added subscribers: Restricted Project, Unknown Object (MLST).

jdoerfert added inline comments.Feb 9 2015, 10:38 AM

include/polly/CodeGen/BlockGenerators.h
267 ↗	(On Diff #19598)	The LTS param was deleted by accident.

Hi Johannes,

the general structure of the patch looks very good. I started directly to go through it in detail to try to understand what finer details and the corner cases/difficulties you have addressed.

I did not yet manage to finish the patch, but wanted to already send you my first comments.

include/polly/CodeGen/BlockGenerators.h
121 ↗	(On Diff #19598)	Is there a reason you use two different maps. As an instruction is either a scalar or a phi, you could just store them in a single map, no? This might avoid the need to later choose between the two. While reading further through the source code, it seems a PHI instruction may have two allocations. One for the incoming values and one for the outcoming values. Maybe it would be worth a comment and an example here that explains which values belong where and why PHI nodes can require two different allocs.
lib/CodeGen/BlockGenerators.cpp
461 ↗	(On Diff #19598)	The non-PHI case is pretty straightforward, but it becomes unclear due to the mix of PHI and non-PHI code. My comments below try to point out how those code paths could be made more separate.
463 ↗	(On Diff #19598)	This could be folded into the if(ScalarBasePHI) condition, no?
465 ↗	(On Diff #19598)	artificial.
476 ↗	(On Diff #19598)	A single map might avoid this ternary condition on PHIOpMap, no? Also, moving this below the if(ScalarBasePHI) block, would allow to move the continue for artificial self write accesses into the ScalarBasePHI block as well. Like this there is just a single block that contains all the PHI special cases.
482 ↗	(On Diff #19598)	You could possibly put this into an if (!ScalarBasePHI) { ScalarValue = ScalarInst; } else { } This would make clear that these are two exclusive code paths. One for the common case, one for the PHI.
484 ↗	(On Diff #19598)	PHI? You sometimes write it uppercase, otherwise lowercase.
509 ↗	(On Diff #19598)	typo.
512 ↗	(On Diff #19598)	Why would we reload a value to directly store it back? Could we instead just neither load nor store the value?
543 ↗	(On Diff #19598)	PHI?
562 ↗	(On Diff #19598)	typo
594 ↗	(On Diff #19598)	typo

answers

include/polly/CodeGen/BlockGenerators.h
121 ↗	(On Diff #19598)	For almost all corner cases and most of the decisions I need to add unit tests (I know). Nevertheless, I am quite sure you need two allocations for a PHI node, the general test case structure looks like: x1 = ... for (i=0...N) { x2 = phi(x1, add) add = x2 + A[i]; } print(x1) print(x2) print(add)
lib/CodeGen/BlockGenerators.cpp
463 ↗	(On Diff #19598)	It is used twice so I moved it out of the two if cases.
476 ↗	(On Diff #19598)	We could extract the phi path probably, but I still need we need 2 maps.
482 ↗	(On Diff #19598)	ok
484 ↗	(On Diff #19598)	True. Which do you prefer?
512 ↗	(On Diff #19598)	Unit test idea: bb1: %x = br %cond, label %then, label %else then: br label %merge else: %y = merge: %p = phi[(%x, %then), (%y, %else)] Now we have a write access in the %then block which needs to reload %x and store it in the operand location of %p.

Second set of comments. Again, mostly comments to help me understand better.

include/polly/CodeGen/BlockGenerators.h
121 ↗	(On Diff #19598)	Right. I agree that two different allocations might be needed. If you could comment in the code (possibly with example), why PHI nodes need two allocations that would be great. Also, I wonder if you can somehow explain which of the two allocations is "normal" and ends up in the ScalarMap and which is special and ends up in the PHIOpMap, that would help my understanding.
153 ↗	(On Diff #19598)	You explain that this function generates both the loads and the stores of a PHI node, but it is not 100% clear to me why these to need to be generated together, instead of just using the same code-generating for generating their loads/stores as is used for normal scalar loads & stores. Is this for correctness reasons or something else? (In fact, for some PHIs you seem to do so, only self reverencing PHIs seem to be special).
lib/CodeGen/BlockGenerators.cpp
321 ↗	(On Diff #19598)	I do not understand why this condition (and the forwarding of the instructions in copyInstruction) is necessary. If I remove the full if (InstCopy) block, all test cases still pass for me. Is the BBMap not already updated when the statements are copied? For example: // Compute NewStore before its insertion in BBMap to make the insertion // deterministic. BBMap[Store] = NewStore; return NewStore;
463 ↗	(On Diff #19598)	I missed the second use. It is in the special case where you check if a PHI is self-referencing. (I commented earlier that I do not yet understand why stores caused by self-referencing PHIs can not be handled in generateScalarStores. If they could, the second use would not be needed any more).
484 ↗	(On Diff #19598)	I do not have a preference. Maybe just check: grep -R " PHI " lib/ \| wc grep -R " phi " lib/ \| wc and use the one that is more common in Polly.
512 ↗	(On Diff #19598)	Ah, I see. This is also PHI node specific. This indeed makes a lot of sense. mentioning this in the comment and/or folding this into the PHI block might make this more obvious

Initial comments.

lib/CodeGen/BlockGenerators.cpp
321 ↗	(On Diff #19598)	During the development this looked different and when I changed it I assumed that if there is a inst copy there is a mapping, however for synthezisable instructions there is none (currently), the same holds for phi nodes (but then there is no inst copy either).
461 ↗	(On Diff #19598)	I will seperate them as far as possible.

Hi Johannes,

a last minor comment.

Also, to make it clear. I think the general structure of the patch is great. There are some parts where I try to give some comments to better understand the patch or to improve readability, but none is affecting the general structure of the patch.

Looking forward to an updated version,
Tobias

lib/CodeGen/BlockGenerators.cpp
321 ↗	(On Diff #19598)	I understand partially. ;-) Are you saying this is needed for PHI nodes and synthezisable instructions, but we just don't have a test case? Or this was needed at some point, but is not needed any more?

New Version that handles loops in non-affine regions. Passes all lnt tests with and without scalar/phi modeling.

Hi Johannes,

this patch looks great. Thanks for pushing this ahead.

I added a couple of comments. Mostly minor typos. There is one comment on stuff that could be split off into a separate patch (would be nice, but not essential) as well as one possible correctness issue in the linked-list implementation.

Otherwise, this LGTM.

Also, we should try to enable this quickly. I think just having LNT performance numbers for the latest patch that confirm we do not badly regress should be enough to justify this switch.

Best,
Tobias

include/polly/CodeGen/BlockGenerators.h
297 ↗	(On Diff #24874)	This change is not needed. I would suggest to remove it from the patch to not distract from the core changes.
342 ↗	(On Diff #24874)	@returns missing
include/polly/CodeGen/IslNodeBuilder.h
107 ↗	(On Diff #24874)	This (and below)) seems like an unrelated change. Maybe commit separately.
lib/Analysis/ScopInfo.cpp
888 ↗	(On Diff #24874)	Is this correct? In case we add three memory accesses A,B,C. Would this code not first add A to the map, than add a link from A to B, and then overwrite the link to B with C, which results in us loosing the link to B entirely? I somehow feel the manually implemented linked list adds unnecessary complexity and also somehow mixes the MemoryAccess definition and the InstructionToAccess mapping. I wonder if it might not be clearer to make the InstructionToAccess Map a mapping from instructions to a vector of accesses.
lib/CodeGen/BlockGenerators.cpp
278 ↗	(On Diff #24874)	As this function only returns a Value, there is no need to change this to 'Instruction' (similar comment below). To not distract from the core change, I would suggest to avoid this change (or perform it in a separate patch).
360 ↗	(On Diff #24874)	This patch modifies various functions to pass the newly created instruction on. I somehow believe to remember you earlier passed these on for some other use case, but the current patch only seems to need these changes for the assert. If the assert has shown helpful during development, it seems reasonable to commit it (and the associated changes). However, in the optimal case, this would be a separate patch committed ahead of time. If that is to much work, I do not insist on separating this out. Also, a general phabricator issue is that the updated commit messages are not visible. It would be nice if you could add a comment in the commit message that explains the intention of these set of changes (to help people understand that they are not strictly necessary for the core goal of this patch).
419 ↗	(On Diff #24874)	an
483 ↗	(On Diff #24874)	an
489 ↗	(On Diff #24874)	out of the region
494 ↗	(On Diff #24874)	grammar?
536 ↗	(On Diff #24874)	Use a range-based for loop?
1220 ↗	(On Diff #24874)	Unfinished sentence.
lib/CodeGen/IslCodeGeneration.cpp
141 ↗	(On Diff #24874)	leftover debugging stuff?

This revision is now accepted and ready to land.May 5 2015, 10:00 AM

Quick comment about lnt:

We mostly benefit from this (as 5 runs on my laptop show) and do not regress to badly anywhere.

My last concern:

I haven't tested this with parallel code generation yet. We need to do that before enabling it (but not before commiting it though).

include/polly/CodeGen/BlockGenerators.h
121 ↗	(On Diff #19598)	I added a comment for both maps to describe what they are used for.
153 ↗	(On Diff #19598)	I removed this. I use generateScalarStorePHI now from generateScalarStore and the ge cnerateScalarLoad code is easy for both cases.
lib/CodeGen/BlockGenerators.cpp
321 ↗	(On Diff #19598)	I put synthezised instructions in the BBMap and removed the conditional you asked about.

Quick comment about lnt:

We mostly benefit from this (as 5 runs on my laptop show) and do not regress to badly anywhere.

My last concern:

I haven't tested this with parallel code generation yet. We need to do that before enabling it (but not before commiting it though).

I will change the stuff according to the comments (including the spelling) and then commit it.

lib/Analysis/ScopInfo.cpp
888 ↗	(On Diff #24874)	We add A and MA is null. We add B and MA is now A (as it was already in the Inst2Acc map) As MA is not null we link B (the last access) to A: B -> A and change the mapping in Inst2Acc to B: Inst2Acc[AccessInst] ==> B This way nothing is lost and in the end it looks like: Inst2Acc[AccessInst] ==> C -> B -> A We could map it to a vector but I thought the overhead would be unproportional. If you think it is worth it I can change it again.
lib/CodeGen/BlockGenerators.cpp
278 ↗	(On Diff #24874)	Ok.
360 ↗	(On Diff #24874)	This might be true, I'll check if I can revert the return Instruction/Value part.

In D7513#168961, @jdoerfert wrote:

I will change the stuff according to the comments (including the spelling) and then commit it.

OK. I commented on the last issue that could need some feedback.

Tobias

lib/Analysis/ScopInfo.cpp
888 ↗	(On Diff #24874)	Alright! I got a little confused by the use of MemAccs.back() as well as the *&MA items, which works both as a pointer that is passed to setNextMA() as well as a way to update Inst2Acc. Instead of using a vector we can also use std::forward_list, which should have a cost similar to what we have today, but it makes it explicit that Inst2Acc is indeed mapping to a list of Accesses. auto NewAccess = new MemoryAccess(Access, AccessInst, this, SAI); MemAccs.push_back(NewAccess); if (!InstructionToAccess.count(AccessInst)) InstructionToAccess[AccessInst] = {NewAccess}; else InstructionToAccess[AccessInst].push_front(NewAccess); In addition, the small, but needed changes to lookupAccessFor() would clarify that there are indeed multiple accesses per instruction. On the other side, all the manual list-management code would not be needed and we get some proper iterators for this as well.

I'll use some standard container but I won't change it today. Thanks!

msg-18927-6.dat219 BDownload

Hi Johannes,

I just got fresh performance numbers for this patch. To me it looks as if it gives both, nice performance and nice compile time improvements:

polly-vs-pollyModelScalars.html891 KBDownload

I saw a couple of errs() that you still left in and there is the standard-container thing, but both seem minor issues. Any plans to commit this patch soon?

Best,
Tobias

Closed by commit rL238070: Add scalar and phi code generation (authored by jdoerfert). · Explain WhyMay 22 2015, 4:48 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

polly/

trunk/

include/

polly/

CodeGen/

BlockGenerators.h

232 lines

IslNodeBuilder.h

30 lines

ScopInfo.h

39 lines

lib/

Analysis/

ScopInfo.cpp

25 lines

CodeGen/

BlockGenerators.cpp

565 lines

CodeGeneration.cpp

2 lines

test/

Isl/

CodeGen/

phi_condition_modeling_1.ll

59 lines

phi_condition_modeling_2.ll

66 lines

phi_conditional_simple_1.ll

72 lines

phi_loop_carried_float.ll

67 lines

phi_loop_carried_float_escape.ll

63 lines

phi_scalar_simple_1.ll

93 lines

phi_scalar_simple_2.ll

108 lines

Diff 26361

polly/trunk/include/polly/CodeGen/BlockGenerators.h

Show First 20 Lines • Show All 57 Lines • ▼ Show 20 Lines

/// @brief Return true iff @p V is an intrinsic that we ignore during code		/// @brief Return true iff @p V is an intrinsic that we ignore during code
/// generation.		/// generation.
bool isIgnoredIntrinsic(const llvm::Value *V);		bool isIgnoredIntrinsic(const llvm::Value *V);

/// @brief Generate a new basic block for a polyhedral statement.		/// @brief Generate a new basic block for a polyhedral statement.
class BlockGenerator {		class BlockGenerator {
public:		public:
		/// @brief Map types to resolve scalar dependences.
		///
		///@{

		/// @see The ScalarMap and PHIOpMap member.
		using ScalarAllocaMapTy = DenseMap<Instruction , AllocaInst >;

		/// @brief Simple vector of instructions to store escape users.
		using EscapeUserVectorTy = SmallVector<Instruction *, 4>;

		/// @brief Map type to resolve escaping users for scalar instructions.
		///
		/// @see The EscapeMap member.
		using EscapeUsersAllocaMapTy =
		DenseMap<Instruction , std::pair<AllocaInst , EscapeUserVectorTy>>;

		///@}

/// @brief Create a generator for basic blocks.		/// @brief Create a generator for basic blocks.
///		///
/// @param Builder The LLVM-IR Builder used to generate the statement. The		/// @param Builder The LLVM-IR Builder used to generate the statement. The
/// code is generated at the location, the Builder points		/// code is generated at the location, the Builder points
/// to.		/// to.
/// @param LI The loop info for the current function		/// @param LI The loop info for the current function
/// @param SE The scalar evolution info for the current function		/// @param SE The scalar evolution info for the current function
/// @param DT The dominator tree of this function.		/// @param DT The dominator tree of this function.
		/// @param ScalarMap Map from scalars to their demoted location.
		/// @param PHIOpMap Map from PHIs to their demoted operand location.
		/// @param EscapeMap Map from scalars to their escape users and locations.
/// @param ExprBuilder An expression builder to generate new access functions.		/// @param ExprBuilder An expression builder to generate new access functions.
BlockGenerator(PollyIRBuilder &Builder, LoopInfo &LI, ScalarEvolution &SE,		BlockGenerator(PollyIRBuilder &Builder, LoopInfo &LI, ScalarEvolution &SE,
DominatorTree &DT, IslExprBuilder *ExprBuilder = nullptr);		DominatorTree &DT, ScalarAllocaMapTy &ScalarMap,
		ScalarAllocaMapTy &PHIOpMap, EscapeUsersAllocaMapTy &EscapeMap,
		IslExprBuilder *ExprBuilder = nullptr);

/// @brief Copy the basic block.		/// @brief Copy the basic block.
///		///
/// This copies the entire basic block and updates references to old values		/// This copies the entire basic block and updates references to old values
/// with references to new values, as defined by GlobalMap.		/// with references to new values, as defined by GlobalMap.
///		///
/// @param Stmt The block statement to code generate.		/// @param Stmt The block statement to code generate.
/// @param GlobalMap A mapping from old values to their new values		/// @param GlobalMap A mapping from old values to their new values
/// (for values recalculated in the new ScoP, but not		/// (for values recalculated in the new ScoP, but not
/// within this basic block).		/// within this basic block).
/// @param LTS A map from old loops to new induction variables as SCEVs.		/// @param LTS A map from old loops to new induction variables as SCEVs.
void copyStmt(ScopStmt &Stmt, ValueMapT &GlobalMap, LoopToScevMapT &LTS);		void copyStmt(ScopStmt &Stmt, ValueMapT &GlobalMap, LoopToScevMapT &LTS);

		/// @brief Finalize the code generation for the SCoP @p S.
		///
		/// This will initialize and finalize the scalar variables we demoted during
		/// the code generation.
		///
		/// @see createScalarInitialization(Region &, ValueMapT &)
		/// @see createScalarFinalization(Region &)
		void finalizeSCoP(Scop &S, ValueMapT &VMap);

		/// @brief An empty destructor
		virtual ~BlockGenerator(){};

protected:		protected:
PollyIRBuilder &Builder;		PollyIRBuilder &Builder;
LoopInfo &LI;		LoopInfo &LI;
ScalarEvolution &SE;		ScalarEvolution &SE;
IslExprBuilder *ExprBuilder;		IslExprBuilder *ExprBuilder;

/// @brief The dominator tree of this function.		/// @brief The dominator tree of this function.
DominatorTree &DT;		DominatorTree &DT;

		/// @brief The entry block of the current function.
		BasicBlock *EntryBB;

		/// @brief Maps to resolve scalar dependences for PHI operands and scalars.
		///
		/// Usage example:
		///
		/// x1 = ... // x1 will be inserted in the ScalarMap and PhiOpMap.
		/// for (i=0...N) {
		/// x2 = phi(x1, add) // x2 will be inserted in the ScalarMap, x1 and
		/// // add are mapped in the PHIOpMap.
		/// add = x2 + A[i]; // add will be inserted in the ScalarMap and
		/// // the PhiOpMap.
		/// }
		/// print(x1) // x1 is mapped in the ScalarMap.
		/// print(x2) // x2 is mapped in the ScalarMap.
		/// print(add) // add is mapped in the ScalarMap.
		///
		///{

		/// The PHIOpMap is used to get the alloca to communicate a value to a PHI
		/// node, hence when the operand of a PHI is demoted the corresponding write
		/// access will use the PHIOpMap to look for the correct alloca. PHI nodes
		/// will then read that location in order to get the correct/current operand
		/// value.
		ScalarAllocaMapTy &PHIOpMap;

		/// The ScalarMap is used in __all__ other cases, thus always when a scalar
		/// variable is read/written and the write is not because the scalar is a PHI
		/// operand.
		ScalarAllocaMapTy &ScalarMap;
		///}

		/// @brief Map from instructions to their escape users as well as the alloca.
		EscapeUsersAllocaMapTy &EscapeMap;

/// @brief Split @p BB to create a new one we can use to clone @p BB in.		/// @brief Split @p BB to create a new one we can use to clone @p BB in.
BasicBlock splitBB(BasicBlock BB);		BasicBlock splitBB(BasicBlock BB);

/// @brief Copy the given basic block.		/// @brief Copy the given basic block.
///		///
/// @param Stmt The statement to code generate.		/// @param Stmt The statement to code generate.
/// @param BB The basic block to code generate.		/// @param BB The basic block to code generate.
/// @param BBMap A mapping from old values to their new values in this		/// @param BBMap A mapping from old values to their new values in this
Show All 16 Lines	protected:
/// block.		/// block.
/// @param GlobalMap A mapping from old values to their new values		/// @param GlobalMap A mapping from old values to their new values
/// (for values recalculated in the new ScoP, but not		/// (for values recalculated in the new ScoP, but not
/// within this basic block).		/// within this basic block).
/// @param LTS A map from old loops to new induction variables as SCEVs.		/// @param LTS A map from old loops to new induction variables as SCEVs.
void copyBB(ScopStmt &Stmt, BasicBlock BB, BasicBlock BBCopy,		void copyBB(ScopStmt &Stmt, BasicBlock BB, BasicBlock BBCopy,
ValueMapT &BBMap, ValueMapT &GlobalMap, LoopToScevMapT &LTS);		ValueMapT &BBMap, ValueMapT &GlobalMap, LoopToScevMapT &LTS);

		/// @brief Return the alloca for @p ScalarBase in @p Map.
		///
		/// If no alloca was mapped to @p ScalarBase in @p Map a new one is created
		/// and named after @p ScalarBase with the suffix @p NameExt.
		///
		/// @param ScalarBase The demoted scalar instruction.
		/// @param Map The map we should look for a mapped alloca instruction.
		/// @param NameExt The suffix we add to the name of a new created alloca.
		/// @param IsNew If set it will hold true iff the alloca was created.
		///
		/// @returns The alloca for @p ScalarBase in @p Map.
		AllocaInst getOrCreateAlloca(Instruction ScalarBase, ScalarAllocaMapTy &Map,
		const char *NameExt = ".s2a",
		bool *IsNew = nullptr);

		/// @brief Generate reload of scalars demoted to memory and needed by @p Inst.
		///
		/// @param Stmt The statement we generate code for.
		/// @param Inst The instruction that might need reloaded values.
		/// @param BBMap A mapping from old values to their new values in this block.
		virtual void generateScalarLoads(ScopStmt &Stmt, const Instruction *Inst,
		ValueMapT &BBMap);

		/// @brief Generate the scalar stores for the given statement.
		///
		/// After the statement @p Stmt was copied all inner-SCoP scalar dependences
		/// starting in @p Stmt (hence all scalar write accesses in @p Stmt) need to
		/// be demoted to memory.
		///
		/// @param Stmt The statement we generate code for.
		/// @param BB The basic block we generate code for.
		/// @param BBMap A mapping from old values to their new values in this block.
		/// @param GlobalMap A mapping for globally replaced values.
		virtual void generateScalarStores(ScopStmt &Stmt, BasicBlock *BB,
		ValueMapT &BBMAp, ValueMapT &GlobalMap);

		/// @brief Handle users of @p Inst outside the SCoP.
		///
		/// @param R The current SCoP region.
		/// @param Inst The current instruction we check.
		/// @param InstCopy The copy of the instruction @p Inst in the optimized SCoP.
		void handleOutsideUsers(const Region &R, Instruction Inst, Value InstCopy);

		/// @brief Initialize the memory of demoted scalars.
		///
		/// If a PHI node was demoted and one of its predecessor blocks was outside
		/// the SCoP we need to initialize the memory cell we demoted the PHI into
		/// with the value corresponding to that predecessor. As a SCoP is a
		/// __single__ entry region there is at most one such predecessor.
		void createScalarInitialization(Region &R, ValueMapT &VMap);

		/// @brief Promote the values of demoted scalars after the SCoP.
		///
		/// If a scalar value was used outside the SCoP we need to promote the value
		/// stored in the memory cell allocated for that scalar and combine it with
		/// the original value in the non-optimized SCoP.
		void createScalarFinalization(Region &R);

/// @brief Get the new version of a value.		/// @brief Get the new version of a value.
///		///
/// Given an old value, we first check if a new version of this value is		/// Given an old value, we first check if a new version of this value is
/// available in the BBMap or GlobalMap. In case it is not and the value can		/// available in the BBMap or GlobalMap. In case it is not and the value can
/// be recomputed using SCEV, we do so. If we can not recompute a value		/// be recomputed using SCEV, we do so. If we can not recompute a value
/// using SCEV, but we understand that the value is constant within the scop,		/// using SCEV, but we understand that the value is constant within the scop,
/// we return the old value. If the value can still not be derived, this		/// we return the old value. If the value can still not be derived, this
/// function will assert.		/// function will assert.
Show All 39 Lines	protected:
Value generateScalarLoad(ScopStmt &Stmt, const LoadInst load,		Value generateScalarLoad(ScopStmt &Stmt, const LoadInst load,
ValueMapT &BBMap, ValueMapT &GlobalMap,		ValueMapT &BBMap, ValueMapT &GlobalMap,
LoopToScevMapT &LTS);		LoopToScevMapT &LTS);

Value generateScalarStore(ScopStmt &Stmt, const StoreInst store,		Value generateScalarStore(ScopStmt &Stmt, const StoreInst store,
ValueMapT &BBMap, ValueMapT &GlobalMap,		ValueMapT &BBMap, ValueMapT &GlobalMap,
LoopToScevMapT &LTS);		LoopToScevMapT &LTS);

		/// @brief Copy a single PHI instruction.
		///
		/// The implementation in the BlockGenerator is trivial, however it allows
		/// subclasses to handle PHIs different.
		///
		/// @returns The nullptr as the BlockGenerator does not copy PHIs.
		virtual Value copyPHIInstruction(ScopStmt &, const PHINode , ValueMapT &,
		ValueMapT &, LoopToScevMapT &) {
		return nullptr;
		}

/// @brief Copy a single Instruction.		/// @brief Copy a single Instruction.
///		///
/// This copies a single Instruction and updates references to old values		/// This copies a single Instruction and updates references to old values
/// with references to new values, as defined by GlobalMap and BBMap.		/// with references to new values, as defined by GlobalMap and BBMap.
///		///
/// @param Stmt The statement to code generate.		/// @param Stmt The statement to code generate.
/// @param Inst The instruction to copy.		/// @param Inst The instruction to copy.
/// @param BBMap A mapping from old values to their new values		/// @param BBMap A mapping from old values to their new values
/// (for values recalculated within this basic block).		/// (for values recalculated within this basic block).
/// @param GlobalMap A mapping from old values to their new values		/// @param GlobalMap A mapping from old values to their new values
/// (for values recalculated in the new ScoP, but not		/// (for values recalculated in the new ScoP, but not
/// within this basic block).		/// within this basic block).
/// @param LTS A mapping from loops virtual canonical induction		/// @param LTS A mapping from loops virtual canonical induction
/// variable to their new values		/// variable to their new values
/// (for values recalculated in the new ScoP, but not		/// (for values recalculated in the new ScoP, but not
/// within this basic block).		/// within this basic block).
void copyInstruction(ScopStmt &Stmt, const Instruction *Inst,		void copyInstruction(ScopStmt &Stmt, const Instruction *Inst,
ValueMapT &BBMap, ValueMapT &GlobalMap,		ValueMapT &BBMap, ValueMapT &GlobalMap,
LoopToScevMapT &LTS);		LoopToScevMapT &LTS);

		/// @brief Helper to get the newest version of @p ScalarValue.
		///
		/// @param ScalarValue The original value needed.
		/// @param R The current SCoP region.
		/// @param ReloadMap The scalar map for demoted values.
		/// @param BBMap A mapping from old values to their new values
		/// (for values recalculated within this basic block).
		/// @param GlobalMap A mapping from old values to their new values
		/// (for values recalculated in the new ScoP, but not
		/// within this basic block).
		///
		/// @returns The newest version (e.g., reloaded) of the scalar value.
		Value getNewScalarValue(Value ScalarValue, const Region &R,
		ScalarAllocaMapTy &ReloadMap, ValueMapT &BBMap,
		ValueMapT &GlobalMap);
};		};

/// @brief Generate a new vector basic block for a polyhedral statement.		/// @brief Generate a new vector basic block for a polyhedral statement.
///		///
/// The only public function exposed is generate().		/// The only public function exposed is generate().
class VectorBlockGenerator : BlockGenerator {		class VectorBlockGenerator : BlockGenerator {
public:		public:
/// @brief Generate a new vector basic block for a ScoPStmt.		/// @brief Generate a new vector basic block for a ScoPStmt.
▲ Show 20 Lines • Show All 156 Lines • ▼ Show 20 Lines	public:
///		///
/// @param Stmt The statement to code generate.		/// @param Stmt The statement to code generate.
/// @param GlobalMap A mapping from old values to their new values		/// @param GlobalMap A mapping from old values to their new values
/// (for values recalculated in the new ScoP, but not		/// (for values recalculated in the new ScoP, but not
/// within this basic block).		/// within this basic block).
/// @param LTS A map from old loops to new induction variables as SCEVs.		/// @param LTS A map from old loops to new induction variables as SCEVs.
void copyStmt(ScopStmt &Stmt, ValueMapT &GlobalMap, LoopToScevMapT &LTS);		void copyStmt(ScopStmt &Stmt, ValueMapT &GlobalMap, LoopToScevMapT &LTS);

		/// @brief An empty destructor
		virtual ~RegionGenerator(){};

private:		private:
		/// @brief A map from old to new blocks in the region.
		DenseMap<BasicBlock , BasicBlock > BlockMap;

		/// @brief The "BBMaps" for the whole region (one for each block).
		DenseMap<BasicBlock *, ValueMapT> RegionMaps;

		/// @brief Mapping to remember PHI nodes that still need incoming values.
		using PHINodePairTy = std::pair<const PHINode , PHINode >;
		DenseMap<BasicBlock *, SmallVector<PHINodePairTy, 4>> IncompletePHINodeMap;

/// @brief Repair the dominance tree after we created a copy block for @p BB.		/// @brief Repair the dominance tree after we created a copy block for @p BB.
///		///
/// @returns The immediate dominator in the DT for @p BBCopy if in the region.		/// @returns The immediate dominator in the DT for @p BBCopy if in the region.
BasicBlock repairDominance(BasicBlock BB, BasicBlock *BBCopy,		BasicBlock repairDominance(BasicBlock BB, BasicBlock *BBCopy);
DenseMap<BasicBlock , BasicBlock > &BlockMap);
		/// @brief Add the new operand from the copy of @p IncomingBB to @p PHICopy.
		///
		/// @param Stmt The statement to code generate.
		/// @param PHI The original PHI we copy.
		/// @param PHICopy The copy of @p PHI.
		/// @param IncomingBB An incoming block of @p PHI.
		/// @param GlobalMap A mapping from old values to their new values
		/// (for values recalculated in the new ScoP, but not
		/// within this basic block).
		/// @param LTS A map from old loops to new induction variables as
		/// SCEVs.
		void addOperandToPHI(ScopStmt &Stmt, const PHINode PHI, PHINode PHICopy,
		BasicBlock *IncomingBB, ValueMapT &GlobalMap,
		LoopToScevMapT &LTS);

		/// @brief Generate reload of scalars demoted to memory and needed by @p Inst.
		///
		/// @param Stmt The statement we generate code for.
		/// @param Inst The instruction that might need reloaded values.
		/// @param BBMap A mapping from old values to their new values in this block.
		virtual void generateScalarLoads(ScopStmt &Stmt, const Instruction *Inst,
		ValueMapT &BBMap) override;

		/// @brief Generate the scalar stores for the given statement.
		///
		/// After the statement @p Stmt was copied all inner-SCoP scalar dependences
		/// starting in @p Stmt (hence all scalar write accesses in @p Stmt) need to
		/// be demoted to memory.
		///
		/// @param Stmt The statement we generate code for.
		/// @param BB The basic block we generate code for.
		/// @param BBMap A mapping from old values to their new values in this block.
		/// @param GlobalMap A mapping from old values to their new values
		/// (for values recalculated in the new ScoP, but not
		/// within this basic block).
		virtual void generateScalarStores(ScopStmt &Stmt, BasicBlock *BB,
		ValueMapT &BBMAp,
		ValueMapT &GlobalMap) override;

		/// @brief Copy a single PHI instruction.
		///
		/// This copies a single PHI instruction and updates references to old values
		/// with references to new values, as defined by GlobalMap and BBMap.
		///
		/// @param Stmt The statement to code generate.
		/// @param PHI The PHI instruction to copy.
		/// @param BBMap A mapping from old values to their new values
		/// (for values recalculated within this basic block).
		/// @param GlobalMap A mapping from old values to their new values
		/// (for values recalculated in the new ScoP, but not
		/// within this basic block).
		/// @param LTS A map from old loops to new induction variables as SCEVs.
		///
		/// @returns The copied instruction or nullptr if no copy was made.
		virtual Value copyPHIInstruction(ScopStmt &Stmt, const PHINode Inst,
		ValueMapT &BBMap, ValueMapT &GlobalMap,
		LoopToScevMapT &LTS) override;
};		};
}		}
#endif		#endif

polly/trunk/include/polly/CodeGen/IslNodeBuilder.h

	Show All 26 Lines

	class IslNodeBuilder {			class IslNodeBuilder {
	public:			public:
	IslNodeBuilder(PollyIRBuilder &Builder, ScopAnnotator &Annotator, Pass *P,			IslNodeBuilder(PollyIRBuilder &Builder, ScopAnnotator &Annotator, Pass *P,
	const DataLayout &DL, LoopInfo &LI, ScalarEvolution &SE,			const DataLayout &DL, LoopInfo &LI, ScalarEvolution &SE,
	DominatorTree &DT, Scop &S)			DominatorTree &DT, Scop &S)
	: S(S), Builder(Builder), Annotator(Annotator), Rewriter(SE, DL, "polly"),			: S(S), Builder(Builder), Annotator(Annotator), Rewriter(SE, DL, "polly"),
	ExprBuilder(Builder, IDToValue, Rewriter, DT, LI),			ExprBuilder(Builder, IDToValue, Rewriter, DT, LI),
	BlockGen(Builder, LI, SE, DT, &ExprBuilder), RegionGen(BlockGen), P(P),			BlockGen(Builder, LI, SE, DT, ScalarMap, PHIOpMap, EscapeMap,
	DL(DL), LI(LI), SE(SE), DT(DT) {}			&ExprBuilder),
				RegionGen(BlockGen), P(P), DL(DL), LI(LI), SE(SE), DT(DT) {}

	~IslNodeBuilder() {}			~IslNodeBuilder() {}

	void addParameters(__isl_take isl_set *Context);			void addParameters(__isl_take isl_set *Context);
	void create(__isl_take isl_ast_node *Node);			void create(__isl_take isl_ast_node *Node);

				/// @brief Finalize code generation for the SCoP @p S.
				///
				/// @see BlockGenerator::finalizeSCoP(Scop &S)
				void finalizeSCoP(Scop &S) { BlockGen.finalizeSCoP(S, ValueMap); }

	IslExprBuilder &getExprBuilder() { return ExprBuilder; }			IslExprBuilder &getExprBuilder() { return ExprBuilder; }

	private:			private:
	Scop &S;			Scop &S;
	PollyIRBuilder &Builder;			PollyIRBuilder &Builder;
	ScopAnnotator &Annotator;			ScopAnnotator &Annotator;

	/// @brief A SCEVExpander to create llvm values from SCEVs.			/// @brief A SCEVExpander to create llvm values from SCEVs.
	SCEVExpander Rewriter;			SCEVExpander Rewriter;

	IslExprBuilder ExprBuilder;			IslExprBuilder ExprBuilder;

				/// @brief Maps used by the block and region generator to demote scalars.
				///
				///@{

				/// @brief See BlockGenerator::ScalarMap.
				BlockGenerator::ScalarAllocaMapTy ScalarMap;

				/// @brief See BlockGenerator::PhiOpMap.
				BlockGenerator::ScalarAllocaMapTy PHIOpMap;

				/// @brief See BlockGenerator::EscapeMap.
				BlockGenerator::EscapeUsersAllocaMapTy EscapeMap;

				///@}

				/// @brief The generator used to copy a basic block.
	BlockGenerator BlockGen;			BlockGenerator BlockGen;

	/// @brief Generator for region statements.			/// @brief The generator used to copy a non-affine region.
	RegionGenerator RegionGen;			RegionGenerator RegionGen;

	Pass *const P;			Pass *const P;
	const DataLayout &DL;			const DataLayout &DL;
	LoopInfo &LI;			LoopInfo &LI;
	ScalarEvolution &SE;			ScalarEvolution &SE;
	DominatorTree &DT;			DominatorTree &DT;

	▲ Show 20 Lines • Show All 149 Lines • Show Last 20 Lines

polly/trunk/include/polly/ScopInfo.h

Show All 19 Lines
#ifndef POLLY_SCOP_INFO_H		#ifndef POLLY_SCOP_INFO_H
#define POLLY_SCOP_INFO_H		#define POLLY_SCOP_INFO_H

#include "polly/ScopDetection.h"		#include "polly/ScopDetection.h"
#include "llvm/ADT/MapVector.h"		#include "llvm/ADT/MapVector.h"
#include "llvm/Analysis/RegionPass.h"		#include "llvm/Analysis/RegionPass.h"
#include "isl/ctx.h"		#include "isl/ctx.h"

		#include <forward_list>

using namespace llvm;		using namespace llvm;

namespace llvm {		namespace llvm {
class Loop;		class Loop;
class LoopInfo;		class LoopInfo;
class PHINode;		class PHINode;
class ScalarEvolution;		class ScalarEvolution;
class SCEV;		class SCEV;
▲ Show 20 Lines • Show All 369 Lines • ▼ Show 20 Lines
/// @brief Statement of the Scop		/// @brief Statement of the Scop
///		///
/// A Scop statement represents an instruction in the Scop.		/// A Scop statement represents an instruction in the Scop.
///		///
/// It is further described by its iteration domain, its schedule and its data		/// It is further described by its iteration domain, its schedule and its data
/// accesses.		/// accesses.
/// At the moment every statement represents a single basic block of LLVM-IR.		/// At the moment every statement represents a single basic block of LLVM-IR.
class ScopStmt {		class ScopStmt {
//===-------------------------------------------------------------------===//		public:
		/// @brief List to hold all (scalar) memory accesses mapped to an instruction.
		using MemoryAccessList = std::forward_list<MemoryAccess>;

		private:
ScopStmt(const ScopStmt &) = delete;		ScopStmt(const ScopStmt &) = delete;
const ScopStmt &operator=(const ScopStmt &) = delete;		const ScopStmt &operator=(const ScopStmt &) = delete;

/// Polyhedral description		/// Polyhedral description
//@{		//@{

/// The Scop containing this ScopStmt		/// The Scop containing this ScopStmt
Scop &Parent;		Scop &Parent;
▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines	private:
/// vector of B.		/// vector of B.
isl_map *Schedule;		isl_map *Schedule;

/// The memory accesses of this statement.		/// The memory accesses of this statement.
///		///
/// The only side effects of a statement are its memory accesses.		/// The only side effects of a statement are its memory accesses.
typedef SmallVector<MemoryAccess *, 8> MemoryAccessVec;		typedef SmallVector<MemoryAccess *, 8> MemoryAccessVec;
MemoryAccessVec MemAccs;		MemoryAccessVec MemAccs;
std::map<const Instruction , MemoryAccess > InstructionToAccess;
		/// @brief Mapping from instructions to (scalar) memory accesses.
		DenseMap<const Instruction , MemoryAccessList > InstructionToAccess;

//@}		//@}

/// @brief A SCoP statement represents either a basic block (affine/precise		/// @brief A SCoP statement represents either a basic block (affine/precise
/// case) or a whole region (non-affine case). Only one of the		/// case) or a whole region (non-affine case). Only one of the
/// following two members will therefore be set and indicate which		/// following two members will therefore be set and indicate which
/// kind of statement this is.		/// kind of statement this is.
///		///
▲ Show 20 Lines • Show All 135 Lines • ▼ Show 20 Lines	public:
///		///
/// @return The region represented by this ScopStmt, or null if the statement		/// @return The region represented by this ScopStmt, or null if the statement
/// represents a basic block.		/// represents a basic block.
Region *getRegion() const { return R; }		Region *getRegion() const { return R; }

/// @brief Return true if this statement represents a whole region.		/// @brief Return true if this statement represents a whole region.
bool isRegionStmt() const { return R != nullptr; }		bool isRegionStmt() const { return R != nullptr; }

		/// @brief Return the (scalar) memory accesses for @p Inst.
		const MemoryAccessList &getAccessesFor(const Instruction *Inst) const {
		MemoryAccessList *MAL = lookupAccessesFor(Inst);
		assert(MAL && "Cannot get memory accesses because they do not exist!");
		return *MAL;
		}

		/// @brief Return the (scalar) memory accesses for @p Inst if any.
		MemoryAccessList lookupAccessesFor(const Instruction Inst) const {
		auto It = InstructionToAccess.find(Inst);
		return It == InstructionToAccess.end() ? nullptr : It->getSecond();
		}

		/// @brief Return the __first__ (scalar) memory access for @p Inst.
const MemoryAccess &getAccessFor(const Instruction *Inst) const {		const MemoryAccess &getAccessFor(const Instruction *Inst) const {
MemoryAccess *A = lookupAccessFor(Inst);		MemoryAccess *MA = lookupAccessFor(Inst);
assert(A && "Cannot get memory access because it does not exist!");		assert(MA && "Cannot get memory access because it does not exist!");
return *A;		return *MA;
}		}

		/// @brief Return the __first__ (scalar) memory access for @p Inst if any.
MemoryAccess lookupAccessFor(const Instruction Inst) const {		MemoryAccess lookupAccessFor(const Instruction Inst) const {
std::map<const Instruction , MemoryAccess >::const_iterator at =		auto It = InstructionToAccess.find(Inst);
InstructionToAccess.find(Inst);		return It == InstructionToAccess.end() ? nullptr
return at == InstructionToAccess.end() ? NULL : at->second;		: &It->getSecond()->front();
}		}

void setBasicBlock(BasicBlock *Block) {		void setBasicBlock(BasicBlock *Block) {
// TODO: Handle the case where the statement is a region statement, thus		// TODO: Handle the case where the statement is a region statement, thus
// the entry block was split and needs to be changed in the region R.		// the entry block was split and needs to be changed in the region R.
assert(BB && "Cannot set a block for a region statement");		assert(BB && "Cannot set a block for a region statement");
BB = Block;		BB = Block;
}		}
▲ Show 20 Lines • Show All 479 Lines • Show Last 20 Lines

polly/trunk/lib/Analysis/ScopInfo.cpp

Show All 21 Lines
#include "polly/ScopInfo.h"		#include "polly/ScopInfo.h"
#include "polly/Support/GICHelper.h"		#include "polly/Support/GICHelper.h"
#include "polly/Support/SCEVValidator.h"		#include "polly/Support/SCEVValidator.h"
#include "polly/Support/ScopHelper.h"		#include "polly/Support/ScopHelper.h"
#include "polly/TempScopInfo.h"		#include "polly/TempScopInfo.h"
#include "llvm/ADT/MapVector.h"		#include "llvm/ADT/MapVector.h"
#include "llvm/ADT/SetVector.h"		#include "llvm/ADT/SetVector.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/StringExtras.h"		#include "llvm/ADT/StringExtras.h"
#include "llvm/Analysis/AliasAnalysis.h"		#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/Analysis/LoopInfo.h"		#include "llvm/Analysis/LoopInfo.h"
#include "llvm/Analysis/RegionIterator.h"		#include "llvm/Analysis/RegionIterator.h"
#include "llvm/Analysis/ScalarEvolutionExpressions.h"		#include "llvm/Analysis/ScalarEvolutionExpressions.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "isl/aff.h"		#include "isl/aff.h"
#include "isl/constraint.h"		#include "isl/constraint.h"
▲ Show 20 Lines • Show All 834 Lines • ▼ Show 20 Lines	for (auto &AccessPair : *AFS) {

Type *ElementType = getAccessInstType(AccessInst);		Type *ElementType = getAccessInstType(AccessInst);
const ScopArrayInfo *SAI = getParent()->getOrCreateScopArrayInfo(		const ScopArrayInfo *SAI = getParent()->getOrCreateScopArrayInfo(
Access.getBase(), ElementType, Access.Sizes);		Access.getBase(), ElementType, Access.Sizes);

if (isApproximated && Access.isWrite())		if (isApproximated && Access.isWrite())
Access.setMayWrite();		Access.setMayWrite();

MemAccs.push_back(		MemoryAccessList *&MAL = InstructionToAccess[AccessInst];
new MemoryAccess(Access, AccessInst, this, SAI, MemAccs.size()));		if (!MAL)
		MAL = new MemoryAccessList();
// We do not track locations for scalar memory accesses at the moment.		MAL->emplace_front(Access, AccessInst, this, SAI, MemAccs.size());
//		MemAccs.push_back(&MAL->front());
// We do not have a use for this information at the moment. If we need this
// at some point, the "instruction -> access" mapping needs to be enhanced
// as a single instruction could then possibly perform multiple accesses.
if (!Access.isScalar()) {
assert(!InstructionToAccess.count(AccessInst) &&
"Unexpected 1-to-N mapping on instruction to access map!");
InstructionToAccess[AccessInst] = MemAccs.back();
}
}		}
}		}

void ScopStmt::realignParams() {		void ScopStmt::realignParams() {
for (MemoryAccess MA : this)		for (MemoryAccess MA : this)
MA->realignParams();		MA->realignParams();

Domain = isl_set_align_params(Domain, Parent.getParamSpace());		Domain = isl_set_align_params(Domain, Parent.getParamSpace());
▲ Show 20 Lines • Show All 352 Lines • ▼ Show 20 Lines	__isl_give isl_space *ScopStmt::getDomainSpace() const {
return isl_set_get_space(Domain);		return isl_set_get_space(Domain);
}		}

__isl_give isl_id *ScopStmt::getDomainId() const {		__isl_give isl_id *ScopStmt::getDomainId() const {
return isl_set_get_tuple_id(Domain);		return isl_set_get_tuple_id(Domain);
}		}

ScopStmt::~ScopStmt() {		ScopStmt::~ScopStmt() {
while (!MemAccs.empty()) {		DeleteContainerSeconds(InstructionToAccess);
delete MemAccs.back();
MemAccs.pop_back();
}

isl_set_free(Domain);		isl_set_free(Domain);
isl_map_free(Schedule);		isl_map_free(Schedule);
}		}

void ScopStmt::print(raw_ostream &OS) const {		void ScopStmt::print(raw_ostream &OS) const {
OS << "\t" << getBaseName() << "\n";		OS << "\t" << getBaseName() << "\n";
OS.indent(12) << "Domain :=\n";		OS.indent(12) << "Domain :=\n";

▲ Show 20 Lines • Show All 868 Lines • Show Last 20 Lines

polly/trunk/lib/CodeGen/BlockGenerators.cpp

Show First 20 Lines • Show All 75 Lines • ▼ Show 20 Lines	default:
break;		break;
}		}
}		}
return false;		return false;
}		}

BlockGenerator::BlockGenerator(PollyIRBuilder &B, LoopInfo &LI,		BlockGenerator::BlockGenerator(PollyIRBuilder &B, LoopInfo &LI,
ScalarEvolution &SE, DominatorTree &DT,		ScalarEvolution &SE, DominatorTree &DT,
		ScalarAllocaMapTy &ScalarMap,
		ScalarAllocaMapTy &PHIOpMap,
		EscapeUsersAllocaMapTy &EscapeMap,
IslExprBuilder *ExprBuilder)		IslExprBuilder *ExprBuilder)
: Builder(B), LI(LI), SE(SE), ExprBuilder(ExprBuilder), DT(DT) {}		: Builder(B), LI(LI), SE(SE), ExprBuilder(ExprBuilder), DT(DT),
		EntryBB(nullptr), PHIOpMap(PHIOpMap), ScalarMap(ScalarMap),
		EscapeMap(EscapeMap) {}

Value BlockGenerator::getNewValue(ScopStmt &Stmt, const Value Old,		Value BlockGenerator::getNewValue(ScopStmt &Stmt, const Value Old,
ValueMapT &BBMap, ValueMapT &GlobalMap,		ValueMapT &BBMap, ValueMapT &GlobalMap,
LoopToScevMapT &LTS, Loop *L) const {		LoopToScevMapT &LTS, Loop *L) const {
// We assume constants never change.		// We assume constants never change.
// This avoids map lookups for many calls to this function.		// This avoids map lookups for many calls to this function.
if (isa<Constant>(Old))		if (isa<Constant>(Old))
return const_cast<Value *>(Old);		return const_cast<Value *>(Old);
▲ Show 20 Lines • Show All 143 Lines • ▼ Show 20 Lines	Value *BlockGenerator::generateScalarStore(ScopStmt &Stmt,
Value *NewStore = Builder.CreateAlignedStore(ValueOperand, NewPointer,		Value *NewStore = Builder.CreateAlignedStore(ValueOperand, NewPointer,
Store->getAlignment());		Store->getAlignment());
return NewStore;		return NewStore;
}		}

void BlockGenerator::copyInstruction(ScopStmt &Stmt, const Instruction *Inst,		void BlockGenerator::copyInstruction(ScopStmt &Stmt, const Instruction *Inst,
ValueMapT &BBMap, ValueMapT &GlobalMap,		ValueMapT &BBMap, ValueMapT &GlobalMap,
LoopToScevMapT &LTS) {		LoopToScevMapT &LTS) {

		// First check for possible scalar dependences for this instruction.
		generateScalarLoads(Stmt, Inst, BBMap);

// Terminator instructions control the control flow. They are explicitly		// Terminator instructions control the control flow. They are explicitly
// expressed in the clast and do not need to be copied.		// expressed in the clast and do not need to be copied.
if (Inst->isTerminator())		if (Inst->isTerminator())
return;		return;

if (canSynthesize(Inst, &LI, &SE, &Stmt.getParent()->getRegion()))		Loop *L = getLoopForInst(Inst);
		if ((Stmt.isBlockStmt() \|\| !Stmt.getRegion()->contains(L)) &&
		canSynthesize(Inst, &LI, &SE, &Stmt.getParent()->getRegion())) {
		Value *NewValue = getNewValue(Stmt, Inst, BBMap, GlobalMap, LTS, L);
		BBMap[Inst] = NewValue;
return;		return;
		}

if (const LoadInst *Load = dyn_cast<LoadInst>(Inst)) {		if (const LoadInst *Load = dyn_cast<LoadInst>(Inst)) {
Value *NewLoad = generateScalarLoad(Stmt, Load, BBMap, GlobalMap, LTS);		Value *NewLoad = generateScalarLoad(Stmt, Load, BBMap, GlobalMap, LTS);
// Compute NewLoad before its insertion in BBMap to make the insertion		// Compute NewLoad before its insertion in BBMap to make the insertion
// deterministic.		// deterministic.
BBMap[Load] = NewLoad;		BBMap[Load] = NewLoad;
return;		return;
}		}

if (const StoreInst *Store = dyn_cast<StoreInst>(Inst)) {		if (const StoreInst *Store = dyn_cast<StoreInst>(Inst)) {
Value *NewStore = generateScalarStore(Stmt, Store, BBMap, GlobalMap, LTS);		Value *NewStore = generateScalarStore(Stmt, Store, BBMap, GlobalMap, LTS);
// Compute NewStore before its insertion in BBMap to make the insertion		// Compute NewStore before its insertion in BBMap to make the insertion
// deterministic.		// deterministic.
BBMap[Store] = NewStore;		BBMap[Store] = NewStore;
return;		return;
}		}

		if (const PHINode *PHI = dyn_cast<PHINode>(Inst)) {
		copyPHIInstruction(Stmt, PHI, BBMap, GlobalMap, LTS);
		return;
		}

// Skip some special intrinsics for which we do not adjust the semantics to		// Skip some special intrinsics for which we do not adjust the semantics to
// the new schedule. All others are handled like every other instruction.		// the new schedule. All others are handled like every other instruction.
if (auto *IT = dyn_cast<IntrinsicInst>(Inst)) {		if (auto *IT = dyn_cast<IntrinsicInst>(Inst)) {
switch (IT->getIntrinsicID()) {		switch (IT->getIntrinsicID()) {
// Lifetime markers are ignored.		// Lifetime markers are ignored.
case llvm::Intrinsic::lifetime_start:		case llvm::Intrinsic::lifetime_start:
case llvm::Intrinsic::lifetime_end:		case llvm::Intrinsic::lifetime_end:
// Invariant markers are ignored.		// Invariant markers are ignored.
▲ Show 20 Lines • Show All 41 Lines • ▼ Show 20 Lines	BasicBlock BlockGenerator::copyBB(ScopStmt &Stmt, BasicBlock BB,
copyBB(Stmt, BB, CopyBB, BBMap, GlobalMap, LTS);		copyBB(Stmt, BB, CopyBB, BBMap, GlobalMap, LTS);
return CopyBB;		return CopyBB;
}		}

void BlockGenerator::copyBB(ScopStmt &Stmt, BasicBlock BB, BasicBlock CopyBB,		void BlockGenerator::copyBB(ScopStmt &Stmt, BasicBlock BB, BasicBlock CopyBB,
ValueMapT &BBMap, ValueMapT &GlobalMap,		ValueMapT &BBMap, ValueMapT &GlobalMap,
LoopToScevMapT &LTS) {		LoopToScevMapT &LTS) {
Builder.SetInsertPoint(CopyBB->begin());		Builder.SetInsertPoint(CopyBB->begin());
		EntryBB = &CopyBB->getParent()->getEntryBlock();

for (Instruction &Inst : *BB)		for (Instruction &Inst : *BB)
copyInstruction(Stmt, &Inst, BBMap, GlobalMap, LTS);		copyInstruction(Stmt, &Inst, BBMap, GlobalMap, LTS);

		// After a basic block was copied store all scalars that escape this block
		// in their alloca. First the scalars that have dependences inside the SCoP,
		// then the ones that might escape the SCoP.
		generateScalarStores(Stmt, BB, BBMap, GlobalMap);

		const Region &R = Stmt.getParent()->getRegion();
		for (Instruction &Inst : *BB)
		handleOutsideUsers(R, &Inst, BBMap[&Inst]);
		}

		AllocaInst BlockGenerator::getOrCreateAlloca(Instruction ScalarBase,
		ScalarAllocaMapTy &Map,
		const char *NameExt,
		bool *IsNew) {

		// Check if an alloca was cached for the base instruction.
		AllocaInst *&Addr = Map[ScalarBase];

		// If needed indicate if it was found already or will be created.
		if (IsNew)
		*IsNew = (Addr == nullptr);

		// If no alloca was found create one and insert it in the entry block.
		if (!Addr) {
		auto *Ty = ScalarBase->getType();
		Addr = new AllocaInst(Ty, ScalarBase->getName() + NameExt);
		Addr->insertBefore(EntryBB->getFirstInsertionPt());
		}

		return Addr;
		}

		void BlockGenerator::handleOutsideUsers(const Region &R, Instruction *Inst,
		Value *InstCopy) {
		BasicBlock *ExitBB = R.getExit();

		EscapeUserVectorTy EscapeUsers;
		for (User *U : Inst->users()) {

		// Non-instruction user will never escape.
		Instruction *UI = dyn_cast<Instruction>(U);
		if (!UI)
		continue;

		if (R.contains(UI) && ExitBB != UI->getParent())
		continue;

		EscapeUsers.push_back(UI);
		}

		// Exit if no escape uses were found.
		if (EscapeUsers.empty())
		return;

		// If there are escape users we get the alloca for this instruction and put
		// it in the EscapeMap for later finalization. However, if the alloca was not
		// created by an already handled scalar dependence we have to initialize it
		// also. Lastly, if the instruction was copied multiple times we already did
		// this and can exit.
		if (EscapeMap.count(Inst))
		return;

		// Get or create an escape alloca for this instruction.
		bool IsNew;
		AllocaInst *ScalarAddr =
		getOrCreateAlloca(Inst, ScalarMap, ".escape", &IsNew);

		// Remember that this instruction has escape uses and the escape alloca.
		EscapeMap[Inst] = std::make_pair(ScalarAddr, std::move(EscapeUsers));

		// If the escape alloca was just created store the instruction in there,
		// otherwise that happened already.
		if (IsNew) {
		assert(InstCopy && "Except PHIs every instruction should have a copy!");
		Builder.CreateStore(InstCopy, ScalarAddr);
		}
		}

		void BlockGenerator::generateScalarLoads(ScopStmt &Stmt,
		const Instruction *Inst,
		ValueMapT &BBMap) {

		// Iterate over all memory accesses for the given instruction and handle all
		// scalar reads.
		if (ScopStmt::MemoryAccessList *MAL = Stmt.lookupAccessesFor(Inst)) {
		for (MemoryAccess &MA : *MAL) {
		if (!MA.isScalar() \|\| !MA.isRead())
		continue;

		Instruction *ScalarBase = cast<Instruction>(MA.getBaseAddr());
		Instruction *ScalarInst = MA.getAccessInstruction();

		PHINode *ScalarBasePHI = dyn_cast<PHINode>(ScalarBase);

		// This is either a common scalar use (second case) or the use of a phi
		// operand by the PHI node (first case).
		if (ScalarBasePHI == ScalarInst) {
		AllocaInst *PHIOpAddr =
		getOrCreateAlloca(ScalarBase, PHIOpMap, ".phiops");
		LoadInst *LI =
		Builder.CreateLoad(PHIOpAddr, PHIOpAddr->getName() + ".reload");
		BBMap[ScalarBase] = LI;
		} else {
		// For non-PHI operand uses we look up the alloca in the ScalarMap,
		// reload it and add the mapping to the ones in the current basic block.
		AllocaInst *ScalarAddr =
		getOrCreateAlloca(ScalarBase, ScalarMap, ".s2a");
		LoadInst *LI =
		Builder.CreateLoad(ScalarAddr, ScalarAddr->getName() + ".reload");
		BBMap[ScalarBase] = LI;
		}
		}
		}
		}

		Value BlockGenerator::getNewScalarValue(Value ScalarValue, const Region &R,
		ScalarAllocaMapTy &ReloadMap,
		ValueMapT &BBMap,
		ValueMapT &GlobalMap) {
		// If the value we want to store is an instruction we might have demoted it
		// in order to make it accessible here. In such a case a reload is
		// necessary. If it is no instruction it will always be a value that
		// dominates the current point and we can just use it. In total there are 4
		// options:
		// (1) The value is no instruction ==> use the value.
		// (2) The value is an instruction that was split out of the region prior to
		// code generation ==> use the instruction as it dominates the region.
		// (3) The value is an instruction:
		// (a) The value was defined in the current block, thus a copy is in
		// the BBMap ==> use the mapped value.
		// (b) The value was defined in a previous block, thus we demoted it
		// earlier ==> use the reloaded value.
		Instruction *ScalarValueInst = dyn_cast<Instruction>(ScalarValue);
		if (!ScalarValueInst)
		return ScalarValue;

		if (!R.contains(ScalarValueInst)) {
		if (Value *ScalarValueCopy = GlobalMap.lookup(ScalarValueInst))
		return /* Case (3a) */ ScalarValueCopy;
		else
		return /* Case 2 */ ScalarValue;
		}

		if (Value *ScalarValueCopy = BBMap.lookup(ScalarValueInst))
		return /* Case (3a) */ ScalarValueCopy;

		// Case (3b)
		assert(ReloadMap.count(ScalarValueInst) &&
		"ScalarInst not mapped in the block and not in the given reload map!");
		Value *ReloadAddr = ReloadMap[ScalarValueInst];
		ScalarValue =
		Builder.CreateLoad(ReloadAddr, ReloadAddr->getName() + ".reload");

		return ScalarValue;
		}

		void BlockGenerator::generateScalarStores(ScopStmt &Stmt, BasicBlock *BB,
		ValueMapT &BBMap,
		ValueMapT &GlobalMap) {
		const Region &R = Stmt.getParent()->getRegion();

		assert(Stmt.isBlockStmt() && BB == Stmt.getBasicBlock() &&
		"Region statements need to use the generateScalarStores() "
		"function in the RegionGenerator");

		// Set to remember a store to the phiops alloca of a PHINode. It is needed as
		// we might have multiple write accesses to the same PHI and while one is the
		// self write of the PHI (to the ScalarMap alloca) the other is the write to
		// the operand alloca (PHIOpMap).
		SmallPtrSet<PHINode *, 4> SeenPHIs;

		// Iterate over all accesses in the given statement.
		for (MemoryAccess *MA : Stmt) {

		// Skip non-scalar and read accesses.
		if (!MA->isScalar() \|\| MA->isRead())
		continue;

		Instruction *ScalarBase = cast<Instruction>(MA->getBaseAddr());
		Instruction *ScalarInst = MA->getAccessInstruction();
		PHINode *ScalarBasePHI = dyn_cast<PHINode>(ScalarBase);

		// Get the alloca node for the base instruction and the value we want to
		// store. In total there are 4 options:
		// (1) The base is no PHI, hence it is a simple scalar def-use chain.
		// (2) The base is a PHI,
		// (a) and the write is caused by an operand in the block.
		// (b) and it is the PHI self write (same as case (1)).
		// (c) (2a) and (2b) are not distinguishable.
		// For case (1) and (2b) we get the alloca from the scalar map and the value
		// we want to store is initialized with the instruction attached to the
		// memory access. For case (2a) we get the alloca from the PHI operand map
		// and the value we want to store is initialized with the incoming value for
		// this block. The tricky case (2c) is when both (2a) and (2b) match. This
		// happens if the PHI operand is in the same block as the PHI. To handle
		// that we choose the alloca of (2a) first and (2b) for the next write
		// access to that PHI (there must be 2).
		Value *ScalarValue = nullptr;
		AllocaInst *ScalarAddr = nullptr;

		if (!ScalarBasePHI) {
		// Case (1)
		ScalarAddr = getOrCreateAlloca(ScalarBase, ScalarMap, ".s2a");
		ScalarValue = ScalarInst;
		} else {
		int PHIIdx = ScalarBasePHI->getBasicBlockIndex(BB);
		if (ScalarBasePHI != ScalarInst) {
		// Case (2a)
		assert(PHIIdx >= 0 && "Bad scalar write to PHI operand");
		SeenPHIs.insert(ScalarBasePHI);
		ScalarAddr = getOrCreateAlloca(ScalarBase, PHIOpMap, ".phiops");
		ScalarValue = ScalarBasePHI->getIncomingValue(PHIIdx);
		} else if (PHIIdx < 0) {
		// Case (2b)
		ScalarAddr = getOrCreateAlloca(ScalarBase, ScalarMap, ".s2a");
		ScalarValue = ScalarInst;
		} else {
		// Case (2c)
		if (SeenPHIs.insert(ScalarBasePHI).second) {
		// First access ==> same as (2a)
		ScalarAddr = getOrCreateAlloca(ScalarBase, PHIOpMap, ".phiops");
		ScalarValue = ScalarBasePHI->getIncomingValue(PHIIdx);
		} else {
		// Second access ==> same as (2b)
		ScalarAddr = getOrCreateAlloca(ScalarBase, ScalarMap, ".s2a");
		ScalarValue = ScalarInst;
		}
		}
		}

		ScalarValue =
		getNewScalarValue(ScalarValue, R, ScalarMap, BBMap, GlobalMap);
		Builder.CreateStore(ScalarValue, ScalarAddr);
		}
		}

		void BlockGenerator::createScalarInitialization(Region &R,
		ValueMapT &GlobalMap) {
		// The split block __just before__ the region and optimized region.
		BasicBlock *SplitBB = R.getEnteringBlock();
		BranchInst *SplitBBTerm = cast<BranchInst>(SplitBB->getTerminator());
		assert(SplitBBTerm->getNumSuccessors() == 2 && "Bad region entering block!");

		// Get the start block of the __optimized__ region.
		BasicBlock *StartBB = SplitBBTerm->getSuccessor(0);
		if (StartBB == R.getEntry())
		StartBB = SplitBBTerm->getSuccessor(1);

		// For each PHI predecessor outside the region store the incoming operand
		// value prior to entering the optimized region.
		Builder.SetInsertPoint(StartBB->getTerminator());

		ScalarAllocaMapTy EmptyMap;
		for (const auto &PHIOpMapping : PHIOpMap) {
		const PHINode *PHI = cast<PHINode>(PHIOpMapping.getFirst());

		// Check if this PHI has the split block as predecessor (that is the only
		// possible predecessor outside the SCoP).
		int idx = PHI->getBasicBlockIndex(SplitBB);
		if (idx < 0)
		continue;

		Value *ScalarValue = PHI->getIncomingValue(idx);
		ScalarValue =
		getNewScalarValue(ScalarValue, R, EmptyMap, GlobalMap, GlobalMap);

		// If the split block is the predecessor initialize the PHI operator alloca.
		Builder.CreateStore(ScalarValue, PHIOpMapping.getSecond());
		}
		}

		void BlockGenerator::createScalarFinalization(Region &R) {
		// The exit block of the __unoptimized__ region.
		BasicBlock *ExitBB = R.getExitingBlock();
		// The merge block __just after__ the region and the optimized region.
		BasicBlock *MergeBB = R.getExit();

		// The exit block of the __optimized__ region.
		BasicBlock OptExitBB = (pred_begin(MergeBB));
		if (OptExitBB == ExitBB)
		OptExitBB = *(++pred_begin(MergeBB));

		Builder.SetInsertPoint(OptExitBB->getTerminator());
		for (const auto &EscapeMapping : EscapeMap) {
		// Extract the escaping instruction and the escaping users as well as the
		// alloca the instruction was demoted to.
		Instruction *EscapeInst = EscapeMapping.getFirst();
		const auto &EscapeMappingValue = EscapeMapping.getSecond();
		const EscapeUserVectorTy &EscapeUsers = EscapeMappingValue.second;
		AllocaInst *ScalarAddr = EscapeMappingValue.first;

		// Reload the demoted instruction in the optimized version of the SCoP.
		Instruction *EscapeInstReload =
		Builder.CreateLoad(ScalarAddr, EscapeInst->getName() + ".final_reload");

		// Create the merge PHI that merges the optimized and unoptimized version.
		PHINode *MergePHI = PHINode::Create(EscapeInst->getType(), 2,
		EscapeInst->getName() + ".merge");
		MergePHI->insertBefore(MergeBB->getFirstInsertionPt());

		// Add the respective values to the merge PHI.
		MergePHI->addIncoming(EscapeInstReload, OptExitBB);
		MergePHI->addIncoming(EscapeInst, ExitBB);

		// The information of scalar evolution about the escaping instruction needs
		// to be revoked so the new merged instruction will be used.
		if (SE.isSCEVable(EscapeInst->getType()))
		SE.forgetValue(EscapeInst);

		// Replace all uses of the demoted instruction with the merge PHI.
		for (Instruction *EUser : EscapeUsers)
		EUser->replaceUsesOfWith(EscapeInst, MergePHI);
		}
		}

		void BlockGenerator::finalizeSCoP(Scop &S, ValueMapT &GlobalMap) {
		createScalarInitialization(S.getRegion(), GlobalMap);
		createScalarFinalization(S.getRegion());
}		}

VectorBlockGenerator::VectorBlockGenerator(BlockGenerator &BlockGen,		VectorBlockGenerator::VectorBlockGenerator(BlockGenerator &BlockGen,
VectorValueMapT &GlobalMaps,		VectorValueMapT &GlobalMaps,
std::vector<LoopToScevMapT> &VLTS,		std::vector<LoopToScevMapT> &VLTS,
isl_map *Schedule)		isl_map *Schedule)
: BlockGenerator(BlockGen), GlobalMaps(GlobalMaps), VLTS(VLTS),		: BlockGenerator(BlockGen), GlobalMaps(GlobalMaps), VLTS(VLTS),
Schedule(Schedule) {		Schedule(Schedule) {
▲ Show 20 Lines • Show All 338 Lines • ▼ Show 20 Lines	void VectorBlockGenerator::copyStmt(ScopStmt &Stmt) {
// appears once in every dimension of the scalarMap.		// appears once in every dimension of the scalarMap.
VectorValueMapT ScalarBlockMap(getVectorWidth());		VectorValueMapT ScalarBlockMap(getVectorWidth());
ValueMapT VectorBlockMap;		ValueMapT VectorBlockMap;

for (Instruction &Inst : *BB)		for (Instruction &Inst : *BB)
copyInstruction(Stmt, &Inst, VectorBlockMap, ScalarBlockMap);		copyInstruction(Stmt, &Inst, VectorBlockMap, ScalarBlockMap);
}		}

BasicBlock *RegionGenerator::repairDominance(		BasicBlock RegionGenerator::repairDominance(BasicBlock BB,
BasicBlock BB, BasicBlock BBCopy,		BasicBlock *BBCopy) {
DenseMap<BasicBlock , BasicBlock > &BlockMap) {

BasicBlock *BBIDom = DT.getNode(BB)->getIDom()->getBlock();		BasicBlock *BBIDom = DT.getNode(BB)->getIDom()->getBlock();
BasicBlock *BBCopyIDom = BlockMap.lookup(BBIDom);		BasicBlock *BBCopyIDom = BlockMap.lookup(BBIDom);

if (BBCopyIDom)		if (BBCopyIDom)
DT.changeImmediateDominator(BBCopy, BBCopyIDom);		DT.changeImmediateDominator(BBCopy, BBCopyIDom);

return BBCopyIDom;		return BBCopyIDom;
}		}

void RegionGenerator::copyStmt(ScopStmt &Stmt, ValueMapT &GlobalMap,		void RegionGenerator::copyStmt(ScopStmt &Stmt, ValueMapT &GlobalMap,
LoopToScevMapT &LTS) {		LoopToScevMapT &LTS) {
assert(Stmt.isRegionStmt() &&		assert(Stmt.isRegionStmt() &&
"Only region statements can be copied by the block generator");		"Only region statements can be copied by the block generator");

		// Forget all old mappings.
		BlockMap.clear();
		RegionMaps.clear();
		IncompletePHINodeMap.clear();

// The region represented by the statement.		// The region represented by the statement.
Region *R = Stmt.getRegion();		Region *R = Stmt.getRegion();

// The "BBMaps" for the whole region.		// Create a dedicated entry for the region where we can reload all demoted
DenseMap<BasicBlock *, ValueMapT> RegionMaps;		// inputs.
		BasicBlock *EntryBB = R->getEntry();
		BasicBlock *EntryBBCopy =
		SplitBlock(Builder.GetInsertBlock(), Builder.GetInsertPoint(), &DT, &LI);
		EntryBBCopy->setName("polly.stmt." + EntryBB->getName() + ".entry");
		Builder.SetInsertPoint(EntryBBCopy->begin());

// A map from old to new blocks in the region		for (auto PI = pred_begin(EntryBB), PE = pred_end(EntryBB); PI != PE; ++PI)
DenseMap<BasicBlock , BasicBlock > BlockMap;		if (!R->contains(*PI))
		BlockMap[*PI] = EntryBBCopy;

// Iterate over all blocks in the region in a breadth-first search.		// Iterate over all blocks in the region in a breadth-first search.
std::deque<BasicBlock *> Blocks;		std::deque<BasicBlock *> Blocks;
SmallPtrSet<BasicBlock *, 8> SeenBlocks;		SmallPtrSet<BasicBlock *, 8> SeenBlocks;
Blocks.push_back(R->getEntry());		Blocks.push_back(EntryBB);
SeenBlocks.insert(R->getEntry());		SeenBlocks.insert(EntryBB);

while (!Blocks.empty()) {		while (!Blocks.empty()) {
BasicBlock *BB = Blocks.front();		BasicBlock *BB = Blocks.front();
Blocks.pop_front();		Blocks.pop_front();

// First split the block and update dominance information.		// First split the block and update dominance information.
BasicBlock *BBCopy = splitBB(BB);		BasicBlock *BBCopy = splitBB(BB);
BasicBlock *BBCopyIDom = repairDominance(BB, BBCopy, BlockMap);		BasicBlock *BBCopyIDom = repairDominance(BB, BBCopy);

		// In order to remap PHI nodes we store also basic block mappings.
		BlockMap[BB] = BBCopy;

// Get the mapping for this block and initialize it with the mapping		// Get the mapping for this block and initialize it with the mapping
// available at its immediate dominator (in the new region).		// available at its immediate dominator (in the new region).
ValueMapT &RegionMap = RegionMaps[BBCopy];		ValueMapT &RegionMap = RegionMaps[BBCopy];
RegionMap = RegionMaps[BBCopyIDom];		RegionMap = RegionMaps[BBCopyIDom];

// Copy the block with the BlockGenerator.		// Copy the block with the BlockGenerator.
copyBB(Stmt, BB, BBCopy, RegionMap, GlobalMap, LTS);		copyBB(Stmt, BB, BBCopy, RegionMap, GlobalMap, LTS);

		// In order to remap PHI nodes we store also basic block mappings.
		BlockMap[BB] = BBCopy;

		// Add values to incomplete PHI nodes waiting for this block to be copied.
		for (const PHINodePairTy &PHINodePair : IncompletePHINodeMap[BB])
		addOperandToPHI(Stmt, PHINodePair.first, PHINodePair.second, BB,
		GlobalMap, LTS);
		IncompletePHINodeMap[BB].clear();

// And continue with new successors inside the region.		// And continue with new successors inside the region.
for (auto SI = succ_begin(BB), SE = succ_end(BB); SI != SE; SI++)		for (auto SI = succ_begin(BB), SE = succ_end(BB); SI != SE; SI++)
if (R->contains(SI) && SeenBlocks.insert(SI).second)		if (R->contains(SI) && SeenBlocks.insert(SI).second)
Blocks.push_back(*SI);		Blocks.push_back(*SI);

// In order to remap PHI nodes we store also basic block mappings.
BlockMap[BB] = BBCopy;
}		}

// Now create a new dedicated region exit block and add it to the region map.		// Now create a new dedicated region exit block and add it to the region map.
BasicBlock *ExitBBCopy =		BasicBlock *ExitBBCopy =
SplitBlock(Builder.GetInsertBlock(), Builder.GetInsertPoint(), &DT, &LI);		SplitBlock(Builder.GetInsertBlock(), Builder.GetInsertPoint(), &DT, &LI);
ExitBBCopy->setName("polly.stmt." + R->getExit()->getName() + ".as.exit");		ExitBBCopy->setName("polly.stmt." + R->getExit()->getName() + ".exit");
BlockMap[R->getExit()] = ExitBBCopy;		BlockMap[R->getExit()] = ExitBBCopy;

repairDominance(R->getExit(), ExitBBCopy, BlockMap);		repairDominance(R->getExit(), ExitBBCopy);

// As the block generator doesn't handle control flow we need to add the		// As the block generator doesn't handle control flow we need to add the
// region control flow by hand after all blocks have been copied.		// region control flow by hand after all blocks have been copied.
for (BasicBlock *BB : SeenBlocks) {		for (BasicBlock *BB : SeenBlocks) {

BranchInst *BI = cast<BranchInst>(BB->getTerminator());		BranchInst *BI = cast<BranchInst>(BB->getTerminator());

BasicBlock *BBCopy = BlockMap[BB];		BasicBlock *BBCopy = BlockMap[BB];
Instruction *BICopy = BBCopy->getTerminator();		Instruction *BICopy = BBCopy->getTerminator();

ValueMapT &RegionMap = RegionMaps[BBCopy];		ValueMapT &RegionMap = RegionMaps[BBCopy];
RegionMap.insert(BlockMap.begin(), BlockMap.end());		RegionMap.insert(BlockMap.begin(), BlockMap.end());

Builder.SetInsertPoint(BBCopy);		Builder.SetInsertPoint(BBCopy);
copyInstScalar(Stmt, BI, RegionMap, GlobalMap, LTS);		copyInstScalar(Stmt, BI, RegionMap, GlobalMap, LTS);
BICopy->eraseFromParent();		BICopy->eraseFromParent();
}		}

		// Add counting PHI nodes to all loops in the region that can be used as
		// replacement for SCEVs refering to the old loop.
		for (BasicBlock *BB : SeenBlocks) {
		Loop *L = LI.getLoopFor(BB);
		if (L == nullptr \|\| L->getHeader() != BB)
		continue;

		BasicBlock *BBCopy = BlockMap[BB];
		Value *NullVal = Builder.getInt32(0);
		PHINode *LoopPHI =
		PHINode::Create(Builder.getInt32Ty(), 2, "polly.subregion.iv");
		Instruction *LoopPHIInc = BinaryOperator::CreateAdd(
		LoopPHI, Builder.getInt32(1), "polly.subregion.iv.inc");
		LoopPHI->insertBefore(BBCopy->begin());
		LoopPHIInc->insertBefore(BBCopy->getTerminator());

		for (auto *PredBB : make_range(pred_begin(BB), pred_end(BB))) {
		if (!R->contains(PredBB))
		continue;
		if (L->contains(PredBB))
		LoopPHI->addIncoming(LoopPHIInc, BlockMap[PredBB]);
		else
		LoopPHI->addIncoming(NullVal, BlockMap[PredBB]);
		}

		for (auto *PredBBCopy : make_range(pred_begin(BBCopy), pred_end(BBCopy)))
		if (LoopPHI->getBasicBlockIndex(PredBBCopy) < 0)
		LoopPHI->addIncoming(NullVal, PredBBCopy);

		LTS[L] = SE.getUnknown(LoopPHI);
		}

		// Add all mappings from the region to the global map so outside uses will use
		// the copied instructions.
		for (auto &BBMap : RegionMaps)
		GlobalMap.insert(BBMap.second.begin(), BBMap.second.end());

// Reset the old insert point for the build.		// Reset the old insert point for the build.
Builder.SetInsertPoint(ExitBBCopy->begin());		Builder.SetInsertPoint(ExitBBCopy->begin());
}		}

		void RegionGenerator::generateScalarLoads(ScopStmt &Stmt,
		const Instruction *Inst,
		ValueMapT &BBMap) {

		// Inside a non-affine region PHI nodes are copied not demoted. Once the
		// phi is copied it will reload all inputs from outside the region, hence
		// we do not need to generate code for the read access of the operands of a
		// PHI.
		if (isa<PHINode>(Inst))
		return;

		return BlockGenerator::generateScalarLoads(Stmt, Inst, BBMap);
		}

		void RegionGenerator::generateScalarStores(ScopStmt &Stmt, BasicBlock *BB,
		ValueMapT &BBMap,
		ValueMapT &GlobalMap) {
		const Region &R = Stmt.getParent()->getRegion();

		Region *StmtR = Stmt.getRegion();
		assert(StmtR && "Block statements need to use the generateScalarStores() "
		"function in the BlockGenerator");

		BasicBlock *ExitBB = StmtR->getExit();

		// For region statements three kinds of scalar stores exists:
		// (1) A definition used by a non-phi instruction outside the region.
		// (2) A phi-instruction in the region entry.
		// (3) A write to a phi instruction in the region exit.
		// The last case is the tricky one since we do not know anymore which
		// predecessor of the exit needs to store the operand value that doesn't
		// have a definition in the region. Therefore, we have to check in each
		// block in the region if we should store the value or not.

		// Iterate over all accesses in the given statement.
		for (MemoryAccess *MA : Stmt) {

		// Skip non-scalar and read accesses.
		if (!MA->isScalar() \|\| MA->isRead())
		continue;

		Instruction *ScalarBase = cast<Instruction>(MA->getBaseAddr());
		Instruction *ScalarInst = MA->getAccessInstruction();
		PHINode *ScalarBasePHI = dyn_cast<PHINode>(ScalarBase);

		Value *ScalarValue = nullptr;
		AllocaInst *ScalarAddr = nullptr;

		if (!ScalarBasePHI) {
		// Case (1)
		ScalarAddr = getOrCreateAlloca(ScalarBase, ScalarMap, ".s2a");
		ScalarValue = ScalarInst;
		} else if (ScalarBasePHI->getParent() != ExitBB) {
		// Case (2)
		assert(ScalarBasePHI->getParent() == StmtR->getEntry() &&
		"Bad PHI self write in non-affine region");
		assert(ScalarBase == ScalarInst &&
		"Bad PHI self write in non-affine region");
		ScalarAddr = getOrCreateAlloca(ScalarBase, ScalarMap, ".s2a");
		ScalarValue = ScalarInst;
		} else {
		int PHIIdx = ScalarBasePHI->getBasicBlockIndex(BB);
		// Skip accesses we will not handle in this basic block but in another one
		// in the statement region.
		if (PHIIdx < 0)
		continue;

		// Case (3)
		ScalarAddr = getOrCreateAlloca(ScalarBase, PHIOpMap, ".phiops");
		ScalarValue = ScalarBasePHI->getIncomingValue(PHIIdx);
		}

		ScalarValue =
		getNewScalarValue(ScalarValue, R, ScalarMap, BBMap, GlobalMap);
		Builder.CreateStore(ScalarValue, ScalarAddr);
		}
		}

		void RegionGenerator::addOperandToPHI(ScopStmt &Stmt, const PHINode *PHI,
		PHINode PHICopy, BasicBlock IncomingBB,
		ValueMapT &GlobalMap,
		LoopToScevMapT &LTS) {
		Region *StmtR = Stmt.getRegion();

		// If the incoming block was not yet copied mark this PHI as incomplete.
		// Once the block will be copied the incoming value will be added.
		BasicBlock *BBCopy = BlockMap[IncomingBB];
		if (!BBCopy) {
		assert(StmtR->contains(IncomingBB) &&
		"Bad incoming block for PHI in non-affine region");
		IncompletePHINodeMap[IncomingBB].push_back(std::make_pair(PHI, PHICopy));
		return;
		}

		Value *OpCopy = nullptr;
		if (StmtR->contains(IncomingBB)) {
		assert(RegionMaps.count(BBCopy) &&
		"Incoming PHI block did not have a BBMap");
		ValueMapT &BBCopyMap = RegionMaps[BBCopy];

		Value *Op = PHI->getIncomingValueForBlock(IncomingBB);
		OpCopy =
		getNewValue(Stmt, Op, BBCopyMap, GlobalMap, LTS, getLoopForInst(PHI));
		} else {

		if (PHICopy->getBasicBlockIndex(BBCopy) >= 0)
		return;

		AllocaInst *PHIOpAddr =
		getOrCreateAlloca(const_cast<PHINode *>(PHI), PHIOpMap, ".phiops");
		OpCopy = new LoadInst(PHIOpAddr, PHIOpAddr->getName() + ".reload",
		BlockMap[IncomingBB]->getTerminator());
		}

		assert(OpCopy && "Incoming PHI value was not copied properly");
		assert(BBCopy && "Incoming PHI block was not copied properly");
		PHICopy->addIncoming(OpCopy, BBCopy);
		}

		Value RegionGenerator::copyPHIInstruction(ScopStmt &Stmt, const PHINode PHI,
		ValueMapT &BBMap,
		ValueMapT &GlobalMap,
		LoopToScevMapT &LTS) {
		unsigned NumIncoming = PHI->getNumIncomingValues();
		PHINode *PHICopy =
		Builder.CreatePHI(PHI->getType(), NumIncoming, "polly." + PHI->getName());
		PHICopy->moveBefore(PHICopy->getParent()->getFirstNonPHI());
		BBMap[PHI] = PHICopy;

		for (unsigned u = 0; u < NumIncoming; u++)
		addOperandToPHI(Stmt, PHI, PHICopy, PHI->getIncomingBlock(u), GlobalMap,
		LTS);
		return PHICopy;
		}

polly/trunk/lib/CodeGen/CodeGeneration.cpp

Show First 20 Lines • Show All 125 Lines • ▼ Show 20 Lines	bool runOnScop(Scop &S) override {
Builder.SetInsertPoint(SplitBlock->getTerminator());		Builder.SetInsertPoint(SplitBlock->getTerminator());
NodeBuilder.addParameters(S.getContext());		NodeBuilder.addParameters(S.getContext());
Value *RTC = buildRTC(Builder, NodeBuilder.getExprBuilder());		Value *RTC = buildRTC(Builder, NodeBuilder.getExprBuilder());
SplitBlock->getTerminator()->setOperand(0, RTC);		SplitBlock->getTerminator()->setOperand(0, RTC);
Builder.SetInsertPoint(StartBlock->begin());		Builder.SetInsertPoint(StartBlock->begin());

NodeBuilder.create(AstRoot);		NodeBuilder.create(AstRoot);

		NodeBuilder.finalizeSCoP(S);

assert(!verifyGeneratedFunction(S, *EnteringBB->getParent()) &&		assert(!verifyGeneratedFunction(S, *EnteringBB->getParent()) &&
"Verification of generated function failed");		"Verification of generated function failed");
return true;		return true;
}		}

void printScop(raw_ostream &, Scop &) const override {}		void printScop(raw_ostream &, Scop &) const override {}

void getAnalysisUsage(AnalysisUsage &AU) const override {		void getAnalysisUsage(AnalysisUsage &AU) const override {
Show All 40 Lines

polly/trunk/test/Isl/CodeGen/phi_condition_modeling_1.ll

				; RUN: opt %loadPolly -S -polly-no-early-exit -polly-detect-unprofitable -polly-model-phi-nodes -polly-codegen < %s \| FileCheck %s
				;
				; void f(int *A, int c, int N) {
				; int tmp;
				; for (int i = 0; i < N; i++) {
				; if (i > c)
				; tmp = 3;
				; else
				; tmp = 5;
				; A[i] = tmp;
				; }
				; }
				;
				; CHECK-LABEL: bb:
				; CHECK: %tmp.0.phiops = alloca i32
				; CHECK-LABEL: polly.stmt.bb8:
				; CHECK: %tmp.0.phiops.reload = load i32, i32* %tmp.0.phiops
				; CHECK: store i32 %tmp.0.phiops.reload, i32*
				; CHECK-LABEL: polly.stmt.bb6:
				; CHECK: store i32 3, i32* %tmp.0.phiops
				; CHECK-LABEL: polly.stmt.bb7:
				; CHECK: store i32 5, i32* %tmp.0.phiops

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

				define void @f(i32* %A, i32 %c, i32 %N) {
				bb:
				%tmp = sext i32 %N to i64
				%tmp1 = sext i32 %c to i64
				br label %bb2

				bb2: ; preds = %bb10, %bb
				%indvars.iv = phi i64 [ %indvars.iv.next, %bb10 ], [ 0, %bb ]
				%tmp3 = icmp slt i64 %indvars.iv, %tmp
				br i1 %tmp3, label %bb4, label %bb11

				bb4: ; preds = %bb2
				%tmp5 = icmp sgt i64 %indvars.iv, %tmp1
				br i1 %tmp5, label %bb6, label %bb7

				bb6: ; preds = %bb4
				br label %bb8

				bb7: ; preds = %bb4
				br label %bb8

				bb8: ; preds = %bb7, %bb6
				%tmp.0 = phi i32 [ 3, %bb6 ], [ 5, %bb7 ]
				%tmp9 = getelementptr inbounds i32, i32* %A, i64 %indvars.iv
				store i32 %tmp.0, i32* %tmp9, align 4
				br label %bb10

				bb10: ; preds = %bb8
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				br label %bb2

				bb11: ; preds = %bb2
				ret void
				}

polly/trunk/test/Isl/CodeGen/phi_condition_modeling_2.ll

				; RUN: opt %loadPolly -S -polly-no-early-exit -polly-detect-unprofitable -polly-model-phi-nodes -disable-polly-intra-scop-scalar-to-array -polly-codegen < %s \| FileCheck %s
				;
				; void f(int *A, int c, int N) {
				; int tmp;
				; for (int i = 0; i < N; i++) {
				; if (i > c)
				; tmp = 3;
				; else
				; tmp = 5;
				; A[i] = tmp;
				; }
				; }
				;
				; CHECK-LABEL: bb:
				; CHECK-DAG: %tmp.0.s2a = alloca i32
				; CHECK-DAG: %tmp.0.phiops = alloca i32
				; CHECK-LABEL: polly.stmt.bb8:
				; CHECK: %tmp.0.phiops.reload = load i32, i32* %tmp.0.phiops
				; CHECK: store i32 %tmp.0.phiops.reload, i32* %tmp.0.s2a
				; CHECK-LABEL: polly.stmt.bb8b:
				; CHECK: %tmp.0.s2a.reload = load i32, i32* %tmp.0.s2a
				; CHECK: store i32 %tmp.0.s2a.reload,
				; CHECK-LABEL: polly.stmt.bb6:
				; CHECK: store i32 3, i32* %tmp.0.phiops
				; CHECK-LABEL: polly.stmt.bb7:
				; CHECK: store i32 5, i32* %tmp.0.phiops

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

				define void @f(i32* %A, i32 %c, i32 %N) {
				bb:
				%tmp = sext i32 %N to i64
				%tmp1 = sext i32 %c to i64
				br label %bb2

				bb2: ; preds = %bb10, %bb
				%indvars.iv = phi i64 [ %indvars.iv.next, %bb10 ], [ 0, %bb ]
				%tmp3 = icmp slt i64 %indvars.iv, %tmp
				br i1 %tmp3, label %bb4, label %bb11

				bb4: ; preds = %bb2
				%tmp5 = icmp sgt i64 %indvars.iv, %tmp1
				br i1 %tmp5, label %bb6, label %bb7

				bb6: ; preds = %bb4
				br label %bb8

				bb7: ; preds = %bb4
				br label %bb8

				bb8: ; preds = %bb7, %bb6
				%tmp.0 = phi i32 [ 3, %bb6 ], [ 5, %bb7 ]
				br label %bb8b

				bb8b:
				%tmp9 = getelementptr inbounds i32, i32* %A, i64 %indvars.iv
				store i32 %tmp.0, i32* %tmp9, align 4
				br label %bb10

				bb10: ; preds = %bb8
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				br label %bb2

				bb11: ; preds = %bb2
				ret void
				}

polly/trunk/test/Isl/CodeGen/phi_conditional_simple_1.ll

				; RUN: opt %loadPolly -analyze -polly-ast -polly-no-early-exit -polly-detect-unprofitable -polly-model-phi-nodes < %s \| FileCheck %s --check-prefix=AST
				; RUN: opt %loadPolly -S -polly-no-early-exit -polly-detect-unprofitable -polly-model-phi-nodes -polly-codegen < %s \| FileCheck %s
				;
				; void jd(int *A, int c) {
				; for (int i = 0; i < 1024; i++) {
				; if (c)
				; A[i] = 1;
				; else
				; A[i] = 2;
				; }
				; }

				; AST: for (int c0 = 0; c0 <= 1023; c0 += 1) {
				; AST: if (c <= -1) {
				; AST: Stmt_if_then(c0);
				; AST: } else if (c >= 1) {
				; AST: Stmt_if_then(c0);
				; AST: } else
				; AST: Stmt_if_else(c0);
				; AST: Stmt_if_end(c0);
				; AST: }
				;
				; CHECK-LABEL: entry:
				; CHECK-NEXT: %phi.phiops = alloca i32
				; CHECK-LABEL: polly.stmt.if.end:
				; CHECK-NEXT: %phi.phiops.reload = load i32, i32* %phi.phiops
				; CHECK-NEXT: %scevgep
				; CHECK-NEXT: store i32 %phi.phiops.reload, i32*
				; CHECK-LABEL: polly.stmt.if.then:
				; CHECK-NEXT: store i32 1, i32* %phi.phiops
				; CHECK-NEXT: br label %polly.merge{{[.]?}}
				; CHECK-LABEL: polly.stmt.if.then{{.}}:
				; CHECK-NEXT: store i32 1, i32* %phi.phiops
				; CHECK-NEXT: br label %polly.merge{{[.]?}}
				; CHECK-LABEL: polly.stmt.if.else:
				; CHECK-NEXT: store i32 2, i32* %phi.phiops
				; CHECK-NEXT: br label %polly.merge{{[.]?}}
				;
				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

				define void @jd(i32* %A, i32 %c) {
				entry:
				br label %for.cond

				for.cond:
				%indvars.iv = phi i64 [ %indvars.iv.next, %for.inc ], [ 0, %entry ]
				%exitcond = icmp ne i64 %indvars.iv, 1024
				br i1 %exitcond, label %for.body, label %for.end

				for.body:
				%tobool = icmp eq i32 %c, 0
				br i1 %tobool, label %if.else, label %if.then

				if.then:
				br label %if.end

				if.else:
				br label %if.end

				if.end:
				%phi = phi i32 [ 1, %if.then], [ 2, %if.else ]
				%arrayidx = getelementptr inbounds i32, i32* %A, i64 %indvars.iv
				store i32 %phi, i32* %arrayidx, align 4
				br label %for.inc

				for.inc:
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				br label %for.cond

				for.end:
				ret void
				}

polly/trunk/test/Isl/CodeGen/phi_loop_carried_float.ll

				; RUN: opt %loadPolly -S -polly-no-early-exit -polly-detect-unprofitable -polly-model-phi-nodes -disable-polly-intra-scop-scalar-to-array -polly-codegen < %s \| FileCheck %s
				;
				; float f(float *A, int N) {
				; float tmp = 0;
				; for (int i = 0; i < N; i++)
				; tmp += A[i];
				; }
				;
				; CHECK: bb:
				; CHECK-NOT: %tmp7{{[.*]}} = alloca float
				; CHECK-DAG: %tmp.0.s2a = alloca float
				; CHECK-NOT: %tmp7{{[.*]}} = alloca float
				; CHECK-DAG: %tmp.0.phiops = alloca float
				; CHECK-NOT: %tmp7{{[.*]}} = alloca float
				;
				; CHECK: polly.merge_new_and_old:
				; CHECK-NEXT: ret
				;
				; CHECK: polly.start:
				; CHECK-NEXT: store float 0.000000e+00, float* %tmp.0.phiops

				; CHECK: polly.merge:
				; CHECK-NEXT: br label %polly.merge_new_and_old

				; CHECK: polly.stmt.bb1{{[0-9]*}}:
				; CHECK-NEXT: %tmp.0.phiops.reload[[R1:[0-9]]] = load float, float %tmp.0.phiops
				; CHECK: store float %tmp.0.phiops.reload[[R1]], float* %tmp.0.s2a

				; CHECK: polly.stmt.bb1{{[0-9]*}}:
				; CHECK-NEXT: %tmp.0.phiops.reload[[R2:[0-9]]] = load float, float %tmp.0.phiops
				; CHECK: store float %tmp.0.phiops.reload[[R2]], float* %tmp.0.s2a

				; CHECK: polly.stmt.bb4: ; preds = %polly.then3
				; CHECK: %tmp[[R5:[0-9]]]_p_scalar_ = load float, float %scevgep, align 4, !alias.scope !0, !noalias !2
				; CHECK: %tmp.0.s2a.reload[[R3:[0-9]]] = load float, float %tmp.0.s2a
				; CHECK: %p_tmp[[R4:[0-9]*]] = fadd float %tmp.0.s2a.reload[[R3]], %tmp[[R5]]_p_scalar_
				; CHECK: store float %p_tmp[[R4]], float* %tmp.0.phiops

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

				define void @f(float* %A, i32 %N) {
				bb:
				%tmp = sext i32 %N to i64
				br label %bb1

				bb1: ; preds = %bb4, %bb
				%indvars.iv = phi i64 [ %indvars.iv.next, %bb4 ], [ 0, %bb ]
				%tmp.0 = phi float [ 0.000000e+00, %bb ], [ %tmp7, %bb4 ]
				%tmp2 = icmp slt i64 %indvars.iv, %tmp
				br i1 %tmp2, label %bb3, label %bb8

				bb3: ; preds = %bb1
				br label %bb4

				bb4: ; preds = %bb3
				%tmp5 = getelementptr inbounds float, float* %A, i64 %indvars.iv
				%tmp6 = load float, float* %tmp5, align 4
				%tmp7 = fadd float %tmp.0, %tmp6
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				br label %bb1

				bb8: ; preds = %bb1
				br label %exit

				exit:
				ret void
				}

polly/trunk/test/Isl/CodeGen/phi_loop_carried_float_escape.ll

				; RUN: opt %loadPolly -S -polly-no-early-exit -polly-detect-unprofitable -polly-model-phi-nodes -disable-polly-intra-scop-scalar-to-array -polly-codegen < %s \| FileCheck %s
				;
				; float f(float *A, int N) {
				; float tmp = 0;
				; for (int i = 0; i < N; i++)
				; tmp += A[i];
				; return tmp;
				; }
				;
				; CHECK: polly.merge_new_and_old:
				; CHECK-NEXT: %tmp.0.merge = phi float [ %tmp.0.final_reload, %polly.merge ], [ %tmp.0, %bb8 ]
				; CHECK-NEXT: ret float %tmp.0.merge
				;
				; CHECK: polly.start:
				; CHECK-NEXT: store float 0.000000e+00, float* %tmp.0.phiops

				; CHECK: polly.merge:
				; CHECK-NEXT: %tmp.0.final_reload = load float, float* %tmp.0.s2a
				; CHECK-NEXT: br label %polly.merge_new_and_old

				; CHECK: polly.stmt.bb1{{[0-9]*}}:
				; CHECK-NEXT: %tmp.0.phiops.reload[[R1:[0-9]]] = load float, float %tmp.0.phiops
				; CHECK-: store float %tmp.0.phiops.reload[[R1]], float* %tmp.0.s2a

				; CHECK: polly.stmt.bb1{{[0-9]*}}:
				; CHECK-NEXT: %tmp.0.phiops.reload[[R2:[0-9]]] = load float, float %tmp.0.phiops
				; CHECK: store float %tmp.0.phiops.reload[[R2]], float* %tmp.0.s2a

				; CHECK: polly.stmt.bb4: ; preds = %polly.then3
				; CHECK: %tmp[[R5:[0-9]]]_p_scalar_ = load float, float %scevgep, align 4, !alias.scope !0, !noalias !2
				; CHECK: %tmp.0.s2a.reload[[R3:[0-9]]] = load float, float %tmp.0.s2a
				; CHECK: %p_tmp[[R4:[0-9]*]] = fadd float %tmp.0.s2a.reload[[R3]], %tmp[[R5]]_p_scalar_
				; CHECK: store float %p_tmp[[R4]], float* %tmp.0.phiops

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

				define float @f(float* %A, i32 %N) {
				bb:
				%tmp = sext i32 %N to i64
				br label %bb1

				bb1: ; preds = %bb4, %bb
				%indvars.iv = phi i64 [ %indvars.iv.next, %bb4 ], [ 0, %bb ]
				%tmp.0 = phi float [ 0.000000e+00, %bb ], [ %tmp7, %bb4 ]
				%tmp2 = icmp slt i64 %indvars.iv, %tmp
				br i1 %tmp2, label %bb3, label %bb8

				bb3: ; preds = %bb1
				br label %bb4

				bb4: ; preds = %bb3
				%tmp5 = getelementptr inbounds float, float* %A, i64 %indvars.iv
				%tmp6 = load float, float* %tmp5, align 4
				%tmp7 = fadd float %tmp.0, %tmp6
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				br label %bb1

				bb8: ; preds = %bb1
				br label %exit

				exit:
				ret float %tmp.0
				}

polly/trunk/test/Isl/CodeGen/phi_scalar_simple_1.ll

				; RUN: opt %loadPolly -S -polly-detect-unprofitable -polly-model-phi-nodes -disable-polly-intra-scop-scalar-to-array -polly-no-early-exit -polly-codegen < %s \| FileCheck %s
				;
				; int jd(int *restrict A, int x, int N) {
				; for (int i = 1; i < N; i++)
				; for (int j = 3; j < N; j++)
				; x += A[i];
				; return x;
				; }
				;
				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

				define i32 @jd(i32* noalias %A, i32 %x, i32 %N) {
				entry:
				; CHECK-LABEL: entry:
				; CHECK-DAG: %x.addr.1.lcssa.s2a = alloca i32
				; CHECK-DAG: %x.addr.1.lcssa.phiops = alloca i32
				; CHECK-DAG: %x.addr.1.s2a = alloca i32
				; CHECK-DAG: %x.addr.1.phiops = alloca i32
				; CHECK-DAG: %x.addr.0.s2a = alloca i32
				; CHECK-DAG: %x.addr.0.phiops = alloca i32
				%tmp = sext i32 %N to i64
				br label %for.cond

				; CHECK-LABEL: polly.merge_new_and_old:
				; CHECK: %x.addr.0.merge = phi i32 [ %x.addr.0.final_reload, %polly.merge ], [ %x.addr.0, %for.cond ]
				; CHECK: ret i32 %x.addr.0.merge

				; CHECK-LABEL: polly.start:
				; CHECK-NEXT: store i32 %x, i32* %x.addr.0.phiops

				; CHECK-LABEL: polly.merge:
				; CHECK: %x.addr.0.final_reload = load i32, i32* %x.addr.0.s2a

				for.cond: ; preds = %for.inc4, %entry
				; CHECK-LABEL: polly.stmt.for.cond{{[0-9]*}}:
				; CHECK: %x.addr.0.phiops.reload[[R1:[0-9]]] = load i32, i32 %x.addr.0.phiops
				; CHECK: store i32 %x.addr.0.phiops.reload[[R1]], i32* %x.addr.0.s2a
				%indvars.iv = phi i64 [ %indvars.iv.next, %for.inc4 ], [ 1, %entry ]
				%x.addr.0 = phi i32 [ %x, %entry ], [ %x.addr.1.lcssa, %for.inc4 ]
				%cmp = icmp slt i64 %indvars.iv, %tmp
				br i1 %cmp, label %for.body, label %for.end6

				; CHECK-LABEL: polly.stmt.for.cond{{[0-9]*}}:
				; CHECK: %x.addr.0.phiops.reload[[R1:[0-9]]] = load i32, i32 %x.addr.0.phiops
				; CHECK: store i32 %x.addr.0.phiops.reload[[R1]], i32* %x.addr.0.s2a

				for.body: ; preds = %for.cond
				; CHECK-LABEL: polly.stmt.for.body:
				; CHECK: %x.addr.0.s2a.reload[[R2:[0-9]]] = load i32, i32 %x.addr.0.s2a
				; CHECK: store i32 %x.addr.0.s2a.reload[[R2]], i32* %x.addr.1.phiops
				br label %for.cond1

				for.end: ; preds = %for.cond1
				; CHECK-LABEL: polly.stmt.for.end:
				; CHECK-NEXT: %x.addr.1.lcssa.phiops.reload = load i32, i32* %x.addr.1.lcssa.phiops
				; CHECK-NEXT: store i32 %x.addr.1.lcssa.phiops.reload, i32* %x.addr.1.lcssa.s2a[[R4:[0-9]*]]
				%x.addr.1.lcssa = phi i32 [ %x.addr.1, %for.cond1 ]
				br label %for.inc4

				for.inc4: ; preds = %for.end
				; CHECK-LABEL: polly.stmt.for.inc4:
				; CHECK: %x.addr.1.lcssa.s2a.reload[[R5:[0-9]]] = load i32, i32 %x.addr.1.lcssa.s2a[[R4]]
				; CHECK: store i32 %x.addr.1.lcssa.s2a.reload[[R5]], i32* %x.addr.0.phiops
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				br label %for.cond

				for.cond1: ; preds = %for.inc, %for.body
				; CHECK-LABEL: polly.stmt.for.cond1:
				; CHECK: %x.addr.1.phiops.reload = load i32, i32* %x.addr.1.phiops
				; CHECK: store i32 %x.addr.1.phiops.reload, i32* %x.addr.1.s2a[[R6:[0-9]*]]
				; CHECK: store i32 %x.addr.1.phiops.reload, i32* %x.addr.1.lcssa.phiops
				%x.addr.1 = phi i32 [ %x.addr.0, %for.body ], [ %add, %for.inc ]
				%j.0 = phi i32 [ 3, %for.body ], [ %inc, %for.inc ]
				%exitcond = icmp ne i32 %j.0, %N
				br i1 %exitcond, label %for.body3, label %for.end

				for.body3: ; preds = %for.cond1
				br label %for.inc

				for.inc: ; preds = %for.body3
				; CHECK-LABEL: polly.stmt.for.inc:
				; CHECK: %x.addr.1.s2a.reload[[R3:[0-9]]] = load i32, i32 %x.addr.1.s2a
				; CHECK: %p_add = add nsw i32 %x.addr.1.s2a.reload[[R3]], %tmp1_p_scalar_
				; CHECK: store i32 %p_add, i32* %x.addr.1.phiops
				%arrayidx = getelementptr inbounds i32, i32* %A, i64 %indvars.iv
				%tmp1 = load i32, i32* %arrayidx, align 4
				%add = add nsw i32 %x.addr.1, %tmp1
				%inc = add nsw i32 %j.0, 1
				br label %for.cond1

				for.end6: ; preds = %for.cond
				ret i32 %x.addr.0
				}

polly/trunk/test/Isl/CodeGen/phi_scalar_simple_2.ll

				; RUN: opt %loadPolly -S -polly-detect-unprofitable -polly-model-phi-nodes -disable-polly-intra-scop-scalar-to-array -polly-no-early-exit -polly-codegen < %s \| FileCheck %s
				;
				; int jd(int *restrict A, int x, int N, int c) {
				; for (int i = 0; i < N; i++)
				; for (int j = 0; j < N; j++)
				; if (i < c)
				; x += A[i];
				; return x;
				; }
				;
				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

				define i32 @jd(i32* noalias %A, i32 %x, i32 %N, i32 %c) {
				entry:
				; CHECK-LABEL: entry:
				; CHECK-DAG: %x.addr.2.s2a = alloca i32
				; CHECK-DAG: %x.addr.2.phiops = alloca i32
				; CHECK-DAG: %x.addr.1.s2a = alloca i32
				; CHECK-DAG: %x.addr.1.phiops = alloca i32
				; CHECK-DAG: %x.addr.0.s2a = alloca i32
				; CHECK-DAG: %x.addr.0.phiops = alloca i32
				%tmp = sext i32 %N to i64
				%tmp1 = sext i32 %c to i64
				br label %for.cond

				; CHECK-LABEL: polly.merge_new_and_old:
				; CHECK: %x.addr.0.merge = phi i32 [ %x.addr.0.final_reload, %polly.merge ], [ %x.addr.0, %for.cond ]
				; CHECK: ret i32 %x.addr.0.merge

				; CHECK-LABEL: polly.start:
				; CHECK-NEXT: store i32 %x, i32* %x.addr.0.phiops

				; CHECK-LABEL: polly.merge:
				; CHECK: %x.addr.0.final_reload = load i32, i32* %x.addr.0.s2a

				for.cond: ; preds = %for.inc5, %entry
				; CHECK-LABEL: polly.stmt.for.cond{{[0-9]*}}:
				; CHECK: %x.addr.0.phiops.reload[[R1:[0-9]]] = load i32, i32 %x.addr.0.phiops
				; CHECK: store i32 %x.addr.0.phiops.reload[[R1]], i32* %x.addr.0.s2a
				%indvars.iv = phi i64 [ %indvars.iv.next, %for.inc5 ], [ 0, %entry ]
				%x.addr.0 = phi i32 [ %x, %entry ], [ %x.addr.1, %for.inc5 ]
				%cmp = icmp slt i64 %indvars.iv, %tmp
				br i1 %cmp, label %for.body, label %for.end7

				; CHECK-LABEL: polly.stmt.for.cond{{[0-9]*}}:
				; CHECK: %x.addr.0.phiops.reload[[R1:[0-9]]] = load i32, i32 %x.addr.0.phiops
				; CHECK: store i32 %x.addr.0.phiops.reload[[R1]], i32* %x.addr.0.s2a

				for.body: ; preds = %for.cond
				; CHECK-LABEL: polly.stmt.for.body:
				; CHECK: %x.addr.0.s2a.reload[[R2:[0-9]]] = load i32, i32 %x.addr.0.s2a
				; CHECK: store i32 %x.addr.0.s2a.reload[[R2]], i32* %x.addr.1.phiops
				br label %for.cond1

				for.inc5: ; preds = %for.end
				; CHECK-LABEL: polly.stmt.for.inc5:
				; CHECK: %x.addr.1.s2a.reload[[R5:[0-9]]] = load i32, i32 %x.addr.1.s2a
				; CHECK: store i32 %x.addr.1.s2a.reload[[R5]], i32* %x.addr.0.phiops
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				br label %for.cond

				for.cond1: ; preds = %for.inc, %for.body
				; CHECK-LABEL: polly.stmt.for.cond1:
				; CHECK: %x.addr.1.phiops.reload = load i32, i32* %x.addr.1.phiops
				; CHECK: store i32 %x.addr.1.phiops.reload, i32* %x.addr.1.s2a
				%x.addr.1 = phi i32 [ %x.addr.0, %for.body ], [ %x.addr.2, %for.inc ]
				%j.0 = phi i32 [ 0, %for.body ], [ %inc, %for.inc ]
				%exitcond = icmp ne i32 %j.0, %N
				br i1 %exitcond, label %for.body3, label %for.end

				for.body3: ; preds = %for.cond1
				; CHECK-LABEL: polly.stmt.for.body3:
				; CHECK: %x.addr.1.s2a.reload = load i32, i32* %x.addr.1.s2a
				; CHECK: store i32 %x.addr.1.s2a.reload, i32* %x.addr.2.phiops
				%cmp4 = icmp slt i64 %indvars.iv, %tmp1
				br i1 %cmp4, label %if.then, label %if.end

				if.end: ; preds = %if.then, %for.body3
				; CHECK-LABEL: polly.stmt.if.end:
				; CHECK: %x.addr.2.phiops.reload = load i32, i32* %x.addr.2.phiops
				; CHECK: store i32 %x.addr.2.phiops.reload, i32* %x.addr.2.s2a
				%x.addr.2 = phi i32 [ %add, %if.then ], [ %x.addr.1, %for.body3 ]
				br label %for.inc

				for.inc: ; preds = %if.end
				; CHECK-LABEL: polly.stmt.for.inc:
				; CHECK: %x.addr.2.s2a.reload[[R3:[0-9]]] = load i32, i32 %x.addr.2.s2a
				; CHECK: store i32 %x.addr.2.s2a.reload[[R3]], i32* %x.addr.1.phiops
				%inc = add nsw i32 %j.0, 1
				br label %for.cond1

				if.then: ; preds = %for.body3
				; CHECK-LABEL: polly.stmt.if.then:
				; CHECK: %x.addr.1.s2a.reload[[R5:[0-9]]] = load i32, i32 %x.addr.1.s2a
				; CHECK: %p_add = add nsw i32 %x.addr.1.s2a.reload[[R5]], %tmp2_p_scalar_
				; CHECK: store i32 %p_add, i32* %x.addr.2.phiops
				%arrayidx = getelementptr inbounds i32, i32* %A, i64 %indvars.iv
				%tmp2 = load i32, i32* %arrayidx, align 4
				%add = add nsw i32 %x.addr.1, %tmp2
				br label %if.end

				for.end: ; preds = %for.cond1
				br label %for.inc5

				for.end7: ; preds = %for.cond
				ret i32 %x.addr.0
				}

This is an archive of the discontinued LLVM Phabricator instance.

Scalar/PHI code genrationClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 26361

polly/trunk/include/polly/CodeGen/BlockGenerators.h

polly/trunk/include/polly/CodeGen/IslNodeBuilder.h

polly/trunk/include/polly/ScopInfo.h

polly/trunk/lib/Analysis/ScopInfo.cpp

polly/trunk/lib/CodeGen/BlockGenerators.cpp

polly/trunk/lib/CodeGen/CodeGeneration.cpp

polly/trunk/test/Isl/CodeGen/phi_condition_modeling_1.ll

polly/trunk/test/Isl/CodeGen/phi_condition_modeling_2.ll

polly/trunk/test/Isl/CodeGen/phi_conditional_simple_1.ll

polly/trunk/test/Isl/CodeGen/phi_loop_carried_float.ll

polly/trunk/test/Isl/CodeGen/phi_loop_carried_float_escape.ll

polly/trunk/test/Isl/CodeGen/phi_scalar_simple_1.ll

polly/trunk/test/Isl/CodeGen/phi_scalar_simple_2.ll

Scalar/PHI code genration
ClosedPublic