This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/
-
llvm/
-
Analysis/
-
CFG.h
6/7
LoopAllocationInfo.h
-
IR/
2/2
Instruction.h
-
Transforms/Utils/
-
Utils/
-
LoopUtils.h
-
lib/
-
Analysis/
-
BasicAliasAnalysis.cpp
-
CFG.cpp
-
CMakeLists.txt
-
CaptureTracking.cpp
2/2
LoopAllocationInfo.cpp
-
CodeGen/
-
DwarfEHPrepare.cpp
-
IR/
2/2
Instruction.cpp
-
Transforms/Scalar/
-
Scalar/
4/7
LICM.cpp
-
test/Transforms/LICM/
-
Transforms/
-
LICM/
2/3
allocs.ll
-
unittests/Analysis/
-
Analysis/
-
CFGTest.cpp

Differential D60056

Hoist/sink malloc/free's in LICM.
Needs ReviewPublic

Authored by nicholas on Mar 31 2019, 10:19 PM.

Download Raw Diff

Details

Reviewers

asbirlea
george.burgess.iv
reames

Summary

Hoist/sink malloc/free's in LICM.

Adds a new method on Instruction that answers the query of whether the given Instruction might allocate or free memory. Unlike builtin info which answers whether they are an allocation/deallocation call, this method indicates whether allocation behaviour is possible.

Adds a new analysis on loops called LoopAllocationInfo. This is tightly tied to the internal implementation of LICM, but is exposed in a public API because the relevant parts of LICM's externals, hoistRegion and sinkRegion, are also a public API.

One potentially surprising aspect of this optimization is that it is correct even if the loop exits in the middle, after the malloc and before the free. As long as there is only one live malloc at a time, there is no visible difference in behaviour. It is sufficient to show that we've found free() calls which cover all possible access to the loop backedge (the other option is that the code had a pre-existing double-free, which is undefined behaviour). The loop may thus exit with the malloc either allocated or freed, and we track which loop exits blocks should have frees added to them and which ones should not.

Diff Detail

Repository

rG LLVM Github Monorepo

Build Status

Buildable 29885
Build 29884: arc lint + arc unit

Event Timeline

nicholas created this revision.Mar 31 2019, 10:19 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 31 2019, 10:19 PM

Herald added subscribers: llvm-commits, asbirlea, jfb and 2 others. · View Herald Transcript

Harbormaster completed remote builds in B29885: Diff 193042.Mar 31 2019, 10:21 PM

nicholas added reviewers: asbirlea, george.burgess.iv, reames.Mar 31 2019, 10:22 PM

A couple of quick comments.
Would you mind splitting the cleanups and bug fix (which will be straightforward to approve) and rebase this on top of that?

llvm/include/llvm/Analysis/LoopAllocationInfo.h
29	Could you include the comment from the description here? "This analysis is tightly coupled to the internal implementation of the LICM transform."
llvm/test/Transforms/LICM/allocs.ll
2	Could you also add a RUN line with -enable-mssa-loop-dependency?

nicholas added parent revisions: D60084: [NFC] Remove dead parameter "FreeInLoop", fix some typos and trailing whitespace., D60085: Add an optional list of blocks to avoid when looking for a path in isPotentiallyReachable..Apr 1 2019, 11:55 AM

nicholas updated this revision to Diff 193147.Apr 1 2019, 12:00 PM

nicholas edited the summary of this revision. (Show Details)

nicholas updated this revision to Diff 193152.Apr 1 2019, 12:33 PM

nicholas marked an inline comment as done.

I've only skimmed through the high level comments so far, so if if any of this is addressed inline, just say so.

I was wondering why you phrased this as a combination of hoisting and sinking as opposed to promotion. We have the scalar promotion path, and the basic transform you're doing feels a lot like promotion. I suspect that many of the legality aspects will be common.

llvm/include/llvm/IR/Instruction.h
539	These feel like they need a bit more specification. In particular, are there any expectations around how new memory is returned? Is a deallocation routine allowed to free a random pointer read from a global? (i.e. which set of locations are we talking about freeing and allocating?)
llvm/lib/IR/Instruction.cpp
562	Err, using a blacklist here feels really dangerous.

I've only skimmed through the high level comments so far, so if if any of this is addressed inline, just say so.

I was wondering why you phrased this as a combination of hoisting and sinking as opposed to promotion. We have the scalar promotion path, and the basic transform you're doing feels a lot like promotion. I suspect that many of the legality aspects will be common.

I didn't frame it as promotion simply because I view promotion as a transform from one thing to another (such as heap to stack, stack to scalar) whereas this optimization was just hoisting out the allocation and sinking the matching deallocations.

The safety checks we need are "is there a path from this malloc back to this malloc without passing through one of the frees we identified?" and "for each exit block, can we see that all paths either pass through one-or-more frees or that all paths do not pass through any frees?". We have no need of MemorySSA or AliasSets like the memory-to-scalar promotion does.

Comparing this code with LICM's promotion does reveal one bug. We might try to insert into catchswitch blocks.

Should I pull it out of hoisting and sinking and into a post-pass like LICM's scalar promotion? Initially I had imagined that it would participate in hoist and sink's queries of whether "are all arguments to this instruction loop invariant (outside the loop)" as hoists instructions iterately, it would hoist malloc like any other instruction. In practice that couldn't happen because the hoist is deferred until sink time. We do need the malloc hoist to occur after hoistRegion in order to make the malloc size argument invariant. So, maybe there's no reason not to place it after hoist and sink? We currently piggy-back on hoist and sink's linear scans, but doing them again is merely a constant factor. It would eliminate the need to expose LoopAllocationInfo as a public interface, which is a benefit.

llvm/include/llvm/IR/Instruction.h
539	Acknowledged. These exist to help us maintain a C++-language invariant that we may not increase the total amount of heap memory in use, so, malloc+free+malloc may not be transformed into malloc+malloc+free. We simply need to avoid moving memory allocations around each other (though we can sort a continuous block of memory allocations if there's no frees in between). As such, there's no consideration of how the allocated memory would be returned or which memory gets deallocated (similar to how `mayReadOrWriteMemory` doesn't). We also don't really talk about what happens when an intrinsic allocates and deallocates internally, I think ultimately whether the intrinsic wants to be treated as something we can move other allocation/deallocation operations around should be left to the intrinsic author. It may not matter to the Objective-C runtime for instance. (But what about Objective-C++?) I think this will end up getting resolved when we address your comments on llvm/lib/IR/Instruction.cpp:562.
llvm/lib/IR/Instruction.cpp
562	Uhh, you aren't suggesting that I actually list them all, right? The reason I put it here in Instruction instead of hidden away in LICM/LoopAllocationInfo is to make it somewhat clearer that this is a global property that you'll need to update when adding an instruction or intrinsic. How about if we update Intrinsics.td to include a Mallocs and Frees (similar to Throws) IntrinsicProperty? That makes it opt-in when declaring the intrinsic? (Speaking of which, why Throws and not IntrThrows? Should mine be IntrMallocs/IntrFrees or Mallocs/Frees?) Also, can a readnone function malloc? Is there any way we could make this stricter?

In D60056#1450884, @nicholas wrote:

Should I pull it out of hoisting and sinking and into a post-pass like LICM's scalar promotion? Initially I had imagined that it would participate in hoist and sink's queries of whether "are all arguments to this instruction loop invariant (outside the loop)" as hoists instructions iterately, it would hoist malloc like any other instruction. In practice that couldn't happen because the hoist is deferred until sink time.

Er, I'm mistaken. It does, because sinkRegion happens first and doesn't call isLoopInvariant, then hoistRegion happens second and that uses isLoopInvariant. So in theory if you have something like ptrtoint of the malloc (maybe an alignment check?), we could hoist that out of the loop too. So it does interlace with the rest of hoistRegion, and that's important for optimization power. I'll add a test to that effect shortly.

nicholas updated this revision to Diff 193226.Apr 1 2019, 7:46 PM

Fixed creation of invalid IR when the loop exit block we need to insert a free call in, is a catchswitch block.

reames added inline comments.Apr 5 2019, 2:52 PM

llvm/lib/Analysis/LoopAllocationInfo.cpp
93	As discussed offline, bug example: for (int i = 0; i < N; i++) { throw_if(requested_size > TOO_BIG); a = malloc(requested_size); free(a); }

Other items discussed offline:

the instruction usage feels to generic with too few uses. I suggested (but will not require) either making them computed properties of an intrinsic, or finding other uses.
my comment about using a blacklist is clearly wrong. You'd suggested using an attribute in Instrinsics.td. I was fine with that, but it felt like possible overkill.

llvm/include/llvm/Analysis/LoopAllocationInfo.h
92	It really looks like you have a named tuple hidding here. AllocationInfoEntry?
llvm/lib/Transforms/Scalar/LICM.cpp
857	As debated offline, naming here is problematic. addDeallocations -> locationsNeedingFrees? newAllocPlacement? toSink -> freesForMalloc (key piece: avoid action names) A comment would be help as well. Pull out the cast to CallInst
860	You might wish to either a) union the source locations, or b) restrict this to non -g See applyMergedLocation on Instruction.

nicholas updated this revision to Diff 194639.Apr 10 2019, 11:21 PM

nicholas marked 9 inline comments as done.

minor comments on interface, LICM part, and tests. I got interrupted and need to come back and finish reviewing the analysis implementation.

llvm/include/llvm/Analysis/LoopAllocationInfo.h
78	either "may ... ?" or "return true if ...". (i.e make a question a question)
83	Should we assert that CI is a malloc within the specified loop?
107	Not sure what the last part of the comment means. Reword?
llvm/lib/Transforms/Scalar/LICM.cpp
535	What about invokes? (Use callbase)
llvm/test/Transforms/LICM/allocs.ll
568	This test is odd. I don't see anything preventing us from reordering the volatile load and malloc. Was this an attempt to test for a side exit? If so, simply using an unknown call before the malloc would be preferred.

nicholas marked 6 inline comments as done.Apr 11 2019, 12:40 PM

nicholas added inline comments.

llvm/include/llvm/Analysis/LoopAllocationInfo.h
78	Done, used "return true iff".
83	We don't presently store the right member variables to assert that, and I don't really want to add DEBUG-only member variables. The condition on this function is that mayHoist(CI) must be true. I've added an assert that CI is in entries, which is the condition under which this function would have UB.
llvm/lib/Analysis/LoopAllocationInfo.cpp
93	I've updated this to use isGuaranteedToTransferExecutionToSuccessor on the instructions leading up to the allocation. The "throw_if" example couldn't happen because any opaque function might also malloc or free, so the test @test16 uses a volatile load instead.
llvm/lib/Transforms/Scalar/LICM.cpp
497	Rename this to something clearer, such as "ScanForFrees".
535	Done. This loop skips the terminator, so it can't be an invoke. Which is, indeed, a miscompile bug. Fixed above where we initialize ScanForFrees and added @test19. Changed it to CallBase here anyways.
857	Is picking the first one out of toSink/freesForMalloc a problem? The order depends on the use-list ordering. I don't know what LLVM's current rules on that are, I know that there's https://llvm.org/docs/LangRef.html#use-list-order-directives . Is it considered a bug to have an optimization whose result depends on use list order?
857	Is it possible for a free call to have an operand bundle on a malloc or free call? In CloneInstructionInExitBlock, LICM updates the funclet operand bundle for the new location in the CFG.
llvm/test/Transforms/LICM/allocs.ll
568	This test shows the difference from using isGuaranteedToTranferExecution in analyzeLoop. Before that change, we would have hoisted the malloc above the volatile load. The same test written with a function would not have shown any difference simply because there is no way to create a function that might-throw but can't-malloc/free (without adding new attributes to LLVM). It would also just be @test3 again. I think the new behaviour tested for here is correct. Suppose you have a system where out of memory terminates the program, and the volatile store is triggering some external action (moves the robot arm). Suppose the programmer is using volatile to make sure the external system is in the safe state before doing the malloc that may terminate, and wants the malloc to complete before beginning any more operations. The compiler should not reorder these actions.

nicholas updated this revision to Diff 194733.Apr 11 2019, 12:40 PM

nicholas marked an inline comment as done.

This update adds a failing test @test21 which demonstrates a miscompile. While we check the block with the free in it for potentially-malloc'ing instructions, we don't examine all the possible paths from where that free was pre-transform to the new location of the free post-transform.

It's not immediately clear to me how to fix that efficiently. It's possible to label each loop block with whether it contains a potentially-malloc'ing instruction, then find all paths from frees to exit blocks which will have frees added to them and do not pass through any backedge or back through the header. We can simplify this path scan, knowing that all exits reachable after the free call are exits where we will insert a free.

Assumed inactive, cleaning out review list, please readd if needed.

Revision Contents

Path

Size

llvm/

include/

llvm/

Analysis/

CFG.h

25 lines

LoopAllocationInfo.h

102 lines

IR/

Instruction.h

6 lines

Transforms/

Utils/

LoopUtils.h

7 lines

lib/

Analysis/

BasicAliasAnalysis.cpp

2 lines

CFG.cpp

72 lines

CMakeLists.txt

1 line

CaptureTracking.cpp

4 lines

LoopAllocationInfo.cpp

235 lines

CodeGen/

DwarfEHPrepare.cpp

2 lines

IR/

Instruction.cpp

40 lines

Transforms/

Scalar/

LICM.cpp

86 lines

test/

Transforms/

LICM/

allocs.ll

300 lines

unittests/

Analysis/

CFGTest.cpp

140 lines

Diff 193042

llvm/include/llvm/Analysis/CFG.h

	Show First 20 Lines • Show All 41 Lines • ▼ Show 20 Lines

	/// Return true if the specified edge is a critical edge. Critical edges are			/// Return true if the specified edge is a critical edge. Critical edges are
	/// edges from a block with multiple successors to a block with multiple			/// edges from a block with multiple successors to a block with multiple
	/// predecessors.			/// predecessors.
	///			///
	bool isCriticalEdge(const Instruction *TI, unsigned SuccNum,			bool isCriticalEdge(const Instruction *TI, unsigned SuccNum,
	bool AllowIdenticalEdges = false);			bool AllowIdenticalEdges = false);

	/// Determine whether instruction 'To' is reachable from 'From',			/// Determine whether instruction 'To' is reachable from 'From', without passing
	/// returning true if uncertain.			/// through any blocks in ExclusionSet, returning true if uncertain.
	///			///
	/// Determine whether there is a path from From to To within a single function.			/// Determine whether there is a path from From to To within a single function.
	/// Returns false only if we can prove that once 'From' has been executed then			/// Returns false only if we can prove that once 'From' has been executed then
	/// 'To' can not be executed. Conservatively returns true.			/// 'To' can not be executed. Conservatively returns true.
	///			///
	/// This function is linear with respect to the number of blocks in the CFG,			/// This function is linear with respect to the number of blocks in the CFG,
	/// walking down successors from From to reach To, with a fixed threshold.			/// walking down successors from From to reach To, with a fixed threshold.
	/// Using DT or LI allows us to answer more quickly. LI reduces the cost of			/// Using DT or LI allows us to answer more quickly. LI reduces the cost of
	/// an entire loop of any number of blocks to be the same as the cost of a			/// an entire loop of any number of blocks to be the same as the cost of a
	/// single block. DT reduces the cost by allowing the search to terminate when			/// single block. DT reduces the cost by allowing the search to terminate when
	/// we find a block that dominates the block containing 'To'. DT is most useful			/// we find a block that dominates the block containing 'To'. DT is most useful
	/// on branchy code but not loops, and LI is most useful on code with loops but			/// on branchy code but not loops, and LI is most useful on code with loops but
	/// does not help on branchy code outside loops.			/// does not help on branchy code outside loops.
	bool isPotentiallyReachable(const Instruction From, const Instruction To,			bool isPotentiallyReachable(
	const DominatorTree *DT = nullptr,			const Instruction From, const Instruction To,
	const LoopInfo *LI = nullptr);			const SmallPtrSetImpl<BasicBlock > ExclusionSet = nullptr,
				const DominatorTree DT = nullptr, const LoopInfo LI = nullptr);

	/// Determine whether block 'To' is reachable from 'From', returning			/// Determine whether block 'To' is reachable from 'From', returning
	/// true if uncertain.			/// true if uncertain.
	///			///
	/// Determine whether there is a path from From to To within a single function.			/// Determine whether there is a path from From to To within a single function.
	/// Returns false only if we can prove that once 'From' has been reached then			/// Returns false only if we can prove that once 'From' has been reached then
	/// 'To' can not be executed. Conservatively returns true.			/// 'To' can not be executed. Conservatively returns true.
	bool isPotentiallyReachable(const BasicBlock From, const BasicBlock To,			bool isPotentiallyReachable(const BasicBlock From, const BasicBlock To,
	const DominatorTree *DT = nullptr,			const DominatorTree *DT = nullptr,
	const LoopInfo *LI = nullptr);			const LoopInfo *LI = nullptr);

	/// Determine whether there is at least one path from a block in			/// Determine whether there is at least one path from a block in
	/// 'Worklist' to 'StopBB', returning true if uncertain.			/// 'Worklist' to 'StopBB', returning true if uncertain.
	///			///
	/// Determine whether there is a path from at least one block in Worklist to			/// Determine whether there is a path from at least one block in Worklist to
	/// StopBB within a single function. Returns false only if we can prove that			/// StopBB within a single function. Returns false only if we can prove that
	/// once any block in 'Worklist' has been reached then 'StopBB' can not be			/// once any block in 'Worklist' has been reached then 'StopBB' can not be
	/// executed. Conservatively returns true.			/// executed. Conservatively returns true.
	bool isPotentiallyReachableFromMany(SmallVectorImpl<BasicBlock *> &Worklist,			bool isPotentiallyReachableFromMany(SmallVectorImpl<BasicBlock *> &Worklist,
	BasicBlock *StopBB,			BasicBlock *StopBB,
	const DominatorTree *DT = nullptr,			const DominatorTree *DT = nullptr,
	const LoopInfo *LI = nullptr);			const LoopInfo *LI = nullptr);

				/// Determine whether there is at least one path from a block in
				/// 'Worklist' to 'StopBB' without passing through any blocks in
				/// 'ExclusionSet', returning true if uncertain.
				///
				/// Determine whether there is a path from at least one block in Worklist to
				/// StopBB within a single function without passing through any of the blocks
				/// in 'ExclusionSet'. Returns false only if we can prove that once any block
				/// in 'Worklist' has been reached then 'StopBB' can not be executed.
				/// Conservatively returns true.
				bool isPotentiallyReachableFromMany(
				SmallVectorImpl<BasicBlock > &Worklist, BasicBlock StopBB,
				const SmallPtrSetImpl<BasicBlock > ExclusionSet,
				const DominatorTree DT = nullptr, const LoopInfo LI = nullptr);

	/// Return true if the control flow in \p RPOTraversal is irreducible.			/// Return true if the control flow in \p RPOTraversal is irreducible.
	///			///
	/// This is a generic implementation to detect CFG irreducibility based on loop			/// This is a generic implementation to detect CFG irreducibility based on loop
	/// info analysis. It can be used for any kind of CFG (Loop, MachineLoop,			/// info analysis. It can be used for any kind of CFG (Loop, MachineLoop,
	/// Function, MachineFunction, etc.) by providing an RPO traversal (\p			/// Function, MachineFunction, etc.) by providing an RPO traversal (\p
	/// RPOTraversal) and the loop info analysis (\p LI) of the CFG. This utility			/// RPOTraversal) and the loop info analysis (\p LI) of the CFG. This utility
	/// function is only recommended when loop info analysis is available. If loop			/// function is only recommended when loop info analysis is available. If loop
	/// info analysis isn't available, please, don't compute it explicitly for this			/// info analysis isn't available, please, don't compute it explicitly for this
	▲ Show 20 Lines • Show All 61 Lines • Show Last 20 Lines

llvm/include/llvm/Analysis/LoopAllocationInfo.h

This file was added.

				//===- LoopAllocationInfo.h - memory allocations inside loops ---- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_ANALYSIS_LOOPALLOCATIONINFO_H
				#define LLVM_ANALYSIS_LOOPALLOCATIONINFO_H

				#include "llvm/ADT/DenseMap.h"
				#include "llvm/ADT/DenseSet.h"
				#include "llvm/ADT/SmallVector.h"

				namespace llvm {

				class BasicBlock;
				class CallInst;
				class DominatorTree;
				class Loop;
				class LoopInfo;
				class TargetLibraryInfo;

				/// Track ptr = allocfn to one or more freefn(ptr) relationships within the
				/// loop.
				///
				/// This analysis stores information about allocation and deallocation calls
				/// where the pointer is only live within the loop.
				asbirleaUnsubmitted Done Reply Inline Actions Could you include the comment from the description here? "This analysis is tightly coupled to the internal implementation of the LICM transform." asbirlea: Could you include the comment from the description here? "This analysis is tightly coupled to…
				///
				/// An allocation must be both hoisted and sunk, or neither. LICM performs
				/// sinking first, then hoisting. As sinking and hoisting is performed, more
				/// instructions will appear to be loop invariant, making more opportunities.
				/// For this reason we split the analysis into two pieces:
				/// 1. for a given pointer that is only live in the loop, retain a list its
				/// matching deallocating instructions.
				/// 2. during sinking, if we would sink the deallocation, don't, but inform
				/// the LoopAllocationInfo. If all deallocations would have been sunk, and
				/// hoistSink would hoist the allocation, perform both the hoist and sink
				/// the various deallocation instructions.
				///
				/// We don't need guaranteed execution of any deallocation function during
				/// unwinding. Before this transformation, an unwind mid-loop would leak one
				/// pointer that was allocated at the start of the loop. After this
				/// transformation, it would leak the one pointer allocated before the loop
				/// began. There is no difference visible to the caller.
				///
				/// We do have to ensure that the maximum amount of allocated memory is not
				/// raised by this transformation. To ensure this, we stop looking for
				/// allocations after seeing an opaque function call. For example:
				/// loop_top:
				/// %ptr1 = call i8* @malloc(i32 %loop_invariant1) ;; candidate
				/// %ptr2 = call i8* @malloc(i32 %loop_variant2) ;; not candidate
				/// %ptr3 = call i8* @malloc(i32 %loop_invariant3) ;; candidate
				/// call void @opaque() ;; stop scanning
				/// %ptr5 = call i8* @malloc(i32 %loop_invariant5) ;; not candidate
				///
				/// The same idea applies in reverse order to @free calls at the loop exits:
				/// loop_exiting:
				/// call void @free(%ptr1) ;; not candidate
				/// call void @opaque() ;; stop scanning
				/// call void @free(%ptr2) ;; %ptr2 is candidate
				/// ;; start here and scan upwards
				class LoopAllocationInfo {
				public:
				void analyzeLoop(LoopInfo LI, DominatorTree DT, TargetLibraryInfo *TLI,
				Loop *CurLoop);

				// sinkRegion has determined that this deallocation call may be sunk.
				void addSafeToSink(const CallInst *CI);

				// hoistRegion would like to know whether the given allocation call may be
				// hoisted.
				bool mayHoist(const CallInst *CI) const;

				// Retrieve the list of free calls that should be removed from the loop.
				const SmallVector<CallInst , 16> &toSink(const CallInst CI);

				reamesUnsubmitted Done Reply Inline Actions either "may ... ?" or "return true if ...". (i.e make a question a question) reames: either "may ... ?" or "return true if ...". (i.e make a question a question)
				nicholasAuthorUnsubmitted Done Reply Inline Actions Done, used "return true iff". nicholas: Done, used "return true iff".
				// Retrieve the list of loop exit blocks which must have a free inserted.
				// This is a subset or equal to the loops exit blocks.
				const SmallVector<BasicBlock , 16> &addDeallocations(const CallInst CI);

				// Returns whether the loop has any potentially-hoistable allocations.
				reamesUnsubmitted Not Done Reply Inline Actions Should we assert that CI is a malloc within the specified loop? reames: Should we assert that CI is a malloc within the specified loop?
				nicholasAuthorUnsubmitted Done Reply Inline Actions We don't presently store the right member variables to assert that, and I don't really want to add DEBUG-only member variables. The condition on this function is that mayHoist(CI) must be true. I've added an assert that CI is in entries, which is the condition under which this function would have UB. nicholas: We don't presently store the right member variables to assert that, and I don't really want to…
				bool empty() const { return Allocations.empty(); }

				private:
				// The n-th index into these vectors finds the matching allocation and
				// deallocations.
				SmallVector<const CallInst *, 4> Allocations;
				SmallVector<SmallVector<CallInst *, 16>, 4> Deallocations;
				SmallVector<SmallVector<BasicBlock *, 16>, 4> ExitBlocksToAddDeallocationsTo;

				reamesUnsubmitted Done Reply Inline Actions It really looks like you have a named tuple hidding here. AllocationInfoEntry? reames: It really looks like you have a named tuple hidding here. AllocationInfoEntry?
				// Look up the index into the above vector given a call.
				DenseMap<const CallInst *, size_t> AllocationLookup;
				DenseMap<const CallInst *, size_t> DeallocationLookup;

				DenseSet<const CallInst *> SafeToSink;
				};

				} // namespace llvm

				#endif
				reamesUnsubmitted Done Reply Inline Actions Not sure what the last part of the comment means. Reword? reames: Not sure what the last part of the comment means. Reword?

llvm/include/llvm/IR/Instruction.h

Show First 20 Lines • Show All 530 Lines • ▼ Show 20 Lines	public:
/// Return true if this instruction may read memory.		/// Return true if this instruction may read memory.
bool mayReadFromMemory() const;		bool mayReadFromMemory() const;

/// Return true if this instruction may read or write memory.		/// Return true if this instruction may read or write memory.
bool mayReadOrWriteMemory() const {		bool mayReadOrWriteMemory() const {
return mayReadFromMemory() \|\| mayWriteToMemory();		return mayReadFromMemory() \|\| mayWriteToMemory();
}		}

		/// Return true if this instruction may allocate heap memory.
		reamesUnsubmitted Done Reply Inline Actions These feel like they need a bit more specification. In particular, are there any expectations around how new memory is returned? Is a deallocation routine allowed to free a random pointer read from a global? (i.e. which set of locations are we talking about freeing and allocating?) reames: These feel like they need a bit more specification. In particular, are there any expectations…
		nicholasAuthorUnsubmitted Done Reply Inline Actions Acknowledged. These exist to help us maintain a C++-language invariant that we may not increase the total amount of heap memory in use, so, malloc+free+malloc may not be transformed into malloc+malloc+free. We simply need to avoid moving memory allocations around each other (though we can sort a continuous block of memory allocations if there's no frees in between). As such, there's no consideration of how the allocated memory would be returned or which memory gets deallocated (similar to how `mayReadOrWriteMemory` doesn't). We also don't really talk about what happens when an intrinsic allocates and deallocates internally, I think ultimately whether the intrinsic wants to be treated as something we can move other allocation/deallocation operations around should be left to the intrinsic author. It may not matter to the Objective-C runtime for instance. (But what about Objective-C++?) I think this will end up getting resolved when we address your comments on llvm/lib/IR/Instruction.cpp:562. nicholas: Acknowledged. These exist to help us maintain a C++-language invariant that we may not…
		bool mayAllocateMemory() const;

		/// Return true if this instruction may deallocate heap memory.
		bool mayDeallocateMemory() const;

/// Return true if this instruction has an AtomicOrdering of unordered or		/// Return true if this instruction has an AtomicOrdering of unordered or
/// higher.		/// higher.
bool isAtomic() const;		bool isAtomic() const;

/// Return true if this atomic instruction loads from memory.		/// Return true if this atomic instruction loads from memory.
bool hasAtomicLoad() const;		bool hasAtomicLoad() const;

/// Return true if this atomic instruction stores to memory.		/// Return true if this atomic instruction stores to memory.
▲ Show 20 Lines • Show All 232 Lines • Show Last 20 Lines

llvm/include/llvm/Transforms/Utils/LoopUtils.h

	Show All 33 Lines

	namespace llvm {			namespace llvm {

	class AliasSet;			class AliasSet;
	class AliasSetTracker;			class AliasSetTracker;
	class BasicBlock;			class BasicBlock;
	class DataLayout;			class DataLayout;
	class Loop;			class Loop;
				class LoopAllocationInfo;
	class LoopInfo;			class LoopInfo;
	class MemoryAccess;			class MemoryAccess;
	class MemorySSAUpdater;			class MemorySSAUpdater;
	class OptimizationRemarkEmitter;			class OptimizationRemarkEmitter;
	class PredicatedScalarEvolution;			class PredicatedScalarEvolution;
	class PredIteratorCache;			class PredIteratorCache;
	class ScalarEvolution;			class ScalarEvolution;
	class SCEV;			class SCEV;
	▲ Show 20 Lines • Show All 57 Lines • ▼ Show 20 Lines
	/// uses before definitions, allowing us to sink a loop body in one pass without			/// uses before definitions, allowing us to sink a loop body in one pass without
	/// iteration. Takes DomTreeNode, AliasAnalysis, LoopInfo, DominatorTree,			/// iteration. Takes DomTreeNode, AliasAnalysis, LoopInfo, DominatorTree,
	/// DataLayout, TargetLibraryInfo, Loop, AliasSet information for all			/// DataLayout, TargetLibraryInfo, Loop, AliasSet information for all
	/// instructions of the loop and loop safety information as			/// instructions of the loop and loop safety information as
	/// arguments. Diagnostics is emitted via \p ORE. It returns changed status.			/// arguments. Diagnostics is emitted via \p ORE. It returns changed status.
	bool sinkRegion(DomTreeNode , AliasAnalysis , LoopInfo , DominatorTree ,			bool sinkRegion(DomTreeNode , AliasAnalysis , LoopInfo , DominatorTree ,
	TargetLibraryInfo , TargetTransformInfo , Loop *,			TargetLibraryInfo , TargetTransformInfo , Loop *,
	AliasSetTracker , MemorySSAUpdater , ICFLoopSafetyInfo *,			AliasSetTracker , MemorySSAUpdater , ICFLoopSafetyInfo *,
	bool, int &, OptimizationRemarkEmitter *);			LoopAllocationInfo , bool, int &, OptimizationRemarkEmitter );

	/// Walk the specified region of the CFG (defined by all blocks			/// Walk the specified region of the CFG (defined by all blocks
	/// dominated by the specified block, and that are in the current loop) in depth			/// dominated by the specified block, and that are in the current loop) in depth
	/// first order w.r.t the DominatorTree. This allows us to visit definitions			/// first order w.r.t the DominatorTree. This allows us to visit definitions
	/// before uses, allowing us to hoist a loop body in one pass without iteration.			/// before uses, allowing us to hoist a loop body in one pass without iteration.
	/// Takes DomTreeNode, AliasAnalysis, LoopInfo, DominatorTree, DataLayout,			/// Takes DomTreeNode, AliasAnalysis, LoopInfo, DominatorTree, DataLayout,
	/// TargetLibraryInfo, Loop, AliasSet information for all instructions of the			/// TargetLibraryInfo, Loop, AliasSet information for all instructions of the
	/// loop and loop safety information as arguments. Diagnostics is emitted via \p			/// loop and loop safety information as arguments. Diagnostics is emitted via \p
	/// ORE. It returns changed status.			/// ORE. It returns changed status.
	bool hoistRegion(DomTreeNode , AliasAnalysis , LoopInfo , DominatorTree ,			bool hoistRegion(DomTreeNode , AliasAnalysis , LoopInfo , DominatorTree ,
	TargetLibraryInfo , Loop , AliasSetTracker *,			TargetLibraryInfo , Loop , AliasSetTracker *,
	MemorySSAUpdater , ICFLoopSafetyInfo , bool, int &,			MemorySSAUpdater , ICFLoopSafetyInfo , LoopAllocationInfo *,
	OptimizationRemarkEmitter *);			bool, int &, OptimizationRemarkEmitter *);

	/// This function deletes dead loops. The caller of this function needs to			/// This function deletes dead loops. The caller of this function needs to
	/// guarantee that the loop is infact dead.			/// guarantee that the loop is infact dead.
	/// The function requires a bunch or prerequisites to be present:			/// The function requires a bunch or prerequisites to be present:
	/// - The loop needs to be in LCSSA form			/// - The loop needs to be in LCSSA form
	/// - The loop needs to have a Preheader			/// - The loop needs to have a Preheader
	/// - A unique dedicated exit block must exist			/// - A unique dedicated exit block must exist
	///			///
	▲ Show 20 Lines • Show All 212 Lines • Show Last 20 Lines

llvm/lib/Analysis/BasicAliasAnalysis.cpp

Show First 20 Lines • Show All 1,900 Lines • ▼ Show 20 Lines	bool BasicAAResult::isValueEqualInPotentialCycles(const Value *V,

if (VisitedPhiBBs.size() > MaxNumPhiBBsValueReachabilityCheck)		if (VisitedPhiBBs.size() > MaxNumPhiBBsValueReachabilityCheck)
return false;		return false;

// Make sure that the visited phis cannot reach the Value. This ensures that		// Make sure that the visited phis cannot reach the Value. This ensures that
// the Values cannot come from different iterations of a potential cycle the		// the Values cannot come from different iterations of a potential cycle the
// phi nodes could be involved in.		// phi nodes could be involved in.
for (auto *P : VisitedPhiBBs)		for (auto *P : VisitedPhiBBs)
if (isPotentiallyReachable(&P->front(), Inst, DT, LI))		if (isPotentiallyReachable(&P->front(), Inst, nullptr, DT, LI))
return false;		return false;

return true;		return true;
}		}

/// Computes the symbolic difference between two de-composed GEPs.		/// Computes the symbolic difference between two de-composed GEPs.
///		///
/// Dest and Src are the variable indices from two decomposed GetElementPtr		/// Dest and Src are the variable indices from two decomposed GetElementPtr
▲ Show 20 Lines • Show All 161 Lines • Show Last 20 Lines

llvm/lib/Analysis/CFG.cpp

//===-- CFG.cpp - BasicBlock analysis --------------------------------------==//		//===-- CFG.cpp - BasicBlock analysis --------------------------------------==//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This family of functions performs analyses on basic blocks, and instructions		// This family of functions performs analyses on basic blocks, and instructions
// contained within basic blocks.		// contained within basic blocks.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/Analysis/CFG.h"		#include "llvm/Analysis/CFG.h"
		#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/SmallSet.h"		#include "llvm/ADT/SmallSet.h"
#include "llvm/Analysis/LoopInfo.h"		#include "llvm/Analysis/LoopInfo.h"
#include "llvm/IR/Dominators.h"		#include "llvm/IR/Dominators.h"

using namespace llvm;		using namespace llvm;

/// FindFunctionBackedges - Analyze the specified function to find all of the		/// FindFunctionBackedges - Analyze the specified function to find all of the
/// loop backedges in the function and return them. This is a relatively cheap		/// loop backedges in the function and return them. This is a relatively cheap
▲ Show 20 Lines • Show All 91 Lines • ▼ Show 20 Lines	static const Loop getOutermostLoop(const LoopInfo LI, const BasicBlock *BB) {
const Loop *L = LI->getLoopFor(BB);		const Loop *L = LI->getLoopFor(BB);
if (L) {		if (L) {
while (const Loop *Parent = L->getParentLoop())		while (const Loop *Parent = L->getParentLoop())
L = Parent;		L = Parent;
}		}
return L;		return L;
}		}

// True if there is a loop which contains both BB1 and BB2.
static bool loopContainsBoth(const LoopInfo *LI,
const BasicBlock BB1, const BasicBlock BB2) {
const Loop *L1 = getOutermostLoop(LI, BB1);
const Loop *L2 = getOutermostLoop(LI, BB2);
return L1 != nullptr && L1 == L2;
}

bool llvm::isPotentiallyReachableFromMany(		bool llvm::isPotentiallyReachableFromMany(
SmallVectorImpl<BasicBlock > &Worklist, BasicBlock StopBB,		SmallVectorImpl<BasicBlock > &Worklist, BasicBlock StopBB,
const DominatorTree DT, const LoopInfo LI) {		const SmallPtrSetImpl<BasicBlock > ExclusionSet, const DominatorTree *DT,
		const LoopInfo *LI) {
// When the stop block is unreachable, it's dominated from everywhere,		// When the stop block is unreachable, it's dominated from everywhere,
// regardless of whether there's a path between the two blocks.		// regardless of whether there's a path between the two blocks.
if (DT && !DT->isReachableFromEntry(StopBB))		if (DT && !DT->isReachableFromEntry(StopBB))
DT = nullptr;		DT = nullptr;

		// We can't skip directly from a block that dominates the stop block if the
		// exclusion block is potentially in between.
		if (ExclusionSet && !ExclusionSet->empty())
		DT = nullptr;

		// Normally any block in a loop is reachable from any other block in a loop,
		// however excluded blocks might partition the body of a loop to make that
		// untrue.
		SmallPtrSet<const Loop *, 8> LoopsWithHoles;
		if (LI && ExclusionSet) {
		for (auto BB : *ExclusionSet) {
		if (const Loop *L = getOutermostLoop(LI, BB))
		LoopsWithHoles.insert(L);
		}
		}

		const Loop *StopLoop = LI ? getOutermostLoop(LI, StopBB) : nullptr;

// Limit the number of blocks we visit. The goal is to avoid run-away compile		// Limit the number of blocks we visit. The goal is to avoid run-away compile
// times on large CFGs without hampering sensible code. Arbitrarily chosen.		// times on large CFGs without hampering sensible code. Arbitrarily chosen.
unsigned Limit = 32;		unsigned Limit = 32;
SmallPtrSet<const BasicBlock*, 32> Visited;		SmallPtrSet<const BasicBlock*, 32> Visited;
do {		do {
BasicBlock *BB = Worklist.pop_back_val();		BasicBlock *BB = Worklist.pop_back_val();
if (!Visited.insert(BB).second)		if (!Visited.insert(BB).second)
continue;		continue;
if (BB == StopBB)		if (BB == StopBB)
return true;		return true;
		if (ExclusionSet && ExclusionSet->count(BB))
		continue;
if (DT && DT->dominates(BB, StopBB))		if (DT && DT->dominates(BB, StopBB))
return true;		return true;
if (LI && loopContainsBoth(LI, BB, StopBB))
		const Loop *Outer = nullptr;
		if (LI) {
		Outer = getOutermostLoop(LI, BB);
		if (LoopsWithHoles.count(Outer))
		Outer = nullptr;
		if (StopLoop && Outer == StopLoop)
return true;		return true;
		}

if (!--Limit) {		if (!--Limit) {
// We haven't been able to prove it one way or the other. Conservatively		// We haven't been able to prove it one way or the other. Conservatively
// answer true -- that there is potentially a path.		// answer true -- that there is potentially a path.
return true;		return true;
}		}

if (const Loop *Outer = LI ? getOutermostLoop(LI, BB) : nullptr) {		if (Outer) {
// All blocks in a single loop are reachable from all other blocks. From		// All blocks in a single loop are reachable from all other blocks. From
// any of these blocks, we can skip directly to the exits of the loop,		// any of these blocks, we can skip directly to the exits of the loop,
// ignoring any other blocks inside the loop body.		// ignoring any other blocks inside the loop body.
Outer->getExitBlocks(Worklist);		Outer->getExitBlocks(Worklist);
} else {		} else {
Worklist.append(succ_begin(BB), succ_end(BB));		Worklist.append(succ_begin(BB), succ_end(BB));
}		}
} while (!Worklist.empty());		} while (!Worklist.empty());

// We have exhausted all possible paths and are certain that 'To' can not be		// We have exhausted all possible paths and are certain that 'To' can not be
// reached from 'From'.		// reached from 'From'.
return false;		return false;
}		}

bool llvm::isPotentiallyReachable(const BasicBlock A, const BasicBlock B,		bool llvm::isPotentiallyReachable(const BasicBlock A, const BasicBlock B,
const DominatorTree DT, const LoopInfo LI) {		const DominatorTree DT, const LoopInfo LI) {
assert(A->getParent() == B->getParent() &&		assert(A->getParent() == B->getParent() &&
"This analysis is function-local!");		"This analysis is function-local!");

SmallVector<BasicBlock*, 32> Worklist;		SmallVector<BasicBlock*, 32> Worklist;
Worklist.push_back(const_cast<BasicBlock*>(A));		Worklist.push_back(const_cast<BasicBlock*>(A));

return isPotentiallyReachableFromMany(Worklist, const_cast<BasicBlock *>(B),		return isPotentiallyReachableFromMany(Worklist, const_cast<BasicBlock *>(B),
DT, LI);		nullptr, DT, LI);
}		}

bool llvm::isPotentiallyReachable(const Instruction A, const Instruction B,		bool llvm::isPotentiallyReachable(
const DominatorTree DT, const LoopInfo LI) {		const Instruction A, const Instruction B,
		const SmallPtrSetImpl<BasicBlock > ExclusionSet, const DominatorTree *DT,
		const LoopInfo *LI) {
assert(A->getParent()->getParent() == B->getParent()->getParent() &&		assert(A->getParent()->getParent() == B->getParent()->getParent() &&
"This analysis is function-local!");		"This analysis is function-local!");

SmallVector<BasicBlock*, 32> Worklist;		SmallVector<BasicBlock*, 32> Worklist;

if (A->getParent() == B->getParent()) {		if (A->getParent() == B->getParent()) {
// The same block case is special because it's the only time we're looking		// The same block case is special because it's the only time we're looking
// within a single block to see which instruction comes first. Once we		// within a single block to see which instruction comes first. Once we
Show All 25 Lines	if (A->getParent() == B->getParent()) {
if (Worklist.empty()) {		if (Worklist.empty()) {
// We've proven that there's no path!		// We've proven that there's no path!
return false;		return false;
}		}
} else {		} else {
Worklist.push_back(const_cast<BasicBlock*>(A->getParent()));		Worklist.push_back(const_cast<BasicBlock*>(A->getParent()));
}		}

if (A->getParent() == &A->getParent()->getParent()->getEntryBlock())		if (DT) {
		if (DT->isReachableFromEntry(A->getParent()) !=
		DT->isReachableFromEntry(B->getParent()))
		return false;
		if (!ExclusionSet \|\| ExclusionSet->empty()) {
		if (A->getParent() == &A->getParent()->getParent()->getEntryBlock() &&
		DT->isReachableFromEntry(B->getParent()))
return true;		return true;
if (B->getParent() == &A->getParent()->getParent()->getEntryBlock())		if (B->getParent() == &A->getParent()->getParent()->getEntryBlock() &&
		DT->isReachableFromEntry(A->getParent()))
return false;		return false;
		}
		}

return isPotentiallyReachableFromMany(		return isPotentiallyReachableFromMany(
Worklist, const_cast<BasicBlock *>(B->getParent()), DT, LI);		Worklist, const_cast<BasicBlock *>(B->getParent()), ExclusionSet, DT, LI);
}		}

llvm/lib/Analysis/CMakeLists.txt

Show First 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	add_llvm_library(LLVMAnalysis
LazyBranchProbabilityInfo.cpp		LazyBranchProbabilityInfo.cpp
LazyBlockFrequencyInfo.cpp		LazyBlockFrequencyInfo.cpp
LazyCallGraph.cpp		LazyCallGraph.cpp
LazyValueInfo.cpp		LazyValueInfo.cpp
LegacyDivergenceAnalysis.cpp		LegacyDivergenceAnalysis.cpp
Lint.cpp		Lint.cpp
Loads.cpp		Loads.cpp
LoopAccessAnalysis.cpp		LoopAccessAnalysis.cpp
		LoopAllocationInfo.cpp
LoopAnalysisManager.cpp		LoopAnalysisManager.cpp
LoopUnrollAnalyzer.cpp		LoopUnrollAnalyzer.cpp
LoopInfo.cpp		LoopInfo.cpp
LoopPass.cpp		LoopPass.cpp
MemDepPrinter.cpp		MemDepPrinter.cpp
MemDerefPrinter.cpp		MemDerefPrinter.cpp
MemoryBuiltins.cpp		MemoryBuiltins.cpp
MemoryDependenceAnalysis.cpp		MemoryDependenceAnalysis.cpp
▲ Show 20 Lines • Show All 44 Lines • Show Last 20 Lines

llvm/lib/Analysis/CaptureTracking.cpp

Show First 20 Lines • Show All 95 Lines • ▼ Show 20 Lines	bool isSafeToPrune(Instruction *I) {
// (1) BB is an entry block or have no successors.		// (1) BB is an entry block or have no successors.
// (2) There's no path coming back through BB successors.		// (2) There's no path coming back through BB successors.
if (BB == &BB->getParent()->getEntryBlock() \|\|		if (BB == &BB->getParent()->getEntryBlock() \|\|
!BB->getTerminator()->getNumSuccessors())		!BB->getTerminator()->getNumSuccessors())
return true;		return true;

SmallVector<BasicBlock*, 32> Worklist;		SmallVector<BasicBlock*, 32> Worklist;
Worklist.append(succ_begin(BB), succ_end(BB));		Worklist.append(succ_begin(BB), succ_end(BB));
return !isPotentiallyReachableFromMany(Worklist, BB, DT);		return !isPotentiallyReachableFromMany(Worklist, BB, nullptr, DT);
}		}

// If the value is defined in the same basic block as use and BeforeHere,		// If the value is defined in the same basic block as use and BeforeHere,
// there is no need to explore the use if BeforeHere dominates use.		// there is no need to explore the use if BeforeHere dominates use.
// Check whether there is a path from I to BeforeHere.		// Check whether there is a path from I to BeforeHere.
if (BeforeHere != I && DT->dominates(BeforeHere, I) &&		if (BeforeHere != I && DT->dominates(BeforeHere, I) &&
!isPotentiallyReachable(I, BeforeHere, DT))		!isPotentiallyReachable(I, BeforeHere, nullptr, DT))
return true;		return true;

return false;		return false;
}		}

bool shouldExplore(const Use *U) override {		bool shouldExplore(const Use *U) override {
Instruction *I = cast<Instruction>(U->getUser());		Instruction *I = cast<Instruction>(U->getUser());

▲ Show 20 Lines • Show All 244 Lines • Show Last 20 Lines

llvm/lib/Analysis/LoopAllocationInfo.cpp

This file was added.

				//===- LoopAllocationInfo.cpp - memory allocations inside loops -----------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/Analysis/LoopAllocationInfo.h"

				#include "llvm/ADT/BitVector.h"
				#include "llvm/ADT/DenseMap.h"
				#include "llvm/ADT/DenseSet.h"
				#include "llvm/ADT/STLExtras.h"
				#include "llvm/Analysis/CFG.h"
				#include "llvm/Analysis/LoopInfo.h"
				#include "llvm/Analysis/MemoryBuiltins.h"
				#include "llvm/IR/BasicBlock.h"
				#include "llvm/IR/CFG.h"
				#include "llvm/IR/Dominators.h"
				#include "llvm/IR/Instruction.h"
				#include "llvm/IR/Instructions.h"
				#include "llvm/Support/Casting.h"

				namespace llvm {

				/// Track ptr = allocfn to one or more freefn(ptr) relationships within the
				/// loop.
				///
				/// This analysis stores information about allocation and deallocation calls
				/// where the pointer is only live within the loop.
				///
				/// An allocation must be both hoisted and sunk, or neither. LICM performs
				/// sinking first, then hoisting. As sinking and hoisting is performed, more
				/// instructions will appear to be loop invariant, making more opportunities.
				/// For this reason we split the analysis into two pieces:
				/// 1. for a given pointer that is only live in the loop, retain a list its
				/// matching deallocating instructions.
				/// 2. during sinking, if we would sink the deallocation, don't, but inform
				/// the LoopAllocationInfo. If all deallocations would have been sunk, and
				/// hoistSink would hoist the allocation, perform both the hoist and sink
				/// the various deallocation instructions.
				///
				/// We don't need guaranteed execution of any deallocation function during
				/// unwinding. Before this transformation, an unwind mid-loop would leak one
				/// pointer that was allocated at the start of the loop. After this
				/// transformation, it would leak the one pointer allocated before the loop
				/// began. There is no difference visible to the caller.
				///
				/// We do have to ensure that the maximum amount of allocated memory is not
				/// raised by this transformation. To ensure this, we stop looking for
				/// allocations after seeing an opaque function call. For example:
				/// loop_top:
				/// %ptr1 = call i8* @malloc(i32 %loop_invariant1) ;; candidate
				/// %ptr2 = call i8* @malloc(i32 %loop_variant2) ;; not candidate
				/// %ptr3 = call i8* @malloc(i32 %loop_invariant3) ;; candidate
				/// call void @opaque() ;; stop scanning
				/// %ptr5 = call i8* @malloc(i32 %loop_invariant5) ;; not candidate
				///
				/// The same idea applies in reverse order to @free calls at the loop exits:
				/// loop_exiting:
				/// call void @free(%ptr1) ;; not candidate
				/// call void @opaque() ;; stop scanning
				/// call void @free(%ptr2) ;; %ptr2 is candidate
				/// ;; start here and scan upwards
				void LoopAllocationInfo::analyzeLoop(LoopInfo LI, DominatorTree DT,
				TargetLibraryInfo TLI, Loop CurLoop) {
				assert(LI != nullptr && DT != nullptr && CurLoop != nullptr &&
				"Unexpected input to LoopAllocationInfo::analyzeLoop.");
				if (!TLI)
				return;

				SmallVector<BasicBlock *, 32> LatchBlocks;
				CurLoop->getLoopLatches(LatchBlocks);
				if (LatchBlocks.empty())
				return;

				SmallVector<BasicBlock *, 32> ExitBlocks;
				CurLoop->getExitBlocks(ExitBlocks);
				if (ExitBlocks.empty())
				return;

				// Unlike hoistRegion, we only scan the header block. If the allocation is in
				// a later block, it must be behind some sort of branch which implies that
				// it's conditional on something or else another optimization would've removed
				// the unconditional branch earlier).
				BasicBlock *HeaderBB = CurLoop->getHeader();
				SmallPtrSet<BasicBlock *, 1> HeaderBlock;
				HeaderBlock.insert(HeaderBB);
				for (BasicBlock::iterator II = HeaderBB->begin(), E = HeaderBB->end();
				II != E; ++II) {
				if (!II->mayDeallocateMemory() && !II->mayAllocateMemory())
				continue;
				reamesUnsubmitted Done Reply Inline Actions As discussed offline, bug example: for (int i = 0; i < N; i++) { throw_if(requested_size > TOO_BIG); a = malloc(requested_size); free(a); } reames: As discussed offline, bug example: for (int i = 0; i < N; i++) { throw_if(requested_size >…
				nicholasAuthorUnsubmitted Done Reply Inline Actions I've updated this to use isGuaranteedToTransferExecutionToSuccessor on the instructions leading up to the allocation. The "throw_if" example couldn't happen because any opaque function might also malloc or free, so the test @test16 uses a volatile load instead. nicholas: I've updated this to use isGuaranteedToTransferExecutionToSuccessor on the instructions leading…
				if (!llvm::isAllocationFn(&*II, TLI))
				break;
				auto CI = cast<CallInst>(II);

				// Find places in the loop where the pointer is certainly freed. The places
				// we identify must free this allocation, but there may be any number of
				// other places that could free our pointer which we miss.
				//
				// We use this to ensure that the allocation is freed before we go around
				// any loop latch.
				SmallVector<CallInst *, 16> FreeCalls;
				constexpr size_t UseVisitThreshold = 20;
				SmallVector<Instruction *, UseVisitThreshold> Worklist;
				SmallPtrSet<Instruction *, UseVisitThreshold> Visited;
				auto Enqueue = [&](const Instruction *I) {
				for (const Use &U : I->uses()) {
				Instruction *User = cast<Instruction>(U.getUser());
				if (!CurLoop->contains(User))
				continue;
				if (!Visited.insert(User).second)
				continue;
				if (Visited.size() == UseVisitThreshold)
				return;
				Worklist.push_back(User);
				}
				};
				Enqueue(CI);
				do {
				Instruction *I = Worklist.pop_back_val();
				if (isa<BitCastInst>(I)) {
				Enqueue(I);
				} else if (auto *GEP = dyn_cast<GetElementPtrInst>(I)) {
				if (GEP->hasAllZeroIndices())
				Enqueue(GEP);
				} else if (auto *PN = dyn_cast<PHINode>(I)) {
				if (PN->hasConstantOrUndefValue())
				Enqueue(I);
				} else if (llvm::isFreeCall(I, TLI)) {
				FreeCalls.push_back(cast<CallInst>(I));
				}
				} while (!Worklist.empty() && Visited.size() < UseVisitThreshold);

				if (FreeCalls.empty() \|\| Visited.size() >= UseVisitThreshold)
				continue;

				SmallPtrSet<BasicBlock *, 32> FreeCallBBs;
				for (auto FreeCall : FreeCalls)
				FreeCallBBs.insert(FreeCall->getParent());

				bool DoNotTransform = false;

				// Check that the frees we found cover all the latches.
				for (BasicBlock *LatchBB : LatchBlocks) {
				if (FreeCallBBs.count(LatchBB))
				continue;
				SmallVector<BasicBlock *, 32> HeaderWorklist;
				HeaderWorklist.push_back(HeaderBB);
				if (llvm::isPotentiallyReachableFromMany(HeaderWorklist, LatchBB,
				&FreeCallBBs, DT, LI)) {
				DoNotTransform = true;
				break;
				}
				}
				if (DoNotTransform)
				continue;

				// Examine each exit block and determine whether was a free which is
				// unconditionally executed on the way to the exit block.
				//
				// We use two queries, is there a path from any free to this exit without
				// going back to the header (if not, do not insert a free), and is there a
				// path from the path from the header to this exit without going through any
				// frees (if not, we must insert a free). Both of these queries can return
				// true, in which case the deallocation is conditional or there are other
				// deallocations we can not see, and we do not perform the transform.
				SmallVector<BasicBlock *, 16> InsertDeallocation;
				for (auto ExitBB : ExitBlocks) {
				SmallVector<BasicBlock *, 32> HeaderWorklist, FreeCallWorklist;
				HeaderWorklist.push_back(HeaderBB);
				FreeCallWorklist.insert(FreeCallWorklist.end(), FreeCallBBs.begin(),
				FreeCallBBs.end());
				bool A = llvm::isPotentiallyReachableFromMany(HeaderWorklist, ExitBB,
				&FreeCallBBs, DT, LI);
				bool B = FreeCallBBs.count(HeaderBB) \|\|
				llvm::isPotentiallyReachableFromMany(FreeCallWorklist, ExitBB,
				&HeaderBlock, DT, LI);
				assert((A \|\| B) && "loop exit blocks not reachable from loop");
				if (A && B) {
				// We can't tell whether we pass through a free on the way to this exit
				// block.
				DoNotTransform = true;
				break;
				}
				if (B)
				InsertDeallocation.push_back(ExitBB);
				}
				if (DoNotTransform)
				continue;

				AllocationLookup[CI] = Allocations.size();
				for (const CallInst *FreeCall : FreeCalls)
				DeallocationLookup[FreeCall] = Allocations.size();
				Allocations.push_back(CI);
				Deallocations.emplace_back(std::move(FreeCalls));
				ExitBlocksToAddDeallocationsTo.emplace_back(std::move(InsertDeallocation));
				}
				}

				// sinkRegion has determined that this deallocation call may be sunk.
				void LoopAllocationInfo::addSafeToSink(const CallInst *CI) {
				if (DeallocationLookup.count(CI))
				SafeToSink.insert(CI);
				}

				// hoistRegion would like to know whether the given allocation call may be
				// hoisted.
				bool LoopAllocationInfo::mayHoist(const CallInst *CI) const {
				auto it = AllocationLookup.find(CI);
				if (it == AllocationLookup.end())
				return false;
				size_t index = it->second;
				assert(index < Deallocations.size());
				for (const CallInst *FreeCall : Deallocations[index]) {
				if (!SafeToSink.count(FreeCall))
				return false;
				}
				return true;
				}

				const SmallVector<CallInst *, 16> &
				LoopAllocationInfo::toSink(const CallInst *CI) {
				assert(AllocationLookup.find(CI) != AllocationLookup.end());
				return Deallocations[AllocationLookup[CI]];
				}

				const SmallVector<BasicBlock *, 16> &
				LoopAllocationInfo::addDeallocations(const CallInst *CI) {
				assert(AllocationLookup.find(CI) != AllocationLookup.end());
				return ExitBlocksToAddDeallocationsTo[AllocationLookup[CI]];
				}

				} // namespace llvm

llvm/lib/CodeGen/DwarfEHPrepare.cpp

	Show First 20 Lines • Show All 139 Lines • ▼ Show 20 Lines
	/// unreachable and then simplify those blocks.			/// unreachable and then simplify those blocks.
	size_t DwarfEHPrepare::pruneUnreachableResumes(			size_t DwarfEHPrepare::pruneUnreachableResumes(
	Function &Fn, SmallVectorImpl<ResumeInst *> &Resumes,			Function &Fn, SmallVectorImpl<ResumeInst *> &Resumes,
	SmallVectorImpl<LandingPadInst *> &CleanupLPads) {			SmallVectorImpl<LandingPadInst *> &CleanupLPads) {
	BitVector ResumeReachable(Resumes.size());			BitVector ResumeReachable(Resumes.size());
	size_t ResumeIndex = 0;			size_t ResumeIndex = 0;
	for (auto *RI : Resumes) {			for (auto *RI : Resumes) {
	for (auto *LP : CleanupLPads) {			for (auto *LP : CleanupLPads) {
	if (isPotentiallyReachable(LP, RI, DT)) {			if (isPotentiallyReachable(LP, RI, nullptr, DT)) {
	ResumeReachable.set(ResumeIndex);			ResumeReachable.set(ResumeIndex);
	break;			break;
	}			}
	}			}
	++ResumeIndex;			++ResumeIndex;
	}			}

	// If everything is reachable, there is no change.			// If everything is reachable, there is no change.
	▲ Show 20 Lines • Show All 111 Lines • Show Last 20 Lines

llvm/lib/IR/Instruction.cpp

Show First 20 Lines • Show All 542 Lines • ▼ Show 20 Lines	bool Instruction::mayWriteToMemory() const {
case Instruction::Invoke:		case Instruction::Invoke:
case Instruction::CallBr:		case Instruction::CallBr:
return !cast<CallBase>(this)->onlyReadsMemory();		return !cast<CallBase>(this)->onlyReadsMemory();
case Instruction::Load:		case Instruction::Load:
return !cast<LoadInst>(this)->isUnordered();		return !cast<LoadInst>(this)->isUnordered();
}		}
}		}

		bool Instruction::mayAllocateMemory() const {
		switch (getOpcode()) {
		default:
		return false;
		case Instruction::CallBr:
		return true;
		case Instruction::Call: {
		auto II = dyn_cast<IntrinsicInst>(this);
		if (!II)
		return true;
		switch (II->getIntrinsicID()) {
		default:
		reamesUnsubmitted Done Reply Inline Actions Err, using a blacklist here feels really dangerous. reames: Err, using a blacklist here feels really dangerous.
		nicholasAuthorUnsubmitted Done Reply Inline Actions Uhh, you aren't suggesting that I actually list them all, right? The reason I put it here in Instruction instead of hidden away in LICM/LoopAllocationInfo is to make it somewhat clearer that this is a global property that you'll need to update when adding an instruction or intrinsic. How about if we update Intrinsics.td to include a Mallocs and Frees (similar to Throws) IntrinsicProperty? That makes it opt-in when declaring the intrinsic? (Speaking of which, why Throws and not IntrThrows? Should mine be IntrMallocs/IntrFrees or Mallocs/Frees?) Also, can a readnone function malloc? Is there any way we could make this stricter? nicholas: Uhh, you aren't suggesting that I actually list them all, right? The reason I put it here in…
		return false;
		case Intrinsic::objc_autoreleasePoolPush:
		return true;
		}
		}
		}
		}

		bool Instruction::mayDeallocateMemory() const {
		switch (getOpcode()) {
		default:
		return false;
		case Instruction::CallBr:
		return true;
		case Instruction::Call: {
		auto II = dyn_cast<IntrinsicInst>(this);
		if (!II)
		return true;
		switch (II->getIntrinsicID()) {
		default:
		return false;
		case Intrinsic::objc_autoreleasePoolPop:
		return true;
		}
		}
		}
		}

bool Instruction::isAtomic() const {		bool Instruction::isAtomic() const {
switch (getOpcode()) {		switch (getOpcode()) {
default:		default:
return false;		return false;
case Instruction::AtomicCmpXchg:		case Instruction::AtomicCmpXchg:
case Instruction::AtomicRMW:		case Instruction::AtomicRMW:
case Instruction::Fence:		case Instruction::Fence:
return true;		return true;
▲ Show 20 Lines • Show All 230 Lines • Show Last 20 Lines

llvm/lib/Transforms/Scalar/LICM.cpp

Show All 34 Lines
#include "llvm/Analysis/AliasAnalysis.h"		#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/Analysis/AliasSetTracker.h"		#include "llvm/Analysis/AliasSetTracker.h"
#include "llvm/Analysis/BasicAliasAnalysis.h"		#include "llvm/Analysis/BasicAliasAnalysis.h"
#include "llvm/Analysis/CaptureTracking.h"		#include "llvm/Analysis/CaptureTracking.h"
#include "llvm/Analysis/ConstantFolding.h"		#include "llvm/Analysis/ConstantFolding.h"
#include "llvm/Analysis/GlobalsModRef.h"		#include "llvm/Analysis/GlobalsModRef.h"
#include "llvm/Analysis/GuardUtils.h"		#include "llvm/Analysis/GuardUtils.h"
#include "llvm/Analysis/Loads.h"		#include "llvm/Analysis/Loads.h"
		#include "llvm/Analysis/LoopAllocationInfo.h"
#include "llvm/Analysis/LoopInfo.h"		#include "llvm/Analysis/LoopInfo.h"
#include "llvm/Analysis/LoopIterator.h"		#include "llvm/Analysis/LoopIterator.h"
#include "llvm/Analysis/LoopPass.h"		#include "llvm/Analysis/LoopPass.h"
#include "llvm/Analysis/MemoryBuiltins.h"		#include "llvm/Analysis/MemoryBuiltins.h"
#include "llvm/Analysis/MemorySSA.h"		#include "llvm/Analysis/MemorySSA.h"
#include "llvm/Analysis/MemorySSAUpdater.h"		#include "llvm/Analysis/MemorySSAUpdater.h"
#include "llvm/Analysis/OptimizationRemarkEmitter.h"		#include "llvm/Analysis/OptimizationRemarkEmitter.h"
#include "llvm/Analysis/ScalarEvolution.h"		#include "llvm/Analysis/ScalarEvolution.h"
#include "llvm/Analysis/ScalarEvolutionAliasAnalysis.h"		#include "llvm/Analysis/ScalarEvolutionAliasAnalysis.h"
#include "llvm/Analysis/TargetLibraryInfo.h"		#include "llvm/Analysis/TargetLibraryInfo.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/IR/CFG.h"		#include "llvm/IR/CFG.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/DerivedTypes.h"		#include "llvm/IR/DerivedTypes.h"
#include "llvm/IR/Dominators.h"		#include "llvm/IR/Dominators.h"
		#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
#include "llvm/IR/IntrinsicInst.h"		#include "llvm/IR/IntrinsicInst.h"
#include "llvm/IR/LLVMContext.h"		#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/Metadata.h"		#include "llvm/IR/Metadata.h"
#include "llvm/IR/PatternMatch.h"		#include "llvm/IR/PatternMatch.h"
#include "llvm/IR/PredIteratorCache.h"		#include "llvm/IR/PredIteratorCache.h"
#include "llvm/Support/CommandLine.h"		#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
▲ Show 20 Lines • Show All 67 Lines • ▼ Show 20 Lines
static bool isNotUsedOrFreeInLoop(const Instruction &I, const Loop *CurLoop,		static bool isNotUsedOrFreeInLoop(const Instruction &I, const Loop *CurLoop,
const LoopSafetyInfo *SafetyInfo,		const LoopSafetyInfo *SafetyInfo,
TargetTransformInfo *TTI, bool &FreeInLoop);		TargetTransformInfo *TTI, bool &FreeInLoop);
static void hoist(Instruction &I, const DominatorTree DT, const Loop CurLoop,		static void hoist(Instruction &I, const DominatorTree DT, const Loop CurLoop,
BasicBlock Dest, ICFLoopSafetyInfo SafetyInfo,		BasicBlock Dest, ICFLoopSafetyInfo SafetyInfo,
MemorySSAUpdater MSSAU, OptimizationRemarkEmitter ORE);		MemorySSAUpdater MSSAU, OptimizationRemarkEmitter ORE);
static bool sink(Instruction &I, LoopInfo LI, DominatorTree DT,		static bool sink(Instruction &I, LoopInfo LI, DominatorTree DT,
const Loop CurLoop, ICFLoopSafetyInfo SafetyInfo,		const Loop CurLoop, ICFLoopSafetyInfo SafetyInfo,
MemorySSAUpdater MSSAU, OptimizationRemarkEmitter ORE,		MemorySSAUpdater MSSAU, OptimizationRemarkEmitter ORE);
bool FreeInLoop);
static bool isSafeToExecuteUnconditionally(Instruction &Inst,		static bool isSafeToExecuteUnconditionally(Instruction &Inst,
const DominatorTree *DT,		const DominatorTree *DT,
const Loop *CurLoop,		const Loop *CurLoop,
const LoopSafetyInfo *SafetyInfo,		const LoopSafetyInfo *SafetyInfo,
OptimizationRemarkEmitter *ORE,		OptimizationRemarkEmitter *ORE,
const Instruction *CtxI = nullptr);		const Instruction *CtxI = nullptr);
static bool pointerInvalidatedByLoop(MemoryLocation MemLoc,		static bool pointerInvalidatedByLoop(MemoryLocation MemLoc,
AliasSetTracker CurAST, Loop CurLoop,		AliasSetTracker CurAST, Loop CurLoop,
▲ Show 20 Lines • Show All 186 Lines • ▼ Show 20 Lines	bool LoopInvariantCodeMotion::runOnLoop(

// Get the preheader block to move instructions into...		// Get the preheader block to move instructions into...
BasicBlock *Preheader = L->getLoopPreheader();		BasicBlock *Preheader = L->getLoopPreheader();

// Compute loop safety information.		// Compute loop safety information.
ICFLoopSafetyInfo SafetyInfo(DT);		ICFLoopSafetyInfo SafetyInfo(DT);
SafetyInfo.computeLoopSafetyInfo(L);		SafetyInfo.computeLoopSafetyInfo(L);

		// Scan for matching malloc/free pairs.
		LoopAllocationInfo LAI;
		if (L->hasDedicatedExits() && Preheader)
		LAI.analyzeLoop(LI, DT, TLI, L);

// We want to visit all of the instructions in this loop... that are not parts		// We want to visit all of the instructions in this loop... that are not parts
// of our subloops (they have already had their invariants hoisted out of		// of our subloops (they have already had their invariants hoisted out of
// their loop, into this loop, so there is no need to process the BODIES of		// their loop, into this loop, so there is no need to process the BODIES of
// the subloops).		// the subloops).
//		//
// Traverse the body of the loop in depth first order on the dominator tree so		// Traverse the body of the loop in depth first order on the dominator tree so
// that we are guaranteed to see definitions before we see uses. This allows		// that we are guaranteed to see definitions before we see uses. This allows
// us to sink instructions in one pass, without iteration. After sinking		// us to sink instructions in one pass, without iteration. After sinking
// instructions, we perform another pass to hoist them out of the loop.		// instructions, we perform another pass to hoist them out of the loop.
//		//
if (L->hasDedicatedExits())		if (L->hasDedicatedExits())
Changed \|= sinkRegion(DT->getNode(L->getHeader()), AA, LI, DT, TLI, TTI, L,		Changed \|= sinkRegion(DT->getNode(L->getHeader()), AA, LI, DT, TLI, TTI, L,
CurAST.get(), MSSAU.get(), &SafetyInfo,		CurAST.get(), MSSAU.get(), &SafetyInfo, &LAI,
NoOfMemAccTooLarge, LicmMssaOptCounter, ORE);		NoOfMemAccTooLarge, LicmMssaOptCounter, ORE);
if (Preheader)		if (Preheader)
Changed \|= hoistRegion(DT->getNode(L->getHeader()), AA, LI, DT, TLI, L,		Changed \|= hoistRegion(DT->getNode(L->getHeader()), AA, LI, DT, TLI, L,
CurAST.get(), MSSAU.get(), &SafetyInfo,		CurAST.get(), MSSAU.get(), &SafetyInfo, &LAI,
NoOfMemAccTooLarge, LicmMssaOptCounter, ORE);		NoOfMemAccTooLarge, LicmMssaOptCounter, ORE);

// Now that all loop invariants have been removed from the loop, promote any		// Now that all loop invariants have been removed from the loop, promote any
// memory references to scalars that we can.		// memory references to scalars that we can.
// Don't sink stores from loops without dedicated block exits. Exits		// Don't sink stores from loops without dedicated block exits. Exits
// containing indirect branches are not transformed by loop simplify,		// containing indirect branches are not transformed by loop simplify,
// make sure we catch that. An additional load may be generated in the		// make sure we catch that. An additional load may be generated in the
// preheader for SSA updater, so also avoid sinking when no preheader		// preheader for SSA updater, so also avoid sinking when no preheader
▲ Show 20 Lines • Show All 88 Lines • ▼ Show 20 Lines
/// the specified block, and that are in the current loop) in reverse depth		/// the specified block, and that are in the current loop) in reverse depth
/// first order w.r.t the DominatorTree. This allows us to visit uses before		/// first order w.r.t the DominatorTree. This allows us to visit uses before
/// definitions, allowing us to sink a loop body in one pass without iteration.		/// definitions, allowing us to sink a loop body in one pass without iteration.
///		///
bool llvm::sinkRegion(DomTreeNode N, AliasAnalysis AA, LoopInfo *LI,		bool llvm::sinkRegion(DomTreeNode N, AliasAnalysis AA, LoopInfo *LI,
DominatorTree DT, TargetLibraryInfo TLI,		DominatorTree DT, TargetLibraryInfo TLI,
TargetTransformInfo TTI, Loop CurLoop,		TargetTransformInfo TTI, Loop CurLoop,
AliasSetTracker CurAST, MemorySSAUpdater MSSAU,		AliasSetTracker CurAST, MemorySSAUpdater MSSAU,
ICFLoopSafetyInfo *SafetyInfo, bool NoOfMemAccTooLarge,		ICFLoopSafetyInfo SafetyInfo, LoopAllocationInfo LAI,
int &LicmMssaOptCounter, OptimizationRemarkEmitter *ORE) {		bool NoOfMemAccTooLarge, int &LicmMssaOptCounter,
		OptimizationRemarkEmitter *ORE) {

// Verify inputs.		// Verify inputs.
assert(N != nullptr && AA != nullptr && LI != nullptr && DT != nullptr &&		assert(N != nullptr && AA != nullptr && LI != nullptr && DT != nullptr &&
CurLoop != nullptr && SafetyInfo != nullptr &&		CurLoop != nullptr && SafetyInfo != nullptr && LAI != nullptr &&
"Unexpected input to sinkRegion.");		"Unexpected input to sinkRegion.");
assert(((CurAST != nullptr) ^ (MSSAU != nullptr)) &&		assert(((CurAST != nullptr) ^ (MSSAU != nullptr)) &&
"Either AliasSetTracker or MemorySSA should be initialized.");		"Either AliasSetTracker or MemorySSA should be initialized.");

// We want to visit children before parents. We will enque all the parents		// We want to visit children before parents. We will enque all the parents
// before their children in the worklist and process the worklist in reverse		// before their children in the worklist and process the worklist in reverse
// order.		// order.
SmallVector<DomTreeNode *, 16> Worklist = collectChildrenInLoop(N, CurLoop);		SmallVector<DomTreeNode *, 16> Worklist = collectChildrenInLoop(N, CurLoop);

bool Changed = false;		bool Changed = false;
for (DomTreeNode *DTN : reverse(Worklist)) {		for (DomTreeNode *DTN : reverse(Worklist)) {
BasicBlock *BB = DTN->getBlock();		BasicBlock *BB = DTN->getBlock();
// Only need to process the contents of this block if it is not part of a		// Only need to process the contents of this block if it is not part of a
// subloop (which would already have been processed).		// subloop (which would already have been processed).
if (inSubLoop(BB, CurLoop, LI))		if (inSubLoop(BB, CurLoop, LI))
continue;		continue;

		bool PassedOpaqueFunction = LAI->empty();
		nicholasAuthorUnsubmitted Done Reply Inline Actions Rename this to something clearer, such as "ScanForFrees". nicholas: Rename this to something clearer, such as "ScanForFrees".
for (BasicBlock::iterator II = BB->end(); II != BB->begin();) {		for (BasicBlock::iterator II = BB->end(); II != BB->begin();) {
Instruction &I = *--II;		Instruction &I = *--II;

// If the instruction is dead, we would try to sink it because it isn't		// If the instruction is dead, we would try to sink it because it isn't
// used in the loop, instead, just delete it.		// used in the loop, instead, just delete it.
if (isInstructionTriviallyDead(&I, TLI)) {		if (isInstructionTriviallyDead(&I, TLI)) {
LLVM_DEBUG(dbgs() << "LICM deleting dead inst: " << I << '\n');		LLVM_DEBUG(dbgs() << "LICM deleting dead inst: " << I << '\n');
salvageDebugInfo(I);		salvageDebugInfo(I);
++II;		++II;
eraseInstruction(I, *SafetyInfo, CurAST, MSSAU);		eraseInstruction(I, *SafetyInfo, CurAST, MSSAU);
Changed = true;		Changed = true;
continue;		continue;
}		}

// Check to see if we can sink this instruction to the exit blocks		// Check to see if we can sink this instruction to the exit blocks
// of the loop. We can do this if the all users of the instruction are		// of the loop. We can do this if the all users of the instruction are
// outside of the loop. In this case, it doesn't even matter if the		// outside of the loop. In this case, it doesn't even matter if the
// operands of the instruction are loop invariant.		// operands of the instruction are loop invariant.
//		//
bool FreeInLoop = false;		bool FreeInLoop = false;
if (isNotUsedOrFreeInLoop(I, CurLoop, SafetyInfo, TTI, FreeInLoop) &&		if (isNotUsedOrFreeInLoop(I, CurLoop, SafetyInfo, TTI, FreeInLoop) &&
canSinkOrHoistInst(I, AA, DT, CurLoop, CurAST, MSSAU, true,		canSinkOrHoistInst(I, AA, DT, CurLoop, CurAST, MSSAU, true,
NoOfMemAccTooLarge, &LicmMssaOptCounter, ORE) &&		NoOfMemAccTooLarge, &LicmMssaOptCounter, ORE) &&
!I.mayHaveSideEffects()) {		!I.mayHaveSideEffects()) {
if (sink(I, LI, DT, CurLoop, SafetyInfo, MSSAU, ORE, FreeInLoop)) {		if (sink(I, LI, DT, CurLoop, SafetyInfo, MSSAU, ORE)) {
if (!FreeInLoop) {		if (!FreeInLoop) {
++II;		++II;
eraseInstruction(I, *SafetyInfo, CurAST, MSSAU);		eraseInstruction(I, *SafetyInfo, CurAST, MSSAU);
}		}
Changed = true;		Changed = true;
}		}
}		}

		if (!PassedOpaqueFunction) {
		if (llvm::isFreeCall(&I, TLI)) {
		LAI->addSafeToSink(cast<CallInst>(&I));
		} else if (I.mayAllocateMemory()) {
		PassedOpaqueFunction = true;
		reamesUnsubmitted Not Done Reply Inline Actions What about invokes? (Use callbase) reames: What about invokes? (Use callbase)
		nicholasAuthorUnsubmitted Done Reply Inline Actions Done. This loop skips the terminator, so it can't be an invoke. Which is, indeed, a miscompile bug. Fixed above where we initialize ScanForFrees and added @test19. Changed it to CallBase here anyways. nicholas: Done. This loop skips the terminator, so it can't be an invoke. Which is, indeed, a…
		}
		}
}		}
}		}
if (MSSAU && VerifyMemorySSA)		if (MSSAU && VerifyMemorySSA)
MSSAU->getMemorySSA()->verifyMemorySSA();		MSSAU->getMemorySSA()->verifyMemorySSA();
return Changed;		return Changed;
}		}

namespace {		namespace {
▲ Show 20 Lines • Show All 228 Lines • ▼ Show 20 Lines
/// Walk the specified region of the CFG (defined by all blocks dominated by		/// Walk the specified region of the CFG (defined by all blocks dominated by
/// the specified block, and that are in the current loop) in depth first		/// the specified block, and that are in the current loop) in depth first
/// order w.r.t the DominatorTree. This allows us to visit definitions before		/// order w.r.t the DominatorTree. This allows us to visit definitions before
/// uses, allowing us to hoist a loop body in one pass without iteration.		/// uses, allowing us to hoist a loop body in one pass without iteration.
///		///
bool llvm::hoistRegion(DomTreeNode N, AliasAnalysis AA, LoopInfo *LI,		bool llvm::hoistRegion(DomTreeNode N, AliasAnalysis AA, LoopInfo *LI,
DominatorTree DT, TargetLibraryInfo TLI, Loop *CurLoop,		DominatorTree DT, TargetLibraryInfo TLI, Loop *CurLoop,
AliasSetTracker CurAST, MemorySSAUpdater MSSAU,		AliasSetTracker CurAST, MemorySSAUpdater MSSAU,
ICFLoopSafetyInfo *SafetyInfo, bool NoOfMemAccTooLarge,		ICFLoopSafetyInfo SafetyInfo, LoopAllocationInfo LAI,
int &LicmMssaOptCounter,		bool NoOfMemAccTooLarge, int &LicmMssaOptCounter,
OptimizationRemarkEmitter *ORE) {		OptimizationRemarkEmitter *ORE) {
// Verify inputs.		// Verify inputs.
assert(N != nullptr && AA != nullptr && LI != nullptr && DT != nullptr &&		assert(N != nullptr && AA != nullptr && LI != nullptr && DT != nullptr &&
CurLoop != nullptr && SafetyInfo != nullptr &&		CurLoop != nullptr && SafetyInfo != nullptr && LAI != nullptr &&
"Unexpected input to hoistRegion.");		"Unexpected input to hoistRegion.");
assert(((CurAST != nullptr) ^ (MSSAU != nullptr)) &&		assert(((CurAST != nullptr) ^ (MSSAU != nullptr)) &&
"Either AliasSetTracker or MemorySSA should be initialized.");		"Either AliasSetTracker or MemorySSA should be initialized.");

ControlFlowHoister CFH(LI, DT, CurLoop, MSSAU);		ControlFlowHoister CFH(LI, DT, CurLoop, MSSAU);

// Keep track of instructions that have been hoisted, as they may need to be		// Keep track of instructions that have been hoisted, as they may need to be
// re-hoisted if they end up not dominating all of their uses.		// re-hoisted if they end up not dominating all of their uses.
▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	for (BasicBlock::iterator II = BB->begin(), E = BB->end(); II != E;) {
CurLoop->getLoopPreheader()->getTerminator())) {		CurLoop->getLoopPreheader()->getTerminator())) {
hoist(I, DT, CurLoop, CFH.getOrCreateHoistedBlock(BB), SafetyInfo,		hoist(I, DT, CurLoop, CFH.getOrCreateHoistedBlock(BB), SafetyInfo,
MSSAU, ORE);		MSSAU, ORE);
HoistedInstructions.push_back(&I);		HoistedInstructions.push_back(&I);
Changed = true;		Changed = true;
continue;		continue;
}		}

		// If it's a malloc/free pair, try hoisting it out to the preheader. We
		// don't need to worry about intervening unwinding function calls in this
		// case, so we can be more aggressive than for generic instructions.
		if (CurLoop->hasLoopInvariantOperands(&I) &&
		llvm::isAllocationFn(&I, TLI) && LAI->mayHoist(cast<CallInst>(&I))) {
		IRBuilder<>(&I).CreateLifetimeStart(&I);
		hoist(I, DT, CurLoop, CFH.getOrCreateHoistedBlock(BB), SafetyInfo,
		MSSAU, ORE);
		HoistedInstructions.push_back(&I);
		Instruction *Pattern = LAI->toSink(cast<CallInst>(&I))[0];
		reamesUnsubmitted Done Reply Inline Actions As debated offline, naming here is problematic. addDeallocations -> locationsNeedingFrees? newAllocPlacement? toSink -> freesForMalloc (key piece: avoid action names) A comment would be help as well. Pull out the cast to CallInst reames: As debated offline, naming here is problematic. - addDeallocations -> locationsNeedingFrees?
		nicholasAuthorUnsubmitted Not Done Reply Inline Actions Is picking the first one out of toSink/freesForMalloc a problem? The order depends on the use-list ordering. I don't know what LLVM's current rules on that are, I know that there's https://llvm.org/docs/LangRef.html#use-list-order-directives . Is it considered a bug to have an optimization whose result depends on use list order? nicholas: Is picking the first one out of toSink/freesForMalloc a problem? The order depends on the use…
		nicholasAuthorUnsubmitted Not Done Reply Inline Actions Is it possible for a free call to have an operand bundle on a malloc or free call? In CloneInstructionInExitBlock, LICM updates the funclet operand bundle for the new location in the CFG. nicholas: Is it possible for a free call to have an operand bundle on a malloc or free call? In…
		for (auto ExitBlock : LAI->addDeallocations(cast<CallInst>(&I))) {
		Instruction *Clone = Pattern->clone();
		Clone->dropUnknownNonDebugMetadata();
		reamesUnsubmitted Done Reply Inline Actions You might wish to either a) union the source locations, or b) restrict this to non -g See applyMergedLocation on Instruction. reames: You might wish to either a) union the source locations, or b) restrict this to non -g See…
		ExitBlock->getInstList().insert(ExitBlock->getFirstInsertionPt(),
		Clone);
		}

		for (auto FreeCall : LAI->toSink(cast<CallInst>(&I))) {
		IRBuilder<>(FreeCall).CreateLifetimeEnd(&I);
		eraseInstruction(FreeCall, SafetyInfo, CurAST, MSSAU);
		}

		Changed = true;
		continue;
		}

// Attempt to remove floating point division out of the loop by		// Attempt to remove floating point division out of the loop by
// converting it to a reciprocal multiplication.		// converting it to a reciprocal multiplication.
if (I.getOpcode() == Instruction::FDiv &&		if (I.getOpcode() == Instruction::FDiv &&
CurLoop->isLoopInvariant(I.getOperand(1)) &&		CurLoop->isLoopInvariant(I.getOperand(1)) &&
I.hasAllowReciprocal()) {		I.hasAllowReciprocal()) {
auto Divisor = I.getOperand(1);		auto Divisor = I.getOperand(1);
auto One = llvm::ConstantFP::get(Divisor->getType(), 1.0);		auto One = llvm::ConstantFP::get(Divisor->getType(), 1.0);
auto ReciprocalDivisor = BinaryOperator::CreateFDiv(One, Divisor);		auto ReciprocalDivisor = BinaryOperator::CreateFDiv(One, Divisor);
▲ Show 20 Lines • Show All 153 Lines • ▼ Show 20 Lines	static bool isLoadInvariantInLoop(LoadInst LI, DominatorTree DT,
}		}

return false;		return false;
}		}

namespace {		namespace {
/// Return true if-and-only-if we know how to (mechanically) both hoist and		/// Return true if-and-only-if we know how to (mechanically) both hoist and
/// sink a given instruction out of a loop. Does not address legality		/// sink a given instruction out of a loop. Does not address legality
/// concerns such as aliasing or speculation safety.		/// concerns such as aliasing or speculation safety.
bool isHoistableAndSinkableInst(Instruction &I) {		bool isHoistableAndSinkableInst(Instruction &I) {
// Only these instructions are hoistable/sinkable.		// Only these instructions are hoistable/sinkable.
return (isa<LoadInst>(I) \|\| isa<StoreInst>(I) \|\|		return (isa<LoadInst>(I) \|\| isa<StoreInst>(I) \|\| isa<CallInst>(I) \|\|
isa<CallInst>(I) \|\| isa<FenceInst>(I) \|\|		isa<FenceInst>(I) \|\| isa<BinaryOperator>(I) \|\| isa<CastInst>(I) \|\|
isa<BinaryOperator>(I) \|\| isa<CastInst>(I) \|\|		isa<SelectInst>(I) \|\| isa<GetElementPtrInst>(I) \|\| isa<CmpInst>(I) \|\|
isa<SelectInst>(I) \|\| isa<GetElementPtrInst>(I) \|\|		isa<InsertElementInst>(I) \|\| isa<ExtractElementInst>(I) \|\|
isa<CmpInst>(I) \|\| isa<InsertElementInst>(I) \|\|		isa<ShuffleVectorInst>(I) \|\| isa<ExtractValueInst>(I) \|\|
isa<ExtractElementInst>(I) \|\| isa<ShuffleVectorInst>(I) \|\|		isa<InsertValueInst>(I));
isa<ExtractValueInst>(I) \|\| isa<InsertValueInst>(I));
}		}
/// Return true if all of the alias sets within this AST are known not to		/// Return true if all of the alias sets within this AST are known not to
/// contain a Mod, or if MSSA knows thare are no MemoryDefs in the loop.		/// contain a Mod, or if MSSA knows thare are no MemoryDefs in the loop.
bool isReadOnly(AliasSetTracker CurAST, const MemorySSAUpdater MSSAU,		bool isReadOnly(AliasSetTracker CurAST, const MemorySSAUpdater MSSAU,
const Loop *L) {		const Loop *L) {
if (CurAST) {		if (CurAST) {
for (AliasSet &AS : *CurAST) {		for (AliasSet &AS : *CurAST) {
if (!AS.isForwardingAliasSet() && AS.isMod()) {		if (!AS.isForwardingAliasSet() && AS.isMod()) {
▲ Show 20 Lines • Show All 293 Lines • ▼ Show 20 Lines	static Instruction *CloneInstructionInExitBlock(
Instruction &I, BasicBlock &ExitBlock, PHINode &PN, const LoopInfo *LI,		Instruction &I, BasicBlock &ExitBlock, PHINode &PN, const LoopInfo *LI,
const LoopSafetyInfo SafetyInfo, MemorySSAUpdater MSSAU) {		const LoopSafetyInfo SafetyInfo, MemorySSAUpdater MSSAU) {
Instruction *New;		Instruction *New;
if (auto *CI = dyn_cast<CallInst>(&I)) {		if (auto *CI = dyn_cast<CallInst>(&I)) {
const auto &BlockColors = SafetyInfo->getBlockColors();		const auto &BlockColors = SafetyInfo->getBlockColors();

// Sinking call-sites need to be handled differently from other		// Sinking call-sites need to be handled differently from other
// instructions. The cloned call-site needs a funclet bundle operand		// instructions. The cloned call-site needs a funclet bundle operand
// appropriate for it's location in the CFG.		// appropriate for its location in the CFG.
SmallVector<OperandBundleDef, 1> OpBundles;		SmallVector<OperandBundleDef, 1> OpBundles;
for (unsigned BundleIdx = 0, BundleEnd = CI->getNumOperandBundles();		for (unsigned BundleIdx = 0, BundleEnd = CI->getNumOperandBundles();
BundleIdx != BundleEnd; ++BundleIdx) {		BundleIdx != BundleEnd; ++BundleIdx) {
OperandBundleUse Bundle = CI->getOperandBundleAt(BundleIdx);		OperandBundleUse Bundle = CI->getOperandBundleAt(BundleIdx);
if (Bundle.getTagID() == LLVMContext::OB_funclet)		if (Bundle.getTagID() == LLVMContext::OB_funclet)
continue;		continue;

OpBundles.emplace_back(Bundle);		OpBundles.emplace_back(Bundle);
▲ Show 20 Lines • Show All 181 Lines • ▼ Show 20 Lines

/// When an instruction is found to only be used outside of the loop, this		/// When an instruction is found to only be used outside of the loop, this
/// function moves it to the exit blocks and patches up SSA form as needed.		/// function moves it to the exit blocks and patches up SSA form as needed.
/// This method is guaranteed to remove the original instruction from its		/// This method is guaranteed to remove the original instruction from its
/// position, and may either delete it or move it to outside of the loop.		/// position, and may either delete it or move it to outside of the loop.
///		///
static bool sink(Instruction &I, LoopInfo LI, DominatorTree DT,		static bool sink(Instruction &I, LoopInfo LI, DominatorTree DT,
const Loop CurLoop, ICFLoopSafetyInfo SafetyInfo,		const Loop CurLoop, ICFLoopSafetyInfo SafetyInfo,
MemorySSAUpdater MSSAU, OptimizationRemarkEmitter ORE,		MemorySSAUpdater MSSAU, OptimizationRemarkEmitter ORE) {
bool FreeInLoop) {
LLVM_DEBUG(dbgs() << "LICM sinking instruction: " << I << "\n");		LLVM_DEBUG(dbgs() << "LICM sinking instruction: " << I << "\n");
ORE->emit([&]() {		ORE->emit([&]() {
return OptimizationRemark(DEBUG_TYPE, "InstSunk", &I)		return OptimizationRemark(DEBUG_TYPE, "InstSunk", &I)
<< "sinking " << ore::NV("Inst", &I);		<< "sinking " << ore::NV("Inst", &I);
});		});
bool Changed = false;		bool Changed = false;
if (isa<LoadInst>(I))		if (isa<LoadInst>(I))
++NumMovedLoads;		++NumMovedLoads;
else if (isa<CallInst>(I))		else if (isa<CallInst>(I))
++NumMovedCalls;		++NumMovedCalls;
++NumSunk;		++NumSunk;

// Iterate over users to be ready for actual sinking. Replace users via		// Iterate over users to be ready for actual sinking. Replace users via
// unrechable blocks with undef and make all user PHIs trivially replcable.		// unreachable blocks with undef and make all user PHIs trivially replaceable.
SmallPtrSet<Instruction *, 8> VisitedUsers;		SmallPtrSet<Instruction *, 8> VisitedUsers;
for (Value::user_iterator UI = I.user_begin(), UE = I.user_end(); UI != UE;) {		for (Value::user_iterator UI = I.user_begin(), UE = I.user_end(); UI != UE;) {
auto User = cast<Instruction>(UI);		auto User = cast<Instruction>(UI);
Use &U = UI.getUse();		Use &U = UI.getUse();
++UI;		++UI;

if (VisitedUsers.count(User) \|\| CurLoop->contains(User))		if (VisitedUsers.count(User) \|\| CurLoop->contains(User))
continue;		continue;
▲ Show 20 Lines • Show All 720 Lines • Show Last 20 Lines

llvm/test/Transforms/LICM/allocs.ll

This file was added.

				; RUN: opt -S -licm < %s \| FileCheck %s

				asbirleaUnsubmitted Done Reply Inline Actions Could you also add a RUN line with -enable-mssa-loop-dependency? asbirlea: Could you also add a RUN line with -enable-mssa-loop-dependency?
				declare i8* @malloc(i32)
				declare void @free(i8* nocapture)

				declare void @use(i8*)
				declare i1 @expr()

				define void @test1(i32 %loop_invariant) {
				; CHECK-LABEL: @test1
				header:
				; CHECK-LABEL: header:
				; CHECK: @malloc
				; CHECK: br label %loop
				br label %loop

				loop:
				; CHECK-LABEL: loop:
				; CHECK-NOT: @malloc
				; CHECK: @llvm.lifetime.start
				; CHECK-NOT: @malloc
				; CHECK: @use
				; CHECK-NOT: @free
				; CHECK: @llvm.lifetime.end
				; CHECK-NOT: @free
				%ptr = call i8* @malloc(i32 %loop_invariant)
				call void @use(i8* %ptr)
				call void @free(i8* %ptr)
				br i1 undef, label %loop, label %loopexit

				loopexit:
				; CHECK-LABEL: loopexit:
				; CHECK: @free
				ret void
				}

				declare void @mightmalloc()

				define void @test2(i32 %loop_invariant) {
				; CHECK-LABEL: @test2
				header:
				; CHECK-LABEL: header:
				; CHECK-NOT: @malloc
				; CHECK: br label %loop
				br label %loop

				loop:
				; CHECK-LABEL: loop:
				; CHECK: @malloc
				; CHECK: @use
				; CHECK: @free
				; CHECK: @mightmalloc
				%ptr = call i8* @malloc(i32 %loop_invariant)
				call void @use(i8* %ptr)
				call void @free(i8* %ptr)
				call void @mightmalloc()
				br i1 undef, label %loop, label %loopexit

				loopexit:
				; CHECK-LABEL: loopexit:
				; CHECK-NOT: @free
				ret void
				}

				define void @test3(i32 %loop_invariant) {
				; CHECK-LABEL: @test3
				header:
				; CHECK-LABEL: header:
				; CHECK-NOT: @malloc
				; CHECK: br label %loop
				br label %loop

				loop:
				; CHECK-LABEL: loop:
				; CHECK: @mightmalloc
				; CHECK: @malloc
				; CHECK: @use
				; CHECK: @free
				call void @mightmalloc()
				%ptr = call i8* @malloc(i32 %loop_invariant)
				call void @use(i8* %ptr)
				call void @free(i8* %ptr)
				br i1 undef, label %loop, label %loopexit

				loopexit:
				; CHECK-LABEL: loopexit:
				; CHECK-NOT: @free
				ret void
				}

				define void @test4(i32 %loop_invariant) {
				; CHECK-LABEL: @test4
				header:
				; CHECK-LABEL: header:
				; CHECK: @malloc
				; CHECK: @malloc
				; CHECK: @malloc
				; CHECK: br label %loop
				br label %loop

				loop:
				; CHECK-LABEL: loop:
				; CHECK-NOT: @malloc
				; CHECK: @llvm.lifetime.start
				; CHECK: @llvm.lifetime.start
				; CHECK: @llvm.lifetime.start
				; CHECK: @use
				; CHECK: @use
				; CHECK: @use
				; CHECK: @llvm.lifetime.end
				; CHECK: @llvm.lifetime.end
				; CHECK: @llvm.lifetime.end
				; CHECK-NOT: @free
				%ptr1 = call i8* @malloc(i32 %loop_invariant)
				%ptr2 = call i8* @malloc(i32 %loop_invariant)
				%ptr3 = call i8* @malloc(i32 %loop_invariant)
				call void @use(i8* %ptr1)
				call void @use(i8* %ptr2)
				call void @use(i8* %ptr3)
				call void @free(i8* %ptr1)
				call void @free(i8* %ptr2)
				call void @free(i8* %ptr3)
				br i1 undef, label %loop, label %loopexit

				loopexit:
				; CHECK-LABEL: loopexit:
				; CHECK: @free
				; CHECK: @free
				; CHECK: @free
				ret void
				}

				define void @test5(i32 %loop_invariant) {
				; CHECK-LABEL: @test5
				header:
				; CHECK-LABEL: header:
				; CHECK: @malloc
				; CHECK: @malloc
				; CHECK: @malloc
				; CHECK: br label %loop
				br label %loop

				loop:
				; CHECK-LABEL: loop:
				; CHECK-NOT: @malloc
				; CHECK: @llvm.lifetime.start
				; CHECK: @llvm.lifetime.start
				; CHECK: @llvm.lifetime.start
				; CHECK: @use
				; CHECK: @use
				; CHECK: @use
				; CHECK: @llvm.lifetime.end
				; CHECK: @llvm.lifetime.end
				; CHECK: @llvm.lifetime.end
				; CHECK-NOT: @free
				%ptr3 = call i8* @malloc(i32 %loop_invariant)
				%ptr2 = call i8* @malloc(i32 %loop_invariant)
				%ptr1 = call i8* @malloc(i32 %loop_invariant)
				call void @use(i8* %ptr1)
				call void @use(i8* %ptr2)
				call void @use(i8* %ptr3)
				call void @free(i8* %ptr1)
				call void @free(i8* %ptr2)
				call void @free(i8* %ptr3)
				br i1 undef, label %loop, label %loopexit

				loopexit:
				; CHECK-LABEL: loopexit:
				; CHECK: @free
				; CHECK: @free
				; CHECK: @free
				ret void
				}

				define void @test6(i32 %loop_invariant) {
				; CHECK-LABEL: @test6
				header:
				; CHECK-LABEL: header:
				; CHECK: @malloc
				; CHECK: @malloc
				; CHECK-NOT: @malloc
				; CHECK: br label %loop
				br label %loop

				loop:
				; CHECK-LABEL: loop:
				; CHECK: @llvm.lifetime.start
				; CHECK: @malloc
				; CHECK: @llvm.lifetime.start
				; CHECK: @use
				; CHECK: @use
				; CHECK: @use
				; CHECK: @llvm.lifetime.end
				; CHECK: @free
				; CHECK: @llvm.lifetime.end
				%i = phi i32 [%loop_invariant, %header], [%i.1, %loop]
				%ptr1 = call i8* @malloc(i32 %loop_invariant)
				%ptr2 = call i8* @malloc(i32 %i)
				%ptr3 = call i8* @malloc(i32 %loop_invariant)
				call void @use(i8* %ptr1)
				call void @use(i8* %ptr2)
				call void @use(i8* %ptr3)
				call void @free(i8* %ptr1)
				call void @free(i8* %ptr2)
				call void @free(i8* %ptr3)
				%i.1 = add i32 %i, 1
				br i1 undef, label %loop, label %loopexit

				loopexit:
				; CHECK-LABEL: loopexit:
				; CHECK: @free
				; CHECK: @free
				; CHECK-NOT: @free
				; CHECK: ret void
				ret void
				}

				define void @test7(i32 %loop_invariant) {
				; CHECK-LABEL: @test7
				header:
				; CHECK-LABEL: header:
				; CHECK: @malloc
				; CHECK: br label %loop
				br label %loop

				loop:
				; CHECK-LABEL: loop:
				; CHECK-NOT: @malloc
				; CHECK: @llvm.lifetime.start
				; CHECK-NOT: @malloc
				; CHECK: @use
				; CHECK-NOT: @free
				; CHECK-NOT: @llvm.lifetime.end
				%i = phi i32 [%loop_invariant, %header], [%i.1, %loop.2], [%i.1, %loop.3]
				%ptr = call i8* @malloc(i32 %loop_invariant)
				call void @use(i8* %ptr)
				%i.1 = add i32 %i, 1
				%cmp = icmp eq i32 %i.1, 123
				br i1 %cmp, label %loop.2, label %loop.3

				loop.2:
				; CHECK-LABEL: loop.2:
				; CHECK-NOT: @free
				; CHECK: @llvm.lifetime.end
				; CHECK-NOT: @free
				%continue.1 = call i1 @expr()
				call void @free(i8* %ptr)
				br i1 %continue.1, label %loop, label %loopexit

				loop.3:
				; CHECK-LABEL: loop.3:
				; CHECK-NOT: @free
				; CHECK: @llvm.lifetime.end
				; CHECK-NOT: @free
				%continue.2 = call i1 @expr()
				call void @free(i8* %ptr)
				br i1 %continue.2, label %loop, label %loopexit

				loopexit:
				; CHECK-LABEL: loopexit:
				; CHECK: @free
				ret void
				}

				declare void @assert_fail()

				define void @test8(i32 %loop_invariant) {
				; CHECK-LABEL: @test8
				header:
				; CHECK-LABEL: header:
				; CHECK: @malloc
				; CHECK: br label %loop
				br label %loop

				loop:
				; CHECK-LABEL: loop:
				; CHECK-NOT: @malloc
				; CHECK: @llvm.lifetime.start
				; CHECK-NOT: @malloc
				%ptr = call i8* @malloc(i32 %loop_invariant)
				%fail = call i1 @expr()
				br i1 %fail, label %assert_fail, label %assert_pass

				assert_fail:
				call void @assert_fail()
				unreachable

				assert_pass:
				; CHECK-LABEL: assert_pass:
				; CHECK-NOT: @free
				; CHECK: @llvm.lifetime.end
				; CHECK-NOT: @free
				call void @free(i8* %ptr)
				br i1 undef, label %loop, label %loopexit

				loopexit:
				; CHECK-LABEL: loopexit:
				; CHECK: @free
				ret void
				}
				reamesUnsubmitted Not Done Reply Inline Actions This test is odd. I don't see anything preventing us from reordering the volatile load and malloc. Was this an attempt to test for a side exit? If so, simply using an unknown call before the malloc would be preferred. reames: This test is odd. I don't see anything preventing us from reordering the volatile load and…
				nicholasAuthorUnsubmitted Done Reply Inline Actions This test shows the difference from using isGuaranteedToTranferExecution in analyzeLoop. Before that change, we would have hoisted the malloc above the volatile load. The same test written with a function would not have shown any difference simply because there is no way to create a function that might-throw but can't-malloc/free (without adding new attributes to LLVM). It would also just be @test3 again. I think the new behaviour tested for here is correct. Suppose you have a system where out of memory terminates the program, and the volatile store is triggering some external action (moves the robot arm). Suppose the programmer is using volatile to make sure the external system is in the safe state before doing the malloc that may terminate, and wants the malloc to complete before beginning any more operations. The compiler should not reorder these actions. nicholas: This test shows the difference from using isGuaranteedToTranferExecution in analyzeLoop. Before…

llvm/unittests/Analysis/CFGTest.cpp

//===- CFGTest.cpp - CFG tests --------------------------------------------===//		//===- CFGTest.cpp - CFG tests --------------------------------------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/Analysis/CFG.h"		#include "llvm/Analysis/CFG.h"
		#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/Analysis/LoopInfo.h"		#include "llvm/Analysis/LoopInfo.h"
#include "llvm/AsmParser/Parser.h"		#include "llvm/AsmParser/Parser.h"
#include "llvm/IR/Dominators.h"		#include "llvm/IR/Dominators.h"
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"
#include "llvm/IR/InstIterator.h"		#include "llvm/IR/InstIterator.h"
#include "llvm/IR/LLVMContext.h"		#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/LegacyPassManager.h"		#include "llvm/IR/LegacyPassManager.h"
#include "llvm/IR/Module.h"		#include "llvm/IR/Module.h"
Show All 34 Lines	for (inst_iterator I = inst_begin(F), E = inst_end(F); I != E; ++I) {
else if (I->getName() == "B")		else if (I->getName() == "B")
B = &*I;		B = &*I;
}		}
}		}
if (A == nullptr)		if (A == nullptr)
report_fatal_error("@test must have an instruction %A");		report_fatal_error("@test must have an instruction %A");
if (B == nullptr)		if (B == nullptr)
report_fatal_error("@test must have an instruction %B");		report_fatal_error("@test must have an instruction %B");

		assert(ExclusionSet.empty());
		for (auto I = F->begin(), E = F->end(); I != E; ++I) {
		if (I->hasName() && I->getName().startswith("excluded"))
		ExclusionSet.insert(&*I);
		}
}		}

void ExpectPath(bool ExpectedResult) {		void ExpectPath(bool ExpectedResult) {
static char ID;		static char ID;
class IsPotentiallyReachableTestPass : public FunctionPass {		class IsPotentiallyReachableTestPass : public FunctionPass {
public:		public:
IsPotentiallyReachableTestPass(bool ExpectedResult,		IsPotentiallyReachableTestPass(bool ExpectedResult, Instruction *A,
Instruction A, Instruction B)		Instruction *B,
: FunctionPass(ID), ExpectedResult(ExpectedResult), A(A), B(B) {}		SmallPtrSet<BasicBlock *, 4> ExclusionSet)
		: FunctionPass(ID), ExpectedResult(ExpectedResult), A(A), B(B),
		ExclusionSet(ExclusionSet) {}

static int initialize() {		static int initialize() {
PassInfo *PI = new PassInfo("isPotentiallyReachable testing pass",		PassInfo *PI = new PassInfo("isPotentiallyReachable testing pass", "",
"", &ID, nullptr, true, true);		&ID, nullptr, true, true);
PassRegistry::getPassRegistry()->registerPass(*PI, false);		PassRegistry::getPassRegistry()->registerPass(*PI, false);
initializeLoopInfoWrapperPassPass(*PassRegistry::getPassRegistry());		initializeLoopInfoWrapperPassPass(*PassRegistry::getPassRegistry());
initializeDominatorTreeWrapperPassPass(		initializeDominatorTreeWrapperPassPass(
*PassRegistry::getPassRegistry());		*PassRegistry::getPassRegistry());
return 0;		return 0;
}		}

void getAnalysisUsage(AnalysisUsage &AU) const override {		void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.setPreservesAll();		AU.setPreservesAll();
AU.addRequired<LoopInfoWrapperPass>();		AU.addRequired<LoopInfoWrapperPass>();
AU.addRequired<DominatorTreeWrapperPass>();		AU.addRequired<DominatorTreeWrapperPass>();
}		}

bool runOnFunction(Function &F) override {		bool runOnFunction(Function &F) override {
if (!F.hasName() \|\| F.getName() != "test")		if (!F.hasName() \|\| F.getName() != "test")
return false;		return false;

LoopInfo *LI = &getAnalysis<LoopInfoWrapperPass>().getLoopInfo();		LoopInfo *LI = &getAnalysis<LoopInfoWrapperPass>().getLoopInfo();
DominatorTree *DT =		DominatorTree *DT =
&getAnalysis<DominatorTreeWrapperPass>().getDomTree();		&getAnalysis<DominatorTreeWrapperPass>().getDomTree();
EXPECT_EQ(isPotentiallyReachable(A, B, nullptr, nullptr),		EXPECT_EQ(isPotentiallyReachable(A, B, &ExclusionSet, nullptr, nullptr),
		ExpectedResult);
		EXPECT_EQ(isPotentiallyReachable(A, B, &ExclusionSet, DT, nullptr),
		ExpectedResult);
		EXPECT_EQ(isPotentiallyReachable(A, B, &ExclusionSet, nullptr, LI),
		ExpectedResult);
		EXPECT_EQ(isPotentiallyReachable(A, B, &ExclusionSet, DT, LI),
ExpectedResult);		ExpectedResult);
EXPECT_EQ(isPotentiallyReachable(A, B, DT, nullptr), ExpectedResult);
EXPECT_EQ(isPotentiallyReachable(A, B, nullptr, LI), ExpectedResult);
EXPECT_EQ(isPotentiallyReachable(A, B, DT, LI), ExpectedResult);
return false;		return false;
}		}
bool ExpectedResult;		bool ExpectedResult;
Instruction A, B;		Instruction A, B;
		SmallPtrSet<BasicBlock *, 4> ExclusionSet;
};		};

static int initialize = IsPotentiallyReachableTestPass::initialize();		static int initialize = IsPotentiallyReachableTestPass::initialize();
(void)initialize;		(void)initialize;

IsPotentiallyReachableTestPass *P =		IsPotentiallyReachableTestPass *P =
new IsPotentiallyReachableTestPass(ExpectedResult, A, B);		new IsPotentiallyReachableTestPass(ExpectedResult, A, B, ExclusionSet);
legacy::PassManager PM;		legacy::PassManager PM;
PM.add(P);		PM.add(P);
PM.run(*M);		PM.run(*M);
}		}

LLVMContext Context;		LLVMContext Context;
std::unique_ptr<Module> M;		std::unique_ptr<Module> M;
Instruction A, B;		Instruction A, B;
		SmallPtrSet<BasicBlock *, 4> ExclusionSet;
};		};

}		}

TEST_F(IsPotentiallyReachableTest, SameBlockNoPath) {		TEST_F(IsPotentiallyReachableTest, SameBlockNoPath) {
ParseAssembly(		ParseAssembly(
"define void @test() {\n"		"define void @test() {\n"
"entry:\n"		"entry:\n"
▲ Show 20 Lines • Show All 255 Lines • ▼ Show 20 Lines	TEST_F(IsPotentiallyReachableTest, ModifyTest) {

succ_iterator S = succ_begin(&*++M->getFunction("test")->begin());		succ_iterator S = succ_begin(&*++M->getFunction("test")->begin());
BasicBlock *OldBB = S[0];		BasicBlock *OldBB = S[0];
S[0] = S[1];		S[0] = S[1];
ExpectPath(false);		ExpectPath(false);
S[0] = OldBB;		S[0] = OldBB;
ExpectPath(true);		ExpectPath(true);
}		}

		TEST_F(IsPotentiallyReachableTest, UnreachableFromEntryTest) {
		ParseAssembly("define void @test() {\n"
		"entry:\n"
		" %A = bitcast i8 undef to i8\n"
		" ret void\n"
		"not.reachable:\n"
		" %B = bitcast i8 undef to i8\n"
		" ret void\n"
		"}");
		ExpectPath(false);
		}

		TEST_F(IsPotentiallyReachableTest, UnreachableBlocksTest1) {
		ParseAssembly("define void @test() {\n"
		"entry:\n"
		" ret void\n"
		"not.reachable.1:\n"
		" %A = bitcast i8 undef to i8\n"
		" br label %not.reachable.2\n"
		"not.reachable.2:\n"
		" %B = bitcast i8 undef to i8\n"
		" ret void\n"
		"}");
		ExpectPath(true);
		}

		TEST_F(IsPotentiallyReachableTest, UnreachableBlocksTest2) {
		ParseAssembly("define void @test() {\n"
		"entry:\n"
		" ret void\n"
		"not.reachable.1:\n"
		" %B = bitcast i8 undef to i8\n"
		" br label %not.reachable.2\n"
		"not.reachable.2:\n"
		" %A = bitcast i8 undef to i8\n"
		" ret void\n"
		"}");
		ExpectPath(false);
		}

		TEST_F(IsPotentiallyReachableTest, SimpleExclusionTest) {
		ParseAssembly("define void @test() {\n"
		"entry:\n"
		" %A = bitcast i8 undef to i8\n"
		" br label %excluded\n"
		"excluded:\n"
		" br label %exit\n"
		"exit:\n"
		" %B = bitcast i8 undef to i8\n"
		" ret void\n"
		"}");
		ExpectPath(false);
		}

		TEST_F(IsPotentiallyReachableTest, DiamondExcludedTest) {
		ParseAssembly("declare i1 @switch()\n"
		"\n"
		"define void @test() {\n"
		"entry:\n"
		" %x = call i1 @switch()\n"
		" %A = bitcast i8 undef to i8\n"
		" br i1 %x, label %excluded.1, label %excluded.2\n"
		"excluded.1:\n"
		" br label %exit\n"
		"excluded.2:\n"
		" br label %exit\n"
		"exit:\n"
		" %B = bitcast i8 undef to i8\n"
		" ret void\n"
		"}");
		ExpectPath(false);
		}

		TEST_F(IsPotentiallyReachableTest, DiamondOneSideExcludedTest) {
		ParseAssembly("declare i1 @switch()\n"
		"\n"
		"define void @test() {\n"
		"entry:\n"
		" %x = call i1 @switch()\n"
		" %A = bitcast i8 undef to i8\n"
		" br i1 %x, label %excluded, label %diamond\n"
		"excluded:\n"
		" br label %exit\n"
		"diamond:\n"
		" br label %exit\n"
		"exit:\n"
		" %B = bitcast i8 undef to i8\n"
		" ret void\n"
		"}");
		ExpectPath(true);
		}

This is an archive of the discontinued LLVM Phabricator instance.

Hoist/sink malloc/free's in LICM.Needs ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 193042

llvm/include/llvm/Analysis/CFG.h

llvm/include/llvm/Analysis/LoopAllocationInfo.h

llvm/include/llvm/IR/Instruction.h

llvm/include/llvm/Transforms/Utils/LoopUtils.h

llvm/lib/Analysis/BasicAliasAnalysis.cpp

llvm/lib/Analysis/CFG.cpp

llvm/lib/Analysis/CMakeLists.txt

llvm/lib/Analysis/CaptureTracking.cpp

llvm/lib/Analysis/LoopAllocationInfo.cpp

llvm/lib/CodeGen/DwarfEHPrepare.cpp

llvm/lib/IR/Instruction.cpp

llvm/lib/Transforms/Scalar/LICM.cpp

llvm/test/Transforms/LICM/allocs.ll

llvm/unittests/Analysis/CFGTest.cpp

Hoist/sink malloc/free's in LICM.
Needs ReviewPublic