This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/
-
llvm/
-
Analysis/
6/7
LoopAllocationInfo.h
-
Transforms/Utils/
-
Utils/
-
LoopUtils.h
-
lib/
-
Analysis/
-
CMakeLists.txt
2/2
LoopAllocationInfo.cpp
-
Transforms/Scalar/
-
Scalar/
4/7
LICM.cpp
-
test/Transforms/LICM/
-
Transforms/
-
LICM/
2/3
allocs.ll

Differential D60056

Hoist/sink malloc/free's in LICM.
Needs ReviewPublic

Authored by nicholas on Mar 31 2019, 10:19 PM.

Download Raw Diff

Details

Reviewers

asbirlea
george.burgess.iv
reames

Summary

Hoist/sink malloc/free's in LICM.

Adds a new method on Instruction that answers the query of whether the given Instruction might allocate or free memory. Unlike builtin info which answers whether they are an allocation/deallocation call, this method indicates whether allocation behaviour is possible.

Adds a new analysis on loops called LoopAllocationInfo. This is tightly tied to the internal implementation of LICM, but is exposed in a public API because the relevant parts of LICM's externals, hoistRegion and sinkRegion, are also a public API.

One potentially surprising aspect of this optimization is that it is correct even if the loop exits in the middle, after the malloc and before the free. As long as there is only one live malloc at a time, there is no visible difference in behaviour. It is sufficient to show that we've found free() calls which cover all possible access to the loop backedge (the other option is that the code had a pre-existing double-free, which is undefined behaviour). The loop may thus exit with the malloc either allocated or freed, and we track which loop exits blocks should have frees added to them and which ones should not.

Diff Detail

Event Timeline

nicholas created this revision.Mar 31 2019, 10:19 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 31 2019, 10:19 PM

Herald added subscribers: llvm-commits, asbirlea, jfb and 2 others. · View Herald Transcript

Harbormaster completed remote builds in B29885: Diff 193042.Mar 31 2019, 10:21 PM

nicholas added reviewers: asbirlea, george.burgess.iv, reames.Mar 31 2019, 10:22 PM

A couple of quick comments.
Would you mind splitting the cleanups and bug fix (which will be straightforward to approve) and rebase this on top of that?

llvm/include/llvm/Analysis/LoopAllocationInfo.h
30	Could you include the comment from the description here? "This analysis is tightly coupled to the internal implementation of the LICM transform."
llvm/test/Transforms/LICM/allocs.ll
3	Could you also add a RUN line with -enable-mssa-loop-dependency?

nicholas added parent revisions: D60084: [NFC] Remove dead parameter "FreeInLoop", fix some typos and trailing whitespace., D60085: Add an optional list of blocks to avoid when looking for a path in isPotentiallyReachable..Apr 1 2019, 11:55 AM

nicholas updated this revision to Diff 193147.Apr 1 2019, 12:00 PM

nicholas edited the summary of this revision. (Show Details)

nicholas updated this revision to Diff 193152.Apr 1 2019, 12:33 PM

nicholas marked an inline comment as done.

I've only skimmed through the high level comments so far, so if if any of this is addressed inline, just say so.

I was wondering why you phrased this as a combination of hoisting and sinking as opposed to promotion. We have the scalar promotion path, and the basic transform you're doing feels a lot like promotion. I suspect that many of the legality aspects will be common.

llvm/include/llvm/IR/Instruction.h
539 ↗	(On Diff #193152)	These feel like they need a bit more specification. In particular, are there any expectations around how new memory is returned? Is a deallocation routine allowed to free a random pointer read from a global? (i.e. which set of locations are we talking about freeing and allocating?)
llvm/lib/IR/Instruction.cpp
562 ↗	(On Diff #193152)	Err, using a blacklist here feels really dangerous.

I've only skimmed through the high level comments so far, so if if any of this is addressed inline, just say so.

I was wondering why you phrased this as a combination of hoisting and sinking as opposed to promotion. We have the scalar promotion path, and the basic transform you're doing feels a lot like promotion. I suspect that many of the legality aspects will be common.

I didn't frame it as promotion simply because I view promotion as a transform from one thing to another (such as heap to stack, stack to scalar) whereas this optimization was just hoisting out the allocation and sinking the matching deallocations.

The safety checks we need are "is there a path from this malloc back to this malloc without passing through one of the frees we identified?" and "for each exit block, can we see that all paths either pass through one-or-more frees or that all paths do not pass through any frees?". We have no need of MemorySSA or AliasSets like the memory-to-scalar promotion does.

Comparing this code with LICM's promotion does reveal one bug. We might try to insert into catchswitch blocks.

Should I pull it out of hoisting and sinking and into a post-pass like LICM's scalar promotion? Initially I had imagined that it would participate in hoist and sink's queries of whether "are all arguments to this instruction loop invariant (outside the loop)" as hoists instructions iterately, it would hoist malloc like any other instruction. In practice that couldn't happen because the hoist is deferred until sink time. We do need the malloc hoist to occur after hoistRegion in order to make the malloc size argument invariant. So, maybe there's no reason not to place it after hoist and sink? We currently piggy-back on hoist and sink's linear scans, but doing them again is merely a constant factor. It would eliminate the need to expose LoopAllocationInfo as a public interface, which is a benefit.

llvm/include/llvm/IR/Instruction.h
539 ↗	(On Diff #193152)	Acknowledged. These exist to help us maintain a C++-language invariant that we may not increase the total amount of heap memory in use, so, malloc+free+malloc may not be transformed into malloc+malloc+free. We simply need to avoid moving memory allocations around each other (though we can sort a continuous block of memory allocations if there's no frees in between). As such, there's no consideration of how the allocated memory would be returned or which memory gets deallocated (similar to how `mayReadOrWriteMemory` doesn't). We also don't really talk about what happens when an intrinsic allocates and deallocates internally, I think ultimately whether the intrinsic wants to be treated as something we can move other allocation/deallocation operations around should be left to the intrinsic author. It may not matter to the Objective-C runtime for instance. (But what about Objective-C++?) I think this will end up getting resolved when we address your comments on llvm/lib/IR/Instruction.cpp:562.
llvm/lib/IR/Instruction.cpp
562 ↗	(On Diff #193152)	Uhh, you aren't suggesting that I actually list them all, right? The reason I put it here in Instruction instead of hidden away in LICM/LoopAllocationInfo is to make it somewhat clearer that this is a global property that you'll need to update when adding an instruction or intrinsic. How about if we update Intrinsics.td to include a Mallocs and Frees (similar to Throws) IntrinsicProperty? That makes it opt-in when declaring the intrinsic? (Speaking of which, why Throws and not IntrThrows? Should mine be IntrMallocs/IntrFrees or Mallocs/Frees?) Also, can a readnone function malloc? Is there any way we could make this stricter?

In D60056#1450884, @nicholas wrote:

Should I pull it out of hoisting and sinking and into a post-pass like LICM's scalar promotion? Initially I had imagined that it would participate in hoist and sink's queries of whether "are all arguments to this instruction loop invariant (outside the loop)" as hoists instructions iterately, it would hoist malloc like any other instruction. In practice that couldn't happen because the hoist is deferred until sink time.

Er, I'm mistaken. It does, because sinkRegion happens first and doesn't call isLoopInvariant, then hoistRegion happens second and that uses isLoopInvariant. So in theory if you have something like ptrtoint of the malloc (maybe an alignment check?), we could hoist that out of the loop too. So it does interlace with the rest of hoistRegion, and that's important for optimization power. I'll add a test to that effect shortly.

nicholas updated this revision to Diff 193226.Apr 1 2019, 7:46 PM

Fixed creation of invalid IR when the loop exit block we need to insert a free call in, is a catchswitch block.

reames added inline comments.Apr 5 2019, 2:52 PM

llvm/lib/Analysis/LoopAllocationInfo.cpp
93	As discussed offline, bug example: for (int i = 0; i < N; i++) { throw_if(requested_size > TOO_BIG); a = malloc(requested_size); free(a); }

Other items discussed offline:

the instruction usage feels to generic with too few uses. I suggested (but will not require) either making them computed properties of an intrinsic, or finding other uses.
my comment about using a blacklist is clearly wrong. You'd suggested using an attribute in Instrinsics.td. I was fine with that, but it felt like possible overkill.

llvm/include/llvm/Analysis/LoopAllocationInfo.h
92	It really looks like you have a named tuple hidding here. AllocationInfoEntry?
llvm/lib/Transforms/Scalar/LICM.cpp
878	As debated offline, naming here is problematic. addDeallocations -> locationsNeedingFrees? newAllocPlacement? toSink -> freesForMalloc (key piece: avoid action names) A comment would be help as well. Pull out the cast to CallInst
881	You might wish to either a) union the source locations, or b) restrict this to non -g See applyMergedLocation on Instruction.

nicholas updated this revision to Diff 194639.Apr 10 2019, 11:21 PM

nicholas marked 9 inline comments as done.

minor comments on interface, LICM part, and tests. I got interrupted and need to come back and finish reviewing the analysis implementation.

llvm/include/llvm/Analysis/LoopAllocationInfo.h
78	either "may ... ?" or "return true if ...". (i.e make a question a question)
83	Should we assert that CI is a malloc within the specified loop?
107	Not sure what the last part of the comment means. Reword?
llvm/lib/Transforms/Scalar/LICM.cpp
551	What about invokes? (Use callbase)
llvm/test/Transforms/LICM/allocs.ll
568	This test is odd. I don't see anything preventing us from reordering the volatile load and malloc. Was this an attempt to test for a side exit? If so, simply using an unknown call before the malloc would be preferred.

nicholas marked 6 inline comments as done.Apr 11 2019, 12:40 PM

nicholas added inline comments.

llvm/include/llvm/Analysis/LoopAllocationInfo.h
78	Done, used "return true iff".
83	We don't presently store the right member variables to assert that, and I don't really want to add DEBUG-only member variables. The condition on this function is that mayHoist(CI) must be true. I've added an assert that CI is in entries, which is the condition under which this function would have UB.
llvm/lib/Analysis/LoopAllocationInfo.cpp
93	I've updated this to use isGuaranteedToTransferExecutionToSuccessor on the instructions leading up to the allocation. The "throw_if" example couldn't happen because any opaque function might also malloc or free, so the test @test16 uses a volatile load instead.
llvm/lib/Transforms/Scalar/LICM.cpp
512	Rename this to something clearer, such as "ScanForFrees".
551	Done. This loop skips the terminator, so it can't be an invoke. Which is, indeed, a miscompile bug. Fixed above where we initialize ScanForFrees and added @test19. Changed it to CallBase here anyways.
878	Is picking the first one out of toSink/freesForMalloc a problem? The order depends on the use-list ordering. I don't know what LLVM's current rules on that are, I know that there's https://llvm.org/docs/LangRef.html#use-list-order-directives . Is it considered a bug to have an optimization whose result depends on use list order?
878	Is it possible for a free call to have an operand bundle on a malloc or free call? In CloneInstructionInExitBlock, LICM updates the funclet operand bundle for the new location in the CFG.
llvm/test/Transforms/LICM/allocs.ll
568	This test shows the difference from using isGuaranteedToTranferExecution in analyzeLoop. Before that change, we would have hoisted the malloc above the volatile load. The same test written with a function would not have shown any difference simply because there is no way to create a function that might-throw but can't-malloc/free (without adding new attributes to LLVM). It would also just be @test3 again. I think the new behaviour tested for here is correct. Suppose you have a system where out of memory terminates the program, and the volatile store is triggering some external action (moves the robot arm). Suppose the programmer is using volatile to make sure the external system is in the safe state before doing the malloc that may terminate, and wants the malloc to complete before beginning any more operations. The compiler should not reorder these actions.

nicholas updated this revision to Diff 194733.Apr 11 2019, 12:40 PM

nicholas marked an inline comment as done.

This update adds a failing test @test21 which demonstrates a miscompile. While we check the block with the free in it for potentially-malloc'ing instructions, we don't examine all the possible paths from where that free was pre-transform to the new location of the free post-transform.

It's not immediately clear to me how to fix that efficiently. It's possible to label each loop block with whether it contains a potentially-malloc'ing instruction, then find all paths from frees to exit blocks which will have frees added to them and do not pass through any backedge or back through the header. We can simplify this path scan, knowing that all exits reachable after the free call are exits where we will insert a free.

Assumed inactive, cleaning out review list, please readd if needed.

Revision Contents

Path

Size

llvm/

include/

llvm/

Analysis/

LoopAllocationInfo.h

154 lines

Transforms/

Utils/

LoopUtils.h

7 lines

lib/

Analysis/

CMakeLists.txt

1 line

LoopAllocationInfo.cpp

247 lines

Transforms/

Scalar/

LICM.cpp

82 lines

test/

Transforms/

LICM/

allocs.ll

750 lines

Diff 196980

llvm/include/llvm/Analysis/LoopAllocationInfo.h

This file was added.

				//===- LoopAllocationInfo.h - memory allocations inside loops ---- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_ANALYSIS_LOOPALLOCATIONINFO_H
				#define LLVM_ANALYSIS_LOOPALLOCATIONINFO_H

				#include "llvm/ADT/DenseMap.h"
				#include "llvm/ADT/DenseSet.h"
				#include "llvm/ADT/SmallVector.h"

				namespace llvm {

				class BasicBlock;
				class CallInst;
				class DominatorTree;
				class Loop;
				class LoopInfo;
				class TargetLibraryInfo;

				/// Track %ptr = allocfn to one or more freefn(%ptr) relationships within the
				/// loop.
				///
				/// This analysis stores information about allocation and deallocation calls
				/// where the pointer is only live within the loop. Its design is tightly
				/// coupled to the internal implementation of LICM, but is exposed in a public
				asbirleaUnsubmitted Done Reply Inline Actions Could you include the comment from the description here? "This analysis is tightly coupled to the internal implementation of the LICM transform." asbirlea: Could you include the comment from the description here? "This analysis is tightly coupled to…
				/// API because the llvm::hoistRegion and llvm::sinkRegion functions in LICM are
				/// also exposed.
				///
				/// An allocation and its corresponding deallocations must be both hoisted and
				/// sunk, or neither. LICM performs sinking first, then hoisting. As sinking and
				/// hoisting is performed, more instructions will appear to be loop invariant,
				/// creating more hoisting opportunities. For this reason we split the analysis
				/// into two pieces:
				/// 1. for a given pointer that is only live in the loop, retain a list its
				/// matching deallocating instructions.
				/// 2. during sinking, if we would sink the deallocation, don't, but inform
				/// the LoopAllocationInfo. If all deallocations would have been sunk, and
				/// hoistSink would hoist the allocation, perform both the hoist and sink
				/// the various deallocation instructions.
				///
				/// There's two subtleties here. First, we don't need guaranteed execution of
				/// any particular deallocation function during unwinding. Before this
				/// transformation, an unwind mid-loop would leak one pointer that was allocated
				/// at the start of the loop. After this transformation, it would leak the one
				/// pointer allocated before the loop began. There is no difference visible to
				/// the caller.
				///
				/// Second, we don't need to identify every deallocation in order for the
				/// transformation to be safe. We need to know that all backedges (technically,
				/// all edges back to the header block, which may be a superset of backedges in
				/// the event of unnatural loop cycles) are only reached by passing through some
				/// deallocation. Those deallocations are guaranteed to be sunk if/when we hoist
				/// the allocation. Other functions may also happen to free the pointer, but we
				/// would have no way of knowing that. If the function frees and we go back to
				/// the header block, the code has undefined behaviour due to a double-free. For
				/// it to have well-defined behaviour after freeing, control must pass to a loop
				/// exit block.
				///
				/// Every exit block is categorized into blocks where the pointer is certainly
				/// freed by one of the deallocations we've already identified, or is certainly
				/// not freed by any of the deallocations we've identified. In the former case
				/// we will sink a free into that exit block. An opaque function that
				/// deallocates fits into the latter category, and we won't insert a free call
				/// in the block. In the remaining case where there are multiple paths to an
				/// exit block some of which free the pointer and others that don't pass through
				/// an identified free, we abort the transform.
				///
				/// We also have to ensure that the maximum amount of allocated memory is not
				/// raised by this transformation. To ensure this, we stop looking for
				/// allocations after seeing an opaque function call. For example:
				/// loop_top:
				/// %ptr1 = call i8* @malloc(i32 %loop_invariant1) ;; candidate
				/// %ptr2 = call i8* @malloc(i32 %loop_variant2) ;; not candidate
				reamesUnsubmitted Done Reply Inline Actions either "may ... ?" or "return true if ...". (i.e make a question a question) reames: either "may ... ?" or "return true if ...". (i.e make a question a question)
				nicholasAuthorUnsubmitted Done Reply Inline Actions Done, used "return true iff". nicholas: Done, used "return true iff".
				/// %ptr3 = call i8* @malloc(i32 %loop_invariant3) ;; candidate
				/// call void @opaque() ;; stop scanning
				/// %ptr5 = call i8* @malloc(i32 %loop_invariant5) ;; not candidate
				///
				/// The same idea applies in reverse order to @free calls at the loop exits:
				reamesUnsubmitted Not Done Reply Inline Actions Should we assert that CI is a malloc within the specified loop? reames: Should we assert that CI is a malloc within the specified loop?
				nicholasAuthorUnsubmitted Done Reply Inline Actions We don't presently store the right member variables to assert that, and I don't really want to add DEBUG-only member variables. The condition on this function is that mayHoist(CI) must be true. I've added an assert that CI is in entries, which is the condition under which this function would have UB. nicholas: We don't presently store the right member variables to assert that, and I don't really want to…
				/// loop_exiting:
				/// call void @free(%ptr1) ;; not candidate
				/// call void @opaque() ;; stop scanning
				/// call void @free(%ptr2) ;; %ptr2 is candidate
				/// ;; start here and scan upwards
				class LoopAllocationInfo {
				public:
				void analyzeLoop(LoopInfo LI, DominatorTree DT, TargetLibraryInfo *TLI,
				Loop *CurLoop);
				reamesUnsubmitted Done Reply Inline Actions It really looks like you have a named tuple hidding here. AllocationInfoEntry? reames: It really looks like you have a named tuple hidding here. AllocationInfoEntry?

				// LICM has determined that this deallocation call may be sunk.
				void addSafeToSink(const CallInst *CI) {
				if (DeallocationToAllocation.count(CI))
				SafeToSink.insert(CI);
				}

				// Given an allocation call, return true iff all the safety conditions met to
				// hoist this allocation and sink its deallocations.
				bool mayHoist(const CallInst *CI) const;

				// Retrieve the list of free calls in the loop matching a given allocation.
				const SmallVector<CallInst , 16> &FreesForMalloc(CallInst CI) const {
				assert(Entries.count(CI));
				return Entries.find(CI)->second.Deallocations;
				reamesUnsubmitted Done Reply Inline Actions Not sure what the last part of the comment means. Reword? reames: Not sure what the last part of the comment means. Reword?
				}

				// Retrieve the list of loop exit blocks that are only entered after the
				// allocation has been deallocated. All other loop exit blocks are reached
				// with the allocation still allocated.
				const SmallVector<BasicBlock *, 16> &
				ExitBlocksWithPointerFreed(CallInst *CI) const {
				assert(Entries.count(CI));
				return Entries.find(CI)->second.ExitBlocksToAddDeallocationsTo;
				}

				// Returns whether the loop has any potentially-hoistable allocations.
				bool empty() const { return Entries.empty(); }

				private:
				struct AllocationInfoEntry {
				AllocationInfoEntry(
				SmallVector<CallInst *, 16> &&Deallocations,
				SmallVector<BasicBlock *, 16> &&ExitBlocksToAddDeallocationsTo)
				: Deallocations(Deallocations),
				ExitBlocksToAddDeallocationsTo(ExitBlocksToAddDeallocationsTo) {}

				// Instructions that deallocate this pointer, but may or may not be
				// executed.
				SmallVector<CallInst *, 16> Deallocations;

				// Loop exit blocks which can only be entered with the pointer already
				// freed.
				SmallVector<BasicBlock *, 16> ExitBlocksToAddDeallocationsTo;
				};

				// Store deallocation info for a given allocation. Keyed on the allocation
				// call.
				DenseMap<const CallInst *, AllocationInfoEntry> Entries;

				// Look up allocation matching a given deallocation.
				DenseMap<const CallInst , const CallInst > DeallocationToAllocation;

				// Deallocations that LICM has determined are safe to sink. They can be sunk
				// only if the whole collection of Deallocations in an entry are sunk and its
				// allocation are matching is hoisted.
				DenseSet<const CallInst *> SafeToSink;
				};

				} // namespace llvm

				#endif

llvm/include/llvm/Transforms/Utils/LoopUtils.h

	Show All 33 Lines

	namespace llvm {			namespace llvm {

	class AliasSet;			class AliasSet;
	class AliasSetTracker;			class AliasSetTracker;
	class BasicBlock;			class BasicBlock;
	class DataLayout;			class DataLayout;
	class Loop;			class Loop;
				class LoopAllocationInfo;
	class LoopInfo;			class LoopInfo;
	class MemoryAccess;			class MemoryAccess;
	class MemorySSAUpdater;			class MemorySSAUpdater;
	class OptimizationRemarkEmitter;			class OptimizationRemarkEmitter;
	class PredicatedScalarEvolution;			class PredicatedScalarEvolution;
	class PredIteratorCache;			class PredIteratorCache;
	class ScalarEvolution;			class ScalarEvolution;
	class SCEV;			class SCEV;
	▲ Show 20 Lines • Show All 64 Lines • ▼ Show 20 Lines
	/// uses before definitions, allowing us to sink a loop body in one pass without			/// uses before definitions, allowing us to sink a loop body in one pass without
	/// iteration. Takes DomTreeNode, AliasAnalysis, LoopInfo, DominatorTree,			/// iteration. Takes DomTreeNode, AliasAnalysis, LoopInfo, DominatorTree,
	/// DataLayout, TargetLibraryInfo, Loop, AliasSet information for all			/// DataLayout, TargetLibraryInfo, Loop, AliasSet information for all
	/// instructions of the loop and loop safety information as			/// instructions of the loop and loop safety information as
	/// arguments. Diagnostics is emitted via \p ORE. It returns changed status.			/// arguments. Diagnostics is emitted via \p ORE. It returns changed status.
	bool sinkRegion(DomTreeNode , AliasAnalysis , LoopInfo , DominatorTree ,			bool sinkRegion(DomTreeNode , AliasAnalysis , LoopInfo , DominatorTree ,
	TargetLibraryInfo , TargetTransformInfo , Loop *,			TargetLibraryInfo , TargetTransformInfo , Loop *,
	AliasSetTracker , MemorySSAUpdater , ICFLoopSafetyInfo *,			AliasSetTracker , MemorySSAUpdater , ICFLoopSafetyInfo *,
	SinkAndHoistLICMFlags &, OptimizationRemarkEmitter *);			LoopAllocationInfo *, SinkAndHoistLICMFlags &,
				OptimizationRemarkEmitter *);

	/// Walk the specified region of the CFG (defined by all blocks			/// Walk the specified region of the CFG (defined by all blocks
	/// dominated by the specified block, and that are in the current loop) in depth			/// dominated by the specified block, and that are in the current loop) in depth
	/// first order w.r.t the DominatorTree. This allows us to visit definitions			/// first order w.r.t the DominatorTree. This allows us to visit definitions
	/// before uses, allowing us to hoist a loop body in one pass without iteration.			/// before uses, allowing us to hoist a loop body in one pass without iteration.
	/// Takes DomTreeNode, AliasAnalysis, LoopInfo, DominatorTree, DataLayout,			/// Takes DomTreeNode, AliasAnalysis, LoopInfo, DominatorTree, DataLayout,
	/// TargetLibraryInfo, Loop, AliasSet information for all instructions of the			/// TargetLibraryInfo, Loop, AliasSet information for all instructions of the
	/// loop and loop safety information as arguments. Diagnostics is emitted via \p			/// loop and loop safety information as arguments. Diagnostics is emitted via \p
	/// ORE. It returns changed status.			/// ORE. It returns changed status.
	bool hoistRegion(DomTreeNode , AliasAnalysis , LoopInfo , DominatorTree ,			bool hoistRegion(DomTreeNode , AliasAnalysis , LoopInfo , DominatorTree ,
	TargetLibraryInfo , Loop , AliasSetTracker *,			TargetLibraryInfo , Loop , AliasSetTracker *,
	MemorySSAUpdater , ICFLoopSafetyInfo ,			MemorySSAUpdater , ICFLoopSafetyInfo ,
	SinkAndHoistLICMFlags &, OptimizationRemarkEmitter *);			LoopAllocationInfo *, SinkAndHoistLICMFlags &,
				OptimizationRemarkEmitter *);

	/// This function deletes dead loops. The caller of this function needs to			/// This function deletes dead loops. The caller of this function needs to
	/// guarantee that the loop is infact dead.			/// guarantee that the loop is infact dead.
	/// The function requires a bunch or prerequisites to be present:			/// The function requires a bunch or prerequisites to be present:
	/// - The loop needs to be in LCSSA form			/// - The loop needs to be in LCSSA form
	/// - The loop needs to have a Preheader			/// - The loop needs to have a Preheader
	/// - A unique dedicated exit block must exist			/// - A unique dedicated exit block must exist
	///			///
	▲ Show 20 Lines • Show All 211 Lines • Show Last 20 Lines

llvm/lib/Analysis/CMakeLists.txt

Show First 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	add_llvm_library(LLVMAnalysis
LazyBranchProbabilityInfo.cpp		LazyBranchProbabilityInfo.cpp
LazyBlockFrequencyInfo.cpp		LazyBlockFrequencyInfo.cpp
LazyCallGraph.cpp		LazyCallGraph.cpp
LazyValueInfo.cpp		LazyValueInfo.cpp
LegacyDivergenceAnalysis.cpp		LegacyDivergenceAnalysis.cpp
Lint.cpp		Lint.cpp
Loads.cpp		Loads.cpp
LoopAccessAnalysis.cpp		LoopAccessAnalysis.cpp
		LoopAllocationInfo.cpp
LoopAnalysisManager.cpp		LoopAnalysisManager.cpp
LoopUnrollAnalyzer.cpp		LoopUnrollAnalyzer.cpp
LoopInfo.cpp		LoopInfo.cpp
LoopPass.cpp		LoopPass.cpp
MemDepPrinter.cpp		MemDepPrinter.cpp
MemDerefPrinter.cpp		MemDerefPrinter.cpp
MemoryBuiltins.cpp		MemoryBuiltins.cpp
MemoryDependenceAnalysis.cpp		MemoryDependenceAnalysis.cpp
▲ Show 20 Lines • Show All 44 Lines • Show Last 20 Lines

llvm/lib/Analysis/LoopAllocationInfo.cpp

This file was added.

				//===- LoopAllocationInfo.cpp - memory allocations inside loops -----------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/Analysis/LoopAllocationInfo.h"

				#include "llvm/ADT/BitVector.h"
				#include "llvm/ADT/DenseMap.h"
				#include "llvm/ADT/DenseSet.h"
				#include "llvm/ADT/STLExtras.h"
				#include "llvm/Analysis/CFG.h"
				#include "llvm/Analysis/LoopInfo.h"
				#include "llvm/Analysis/MemoryBuiltins.h"
				#include "llvm/Analysis/ValueTracking.h"
				#include "llvm/IR/BasicBlock.h"
				#include "llvm/IR/CFG.h"
				#include "llvm/IR/Dominators.h"
				#include "llvm/IR/Instruction.h"
				#include "llvm/IR/Instructions.h"
				#include "llvm/IR/PatternMatch.h"
				#include "llvm/Support/Casting.h"

				namespace llvm {

				/// Track ptr = allocfn to one or more freefn(ptr) relationships within the
				/// loop.
				///
				/// This analysis stores information about allocation and deallocation calls
				/// where the pointer is only live within the loop.
				///
				/// An allocation must be both hoisted and sunk, or neither. LICM performs
				/// sinking first, then hoisting. As sinking and hoisting is performed, more
				/// instructions will appear to be loop invariant, making more opportunities.
				/// For this reason we split the analysis into two pieces:
				/// 1. for a given pointer that is only live in the loop, retain a list its
				/// matching deallocating instructions.
				/// 2. during sinking, if we would sink the deallocation, don't, but inform
				/// the LoopAllocationInfo. If all deallocations would have been sunk, and
				/// hoistSink would hoist the allocation, perform both the hoist and sink
				/// the various deallocation instructions.
				///
				/// We don't need guaranteed execution of any deallocation function during
				/// unwinding. Before this transformation, an unwind mid-loop would leak one
				/// pointer that was allocated at the start of the loop. After this
				/// transformation, it would leak the one pointer allocated before the loop
				/// began. There is no difference visible to the caller.
				///
				/// We do have to ensure that the maximum amount of allocated memory is not
				/// raised by this transformation. To ensure this, we stop looking for
				/// allocations after seeing an opaque function call. For example:
				/// loop_top:
				/// %ptr1 = call i8* @malloc(i32 %loop_invariant1) ;; candidate
				/// %ptr2 = call i8* @malloc(i32 %loop_variant2) ;; not candidate
				/// %ptr3 = call i8* @malloc(i32 %loop_invariant3) ;; candidate
				/// call void @opaque() ;; stop scanning
				/// %ptr5 = call i8* @malloc(i32 %loop_invariant5) ;; not candidate
				///
				/// The same idea applies in reverse order to @free calls at the loop exits:
				/// loop_exiting:
				/// call void @free(%ptr1) ;; not candidate
				/// call void @opaque() ;; stop scanning
				/// call void @free(%ptr2) ;; %ptr2 is candidate
				/// ;; start here and scan upwards
				void LoopAllocationInfo::analyzeLoop(LoopInfo LI, DominatorTree DT,
				TargetLibraryInfo TLI, Loop CurLoop) {
				assert(LI != nullptr && DT != nullptr && CurLoop != nullptr &&
				"Unexpected input to LoopAllocationInfo::analyzeLoop.");
				if (!TLI)
				return;

				SmallVector<BasicBlock *, 32> LatchBlocks;
				CurLoop->getLoopLatches(LatchBlocks);
				if (LatchBlocks.empty())
				return;

				SmallVector<BasicBlock *, 32> ExitBlocks;
				CurLoop->getUniqueExitBlocks(ExitBlocks);
				if (ExitBlocks.empty())
				return;

				// Unlike hoistRegion, we only scan the header block. If the allocation is in
				// a later block, it must be behind some sort of branch which implies that
				// it's conditional on something or else another optimization would've removed
				// the unconditional branch earlier).
				BasicBlock *HeaderBB = CurLoop->getHeader();
				SmallPtrSet<BasicBlock *, 1> HeaderBlock;
				HeaderBlock.insert(HeaderBB);
				for (BasicBlock::iterator I = HeaderBB->begin(), E = HeaderBB->end(); I != E;
				++I) {
				reamesUnsubmitted Done Reply Inline Actions As discussed offline, bug example: for (int i = 0; i < N; i++) { throw_if(requested_size > TOO_BIG); a = malloc(requested_size); free(a); } reames: As discussed offline, bug example: for (int i = 0; i < N; i++) { throw_if(requested_size >…
				nicholasAuthorUnsubmitted Done Reply Inline Actions I've updated this to use isGuaranteedToTransferExecutionToSuccessor on the instructions leading up to the allocation. The "throw_if" example couldn't happen because any opaque function might also malloc or free, so the test @test16 uses a volatile load instead. nicholas: I've updated this to use isGuaranteedToTransferExecutionToSuccessor on the instructions leading…
				// Find a block of allocations that are guaranteed to execute, ignoring the
				// fact that allocation functions are themselves able to exit the program or
				// unwind.

				// There are no allocation functions which are guaranteed to continue to
				// execute the next instruction.
				if (isGuaranteedToTransferExecutionToSuccessor(&*I)) {
				// These two intrinsics and increase and decrease heap size respectively.
				// Don't hoist over an allocating intrinsic in order to ensure that any
				// out-of-memory conditions will occur on the same instruction, in case
				// there are differences in how the instructions handle OOM. Don't hoist
				// over a deallocating intrinsic to avoid increasing the maximum heap
				// allocated.
				if (match(&*I, PatternMatch::m_Intrinsic<
				Intrinsic::objc_autoreleasePoolPush>()) \|\|
				match(
				&*I,
				PatternMatch::m_Intrinsic<Intrinsic::objc_autoreleasePoolPop>()))
				break;

				continue;
				}

				if (!llvm::isAllocationFn(&*I, TLI))
				break;
				auto CI = cast<CallInst>(I);

				// Find places in the loop where the pointer is certainly freed. The places
				// we identify must free this allocation, but there may be any number of
				// other places that could free our pointer which we miss.
				//
				// We use this to ensure that the allocation is freed before we go around
				// any loop latch.
				SmallVector<CallInst *, 16> FreeCalls;
				constexpr size_t UseVisitThreshold = 20;
				SmallVector<Instruction *, UseVisitThreshold> Worklist;
				SmallPtrSet<Instruction *, UseVisitThreshold> Visited;
				auto Enqueue = [&](const Instruction *I) {
				for (const Use &U : I->uses()) {
				Instruction *User = cast<Instruction>(U.getUser());
				if (!CurLoop->contains(User))
				continue;
				if (!Visited.insert(User).second)
				continue;
				if (Visited.size() == UseVisitThreshold)
				return;
				Worklist.push_back(User);
				}
				};
				Enqueue(CI);
				do {
				Instruction *I = Worklist.pop_back_val();
				if (isa<BitCastInst>(I)) {
				Enqueue(I);
				} else if (auto *GEP = dyn_cast<GetElementPtrInst>(I)) {
				if (GEP->hasAllZeroIndices())
				Enqueue(GEP);
				} else if (auto *PN = dyn_cast<PHINode>(I)) {
				if (PN->hasConstantOrUndefValue())
				Enqueue(I);
				} else if (llvm::isFreeCall(I, TLI)) {
				FreeCalls.push_back(cast<CallInst>(I));
				}
				} while (!Worklist.empty() && Visited.size() < UseVisitThreshold);

				if (FreeCalls.empty())
				continue;

				SmallPtrSet<BasicBlock *, 32> FreeCallBBs;
				for (auto FreeCall : FreeCalls)
				FreeCallBBs.insert(FreeCall->getParent());

				bool DoNotTransform = false;

				// Check that the frees we found cover all the latches.
				for (BasicBlock *LatchBB : LatchBlocks) {
				if (FreeCallBBs.count(LatchBB))
				continue;
				SmallVector<BasicBlock *, 32> HeaderWorklist;
				HeaderWorklist.push_back(HeaderBB);
				if (llvm::isPotentiallyReachableFromMany(HeaderWorklist, LatchBB,
				&FreeCallBBs, DT, LI)) {
				DoNotTransform = true;
				break;
				}
				}
				if (DoNotTransform)
				continue;

				// Determine whether each exit block is reached with the pointer freed or
				// not. If it can be reached with the pointer conditionally freed, we abort
				// the transform.
				SmallVector<BasicBlock *, 16> InsertDeallocation;
				for (auto ExitBB : ExitBlocks) {
				SmallVector<BasicBlock *, 32> HeaderWorklist, FreeCallWorklist;
				HeaderWorklist.push_back(HeaderBB);
				FreeCallWorklist.insert(FreeCallWorklist.end(), FreeCallBBs.begin(),
				FreeCallBBs.end());
				// Starting at the header, is there a path to ExitBB without passing
				// through any frees?
				bool ReachExitBBWithoutFree = llvm::isPotentiallyReachableFromMany(
				HeaderWorklist, ExitBB, &FreeCallBBs, DT, LI);
				// Starting from each of the frees, is there a path to ExitBB without
				// passing back through the header, or is this a single-block loop where
				// the malloc and free are in both in the header?
				bool ReachExitBBWithFree =
				FreeCallBBs.count(HeaderBB) \|\|
				llvm::isPotentiallyReachableFromMany(FreeCallWorklist, ExitBB,
				&HeaderBlock, DT, LI);
				assert((ReachExitBBWithoutFree \|\| ReachExitBBWithFree) &&
				"loop exit blocks not reachable from loop");
				if (ReachExitBBWithoutFree && ReachExitBBWithFree) {
				// These exist paths to this exit BB where the allocation is freed and
				// not freed.
				DoNotTransform = true;
				break;
				}
				if (ReachExitBBWithFree) {
				// Some blocks, such as those with a catchswitch instruction, can't
				// have a free call inserted in them. Give up transforming the
				// allocation in that case.
				if (ExitBB->getFirstInsertionPt() == ExitBB->end()) {
				DoNotTransform = true;
				break;
				}
				// We'll need to insert a free-call into this exit BB if we hoist the
				// allocation out of the loop.
				InsertDeallocation.push_back(ExitBB);
				}
				}
				if (DoNotTransform)
				continue;

				for (const CallInst *FreeCall : FreeCalls)
				DeallocationToAllocation[FreeCall] = CI;
				Entries.try_emplace(CI, std::move(FreeCalls),
				std::move(InsertDeallocation));
				}
				}

				// Given an allocation call, are all the safety conditions met to hoist this
				// allocation and sink its deallocations.
				bool LoopAllocationInfo::mayHoist(const CallInst *CI) const {
				auto it = Entries.find(CI);
				if (it == Entries.end())
				return false;
				for (const CallInst *FreeCall : it->second.Deallocations) {
				if (!SafeToSink.count(FreeCall))
				return false;
				}
				return true;
				}

				} // namespace llvm

llvm/lib/Transforms/Scalar/LICM.cpp

Show All 34 Lines
#include "llvm/Analysis/AliasAnalysis.h"		#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/Analysis/AliasSetTracker.h"		#include "llvm/Analysis/AliasSetTracker.h"
#include "llvm/Analysis/BasicAliasAnalysis.h"		#include "llvm/Analysis/BasicAliasAnalysis.h"
#include "llvm/Analysis/CaptureTracking.h"		#include "llvm/Analysis/CaptureTracking.h"
#include "llvm/Analysis/ConstantFolding.h"		#include "llvm/Analysis/ConstantFolding.h"
#include "llvm/Analysis/GlobalsModRef.h"		#include "llvm/Analysis/GlobalsModRef.h"
#include "llvm/Analysis/GuardUtils.h"		#include "llvm/Analysis/GuardUtils.h"
#include "llvm/Analysis/Loads.h"		#include "llvm/Analysis/Loads.h"
		#include "llvm/Analysis/LoopAllocationInfo.h"
#include "llvm/Analysis/LoopInfo.h"		#include "llvm/Analysis/LoopInfo.h"
#include "llvm/Analysis/LoopIterator.h"		#include "llvm/Analysis/LoopIterator.h"
#include "llvm/Analysis/LoopPass.h"		#include "llvm/Analysis/LoopPass.h"
#include "llvm/Analysis/MemoryBuiltins.h"		#include "llvm/Analysis/MemoryBuiltins.h"
#include "llvm/Analysis/MemorySSA.h"		#include "llvm/Analysis/MemorySSA.h"
#include "llvm/Analysis/MemorySSAUpdater.h"		#include "llvm/Analysis/MemorySSAUpdater.h"
#include "llvm/Analysis/OptimizationRemarkEmitter.h"		#include "llvm/Analysis/OptimizationRemarkEmitter.h"
#include "llvm/Analysis/ScalarEvolution.h"		#include "llvm/Analysis/ScalarEvolution.h"
#include "llvm/Analysis/ScalarEvolutionAliasAnalysis.h"		#include "llvm/Analysis/ScalarEvolutionAliasAnalysis.h"
#include "llvm/Analysis/TargetLibraryInfo.h"		#include "llvm/Analysis/TargetLibraryInfo.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/IR/CFG.h"		#include "llvm/IR/CFG.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/DebugInfoMetadata.h"		#include "llvm/IR/DebugInfoMetadata.h"
#include "llvm/IR/DerivedTypes.h"		#include "llvm/IR/DerivedTypes.h"
#include "llvm/IR/Dominators.h"		#include "llvm/IR/Dominators.h"
		#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
#include "llvm/IR/IntrinsicInst.h"		#include "llvm/IR/IntrinsicInst.h"
#include "llvm/IR/LLVMContext.h"		#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/Metadata.h"		#include "llvm/IR/Metadata.h"
#include "llvm/IR/PatternMatch.h"		#include "llvm/IR/PatternMatch.h"
#include "llvm/IR/PredIteratorCache.h"		#include "llvm/IR/PredIteratorCache.h"
#include "llvm/Support/CommandLine.h"		#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
▲ Show 20 Lines • Show All 283 Lines • ▼ Show 20 Lines	bool LoopInvariantCodeMotion::runOnLoop(

// Get the preheader block to move instructions into...		// Get the preheader block to move instructions into...
BasicBlock *Preheader = L->getLoopPreheader();		BasicBlock *Preheader = L->getLoopPreheader();

// Compute loop safety information.		// Compute loop safety information.
ICFLoopSafetyInfo SafetyInfo(DT);		ICFLoopSafetyInfo SafetyInfo(DT);
SafetyInfo.computeLoopSafetyInfo(L);		SafetyInfo.computeLoopSafetyInfo(L);

		// Scan for matching malloc/free pairs.
		LoopAllocationInfo LAI;
		if (L->hasDedicatedExits() && Preheader)
		LAI.analyzeLoop(LI, DT, TLI, L);

// We want to visit all of the instructions in this loop... that are not parts		// We want to visit all of the instructions in this loop... that are not parts
// of our subloops (they have already had their invariants hoisted out of		// of our subloops (they have already had their invariants hoisted out of
// their loop, into this loop, so there is no need to process the BODIES of		// their loop, into this loop, so there is no need to process the BODIES of
// the subloops).		// the subloops).
//		//
// Traverse the body of the loop in depth first order on the dominator tree so		// Traverse the body of the loop in depth first order on the dominator tree so
// that we are guaranteed to see definitions before we see uses. This allows		// that we are guaranteed to see definitions before we see uses. This allows
// us to sink instructions in one pass, without iteration. After sinking		// us to sink instructions in one pass, without iteration. After sinking
// instructions, we perform another pass to hoist them out of the loop.		// instructions, we perform another pass to hoist them out of the loop.
SinkAndHoistLICMFlags Flags = {NoOfMemAccTooLarge, LicmMssaOptCounter,		SinkAndHoistLICMFlags Flags = {NoOfMemAccTooLarge, LicmMssaOptCounter,
LicmMssaOptCap, LicmMssaNoAccForPromotionCap};		LicmMssaOptCap, LicmMssaNoAccForPromotionCap};
if (L->hasDedicatedExits())		if (L->hasDedicatedExits())
Changed \|= sinkRegion(DT->getNode(L->getHeader()), AA, LI, DT, TLI, TTI, L,		Changed \|= sinkRegion(DT->getNode(L->getHeader()), AA, LI, DT, TLI, TTI, L,
CurAST.get(), MSSAU.get(), &SafetyInfo, Flags, ORE);		CurAST.get(), MSSAU.get(), &SafetyInfo, &LAI, Flags,
		ORE);
if (Preheader)		if (Preheader)
Changed \|= hoistRegion(DT->getNode(L->getHeader()), AA, LI, DT, TLI, L,		Changed \|= hoistRegion(DT->getNode(L->getHeader()), AA, LI, DT, TLI, L,
CurAST.get(), MSSAU.get(), &SafetyInfo, Flags, ORE);		CurAST.get(), MSSAU.get(), &SafetyInfo, &LAI, Flags,
		ORE);

// Now that all loop invariants have been removed from the loop, promote any		// Now that all loop invariants have been removed from the loop, promote any
// memory references to scalars that we can.		// memory references to scalars that we can.
// Don't sink stores from loops without dedicated block exits. Exits		// Don't sink stores from loops without dedicated block exits. Exits
// containing indirect branches are not transformed by loop simplify,		// containing indirect branches are not transformed by loop simplify,
// make sure we catch that. An additional load may be generated in the		// make sure we catch that. An additional load may be generated in the
// preheader for SSA updater, so also avoid sinking when no preheader		// preheader for SSA updater, so also avoid sinking when no preheader
// is available.		// is available.
▲ Show 20 Lines • Show All 87 Lines • ▼ Show 20 Lines
/// the specified block, and that are in the current loop) in reverse depth		/// the specified block, and that are in the current loop) in reverse depth
/// first order w.r.t the DominatorTree. This allows us to visit uses before		/// first order w.r.t the DominatorTree. This allows us to visit uses before
/// definitions, allowing us to sink a loop body in one pass without iteration.		/// definitions, allowing us to sink a loop body in one pass without iteration.
///		///
bool llvm::sinkRegion(DomTreeNode N, AliasAnalysis AA, LoopInfo *LI,		bool llvm::sinkRegion(DomTreeNode N, AliasAnalysis AA, LoopInfo *LI,
DominatorTree DT, TargetLibraryInfo TLI,		DominatorTree DT, TargetLibraryInfo TLI,
TargetTransformInfo TTI, Loop CurLoop,		TargetTransformInfo TTI, Loop CurLoop,
AliasSetTracker CurAST, MemorySSAUpdater MSSAU,		AliasSetTracker CurAST, MemorySSAUpdater MSSAU,
ICFLoopSafetyInfo *SafetyInfo,		ICFLoopSafetyInfo SafetyInfo, LoopAllocationInfo LAI,
SinkAndHoistLICMFlags &Flags,		SinkAndHoistLICMFlags &Flags,
OptimizationRemarkEmitter *ORE) {		OptimizationRemarkEmitter *ORE) {

// Verify inputs.		// Verify inputs.
assert(N != nullptr && AA != nullptr && LI != nullptr && DT != nullptr &&		assert(N != nullptr && AA != nullptr && LI != nullptr && DT != nullptr &&
CurLoop != nullptr && SafetyInfo != nullptr &&		CurLoop != nullptr && SafetyInfo != nullptr && LAI != nullptr &&
"Unexpected input to sinkRegion.");		"Unexpected input to sinkRegion.");
assert(((CurAST != nullptr) ^ (MSSAU != nullptr)) &&		assert(((CurAST != nullptr) ^ (MSSAU != nullptr)) &&
"Either AliasSetTracker or MemorySSA should be initialized.");		"Either AliasSetTracker or MemorySSA should be initialized.");

// We want to visit children before parents. We will enque all the parents		// We want to visit children before parents. We will enque all the parents
// before their children in the worklist and process the worklist in reverse		// before their children in the worklist and process the worklist in reverse
// order.		// order.
SmallVector<DomTreeNode *, 16> Worklist = collectChildrenInLoop(N, CurLoop);		SmallVector<DomTreeNode *, 16> Worklist = collectChildrenInLoop(N, CurLoop);

bool Changed = false;		bool Changed = false;
for (DomTreeNode *DTN : reverse(Worklist)) {		for (DomTreeNode *DTN : reverse(Worklist)) {
BasicBlock *BB = DTN->getBlock();		BasicBlock *BB = DTN->getBlock();
// Only need to process the contents of this block if it is not part of a		// Only need to process the contents of this block if it is not part of a
// subloop (which would already have been processed).		// subloop (which would already have been processed).
if (inSubLoop(BB, CurLoop, LI))		if (inSubLoop(BB, CurLoop, LI))
continue;		continue;

		bool ScanForFrees = !LAI->empty() && !isa<CallBase>(BB->getTerminator());
		nicholasAuthorUnsubmitted Done Reply Inline Actions Rename this to something clearer, such as "ScanForFrees". nicholas: Rename this to something clearer, such as "ScanForFrees".

for (BasicBlock::iterator II = BB->end(); II != BB->begin();) {		for (BasicBlock::iterator II = BB->end(); II != BB->begin();) {
Instruction &I = *--II;		Instruction &I = *--II;

// If the instruction is dead, we would try to sink it because it isn't		// If the instruction is dead, we would try to sink it because it isn't
// used in the loop, instead, just delete it.		// used in the loop, instead, just delete it.
if (isInstructionTriviallyDead(&I, TLI)) {		if (isInstructionTriviallyDead(&I, TLI)) {
LLVM_DEBUG(dbgs() << "LICM deleting dead inst: " << I << '\n');		LLVM_DEBUG(dbgs() << "LICM deleting dead inst: " << I << '\n');
salvageDebugInfo(I);		salvageDebugInfo(I);
Show All 16 Lines	for (BasicBlock::iterator II = BB->end(); II != BB->begin();) {
if (sink(I, LI, DT, CurLoop, SafetyInfo, MSSAU, ORE)) {		if (sink(I, LI, DT, CurLoop, SafetyInfo, MSSAU, ORE)) {
if (!FreeInLoop) {		if (!FreeInLoop) {
++II;		++II;
eraseInstruction(I, *SafetyInfo, CurAST, MSSAU);		eraseInstruction(I, *SafetyInfo, CurAST, MSSAU);
}		}
Changed = true;		Changed = true;
}		}
}		}

		if (ScanForFrees) {
		if (llvm::isFreeCall(&I, TLI)) {
		LAI->addSafeToSink(cast<CallInst>(&I));
		} else if (isa<CallBase>(I) &&
		(!isa<IntrinsicInst>(I) \|\|
		reamesUnsubmitted Not Done Reply Inline Actions What about invokes? (Use callbase) reames: What about invokes? (Use callbase)
		nicholasAuthorUnsubmitted Done Reply Inline Actions Done. This loop skips the terminator, so it can't be an invoke. Which is, indeed, a miscompile bug. Fixed above where we initialize ScanForFrees and added @test19. Changed it to CallBase here anyways. nicholas: Done. This loop skips the terminator, so it can't be an invoke. Which is, indeed, a…
		match(&I, PatternMatch::m_Intrinsic<
		Intrinsic::objc_autoreleasePoolPush>()))) {
		// This call might allocate memory. Don't move frees which are before
		// it to after it.
		ScanForFrees = false;
		}
		}
}		}
}		}
if (MSSAU && VerifyMemorySSA)		if (MSSAU && VerifyMemorySSA)
MSSAU->getMemorySSA()->verifyMemorySSA();		MSSAU->getMemorySSA()->verifyMemorySSA();
return Changed;		return Changed;
}		}

namespace {		namespace {
▲ Show 20 Lines • Show All 228 Lines • ▼ Show 20 Lines
/// Walk the specified region of the CFG (defined by all blocks dominated by		/// Walk the specified region of the CFG (defined by all blocks dominated by
/// the specified block, and that are in the current loop) in depth first		/// the specified block, and that are in the current loop) in depth first
/// order w.r.t the DominatorTree. This allows us to visit definitions before		/// order w.r.t the DominatorTree. This allows us to visit definitions before
/// uses, allowing us to hoist a loop body in one pass without iteration.		/// uses, allowing us to hoist a loop body in one pass without iteration.
///		///
bool llvm::hoistRegion(DomTreeNode N, AliasAnalysis AA, LoopInfo *LI,		bool llvm::hoistRegion(DomTreeNode N, AliasAnalysis AA, LoopInfo *LI,
DominatorTree DT, TargetLibraryInfo TLI, Loop *CurLoop,		DominatorTree DT, TargetLibraryInfo TLI, Loop *CurLoop,
AliasSetTracker CurAST, MemorySSAUpdater MSSAU,		AliasSetTracker CurAST, MemorySSAUpdater MSSAU,
ICFLoopSafetyInfo *SafetyInfo,		ICFLoopSafetyInfo SafetyInfo, LoopAllocationInfo LAI,
SinkAndHoistLICMFlags &Flags,		SinkAndHoistLICMFlags &Flags,
OptimizationRemarkEmitter *ORE) {		OptimizationRemarkEmitter *ORE) {
// Verify inputs.		// Verify inputs.
assert(N != nullptr && AA != nullptr && LI != nullptr && DT != nullptr &&		assert(N != nullptr && AA != nullptr && LI != nullptr && DT != nullptr &&
CurLoop != nullptr && SafetyInfo != nullptr &&		CurLoop != nullptr && SafetyInfo != nullptr && LAI != nullptr &&
"Unexpected input to hoistRegion.");		"Unexpected input to hoistRegion.");
assert(((CurAST != nullptr) ^ (MSSAU != nullptr)) &&		assert(((CurAST != nullptr) ^ (MSSAU != nullptr)) &&
"Either AliasSetTracker or MemorySSA should be initialized.");		"Either AliasSetTracker or MemorySSA should be initialized.");

ControlFlowHoister CFH(LI, DT, CurLoop, MSSAU);		ControlFlowHoister CFH(LI, DT, CurLoop, MSSAU);

// Keep track of instructions that have been hoisted, as they may need to be		// Keep track of instructions that have been hoisted, as they may need to be
// re-hoisted if they end up not dominating all of their uses.		// re-hoisted if they end up not dominating all of their uses.
▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	for (BasicBlock::iterator II = BB->begin(), E = BB->end(); II != E;) {
CurLoop->getLoopPreheader()->getTerminator())) {		CurLoop->getLoopPreheader()->getTerminator())) {
hoist(I, DT, CurLoop, CFH.getOrCreateHoistedBlock(BB), SafetyInfo,		hoist(I, DT, CurLoop, CFH.getOrCreateHoistedBlock(BB), SafetyInfo,
MSSAU, ORE);		MSSAU, ORE);
HoistedInstructions.push_back(&I);		HoistedInstructions.push_back(&I);
Changed = true;		Changed = true;
continue;		continue;
}		}

		// If it's a malloc/free pair, try hoisting it out to the preheader. We
		// don't need to worry about intervening unwinding function calls in this
		// case, so we can be more aggressive than for generic instructions.
		if (CurLoop->hasLoopInvariantOperands(&I) &&
		llvm::isAllocationFn(&I, TLI) && LAI->mayHoist(cast<CallInst>(&I))) {
		IRBuilder<>(&I).CreateLifetimeStart(&I);
		hoist(I, DT, CurLoop, CFH.getOrCreateHoistedBlock(BB), SafetyInfo,
		MSSAU, ORE);
		HoistedInstructions.push_back(&I);
		auto CI = cast<CallInst>(&I);
		reamesUnsubmitted Done Reply Inline Actions As debated offline, naming here is problematic. addDeallocations -> locationsNeedingFrees? newAllocPlacement? toSink -> freesForMalloc (key piece: avoid action names) A comment would be help as well. Pull out the cast to CallInst reames: As debated offline, naming here is problematic. - addDeallocations -> locationsNeedingFrees?
		nicholasAuthorUnsubmitted Not Done Reply Inline Actions Is picking the first one out of toSink/freesForMalloc a problem? The order depends on the use-list ordering. I don't know what LLVM's current rules on that are, I know that there's https://llvm.org/docs/LangRef.html#use-list-order-directives . Is it considered a bug to have an optimization whose result depends on use list order? nicholas: Is picking the first one out of toSink/freesForMalloc a problem? The order depends on the use…
		nicholasAuthorUnsubmitted Not Done Reply Inline Actions Is it possible for a free call to have an operand bundle on a malloc or free call? In CloneInstructionInExitBlock, LICM updates the funclet operand bundle for the new location in the CFG. nicholas: Is it possible for a free call to have an operand bundle on a malloc or free call? In…

		// One deallocation will be cloned into all the necessary exit blocks
		// then the deallocations will be deleted.
		reamesUnsubmitted Done Reply Inline Actions You might wish to either a) union the source locations, or b) restrict this to non -g See applyMergedLocation on Instruction. reames: You might wish to either a) union the source locations, or b) restrict this to non -g See…
		CallInst *Pattern = LAI->FreesForMalloc(CI)[0];
		Pattern->dropUnknownNonDebugMetadata();

		// In case it is freeing a copy of the pointer not available in all
		// exits of the loop.
		Pattern->setArgOperand(0, CI);

		// Merge debug locations. We don't track which frees have paths to which
		// exit blocks, so we merge all the debug info to put on all the frees
		// being placed in the exit blocks.
		for (auto FreeCall : LAI->FreesForMalloc(CI)) {
		Pattern->applyMergedLocation(Pattern->getDebugLoc(),
		FreeCall->getDebugLoc());
		}

		// Insert frees into the loop exit blocks where we know the pointer
		// would have been freed.
		for (auto ExitBlock : LAI->ExitBlocksWithPointerFreed(CI)) {
		Pattern->clone()->insertBefore(&*ExitBlock->getFirstInsertionPt());
		}

		// Replace the original frees with a @lifetime.end marker.
		for (auto FreeCall : LAI->FreesForMalloc(CI)) {
		IRBuilder<>(FreeCall).CreateLifetimeEnd(&I);
		eraseInstruction(FreeCall, SafetyInfo, CurAST, MSSAU);
		}

		Changed = true;
		continue;
		}

// Attempt to remove floating point division out of the loop by		// Attempt to remove floating point division out of the loop by
// converting it to a reciprocal multiplication.		// converting it to a reciprocal multiplication.
if (I.getOpcode() == Instruction::FDiv &&		if (I.getOpcode() == Instruction::FDiv &&
CurLoop->isLoopInvariant(I.getOperand(1)) &&		CurLoop->isLoopInvariant(I.getOperand(1)) &&
I.hasAllowReciprocal()) {		I.hasAllowReciprocal()) {
auto Divisor = I.getOperand(1);		auto Divisor = I.getOperand(1);
auto One = llvm::ConstantFP::get(Divisor->getType(), 1.0);		auto One = llvm::ConstantFP::get(Divisor->getType(), 1.0);
auto ReciprocalDivisor = BinaryOperator::CreateFDiv(One, Divisor);		auto ReciprocalDivisor = BinaryOperator::CreateFDiv(One, Divisor);
▲ Show 20 Lines • Show All 1,370 Lines • ▼ Show 20 Lines	static bool pointerInvalidatedByLoop(MemoryLocation MemLoc,
// Don't look at nested loops.		// Don't look at nested loops.
if (CurLoop->begin() != CurLoop->end())		if (CurLoop->begin() != CurLoop->end())
return true;		return true;

int N = 0;		int N = 0;
for (BasicBlock *BB : CurLoop->getBlocks())		for (BasicBlock *BB : CurLoop->getBlocks())
for (Instruction &I : *BB) {		for (Instruction &I : *BB) {
if (N >= LICMN2Theshold) {		if (N >= LICMN2Theshold) {
LLVM_DEBUG(dbgs() << "Alasing N2 threshold exhausted for "		LLVM_DEBUG(dbgs() << "Aliasing N2 threshold exhausted for "
<< *(MemLoc.Ptr) << "\n");		<< *(MemLoc.Ptr) << "\n");
return true;		return true;
}		}
N++;		N++;
auto Res = AA->getModRefInfo(&I, MemLoc);		auto Res = AA->getModRefInfo(&I, MemLoc);
if (isModSet(Res)) {		if (isModSet(Res)) {
LLVM_DEBUG(dbgs() << "Aliasing failed on " << I << " for "		LLVM_DEBUG(dbgs() << "Aliasing failed on " << I << " for "
<< *(MemLoc.Ptr) << "\n");		<< *(MemLoc.Ptr) << "\n");
Show All 29 Lines

llvm/test/Transforms/LICM/allocs.ll

This file was added.

				; RUN: opt -S -licm < %s \| FileCheck %s
				; RUN: opt -S -enable-mssa-loop-dependency -licm < %s \| FileCheck %s

				asbirleaUnsubmitted Done Reply Inline Actions Could you also add a RUN line with -enable-mssa-loop-dependency? asbirlea: Could you also add a RUN line with -enable-mssa-loop-dependency?
				declare i8* @malloc(i32) nounwind
				declare void @free(i8* nocapture) nounwind

				declare void @use(i8*)
				declare i1 @expr()

				define void @test1(i32 %loop_invariant) {
				; CHECK-LABEL: @test1
				header:
				; CHECK-LABEL: header:
				; CHECK: @malloc
				; CHECK: br label %loop
				br label %loop

				loop:
				; CHECK-LABEL: loop:
				; CHECK-NOT: @malloc
				; CHECK: @llvm.lifetime.start
				; CHECK-NOT: @malloc
				; CHECK: @use
				; CHECK-NOT: @free
				; CHECK: @llvm.lifetime.end
				; CHECK-NOT: @free
				%ptr = call i8* @malloc(i32 %loop_invariant)
				call void @use(i8* %ptr)
				call void @free(i8* %ptr)
				br i1 undef, label %loop, label %loopexit

				loopexit:
				; CHECK-LABEL: loopexit:
				; CHECK: @free
				ret void
				}

				declare void @mightmalloc()

				define void @test2(i32 %loop_invariant) {
				; CHECK-LABEL: @test2
				header:
				; CHECK-LABEL: header:
				; CHECK-NOT: @malloc
				; CHECK: br label %loop
				br label %loop

				loop:
				; CHECK-LABEL: loop:
				; CHECK: @malloc
				; CHECK: @use
				; CHECK: @free
				; CHECK: @mightmalloc
				%ptr = call i8* @malloc(i32 %loop_invariant)
				call void @use(i8* %ptr)
				call void @free(i8* %ptr)
				call void @mightmalloc()
				br i1 undef, label %loop, label %loopexit

				loopexit:
				; CHECK-LABEL: loopexit:
				; CHECK-NOT: @free
				ret void
				}

				define void @test3(i32 %loop_invariant) {
				; CHECK-LABEL: @test3
				header:
				; CHECK-LABEL: header:
				; CHECK-NOT: @malloc
				; CHECK: br label %loop
				br label %loop

				loop:
				; CHECK-LABEL: loop:
				; CHECK: @mightmalloc
				; CHECK: @malloc
				; CHECK: @use
				; CHECK: @free
				call void @mightmalloc()
				%ptr = call i8* @malloc(i32 %loop_invariant)
				call void @use(i8* %ptr)
				call void @free(i8* %ptr)
				br i1 undef, label %loop, label %loopexit

				loopexit:
				; CHECK-LABEL: loopexit:
				; CHECK-NOT: @free
				ret void
				}

				define void @test4(i32 %loop_invariant) {
				; CHECK-LABEL: @test4
				header:
				; CHECK-LABEL: header:
				; CHECK: @malloc
				; CHECK: @malloc
				; CHECK: @malloc
				; CHECK: br label %loop
				br label %loop

				loop:
				; CHECK-LABEL: loop:
				; CHECK-NOT: @malloc
				; CHECK: @llvm.lifetime.start
				; CHECK: @llvm.lifetime.start
				; CHECK: @llvm.lifetime.start
				; CHECK: @use
				; CHECK: @use
				; CHECK: @use
				; CHECK: @llvm.lifetime.end
				; CHECK: @llvm.lifetime.end
				; CHECK: @llvm.lifetime.end
				; CHECK-NOT: @free
				%ptr1 = call i8* @malloc(i32 %loop_invariant)
				%ptr2 = call i8* @malloc(i32 %loop_invariant)
				%ptr3 = call i8* @malloc(i32 %loop_invariant)
				call void @use(i8* %ptr1)
				call void @use(i8* %ptr2)
				call void @use(i8* %ptr3)
				call void @free(i8* %ptr1)
				call void @free(i8* %ptr2)
				call void @free(i8* %ptr3)
				br i1 undef, label %loop, label %loopexit

				loopexit:
				; CHECK-LABEL: loopexit:
				; CHECK: @free
				; CHECK: @free
				; CHECK: @free
				ret void
				}

				define void @test5(i32 %loop_invariant) {
				; CHECK-LABEL: @test5
				header:
				; CHECK-LABEL: header:
				; CHECK: @malloc
				; CHECK: @malloc
				; CHECK: @malloc
				; CHECK: br label %loop
				br label %loop

				loop:
				; CHECK-LABEL: loop:
				; CHECK-NOT: @malloc
				; CHECK: @llvm.lifetime.start
				; CHECK: @llvm.lifetime.start
				; CHECK: @llvm.lifetime.start
				; CHECK: @use
				; CHECK: @use
				; CHECK: @use
				; CHECK: @llvm.lifetime.end
				; CHECK: @llvm.lifetime.end
				; CHECK: @llvm.lifetime.end
				; CHECK-NOT: @free
				%ptr3 = call i8* @malloc(i32 %loop_invariant)
				%ptr2 = call i8* @malloc(i32 %loop_invariant)
				%ptr1 = call i8* @malloc(i32 %loop_invariant)
				call void @use(i8* %ptr1)
				call void @use(i8* %ptr2)
				call void @use(i8* %ptr3)
				call void @free(i8* %ptr1)
				call void @free(i8* %ptr2)
				call void @free(i8* %ptr3)
				br i1 undef, label %loop, label %loopexit

				loopexit:
				; CHECK-LABEL: loopexit:
				; CHECK: @free
				; CHECK: @free
				; CHECK: @free
				ret void
				}

				define void @test6(i32 %loop_invariant) {
				; CHECK-LABEL: @test6
				header:
				; CHECK-LABEL: header:
				; CHECK: @malloc
				; CHECK: @malloc
				; CHECK-NOT: @malloc
				; CHECK: br label %loop
				br label %loop

				loop:
				; CHECK-LABEL: loop:
				; CHECK: @llvm.lifetime.start
				; CHECK: @malloc
				; CHECK: @llvm.lifetime.start
				; CHECK: @use
				; CHECK: @use
				; CHECK: @use
				; CHECK: @llvm.lifetime.end
				; CHECK: @free
				; CHECK: @llvm.lifetime.end
				%i = phi i32 [%loop_invariant, %header], [%i.1, %loop]
				%ptr1 = call i8* @malloc(i32 %loop_invariant)
				%ptr2 = call i8* @malloc(i32 %i)
				%ptr3 = call i8* @malloc(i32 %loop_invariant)
				call void @use(i8* %ptr1)
				call void @use(i8* %ptr2)
				call void @use(i8* %ptr3)
				call void @free(i8* %ptr1)
				call void @free(i8* %ptr2)
				call void @free(i8* %ptr3)
				%i.1 = add i32 %i, 1
				br i1 undef, label %loop, label %loopexit

				loopexit:
				; CHECK-LABEL: loopexit:
				; CHECK: @free
				; CHECK: @free
				; CHECK-NOT: @free
				; CHECK: ret void
				ret void
				}

				define void @test7(i32 %loop_invariant) {
				; CHECK-LABEL: @test7
				header:
				; CHECK-LABEL: header:
				; CHECK: @malloc
				; CHECK: br label %loop
				br label %loop

				loop:
				; CHECK-LABEL: loop:
				; CHECK-NOT: @malloc
				; CHECK: @llvm.lifetime.start
				; CHECK-NOT: @malloc
				; CHECK: @use
				; CHECK-NOT: @free
				; CHECK-NOT: @llvm.lifetime.end
				%i = phi i32 [%loop_invariant, %header], [%i.1, %loop.2], [%i.1, %loop.3]
				%ptr = call i8* @malloc(i32 %loop_invariant)
				call void @use(i8* %ptr)
				%i.1 = add i32 %i, 1
				%cmp = icmp eq i32 %i.1, 123
				br i1 %cmp, label %loop.2, label %loop.3

				loop.2:
				; CHECK-LABEL: loop.2:
				; CHECK-NOT: @free
				; CHECK: @llvm.lifetime.end
				; CHECK-NOT: @free
				%continue.1 = call i1 @expr()
				call void @free(i8* %ptr)
				br i1 %continue.1, label %loop, label %loopexit

				loop.3:
				; CHECK-LABEL: loop.3:
				; CHECK-NOT: @free
				; CHECK: @llvm.lifetime.end
				; CHECK-NOT: @free
				%continue.2 = call i1 @expr()
				call void @free(i8* %ptr)
				br i1 %continue.2, label %loop, label %loopexit

				loopexit:
				; CHECK-LABEL: loopexit:
				; CHECK: @free
				ret void
				}

				declare void @assert_fail()

				define void @test8(i32 %loop_invariant) {
				; CHECK-LABEL: @test8
				header:
				; CHECK-LABEL: header:
				; CHECK: @malloc
				; CHECK: br label %loop
				br label %loop

				loop:
				; CHECK-LABEL: loop:
				; CHECK-NOT: @malloc
				; CHECK: @llvm.lifetime.start
				; CHECK-NOT: @malloc
				%ptr = call i8* @malloc(i32 %loop_invariant)
				%fail = call i1 @expr()
				br i1 %fail, label %assert_fail, label %assert_pass

				assert_fail:
				call void @assert_fail()
				unreachable

				assert_pass:
				; CHECK-LABEL: assert_pass:
				; CHECK-NOT: @free
				; CHECK: @llvm.lifetime.end
				; CHECK-NOT: @free
				call void @free(i8* %ptr)
				br i1 undef, label %loop, label %loopexit

				loopexit:
				; CHECK-LABEL: loopexit:
				; CHECK: @free
				ret void
				}

				declare void @usei64(i64)

				define void @test9(i32 %loop_invariant) {
				; CHECK-LABEL: @test9
				header:
				; CHECK-LABEL: header:
				; CHECK: add
				; CHECK: @malloc
				; CHECK: ptrtoint
				; CHECK: and
				; CHECK-NOT: usei64
				; CHECK: br label %loop
				br label %loop

				loop:
				; CHECK-LABEL: loop:
				; CHECK-NOT: @malloc
				; CHECK: @llvm.lifetime.start
				; CHECK-NOT: @malloc
				; CHECK: usei64
				%size = add i32 %loop_invariant, 1
				%ptr = call i8* @malloc(i32 %size)
				%ptrbits = ptrtoint i8* %ptr to i64
				%aligncheck = and i64 %ptrbits, 4
				call void @usei64(i64%aligncheck)
				%fail = call i1 @expr()
				br i1 %fail, label %assert_fail, label %assert_pass

				assert_fail:
				call void @assert_fail()
				unreachable

				assert_pass:
				; CHECK-LABEL: assert_pass:
				; CHECK-NOT: @free
				; CHECK: @llvm.lifetime.end
				; CHECK-NOT: @free
				call void @free(i8* %ptr)
				br i1 undef, label %loop, label %loopexit

				loopexit:
				; CHECK-LABEL: loopexit:
				; CHECK: @free
				ret void
				}

				declare x86_stdcallcc void @_CxxThrowException(i8, i8)

				declare i32 @__CxxFrameHandler3(...)

				define void @test10(i32 %loop_invariant) personality i32 (...)* @__CxxFrameHandler3 {
				; CHECK-LABEL: @test10
				header:
				br label %loop

				loop:
				; CHECK-LABEL: loop:
				; CHECK: @malloc
				; CHECK: @use
				; CHECK: @free
				%ptr = call i8* @malloc(i32 %loop_invariant)
				call void @use(i8* %ptr)
				call void @free(i8* %ptr)
				invoke void @_CxxThrowException(i8* null, i8* null)
				to label %continue unwind label %catch.dispatch

				continue:
				%take.backedge = call i1 @expr()
				br i1 %take.backedge, label %loop, label %loopexit

				catch.dispatch:
				%cs = catchswitch within none [label %catch] unwind to caller

				catch: ; preds = %catch.dispatch
				%cp = catchpad within %cs [i8* null, i32 64, i8* null]
				catchret from %cp to label %loopexit

				loopexit:
				ret void
				}

				define void @test11(i32 %loop_invariant) personality i32 (...)* @__CxxFrameHandler3 {
				; CHECK-LABEL: @test11
				header:
				; CHECK-LABEL: header:
				; CHECK: @malloc
				; CHECK: br label %loop
				br label %loop

				loop:
				; CHECK-LABEL: loop:
				; CHECK-NOT: @malloc
				; CHECK: @llvm.lifetime.start
				; CHECK-NOT: @malloc
				; CHECK: @use
				%ptr = call i8* @malloc(i32 %loop_invariant)
				call void @use(i8* %ptr)
				invoke void @_CxxThrowException(i8* null, i8* null)
				to label %continue unwind label %catch.dispatch

				continue:
				; CHECK_LABEL: continue:
				; CHECK-NOT: free
				; CHECK: lifetime.end
				; CHECK-NOT: free
				%take.backedge = call i1 @expr()
				call void @free(i8* %ptr)
				br i1 %take.backedge, label %loop, label %loopexit

				catch.dispatch:
				; CHECK-LABEL: catch.dispatch:
				; CHECK-NEXT: catchswitch
				%cs = catchswitch within none [label %catch] unwind to caller

				catch: ; preds = %catch.dispatch
				%cp = catchpad within %cs [i8* null, i32 64, i8* null]
				catchret from %cp to label %loopexit

				loopexit:
				; CHECK-LABEL: loopexit:
				; CHECK: @free
				ret void
				}

				declare float @llvm.log.f32(float %Val)

				define void @test12(i32 %loop_invariant) {
				; CHECK-LABEL: @test1
				header:
				; CHECK-LABEL: header:
				; CHECK: @malloc
				; CHECK: br label %loop
				br label %loop

				loop:
				; CHECK-LABEL: loop:
				; CHECK-NOT: @malloc
				; CHECK: @llvm.lifetime.start
				; CHECK-NOT: @malloc
				; CHECK: @use
				; CHECK-NOT: @free
				; CHECK: @llvm.lifetime.end
				; CHECK-NOT: @free
				call float @llvm.log.f32(float 0.0)
				%ptr = call i8* @malloc(i32 %loop_invariant)
				call void @use(i8* %ptr)
				call float @llvm.log.f32(float 0.0)
				call void @free(i8* %ptr)
				call float @llvm.log.f32(float 0.0)
				br i1 undef, label %loop, label %loopexit

				loopexit:
				; CHECK-LABEL: loopexit:
				; CHECK: @free
				ret void
				}

				declare void @llvm.objc.autoreleasePoolPop(i8*)
				declare i8* @llvm.objc.autoreleasePoolPush()

				define void @test13(i32 %loop_invariant) {
				; CHECK-LABEL: @test13
				header:
				; CHECK-LABEL: header:
				; CHECK-NOT: @malloc
				; CHECK: br label %loop
				br label %loop

				loop:
				; CHECK-LABEL: loop:
				; CHECK: @malloc
				; CHECK: @use
				; CHECK: @free
				%managed_ptr = call i8* @llvm.objc.autoreleasePoolPush()
				%ptr = call i8* @malloc(i32 %loop_invariant)
				call void @use(i8* %ptr)
				call void @free(i8* %ptr)
				br i1 undef, label %loop, label %loopexit

				loopexit:
				; CHECK-LABEL: loopexit:
				; CHECK-NOT: @free
				ret void
				}

				define void @test14(i32 %loop_invariant) {
				; CHECK-LABEL: @test14
				header:
				; CHECK-LABEL: header:
				; CHECK-NOT: @malloc
				; CHECK: br label %loop
				br label %loop

				loop:
				; CHECK-LABEL: loop:
				; CHECK: @malloc
				; CHECK: @use
				; CHECK: @free
				%ptr = call i8* @malloc(i32 %loop_invariant)
				call void @use(i8* %ptr)
				call void @free(i8* %ptr)
				%managed_ptr = call i8* @llvm.objc.autoreleasePoolPush()
				br i1 undef, label %loop, label %loopexit

				loopexit:
				; CHECK-LABEL: loopexit:
				; CHECK-NOT: @free
				ret void
				}

				declare i8 @expr8()

				define void @test15(i32 %loop_invariant) {
				; CHECK-LABEL: @test15
				header:
				; CHECK-LABEL: header:
				; CHECK: @malloc
				; CHECK: br label %loop
				br label %loop

				loop:
				%ptr = call i8* @malloc(i32 %loop_invariant)
				call void @use(i8* %ptr)
				%Val = call i8 @expr8()
				switch i8 %Val, label %b1 [ i8 2, label %b2
				i8 3, label %b3 ]

				b1:
				call void @usei64(i64 1)
				%a = call i1 @expr()
				call void @free(i8* %ptr), !dbg !6
				br i1 %a, label %loop, label %loopexit

				b2:
				call void @usei64(i64 2)
				%b = call i1 @expr()
				call void @free(i8* %ptr), !dbg !7
				br i1 %b, label %loop, label %loopexit

				b3:
				call void @usei64(i64 3)
				%c = call i1 @expr()
				call void @free(i8* %ptr), !dbg !8
				br i1 %c, label %loop, label %loopexit

				loopexit:
				; CHECK-LABEL: loopexit:
				; CHECK: @free{{.*}}, !dbg [[test15MergedLoc:![0-9]+]]
				ret void
				}

				define void @test16(i32 %loop_invariant, i8* %vptr) {
				; CHECK-LABEL: @test16
				header:
				; CHECK-LABEL: header:
				; CHECK-NOT: @malloc
				; CHECK: br label %loop
				br label %loop

				loop:
				; CHECK-LABEL: loop:
				; CHECK: @malloc
				; CHECK: @use
				; CHECK: @free
				%Val = load volatile i8, i8* %vptr
				%ptr = call i8* @malloc(i32 %loop_invariant)
				reamesUnsubmitted Not Done Reply Inline Actions This test is odd. I don't see anything preventing us from reordering the volatile load and malloc. Was this an attempt to test for a side exit? If so, simply using an unknown call before the malloc would be preferred. reames: This test is odd. I don't see anything preventing us from reordering the volatile load and…
				nicholasAuthorUnsubmitted Done Reply Inline Actions This test shows the difference from using isGuaranteedToTranferExecution in analyzeLoop. Before that change, we would have hoisted the malloc above the volatile load. The same test written with a function would not have shown any difference simply because there is no way to create a function that might-throw but can't-malloc/free (without adding new attributes to LLVM). It would also just be @test3 again. I think the new behaviour tested for here is correct. Suppose you have a system where out of memory terminates the program, and the volatile store is triggering some external action (moves the robot arm). Suppose the programmer is using volatile to make sure the external system is in the safe state before doing the malloc that may terminate, and wants the malloc to complete before beginning any more operations. The compiler should not reorder these actions. nicholas: This test shows the difference from using isGuaranteedToTranferExecution in analyzeLoop. Before…
				call void @use(i8* %ptr)
				call void @free(i8* %ptr)
				br i1 undef, label %loop, label %loopexit

				loopexit:
				; CHECK-LABEL: loopexit:
				; CHECK-NOT: @free
				ret void
				}

				define void @test17(i32 %loop_invariant) {
				; CHECK-LABEL: @test17
				header:
				; CHECK-LABEL: header:
				; CHECK: @malloc
				; CHECK: bitcast
				; CHECK: getelementptr
				; CHECK: br label %loop
				br label %loop

				loop:
				; CHECK-LABEL: loop:
				; CHECK-NOT: @malloc
				; CHECK: @llvm.lifetime.start
				; CHECK-NOT: @malloc
				%ptr = call i8* @malloc(i32 %loop_invariant)
				%ptr2 = bitcast i8* %ptr to i8*
				%ptr3 = getelementptr i8, i8* %ptr2, i32 0
				br label %nextloopblock

				nextloopblock:
				%ptr4 = phi i8* [%ptr3, %loop]
				call void @free(i8* %ptr4)
				br i1 undef, label %loop, label %loopexit

				loopexit:
				; CHECK-LABEL: loopexit:
				; CHECK: @free
				ret void
				}

				define void @test18(i32 %loop_invariant) {
				; CHECK-LABEL: @test18
				header:
				; CHECK-LABEL: header:
				; CHECK: @malloc
				; CHECK: br label %loop
				br label %loop

				loop:
				; CHECK-LABEL: loop:
				; CHECK-NOT: @malloc
				; CHECK: @llvm.lifetime.start
				; CHECK-NOT: @malloc
				%ptr = call i8* @malloc(i32 %loop_invariant)
				%continue = call i1 @expr()
				%branch = call i1 @expr()
				br i1 %branch, label %nextbb1, label %nextbb2

				nextbb1:
				%ptr2 = bitcast i8* %ptr to i8*
				call void @free(i8* %ptr2)
				br label %nextloopblock

				nextbb2:
				call void @free(i8* %ptr)
				br label %nextloopblock

				nextloopblock:
				br i1 %continue, label %loop, label %loopexit

				loopexit:
				; CHECK-LABEL: loopexit:
				; CHECK-NOT: undef
				; CHECK: @free
				ret void
				}

				declare void @might_malloc_or_throw()

				define void @test19(i32 %loop_invariant) personality i32 (...)* @__CxxFrameHandler3 {
				; CHECK-LABEL: @test19
				header:
				; CHECK-LABEL: header:
				; CHECK-NOT: @malloc
				; CHECK: br label %loop
				br label %loop

				loop:
				; CHECK-LABEL: loop:
				; CHECK: @malloc
				; CHECK: @use
				; CHECK: @free
				; CHECK: @might_malloc_or_throw
				%ptr = call i8* @malloc(i32 %loop_invariant)
				call void @use(i8* %ptr)
				call void @free(i8* %ptr)
				invoke void @might_malloc_or_throw() to label %loop unwind label %loopexit

				loopexit:
				; CHECK-LABEL: loopexit:
				; CHECK-NOT: @free
				%lpad = landingpad { i8, i32 } catch i8 null
				ret void
				}

				define void @test20(i32 %loop_invariant) {
				; CHECK-LABEL: @test20
				header:
				; CHECK-LABEL: header:
				; CHECK: @malloc
				; CHECK: br label %loop.header
				br label %loop.header

				loop.header:
				; CHECK-LABEL: loop.header:
				; CHECK-NOT: @malloc
				; CHECK: @llvm.lifetime.start
				; CHECK-NOT: @malloc
				; CHECK: @use
				; CHECK-NOT: @free
				; CHECK: @llvm.lifetime.end
				; CHECK-NOT: @free
				%ptr = call i8* @malloc(i32 %loop_invariant)
				call void @use(i8* %ptr)
				call void @free(i8* %ptr)
				br i1 undef, label %loop.header, label %loop2

				loop2:
				br i1 undef, label %loop.header, label %loop.exit

				loop.exit:
				; CHECK-LABEL: loop.exit:
				; CHECK: @free
				ret void
				}

				define void @test21(i32 %loop_invariant) {
				; CHECK-LABEL: @test21
				header:
				; CHECK-LABEL: header:
				; CHECK-NOT: @malloc
				; CHECK: br label %loop.header
				br label %loop.header

				loop.header:
				; CHECK-LABEL: loop.header:
				; CHECK: @malloc
				; CHECK: @use
				; CHECK: @free
				%ptr = call i8* @malloc(i32 %loop_invariant)
				call void @use(i8* %ptr)
				call void @free(i8* %ptr)
				br i1 undef, label %loop.header, label %loop2

				loop2:
				; CHECK-LABEL: loop2:
				; CHECK: call i1 @expr
				; This call after @free may increase total heap memory and therefore must
				; stay after the free call.
				%A = call i1 @expr()
				br i1 %A, label %loop.header, label %loop.exit

				loop.exit:
				; CHECK-LABEL: loop.exit:
				; CHECK-NOT: @free
				ret void
				}

				!llvm.module.flags = !{!0, !1, !2}
				!llvm.dbg.cu = !{!3}

				; CHECK: [[test15MergedLoc]] = !DILocation(line: 0
				!0 = !{i32 2, !"Dwarf Version", i32 4}
				!1 = !{i32 2, !"Debug Info Version", i32 3}
				!2 = !{i32 1, !"PIC Level", i32 2}
				!3 = distinct !DICompileUnit(language: DW_LANG_C99, file: !4)
				!4 = !DIFile(filename: "allocs.ll", directory: "test/Transforms/LICM")
				!5 = distinct !DISubprogram(name: "test/Transforms/LICM/allocs.ll", unit: !3)
				!6 = !DILocation(line: 1, column: 10, scope: !5)
				!7 = !DILocation(line: 2, column: 20, scope: !5)
				!8 = !DILocation(line: 3, column: 30, scope: !5)

This is an archive of the discontinued LLVM Phabricator instance.

Hoist/sink malloc/free's in LICM.Needs ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 196980

llvm/include/llvm/Analysis/LoopAllocationInfo.h

llvm/include/llvm/Transforms/Utils/LoopUtils.h

llvm/lib/Analysis/CMakeLists.txt

llvm/lib/Analysis/LoopAllocationInfo.cpp

llvm/lib/Transforms/Scalar/LICM.cpp

llvm/test/Transforms/LICM/allocs.ll

Hoist/sink malloc/free's in LICM.
Needs ReviewPublic