This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/Analysis/
-
llvm/
-
Analysis/
-
TargetTransformInfo.h
-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
3
LoopUnrollPass.cpp
-
test/Transforms/LoopUnroll/
-
Transforms/
-
LoopUnroll/
1
partial-unroll-maxcount.ll

Differential D18670

LoopUnroll: some small fixes/tweaks to make it more useful for partial unrolling
ClosedPublic

Authored by escha on Mar 31 2016, 1:02 PM.

Download Raw Diff

Details

Reviewers

mzolotukhin
resistor

Summary

I wanted to be able to do some partial unrolling on my target, as well as limit the unroll counts for full unrolling. This meant the behavior I wanted (which seemed quite reasonable to ask for!) looks like this:

If doing full unrolling, use threshold X, and don't go over A iterations unrolled.
If we don't meet the requirements for full unrolling, use threshold Y and don't go over A iterations unrolled. Also, make sure to make an unroll count that divides evenly into the loop count.
Don't do runtime unrolling.

Unfortunately, I ran into three problems, which this patch fixes (comments welcome if there's any better ways to fix them).

There's no way to limit the number of iterations for full unrolling -- only for partial/runtime unrolling. So I added that.
A bug in partial unrolling causes it to not reduce the count to be modulo-tripcount if the PartialThreshold is already met. So I fixed that. I'm not sure if this bug can trigger without change 1), though.
A bug in partial unrolling causes it to ignore MaxCount, even though MaxCount says it applies to everything but full unrolling. So I fixed that.

(Use-case: our target, a GPU, can [in TTI] roughly estimate the number of high-latency operations, like loads and texture reads, and make reasonable judgements as to how much unrolling is reasonable given that number. But to do that, we need to be able to put a cap on full unrolling separate from the overall cost threshold.)

Diff Detail

Repository: rL LLVM

Event Timeline

escha updated this revision to Diff 52268.Mar 31 2016, 1:02 PM

escha retitled this revision from to LoopUnroll: some small fixes/tweaks to make it more useful for partial unrolling.

escha updated this object.

escha added reviewers: mzolotukhin, resistor.

escha set the repository for this revision to rL LLVM.

escha added a subscriber: llvm-commits.

Herald added a subscriber: mzolotukhin. · View Herald TranscriptMar 31 2016, 1:02 PM

Updates in partial unrolling logic looks reasonable to me... just don't understand about limiting full unroll, and seems it's not limited in this patch.

lib/Transforms/Scalar/LoopUnrollPass.cpp
110	Where's FullUnrollMaxCount used? If I understand your description correctly, full unroll and partial unroll are limited by the same max unroll count ('A iterations') but different thresholds ('threshold X' and 'threshold Y')? I don't quiet understand the term 'put a cap on full unrolling'... If we can't unroll by compile-time-known tripcount, it becomes partial unrolling, right? getUnrollingPreferences() in TTI can update UP.Count, UP.MaxCount and UP.PartialThreshold to bound partial unrolling.

FullUnrollMaxCount isn't used anywhere because it's a TTI option used by our out of tree target.

Count is a force-unroll option; it doesn't have anything to do with this, as far as I know.

MaxCount is what I would want, but it *specifically does not apply to full unrolling*, as specified in the documentation. I want to limit the number of iterations for a full unroll.

PartialThreshold, again, has nothing to do with full unrolls.

I don't quiet understand the term 'put a cap on full unrolling'... If we can't unroll by compile-time-known tripcount, it becomes partial unrolling, right?

Yes, exactly. If the compile-time known tripcount is greater than FullUnrollMaxCount, it will fall back to partial unrolling instead, because it cannot do a full unroll.

In D18670#389000, @escha wrote:

FullUnrollMaxCount isn't used anywhere because it's a TTI option used by our out of tree target.

If it's not used by target-independent unroll pass, perhaps you can put it in your TTI?

Count is a force-unroll option; it doesn't have anything to do with this, as far as I know.
...make reasonable judgements as to how much unrolling is reasonable given that number

That's what UP.Count is intended to do: let the target suggest an unroll count it deems reasonable.

MaxCount is what I would want, but it *specifically does not apply to full unrolling*, as specified in the documentation. I want to limit the number of iterations for a full unroll.

PartialThreshold, again, has nothing to do with full unrolls.

unsigned Count = UP.Count;
bool CountSetExplicitly = Count != 0;                                                                   
// Use a heuristic count if we didn't set anything explicitly.                                          
if (!CountSetExplicitly)
  Count = TripCount == 0 ? DefaultUnrollRuntimeCount : TripCount;
if (TripCount && Count > TripCount)
  Count = TripCount;

If Count, derived from UP.Count, (again, suggested by the target), is smaller than TripCount, it becomes partial unrolling and PartialThreshold/MaxCount applies automatically.

I'd add TripCount to UP so TTI::getUnrollingPreferences() can knowingly setup partial unrolling.

I don't quiet understand the term 'put a cap on full unrolling'... If we can't unroll by compile-time-known tripcount, it becomes partial unrolling, right?

Yes, exactly. If the compile-time known tripcount is greater than FullUnrollMaxCount, it will fall back to partial unrolling instead, because it cannot do a full unroll.

It seems partial unrolling is preferred over full unrolling for your target. I would set it up in getUnrollingPerferences()... unless it's too early for the TTI to estimate how much unrolling is suitable during LoopUnrollPass.

Ugh, my mistake, I missed one hunk when uploading the diff; the hunk where the variable was used! My mistake.

That's what UP.Count is intended to do: let the target suggest an unroll count it deems reasonable.

I thought UP.Count overrides Threshold, causing it to unroll loops even if their LoopSize is greater than the Threshold? If that's not true, then I guess that part of this patch is unnecessary, but it didn't seem to work that way when I tried.

It seems partial unrolling is preferred over full unrolling for your target. I would set it up in getUnrollingPerferences()... unless it's too early for the TTI to estimate how much unrolling is suitable during LoopUnrollPass.

No, full unrolling is strongly preferred. However, in some cases, full unrolling is too costly, either due to register usage (which our TTI calculates) or due to Threshold (which LoopUnrollPass calculates), and we'd rather partial unroll than do nothing at all.

Going by the code, Count is emphatically not what I want:

if (!AllowPartial && !CountSetExplicitly) {
  DEBUG(dbgs() << "  will not try to unroll partially because "
               << "-unroll-allow-partial not given\n");
  return false;
}
if (!AllowRuntime && !CountSetExplicitly) {
  DEBUG(dbgs() << "  will not try to unroll loop with runtime trip count "
               << "-unroll-runtime not given\n");
  return false;
}

It forces an unroll of Count, even if you haven't enabled Runtime unrolling and don't want it.

Hi,

In general the idea looks good to me, one remark is inline.

Michael

PS: Please upload the patch with full-context.

lib/Transforms/Scalar/LoopUnrollPass.cpp
639–644	I think we shouldn't do this if `CountSetExplicitly` is true.

In D18670#389043, @escha wrote:

That's what UP.Count is intended to do: let the target suggest an unroll count it deems reasonable.

I thought UP.Count overrides Threshold, causing it to unroll loops even if their LoopSize is greater than the Threshold? If that's not true, then I guess that part of this patch is unnecessary, but it didn't seem to work that way when I tried.

It seems partial unrolling is preferred over full unrolling for your target. I would set it up in getUnrollingPerferences()... unless it's too early for the TTI to estimate how much unrolling is suitable during LoopUnrollPass.

No, full unrolling is strongly preferred. However, in some cases, full unrolling is too costly, either due to register usage (which our TTI calculates) or due to Threshold (which LoopUnrollPass calculates), and we'd rather partial unroll than do nothing at all.

Count is still bound by UP.Threshold in canUnrollCompletely(), which decides full or partial unrolling. Setting UP.Threshold and UP.PartialThreshold but not UP.Count in TTI can limit full unroll as well.

In D18670#389046, @escha wrote:
Going by the code, Count is emphatically not what I want:
if (!AllowPartial && !CountSetExplicitly) {
  DEBUG(dbgs() << "  will not try to unroll partially because "
               << "-unroll-allow-partial not given\n");
  return false;
}
if (!AllowRuntime && !CountSetExplicitly) {
  DEBUG(dbgs() << "  will not try to unroll loop with runtime trip count "
               << "-unroll-runtime not given\n");
  return false;
}
It forces an unroll of Count, even if you haven't enabled Runtime unrolling and don't want it.

You are right with runtime unrolling part, :-)

Can you also add a test case?

lib/Transforms/Scalar/LoopUnrollPass.cpp
570	Now I understand what you're trying to do...

evstupac added a subscriber: evstupac.Mar 31 2016, 10:50 PM

What sort of test case should I do for this (which behavior are you looking to test)? Should I make a commandline option so that this can easily be tested via 'opt'?

Nvm... I just realized FullUnrollMaxCount is set by your out of tree TTI and only that target needs to cap full unroll for now.

The test might be that given a loop with a TripCount=9, we unroll it with a factor of 3 (instead of 2 or 4)? Or something along these lines. Would it be possible, would it make sense?

I don't *think* it's possible for that problem to happen under normal circumstances without FullUnrollMaxCount being set. Normally, the only way we can fail to do full unrolling is if the full unroll cost is greater than Threshold. In this case, we'll go try partial unrolling, and we'll also be above the partial threshold (since a partial threshold higher than Threshold doesn't make sense), and it'll take the path that makes the count modulo-correct.

But if we set FullUnrollMaxCount, it's possible to reach this code with UnrolledSize <= PartialThreshold and a Count that isn't modulo TripCount, which leads to the bug.

In D18670#389705, @escha wrote:

In this case, we'll go try partial unrolling, and we'll also be above the partial threshold (since a partial threshold higher than Threshold doesn't make sense), and it'll take the path that makes the count modulo-correct.

Say we have loop with TripCount 9 and size 6, than fully unrolled size is ~9 * (6 - 2) + 2 = 38 (it could be less but not dramatically).
If we unroll the loop by 3, than unrolled size is 3 * (6 - 2) = 12. With Threshold = PartialThreshold = 20 we are able to do partial unrolling and will fail with full.

Yes, I understand that, but what I mean is, the bug won't trigger because we *will* get to the path that corrects Count because UnrolledSize > UP.PartialThreshold with Count = 9, and then it'll lower Count to 3. So we won't end up with a Count that isn't evenly dividing into 9.

Thanks, now I get it.
For your target there is a case when full unroll fails and then partial fails as well. And when patch applied partial unroll is successful. Right? So you can create target test.
It is really more convenient to look at code with maximum context:

svn diff -diff-cmd=diff -x -U999999

Adding full context.

For your target there is a case when full unroll fails and then partial fails as well. And when patch applied partial unroll is successful. Right? So you can create target test.

Partial unroll didn't fail -- rather, it did the wrong thing. For example, suppose the loop count was 9, and the FullUnrollMaxCount was 6. The process would go like this:

tripcount = 9, count = 9
count = 6, because of FullUnrollMaxCount.
count != tripcount, so full unroll can't be done; do partial instead.
we're within the partial threshold, so skip the logic for that. do a partial unroll with count 6. (this is where we totally skip the logic to fix up the count)

Now we get a 6-wide partial unroll with a fixup loop at the end, which is totally not what we wanted.

Also, it's an out of tree target, so we can't possibly add an in-tree test. But I figure any target that wanted to limit the unroll count would have the same problem.

In D18670#389914, @escha wrote:

FullUnrollMaxCount was 6.

But now in LLVM trunk there is no way to make FullUnrollMaxCount 6. Right?
To make FullUnrollMaxCount introduction reasonable there should be a path in LLVM trunk where FullUnrollMaxCount can become something else than UINT_MAX. That will resolve the issue with test as well.

In D18670#389959, @evstupac wrote:

But now in LLVM trunk there is no way to make FullUnrollMaxCount 6. Right?
To make FullUnrollMaxCount introduction reasonable there should be a path in LLVM trunk where FullUnrollMaxCount can become something else than UINT_MAX. That will resolve the issue with test as well.

Do you have a concrete solution to suggest here? Because locking the infrastructure to only the feature set needed by upstreamed targets seems shortsighted.

Do you have a concrete solution to suggest here?

Maybe just introduce a command-line option for UP.FullUnrollMaxCount then?

Ironically it looks like MaxCount itself doesn't even have a commandline option either... should I introduce one for both? :/ it feels wrong to bloat the commandline options like this, I guess

In D18670#389991, @escha wrote:

It feels wrong to bloat the commandline options like this, I guess

I wouldn't worry too much about that. This only affects the command line of opt, not of clang, so it's not impacting an end-user tool.

In D18670#389991, @escha wrote:

Ironically it looks like MaxCount itself doesn't even have a commandline option either... should I introduce one for both? :/ it feels wrong to bloat the commandline options like this, I guess

static cl::opt<unsigned>                                                                                  
UnrollCount("unroll-count", cl::Hidden,                                                                   
  cl::desc("Use this unroll count for all loops including those with "                                    
           "unroll_count pragma values, for testing purposes"));

Is it reasonable to have one cmd option, similar to UnrollCount above, to curb both MaxCount and FullUnrollMaxCount?

The only target modifying MaxCount is AMDGPU, but it modifies it to UINT_MAX.
So all checks for exceeding MaxCount are not supposed to hit.

Personally I like Threshold bounding more than Count. So I'd better separate full and partial unroll thresholds in the patch. However there could be something target specific that I'm missing.

So is it possible to reuse MaxCount to bound full unrolling? That way we can introduce an option only for MaxCount?

Added commandline options and a test.

I am slightly afraid to silently modify the behavior of MaxCount to affect full unrolling (when it didn't before) because out of tree users may be using it.

Poking this thread.

LGTM!

Michael

test/Transforms/LoopUnroll/partial-unroll-maxcount.ll
19–22	I think in most of the tests we have CHECK-lines before the test. Could you please move it there and add CHECK-LABEL in the beginning (in case we add more tests in future)?

This revision is now accepted and ready to land.Apr 4 2016, 12:44 PM

escha closed this revision.Apr 6 2016, 10:03 AM

Revision Contents

Path

Size

include/

llvm/

Analysis/

TargetTransformInfo.h

4 lines

lib/

Transforms/

Scalar/

LoopUnrollPass.cpp

23 lines

test/

Transforms/

LoopUnroll/

partial-unroll-maxcount.ll

22 lines

Diff 52438

include/llvm/Analysis/TargetTransformInfo.h

Show First 20 Lines • Show All 254 Lines • ▼ Show 20 Lines	struct UnrollingPreferences {
/// transformation will select an unrolling factor based on the current cost		/// transformation will select an unrolling factor based on the current cost
/// threshold and other factors.		/// threshold and other factors.
unsigned Count;		unsigned Count;
// Set the maximum unrolling factor. The unrolling factor may be selected		// Set the maximum unrolling factor. The unrolling factor may be selected
// using the appropriate cost threshold, but may not exceed this number		// using the appropriate cost threshold, but may not exceed this number
// (set to UINT_MAX to disable). This does not apply in cases where the		// (set to UINT_MAX to disable). This does not apply in cases where the
// loop is being fully unrolled.		// loop is being fully unrolled.
unsigned MaxCount;		unsigned MaxCount;
		/// Set the maximum unrolling factor for full unrolling. Like MaxCount, but
		/// applies even if full unrolling is selected. This allows a target to fall
		/// back to Partial unrolling if full unrolling is above FullUnrollMaxCount.
		unsigned FullUnrollMaxCount;
/// Allow partial unrolling (unrolling of loops to expand the size of the		/// Allow partial unrolling (unrolling of loops to expand the size of the
/// loop body, not only to eliminate small constant-trip-count loops).		/// loop body, not only to eliminate small constant-trip-count loops).
bool Partial;		bool Partial;
/// Allow runtime unrolling (unrolling of loops to expand the size of the		/// Allow runtime unrolling (unrolling of loops to expand the size of the
/// loop body even when the number of loop iterations is not known at		/// loop body even when the number of loop iterations is not known at
/// compile time).		/// compile time).
bool Runtime;		bool Runtime;
/// Allow emitting expensive instructions (such as divisions) when computing		/// Allow emitting expensive instructions (such as divisions) when computing
▲ Show 20 Lines • Show All 724 Lines • Show Last 20 Lines

lib/Transforms/Scalar/LoopUnrollPass.cpp

Show First 20 Lines • Show All 59 Lines • ▼ Show 20 Lines	static cl::opt<unsigned> UnrollMaxIterationsCountToAnalyze(
cl::desc("Don't allow loop unrolling to simulate more than this number of"		cl::desc("Don't allow loop unrolling to simulate more than this number of"
"iterations when checking full unroll profitability"));		"iterations when checking full unroll profitability"));

static cl::opt<unsigned>		static cl::opt<unsigned>
UnrollCount("unroll-count", cl::Hidden,		UnrollCount("unroll-count", cl::Hidden,
cl::desc("Use this unroll count for all loops including those with "		cl::desc("Use this unroll count for all loops including those with "
"unroll_count pragma values, for testing purposes"));		"unroll_count pragma values, for testing purposes"));

		static cl::opt<unsigned>
		UnrollMaxCount("unroll-max-count", cl::Hidden,
		cl::desc("Set the max unroll count for partial and runtime unrolling, for"
		"testing purposes"));

		static cl::opt<unsigned>
		UnrollFullMaxCount("unroll-full-max-count", cl::Hidden,
		cl::desc("Set the max unroll count for full unrolling, for testing purposes"));

static cl::opt<bool>		static cl::opt<bool>
UnrollAllowPartial("unroll-allow-partial", cl::Hidden,		UnrollAllowPartial("unroll-allow-partial", cl::Hidden,
cl::desc("Allows loops to be partially unrolled until "		cl::desc("Allows loops to be partially unrolled until "
"-unroll-threshold loop size is reached."));		"-unroll-threshold loop size is reached."));

static cl::opt<bool>		static cl::opt<bool>
UnrollRuntime("unroll-runtime", cl::ZeroOrMore, cl::Hidden,		UnrollRuntime("unroll-runtime", cl::ZeroOrMore, cl::Hidden,
cl::desc("Unroll loops with run-time trip counts"));		cl::desc("Unroll loops with run-time trip counts"));
Show All 17 Lines
/// flags, TTI overrides, pragmas, and user specified parameters.		/// flags, TTI overrides, pragmas, and user specified parameters.
static TargetTransformInfo::UnrollingPreferences gatherUnrollingPreferences(		static TargetTransformInfo::UnrollingPreferences gatherUnrollingPreferences(
Loop *L, const TargetTransformInfo &TTI, Optional<unsigned> UserThreshold,		Loop *L, const TargetTransformInfo &TTI, Optional<unsigned> UserThreshold,
Optional<unsigned> UserCount, Optional<bool> UserAllowPartial,		Optional<unsigned> UserCount, Optional<bool> UserAllowPartial,
Optional<bool> UserRuntime, unsigned PragmaCount, bool PragmaFullUnroll,		Optional<bool> UserRuntime, unsigned PragmaCount, bool PragmaFullUnroll,
bool PragmaEnableUnroll, unsigned TripCount) {		bool PragmaEnableUnroll, unsigned TripCount) {
TargetTransformInfo::UnrollingPreferences UP;		TargetTransformInfo::UnrollingPreferences UP;

// Set up the defaults		// Set up the defaults
		zzhengUnsubmitted Not Done Reply Inline Actions Where's FullUnrollMaxCount used? If I understand your description correctly, full unroll and partial unroll are limited by the same max unroll count ('A iterations') but different thresholds ('threshold X' and 'threshold Y')? I don't quiet understand the term 'put a cap on full unrolling'... If we can't unroll by compile-time-known tripcount, it becomes partial unrolling, right? getUnrollingPreferences() in TTI can update UP.Count, UP.MaxCount and UP.PartialThreshold to bound partial unrolling. zzheng: Where's FullUnrollMaxCount used? If I understand your description correctly, full unroll and…
UP.Threshold = 150;		UP.Threshold = 150;
UP.PercentDynamicCostSavedThreshold = 20;		UP.PercentDynamicCostSavedThreshold = 20;
UP.DynamicCostSavingsDiscount = 2000;		UP.DynamicCostSavingsDiscount = 2000;
UP.OptSizeThreshold = 50;		UP.OptSizeThreshold = 50;
UP.PartialThreshold = UP.Threshold;		UP.PartialThreshold = UP.Threshold;
UP.PartialOptSizeThreshold = UP.OptSizeThreshold;		UP.PartialOptSizeThreshold = UP.OptSizeThreshold;
UP.Count = 0;		UP.Count = 0;
UP.MaxCount = UINT_MAX;		UP.MaxCount = UINT_MAX;
		UP.FullUnrollMaxCount = UINT_MAX;
UP.Partial = false;		UP.Partial = false;
UP.Runtime = false;		UP.Runtime = false;
UP.AllowExpensiveTripCount = false;		UP.AllowExpensiveTripCount = false;

// Override with any target specific settings		// Override with any target specific settings
TTI.getUnrollingPreferences(L, UP);		TTI.getUnrollingPreferences(L, UP);

// Apply size attributes		// Apply size attributes
Show All 15 Lines	static TargetTransformInfo::UnrollingPreferences gatherUnrollingPreferences(
}		}
if (UnrollPercentDynamicCostSavedThreshold.getNumOccurrences() > 0)		if (UnrollPercentDynamicCostSavedThreshold.getNumOccurrences() > 0)
UP.PercentDynamicCostSavedThreshold =		UP.PercentDynamicCostSavedThreshold =
UnrollPercentDynamicCostSavedThreshold;		UnrollPercentDynamicCostSavedThreshold;
if (UnrollDynamicCostSavingsDiscount.getNumOccurrences() > 0)		if (UnrollDynamicCostSavingsDiscount.getNumOccurrences() > 0)
UP.DynamicCostSavingsDiscount = UnrollDynamicCostSavingsDiscount;		UP.DynamicCostSavingsDiscount = UnrollDynamicCostSavingsDiscount;
if (UnrollCount.getNumOccurrences() > 0)		if (UnrollCount.getNumOccurrences() > 0)
UP.Count = UnrollCount;		UP.Count = UnrollCount;
		if (UnrollMaxCount.getNumOccurrences() > 0)
		UP.MaxCount = UnrollMaxCount;
		if (UnrollFullMaxCount.getNumOccurrences() > 0)
		UP.FullUnrollMaxCount = UnrollFullMaxCount;
if (UnrollAllowPartial.getNumOccurrences() > 0)		if (UnrollAllowPartial.getNumOccurrences() > 0)
UP.Partial = UnrollAllowPartial;		UP.Partial = UnrollAllowPartial;
if (UnrollRuntime.getNumOccurrences() > 0)		if (UnrollRuntime.getNumOccurrences() > 0)
UP.Runtime = UnrollRuntime;		UP.Runtime = UnrollRuntime;

// Apply user values provided by argument		// Apply user values provided by argument
if (UserThreshold.hasValue()) {		if (UserThreshold.hasValue()) {
UP.Threshold = *UserThreshold;		UP.Threshold = *UserThreshold;
▲ Show 20 Lines • Show All 399 Lines • ▼ Show 20 Lines	static bool tryToUnrollLoop(Loop L, DominatorTree &DT, LoopInfo LI,
// block for the trip count estimation.		// block for the trip count estimation.
BasicBlock *ExitingBlock = L->getLoopLatch();		BasicBlock *ExitingBlock = L->getLoopLatch();
if (!ExitingBlock \|\| !L->isLoopExiting(ExitingBlock))		if (!ExitingBlock \|\| !L->isLoopExiting(ExitingBlock))
ExitingBlock = L->getExitingBlock();		ExitingBlock = L->getExitingBlock();
if (ExitingBlock) {		if (ExitingBlock) {
TripCount = SE->getSmallConstantTripCount(L, ExitingBlock);		TripCount = SE->getSmallConstantTripCount(L, ExitingBlock);
TripMultiple = SE->getSmallConstantTripMultiple(L, ExitingBlock);		TripMultiple = SE->getSmallConstantTripMultiple(L, ExitingBlock);
}		}

		zzhengUnsubmitted Not Done Reply Inline Actions Now I understand what you're trying to do... zzheng: Now I understand what you're trying to do...
TargetTransformInfo::UnrollingPreferences UP = gatherUnrollingPreferences(		TargetTransformInfo::UnrollingPreferences UP = gatherUnrollingPreferences(
L, TTI, ProvidedThreshold, ProvidedCount, ProvidedAllowPartial,		L, TTI, ProvidedThreshold, ProvidedCount, ProvidedAllowPartial,
ProvidedRuntime, PragmaCount, PragmaFullUnroll, PragmaEnableUnroll,		ProvidedRuntime, PragmaCount, PragmaFullUnroll, PragmaEnableUnroll,
TripCount);		TripCount);

unsigned Count = UP.Count;		unsigned Count = UP.Count;
bool CountSetExplicitly = Count != 0;		bool CountSetExplicitly = Count != 0;
// Use a heuristic count if we didn't set anything explicitly.		// Use a heuristic count if we didn't set anything explicitly.
if (!CountSetExplicitly)		if (!CountSetExplicitly)
Count = TripCount == 0 ? DefaultUnrollRuntimeCount : TripCount;		Count = TripCount == 0 ? DefaultUnrollRuntimeCount : TripCount;
if (TripCount && Count > TripCount)		if (TripCount && Count > TripCount)
Count = TripCount;		Count = TripCount;
		Count = std::min(Count, UP.FullUnrollMaxCount);

unsigned NumInlineCandidates;		unsigned NumInlineCandidates;
bool NotDuplicatable;		bool NotDuplicatable;
bool Convergent;		bool Convergent;
unsigned LoopSize = ApproximateLoopSize(		unsigned LoopSize = ApproximateLoopSize(
L, NumInlineCandidates, NotDuplicatable, Convergent, TTI, &AC);		L, NumInlineCandidates, NotDuplicatable, Convergent, TTI, &AC);
DEBUG(dbgs() << " Loop Size = " << LoopSize << "\n");		DEBUG(dbgs() << " Loop Size = " << LoopSize << "\n");

Show All 39 Lines	if (TripCount && Count == TripCount) {
Unrolling = Partial;		Unrolling = Partial;
} else {		} else {
Unrolling = Runtime;		Unrolling = Runtime;
}		}

// Reduce count based on the type of unrolling and the threshold values.		// Reduce count based on the type of unrolling and the threshold values.
unsigned OriginalCount = Count;		unsigned OriginalCount = Count;
bool AllowRuntime = PragmaEnableUnroll \|\| (PragmaCount > 0) \|\| UP.Runtime;		bool AllowRuntime = PragmaEnableUnroll \|\| (PragmaCount > 0) \|\| UP.Runtime;
// Don't unroll a runtime trip count loop with unroll full pragma.		// Don't unroll a runtime trip count loop with unroll full pragma.
if (HasRuntimeUnrollDisablePragma(L) \|\| PragmaFullUnroll) {		if (HasRuntimeUnrollDisablePragma(L) \|\| PragmaFullUnroll) {
AllowRuntime = false;		AllowRuntime = false;
}		}
bool DecreasedCountDueToConvergence = false;		bool DecreasedCountDueToConvergence = false;
if (Unrolling == Partial) {		if (Unrolling == Partial) {
		mzolotukhinUnsubmitted Not Done Reply Inline Actions I think we shouldn't do this if `CountSetExplicitly` is true. mzolotukhin: I think we shouldn't do this if `CountSetExplicitly` is true.
bool AllowPartial = PragmaEnableUnroll \|\| UP.Partial;		bool AllowPartial = PragmaEnableUnroll \|\| UP.Partial;
if (!AllowPartial && !CountSetExplicitly) {		if (!AllowPartial && !CountSetExplicitly) {
DEBUG(dbgs() << " will not try to unroll partially because "		DEBUG(dbgs() << " will not try to unroll partially because "
<< "-unroll-allow-partial not given\n");		<< "-unroll-allow-partial not given\n");
return false;		return false;
}		}
if (UP.PartialThreshold != NoThreshold &&		if (UP.PartialThreshold != NoThreshold) {
UnrolledSize > UP.PartialThreshold) {
// Reduce unroll count to be modulo of TripCount for partial unrolling.		// Reduce unroll count to be modulo of TripCount for partial unrolling.
		if (UnrolledSize > UP.PartialThreshold)
Count = (std::max(UP.PartialThreshold, 3u) - 2) / (LoopSize - 2);		Count = (std::max(UP.PartialThreshold, 3u) - 2) / (LoopSize - 2);
		if (Count > UP.MaxCount)
		Count = UP.MaxCount;
while (Count != 0 && TripCount % Count != 0)		while (Count != 0 && TripCount % Count != 0)
Count--;		Count--;
}		}
} else if (Unrolling == Runtime) {		} else if (Unrolling == Runtime) {
if (!AllowRuntime && !CountSetExplicitly) {		if (!AllowRuntime && !CountSetExplicitly) {
DEBUG(dbgs() << " will not try to unroll loop with runtime trip count "		DEBUG(dbgs() << " will not try to unroll loop with runtime trip count "
<< "-unroll-runtime not given\n");		<< "-unroll-runtime not given\n");
return false;		return false;
▲ Show 20 Lines • Show All 166 Lines • Show Last 20 Lines

test/Transforms/LoopUnroll/partial-unroll-maxcount.ll

				; RUN: opt < %s -S -loop-unroll -unroll-allow-partial -unroll-full-max-count=6 \| FileCheck %s

				; Check that we properly round down to an unroll-by-3 instead of emitting a partial unroll by
				; 6.
				define void @unroll_opt_for_size() nounwind optsize {
				entry:
				br label %loop

				loop:
				%iv = phi i32 [ 0, %entry ], [ %inc, %loop ]
				%inc = add i32 %iv, 1
				%exitcnd = icmp uge i32 %inc, 9
				br i1 %exitcnd, label %exit, label %loop

				exit:
				ret void
				}

				; CHECK: add
				; CHECK-NEXT: add
				; CHECK-NEXT: add
				; CHECK-NEXT: icmp
				mzolotukhinUnsubmitted Not Done Reply Inline Actions I think in most of the tests we have CHECK-lines before the test. Could you please move it there and add CHECK-LABEL in the beginning (in case we add more tests in future)? mzolotukhin: I think in most of the tests we have CHECK-lines before the test. Could you please move it…