This is an archive of the discontinued LLVM Phabricator instance.

Differential D20482

[LoopUnroll] Enable advanced unrolling analysis by default.
ClosedPublic

Authored by mzolotukhin on May 20 2016, 12:33 PM.

Download Raw Diff

Details

Reviewers

chandlerc
hfinkel

Commits

rGbe080fc51d77: [LoopUnroll] Enable advanced unrolling analysis by default.
rL270478: [LoopUnroll] Enable advanced unrolling analysis by default.

Summary

This patch turns on LoopUnrollAnalyzer by default. To mitigate compile
time regressions, I chose very conservative thresholds for now. Later we
can make them more aggressive, but it might require being smarter in
which loops we're optimizing. E.g. currently the biggest issue is that
with more agressive thresholds we unroll many cold loops, which
increases compile time for no performance benefit (performance of those
loops is improved, but it doesn't matter since they are cold).

Test results for compile time(using 4 samples to reduce noise):

MultiSource/Benchmarks/VersaBench/ecbdes/ecbdes 5.19%
SingleSource/Benchmarks/Polybench/medley/reg_detect/reg_detect  4.19%
MultiSource/Benchmarks/FreeBench/fourinarow/fourinarow  3.39%
MultiSource/Applications/JM/lencod/lencod 1.47%
MultiSource/Benchmarks/Fhourstones-3_1/fhourstones3_1 -6.06%

I didn't see any performance changes in the testsuite, but it improves
some internal tests.

Diff Detail

Repository: rL LLVM

Event Timeline

mzolotukhin updated this revision to Diff 57969.May 20 2016, 12:33 PM

mzolotukhin retitled this revision from to [LoopUnroll] Enable advanced unrolling analysis by default..

mzolotukhin updated this object.

mzolotukhin added reviewers: hfinkel, chandlerc.

mzolotukhin added a subscriber: llvm-commits.

Herald added a subscriber: mzolotukhin. · View Herald TranscriptMay 20 2016, 12:33 PM

LGTM.

Maybe work on a patch to use profile info to adjust the thresholds? Once we have that, I think we can help benchmarking a range of thresholds and produce some data about where the sweet spot lies.

This revision is now accepted and ready to land.May 20 2016, 1:09 PM

Hi Chandler,

Thanks for LGTM, I'll commit the patch on Monday as I won't be able to check bots on this weekend.

As for the use of profiling info - it indeed looks like the best way to handle this. Also, I think it would make sense to use it more extensively everywhere, not only in loop-unrolling. This way we can save some compile time by not doing expensive optimizations on code that doesn't matter, and then spend more time optimizing relevant pieces of the code. For now it's just general ideas, but I hope to get to something more real soon.

Thanks,
Michael

Closed by commit rL270478: [LoopUnroll] Enable advanced unrolling analysis by default. (authored by mzolotukhin). · Explain WhyMay 23 2016, 12:16 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Transforms/

Scalar/

LoopUnrollPass.cpp

6 lines

test/

Transforms/

LoopUnroll/

partial-unroll-const-bounds.ll

2 lines

Diff 58131

llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp

	Show All 39 Lines

	#define DEBUG_TYPE "loop-unroll"			#define DEBUG_TYPE "loop-unroll"

	static cl::opt<unsigned>			static cl::opt<unsigned>
	UnrollThreshold("unroll-threshold", cl::Hidden,			UnrollThreshold("unroll-threshold", cl::Hidden,
	cl::desc("The baseline cost threshold for loop unrolling"));			cl::desc("The baseline cost threshold for loop unrolling"));

	static cl::opt<unsigned> UnrollPercentDynamicCostSavedThreshold(			static cl::opt<unsigned> UnrollPercentDynamicCostSavedThreshold(
	"unroll-percent-dynamic-cost-saved-threshold", cl::Hidden,			"unroll-percent-dynamic-cost-saved-threshold", cl::init(50), cl::Hidden,
	cl::desc("The percentage of estimated dynamic cost which must be saved by "			cl::desc("The percentage of estimated dynamic cost which must be saved by "
	"unrolling to allow unrolling up to the max threshold."));			"unrolling to allow unrolling up to the max threshold."));

	static cl::opt<unsigned> UnrollDynamicCostSavingsDiscount(			static cl::opt<unsigned> UnrollDynamicCostSavingsDiscount(
	"unroll-dynamic-cost-savings-discount", cl::Hidden,			"unroll-dynamic-cost-savings-discount", cl::init(100), cl::Hidden,
	cl::desc("This is the amount discounted from the total unroll cost when "			cl::desc("This is the amount discounted from the total unroll cost when "
	"the unrolled form has a high dynamic cost savings (triggered by "			"the unrolled form has a high dynamic cost savings (triggered by "
	"the '-unroll-perecent-dynamic-cost-saved-threshold' flag)."));			"the '-unroll-perecent-dynamic-cost-saved-threshold' flag)."));

	static cl::opt<unsigned> UnrollMaxIterationsCountToAnalyze(			static cl::opt<unsigned> UnrollMaxIterationsCountToAnalyze(
	"unroll-max-iteration-count-to-analyze", cl::init(0), cl::Hidden,			"unroll-max-iteration-count-to-analyze", cl::init(10), cl::Hidden,
	cl::desc("Don't allow loop unrolling to simulate more than this number of"			cl::desc("Don't allow loop unrolling to simulate more than this number of"
	"iterations when checking full unroll profitability"));			"iterations when checking full unroll profitability"));

	static cl::opt<unsigned> UnrollCount(			static cl::opt<unsigned> UnrollCount(
	"unroll-count", cl::Hidden,			"unroll-count", cl::Hidden,
	cl::desc("Use this unroll count for all loops including those with "			cl::desc("Use this unroll count for all loops including those with "
	"unroll_count pragma values, for testing purposes"));			"unroll_count pragma values, for testing purposes"));

	▲ Show 20 Lines • Show All 942 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/LoopUnroll/partial-unroll-const-bounds.ll

	; RUN: opt < %s -S -unroll-threshold=20 -loop-unroll -unroll-allow-partial -unroll-runtime \| FileCheck %s			; RUN: opt < %s -S -unroll-threshold=20 -loop-unroll -unroll-allow-partial -unroll-runtime -unroll-dynamic-cost-savings-discount=0 \| FileCheck %s

	; The Loop TripCount is 9. However unroll factors 3 or 9 exceed given threshold.			; The Loop TripCount is 9. However unroll factors 3 or 9 exceed given threshold.
	; The test checks that we choose a smaller, power-of-two, unroll count and do not give up on unrolling.			; The test checks that we choose a smaller, power-of-two, unroll count and do not give up on unrolling.

	; CHECK: for.body:			; CHECK: for.body:
	; CHECK: store			; CHECK: store
	; CHECK: for.body.1:			; CHECK: for.body.1:
	; CHECK: store			; CHECK: store
	Show All 20 Lines