This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
-
LoopUnrollPass.cpp
-
test/Transforms/LoopUnroll/
-
Transforms/
-
LoopUnroll/
-
unroll-pragmas.ll

Differential D119148

[LoopUnroll] Always respect user unroll pragma
ClosedPublic

Authored by Whitney on Feb 7 2022, 8:45 AM.

Download Raw Diff

Details

Reviewers

Meinersbur
reames
nikic
lebedev.ri
eliben
fhahn

Commits

rG80304c5f88f0: [LoopUnroll] Always respect user unroll pragma

Summary

IMO when user provide unroll pragma, compiler should always respect it.
It is not clear to me why loop unroll pass currently ensure that the unrolled loop size is limited by PragmaUnrollThreshold.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	60,060 ms	x64 debian > AddressSanitizer-x86_64-linux.TestCases::scariness_score_test.cpp
	60,140 ms	x64 debian > Clang.CodeGen/RISCV/rvv-intrinsics::vloxseg.c
	60,270 ms	x64 debian > Clang.CodeGen/RISCV/rvv-intrinsics::vluxseg.c
	60,190 ms	x64 debian > Clang.CodeGen/RISCV/rvv-intrinsics-overloaded::vluxseg.c
	60,040 ms	x64 debian > MLIR.Examples/standalone::test.toy

Event Timeline

Whitney created this revision.Feb 7 2022, 8:45 AM

Herald added subscribers: zzheng, hiraditya. · View Herald TranscriptFeb 7 2022, 8:45 AM

Whitney requested review of this revision.Feb 7 2022, 8:45 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 7 2022, 8:45 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Whitney updated this revision to Diff 406502.Feb 7 2022, 9:29 AM

Harbormaster completed remote builds in B148018: Diff 406502.Feb 7 2022, 11:27 AM

Meinersbur added reviewers: reames, nikic, lebedev.ri, eliben, fhahn.Feb 8 2022, 3:09 PM

Added some reviewers for discussion.

The max unroll threshold for pragmas was already present from the very beginning: rGff9032459976cf309eda7a12a4b9c2deb4fb9d2b

ping

I think the limit is there just to prevent the compiler from exploding. If you try to unroll too much, you're going to end up with effectively infinite compile time and/or run out of memory.

I would expect it's unlikely to trigger the limit in practice; do you have a practical case where it matters?

In D119148#3330896, @efriedma wrote:

I think the limit is there just to prevent the compiler from exploding. If you try to unroll too much, you're going to end up with effectively infinite compile time and/or run out of memory.

I would expect it's unlikely to trigger the limit in practice; do you have a practical case where it matters?

We had a change to make certain kinds of instructions return InstructionCost::getMax() from getUserCost(), as we want to prevent optimizations like unrolling to happen for loops containing those instructions. (It is reverted.)
We were surprised to see that loops with user pragma were also affected. IMO it is user's responsibility to put pragmas that make sense.
IIUC if there is user pragma, we should always respect it. I don't feel too strongly about it, I mostly want to understand what's the level we want to respect user pragma to.

Unless @efriedma has further concerns, LGTM.

Review comments from D4090 and D4147 suggested much larger limits anyway.

This revision is now accepted and ready to land.Mar 29 2022, 6:37 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 29 2022, 6:37 PM

PragmaUnrollThreshold has other uses; are you intentionally leaving those alone?

I'm not confident it makes sense to "respect the user" here; nobody really benefits from crashing the compiler. But I don't have a strong preference here; "#pragma unroll" is rare enough that I don't expect much practical impact either way.

In D119148#3417337, @efriedma wrote:

PragmaUnrollThreshold has other uses; are you intentionally leaving those alone?

Yes, it is still possible that shouldPragmaUnroll returns None, e.g., when (UP.AllowRemainder || (TripMultiple % PInfo.PragmaCount == 0)) is false.
And we want to be more aggressive with unrolling limits when given an unrolling pragma.

// If the loop has an unrolling pragma, we want to be more aggressive with                                                           
// unrolling limits. Set thresholds to at least the PragmaUnrollThreshold                                                            
// value which is larger than the default limits.

Thanks for your input.

Oh, I see, you still want to use the PragmaUnrollThreshold limit if the user tries to full-unroll a loop that can't be fully unrolled, or something like that. That makes sense.

This revision was landed with ongoing or failed builds.Apr 11 2022, 11:33 AM

Closed by commit rG80304c5f88f0: [LoopUnroll] Always respect user unroll pragma (authored by Whitney). · Explain Why

This revision was automatically updated to reflect the committed changes.

Whitney added a commit: rG80304c5f88f0: [LoopUnroll] Always respect user unroll pragma.

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Scalar/

LoopUnrollPass.cpp

6 lines

test/

Transforms/

LoopUnroll/

unroll-pragmas.ll

27 lines

Diff 406502

llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp

Show First 20 Lines • Show All 782 Lines • ▼ Show 20 Lines	shouldPragmaUnroll(Loop *L, const PragmaInfo &PInfo,
if (PInfo.UserUnrollCount) {		if (PInfo.UserUnrollCount) {
if (UP.AllowRemainder &&		if (UP.AllowRemainder &&
UCE.getUnrolledLoopSize(UP, (unsigned)UnrollCount) < UP.Threshold)		UCE.getUnrolledLoopSize(UP, (unsigned)UnrollCount) < UP.Threshold)
return (unsigned)UnrollCount;		return (unsigned)UnrollCount;
}		}

// 2nd priority is unroll count set by pragma.		// 2nd priority is unroll count set by pragma.
if (PInfo.PragmaCount > 0) {		if (PInfo.PragmaCount > 0) {
if ((UP.AllowRemainder \|\| (TripMultiple % PInfo.PragmaCount == 0)) &&		if ((UP.AllowRemainder \|\| (TripMultiple % PInfo.PragmaCount == 0)))
UCE.getUnrolledLoopSize(UP, PInfo.PragmaCount) < PragmaUnrollThreshold)
return PInfo.PragmaCount;		return PInfo.PragmaCount;
}		}

if (PInfo.PragmaFullUnroll && TripCount != 0) {		if (PInfo.PragmaFullUnroll && TripCount != 0) {
if (UCE.getUnrolledLoopSize(UP, TripCount) < PragmaUnrollThreshold)
return TripCount;		return TripCount;
}		}
// if didn't return until here, should continue to other priorties		// if didn't return until here, should continue to other priorties
return None;		return None;
}		}

static Optional<unsigned> shouldFullUnroll(		static Optional<unsigned> shouldFullUnroll(
Loop *L, const TargetTransformInfo &TTI, DominatorTree &DT,		Loop *L, const TargetTransformInfo &TTI, DominatorTree &DT,
ScalarEvolution &SE, const SmallPtrSetImpl<const Value *> &EphValues,		ScalarEvolution &SE, const SmallPtrSetImpl<const Value *> &EphValues,
▲ Show 20 Lines • Show All 845 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopUnroll/unroll-pragmas.ll

Show First 20 Lines • Show All 229 Lines • ▼ Show 20 Lines	for.body: ; preds = %for.body, %entry
br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !10		br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !10

for.end: ; preds = %for.body		for.end: ; preds = %for.body
ret void		ret void
}		}
!10 = !{!10, !11}		!10 = !{!10, !11}
!11 = !{!"llvm.loop.unroll.count", i32 1}		!11 = !{!"llvm.loop.unroll.count", i32 1}

; #pragma clang loop unroll(full)
; Loop has very high loop count (1 million) and full unrolling was requested.
; Loop should unrolled up to the pragma threshold, but not completely.
;
; CHECK-LABEL: @unroll_1M(
; CHECK: store i32
; CHECK: store i32
; CHECK: br i1
define void @unroll_1M(i32* nocapture %a, i32 %b) {
entry:
br label %for.body

for.body: ; preds = %for.body, %entry
%indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
%arrayidx = getelementptr inbounds i32, i32* %a, i64 %indvars.iv
%0 = load i32, i32* %arrayidx, align 4
%inc = add nsw i32 %0, 1
store i32 %inc, i32* %arrayidx, align 4
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
%exitcond = icmp eq i64 %indvars.iv.next, 1000000
br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !12

for.end: ; preds = %for.body
ret void
}
!12 = !{!12, !4}

; #pragma clang loop unroll(enable)		; #pragma clang loop unroll(enable)
; Loop should be fully unrolled.		; Loop should be fully unrolled.
;		;
; CHECK-LABEL: @loop64_with_enable(		; CHECK-LABEL: @loop64_with_enable(
; CHECK-NOT: br i1		; CHECK-NOT: br i1
define void @loop64_with_enable(i32* nocapture %a) {		define void @loop64_with_enable(i32* nocapture %a) {
entry:		entry:
br label %for.body		br label %for.body
▲ Show 20 Lines • Show All 100 Lines • Show Last 20 Lines