Download Raw Diff

Details

Reviewers

sdesmalen
kmclaughlin
craig.topper
MattDevereau
paulwalker-arm
nikic
fhahn
dmgreen

Commits

rG905083f3c1a6: [LTO] Ensure LICM hoists expensive fdiv instructions introduced by InstCombine

Summary

In the LTO pipeline we run InstCombine after LICM, which is
different to what we normally do without LTO. This has the
effect of undoing all the great work done by LICM to reduce
the cost of the loop when it hoists the fdiv out and replaces
it with fmul. When InstCombine runs after LICM it puts the
fdiv straight back which, on AArch64 at least, is darn
expensive. You can observe this problem in the SPEC2017
benchmark parest if you build with "-Ofast -flto" and the
loop-vectoriser uses an unroll factor of 1, which is what
often happens when tail-folding is enabled.

This is also a problem for scalar loops, or indeed any loop
where there is only one use of the preheader fdiv result in
the loop.

See InstCombinerImpl::visitFMul for the code that sinks the fdiv.

I've attempted to fix this by adding another LICM pass for Full
LTO after InstCombine. The alternative is to stop InstCombine
from sinking the fdiv into loops. See D87479 for a previous
discussion on this issue.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

david-arm created this revision.Feb 9 2023, 1:12 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 9 2023, 1:12 AM

Herald added subscribers: ormris, StephenFan, asbirlea and 4 others. · View Herald Transcript

david-arm requested review of this revision.Feb 9 2023, 1:12 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 9 2023, 1:12 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

david-arm edited the summary of this revision. (Show Details)Feb 9 2023, 1:21 AM

Harbormaster completed remote builds in B212750: Diff 496042.Feb 9 2023, 2:03 AM

I've added another variation I encountered to D87479 which another run of LICM will not fix.

In D143631#4119689, @craig.topper wrote:

I've added another variation I encountered to D87479 which another run of LICM will not fix.

Would that case be fixed by disabling sinking in InstCombine? Maybe it's finally time to just drop sinking in InstCombine and deal with any fallout.

At least as a first check, compile-time sees a net improvement:
https://llvm-compile-time-tracker.com/compare.php?from=d87468e56c221555be7a1c1550036af5b7034896&to=a5691dd50d4098dec45571fcdfdc5403826a07de&stat=instructions:u

That change should also allow us to close a pile of longstanding InstCombine bugs caused by a conflict between iterative sinking and InstCombine's infinite-loop-threshold.

I get ~60 regression test failures with that patch, but regenerating the first 5 or so tests that I tried to update does not show any real regressions.

In D143631#4123030, @spatel wrote:

In D143631#4119689, @craig.topper wrote:

I've added another variation I encountered to D87479 which another run of LICM will not fix.

Would that case be fixed by disabling sinking in InstCombine? Maybe it's finally time to just drop sinking in InstCombine and deal with any fallout.

At least as a first check, compile-time sees a net improvement:
https://llvm-compile-time-tracker.com/compare.php?from=d87468e56c221555be7a1c1550036af5b7034896&to=a5691dd50d4098dec45571fcdfdc5403826a07de&stat=instructions:u

That change should also allow us to close a pile of longstanding InstCombine bugs caused by a conflict between iterative sinking and InstCombine's infinite-loop-threshold.

I get ~60 regression test failures with that patch, but regenerating the first 5 or so tests that I tried to update does not show any real regressions.

I don't think my new variation as anything to do with the sinking code in InstCombine. It's being caused by the same transform D87479 touched. That code creates an fmul followed by fdiv at the location of the original fmul. Even if the original fdiv was not in the same basic block or loop as the fmul. LICM will hoist a reciprocal out later, but that leaves an extra fmul.

In D143631#4123098, @craig.topper wrote:

In D143631#4123030, @spatel wrote:

In D143631#4119689, @craig.topper wrote:

I've added another variation I encountered to D87479 which another run of LICM will not fix.

Would that case be fixed by disabling sinking in InstCombine? Maybe it's finally time to just drop sinking in InstCombine and deal with any fallout.

At least as a first check, compile-time sees a net improvement:
https://llvm-compile-time-tracker.com/compare.php?from=d87468e56c221555be7a1c1550036af5b7034896&to=a5691dd50d4098dec45571fcdfdc5403826a07de&stat=instructions:u

That change should also allow us to close a pile of longstanding InstCombine bugs caused by a conflict between iterative sinking and InstCombine's infinite-loop-threshold.

I get ~60 regression test failures with that patch, but regenerating the first 5 or so tests that I tried to update does not show any real regressions.

I don't think my new variation as anything to do with the sinking code in InstCombine. It's being caused by the same transform D87479 touched. That code creates an fmul followed by fdiv at the location of the original fmul. Even if the original fdiv was not in the same basic block or loop as the fmul. LICM will hoist a reciprocal out later, but that leaves an extra fmul.

Ah - I see. I'll comment on the other patch then, so I don't clutter up this review about general sinking in InstCombine.

Hi @spatel @craig.topper, I did think about trying to change InstCombine to avoid sinking, but given resistance to this approach in the past I wasn't sure if this was the right approach. However, certainly it's unfortunate that we run InstCombine so many times in the LTO pipeline that we end up with this problem. I agree that adding another LICM pass at the end increases compilation time too, which isn't great. I think ultimately we have to decide between 1) increased compilation time without changing InstCombine, 2) reduce compilation time and increase performance by not sinking the fdiv in InstCombine. I'm not particularly attached to either approach, but I definitely do want to fix it. :)

Depending on the LICM pass being run after InstCombine feels a bit like a game of cat and mouse, where the former hoists out the reciprocal and the latter restores things to the canonical form. I worry that guarding this ordering with just a test is a bit fragile.

There also isn't much precedent for changing InstCombine to not do the simplification, as I only spotted one cases in InstCombine (InstCombinerImpl::visitGEPOfGEP) where it looked at LoopInfo to see if something is loop invariant. Additionally, if having 1/x outside the loop is the canonical form, it still relies on LICM being run first and for LoopInfo to be available in InstCombine (which in my understanding is optional, depending on the passes run before it).

So should this functionality perhaps be moved out of LICM into a more limited pass that is run just before SelectionDAG, or perhaps be made part of CodeGenPrepare?

In D143631#4125429, @sdesmalen wrote:

Depending on the LICM pass being run after InstCombine feels a bit like a game of cat and mouse, where the former hoists out the reciprocal and the latter restores things to the canonical form. I worry that guarding this ordering with just a test is a bit fragile.

Agree.

There also isn't much precedent for changing InstCombine to not do the simplification, as I only spotted one cases in InstCombine (InstCombinerImpl::visitGEPOfGEP) where it looked at LoopInfo to see if something is loop invariant. Additionally, if having 1/x outside the loop is the canonical form, it still relies on LICM being run first and for LoopInfo to be available in InstCombine (which in my understanding is optional, depending on the passes run before it).

Agree again. But we do make some concessions about canonical form when we know an instruction really is going to be expensive on all targets (any type of div/rem).

So should this functionality perhaps be moved out of LICM into a more limited pass that is run just before SelectionDAG, or perhaps be made part of CodeGenPrepare?

CGP is an option, but that's also potentially not reliable (supposed to eventually go away with the switch to GISel?). A refinement of D87479 doesn't seem that bad to me after all. It's a hack, but there's no ideal fix in sight.

In D143631#4126856, @spatel wrote:

In D143631#4125429, @sdesmalen wrote:

Depending on the LICM pass being run after InstCombine feels a bit like a game of cat and mouse, where the former hoists out the reciprocal and the latter restores things to the canonical form. I worry that guarding this ordering with just a test is a bit fragile.

Agree.

There also isn't much precedent for changing InstCombine to not do the simplification, as I only spotted one cases in InstCombine (InstCombinerImpl::visitGEPOfGEP) where it looked at LoopInfo to see if something is loop invariant. Additionally, if having 1/x outside the loop is the canonical form, it still relies on LICM being run first and for LoopInfo to be available in InstCombine (which in my understanding is optional, depending on the passes run before it).

Agree again. But we do make some concessions about canonical form when we know an instruction really is going to be expensive on all targets (any type of div/rem).

So should this functionality perhaps be moved out of LICM into a more limited pass that is run just before SelectionDAG, or perhaps be made part of CodeGenPrepare?

CGP is an option, but that's also potentially not reliable (supposed to eventually go away with the switch to GISel?). A refinement of D87479 doesn't seem that bad to me after all. It's a hack, but there's no ideal fix in sight.

Do you have a proposed refinement for D87479? Turns out LI is only used by InstCombine if its available already in the pipeline so I don't know if it's safe to use that instead of the basic block check.

In D143631#4126874, @craig.topper wrote:

Do you have a proposed refinement for D87479? Turns out LI is only used by InstCombine if its available already in the pipeline so I don't know if it's safe to use that instead of the basic block check.

Yes, using LoopInfo was what I was thinking. I just made a rough draft of that, and it handles the motivating case here, and I think it would work on the other examples we've raised too. I can clean it up, test some more, and post it.

In D143631#4126901, @spatel wrote:

In D143631#4126874, @craig.topper wrote:

Do you have a proposed refinement for D87479? Turns out LI is only used by InstCombine if its available already in the pipeline so I don't know if it's safe to use that instead of the basic block check.

Yes, using LoopInfo was what I was thinking. I just made a rough draft of that, and it handles the motivating case here, and I think it would work on the other examples we've raised too. I can clean it up, test some more, and post it.

I don't think it it will work on the case I raised since IR LICM can't fix it so the damage will happen the first time InstCombine runs without LoopInfo. But I hope I'm wrong.

In D143631#4126961, @craig.topper wrote:

In D143631#4126901, @spatel wrote:

In D143631#4126874, @craig.topper wrote:

Do you have a proposed refinement for D87479? Turns out LI is only used by InstCombine if its available already in the pipeline so I don't know if it's safe to use that instead of the basic block check.

Yes, using LoopInfo was what I was thinking. I just made a rough draft of that, and it handles the motivating case here, and I think it would work on the other examples we've raised too. I can clean it up, test some more, and post it.

I don't think it it will work on the case I raised since IR LICM can't fix it so the damage will happen the first time InstCombine runs without LoopInfo. But I hope I'm wrong.

You're correct - it seems that LICM isn't able to do this in general:

loop:
  m1 = loopInvariant1 * x;
  m2 = loopInvariant2 * m1;
-->
loopInvariantMul = loopInvariant1 * loopInvariant2;
loop:
  m = x * loopInvariantMul;

I don't know much about LICM - should it be able to do that transform? Or is some other pass responsible for the math reassociation?
This doesn't seem to be fdiv-specific. Here's an example with fadd/fmul; there should only be 1 invariant fadd in the loop:
https://godbolt.org/z/xrs9xfGrf

spatel mentioned this in D144045: [InstCombine] Avoid sinking fdiv into a loop.Feb 14 2023, 2:06 PM

nikic mentioned this in D143726: [LICM] Simplify (X < A && X < B) into (X < MIN(A, B)) if MIN(A, B) is loop-invariant.Feb 14 2023, 11:29 PM

spatel mentioned this in D144274: [InstCombine] use loop info when running the pass after loop vectorization.Mar 3 2023, 12:43 PM

david-arm mentioned this in D152366: [LoopVectorize] Allow inner loop runtime checks to be hoisted above an outer loop.Jun 29 2023, 5:43 AM

Rebase

Gentle ping! After discussions on an alternative approach (D144045), I think this solution is the correct way forward. It would be great if we could fix this issue for LLVM 17!

nikic added inline comments.Jun 29 2023, 6:59 AM

llvm/lib/Passes/PassBuilderPipelines.cpp
1247–1248	It would be better to move these InstCombine + LICM passes out of the if and drop the InstCombine below.

Harbormaster completed remote builds in B242071: Diff 535766.Jun 29 2023, 7:36 AM

Restructured the passes according to review comments.

david-arm marked an inline comment as done.Jul 3 2023, 9:11 AM

Harbormaster completed remote builds in B242819: Diff 536791.Jul 3 2023, 11:49 AM

Looks good but the title could do with rewording because the patch doesn't actually change instcombine.

This revision is now accepted and ready to land.Jul 3 2023, 12:31 PM

Compile-time: http://llvm-compile-time-tracker.com/compare.php?from=6c5ba7cddfa2210e9ddb12ab6f016b84db9a8b23&to=ef44d3f482b284ee11a5335d8c714f7eceb3dea7&stat=instructions:u

Post-link numbers are:

kc.link 	 	30147M 	30326M (+0.59%)
sqlite3.link 	 	40914M 	41303M (+0.95%)
consumer-typeset.link 	36935M 	37405M (+1.27%)
bullet.link 	 	27582M 	27914M (+1.20%)
tramp3d-v4.link 	137504M 139271M (+1.29%)
pairlocalalign.link 	19476M 	19686M (+1.08%)
clamscan.link 	 	56907M 	57439M (+0.93%)
lencod.link 	 	102321M 102803M (+0.47%)
SPASS.link 	 	51517M 	52669M (+2.24%)
7zip-benchmark.link 	106003M 105851M (-0.14%)

Despite that, I'm on board with this change.

llvm/lib/Passes/PassBuilderPipelines.cpp
1261	UseBlockFrequencyInfo should stay false.

In D143631#4469455, @paulwalker-arm wrote:

Looks good but the title could do with rewording because the patch doesn't actually change instcombine.

Hopefully the new title is better now!

Harbormaster completed remote builds in B243236: Diff 537382.Jul 5 2023, 10:16 AM

This revision was landed with ongoing or failed builds.Jul 7 2023, 4:06 AM

Closed by commit rG905083f3c1a6: [LTO] Ensure LICM hoists expensive fdiv instructions introduced by InstCombine (authored by david-arm). · Explain Why

This revision was automatically updated to reflect the committed changes.

david-arm added a commit: rG905083f3c1a6: [LTO] Ensure LICM hoists expensive fdiv instructions introduced by InstCombine.

Diff 538071

llvm/lib/Passes/PassBuilderPipelines.cpp

Show First 20 Lines • Show All 1,238 Lines • ▼ Show 20 Lines	FPM.addPass(LoopUnrollPass(LoopUnrollOptions(
PTO.ForgetAllSCEVInLoopUnroll)));		PTO.ForgetAllSCEVInLoopUnroll)));
FPM.addPass(WarnMissedTransformationsPass());		FPM.addPass(WarnMissedTransformationsPass());
// Now that we are done with loop unrolling, be it either by LoopVectorizer,		// Now that we are done with loop unrolling, be it either by LoopVectorizer,
// or LoopUnroll passes, some variable-offset GEP's into alloca's could have		// or LoopUnroll passes, some variable-offset GEP's into alloca's could have
// become constant-offset, thus enabling SROA and alloca promotion. Do so.		// become constant-offset, thus enabling SROA and alloca promotion. Do so.
// NOTE: we are very late in the pipeline, and we don't have any LICM		// NOTE: we are very late in the pipeline, and we don't have any LICM
// or SimplifyCFG passes scheduled after us, that would cleanup		// or SimplifyCFG passes scheduled after us, that would cleanup
// the CFG mess this may created if allowed to modify CFG, so forbid that.		// the CFG mess this may created if allowed to modify CFG, so forbid that.
FPM.addPass(SROAPass(SROAOptions::PreserveCFG));		FPM.addPass(SROAPass(SROAOptions::PreserveCFG));
		}
		nikicUnsubmitted Done Reply Inline Actions It would be better to move these InstCombine + LICM passes out of the if and drop the InstCombine below. nikic: It would be better to move these InstCombine + LICM passes out of the if and drop the…

FPM.addPass(InstCombinePass());		FPM.addPass(InstCombinePass());

		// This is needed for two reasons:
		// 1. It works around problems that instcombine introduces, such as sinking
		// expensive FP divides into loops containing multiplications using the
		// divide result.
		// 2. It helps to clean up some loop-invariant code created by the loop
		// unroll pass when IsFullLTO=false.
FPM.addPass(createFunctionToLoopPassAdaptor(		FPM.addPass(createFunctionToLoopPassAdaptor(
LICMPass(PTO.LicmMssaOptCap, PTO.LicmMssaNoAccForPromotionCap,		LICMPass(PTO.LicmMssaOptCap, PTO.LicmMssaNoAccForPromotionCap,
/AllowSpeculation=/true),		/AllowSpeculation=/true),
/UseMemorySSA=/true, /UseBlockFrequencyInfo=/false));		/UseMemorySSA=/true, /UseBlockFrequencyInfo=/false));
		nikicUnsubmitted Done Reply Inline Actions UseBlockFrequencyInfo should stay false. nikic: UseBlockFrequencyInfo should stay false.
}

// Now that we've vectorized and unrolled loops, we may have more refined		// Now that we've vectorized and unrolled loops, we may have more refined
// alignment information, try to re-derive it here.		// alignment information, try to re-derive it here.
FPM.addPass(AlignmentFromAssumptionsPass());		FPM.addPass(AlignmentFromAssumptionsPass());

if (IsFullLTO)
FPM.addPass(InstCombinePass());
}		}

ModulePassManager		ModulePassManager
PassBuilder::buildModuleOptimizationPipeline(OptimizationLevel Level,		PassBuilder::buildModuleOptimizationPipeline(OptimizationLevel Level,
ThinOrFullLTOPhase LTOPhase) {		ThinOrFullLTOPhase LTOPhase) {
const bool LTOPreLink = isLTOPreLink(LTOPhase);		const bool LTOPreLink = isLTOPreLink(LTOPhase);
ModulePassManager MPM;		ModulePassManager MPM;

▲ Show 20 Lines • Show All 775 Lines • Show Last 20 Lines

llvm/test/Other/new-pm-lto-defaults.ll

	Show First 20 Lines • Show All 122 Lines • ▼ Show 20 Lines
	; CHECK-O23SZ-NEXT: Running pass: SimplifyCFGPass on foo			; CHECK-O23SZ-NEXT: Running pass: SimplifyCFGPass on foo
	; CHECK-O23SZ-NEXT: Running pass: SCCPPass on foo			; CHECK-O23SZ-NEXT: Running pass: SCCPPass on foo
	; CHECK-O23SZ-NEXT: Running pass: InstCombinePass on foo			; CHECK-O23SZ-NEXT: Running pass: InstCombinePass on foo
	; CHECK-O23SZ-NEXT: Running pass: BDCEPass on foo			; CHECK-O23SZ-NEXT: Running pass: BDCEPass on foo
	; CHECK-O2-NEXT: Running pass: SLPVectorizerPass on foo			; CHECK-O2-NEXT: Running pass: SLPVectorizerPass on foo
	; CHECK-O3-NEXT: Running pass: SLPVectorizerPass on foo			; CHECK-O3-NEXT: Running pass: SLPVectorizerPass on foo
	; CHECK-OS-NEXT: Running pass: SLPVectorizerPass on foo			; CHECK-OS-NEXT: Running pass: SLPVectorizerPass on foo
	; CHECK-O23SZ-NEXT: Running pass: VectorCombinePass on foo			; CHECK-O23SZ-NEXT: Running pass: VectorCombinePass on foo
	; CHECK-O23SZ-NEXT: Running pass: AlignmentFromAssumptionsPass on foo
	; CHECK-O23SZ-NEXT: Running pass: InstCombinePass on foo			; CHECK-O23SZ-NEXT: Running pass: InstCombinePass on foo
				; CHECK-O23SZ-NEXT: Running pass: LoopSimplifyPass
				; CHECK-O23SZ-NEXT: Running pass: LCSSAPass
				; CHECK-O23SZ-NEXT: Running pass: LICMPass
				; CHECK-O23SZ-NEXT: Running pass: AlignmentFromAssumptionsPass on foo
	; CHECK-EP-Peephole-NEXT: Running pass: NoOpFunctionPass on foo			; CHECK-EP-Peephole-NEXT: Running pass: NoOpFunctionPass on foo
	; CHECK-O23SZ-NEXT: Running pass: JumpThreadingPass on foo			; CHECK-O23SZ-NEXT: Running pass: JumpThreadingPass on foo
	; CHECK-O23SZ-NEXT: Running pass: LowerTypeTestsPass			; CHECK-O23SZ-NEXT: Running pass: LowerTypeTestsPass
	; CHECK-O-NEXT: Running pass: LowerTypeTestsPass			; CHECK-O-NEXT: Running pass: LowerTypeTestsPass
	; CHECK-O23SZ-NEXT: Running pass: LoopSink			; CHECK-O23SZ-NEXT: Running pass: LoopSink
	; CHECK-O23SZ-NEXT: Running pass: DivRemPairs			; CHECK-O23SZ-NEXT: Running pass: DivRemPairs
	; CHECK-O23SZ-NEXT: Running pass: SimplifyCFGPass			; CHECK-O23SZ-NEXT: Running pass: SimplifyCFGPass
	; CHECK-O23SZ-NEXT: Running pass: EliminateAvailableExternallyPass			; CHECK-O23SZ-NEXT: Running pass: EliminateAvailableExternallyPass
	Show All 35 Lines

llvm/test/Transforms/PhaseOrdering/lto-licm.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -passes='lto<O3>' -S < %s \| FileCheck %s			; RUN: opt -passes='lto<O3>' -S < %s \| FileCheck %s

	define void @hoist_fdiv(ptr %a, float %b) {			define void @hoist_fdiv(ptr %a, float %b) {
	; CHECK-LABEL: @hoist_fdiv(			; CHECK-LABEL: @hoist_fdiv(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = fdiv fast float 1.000000e+00, [[B:%.]]
	; CHECK-NEXT: br label [[FOR_COND:%.*]]			; CHECK-NEXT: br label [[FOR_COND:%.*]]
	; CHECK: for.cond:			; CHECK: for.cond:
	; CHECK-NEXT: [[I_0:%.]] = phi i32 [ 0, [[ENTRY:%.]] ], [ [[INC:%.]], [[FOR_INC:%.]] ]			; CHECK-NEXT: [[I_0:%.]] = phi i32 [ 0, [[ENTRY:%.]] ], [ [[INC:%.]], [[FOR_INC:%.]] ]
	; CHECK-NEXT: [[CMP_NOT:%.*]] = icmp eq i32 [[I_0]], 1024			; CHECK-NEXT: [[CMP_NOT:%.*]] = icmp eq i32 [[I_0]], 1024
	; CHECK-NEXT: br i1 [[CMP_NOT]], label [[FOR_END:%.*]], label [[FOR_INC]]			; CHECK-NEXT: br i1 [[CMP_NOT]], label [[FOR_END:%.*]], label [[FOR_INC]]
	; CHECK: for.inc:			; CHECK: for.inc:
	; CHECK-NEXT: [[IDXPROM:%.*]] = zext i32 [[I_0]] to i64			; CHECK-NEXT: [[IDXPROM:%.*]] = zext i32 [[I_0]] to i64
	; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds float, ptr [[A:%.]], i64 [[IDXPROM]]			; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds float, ptr [[A:%.]], i64 [[IDXPROM]]
	; CHECK-NEXT: [[TMP0:%.*]] = load float, ptr [[ARRAYIDX]], align 4			; CHECK-NEXT: [[TMP1:%.*]] = load float, ptr [[ARRAYIDX]], align 4
	; CHECK-NEXT: [[TMP1:%.]] = fdiv fast float [[TMP0]], [[B:%.]]			; CHECK-NEXT: [[TMP2:%.*]] = fmul fast float [[TMP1]], [[TMP0]]
	; CHECK-NEXT: store float [[TMP1]], ptr [[ARRAYIDX]], align 4			; CHECK-NEXT: store float [[TMP2]], ptr [[ARRAYIDX]], align 4
	; CHECK-NEXT: [[INC]] = add nuw nsw i32 [[I_0]], 1			; CHECK-NEXT: [[INC]] = add nuw nsw i32 [[I_0]], 1
	; CHECK-NEXT: br label [[FOR_COND]]			; CHECK-NEXT: br label [[FOR_COND]]
	; CHECK: for.end:			; CHECK: for.end:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	br label %for.cond			br label %for.cond

	Show All 23 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[LTO] Ensure LICM hoists expensive fdiv instructions introduced by InstCombine
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 538071

llvm/lib/Passes/PassBuilderPipelines.cpp

llvm/test/Other/new-pm-lto-defaults.ll

llvm/test/Transforms/PhaseOrdering/lto-licm.ll

This is an archive of the discontinued LLVM Phabricator instance.

[LTO] Ensure LICM hoists expensive fdiv instructions introduced by InstCombineClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 538071

llvm/lib/Passes/PassBuilderPipelines.cpp

llvm/test/Other/new-pm-lto-defaults.ll

llvm/test/Transforms/PhaseOrdering/lto-licm.ll

[LTO] Ensure LICM hoists expensive fdiv instructions introduced by InstCombine
ClosedPublic