This is an archive of the discontinued LLVM Phabricator instance.

[LoopVectorizer] give more advice in remark about failure to vectorize call
ClosedPublic

Authored by spatel on Jan 10 2019, 9:31 AM.

Download Raw Diff

Details

Reviewers

hfinkel
Ayal
efriedma

Commits

rG7d65fe5cd551: [LoopVectorizer] give more advice in remark about failure to vectorize call
rL351010: [LoopVectorizer] give more advice in remark about failure to vectorize call

Summary

Something like this is requested by:
https://bugs.llvm.org/show_bug.cgi?id=40265
...and it seems like a common enough case that we should acknowledge it. Not sure if this crosses the line for wordiness in an optimization remark though.

Diff Detail

Event Timeline

spatel created this revision.Jan 10 2019, 9:31 AM

Herald added a subscriber: mcrosier. · View Herald TranscriptJan 10 2019, 9:31 AM

Can we check whether the function could be vectorized if fast math were enabled, so we only show the advice when it's relevant?

"relaxing the floating-point model" is a little confusing... can we explicitly say "consider turning on fast math" or something like that?

Patch updated:

Try to distinguish a vectorizable libcall from an arbitrary call (I don't see an exact mapping, but "hasOptimizedCodeGen()" looks close).
Add tests to show that we correctly differentiate the 2 cases.

hfinkel added inline comments.Jan 10 2019, 6:05 PM

lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
727	I'd prefer it say "fast-math mode" instead of just "fast-math". It would be nice if we could also point users to -fno-math-errno, as that might fix this problem for them and they might not be able to use -ffast-math for the whole translation unit. Now we already have a problem in the vectorizer because it has a lot of optimization remarks that mention Clang-specific things (flags, pragmas, etc.). The intent of the optimization-remark design was that the frontend callback handler would handle such cases by adding frontend-specific information in the frontend (and not have it embedded here). That didn't happen, and while we should clean this up, in the mean time we might just make the problem incrementally worse and mention flags here too: "try compiling with -fno-math-errno or -ffast-math".

spatel marked an inline comment as done.Jan 11 2019, 6:12 AM

spatel added inline comments.

lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
727	Yes, I was trying to avoid clang-specific language here. And in the motivating bug report, you're exactly right -- we only needed -fno-math-errno to overcome the limitation (that's why I was using the likely too vague "relaxed FP" vocabulary in the previous rev).

Patch updated:

Added an FP-type constraint to the mathlib check (no point suggesting FP flags if it's not an FP call).
Changed remark text to include clang-specific flags (and suggest/hope that users can translate those to their actual front-end options if this isn't a clang-based invocation).

LGTM

This revision is now accepted and ready to land.Jan 11 2019, 8:38 AM

This LGTM too, just adding mtcw wondering if these extra checks for more accurate reporting are worth placing under allowExtraAnalysis(); and/or if TLI->isFunctionVectorizable() shouldn't be the one informing the cause of its failure when returning false.

In D56551#1354518, @Ayal wrote:

This LGTM too, just adding mtcw wondering if these extra checks for more accurate reporting are worth placing under allowExtraAnalysis(); and/or if TLI->isFunctionVectorizable() shouldn't be the one informing the cause of its failure when returning false.

Those are good questions/comments. I'm not too familiar with the code organization here, but I'll add that to the 'TODO' comment for now, so we don't lose it.

Closed by commit rL351010: [LoopVectorizer] give more advice in remark about failure to vectorize call (authored by spatel). · Explain WhyJan 12 2019, 7:32 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

lib/

Transforms/

Vectorize/

LoopVectorizationLegality.cpp

4 lines

Diff 181077

lib/Transforms/Vectorize/LoopVectorizationLegality.cpp

Show First 20 Lines • Show All 709 Lines • ▼ Show 20 Lines	for (Instruction &I : *BB) {
// * Have a mapping to an IR intrinsic.		// * Have a mapping to an IR intrinsic.
// * Have a vector version available.		// * Have a vector version available.
auto *CI = dyn_cast<CallInst>(&I);		auto *CI = dyn_cast<CallInst>(&I);
if (CI && !getVectorIntrinsicIDForCall(CI, TLI) &&		if (CI && !getVectorIntrinsicIDForCall(CI, TLI) &&
!isa<DbgInfoIntrinsic>(CI) &&		!isa<DbgInfoIntrinsic>(CI) &&
!(CI->getCalledFunction() && TLI &&		!(CI->getCalledFunction() && TLI &&
TLI->isFunctionVectorizable(CI->getCalledFunction()->getName()))) {		TLI->isFunctionVectorizable(CI->getCalledFunction()->getName()))) {
ORE->emit(createMissedAnalysis("CantVectorizeCall", CI)		ORE->emit(createMissedAnalysis("CantVectorizeCall", CI)
<< "call instruction cannot be vectorized");		<< "call instruction cannot be vectorized "
		"(if this is a math library call, consider relaxing the "
		"floating-point model)");
LLVM_DEBUG(		LLVM_DEBUG(
dbgs() << "LV: Found a non-intrinsic, non-libfunc callsite.\n");		dbgs() << "LV: Found a non-intrinsic, non-libfunc callsite.\n");
return false;		return false;
}		}

// Intrinsics such as powi,cttz and ctlz are legal to vectorize if the		// Intrinsics such as powi,cttz and ctlz are legal to vectorize if the
// second argument is the same (i.e. loop invariant)		// second argument is the same (i.e. loop invariant)
		hfinkelUnsubmitted Not Done Reply Inline Actions I'd prefer it say "fast-math mode" instead of just "fast-math". It would be nice if we could also point users to -fno-math-errno, as that might fix this problem for them and they might not be able to use -ffast-math for the whole translation unit. Now we already have a problem in the vectorizer because it has a lot of optimization remarks that mention Clang-specific things (flags, pragmas, etc.). The intent of the optimization-remark design was that the frontend callback handler would handle such cases by adding frontend-specific information in the frontend (and not have it embedded here). That didn't happen, and while we should clean this up, in the mean time we might just make the problem incrementally worse and mention flags here too: "try compiling with -fno-math-errno or -ffast-math". hfinkel: I'd prefer it say "fast-math mode" instead of just "fast-math". It would be nice if we could…
		spatelAuthorUnsubmitted Done Reply Inline Actions Yes, I was trying to avoid clang-specific language here. And in the motivating bug report, you're exactly right -- we only needed -fno-math-errno to overcome the limitation (that's why I was using the likely too vague "relaxed FP" vocabulary in the previous rev). spatel: Yes, I was trying to avoid clang-specific language here. And in the motivating bug report…
if (CI && hasVectorInstrinsicScalarOpd(		if (CI && hasVectorInstrinsicScalarOpd(
getVectorIntrinsicIDForCall(CI, TLI), 1)) {		getVectorIntrinsicIDForCall(CI, TLI), 1)) {
auto *SE = PSE.getSE();		auto *SE = PSE.getSE();
if (!SE->isLoopInvariant(PSE.getSCEV(CI->getOperand(1)), TheLoop)) {		if (!SE->isLoopInvariant(PSE.getSCEV(CI->getOperand(1)), TheLoop)) {
ORE->emit(createMissedAnalysis("CantVectorizeIntrinsic", CI)		ORE->emit(createMissedAnalysis("CantVectorizeIntrinsic", CI)
<< "intrinsic instruction cannot be vectorized");		<< "intrinsic instruction cannot be vectorized");
LLVM_DEBUG(dbgs()		LLVM_DEBUG(dbgs()
<< "LV: Found unvectorizable intrinsic " << *CI << "\n");		<< "LV: Found unvectorizable intrinsic " << *CI << "\n");
▲ Show 20 Lines • Show All 459 Lines • Show Last 20 Lines