This is an archive of the discontinued LLVM Phabricator instance.

[SLP] Avoid calculating expensive spill cost when it is not required
Needs ReviewPublic

Authored by dtemirbulatov on Sep 28 2021, 6:04 AM.

Details

Summary

Calculating the spill cost is expensive and we are looking to all instructions among scalars of VectorizableTree in the region to find CallInst instance. But we can avoid that and look for call instructions while extending the scheduler region instead. I measured, for example, while building SPEC FP 2006(C and C++ buildable) we have a ~2% ratio of CallInst present in all cost estimations calculation for vectorizable kernels. This change invokes getSpillCost() when requires.

This is a part https://reviews.llvm.org/D57779 change [SLP] Add support for throttling.

Diff Detail

Event Timeline

dtemirbulatov created this revision.Sep 28 2021, 6:04 AM
dtemirbulatov requested review of this revision.Sep 28 2021, 6:04 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 28 2021, 6:04 AM
dtemirbulatov edited the summary of this revision. (Show Details)Sep 28 2021, 6:18 AM
dtemirbulatov edited the summary of this revision. (Show Details)Sep 28 2021, 6:27 AM
ABataev added inline comments.Sep 28 2021, 6:35 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
2296–2297

Not sure it is a good idea to add tree here. Maybe just add an internal flag and update R->NoCallInst after scheduling the region, reading the flag from BlockScheduling? or return as one of the results of BS.tryScheduleBundle.

ABataev added inline comments.Sep 28 2021, 6:38 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
2296–2297

Also, maybe it is worth trying to change BoUpSLP::getSpillCost too? For example, we can try to calculate NumCalls in the scheduling function and avoid doing this in the function BoUpSLP::getSpillCost.

dtemirbulatov edited the summary of this revision. (Show Details)Sep 28 2021, 2:19 PM

Rebased, fixed remark.

dtemirbulatov marked an inline comment as done.Dec 6 2021, 4:56 AM
dtemirbulatov added inline comments.
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
2296–2297

I think the implementation is too complex, here in getSpillCost we are calculating the number of calls between two given instructions and as we might change a tree entry the state to gather this might become even more complicated.

ABataev added inline comments.Dec 6 2021, 8:46 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5536–5547

Hmm, you're using NoCalls here but it is always false. Currently NoCallInst is always false.

5546

Better to make it (!NoCallInst || getSpillCost() == 0) && ....

Rebased, fixed remarks.

dtemirbulatov marked 2 inline comments as done.Dec 8 2021, 11:16 PM
ABataev added inline comments.Dec 9 2021, 5:41 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5536–5543
bool NoCallInst = all_of(BlocksSchedules, [](const decltype(BlocksSchedules)::value_type &BS) {return BSIter.second->NoCalls; }));
7282

lifetime intrinsics are also kind of DbgInfoIntrinsic? Or llvm.assume? Better to check for all these kinds of intrinsics.

Addressed remarks.

Missed updating if statement at 5370, fixed.

ABataev added inline comments.Dec 13 2021, 5:27 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5370

auto *CI

5371–5372

There is special function IntrinsicInst::isAssumeLikeIntrinsic(), use it here

7282

auto *CI

7283

There is special function IntrinsicInst::isAssumeLikeIntrinsic(), use it here

Rebased, fixed remarks.

dtemirbulatov marked 2 inline comments as done.Dec 24 2021, 2:47 PM
dtemirbulatov added inline comments.
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5371–5372

We already dynamically cast to CallInst and instanceof AssumeInst is way cheaper.

7283

We already dynamically cast to CallInst and instanceof to AssumeInst is way cheaper.

ABataev added inline comments.Dec 24 2021, 2:52 PM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5371–5372

intrinsicinst::isAssumeLikeIntrinsic() is way more universal and easier to maintain.

nikic added a subscriber: nikic.Dec 24 2021, 3:10 PM
nikic added inline comments.
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5370

I think it would be best to simply skip all intrinsics here. Intrinsics being expanded inline rather than producing libcalls is a good default assumption if you don't want to model this in detail.

Rebase, I decided to stick with isAssumeLikeIntrinsic() and avoid just checking for any intrinsic since there might be special embedded targets where intrinsic might end up in an actual call to a function.

Rebased, fix formatting. Ping.

ABataev added inline comments.Feb 16 2022, 1:15 PM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5374–5376

I believe isAssumeLikeIntrinsic() covers all the checks here, no?

dtemirbulatov marked an inline comment as done.Feb 17 2022, 10:20 AM
dtemirbulatov added inline comments.
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5374–5376

Ah, Correct.

dtemirbulatov marked an inline comment as done.

Addressed remark.

Can we have tests? It shall reduce the cost for some of the assume-like intrinsics with this patch.