This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/lib/Transforms/InstCombine/
-
trunk/
-
lib/
-
Transforms/
-
InstCombine/
-
InstCombinePHI.cpp

Differential D47023

Limit the number of phis in intptr/ptrint folding
ClosedPublic

Authored by davidxl on May 17 2018, 12:05 PM.

Download Raw Diff

Details

Reviewers

mzolotukhin

Commits

rGbc471c39eed3: Add a limit for phi folding instcombine
rL332653: Add a limit for phi folding instcombine

Summary

In rare cases when there are thousands of single use phis in a BB, the phi folding simplification can take excessive amount of time. Add an option to control the limit. Mostly NFC.

Diff Detail

Repository: rL LLVM

Event Timeline

davidxl created this revision.May 17 2018, 12:05 PM

Do you mind adding a TODO note describing the problem and another way to fix it? Otherwise, LGTM.

Thanks,
Michael

This revision is now accepted and ready to land.May 17 2018, 12:13 PM

Done. Added comment.

Closed by commit rL332653: Add a limit for phi folding instcombine (authored by davidxl). · Explain WhyMay 17 2018, 12:27 PM

This revision was automatically updated to reflect the committed changes.

Hi David,

Unfortunately, this change doesn't help the testcase we have, so there should be more issues with this. Do you have other ideas how it can be fixed (the script I sent earlier still exposes an issue)? Instruments show that the most time is currently spent in Use::getUser().

Thanks,
Michael

My profile shows that with the 20000 phi case, the cycles are cut in half
with the limit.

The remaining problem is in : [.] llvm::BasicBlock::getFirstNonPHI

I guess a short fix would be to have a customized version of getFirstNonPhi
which returns an 'end' if the limit is reached. You can try if that works
for you.

thanks,

David

I thought about this transformation more, and I no longer think that we even need to move it to aggressive-instcombine (or another FunctionPass). What we need is to just change it from top-down to bottom-up: i.e. to start looking not from phi-nodes, but rather from inttoptr instructions. That is, the algorithm would look like:

visitIntToPtr(Instruction &I) {
  Value *Def = I.getOperand()
  if (!Def.hasSingleUse())
     return;
  if (isa<PtrToInt>(Def)) {   // Simple case without phi - it's probably already handled somewhere else, but I'm putting it here for completeness
     I.replaceAllUsesWith(Def.getOperand());
  }
  if (isa<PHINode>(Def)) {    // Interesting case where we have a phi-node
     if (all operands are PtrToInt with a single use) {
       NewPHI = RewritePHI();
       I.replaceAllUsesWith(NewPHI);
    }
  }
}

What do you think? Would it work?

PS: Internally we worked around the slowdown, so it's not pressing on us anymore.

Michael

This might work and existing test cases should be able to make sure the
behavior is not regressed.

David

Revision Contents

Path

Size

llvm/

trunk/

lib/

Transforms/

InstCombine/

InstCombinePHI.cpp

10 lines

Diff 147373

llvm/trunk/lib/Transforms/InstCombine/InstCombinePHI.cpp

Show All 17 Lines
#include "llvm/Analysis/Utils/Local.h"		#include "llvm/Analysis/Utils/Local.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/IR/PatternMatch.h"		#include "llvm/IR/PatternMatch.h"
using namespace llvm;		using namespace llvm;
using namespace llvm::PatternMatch;		using namespace llvm::PatternMatch;

#define DEBUG_TYPE "instcombine"		#define DEBUG_TYPE "instcombine"

		static cl::opt<unsigned>
		MaxNumPhis("instcombine-max-num-phis", cl::init(512),
		cl::desc("Maximum number phis to handle in intptr/ptrint folding"));

/// The PHI arguments will be folded into a single operation with a PHI node		/// The PHI arguments will be folded into a single operation with a PHI node
/// as input. The debug location of the single operation will be the merged		/// as input. The debug location of the single operation will be the merged
/// locations of the original PHI node arguments.		/// locations of the original PHI node arguments.
void InstCombiner::PHIArgMergedDebugLoc(Instruction *Inst, PHINode &PN) {		void InstCombiner::PHIArgMergedDebugLoc(Instruction *Inst, PHINode &PN) {
auto *FirstInst = cast<Instruction>(PN.getIncomingValue(0));		auto *FirstInst = cast<Instruction>(PN.getIncomingValue(0));
Inst->setDebugLoc(FirstInst->getDebugLoc());		Inst->setDebugLoc(FirstInst->getDebugLoc());
// We do not expect a CallInst here, otherwise, N-way merging of DebugLoc		// We do not expect a CallInst here, otherwise, N-way merging of DebugLoc
// will be inefficient.		// will be inefficient.
▲ Show 20 Lines • Show All 137 Lines • ▼ Show 20 Lines	for (unsigned i = 0; i != PN.getNumIncomingValues(); ++i) {
AvailablePtrVals.emplace_back(LoadI);		AvailablePtrVals.emplace_back(LoadI);
}		}

// Now search for a matching PHI		// Now search for a matching PHI
auto *BB = PN.getParent();		auto *BB = PN.getParent();
assert(AvailablePtrVals.size() == PN.getNumIncomingValues() &&		assert(AvailablePtrVals.size() == PN.getNumIncomingValues() &&
"Not enough available ptr typed incoming values");		"Not enough available ptr typed incoming values");
PHINode *MatchingPtrPHI = nullptr;		PHINode *MatchingPtrPHI = nullptr;
		unsigned NumPhis = 0;
for (auto II = BB->begin(), EI = BasicBlock::iterator(BB->getFirstNonPHI());		for (auto II = BB->begin(), EI = BasicBlock::iterator(BB->getFirstNonPHI());
II != EI; II++) {		II != EI; II++, NumPhis++) {
		// FIXME: consider handling this in AggressiveInstCombine
		if (NumPhis > MaxNumPhis)
		return nullptr;
PHINode *PtrPHI = dyn_cast<PHINode>(II);		PHINode *PtrPHI = dyn_cast<PHINode>(II);
if (!PtrPHI \|\| PtrPHI == &PN \|\| PtrPHI->getType() != IntToPtr->getType())		if (!PtrPHI \|\| PtrPHI == &PN \|\| PtrPHI->getType() != IntToPtr->getType())
continue;		continue;
MatchingPtrPHI = PtrPHI;		MatchingPtrPHI = PtrPHI;
for (unsigned i = 0; i != PtrPHI->getNumIncomingValues(); ++i) {		for (unsigned i = 0; i != PtrPHI->getNumIncomingValues(); ++i) {
if (AvailablePtrVals[i] !=		if (AvailablePtrVals[i] !=
PtrPHI->getIncomingValueForBlock(PN.getIncomingBlock(i))) {		PtrPHI->getIncomingValueForBlock(PN.getIncomingBlock(i))) {
MatchingPtrPHI = nullptr;		MatchingPtrPHI = nullptr;
▲ Show 20 Lines • Show All 1,064 Lines • Show Last 20 Lines