In some tests I found forgetLoop() in ScalarEvolution can be pretty slow.
Specifically in one test I measured performance and I found that UnrollLoop took 17% of total compile time with a good 10% of the 17% being SCEV::forgetLoop().
The code seems to be churning quite a bit of instructions in some worse case scenario. The fact also that the algorithm is recursive and that we don't reuse the data structures with the child loops don't help.
I refactored it to be iterative which enables us to reuse the data structures for computing the children loops and I made it such that the "Visited" set is shared between the processing of all the sub loops .
I'm not sure if that is correct because I'm not a SCEV expert and I would like feedback on that. At a naive look it seems like that if you processed an instruction already I don't see why , if you end up encountering it again while processing a child loop, you should process it again, as it doesn't seem the information of being part of a specific loop is used for "forgetMemoizedResults()" or "eraseValueFromMap()"
These changes pretty much halve the time taken by the method in my case which is still not ideal but it's much better. It passes all the LLVM and Clang make tests like this.
I also experimented in adding this:
if (It != ValueExprMap.end()) { eraseValueFromMap(It->first); forgetMemoizedResults(It->second); if (PHINode *PN = dyn_cast<PHINode>(I)) ConstantEvolutionLoopExitValue.erase(PN); - } + } else + continue;
Which basically makes the algorithm to not continue following the def-use chain if we couldn't find a value in the ValueExprMap. I'm not sure this change is correct either, but it is worth noting that with this change the time of the method goes to 0.1% compile time in my case (from 10% originally) and still passes all the LLVM and Clang make tests.
I tried this change only from intuition though. Does anybody know if it is actually valid to do so?