Hello,
the proposed patch aims to reduce the stack memory consumption of the InstCombine function "isOnlyCopiedFromConstantGlobal() ", that in certain conditions could overflow the stack because of excessive recursiveness.
For example, in a case like this:
%0 = alloca [50025 x i32], align 4
%1 = getelementptr inbounds [50025 x i32]* %0, i64 0, i64 0
store i32 0, i32* %1
%2 = getelementptr inbounds i32* %1, i64 1
store i32 1, i32* %2
%3 = getelementptr inbounds i32* %2, i64 1
store i32 2, i32* %3
%4 = getelementptr inbounds i32* %3, i64 1
store i32 3, i32* %4
%5 = getelementptr inbounds i32* %4, i64 1
store i32 4, i32* %5
%6 = getelementptr inbounds i32* %5, i64 1
store i32 5, i32* %6
...
This piece of code crashes llvm when trying to apply instcombine on desktop. On embedded devices this could happen with a much lower limit of recursiveness.
Some instructions (getelementptr and bitcasts) make the function recursively call itself on their uses, which is what makes the example above consume so much stack (it becomes a recursive depth-first tree visit with a very big depth).
The patch changes the algorithm to be semantically equivalent, but iterative instead of recursive and the visiting order to be from a depth-first visit to a breadth-first visit (visit all the instructions of the current level before the ones of the next one).
Now if a lot of memory is required a heap allocation is done instead of the the stack allocation, avoiding the possible crash.
Please, help review this patch :)
Why rename 'U' to 'Use'? 'Use' already a type, so this is a bit confusing. If anything, this loop should iterate over users, not uses.