We abort building vectorizable trees in some cases (e.g., if the maximum recursion depth is reached, if the region size is too large, etc.). If this happens for a reduction, we can be left with a root entry that needs to be gathered. For these cases, we need make sure we actually set VectorizedValue to the resulting vector.
This patch ensures we properly set VectorizedValue and it also ensures the insertelement sequence generated for the gathers is inserted at the correct location (after the last instruction in the bundle). Please let me know what you think.
Reference: https://llvm.org/bugs/show_bug.cgi?id=28330
Just to make sure I understand this correctly - you're running from Leader, inclusive, so that if VL[0] is the last instruction in program order, you'll end up with LastInst == Leader, right?
If so, it's probably worth a comment - or just initialize LastInst to Leader instead of null.
Also, won't we get worst-case quardatic behavior here, if VL[0] is consistently last (so we end up running to the end of the block for each bundle)?