In memory dependence checking module of loop vectorization, the algorithm tries to collect memory access candidates from AliasSetTracker AccessAnalysis::processMemAccesses, and then check memory dependences one another in MemoryDepChecker::isDependent.
The memory accesses are unique in AliasSetTracker, and a single memory access in AliasSetTracker may map to multiple entries in 'PtrAccessSet Accesses' of AccessAnalysis, which could cover both 'read' and 'write'. Originally the algorithm only checked 'write' entry in Accesses if only 'write' exists by using statement "bool IsWrite = S.count(MemAccessInfo(Ptr, true));". This is incorrect and the consequence is it ignored all read access, and finally some RAW and WAR dependence are missed.
The attached test case exposed a loop body like below,
{code}
// loop body ... = a[i] (1) ... = a[i+1] (2) ....... a[i+1] = .... (3) a[i] = ... (4)
{code}
If we ignore two reads, the dependence between (1) and (3) after vectorization would not be able to be captured, and finally this loop will be incorrectly vectorized.
The fix in this patch simply inserts a new loop to find all entries in Accesses. Since it will skip most of all other memory accesses by checking the Value pointer at the very beginning of the loop, I think it will not increase compile-time visibly.
Thanks,
-Jiangning