This is an archive of the discontinued LLVM Phabricator instance.

[LoopVectorizer] Fixed a bug in memory conflict run-time check
ClosedPublic

Authored by delena on Aug 4 2016, 1:15 PM.

Details

Summary

The bug is in Low <-> High boundaries calculation. The High boundary should be "last memory access pointer + element size".

(It may be "last memory access pointer + element size -1", since we use "<=" in bounds condition)

Diff Detail

Repository
rL LLVM

Event Timeline

delena updated this revision to Diff 66844.Aug 4 2016, 1:15 PM
delena retitled this revision from to [LoopVectorizer] Fixed a bug in memory conflict run-time check.
delena updated this object.
delena added reviewers: mkuper, anemet, sbaranga, Ayal.
delena set the repository for this revision to rL LLVM.
delena added a subscriber: llvm-commits.
Ayal added inline comments.Aug 4 2016, 4:04 PM
../lib/Analysis/LoopAccessAnalysis.cpp
202

Worth commenting that this is needed when pointers are not naturally aligned (as in your "*(reinterpret_cast<long long*>(op)) = *(reinterpret_cast<const long long*>(src))" case of i8*'s discussed re: D20789), or a pair of pointers point to elements of distinct sizes.

Put differently, we can do w/o adding this size to ScEnd if all pointers point to naturally aligned data of same size (as in number-of-memchecks.ll below).

(It may be "last memory access pointer + element size -1", since we use "<=" in bounds condition)

So, want to subtract 1 from EltSize?

anemet edited edge metadata.Aug 4 2016, 10:25 PM

Did you forget to add the test?

../lib/Analysis/LoopAccessAnalysis.cpp
202

or use < instead of <=. I remember wondering about why we were using <=

I would also be good to finally have the full mathematical formula in comment somewhere.

delena added a comment.Aug 5 2016, 4:02 AM

Did you forget to add the test?

No, I didn't. I fixed the existing lit tests that explore the change. Do you want one more lit test?
The test case that triggered the bug will not be vectorized without the second patch https://reviews.llvm.org/D20789.

Did you forget to add the test?

No, I didn't. I fixed the existing lit tests that explore the change. Do you want one more lit test?
The test case that triggered the bug will not be vectorized without the second patch https://reviews.llvm.org/D20789.

Yes, please. I would certainly like to have the testcase included in the patch that triggers the bug with an explanation why. You should be able to construct a testcase by checking the memcheck part of the output of LAA.

Once the other bug is fixed you can add an additional RUN line for -loop-vectorize and then check the compares directly.

delena updated this revision to Diff 67147.Aug 8 2016, 5:57 AM
delena edited edge metadata.
  1. Added a test that triggers the bug.
  2. Added a comment that explains how the bounds are being calculated.
  3. Changed condition from ( abs(p1 - p2) >= bound ) to ( abs(p1 - p2) > bound ).
anemet added inline comments.Aug 9 2016, 4:25 PM
../lib/Analysis/LoopAccessAnalysis.cpp
205

Touched is way too loose, how about EltSizeSCEV or something.

I also like Ayal's idea about the comment talking about mentioning misaligned here.

You also need to change the comment in the LAA.h:

/// Holds the pointer value at the end of the loop.
const SCEV *End;

This should say something about a holding a pointer one past the last byte of the last element accessed.

1888–1898

Can you please include the formula in a comment here?

../test/Analysis/LoopAccessAnalysis/memcheck-off-by-one-error.ll
4

Elaborate more how exactly this was failing before. You could for example explain it in terms of the actual pointer values you were debugging with.

delena updated this revision to Diff 67509.Aug 10 2016, 5:18 AM
delena marked 3 inline comments as done.

Added more explanations to the code according to Adam's remarks.

Ayal added inline comments.Aug 10 2016, 2:16 PM
../include/llvm/Analysis/LoopAccessAnalysis.h
337–338

Instead of "Holds the pointer value at the beginning of the loop." suggest:

"Holds the smallest byte address accessed by the pointer throughout all iterations of the loop."

340–341

Instead of "Holds the pointer one past the last byte of the last element accessed." suggest:

"Holds the largest byte address accessed by the pointer throughout all iterations of the loop, plus 1."

../lib/Analysis/LoopAccessAnalysis.cpp
163

can add:

"which is clearly the complement of having the intervals disjoint/ordered:
= !((P2.Start >= P1.End) || (P1.Start >= P2.End))"

166

Suggest to rephrase as follows (think of accessing doubles and floats, where the first float overlaps the "2nd half" of the last double):

"The addition of SizeOfElt may be omitted when all accesses share the same element size and are all naturally aligned. In this case '<=' should be used instead of '<'."

196

while we're here fixing comments: "but the we" >> "but we"

1889

"[A|B].End points to the last accessed byte, plus one."

1893

Instead of "Since one of the "bounds" is always true, the conflict happens when the both are true:" suggest:

"At-least one of the "bounds" is always true; conflict happens when the other is too."

delena updated this revision to Diff 67679.Aug 11 2016, 5:56 AM
delena marked 3 inline comments as done.

Added/fixed some comments.

May I commit this patch? Thank you.

delena updated this revision to Diff 67888.Aug 12 2016, 1:15 PM
delena marked 2 inline comments as done.

Changed comments in the header file.

I am happy with this now with the comments addressed as long as Ayal also likes it. Thanks for the fix!

../lib/Analysis/LoopAccessAnalysis.cpp
165–167

I am not sure there is value in saying whether it can be omitted. Just say why we need it.

1890–1891

Add it that the code actually checks the inverse of this, i.e. B.Start < A.End && A.Start < B.End.

This revision was automatically updated to reflect the committed changes.