This is an archive of the discontinued LLVM Phabricator instance.

[LV] Add remarks that explicitly mention error handling in candidate loops
Needs ReviewPublic

Authored by jhuber6 on Feb 10 2021, 4:04 PM.

Details

Summary

Some loops contain assertions or other methods of error handling that the user might not be aware of. For example, if the access operator for a custom matrix class had bounds checking. The vectorizer's remarks should be more specific if a loop was not deemed legal for this reason. This patch adds a remark that specifically mentions how to prevent assertions from preventing vectorization, and a more generic response for any control flow that will not return.

Diff Detail

Event Timeline

jhuber6 created this revision.Feb 10 2021, 4:04 PM
jhuber6 requested review of this revision.Feb 10 2021, 4:04 PM
fhahn added a comment.Feb 11 2021, 1:35 AM

Thanks for working on improving the remarks! Unfortunately we have to be careful to keep them generic with respect to the frontend. I don't think we should be using a function name string to detect using assert and include other implementation specific parts (like NDEBUG) in the message. For example, the user could define their own __assert_fail function or the IR could be coming from a Rust or Swift frontend.

llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
1085

In this case, the loop *technically* does not contain the unreachable, right? One of the exit blocks does. I am not sure how to best phrase it, but it might be worth adjusting the message below reportVectorizationFailure("The loop must have a unique exit block", to include the information that some of the exit blocks end in unreachable

Unfortunately I don't think the user can do much with this information.

Thanks for working on improving the remarks! Unfortunately we have to be careful to keep them generic with respect to the frontend. I don't think we should be using a function name string to detect using assert and include other implementation specific parts (like NDEBUG) in the message. For example, the user could define their own __assert_fail function or the IR could be coming from a Rust or Swift frontend.

The motivation to handle __assert_fail specifically as a secondary case is mainly because it has a simple solution we can put in the message. It might be sufficient to only check for unreachable instructions because the diagnostics message should point to the offending assertion and leave it to the user to know how to turn them off.

llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
1085

Yes, the loop shouldn't contain an unreachable instruction, but the error handling code will be inside the loop and will create a new exit block to handle the error. This is just a specialization of failing because of multiple exit blocks. The main benefit of checking it separately here is that the diagnostics will point to the expression that caused it. Some of these situations have simple solutions, such as recompiling without assertions or doing manual loop unswitching.

Also. should the remarks be prefaced with "loop control flow is not understood by vectorizer." In this case it will print another remark saying the same anyway.

Thanks for working on improving the remarks! Unfortunately we have to be careful to keep them generic with respect to the frontend. I don't think we should be using a function name string to detect using assert and include other implementation specific parts (like NDEBUG) in the message. For example, the user could define their own __assert_fail function or the IR could be coming from a Rust or Swift frontend.

I'd argue, whatever the frontend or producer is, something called assert_fail could be reasonable attributed to "an assertion". If it is implicit or not is a different story but it tells the user more than just: "No unique exit block". Especially if the code doesn't have an obvious side exit (but only an implicit or explicit assertion).

llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
1085

Unfortunately I don't think the user can do much with this information.

I don't believe that. I think the key point is that such unreachable exit blocks are often "hidden" from the user. Calling exit, abort, assert, or having implicit range checks, doesn't look like a break or return on the source level. Once we tell people what the problem with the control flow is in terms that actually map to the source, e.g., error handling, assertions, etc. they can act on it. IMHO, "no unique exit block" is not tangible if you have:

for (int i = 0; i < N; ++i) {
  A[i] = ...;
}

with T& decltype(A)::operator[](int i) { assert(i < A.size()); return data[i]; }

In this case, the loop *technically* does not contain the unreachable, right?

Right. We could say something along the lines of:

"Loop exit block contains control flow that does not return, e.g., error handling"

Thanks for working on improving the remarks! Unfortunately we have to be careful to keep them generic with respect to the frontend. I don't think we should be using a function name string to detect using assert and include other implementation specific parts (like NDEBUG) in the message. For example, the user could define their own __assert_fail function or the IR could be coming from a Rust or Swift frontend.

I'd argue, whatever the frontend or producer is, something called assert_fail could be reasonable attributed to "an assertion". If it is implicit or not is a different story but it tells the user more than just: "No unique exit block". Especially if the code doesn't have an obvious side exit (but only an implicit or explicit assertion).

I think it's great you are working on making the remarks more user-friendly!

I'd just like to make sure it works well across a range of frontends. Assuming __assert_fail is an assertion in C/C++ is probably fine (unlikely to have many cases where it is not an assertion), but I am still reluctant to encode such frontend-specific logic here. I think a key question we should think about is how other frontends can opt-in to the same remark. I don't have a real suggestion for that at the moment, but if there's anything we can do other than hard-coding assertion function names for x languages here, that would be great.

Another interesting question is how to best present more context information to the user about remarks. Some of those remarks could benefit from slightly more context/information than can reasonably fit into the remark itself and some way to link to slightly more in-depth documentation would IMO be very valuable. I think such a place would also be slightly better suited to go into frontend/language specific details as well. We could also try to 'guess' the language (based on the file from !dbg) and point to language-specific info, if appropriate (e.g. in the assertion case)

fhahn added a comment.Feb 16 2021, 1:49 PM

(I'm adding Francis & Adam, because they've been working on remarks a lot and perhaps have additional insight/suggestions)

Thanks for working on improving the remarks! Unfortunately we have to be careful to keep them generic with respect to the frontend. I don't think we should be using a function name string to detect using assert and include other implementation specific parts (like NDEBUG) in the message. For example, the user could define their own __assert_fail function or the IR could be coming from a Rust or Swift frontend.

I'd argue, whatever the frontend or producer is, something called assert_fail could be reasonable attributed to "an assertion". If it is implicit or not is a different story but it tells the user more than just: "No unique exit block". Especially if the code doesn't have an obvious side exit (but only an implicit or explicit assertion).

I think it's great you are working on making the remarks more user-friendly!

I'd just like to make sure it works well across a range of frontends. Assuming __assert_fail is an assertion in C/C++ is probably fine (unlikely to have many cases where it is not an assertion), but I am still reluctant to encode such frontend-specific logic here. I think a key question we should think about is how other frontends can opt-in to the same remark. I don't have a real suggestion for that at the moment, but if there's anything we can do other than hard-coding assertion function names for x languages here, that would be great.

I don't think Joseph or I are in love with the function name check ;). It's unlikely (IMHO) to cause problems though. I'd love to see a solution that uses a gerneric mechanism, the new annotations (D88645) maybe? So to say a frontend should mark the unreachable/or call before it as "assertion" and here we go.

Another interesting question is how to best present more context information to the user about remarks. Some of those remarks could benefit from slightly more context/information than can reasonably fit into the remark itself and some way to link to slightly more in-depth documentation would IMO be very valuable. I think such a place would also be slightly better suited to go into frontend/language specific details as well. We could also try to 'guess' the language (based on the file from !dbg) and point to language-specific info, if appropriate (e.g. in the assertion case)

I agree we need a documentation. For OpenMP we started by giving the remark an id [OMPXXX] and the describing it online: https://openmp.llvm.org/docs/remarks/OptimizationRemarks.html
I'd be really happy if we had a centralized remark webpage. I'd like to see example there, and more explanation. Ways to remedy the situation, etc.

jhuber6 updated this revision to Diff 325033.Feb 19 2021, 10:45 AM

Removing check for assert to keep the pass agnostic to the front-end and cleaning up the test file.

ormris removed a subscriber: ormris.Jun 3 2021, 10:54 AM

Should I just abandon this? Try to clear out my open revisions.