This is an archive of the discontinued LLVM Phabricator instance.

[LV] Print remark when loop cannot be vectorized due to invalid costs.
ClosedPublic

Authored by sdesmalen on Jul 12 2021, 4:35 AM.

Details

Summary

This patch emits remarks for instructions that have invalid costs for
a given set of vectorization factors. Some example output:

t.c:4:19: remark: Instruction with invalid costs prevented vectorization at VF=(vscale x 1): load
    dst[i] = sinf(src[i]);
                  ^
t.c:4:14: remark: Instruction with invalid costs prevented vectorization at VF=(vscale x 1, vscale x 2, vscale x 4): call to llvm.sin.f32
    dst[i] = sinf(src[i]);
             ^
t.c:4:12: remark: Instruction with invalid costs prevented vectorization at VF=(vscale x 1): store
    dst[i] = sinf(src[i]);
           ^

Diff Detail

Event Timeline

sdesmalen created this revision.Jul 12 2021, 4:35 AM
sdesmalen requested review of this revision.Jul 12 2021, 4:35 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 12 2021, 4:35 AM
Matt added a subscriber: Matt.Jul 12 2021, 7:13 AM
kmclaughlin accepted this revision.Jul 12 2021, 10:16 AM

This looks good to me, I just have one question on the changes to selectVectorizationFactor

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
6143

nit: should InvalidCosts[0].first be used as the location of the remark from reportVectorizationInfo, since there could be multiple instructions found in the loop with invalid costs?

This revision is now accepted and ready to land.Jul 12 2021, 10:16 AM
fhahn added inline comments.Jul 12 2021, 1:34 PM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1680

Could you extend the doc-comment here to include the new argument?

6134

Ideally remarks should be easy to understand for end-users, who do not know about LLVM IR (they specifically are not only targeted at LLVM developers). Instead of printing IR directly, it might be more user-friendly to just say something like call to function sin instead of the plain LLVM IR call. Not sure if we have such pretty-printing anywhere else already though.

Including the file/line info where the instruction appears might be a good first step.

llvm/test/Transforms/LoopVectorize/AArch64/scalable-call.ll
100

could you add debug locations to the test and also check that the correct locations are used for the remark?

sdesmalen updated this revision to Diff 358282.Jul 13 2021, 8:51 AM
sdesmalen edited the summary of this revision. (Show Details)
  • Added debug-info to test, and updated comment for expectedCost
  • Changed the patch to emit individual remarks, with their corresponding debug-info, instead of a single remark where this info is all compiled together.

e.g.

t.c:4:19: remark: Instruction with invalid costs prevented vectorization at VF=(vscale x 1): load

dst[i] = sinf(src[i]);
              ^

t.c:4:14: remark: Instruction with invalid costs prevented vectorization at VF=(vscale x 1, vscale x 2, vscale x 4): call to llvm.sin.f32

dst[i] = sinf(src[i]);
         ^

t.c:4:12: remark: Instruction with invalid costs prevented vectorization at VF=(vscale x 1): store

dst[i] = sinf(src[i]);
       ^
sdesmalen marked 4 inline comments as done.Jul 13 2021, 8:52 AM
sdesmalen added inline comments.
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
6134

@dmgreen also asked a question about that on https://reviews.llvm.org/D105473#2861460
There is precedent for printing IR directly, although I'm not sure if that's necessarily desirable.

What I've done now is:

  1. Prints the opcode and in case of a call, also the name of the function.
  2. Emit one remark for each function that has invalid cost and pass the individual Instruction so that the remark mechanism can take the file/line/column info from it.
6143

Fair point! The updated patch should address this now. It now emits a remark with corresponding line/column info for each instruction with invalid cost.

fhahn accepted this revision.Jul 13 2021, 12:05 PM

Great! LGTM

llvm/test/Transforms/LoopVectorize/AArch64/scalable-call.ll
3

you should be able to pass the input file directly to FileCheck, no need for cat.

kmclaughlin accepted this revision.Jul 14 2021, 5:52 AM
This revision was landed with ongoing or failed builds.Jul 14 2021, 9:12 AM
This revision was automatically updated to reflect the committed changes.
sdesmalen marked 2 inline comments as done.
iii added a subscriber: iii.Jul 15 2021, 3:39 AM

llvm-clang-x86_64-expensive-checks-debian (https://lab.llvm.org/buildbot/#/builders/16/builds/13741) fails with:

<<<<<<
            1: remark: <unknown>:0:0: UserVF ignored because of invalid costs. 
            2: remark: t.c:3:10: Instruction with invalid costs prevented vectorization at VF=(vscale x 1): load 
next:102'0                                                                                                      X error: no match found
            3: remark: t.c:3:20: Instruction with invalid costs prevented vectorization at VF=(vscale x 1): call to llvm.sin.f32 
next:102'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:102'1             ?                                                                                                          possible intended match

The bot attributed this to my changes (https://reviews.llvm.org/D105629), but they have nothing to do with aarch64 or vectorization, and this test passes locally for me.

@sdesmalen could you take a look please?

llvm-clang-x86_64-expensive-checks-debian (https://lab.llvm.org/buildbot/#/builders/16/builds/13741) fails with:

<<<<<<
            1: remark: <unknown>:0:0: UserVF ignored because of invalid costs. 
            2: remark: t.c:3:10: Instruction with invalid costs prevented vectorization at VF=(vscale x 1): load 
next:102'0                                                                                                      X error: no match found
            3: remark: t.c:3:20: Instruction with invalid costs prevented vectorization at VF=(vscale x 1): call to llvm.sin.f32 
next:102'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:102'1             ?                                                                                                          possible intended match

The bot attributed this to my changes (https://reviews.llvm.org/D105629), but they have nothing to do with aarch64 or vectorization, and this test passes locally for me.

@sdesmalen could you take a look please?

Hi @iii thanks for pointing it out, I already started looking into this.

thakis added a subscriber: thakis.Jul 15 2021, 6:06 AM

Hi, it looks like you landed a determinism fix in dc7bdc1e7121693df112f2fdb11cc6b88580ba4b. Thanks!

However, http://45.33.8.238/win/41940/summary.html is one commit after dc7bdc1e7121693df112f2fdb11cc6b88580ba4b and still has the test failing. Looks like the fix isn't complete.

thakis added inline comments.Jul 15 2021, 6:08 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
6115

…maybe this needs to be a stable_sort() instead?

sdesmalen added inline comments.Jul 15 2021, 6:51 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
6115

Doh! The latest issue seems caused by the change in rGdc7bdc1e7121693df112f2fdb11cc6b88580ba4b :

const Instruction *AI = A.first, *BI = B.first;
if (AI->getParent() != BI->getParent())
  return AI->getParent() < BI->getParent();   // pointer comparison

I considered writing:

if (AI->getParent() != BI->getParent())
  return false;

So that AI < BI == false && BI < AI == false and that stable_sort would keep it in place, but this doesn't work in practice.
I didn't think of a more suitable way to order the blocks (the dominator tree is not part of CostModel, so I also can't use that).

Any tips?

thakis added inline comments.Jul 15 2021, 7:15 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
6115

Just the usual "if it takes a while to figure out, revert for now to keep the tree working", sorry.