This is an archive of the discontinued LLVM Phabricator instance.

[VPlan] Handle invariant GEPs in isUniformAfterVectorization.
ClosedPublic

Authored by fhahn on Feb 20 2023, 3:08 PM.

Details

Summary

This fixes a crash caused by legal treating a scalable GEP as invariant,
but isUniformAfterVectorization does not handle GEPs.

Fixes https://github.com/llvm/llvm-project/issues/60831.

Diff Detail

Event Timeline

fhahn created this revision.Feb 20 2023, 3:08 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 20 2023, 3:08 PM
fhahn requested review of this revision.Feb 20 2023, 3:08 PM
fhahn updated this revision to Diff 499174.Feb 21 2023, 7:37 AM

Generate check lines for test.

reames added a subscriber: reames.Feb 21 2023, 8:33 AM

I think there's a deeper logic issue here. I'm pretty sure this is a bug I introduced, so, sorry!

LoopAccessInfo::isUniform uses SCEV's notion of loop invariant. This is both things which are currently loop invariant, but also things whose computation is loop invariant (e.g. uniform).

As you noticed, VPReplicateRecipe::execute uses only whether the recipe was defined. This is essentially a proxy for whether an instructions *placement* was loop invariant. This explicitly *does not* reason about computation. (I think... this part of things I'm less confident in.)

I think the basic issue here is that a Recipe can correspond to the isUniform Value without itself being uniform-per-part or UniformAfterVectorization. So, I think your change is patching over a bug, not fixing one.

kartcq added a subscriber: kartcq.Feb 24 2023, 3:26 AM

This patch does not solve the issue completely.
But I don't have expertise enough to comment on what needs to be fixed.

A simplified test case which still breaks the Loop vectorizer even after applying this candidate patch is as follows

test_loop2.ll

target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
target triple = "aarch64-unknown-linux-gnu"

define dso_local fastcc void @_test_loop2(i64 %init, i64 %n, i8* %store_dest) unnamed_addr #0 {
    br label %1.
  critedge5:                                       ; preds = %1
    ret void
  1:                                               ; preds = %0, %1
    %2 = phi i64 [ %7, %1 ], [ %init, %0 ]
    %3 = sub nsw i64 %n, %2
    %4 = trunc i64 %n to i8
    %5 = add i64 %2, %3
    %6 = getelementptr i8, i8* %store_dest, i64 %5
    store i8 %4, i8* %6, align 1
    %7 = add nsw i64 %2, 1
    %8 = icmp sle i64 %7, 1000
    br i1 %8, label %1, label %.critedge5
}

attributes #0 = { "target-features"="+crc,+crypto,+fp-armv8,+neon,+sve,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8a" }

Issue an be reproduced by command

opt -passes=loop-vectorize test_loop2.ll
fhahn added a comment.May 19 2023, 1:08 PM

I think there's a deeper logic issue here. I'm pretty sure this is a bug I introduced, so, sorry!

LoopAccessInfo::isUniform uses SCEV's notion of loop invariant. This is both things which are currently loop invariant, but also things whose computation is loop invariant (e.g. uniform).

As you noticed, VPReplicateRecipe::execute uses only whether the recipe was defined. This is essentially a proxy for whether an instructions *placement* was loop invariant. This explicitly *does not* reason about computation. (I think... this part of things I'm less confident in.)

At the moment, isUniformAfterVectorization reasons about more than just placement by using information about uniform operands, but relies on uniforms being represented as uniform VPReplicateRecipes. I think the underlying issue is that we don't classify the GEP as uniform correctly. I put up D150991 to use Legal::isUniform to bring the reasoning during classification in line with isUniformMemOp which is used in other places.

This patch does not solve the issue completely.
But I don't have expertise enough to comment on what needs to be fixed.

A simplified test case which still breaks the Loop vectorizer even after applying this candidate patch is as follows

Thanks, I added the new test in 07e5f57df4bf

Ayal accepted this revision.May 23 2023, 1:48 PM

This looks good to me. @reames, @kartcq - ok with you too?

llvm/lib/Transforms/Vectorize/VPlan.h
2983

Propagating uniformity forwards on-demand from operands is fine, for GEPs or other non-phi recipes, where not cached as a result of SCEV/Divergence Analysis.

Should this function still be "inline"?

The "AfterVectorization" suffix should be dropped if favor of "onlyFirstLaneUsed()" above (independent of this patch).

This revision is now accepted and ready to land.May 23 2023, 1:48 PM

I'm fine with this.

fhahn marked an inline comment as done.May 30 2023, 8:17 AM
fhahn added inline comments.
llvm/lib/Transforms/Vectorize/VPlan.h
2983

Will move into VPlan.cpp now that this has grown and rename accordingly.