This is an archive of the discontinued LLVM Phabricator instance.

[LoopVectorizer] NFCI: Calculate register usage based on TLI.getTypeLegalizationCost.
ClosedPublic

Authored by sdesmalen on Nov 9 2020, 2:25 AM.

Details

Summary

This is more accurate than dividing the bitwidth based on the element count by the
maximum register size, as it can just reuse whatever has been calculated for
legalization of these types.

This change is also necessary when calculating register usage for scalable vectors, where
the legalization of these types cannot be done based on the widest register size, because
that does not take the 'vscale' component into account.

Diff Detail

Event Timeline

sdesmalen created this revision.Nov 9 2020, 2:25 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 9 2020, 2:25 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
sdesmalen requested review of this revision.Nov 9 2020, 2:25 AM
SjoerdMeijer added inline comments.Nov 9 2020, 7:17 AM
llvm/include/llvm/CodeGen/BasicTTIImpl.h
297

If it's only about this, perhaps better not to create yet another TTI hook? Perhaps just query getTypeLegalizationCost directly in the LV?

sdesmalen added inline comments.Nov 9 2020, 7:21 AM
llvm/include/llvm/CodeGen/BasicTTIImpl.h
297

I wasn't sure if this was by design, but TLI is currently not a member of the LoopVectorizer. I'm happy to add it, but I thought it had to go through TTI so that target lowering is abstracted from the more high-level loop-vectorization pass.

SjoerdMeijer added inline comments.Nov 9 2020, 7:38 AM
llvm/include/llvm/CodeGen/BasicTTIImpl.h
297

Ah sorry, looking at it closer, I think you're right, calling this from the LV would be a layering violation....

SjoerdMeijer added inline comments.Nov 9 2020, 7:40 AM
llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
260

Perhaps 0 is not a very good default?

sdesmalen updated this revision to Diff 303899.Nov 9 2020, 9:16 AM

Changed default for getRegUsageForType to return 1 instead of 0.

sdesmalen marked 3 inline comments as done.Nov 9 2020, 9:16 AM
sdesmalen added inline comments.
llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
260

Good point!

SjoerdMeijer accepted this revision.Nov 10 2020, 1:26 AM

Looks like a good change to me.

This revision is now accepted and ready to land.Nov 10 2020, 1:26 AM
This revision was automatically updated to reflect the committed changes.
sdesmalen marked an inline comment as done.

Hi,

I hit an assertion with this patch:
opt: ../lib/IR/Type.cpp:620: static llvm::FixedVectorType *llvm::FixedVectorType::get(llvm::Type *, unsigned int): Assertion `isValidElementType(ElementType) && "Element type of a VectorType must " "be an integer, floating point, or " "pointer type."' failed.

Reproduce with
opt -loop-vectorize -mtriple x86_64 -o /dev/null extractvalue-crash.ll

with the input

; RUN: opt -loop-vectorize -mtriple x86_64 -o /dev/null %s
%rec6 = type { i16 }

; Don't crash on trying to vectorize an extractvalue.

@b = global %rec6 zeroinitializer

define void @f1() {
  %_tmp2 = icmp ne i16 0, 0
  br i1 %_tmp2, label %bb1.preheader, label %bb3

bb1.preheader:                                    ; preds = %0
  %_tmp4 = load %rec6, %rec6* @b
  br label %bb1

bb1:                                              ; preds = %bb1.preheader, %bb1
  %_tmp81 = phi i16 [ 0, %bb1.preheader ], [ %_tmp7, %bb1 ]
  %_tmp4.fca.0.extract = extractvalue %rec6 %_tmp4, 0
  %_tmp7 = sub i16 %_tmp81, 1
  %_tmp9 = icmp ne i16 %_tmp7, 0
  br i1 %_tmp9, label %bb1, label %bb3.loopexit

bb3.loopexit:                                     ; preds = %bb1
  %_tmp7.lcssa = phi i16 [ %_tmp7, %bb1 ]
  br label %bb3

bb3:                                              ; preds = %bb3.loopexit, %0
  ret void
}

Hi,

I hit an assertion with this patch:
opt: ../lib/IR/Type.cpp:620: static llvm::FixedVectorType *llvm::FixedVectorType::get(llvm::Type *, unsigned int): Assertion `isValidElementType(ElementType) && "Element type of a VectorType must " "be an integer, floating point, or " "pointer type."' failed.

Reproduce with
opt -loop-vectorize -mtriple x86_64 -o /dev/null extractvalue-crash.ll

with the input

; RUN: opt -loop-vectorize -mtriple x86_64 -o /dev/null %s
%rec6 = type { i16 }

; Don't crash on trying to vectorize an extractvalue.

@b = global %rec6 zeroinitializer

define void @f1() {
  %_tmp2 = icmp ne i16 0, 0
  br i1 %_tmp2, label %bb1.preheader, label %bb3

bb1.preheader:                                    ; preds = %0
  %_tmp4 = load %rec6, %rec6* @b
  br label %bb1

bb1:                                              ; preds = %bb1.preheader, %bb1
  %_tmp81 = phi i16 [ 0, %bb1.preheader ], [ %_tmp7, %bb1 ]
  %_tmp4.fca.0.extract = extractvalue %rec6 %_tmp4, 0
  %_tmp7 = sub i16 %_tmp81, 1
  %_tmp9 = icmp ne i16 %_tmp7, 0
  br i1 %_tmp9, label %bb1, label %bb3.loopexit

bb3.loopexit:                                     ; preds = %bb1
  %_tmp7.lcssa = phi i16 [ %_tmp7, %bb1 ]
  br label %bb3

bb3:                                              ; preds = %bb3.loopexit, %0
  ret void
}

Thanks for reporting @uabelho, I'll revert the patch for now and fix the issue!

sdesmalen added inline comments.Nov 16 2020, 1:08 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5788

The example provided by @uabelho fails for the load instruction (with a struct as result type) which is outside the loop body and isn't actually vectorized, so counting it as a vector register usage seems overly conservative. The extractvalue instruction in the body itself is vectorized, which is fine, since the extracted value from the aggregate is a valid element type for a vector.

I suggest addressing the issue with:

-  if (Ty->isTokenTy())
+  if (Ty->isTokenTy() || !VectorType::isValidElementType(Ty))
      return 0U;

@uabelho @SjoerdMeijer are you happy with me re-landing the patch with the above change?

uabelho added inline comments.Nov 16 2020, 11:50 PM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5788

I've verified that the above change fixes the problem and in the limited testing I've done of it I haven't seen any new problems. Thanks!

@SjoerdMeijer ?

RKSimon added inline comments.
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5778–5779

@sdesmalen I'm seeing unused variable (written but not used) warnings for MaxSafeDepDist - can this be removed?

sdesmalen added inline comments.Nov 19 2020, 9:43 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5778–5779

Thanks for letting me know! I've addressed it in rG41c9f4c1cea7314c0a6a32f2e59dab60cd575c1f