This is an archive of the discontinued LLVM Phabricator instance.

[SLP] Add ScalarizationOverheadBuilder helper to track vector extractions
Changes PlannedPublic

Authored by RKSimon on Sep 25 2022, 8:44 AM.

Details

Summary

Instead of accumulating all extraction costs separately and then adjusting for repeated subvector extractions, this patch collects all the extractions and then converts to calls to getScalarizationOverhead to improve the accuracy of the costs.

I'm not entirely satisfied with the getExtractWithExtendCost handling yet - this still just adds all the getExtractWithExtendCost costs together - it really needs to be replaced with a "getScalarizationOverheadWithExtend", but that will require further refactoring first.

This replaces my initial attempt in D124769.

Diff Detail

Event Timeline

RKSimon created this revision.Sep 25 2022, 8:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 25 2022, 8:44 AM
RKSimon requested review of this revision.Sep 25 2022, 8:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 25 2022, 8:44 AM

Note that this is a comment and not a review. Your usage of the anonymous namespace violates LLVM coding style. Maybe hoist it into its own header?

Nice catch! I was definitely on auto-pilot there!

Actually, I'm not sure if I should alter it - although the style guide recommends keeping just the declarations inside the namespace ("make anonymous namespaces as small as possible"), SLP appears to always keep the implementations there as well. @ABataev any preference?

Actually, I'm not sure if I should alter it - although the style guide recommends keeping just the declarations inside the namespace ("make anonymous namespaces as small as possible"), SLP appears to always keep the implementations there as well. @ABataev any preference?

It is local and small enough, so I believe we can keep it as is.

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5788

Class?

5834–5839

Expand auto

ABataev added inline comments.Sep 26 2022, 5:45 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
6119

Does it support extract subvector?

RKSimon added inline comments.Sep 26 2022, 6:19 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
6119

getScalarizationOverhead should handle it there - however the getExtractWithExtendCost case doesn't handle subvectors yet - I really want to replace getExtractWithExtendCost with a getScalarizationOverhead equivalent tbh as this is the only user of it.

RKSimon updated this revision to Diff 462888.Sep 26 2022, 6:22 AM
RKSimon edited the summary of this revision. (Show Details)

use class instead of struct and replace auto with explicit types

RKSimon marked 2 inline comments as done.Sep 26 2022, 6:22 AM
ABataev added inline comments.Sep 26 2022, 6:26 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5786–5787

///

5789

///

5793

///

5817–5818

///

5828

///

5835

///

RKSimon updated this revision to Diff 463170.Sep 27 2022, 3:25 AM

Fix comments

@ABataev Would you prefer I include a refactor of getExtractWithExtendCost to getScalarizationOverheadWithExtend inside this patch ?

ABataev accepted this revision.Sep 27 2022, 4:25 AM

LG

@ABataev Would you prefer I include a refactor of getExtractWithExtendCost to getScalarizationOverheadWithExtend inside this patch ?

Better to implement it in a separate patch, I assume

This revision is now accepted and ready to land.Sep 27 2022, 4:25 AM
This revision was landed with ongoing or failed builds.Sep 27 2022, 6:52 AM
This revision was automatically updated to reflect the committed changes.
ayzhao added a subscriber: ayzhao.Sep 27 2022, 5:51 PM

FYI This is causing Chrome builds to fail with the error

clang-cl: /usr/local/google/home/ayzhao/src/llvm_tmpfs/llvm-project/llvm/include/llvm/ADT/APInt.h:1281: void llvm::APInt::setBit(unsigned int): Assertion `BitPosition < BitWidth && "BitPosition out of range"' failed.

https://crbug.com/1368841#c5

Cheers - I can see whats happening (but not actually why....) - I'll put in a workaround fix, but if you can create a reduced test case that'd be awesome.

I hit the crash as well. Bugpointed repro:

opt -passes="slp-vectorizer" bbi-74127.ll -o /dev/null

Thanks @uabelho my testing suggests rG926ccfef032d206dcbcdf74ca1e3a9ebf4d1be45 fixed the issue, I'll see whether the fix can be improved before I commit the regression test.

Thanks @uabelho my testing suggests rG926ccfef032d206dcbcdf74ca1e3a9ebf4d1be45 fixed the issue, I'll see whether the fix can be improved before I commit the regression test.

Yes I've also verified that the crash disappears with the fix.

Thanks @uabelho my testing suggests rG926ccfef032d206dcbcdf74ca1e3a9ebf4d1be45 fixed the issue, I'll see whether the fix can be improved before I commit the regression test.

Yes I've also verified that the crash disappears with the fix.

+1, I can also verify that the crash disappears on Chrome builds with this fix. Thanks!

fhahn added a subscriber: fhahn.Sep 29 2022, 9:55 AM

Just a heads up that I am seeing regressions on AArch64 with this patch, caused by additional vectorization. I'll provide a reproducer soon.

Just a heads up that I am seeing regressions on AArch64 with this patch, caused by additional vectorization. I'll provide a reproducer soon.

Cheers - I'm expecting it'll probably turn out to be a cost model issue somewhere.

fhahn added a comment.Sep 29 2022, 2:27 PM

Here's an example where this patch enables additional vectorization that harms performance: https://llvm.godbolt.org/z/KE1jv5sKs

I think there are the following issues that make this unprofitable:

  1. vector version cannot use FMADD while scalar code can, so the vector code doesn't save that many math instructions (this is a known issue but with the patch it seems to trigger more frequently)
  2. the vector code lengths the dependency chains in the loop, limiting out-of-order execution.

Thanks - I can see the problem, and this is going to take a bit of work to fix but I'm about to go on PTO... So I'm going to revert these changes for now and will come back to it soon. I've added test coverage for the cases people found so to not make the same mistakes again.

RKSimon reopened this revision.Sep 30 2022, 3:23 AM
This revision is now accepted and ready to land.Sep 30 2022, 3:23 AM
RKSimon planned changes to this revision.Sep 30 2022, 3:23 AM
Matt added a subscriber: Matt.Oct 14 2022, 10:59 AM