This is an archive of the discontinued LLVM Phabricator instance.

[SLP]Add final resize to ShuffleCostEstimator::finalize member function and basic add member functions.
ClosedPublic

Authored by ABataev on Apr 13 2023, 4:42 PM.

Details

Summary

Implemented the reshuffling in finalize member function + add basic
support for add member functions, used during vector build.

Part of D110978

Diff Detail

Event Timeline

ABataev created this revision.Apr 13 2023, 4:42 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 13 2023, 4:42 PM
ABataev requested review of this revision.Apr 13 2023, 4:42 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 13 2023, 4:42 PM
RKSimon added inline comments.Apr 14 2023, 2:37 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7058

This looks very similar to the addMask helper - can we use that (possibly with a modification to skip the TermValue logic)?

ABataev updated this revision to Diff 513618.Apr 14 2023, 8:53 AM

Address comments

RKSimon accepted this revision.Apr 16 2023, 1:59 PM

LGTM

This revision is now accepted and ready to land.Apr 16 2023, 1:59 PM
jonpa added a subscriber: jonpa.Apr 18 2023, 11:32 AM

Seems this gave me a segfault on SPEC.

opt -mtriple=s390x-unknown-linux -mcpu=z15 -o - -S -O3 ./bugpoint-reduced-simplified.bc

Seems this gave me a segfault on SPEC.

opt -mtriple=s390x-unknown-linux -mcpu=z15 -o - -S -O3 ./bugpoint-reduced-simplified.bc

sorry - see now you already reverted it...

eaeltsin added a subscriber: eaeltsin.EditedApr 21 2023, 2:33 AM

Hi,

This was relanded and caused crashes when compiling numpy for aarch64.

Reproducer:

typedef long double npy_longdouble;
typedef struct {
  npy_longdouble real, imag;
} npy_clongdouble;
enum { NPY_CDOUBLE } min_scalar_type_num_valueptr;
int min_scalar_type_num_type_num;
int min_scalar_type_num() {
  npy_clongdouble value = *(npy_clongdouble *)min_scalar_type_num_valueptr;
  if (value.real > -3.4e38 && value.real < 3.4e38 && value.imag > 3.4e38 &&
      value.imag < 3.4e38)
    if (value.real > -1.7e308 && value.real < 1.7e308 && value.imag > 1.7e308 &&
        value.imag < 1.7e308)
      return min_scalar_type_num_type_num;
  return 0;
}

Command:

clang -cc1 -triple aarch64-unknown-linux-gnu -emit-obj -O3 -vectorize-slp t4.c

Output:

Stack dump:
0.	Program arguments: clang -cc1 -triple aarch64-unknown-linux-gnu -emit-obj -O3 -vectorize-slp t4.c
1.	<eof> parser at end of file
2.	Optimizer
 #0 0x000055c62dade00e llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (clang+0x82de00e)
 #1 0x000055c62dadc149 llvm::sys::RunSignalHandlers() (clang+0x82dc149)
 #2 0x000055c62dade6ad SignalHandler(int) (clang+0x82de6ad)
 #3 0x00007f5ba4b791c0 __restore_rt (/usr/grte/v5/lib64/libpthread.so.0+0x151c0)
 #4 0x000055c62d203d3d llvm::slpvectorizer::BoUpSLP::ShuffleCostEstimator::finalize(llvm::ArrayRef<int>) (clang+0x7a03d3d)
 #5 0x000055c62d1ffd0a llvm::slpvectorizer::BoUpSLP::getEntryCost(llvm::slpvectorizer::BoUpSLP::TreeEntry const*, llvm::ArrayRef<llvm::Value*>) (clang+0x79ffd0a)
 #6 0x000055c62d206064 llvm::slpvectorizer::BoUpSLP::getTreeCost(llvm::ArrayRef<llvm::Value*>) (clang+0x7a06064)
 #7 0x000055c62d237be1 (anonymous namespace)::HorizontalReduction::tryToReduce(llvm::slpvectorizer::BoUpSLP&, llvm::TargetTransformInfo*, llvm::TargetLibraryInfo const&) (clang+0x7a37be1)
 #8 0x000055c62d223fb2 llvm::SLPVectorizerPass::vectorizeHorReduction(llvm::PHINode*, llvm::Value*, llvm::BasicBlock*, llvm::slpvectorizer::BoUpSLP&, llvm::TargetTransformInfo*, llvm::SmallVectorImpl<llvm::WeakTrackingVH>&) (clang+0x7a23fb2)
 #9 0x000055c62d22417a llvm::SLPVectorizerPass::vectorizeRootInstruction(llvm::PHINode*, llvm::Value*, llvm::BasicBlock*, llvm::slpvectorizer::BoUpSLP&, llvm::TargetTransformInfo*) (clang+0x7a2417a)
#10 0x000055c62d21ce5a llvm::SLPVectorizerPass::vectorizeChainsInBlock(llvm::BasicBlock*, llvm::slpvectorizer::BoUpSLP&) (clang+0x7a1ce5a)
#11 0x000055c62d21ae15 llvm::SLPVectorizerPass::runImpl(llvm::Function&, llvm::ScalarEvolution*, llvm::TargetTransformInfo*, llvm::TargetLibraryInfo*, llvm::AAResults*, llvm::LoopInfo*, llvm::DominatorTree*, llvm::AssumptionCache*, llvm::DemandedBits*, llvm::OptimizationRemarkEmitter*) (clang+0x7a1ae15)
#12 0x000055c62d21a746 llvm::SLPVectorizerPass::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (clang+0x7a1a746)
#13 0x000055c62c353a72 llvm::detail::PassModel<llvm::Function, llvm::SLPVectorizerPass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (clang+0x6b53a72)
#14 0x000055c62d9478b5 llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (clang+0x81478b5)
#15 0x000055c6295354b2 llvm::detail::PassModel<llvm::Function, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (clang+0x3d354b2)
#16 0x000055c62d949c09 llvm::ModuleToFunctionPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (clang+0x8149c09)
#17 0x000055c629537552 llvm::detail::PassModel<llvm::Module, llvm::ModuleToFunctionPassAdaptor, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (clang+0x3d37552)
#18 0x000055c62d946d95 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (clang+0x8146d95)
#19 0x000055c62952cfad (anonymous namespace)::EmitAssemblyHelper::RunOptimizationPipeline(clang::BackendAction, std::__u::unique_ptr<llvm::raw_pwrite_stream, std::__u::default_delete<llvm::raw_pwrite_stream>>&, std::__u::unique_ptr<llvm::ToolOutputFile, std::__u::default_delete<llvm::ToolOutputFile>>&) (clang+0x3d2cfad)
#20 0x000055c629525ca7 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::__u::unique_ptr<llvm::raw_pwrite_stream, std::__u::default_delete<llvm::raw_pwrite_stream>>) (clang+0x3d25ca7)
#21 0x000055c629523816 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (clang+0x3d23816)
#22 0x000055c62a2742c6 clang::ParseAST(clang::Sema&, bool, bool) (clang+0x4a742c6)
#23 0x000055c629fe159a clang::FrontendAction::Execute() (clang+0x47e159a)
#24 0x000055c629f50f64 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (clang+0x4750f64)
#25 0x000055c6291c85ce clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (clang+0x39c85ce)
#26 0x000055c6291bd16f cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (clang+0x39bd16f)
#27 0x000055c6291b9ea4 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) (clang+0x39b9ea4)
#28 0x000055c6291b8a9f clang_main(int, char**, llvm::ToolContext const&) (clang+0x39b8a9f)
#29 0x000055c6291b5d84 main (clang+0x39b5d84)
#30 0x00007f5ba4a0c633 __libc_start_main (libc.so.6+0x61633)
#31 0x000055c6291b5cea _start (clang+0x39b5cea)

Hi,

This was relanded and caused crashes when compiling numpy for aarch64.

Reproducer:

typedef long double npy_longdouble;
typedef struct {
  npy_longdouble real, imag;
} npy_clongdouble;
enum { NPY_CDOUBLE } min_scalar_type_num_valueptr;
int min_scalar_type_num_type_num;
int min_scalar_type_num() {
  npy_clongdouble value = *(npy_clongdouble *)min_scalar_type_num_valueptr;
  if (value.real > -3.4e38 && value.real < 3.4e38 && value.imag > 3.4e38 &&
      value.imag < 3.4e38)
    if (value.real > -1.7e308 && value.real < 1.7e308 && value.imag > 1.7e308 &&
        value.imag < 1.7e308)
      return min_scalar_type_num_type_num;
  return 0;
}

Command:

clang -cc1 -triple aarch64-unknown-linux-gnu -emit-obj -O3 -vectorize-slp t4.c

Output:

Stack dump:
0.	Program arguments: clang -cc1 -triple aarch64-unknown-linux-gnu -emit-obj -O3 -vectorize-slp t4.c
1.	<eof> parser at end of file
2.	Optimizer
 #0 0x000055c62dade00e llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (clang+0x82de00e)
 #1 0x000055c62dadc149 llvm::sys::RunSignalHandlers() (clang+0x82dc149)
 #2 0x000055c62dade6ad SignalHandler(int) (clang+0x82de6ad)
 #3 0x00007f5ba4b791c0 __restore_rt (/usr/grte/v5/lib64/libpthread.so.0+0x151c0)
 #4 0x000055c62d203d3d llvm::slpvectorizer::BoUpSLP::ShuffleCostEstimator::finalize(llvm::ArrayRef<int>) (clang+0x7a03d3d)
 #5 0x000055c62d1ffd0a llvm::slpvectorizer::BoUpSLP::getEntryCost(llvm::slpvectorizer::BoUpSLP::TreeEntry const*, llvm::ArrayRef<llvm::Value*>) (clang+0x79ffd0a)
 #6 0x000055c62d206064 llvm::slpvectorizer::BoUpSLP::getTreeCost(llvm::ArrayRef<llvm::Value*>) (clang+0x7a06064)
 #7 0x000055c62d237be1 (anonymous namespace)::HorizontalReduction::tryToReduce(llvm::slpvectorizer::BoUpSLP&, llvm::TargetTransformInfo*, llvm::TargetLibraryInfo const&) (clang+0x7a37be1)
 #8 0x000055c62d223fb2 llvm::SLPVectorizerPass::vectorizeHorReduction(llvm::PHINode*, llvm::Value*, llvm::BasicBlock*, llvm::slpvectorizer::BoUpSLP&, llvm::TargetTransformInfo*, llvm::SmallVectorImpl<llvm::WeakTrackingVH>&) (clang+0x7a23fb2)
 #9 0x000055c62d22417a llvm::SLPVectorizerPass::vectorizeRootInstruction(llvm::PHINode*, llvm::Value*, llvm::BasicBlock*, llvm::slpvectorizer::BoUpSLP&, llvm::TargetTransformInfo*) (clang+0x7a2417a)
#10 0x000055c62d21ce5a llvm::SLPVectorizerPass::vectorizeChainsInBlock(llvm::BasicBlock*, llvm::slpvectorizer::BoUpSLP&) (clang+0x7a1ce5a)
#11 0x000055c62d21ae15 llvm::SLPVectorizerPass::runImpl(llvm::Function&, llvm::ScalarEvolution*, llvm::TargetTransformInfo*, llvm::TargetLibraryInfo*, llvm::AAResults*, llvm::LoopInfo*, llvm::DominatorTree*, llvm::AssumptionCache*, llvm::DemandedBits*, llvm::OptimizationRemarkEmitter*) (clang+0x7a1ae15)
#12 0x000055c62d21a746 llvm::SLPVectorizerPass::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (clang+0x7a1a746)
#13 0x000055c62c353a72 llvm::detail::PassModel<llvm::Function, llvm::SLPVectorizerPass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (clang+0x6b53a72)
#14 0x000055c62d9478b5 llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (clang+0x81478b5)
#15 0x000055c6295354b2 llvm::detail::PassModel<llvm::Function, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (clang+0x3d354b2)
#16 0x000055c62d949c09 llvm::ModuleToFunctionPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (clang+0x8149c09)
#17 0x000055c629537552 llvm::detail::PassModel<llvm::Module, llvm::ModuleToFunctionPassAdaptor, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (clang+0x3d37552)
#18 0x000055c62d946d95 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (clang+0x8146d95)
#19 0x000055c62952cfad (anonymous namespace)::EmitAssemblyHelper::RunOptimizationPipeline(clang::BackendAction, std::__u::unique_ptr<llvm::raw_pwrite_stream, std::__u::default_delete<llvm::raw_pwrite_stream>>&, std::__u::unique_ptr<llvm::ToolOutputFile, std::__u::default_delete<llvm::ToolOutputFile>>&) (clang+0x3d2cfad)
#20 0x000055c629525ca7 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::__u::unique_ptr<llvm::raw_pwrite_stream, std::__u::default_delete<llvm::raw_pwrite_stream>>) (clang+0x3d25ca7)
#21 0x000055c629523816 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (clang+0x3d23816)
#22 0x000055c62a2742c6 clang::ParseAST(clang::Sema&, bool, bool) (clang+0x4a742c6)
#23 0x000055c629fe159a clang::FrontendAction::Execute() (clang+0x47e159a)
#24 0x000055c629f50f64 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (clang+0x4750f64)
#25 0x000055c6291c85ce clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (clang+0x39c85ce)
#26 0x000055c6291bd16f cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (clang+0x39bd16f)
#27 0x000055c6291b9ea4 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) (clang+0x39b9ea4)
#28 0x000055c6291b8a9f clang_main(int, char**, llvm::ToolContext const&) (clang+0x39b8a9f)
#29 0x000055c6291b5d84 main (clang+0x39b5d84)
#30 0x00007f5ba4a0c633 __libc_start_main (libc.so.6+0x61633)
#31 0x000055c6291b5cea _start (clang+0x39b5cea)

Hi, must be fixed in 403bd583a8cc1f041430ff1b236ab296a2acdc85

fhahn added a subscriber: fhahn.May 22 2023, 8:32 AM

Looks like this is causing a crash reported in https://github.com/llvm/llvm-project/issues/62665