Add folds to instcombine to support the removal of select instruction when the masked_load is guaranteed to zero the same lanes, i.e. select(mask, mload(,,mask,0), 0) -> mload(,,mask,0).
Patch originally authored by @paulwalker-arm
Differential D106376
[SVE] Add support for folding for select + masked loads DylanFleming-arm on Jul 20 2021, 9:20 AM. Authored by
Details Add folds to instcombine to support the removal of select instruction when the masked_load is guaranteed to zero the same lanes, i.e. select(mask, mload(,,mask,0), 0) -> mload(,,mask,0). Patch originally authored by @paulwalker-arm
Diff Detail
Event TimelineComment Actions Hi @DylanFleming-arm, could you update the commit message to be more descriptive please and explain the purpose of the patch? You can mention at the bottom that it was originally written by @paulwalker-arm. Comment Actions LGTM!
Comment Actions I forgot to check this patch with all targets before putting it for review Comment Actions This appears to have inserted an obscure failure into Halide code generation, but only for certain vector architectures (e.g. x86-64 with AVX or AVX2, but not SSE4.1). We end up assert-failing: InstructionSimplify.cpp:1981: llvm::Value* simplifyLogicOfAddSub(llvm::Value*, llvm::Value*, llvm::Instruction::BinaryOps): Assertion `Op0->getType() == Op1->getType() && "Mismatched binop types"' failed. where the two binops are %483 = icmp slt i32 %482, 0 %481 = icmp sle <8 x i32> %478, %480 (I'm working on a repro case to provide, but so far I'm hampered by the fact that the assertion failure occurs when trying to produce a useful .ll file -- I'm posting this as a heads-up in case the short description rings any bells.) Comment Actions I did recently encounter a bug from binop matching here https://reviews.llvm.org/D105978 I don't think these cases are related, but it might be worth having a quick check just to rule it out for definite? I've had a brief look, through the repo, this could maybe be caused by the SimplifyAndInst call on line 3239, I believe that'll eventually lead to calling simplifyLogicOfAddSub. (although I haven't tested that myself since I'm not on my work laptop) Apologies for the somewhat lacking response, it's almost 1am here, but hopefully that can be somewhat useful. I can have a more indepth look into it when I start work tomorrow and see if I can find anything else that'd be of more use. Let me know if you get it fixed, i'd be interested to see the cause. Comment Actions Here's a repro case that is quite large, but hopefully useful: Uncompress and run: /path/to/llvm/bin/opt foo.ll -S -passes='default<O3>' --verify-each With this change removed, I get a transformed .ll out, as expected. With this change in place, I get something like: Assertion failed: (Op0->getType() == Op1->getType() && "Mismatched binop types"), function simplifyLogicOfAddSub, file /Users/srj/GitHub/llvm-project/13/llvm/lib/Analysis/InstructionSimplify.cpp, line 1983. PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace. Stack dump: 0. Program arguments: /Users/srj/llvm-13-install/bin/opt /tmp/foo.ll -S -passes=default<O3> --verify-each Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it): 0 opt 0x0000000106dddf87 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 39 1 opt 0x0000000106ddcd58 llvm::sys::RunSignalHandlers() + 248 2 opt 0x0000000106dde5c0 SignalHandler(int) + 272 3 libsystem_platform.dylib 0x00007fff203b6d7d _sigtramp + 29 4 libsystem_platform.dylib 0x00007ffeead70e20 _sigtramp + 18446744072813781184 5 libsystem_c.dylib 0x00007fff202c6406 abort + 125 6 libsystem_c.dylib 0x00007fff202c57d8 err + 0 7 opt 0x0000000107c245d3 simplifyLogicOfAddSub(llvm::Value*, llvm::Value*, llvm::Instruction::BinaryOps) (.cold.2) + 35 8 opt 0x0000000105d4bf07 simplifyLogicOfAddSub(llvm::Value*, llvm::Value*, llvm::Instruction::BinaryOps) + 231 9 opt 0x0000000105d3a404 SimplifyAndInst(llvm::Value*, llvm::Value*, llvm::SimplifyQuery const&, unsigned int) + 388 10 opt 0x00000001068888a9 llvm::InstCombinerImpl::visitSelectInst(llvm::SelectInst&) + 25145 11 opt 0x00000001067e0ae5 llvm::InstCombinerImpl::run() + 2021 12 opt 0x00000001067e296e combineInstructionsOverFunction(llvm::Function&, llvm::InstCombineWorklist&, llvm::AAResults*, llvm::AssumptionCache&, llvm::TargetLibraryInfo&, llvm::TargetTransformInfo&, llvm::DominatorTree&, llvm::OptimizationRemarkEmitter&, llvm::BlockFrequencyInfo*, llvm::ProfileSummaryInfo*, unsigned int, llvm::LoopInfo*) + 894 13 opt 0x00000001067e22a0 llvm::InstCombinePass::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) + 1040 14 opt 0x00000001070b26f2 llvm::detail::PassModel<llvm::Function, llvm::InstCombinePass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Function> >::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) + 18 15 opt 0x000000010658387c llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> >::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) + 524 16 opt 0x00000001054844a2 llvm::detail::PassModel<llvm::Function, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> >, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Function> >::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) + 18 17 opt 0x0000000106588548 llvm::ModuleToFunctionPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) + 664 18 opt 0x0000000104e9c512 llvm::detail::PassModel<llvm::Module, llvm::ModuleToFunctionPassAdaptor, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) + 18 19 opt 0x000000010658245e llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) + 574 20 opt 0x0000000104e9363c llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::StringRef>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool) + 12236 21 opt 0x0000000104ea7d6f main + 15775 22 libdyld.dylib 0x00007fff2038cf3d start + 1 Abort trap: 6 Comment Actions I'm looking into it now, hopefully I'll have a fix for it soon. Thanks for letting me know! |
Maybe worth noting here that you intentionally aren't checking the load has one use.