This is an archive of the discontinued LLVM Phabricator instance.

SROA: Enhance speculateSelectInstLoads
ClosedPublic

Authored by cdevadas on Jul 23 2021, 7:33 AM.

Diff Detail

Event Timeline

cdevadas created this revision.Jul 23 2021, 7:33 AM
cdevadas requested review of this revision.Jul 23 2021, 7:33 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 23 2021, 7:33 AM

@lebedev.ri @chandlerc could one of you take a look?
Much appreciate it.

arsenm added inline comments.Aug 3 2021, 1:33 PM
llvm/lib/Transforms/Scalar/SROA.cpp
1342–1346

I think other places that handle this use a user worklist and don't have the one use restriction for the bitcast (e.g. PointerReplacer::collectUsers)

llvm/test/Transforms/SROA/select-load.ll
35

Should have testcase with multiple users for the bitcast

cdevadas added inline comments.Aug 5 2021, 10:14 AM
llvm/lib/Transforms/Scalar/SROA.cpp
1342–1346

For the transformation to happen, If there are multiple users of Bitcast, they should all be load instructions. That will make the IR non-canonical. The one-use restriction makes sense I guess.

cdevadas updated this revision to Diff 364549.Aug 5 2021, 10:59 AM

Simplified the patch.
'BC->replaceAllUsesWith' isn't required.

arsenm accepted this revision.Aug 6 2021, 11:58 AM
This revision is now accepted and ready to land.Aug 6 2021, 11:58 AM
lebedev.ri added inline comments.Aug 6 2021, 12:03 PM
llvm/lib/Transforms/Scalar/SROA.cpp
1342–1346

That will make the IR non-canonical. The one-use restriction makes sense I guess.

Could you expand on that?

cdevadas added inline comments.Aug 6 2021, 8:46 PM
llvm/lib/Transforms/Scalar/SROA.cpp
1342–1346

If there are multiple loads from the same addr (bitcasted ptr here), they would have been optimized into a unified load before reaching SROA. Otherwise, it makes the IR less canonical.

lebedev.ri added inline comments.Aug 6 2021, 11:18 PM
llvm/lib/Transforms/Scalar/SROA.cpp
1342–1346

So, should this loop only handle a single non-bitcasted load too?
I don't see why we should allow an arbitrary number of loads and bitcasts, but only a single load of bitcast.

cdevadas added inline comments.Aug 7 2021, 12:35 AM
llvm/lib/Transforms/Scalar/SROA.cpp
1342–1346

Yes. I think so.

This revision was landed with ongoing or failed builds.Aug 7 2021, 6:10 AM
This revision was automatically updated to reflect the committed changes.

This patch causes an assertion failure while building the MIPS Linux kernel:

$ make -skj"$(nproc)" ARCH=mips CROSS_COMPILE=mips-linux-gnu- LLVM=1 LLVM_IAS=1 O=build/mips distclean malta_defconfig all
...
clang: /home/nathan/cbl/github/tc-build/llvm-project/llvm/lib/IR/Instructions.cpp:1452: llvm::LoadInst::LoadInst(llvm::Type *, llvm::Value *, const llvm::Twine &, bool, llvm::Align, llvm::AtomicOrdering, SyncScope::ID, llvm::Instruction *): Assertion `cast<PointerType>(Ptr->getType())->(10+ results) [2558/515101]
s(Ty)' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: /home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang -Qunused-arguments -fmacro-prefix-map=/home/nathan/cbl/src/linux/= -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE -Werror=implicit-function-declaration -Werror=im
plicit-int -Werror=return-type -Wno-format-security -std=gnu89 --target=mips-linux-gnu -integrated-as -Werror=unknown-warning-option -mno-check-zero-division -mabi=32 -G 0 -mno-abicalls -fno-pic -pipe -msoft-float -ffreestanding -EL -fno-stack-check -march=mips32r2 -fno-asynchronous-unwind-tables -fno-delete-null
-pointer-checks -Wno-frame-address -Wno-address-of-packed-member -O2 -Wframe-larger-than=1024 -fstack-protector-strong -Wno-format-invalid-specifier -Wno-gnu -mno-global-merge -Wno-unused-but-set-variable -Wno-unused-const-variable -fomit-frame-pointer -fno-stack-clash-protection -Wdeclaration-after-statement -Wv
la -Wno-pointer-sign -Wno-array-bounds -fno-strict-overflow -fno-stack-check -Werror=date-time -Werror=incompatible-pointer-types -Wno-initializer-overrides -Wno-format -Wno-sign-compare -Wno-format-zero-length -Wno-pointer-to-enum-cast -Wno-tautological-constant-out-of-range-compare -Wa,-msoft-float -Wa,--trap -
nostdinc -isystem /home/nathan/cbl/github/tc-build/build/llvm/stage1/lib/clang/14.0.0/include -I/home/nathan/cbl/src/linux/arch/mips/include -I./arch/mips/include/generated -I/home/nathan/cbl/src/linux/include -I./include -I/home/nathan/cbl/src/linux/arch/mips/include/uapi -I./arch/mips/include/generated/uapi -I/
home/nathan/cbl/src/linux/include/uapi -I./include/generated/uapi -include /home/nathan/cbl/src/linux/include/linux/compiler-version.h -include /home/nathan/cbl/src/linux/include/linux/kconfig.h -include /home/nathan/cbl/src/linux/include/linux/compiler_types.h -D__KERNEL__ -DVMLINUX_LOAD_ADDRESS=0xffffffff801000
00 -DLINKER_LOAD_ADDRESS=0x80100000 -DDATAOFFSET=0 -DGAS_HAS_SET_HARDFLOAT -DTOOLCHAIN_SUPPORTS_VIRT -I/home/nathan/cbl/src/linux/arch/mips/include/asm/mach-malta -I/home/nathan/cbl/src/linux/arch/mips/include/asm/mach-generic -I /home/nathan/cbl/src/linux/net/ipv6 -I ./net/ipv6 -DKBUILD_MODFILE=\"net/ipv6/ipv6\"
 -DKBUILD_BASENAME=\"ip6_fib\" -DKBUILD_MODNAME=\"ipv6\" -D__KBUILD_MODNAME=kmod_ipv6 -c -Wp,-MMD,net/ipv6/.ip6_fib.o.d -fcolor-diagnostics -o net/ipv6/ip6_fib.o /home/nathan/cbl/src/linux/net/ipv6/ip6_fib.c
1.      <eof> parser at end of file
2.      Optimizer
 #0 0x0000000002cc1913 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x2cc1913)
 #1 0x0000000002cbf73e llvm::sys::RunSignalHandlers() (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x2cbf73e)
 #2 0x0000000002c4b583 (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) CrashRecoveryContext.cpp:0:0
 #3 0x0000000002c4b6fe CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #4 0x00007fa50259d870 __restore_rt sigaction.c:0:0
 #5 0x00007fa502057d22 raise (/usr/lib/libc.so.6+0x3cd22)
 #6 0x00007fa502041862 abort (/usr/lib/libc.so.6+0x26862)
 #7 0x00007fa502041747 _nl_load_domain.cold loadmsgcat.c:0:0
 #8 0x00007fa502050616 (/usr/lib/libc.so.6+0x35616)
 #9 0x00000000025943be llvm::LoadInst::LoadInst(llvm::Type*, llvm::Value*, llvm::Twine const&, bool, llvm::Align, llvm::AtomicOrdering, unsigned char, llvm::Instruction*) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x25943be)
#10 0x0000000002594222 llvm::LoadInst::LoadInst(llvm::Type*, llvm::Value*, llvm::Twine const&, bool, llvm::Align, llvm::Instruction*) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x2594222)
#11 0x0000000001bbbab8 llvm::IRBuilderBase::CreateAlignedLoad(llvm::Type*, llvm::Value*, llvm::MaybeAlign, bool, llvm::Twine const&) X86ISelLowering.cpp:0:0
#12 0x0000000002bcad15 llvm::SROA::runOnAlloca(llvm::AllocaInst&) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x2bcad15)
#13 0x0000000002bcbd8f llvm::SROA::runImpl(llvm::Function&, llvm::DominatorTree&, llvm::AssumptionCache&) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x2bcbd8f)
#14 0x0000000002bccd2d llvm::SROA::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x2bccd2d)
#15 0x0000000003e2dd2d llvm::detail::PassModel<llvm::Function, llvm::SROA, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Function> >::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) PassBuilder.cpp:0:0
#16 0x00000000025f0fc1 llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> >::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x25f0fc1)
#17 0x00000000033d098d llvm::detail::PassModel<llvm::Function, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> >, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Function> >::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) BackendUtil.cpp:0:0
#18 0x0000000002658ac4 llvm::CGSCCToFunctionPassAdaptor::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x2658ac4)
#19 0x0000000003e325fd llvm::detail::PassModel<llvm::LazyCallGraph::SCC, llvm::CGSCCToFunctionPassAdaptor, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::Laz
yCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) PassBuilder.cpp:0:0
#20 0x0000000002653061 llvm::PassManager<llvm::LazyCallGraph::SCC, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, ll
vm::CGSCCUpdateResult&) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x2653061)
#21 0x00000000027527bd llvm::detail::PassModel<llvm::LazyCallGraph::SCC, llvm::PassManager<llvm::LazyCallGraph::SCC, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm
::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) Inliner.cpp:0:0
#22 0x0000000002656c39 llvm::DevirtSCCRepeatedPass::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x2656c39)
#23 0x0000000002752afd llvm::detail::PassModel<llvm::LazyCallGraph::SCC, llvm::DevirtSCCRepeatedPass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCall
Graph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) Inliner.cpp:0:0
#24 0x00000000026551d5 llvm::ModuleToPostOrderCGSCCPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x26551d5)
#25 0x000000000275295d llvm::detail::PassModel<llvm::Module, llvm::ModuleToPostOrderCGSCCPassAdaptor, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) Inliner.cpp:0:0
#26 0x00000000025efd04 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x25efd04)
#27 0x000000000274f061 llvm::ModuleInlinerWrapperPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x274f061)
#28 0x0000000003e328ad llvm::detail::PassModel<llvm::Module, llvm::ModuleInlinerWrapperPass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) PassBuilder.cpp:0:0
#29 0x00000000025efd04 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x25efd04)
#30 0x00000000033c6e25 (anonymous namespace)::EmitAssemblyHelper::EmitAssemblyWithNewPassManager(clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream> >) BackendUtil.cpp:0:0
#31 0x00000000033c161c clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llv
m::raw_pwrite_stream> >) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x33c161c)
#32 0x0000000003873430 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) CodeGenAction.cpp:0:0
#33 0x0000000003f907a4 clang::ParseAST(clang::Sema&, bool, bool) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x3f907a4)
#34 0x00000000037c13b0 clang::FrontendAction::Execute() (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x37c13b0)
#35 0x000000000372f6ff clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x372f6ff)
#36 0x000000000386d4b7 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x386d4b7)
#37 0x00000000019bb244 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x19bb244)
#38 0x00000000019b8d9d ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) driver.cpp:0:0
#39 0x00000000035c9d52 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, bool*) const::$_1>(long) Job.cpp:0:0
#40 0x0000000002c4b497 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x2c4b497)
#41 0x00000000035c98b7 clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, bool*) const (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x35c98b7)
#42 0x0000000003591ee8 clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&) const (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x3591ee8)
#43 0x00000000035921b7 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*> >&) const (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x35921b7)
#44 0x00000000035aa551 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*> >&) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x35aa551)
#45 0x00000000019b8666 main (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x19b8666)
#46 0x00007fa502042b25 __libc_start_main (/usr/lib/libc.so.6+0x27b25)
#47 0x00000000019b5bae _start (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x19b5bae)
clang-14: error: clang frontend command failed with exit code 134 (use -v to see invocation)
ClangBuiltLinux clang version 14.0.0 (https://github.com/llvm/llvm-project c5c3cdb9c92895a63993cee70d2dd776ff9519c3)
Target: mipsel-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/nathan/cbl/github/tc-build/build/llvm/stage1/bin
clang-14: note: diagnostic msg:
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang-14: note: diagnostic msg: /tmp/ip6_fib-23421b.c
clang-14: note: diagnostic msg: /tmp/ip6_fib-23421b.sh
clang-14: note: diagnostic msg:

********************
...

A simplifier reproducer that is not MIPS specific:

$ cat ip6_fib.i
fib6_node_lookup_daddr;
addr_bit_set(void *);
struct lookup_args {
  int offset;
  int *addr
} fib6_node_lookup_1(struct lookup_args *args) {
  addr_bit_set(args->addr);
  ipv6_prefix_equal(args->addr);
}
fib6_node_lookup() {
  struct lookup_args args[] = {{}, {}};
  fib6_node_lookup_1(fib6_node_lookup_daddr ? args : args + 1);
}

$ clang -O2 -c -o /dev/null ip6_fib.i
...
clang: /home/nathan/cbl/github/tc-build/llvm-project/llvm/lib/IR/Instructions.cpp:1452: llvm::LoadInst::LoadInst(llvm::Type *, llvm::Value *, const llvm::Twine &, bool, llvm::Align, llvm::AtomicOrdering, SyncScope::ID, llvm::Instruction *): Assertion `cast<PointerType>(Ptr->getType())->isOpaqueOrPointeeTypeMatche
s(Ty)' failed.
...

I reverted the patch for now.
Missed the case when both bitcasted and non-bitcasted loads can exist at the same time.
In my patch, I should have retained the original TV and FV values rather than overwriting them with the new bitcast instructions. They are required for the non-bitcasted loads.
The loop for select users makes sense. I will upload a new patch.

; opt -S -sroa bug.ll -o out.ll

%struct.lookup_args = type { i32, i32* }

define void @arg_lookup(i1 %cmp){
entry:
  %args = alloca [2 x %struct.lookup_args], align 16
  %ptr = bitcast [2 x %struct.lookup_args]* %args to i8*
  %arraydecay = getelementptr inbounds [2 x %struct.lookup_args], [2 x %struct.lookup_args]* %args, i64 0, i64 0
  %add.ptr = getelementptr inbounds [2 x %struct.lookup_args], [2 x %struct.lookup_args]* %args, i64 0, i64 1
  %cond = select i1 %cmp, %struct.lookup_args* %add.ptr, %struct.lookup_args* %arraydecay
  %addr.i = getelementptr inbounds %struct.lookup_args, %struct.lookup_args* %cond, i64 0, i32 1
  %bcast.i8 = bitcast i32** %addr.i to i8**
  %addr.i8 = load i8*, i8** %bcast.i8, align 8
  call void @foo_i8(i8* %addr.i8)
  %addr.i32 = load i32*, i32** %addr.i, align 8
  call void @foo_i32 (i32* %addr.i32)
  ret void
}

declare void @foo_i8(i8*)
declare void @foo_i32(i32*)
cdevadas reopened this revision.Aug 9 2021, 11:04 PM
This revision is now accepted and ready to land.Aug 9 2021, 11:04 PM
cdevadas updated this revision to Diff 365367.Aug 9 2021, 11:35 PM

Handled the case when multiple loads of select (both bitcasted and direct) exist at the same time.

cdevadas requested review of this revision.Aug 9 2021, 11:36 PM
arsenm accepted this revision.Aug 11 2021, 6:28 AM
This revision is now accepted and ready to land.Aug 11 2021, 6:28 AM
This revision was landed with ongoing or failed builds.Aug 11 2021, 8:07 PM
This revision was automatically updated to reflect the committed changes.