Page MenuHomePhabricator

[InstCombine] Fix miscompile on GEP+load to icmp fold (PR45210)
AcceptedPublic

Authored by hyeongyukim on Mar 28 2021, 9:00 PM.

Details

Summary

As noted in PR45210: https://bugs.llvm.org/show_bug.cgi?id=45210
...the bug is triggered as Eli say when sext(idx) * ElementSize overflows.

   // assume that GV is an array of 4-byte elements
   GEP = gep GV, 0, Idx // this is accessing Idx * 4
   L = load GEP
   ICI = icmp eq L, value
 =>
   ICI = icmp eq Idx, NewIdx

The foldCmpLoadFromIndexedGlobal function simplifies GEP+load operation to icmp.
And there is a problem because Idx * ElementSize can overflow.

Let's assume that the wanted value is at offset 0.
Then, there are actually four possible values for Idx to match offset 0: 0x00..00, 0x40..00, 0x80..00, 0xC0..00.
We should return true for all these values, but currently, the new icmp only returns true for 0x00..00.

This problem can be solved by masking off (trailing zeros of ElementSize) bits from Idx.

   ...
 =>
   Idx' = and Idx, 0x3F..FF
   ICI = icmp eq Idx', NewIdx

Diff Detail

Unit TestsFailed

TimeTest
2,250 msx64 debian > libarcher.races::lock-unrelated.c
Script: -- : 'RUN: at line 13'; /mnt/disks/ssd0/agent/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests -I /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/runtime/src -L /mnt/disks/ssd0/agent/llvm-project/build/lib -Wl,-rpath,/mnt/disks/ssd0/agent/llvm-project/build/lib /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/races/lock-unrelated.c -o /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/lock-unrelated.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/deflake.bash /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/lock-unrelated.c.tmp 2>&1 | tee /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/lock-unrelated.c.tmp.log | /mnt/disks/ssd0/agent/llvm-project/build/./bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/races/lock-unrelated.c

Event Timeline

hyeongyukim created this revision.Mar 28 2021, 9:00 PM
hyeongyukim requested review of this revision.Mar 28 2021, 9:00 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 28 2021, 9:00 PM
aqjune added a subscriber: aqjune.Apr 3 2021, 11:08 AM
aqjune added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
289

This is to avoid creating redundant expressions, right?
Could you elaborate a bit more with an example?

llvm/test/Transforms/InstCombine/load-cmp.ll
64–65

Could you leave a comment explaining why this is 32767?
Same for test10_struct_arr_noinbounds case below

I added some comments to clarify my codes.

Add simple examples at the comment.

hyeongyukim edited the summary of this revision. (Show Details)Apr 14 2021, 1:11 AM
hyeongyukim edited the summary of this revision. (Show Details)
hyeongyukim edited the summary of this revision. (Show Details)Apr 14 2021, 1:14 AM
hyeongyukim edited the summary of this revision. (Show Details)Apr 14 2021, 4:18 AM

The patch was a bit complicated because it included the sext case, so I uploaded a patch that did not take that case into account. The sext case was for removing redundant operations, so the current patch is still correct.

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
289

Yes, I changed my comment.

aqjune added inline comments.Apr 14 2021, 4:53 AM
llvm/test/Transforms/InstCombine/load-cmp.ll
305

Okay, the mask '268435455' is 0x0F..FF and it makes sense because the size of struct %Foo is 16.

efriedma added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
293

Using "-1" like this won't sign-extend correctly if anyone ever uses a pointer type wider than 64 bits.

There are a few ways to solve this; you can use ConstantInt::getSigned(), or APInt::getLowBitsSet().

Mask was modified to work properly even if the pointer size is bigger than 64bit.

This revision is now accepted and ready to land.Fri, May 28, 6:30 PM
This revision was automatically updated to reflect the committed changes.

This change breaks building the Linux kernel:

$ make -skj"$(nproc)" LLVM=1 LLVM_IAS=1 O=build/x86_64 distclean allmodconfig drivers/net/phy/phy-core.o
fatal error: error in backend: Instruction Combining seems stuck in an infinite loop after 100 iterations.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: /home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang -Qunused-arguments -fmacro-prefix-map=/home/nathan/cbl/src/linux-next/= -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE -Werror=implicit-function-declaration -Werror=implicit-int -Werror=return-type -Wno-format-security -std=gnu89 -Werror=unknown-warning-option -integrated-as -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx -fcf-protection=none -m64 -mno-80387 -mstack-alignment=8 -mtune=generic -mno-red-zone -mcmodel=kernel -Wno-sign-compare -fno-asynchronous-unwind-tables -mretpoline-external-thunk -fno-delete-null-pointer-checks -Wno-frame-address -Wno-address-of-packed-member -O2 -Wframe-larger-than=2048 -fstack-protector-strong -Wno-format-invalid-specifier -Wno-gnu -mno-global-merge -Wno-unused-const-variable -ftrivial-auto-var-init=pattern -fno-stack-clash-protection -pg -mfentry -falign-functions=64 -Wdeclaration-after-statement -Wvla -Wno-pointer-sign -Wno-array-bounds -fno-strict-overflow -fno-stack-check -Werror=date-time -Werror=incompatible-pointer-types -Wno-initializer-overrides -Wno-format -Wno-sign-compare -Wno-format-zero-length -Wno-pointer-to-enum-cast -Wno-tautological-constant-out-of-range-compare -fprofile-generate -fsanitize=kernel-address -mllvm -asan-mapping-offset=0xdffffc0000000000 -mllvm -asan-globals=1 -mllvm -asan-instrumentation-with-call-threshold=0 -mllvm -asan-stack=0 --param asan-instrument-allocas=1 -fsanitize=array-bounds -fsanitize=shift -fsanitize=integer-divide-by-zero -fsanitize=object-size -fsanitize=bool -fsanitize=enum -fsanitize-coverage=trace-pc -fsanitize-coverage=trace-cmp -nostdinc -isystem /home/nathan/cbl/github/tc-build/build/llvm/stage1/lib/clang/13.0.0/include -I/home/nathan/cbl/src/linux-next/arch/x86/include -I./arch/x86/include/generated -I/home/nathan/cbl/src/linux-next/include -I./include -I/home/nathan/cbl/src/linux-next/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/nathan/cbl/src/linux-next/include/uapi -I./include/generated/uapi -include /home/nathan/cbl/src/linux-next/include/linux/compiler-version.h -include /home/nathan/cbl/src/linux-next/include/linux/kconfig.h -include /home/nathan/cbl/src/linux-next/include/linux/compiler_types.h -D__KERNEL__ -DCONFIG_X86_X32_ABI -DCC_USING_NOP_MCOUNT -DCC_USING_FENTRY -I /home/nathan/cbl/src/linux-next/drivers/net/phy -I ./drivers/net/phy -DMODULE -DKBUILD_BASENAME=\"phy_core\" -DKBUILD_MODNAME=\"libphy\" -D__KBUILD_MODNAME=kmod_libphy -c -Wp,-MMD,drivers/net/phy/.phy-core.o.d -fcolor-diagnostics -o drivers/net/phy/phy-core.o /home/nathan/cbl/src/linux-next/drivers/net/phy/phy-core.c
1.	<eof> parser at end of file
2.	Optimizer
 #0 0x00000000029502e3 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x29502e3)
 #1 0x000000000294e12e llvm::sys::RunSignalHandlers() (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x294e12e)
 #2 0x00000000028dbb13 (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) CrashRecoveryContext.cpp:0:0
 #3 0x00000000028dba8f llvm::CrashRecoveryContext::HandleExit(int) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x28dba8f)
 #4 0x000000000294a5b7 llvm::sys::Process::Exit(int, bool) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x294a5b7)
 #5 0x00000000017bfbc0 llvm::DenseMapBase<llvm::DenseMap<llvm::AliasSetTracker::ASTCallbackVH, llvm::AliasSet::PointerRec*, llvm::AliasSetTracker::ASTCallbackVHDenseMapInfo, llvm::detail::DenseMapPair<llvm::AliasSetTracker::ASTCallbackVH, llvm::AliasSet::PointerRec*> >, llvm::AliasSetTracker::ASTCallbackVH, llvm::AliasSet::PointerRec*, llvm::AliasSetTracker::ASTCallbackVHDenseMapInfo, llvm::detail::DenseMapPair<llvm::AliasSetTracker::ASTCallbackVH, llvm::AliasSet::PointerRec*> >::destroyAll() cc1_main.cpp:0:0
 #6 0x00000000028e03c2 llvm::report_fatal_error(llvm::Twine const&, bool) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x28e03c2)
 #7 0x00000000024a0a79 (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x24a0a79)
 #8 0x000000000249ec7a llvm::InstCombinePass::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x249ec7a)
 #9 0x0000000003a67abd llvm::detail::PassModel<llvm::Function, llvm::InstCombinePass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Function> >::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) PassBuilder.cpp:0:0
#10 0x00000000022d5831 llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> >::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x22d5831)
#11 0x000000000302d82d llvm::detail::PassModel<llvm::Function, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> >, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Function> >::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) BackendUtil.cpp:0:0
#12 0x00000000022d8e54 llvm::ModuleToFunctionPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x22d8e54)
#13 0x000000000302f4ed llvm::detail::PassModel<llvm::Module, llvm::ModuleToFunctionPassAdaptor, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) BackendUtil.cpp:0:0
#14 0x00000000022d4564 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x22d4564)
#15 0x0000000003023e7c (anonymous namespace)::EmitAssemblyHelper::EmitAssemblyWithNewPassManager(clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream> >) BackendUtil.cpp:0:0
#16 0x000000000301e7fa clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream> >) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x301e7fa)
#17 0x00000000034b21f0 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) CodeGenAction.cpp:0:0
#18 0x0000000003bd2fe4 clang::ParseAST(clang::Sema&, bool, bool) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x3bd2fe4)
#19 0x00000000034081a0 clang::FrontendAction::Execute() (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x34081a0)
#20 0x000000000337c00f clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x337c00f)
#21 0x00000000034ac2d7 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x34ac2d7)
#22 0x00000000017bf893 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x17bf893)
#23 0x00000000017bd3dd ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) driver.cpp:0:0
#24 0x0000000003220b12 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, bool*) const::$_1>(long) Job.cpp:0:0
#25 0x00000000028dba27 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x28dba27)
#26 0x0000000003220677 clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, bool*) const (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x3220677)
#27 0x00000000031e8958 clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&) const (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x31e8958)
#28 0x00000000031e8c27 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*> >&) const (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x31e8c27)
#29 0x0000000003201411 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*> >&) (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x3201411)
#30 0x00000000017bcca6 main (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x17bcca6)
#31 0x00007fb232feeb25 __libc_start_main (/usr/lib/libc.so.6+0x27b25)
#32 0x00000000017ba04e _start (/home/nathan/cbl/github/tc-build/build/llvm/stage1/bin/clang+0x17ba04e)
clang-13: error: clang frontend command failed with exit code 70 (use -v to see invocation)
ClangBuiltLinux clang version 13.0.0 (https://github.com/llvm/llvm-project 5c9fe816e3b6b9cdbf75758f2744a45f97c489f0)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/nathan/cbl/github/tc-build/build/llvm/stage1/bin
clang-13: note: diagnostic msg: 
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang-13: note: diagnostic msg: /tmp/phy-core-a7405b.c
clang-13: note: diagnostic msg: /tmp/phy-core-a7405b.sh
clang-13: note: diagnostic msg: 

********************

A reduced reproducer:

enum {
  ETHTOOL_LINK_MODE_1000baseT_Full_BIT,
  ETHTOOL_LINK_MODE_10000baseT_Full_BIT,
  ETHTOOL_LINK_MODE_2500baseX_Full_BIT,
  ETHTOOL_LINK_MODE_10000baseKX4_Full_BIT,
  ETHTOOL_LINK_MODE_10000baseKR_Full_BIT,
  ETHTOOL_LINK_MODE_10000baseR_FEC_BIT,
  ETHTOOL_LINK_MODE_20000baseMLD2_Full_BIT,
  ETHTOOL_LINK_MODE_20000baseKR2_Full_BIT,
  ETHTOOL_LINK_MODE_40000baseKR4_Full_BIT,
  ETHTOOL_LINK_MODE_40000baseCR4_Full_BIT,
  ETHTOOL_LINK_MODE_40000baseSR4_Full_BIT,
  ETHTOOL_LINK_MODE_40000baseLR4_Full_BIT,
  ETHTOOL_LINK_MODE_56000baseKR4_Full_BIT,
  ETHTOOL_LINK_MODE_56000baseCR4_Full_BIT,
  ETHTOOL_LINK_MODE_56000baseSR4_Full_BIT,
  ETHTOOL_LINK_MODE_56000baseLR4_Full_BIT,
  ETHTOOL_LINK_MODE_25000baseCR_Full_BIT,
  ETHTOOL_LINK_MODE_25000baseKR_Full_BIT,
  ETHTOOL_LINK_MODE_25000baseSR_Full_BIT,
  ETHTOOL_LINK_MODE_50000baseCR2_Full_BIT,
  ETHTOOL_LINK_MODE_50000baseKR2_Full_BIT,
  ETHTOOL_LINK_MODE_100000baseKR4_Full_BIT,
  ETHTOOL_LINK_MODE_100000baseSR4_Full_BIT,
  ETHTOOL_LINK_MODE_100000baseCR4_Full_BIT,
  ETHTOOL_LINK_MODE_100000baseLR4_ER4_Full_BIT,
  ETHTOOL_LINK_MODE_50000baseSR2_Full_BIT,
  ETHTOOL_LINK_MODE_10000baseCR_Full_BIT,
  ETHTOOL_LINK_MODE_10000baseSR_Full_BIT,
  ETHTOOL_LINK_MODE_10000baseLR_Full_BIT,
  ETHTOOL_LINK_MODE_10000baseLRM_Full_BIT,
  ETHTOOL_LINK_MODE_10000baseER_Full_BIT,
  ETHTOOL_LINK_MODE_2500baseT_Full_BIT,
  ETHTOOL_LINK_MODE_5000baseT_Full_BIT,
  ETHTOOL_LINK_MODE_50000baseKR_Full_BIT,
  ETHTOOL_LINK_MODE_50000baseSR_Full_BIT,
  ETHTOOL_LINK_MODE_50000baseCR_Full_BIT,
  ETHTOOL_LINK_MODE_50000baseLR_ER_FR_Full_BIT,
  ETHTOOL_LINK_MODE_50000baseDR_Full_BIT,
  ETHTOOL_LINK_MODE_100000baseKR2_Full_BIT,
  ETHTOOL_LINK_MODE_100000baseSR2_Full_BIT,
  ETHTOOL_LINK_MODE_100000baseCR2_Full_BIT,
  ETHTOOL_LINK_MODE_100000baseLR2_ER2_FR2_Full_BIT,
  ETHTOOL_LINK_MODE_100000baseDR2_Full_BIT,
  ETHTOOL_LINK_MODE_200000baseKR4_Full_BIT,
  ETHTOOL_LINK_MODE_200000baseSR4_Full_BIT,
  ETHTOOL_LINK_MODE_200000baseLR4_ER4_FR4_Full_BIT,
  ETHTOOL_LINK_MODE_200000baseDR4_Full_BIT,
  ETHTOOL_LINK_MODE_200000baseCR4_Full_BIT,
  ETHTOOL_LINK_MODE_400000baseKR8_Full_BIT,
  ETHTOOL_LINK_MODE_400000baseLR8_ER8_FR8_Full_BIT,
  ETHTOOL_LINK_MODE_400000baseCR8_Full_BIT,
  ETHTOOL_LINK_MODE_100000baseKR_Full_BIT,
  ETHTOOL_LINK_MODE_100000baseSR_Full_BIT,
  ETHTOOL_LINK_MODE_100000baseLR_ER_FR_Full_BIT,
  ETHTOOL_LINK_MODE_100000baseCR_Full_BIT,
  ETHTOOL_LINK_MODE_100000baseDR_Full_BIT,
  ETHTOOL_LINK_MODE_200000baseKR2_Full_BIT,
  ETHTOOL_LINK_MODE_200000baseSR2_Full_BIT,
  ETHTOOL_LINK_MODE_200000baseLR2_ER2_FR2_Full_BIT,
  ETHTOOL_LINK_MODE_200000baseDR2_Full_BIT,
  ETHTOOL_LINK_MODE_200000baseCR2_Full_BIT,
  ETHTOOL_LINK_MODE_400000baseKR4_Full_BIT,
  ETHTOOL_LINK_MODE_400000baseSR4_Full_BIT,
  ETHTOOL_LINK_MODE_400000baseLR4_ER4_FR4_Full_BIT,
  ETHTOOL_LINK_MODE_400000baseDR4_Full_BIT
};
struct {
  char duplex;
  char bit
} const settings[] = {{ETHTOOL_LINK_MODE_400000baseCR8_Full_BIT},
                      {ETHTOOL_LINK_MODE_400000baseKR8_Full_BIT},
                      {ETHTOOL_LINK_MODE_400000baseLR8_ER8_FR8_Full_BIT},
                      {ETHTOOL_LINK_MODE_400000baseKR4_Full_BIT},
                      {ETHTOOL_LINK_MODE_400000baseLR4_ER4_FR4_Full_BIT},
                      {ETHTOOL_LINK_MODE_400000baseDR4_Full_BIT},
                      {ETHTOOL_LINK_MODE_400000baseSR4_Full_BIT},
                      {ETHTOOL_LINK_MODE_200000baseCR4_Full_BIT},
                      {ETHTOOL_LINK_MODE_200000baseKR4_Full_BIT},
                      {ETHTOOL_LINK_MODE_200000baseLR4_ER4_FR4_Full_BIT},
                      {ETHTOOL_LINK_MODE_200000baseDR4_Full_BIT},
                      {ETHTOOL_LINK_MODE_200000baseSR4_Full_BIT},
                      {ETHTOOL_LINK_MODE_200000baseCR2_Full_BIT},
                      {ETHTOOL_LINK_MODE_200000baseKR2_Full_BIT},
                      {ETHTOOL_LINK_MODE_200000baseLR2_ER2_FR2_Full_BIT},
                      {ETHTOOL_LINK_MODE_200000baseDR2_Full_BIT},
                      {ETHTOOL_LINK_MODE_200000baseSR2_Full_BIT},
                      {ETHTOOL_LINK_MODE_100000baseCR4_Full_BIT},
                      {ETHTOOL_LINK_MODE_100000baseKR4_Full_BIT},
                      {ETHTOOL_LINK_MODE_100000baseLR4_ER4_Full_BIT},
                      {ETHTOOL_LINK_MODE_100000baseSR4_Full_BIT},
                      {ETHTOOL_LINK_MODE_100000baseCR2_Full_BIT},
                      {ETHTOOL_LINK_MODE_100000baseKR2_Full_BIT},
                      {ETHTOOL_LINK_MODE_100000baseLR2_ER2_FR2_Full_BIT},
                      {ETHTOOL_LINK_MODE_100000baseDR2_Full_BIT},
                      {ETHTOOL_LINK_MODE_100000baseSR2_Full_BIT},
                      {ETHTOOL_LINK_MODE_100000baseCR_Full_BIT},
                      {ETHTOOL_LINK_MODE_100000baseKR_Full_BIT},
                      {ETHTOOL_LINK_MODE_100000baseLR_ER_FR_Full_BIT},
                      {ETHTOOL_LINK_MODE_100000baseDR_Full_BIT},
                      {ETHTOOL_LINK_MODE_100000baseSR_Full_BIT},
                      {ETHTOOL_LINK_MODE_56000baseCR4_Full_BIT},
                      {ETHTOOL_LINK_MODE_56000baseKR4_Full_BIT},
                      {ETHTOOL_LINK_MODE_56000baseLR4_Full_BIT},
                      {ETHTOOL_LINK_MODE_56000baseSR4_Full_BIT},
                      {ETHTOOL_LINK_MODE_50000baseCR2_Full_BIT},
                      {ETHTOOL_LINK_MODE_50000baseKR2_Full_BIT},
                      {ETHTOOL_LINK_MODE_50000baseSR2_Full_BIT},
                      {ETHTOOL_LINK_MODE_50000baseCR_Full_BIT},
                      {ETHTOOL_LINK_MODE_50000baseKR_Full_BIT},
                      {ETHTOOL_LINK_MODE_50000baseLR_ER_FR_Full_BIT},
                      {ETHTOOL_LINK_MODE_50000baseDR_Full_BIT},
                      {ETHTOOL_LINK_MODE_50000baseSR_Full_BIT},
                      {ETHTOOL_LINK_MODE_40000baseCR4_Full_BIT},
                      {ETHTOOL_LINK_MODE_40000baseKR4_Full_BIT},
                      {ETHTOOL_LINK_MODE_40000baseLR4_Full_BIT},
                      {ETHTOOL_LINK_MODE_40000baseSR4_Full_BIT},
                      {ETHTOOL_LINK_MODE_25000baseCR_Full_BIT},
                      {ETHTOOL_LINK_MODE_25000baseKR_Full_BIT},
                      {ETHTOOL_LINK_MODE_25000baseSR_Full_BIT},
                      {ETHTOOL_LINK_MODE_20000baseKR2_Full_BIT},
                      {ETHTOOL_LINK_MODE_20000baseMLD2_Full_BIT},
                      {ETHTOOL_LINK_MODE_10000baseCR_Full_BIT},
                      {ETHTOOL_LINK_MODE_10000baseER_Full_BIT},
                      {ETHTOOL_LINK_MODE_10000baseKR_Full_BIT},
                      {ETHTOOL_LINK_MODE_10000baseKX4_Full_BIT},
                      {ETHTOOL_LINK_MODE_10000baseLR_Full_BIT},
                      {ETHTOOL_LINK_MODE_10000baseLRM_Full_BIT},
                      {ETHTOOL_LINK_MODE_10000baseR_FEC_BIT},
                      {ETHTOOL_LINK_MODE_10000baseSR_Full_BIT},
                      {ETHTOOL_LINK_MODE_10000baseT_Full_BIT},
                      {ETHTOOL_LINK_MODE_5000baseT_Full_BIT},
                      {ETHTOOL_LINK_MODE_2500baseT_Full_BIT},
                      {},
                      {},
                      {},
                      {},
                      {},
                      {},
                      {},
                      {},
                      {1},
                      {}};
phy_resolve_min_speed_fdx_only, phy_speed_down_core_i;
phy_speed_down_core() {
  while (settings[phy_speed_down_core_i].duplex)
    ;
}
$ clang -O2 -c -o /dev/null phy-core.i
...

$ clang -O2 -fno-strict-overflow -c -o /dev/null phy-core.i
...
fatal error: error in backend: Instruction Combining seems stuck in an infinite loop after 100 iterations.
...

The full preprocessed file and interestingness test can be viewed here: https://github.com/nathanchance/creduce-files/tree/56fb73d02320a11971ca3f8a88b52e13805e20c8/D99481

Reduced .ll:

$ opt -instcombine -S -o - reduce.ll
LLVM ERROR: Instruction Combining seems stuck in an infinite loop after 100 iterations.
...

Oh, hmm, didn't realize the transform could still fail at that point. Need to fix the code to avoid creating the mask instruction if the transform fails. @nathanchance, feel free to revert in the meantime.

On a side-note, looking at the code more closely, probably should check if Idx is too narrow, so we don't accidentally mask the high bits of the array indexes. Not sure if that happens in practice, though. GEPs with indexes narrower than i16 don't usually show up.

mdchen added subscribers: kuhar, mdchen.EditedMon, May 31, 6:37 PM

Reduced .ll:

$ opt -instcombine -S -o - reduce.ll
LLVM ERROR: Instruction Combining seems stuck in an infinite loop after 100 iterations.
...

@kuhar @spatel @lebedev.ri
I tested the case locally and saw the iteration number is infinity(the number 1001 is because I didn't notice MaxIterations before)exactly 1001. I've met a similar problem before, which also required 1001 iterations(but I can only reproduce it in our local codebase), which I don't know a coincidence or something resulted from an internal algorithm issue. I think this is a good chance to discuss if the threshold and fatal behavior introduced in https://reviews.llvm.org/D71673.

@efriedma thank you, I have reverted it in e6b086bef2c0597ed14b2aefbb3f6d74b94fd49e so our CI does not go red.

Reduced .ll:

$ opt -instcombine -S -o - reduce.ll
LLVM ERROR: Instruction Combining seems stuck in an infinite loop after 100 iterations.
...

@kuhar @spatel @lebedev.ri
I tested the case locally and saw the iteration number is exactly 1001. I've met a similar problem before, which also required 1001 iterations(but I can only reproduce it in our local codebase), which I don't know a coincidence or something resulted from an internal algorithm issue. I think this is a good chance to discuss if the threshold and fatal behavior introduced in https://reviews.llvm.org/D71673.

It fails even if i bump those limits to 0.1M iterations.
This is clearly an infinite combine cycle.

@lebedev.ri

It fails even if i bump those limits to 0.1M iterations.
This is clearly an infinite combine cycle.

If that's the case, isn't it more reasonable to replace the fatal_error with assert so that only developers sense that?

aqjune added a comment.Thu, Jun 3, 1:05 AM

I investigated a bit and the reason was as follows.

The reason was that foldCmpLoadFromIndexedGlobal can simply give up folding load+cmp, causing the new instructions created by these lines to be dummy ones:

if (countTrailingZeros(ElementSize) != 0) {
  Value *Mask = ConstantInt::getSigned(Idx->getType(), -1);
  Mask = Builder.CreateLShr(Mask, countTrailingZeros(ElementSize));
  Idx = Builder.CreateAnd(Idx, Mask);
}

In InstCombine's next iteration, they are found as a dead code. They are removed & InstCombine proceeds to the next iteration because there was something to remove.

Then, foldCmpLoadFromIndexedGlobal is called again, creating the dummy instructions, then dead code is removed, and so on.

To avoid this, the lshr+and should be created only when they are necessary.

hyeongyukim reopened this revision.Sun, Jun 6, 6:51 AM

I modified the code to prevent infinite combine cycle

This revision is now accepted and ready to land.Sun, Jun 6, 6:51 AM

Avoid creating an meaningless instruction that causes an infinte loop.

Reduced .ll:

$ opt -instcombine -S -o - reduce.ll
LLVM ERROR: Instruction Combining seems stuck in an infinite loop after 100 iterations.
...

@kuhar @spatel @lebedev.ri
I tested the case locally and saw the iteration number is exactly 1001. I've met a similar problem before, which also required 1001 iterations(but I can only reproduce it in our local codebase), which I don't know a coincidence or something resulted from an internal algorithm issue. I think this is a good chance to discuss if the threshold and fatal behavior introduced in https://reviews.llvm.org/D71673.

It fails even if i bump those limits to 0.1M iterations.
This is clearly an infinite combine cycle.

I fixed this problem. Could anyone can review it again?

This change breaks building the Linux kernel:

Thanks Nathan for the repro!
FYI: this means this function is likely being miscompiled and it may have security implications if the variable indexing into the array can be changed by user input.

I don't think the revised change really solves the problem. The reason you end up with dead instructions is the "return nullptr;" at the end of the function, around line 402. The new code doesn't seem related to that. If we narrow the cases where the generate the lshr+and, the infloop might trigger less frequently, but it doesn't really solve the general problem.

Apply mask to Idx only if icmp(GEP) folds.

I don't think the revised change really solves the problem. The reason you end up with dead instructions is the "return nullptr;" at the end of the function, around line 402. The new code doesn't seem related to that. If we narrow the cases where the generate the lshr+and, the infloop might trigger less frequently, but it doesn't really solve the general problem.

You're right.
I changed my code to add lshr+and only if those two instructions are meaningful.
Could you review my code again?

That approach looks better. But I think you missed a use of Idx in the magic_cst path?

Add a mask to Idx in case of magic_cst.

efriedma added inline comments.Sat, Jun 12, 10:46 AM
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
373

This doesn't look right; it's outside the "if (Ty) {" that actually guards the transform.

Corrected the wrong location of the Mask.

Update unit test.