This is an archive of the discontinued LLVM Phabricator instance.

[LSV] Fix the ContextInst for computeKnownBits.
ClosedPublic

Authored by jlebar on May 28 2023, 7:56 AM.

Details

Summary

Previously we used the later of GEPA or GEPB. This is hacky because
really we should be using the later of the two load/store instructions
being considered. But also it's flat-out incorrect, because GEPA and
GEPB might be in different BBs, in which case we cannot ask which one
comes last (assertion failure,
https://reviews.llvm.org/D149893#4378332).

Fixed, now we use the correct context instruction.

Diff Detail

Event Timeline

jlebar created this revision.May 28 2023, 7:56 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 28 2023, 7:56 AM
jlebar requested review of this revision.May 28 2023, 7:56 AM

thanks for the very rapid fix, i will try it out , and should know in about 1.5 hours. will update here.
in tghe meantime if you think its fine , land it.

jlebar updated this revision to Diff 526332.May 28 2023, 8:01 AM

Fix a typo while we're here.

in the meantime if you think its fine , land it.

OK, will do. I think this is demonstrably more correct than it was before. :) Hopefully we don't have two bugs here.

This revision was not accepted when it landed; it landed in state Needs Review.May 28 2023, 8:03 AM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.

looks like there is a 2nd assert :
llvm/include/llvm/ADT/APInt.h:1039: bool llvm::APInt::operator==(const llvm::APInt&) const: Assertion `BitWidth == RHS.BitWidth && "Comparison requires equal bit widths"' failed.

Hahaha all right. I am headed out for the day, but will look when I'm back.

thanks for looking into it. enjoy rest of your day

thanks for looking into it. enjoy rest of your day

I'm not seeing how this can happen. :(

I see two places where I call APInt::operator==:

  • bool AreContiguous = It->OffsetFromLeader == PrevReadEnd;
  • if (IOffset == ChainElemOffset ...)

In the first case, the two APInts are from the OffsetFromLeader values from the same Chain. There are two places where a ChainElem is created (search for ChainElem{), and the OffsetFromLeader values explicitly have the same bitwidth.

In the second case, IOffset and ChainElemOffset come from the ChainOffsets hashtable, which is created using the ChainElem::OffsetFromLeader field, so here too they should have the same bitwidth.

Perhaps the assertion is happening in another file (still is probably my bug!), or perhaps I'm missing the call to APInt::operator==. Or perhaps I'm misunderstanding my invariants somehow.

A reproducer, or at least a stacktrace, would be very helpful.

Sorry for the trouble, and thanks for helping me hunt this down.

Stack dump:
0. Program arguments: /opt/rocm-5.6.0-12074/llvm/bin/clang-17 -cc1 -triple amdgcn-amd-amdhsa -aux-triple x86_64-unknown-linux-gnu -emit-obj -disable-free -clear-ast-before-backend -main-file-name large_types.cu -mrelocation-model pic -pic-level 2 -fhalf-no-semantic-interposition -mframe-pointer=none -fno-rounding-math -mconstructor-aliases -aux-target-cpu x86-64 -fcuda-is-device -mllvm -amdgpu-internalize-symbols -fcuda-allow-variadic-functions -fvisibility=hidden -fapply-global-visibility-to-externs -mlink-builtin-bitcode /opt/rocm-5.6.0-12074/amdgcn/bitcode/hip.bc -mlink-builtin-bitcode /opt/rocm-5.6.0-12074/amdgcn/bitcode/ocml.bc -mlink-builtin-bitcode /opt/rocm-5.6.0-12074/amdgcn/bitcode/ockl.bc -mlink-builtin-bitcode /opt/rocm-5.6.0-12074/amdgcn/bitcode/oclc_daz_opt_off.bc -mlink-builtin-bitcode /opt/rocm-5.6.0-12074/amdgcn/bitcode/oclc_unsafe_math_off.bc -mlink-builtin-bitcode /opt/rocm-5.6.0-12074/amdgcn/bitcode/oclc_finite_only_off.bc -mlink-builtin-bitcode /opt/rocm-5.6.0-12074/amdgcn/bitcode/oclc_correctly_rounded_sqrt_on.bc -mlink-builtin-bitcode /opt/rocm-5.6.0-12074/amdgcn/bitcode/oclc_wavefrontsize64_on.bc -mlink-builtin-bitcode /opt/rocm-5.6.0-12074/amdgcn/bitcode/oclc_isa_version_906.bc -mlink-builtin-bitcode /opt/rocm-5.6.0-12074/amdgcn/bitcode/oclc_abi_version_400.bc -target-cpu gfx906 -target-feature -xnack -debugger-tuning=gdb -resource-dir /opt/rocm-5.6.0-12074/llvm/lib/clang/17.0.0 -dependency-file CMakeFiles/test_thrust_exclusive_large_types.dir/large_types.cu.o.d -MT testing/async/exclusive_scan/CMakeFiles/test_thrust_exclusive_large_types.dir/large_types.cu.o -sys-header-deps -canonical-system-headers -internal-isystem /opt/rocm-5.6.0-12074/llvm/lib/clang/17.0.0/include/cuda_wrappers -idirafter /opt/rocm-5.6.0-12074/include -include clang_hip_runtime_wrapper.h -c-isystem /opt/rocm-5.6.0-12074/llvm/include/gpu-none-llvm -isystem /opt/rocm-5.6.0-12074/include -isystem /jenkins/workspace/compiler-psdb-amd-stg-open_4/Libs/rocThrust/testing -D USE_PROF_API=1 -D HIP_PLATFORM_AMD=1 -D HIP_PLATFORM_HCC=1 -I /jenkins/workspace/compiler-psdb-amd-stg-open_4/Libs/rocThrust/build/release/thrust/include -I /jenkins/workspace/compiler-psdb-amd-stg-open_4/Libs/rocThrust/thrust/.. -D NDEBUG -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/x86_64-linux-gnu/c++/9 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/backward -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/x86_64-linux-gnu/c++/9 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/backward -internal-isystem /opt/rocm-5.6.0-12074/llvm/lib/clang/17.0.0/include -internal-isystem /usr/local/include -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/9/../../../../x86_64-linux-gnu/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -internal-isystem /opt/rocm-5.6.0-12074/llvm/lib/clang/17.0.0/include -internal-isystem /usr/local/include -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/9/../../../../x86_64-linux-gnu/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -O3 -Wno-unused-command-line-argument -Wall -Wextra -std=c++14 -fdeprecated-macro -fno-autolink -fdebug-compilation-dir=/jenkins/workspace/compiler-psdb-amd-stg-open_4/Libs/rocThrust/build/release/testing/async/exclusive_scan -ferror-limit 19 -pthread -fhip-new-launch-api -fgnuc-version=4.2.1 -fcxx-exceptions -fexceptions -vectorize-loops -vectorize-slp -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -cuid=c55cff91efbd2d1d -fcuda-allow-variadic-functions -faddrsig -DGCC_HAVE_DWARF2_CFI_ASM=1 -o /tmp/large_types-gfx906:xnack--f0faeb.o -x hip /jenkins/workspace/compiler-psdb-amd-stg-open_4/Libs/rocThrust/testing/async/exclusive_sca

  1. <eof> parser at end of file
  2. Code generation
  3. Running pass 'CallGraph Pass Manager' on module '/jenkins/workspace/compiler-psdb-amd-stg-open_4/Libs/rocThrust/testing/async/exclusive_scan/large_types.cu'.
  4. Running pass 'GPU Load and Store Vectorizer' on function '@_ZN7rocprim6detail20lookback_scan_kernelILb1ENS0_19default_scan_configILj0E11FixedVectorIiLj8EEEEN6thrust6detail15normal_iteratorINS6_10device_ptrIKS4_EEEENS8_INS9_IS4_EEEENS6_7maximumIvEES4_NS0_19lookback_scan_stateIS4_Lb1ELb0EEEEEvT1_T2_mT4_T3_T5_jNS0_16ordered_block_idIjEEPNS0_18input_value_traitsISL_E10value_typeEST_bb'
#0 0x0000563c3056d52f llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/rocm-5.6.0-12074/llvm/bin/clang-17+0x26d652f)
#1 0x0000563c3056ac54 SignalHandler(int) Signals.cpp:0:0
#2 0x00007f02c29a3420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
#3 0x00007f02c244000b raise (/lib/x86_64-linux-gnu/libc.so.6+0x4300b)
#4 0x00007f02c241f859 abort (/lib/x86_64-linux-gnu/libc.so.6+0x22859)
#5 0x00007f02c241f729 (/lib/x86_64-linux-gnu/libc.so.6+0x22729)
#6 0x00007f02c2430fd6 (/lib/x86_64-linux-gnu/libc.so.6+0x33fd6)
#7 0x0000563c2f1d5fca (/opt/rocm-5.6.0-12074/llvm/bin/clang-17+0x133efca)
#8 0x0000563c307592b3 (anonymous namespace)::Vectorizer::getConstantOffset(llvm::Value*, llvm::Value*, llvm::Instruction*, unsigned int) LoadStoreVectorizer.cpp:0:0
#9 0x0000563c30759ca6 (anonymous namespace)::Vectorizer::gatherChains(llvm::ArrayRef<llvm::Instruction*>) LoadStoreVectorizer.cpp:0:0

#10 0x0000563c3075d5b8 (anonymous namespace)::Vectorizer::run() LoadStoreVectorizer.cpp:0:0
#11 0x0000563c30760362 (anonymous namespace)::LoadStoreVectorizerLegacyPass::runOnFunction(llvm::Function&) (.part.0) LoadStoreVectorizer.cpp:0:0
#12 0x0000563c2fecea81 llvm::FPPassManager::runOnFunction(llvm::Function&) (/opt/rocm-5.6.0-12074/llvm/bin/clang-17+0x2037a81)
#13 0x0000563c2f4fa8b7 (anonymous namespace)::CGPassManager::runOnModule(llvm::Module&) CallGraphSCCPass.cpp:0:0
#14 0x0000563c2fecf542 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/opt/rocm-5.6.0-12074/llvm/bin/clang-17+0x2038542)
#15 0x0000563c307ea1c6 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>) (/opt/rocm-5.6.0-12074/llvm/bin/clang-17+0x29531c6)
#16 0x0000563c31809912 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/opt/rocm-5.6.0-12074/llvm/bin/clang-17+0x3972912)
#17 0x0000563c32989bd9 clang::ParseAST(clang::Sema&, bool, bool) (/opt/rocm-5.6.0-12074/llvm/bin/clang-17+0x4af2bd9)
#18 0x0000563c318082b8 clang::CodeGenAction::ExecuteAction() (/opt/rocm-5.6.0-12074/llvm/bin/clang-17+0x39712b8)
#19 0x0000563c3104c789 clang::FrontendAction::Execute() (/opt/rocm-5.6.0-12074/llvm/bin/clang-17+0x31b5789)
#20 0x0000563c30fcfb16 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/opt/rocm-5.6.0-12074/llvm/bin/clang-17+0x3138b16)
#21 0x0000563c311312e7 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/opt/rocm-5.6.0-12074/llvm/bin/clang-17+0x329a2e7)
#22 0x0000563c2eb78a04 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/opt/rocm-5.6.0-12074/llvm/bin/clang-17+0xce1a04)
#23 0x0000563c2eb748aa ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0
#24 0x0000563c2eb75d2c clang_main(int, char**, llvm::ToolContext const&) (/opt/rocm-5.6.0-12074/llvm/bin/clang-17+0xcded2c)
#25 0x0000563c2eac0a45 main (/opt/rocm-5.6.0-12074/llvm/bin/clang-17+0xc29a45)
#26 0x00007f02c2421083 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24083)
#27 0x0000563c2eb6ef3e _start (/opt/rocm-5.6.0-12074/llvm/bin/clang-17+0xcd7f3e)

I still have no idea how this could be happening, but I hypothesize 50/50 chance https://reviews.llvm.org/D151640 will fix it for you. I have not been able to write a testcase. :(

subsequent testing has shown following assert in our builds of qmcpack and spec cpu2017 538.imagick.

llvm/lib/IR/Constants.cpp:3070: llvm::APInt llvm::ConstantDataSequential::getElementAsAPInt(unsigned int) const: Assertion `isa<IntegerType>(getElementType()) && "Accessor can only be used when element is an integer"' failed.

looking for a small reproducer ...

small test case
llc < small.ll

small test case
llc < small.ll

This is working on my machine at revision f81f32adc9a8 (that's the newest I have access to at the moment), and also it doesn't seem to invoke LSV.

LMK if I'm holding it wrong?

sorry, i meant to try my trunk build before posting it, forgot. you can disregard, its on my end.
thx for confiorming,

subsequent merge cleaned up our downstream issue. all good.