Following nvptx approach, this patch uses complex function
definitions from complex_cmath.h. With this patch, ovo passes
23/34 complex mathematical test cases.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
clang/lib/Headers/openmp_wrappers/complex | ||
---|---|---|
48 | Is the ifdef nvptx necessary? I thought the bug with variant only affected amdgpu. Looks like it might be an artifact of using a single end declare for both architectures Also possible that adding amdgcn to the arch list by nvptx64 will work here |
I'm seeing crashes in this area when running OvO, e.g.
0. $HOME/OvO/test_src/cpp/mathematical_function/cpp11-complex/pow_complex_float_complex_float_complex_float.cpp
- $HOME/OvO/test_src/cpp/mathematical_function/cpp11-complex/pow_complex_float_complex_float_complex_float.cpp:17:30: current parser token ')'
- $HOME/OvO/test_src/cpp/mathematical_function/cpp11-complex/pow_complex_float_complex_float_complex_float.cpp:11:16: parsing function body 'test_pow'
- $HOME/OvO/test_src/cpp/mathematical_function/cpp11-complex/pow_complex_float_complex_float_complex_float.cpp:11:16: in compound statement ('{}')
- $HOME/OvO/test_src/cpp/mathematical_function/cpp11-complex/pow_complex_float_complex_float_complex_float.cpp:16:4: in compound statement ('{}')
- $HOME/llvm-install/lib/clang/14.0.0/include/openmp_wrappers/complex_cmath.h:158:1: instantiating function definition 'std::pow[device={arch(amdgcn, nvptx, nvptx64)}, implementation={extension(match_any, allow_templates)}]<float, float>'
Something in the name mangler, which might mean we're better off with two separate variant clauses after all
Hey, Jon, sorry for late reply. I cannot reproduce this issue on nvptx so it seems to occur only on amdgcn. Will it be better if instead the name mangling issue is fixed? Or for the meantime, I could add #ifdef around as a temporary fix. Suggestions?
Let's change to the uglier #ifdef for now, and add it to the short list of things that aren't quite right in declare variant
Compiling OvO tests for amdgpu, e.g.
cd rm -rf OvO git clone --depth 1 https://github.com/TApplencourt/OvO.git cd OvO export OMP_TARGET_OFFLOAD=mandatory export CXX="$HOME/llvm-install/bin/clang++" export CXXFLAGS="-fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -lm --rocm-device-lib-path=$HOME/llvm-install/amdgcn/bitcode -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx906" export LD_LIBRARY_PATH=$HOME/llvm-install/lib ./ovo.sh run test_src/cpp/mathematical_function/ ./ovo.sh report --summary
Leads to some passing, but also some failing to compile:
clang-13: $HOME/llvm-project/clang/lib/AST/ItaniumMangle.cpp:5200: bool (anonymous namespace)::TemplateArgManglingInfo::needExactType(unsigned int, const clang::TemplateArgument &): Assertion `ParamIdx < Resolv\ edTemplate->getTemplateParameters()->size() && "no parameter for argument"' failed. PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: 0. Program arguments: $HOME/llvm-install/bin/clang-13 -cc1 -triple amdgcn-amd-amdhsa -aux-triple x86_64-unknown-linux-gnu -emit-llvm-bc -emit-llvm-uselists -disable-free -main-file-name pow_complex_double_\ complex_double_complex_double.cpp -mrelocation-model pic -pic-level 2 -fhalf-no-semantic-interposition -mframe-pointer=all -fno-rounding-math -target-cpu gfx906 -fcuda-is-device -mlink-builtin-bitcode $HOME/llv\ m-install/lib/libomptarget-amdgcn-gfx906.bc -debugger-tuning=gdb -resource-dir $HOME/llvm-install/lib/clang/13.0.0 -internal-isystem $HOME/llvm-install/lib/clang/13.0.0/include/openmp_wrappers -include __clang_\ openmp_device_functions.h -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/10/../../../../include/x86_64-linux-gnu/c++/10 -internal-i\ system /usr/lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/backward -internal-isystem $HOME/llvm-install/lib/clang/13.0.0/include -internal-isystem /usr/local/include -internal-isystem /usr/lib/gcc/x86_\ 64-linux-gnu/10/../../../../x86_64-linux-gnu/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -internal-isystem $HOME/llv\ m-install/lib/clang/13.0.0/include -internal-isystem /usr/local/include -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/10/../../../../x86_64-linux-gnu/include -internal-externc-isystem /usr/include/x86_64-linu\ x-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -fdeprecated-macro -fdebug-compilation-dir=$HOME/OvO/test_src/cpp/mathematical_function/cpp11-complex -ferror-limit 19 -fopenmp -f\ gnuc-version=4.2.1 -fcxx-exceptions -fexceptions -fopenmp-is-device -fopenmp-host-ir-file-path /tmp/pow_complex_double_complex_double_complex_double-2d4f01.bc -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -o /tmp/pow\ _complex_double_complex_double_complex_double-698dd9.bc -x c++ $HOME/OvO/test_src/cpp/mathematical_function/cpp11-complex/pow_complex_double_complex_double_complex_double.cpp 1. $HOME/OvO/test_src/cpp/mathematical_function/cpp11-complex/pow_complex_double_complex_double_complex_double.cpp:17:30: current parser token ')' 2. $HOME/OvO/test_src/cpp/mathematical_function/cpp11-complex/pow_complex_double_complex_double_complex_double.cpp:11:16: parsing function body 'test_pow' 3. $HOME/OvO/test_src/cpp/mathematical_function/cpp11-complex/pow_complex_double_complex_double_complex_double.cpp:11:16: in compound statement ('{}') 4. $HOME/OvO/test_src/cpp/mathematical_function/cpp11-complex/pow_complex_double_complex_double_complex_double.cpp:16:4: in compound statement ('{}') 5. $HOME/llvm-install/lib/clang/13.0.0/include/openmp_wrappers/complex_cmath.h:158:1: instantiating function definition 'std::pow[device={arch(amdgcn, nvptx, nvptx64)}, implementation={extension(match_any,\ allow_templates)}]<double, double>' 6. $HOME/llvm-install/lib/clang/13.0.0/include/openmp_wrappers/complex_cmath.h:158:1: LLVM IR generation of declaration 'std::pow[device={arch(amdgcn, nvptx, nvptx64)}, implementation={extension(match_any,\ allow_templates)}]' 7. $HOME/llvm-install/lib/clang/13.0.0/include/openmp_wrappers/complex_cmath.h:158:1: Mangling declaration 'std::pow[device={arch(amdgcn, nvptx, nvptx64)}, implementation={extension(match_any, allow_templa\ tes)}]' #0 0x0000000002bf92d3 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) ($HOME/llvm-install/bin/clang-13+0x2bf92d3) #1 0x0000000002bf701e llvm::sys::RunSignalHandlers() ($HOME/llvm-install/bin/clang-13+0x2bf701e) #2 0x0000000002bf965f SignalHandler(int) Signals.cpp:0:0 #3 0x00007f0853e96140 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14140) #4 0x00007f085396ece1 raise ./signal/../sysdeps/unix/sysv/linux/raise.c:51:1 #5 0x00007f0853958537 abort ./stdlib/abort.c:81:7 #6 0x00007f085395840f get_sysdep_segment_value ./intl/loadmsgcat.c:509:8 #7 0x00007f085395840f _nl_load_domain ./intl/loadmsgcat.c:970:34 #8 0x00007f0853967662 (/lib/x86_64-linux-gnu/libc.so.6+0x34662) #9 0x00000000052f232e (anonymous namespace)::TemplateArgManglingInfo::needExactType(unsigned int, clang::TemplateArgument const&) ItaniumMangle.cpp:0:0 #10 0x00000000052f61c5 (anonymous namespace)::CXXNameMangler::mangleTemplateArgs(clang::TemplateName, clang::TemplateArgumentList const&) ItaniumMangle.cpp:0:0 #11 0x00000000052e2516 (anonymous namespace)::CXXNameMangler::mangleNameWithAbiTags(clang::GlobalDecl, llvm::SmallVector<llvm::StringRef, 4u> const*) ItaniumMangle.cpp:0:0 #12 0x00000000052e04e7 (anonymous namespace)::CXXNameMangler::mangleFunctionEncoding(clang::GlobalDecl) ItaniumMangle.cpp:0:0 #13 0x00000000052ddddb (anonymous namespace)::ItaniumMangleContextImpl::mangleCXXName(clang::GlobalDecl, llvm::raw_ostream&) ItaniumMangle.cpp:0:0 #14 0x0000000002f69e0c getMangledNameImpl[abi:cxx11](clang::CodeGen::CodeGenModule&, clang::GlobalDecl, clang::NamedDecl const*, bool) CodeGenModule.cpp:0:0 #15 0x0000000002f627cd clang::CodeGen::CodeGenModule::getMangledName(clang::GlobalDecl) ($HOME/llvm-install/bin/clang-13+0x2f627cd) #16 0x0000000003279259 clang::CodeGen::CGOpenMPRuntime::emitTargetFunctions(clang::GlobalDecl) ($HOME/llvm-install/bin/clang-13+0x3279259) #17 0x0000000002f76a00 clang::CodeGen::CodeGenModule::EmitGlobal(clang::GlobalDecl) ($HOME/llvm-install/bin/clang-13+0x2f76a00) #18 0x0000000002f7d923 clang::CodeGen::CodeGenModule::EmitTopLevelDecl(clang::Decl*) ($HOME/llvm-install/bin/clang-13+0x2f7d923) #19 0x0000000003ba6a00 (anonymous namespace)::CodeGeneratorImpl::HandleTopLevelDecl(clang::DeclGroupRef) ModuleBuilder.cpp:0:0 #20 0x0000000003ba3616 clang::BackendConsumer::HandleTopLevelDecl(clang::DeclGroupRef) ($HOME/llvm-install/bin/clang-13+0x3ba3616) #21 0x0000000004dc3e39 clang::Sema::InstantiateFunctionDefinition(clang::SourceLocation, clang::FunctionDecl*, bool, bool, bool) ($HOME/llvm-install/bin/clang-13+0x4dc3e39) #22 0x0000000004dada97 clang::Sema::InstantiateAttrs(clang::MultiLevelTemplateArgumentList const&, clang::Decl const*, clang::Decl*, llvm::SmallVector<clang::Sema::LateInstantiatedAttribute, 16u>*, clang::LocalInstantiationScope*) ($HOME/llvm-install/bin/clang-13+0x4dada97) ... etc
(that's on release/13.x, but originally noticed on trunk)
Even with declare variant separated using ifdef's, the error is still there. So I don't think we have workaround for this.
Is the ifdef nvptx necessary? I thought the bug with variant only affected amdgpu. Looks like it might be an artifact of using a single end declare for both architectures
Also possible that adding amdgcn to the arch list by nvptx64 will work here