Enable opaque pointers by default in clang, which can be disabled either via cc1 option or a cmake flag.
Details
- Reviewers
aeubanks - Group Reviewers
Restricted Project - Commits
- rG702d5de4380b: [Clang] Enable opaque pointers by default
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
actually I took a look at the failing CI and repro'd it locally with this change
cmake -GNinja -DCMAKE_BUILD_TYPE=Release -DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_ENABLE_PROJECTS='clang;lld' -DLLVM_ENABLE_RUNTIMES='compiler-rt' -DLLVM_ENABLE_LLD=ON ../../llvm
$ bin/clang -fsanitize=memory -m64 -gline-tables-only ../../compiler-rt/test/msan/use-after-dtor.cpp -fno-sanitize-memory-use-after-dtor -o - -S -emit-llvm -Xclang -disable-llvm-passes
the debug location for main's return instruction is line 39 with typed pointers and line 50 with opaque pointers.
I think bugpoint reduces debug info (since a few years ago?) but I don't remember for sure. @aprantl, any (other) thoughts on how to dig into this?
Here's a reduced test case:
int main() { unsigned long buf; unsigned long buf2; return 0; // Comment }
With command build/bin/clang -O1 -gline-tables-only test.cpp -o - -S -emit-llvm -Xclang -disable-llvm-passes. The -O1 is relevant to enable placement of lifetime markers. It looks like we end up inheriting the location of the lifetime marker (at the end of the function) when opaque pointers are used.
I've fixed an issue with a return value optimization in https://github.com/llvm/llvm-project/commit/692a147bf4339380ccfea7418cbedcea540cfaef -- and that also seems to fix the debuginfo problem. Someone might want to look into the debuginfo aspect more closely, because that optimization probably shouldn't be impacting debuginfo...
I've also adjusted a few more compiler-rt tests to add -no-opaque-pointers in https://github.com/llvm/llvm-project/commit/3876cd10ae202110fc02e494cd0799ce6120aa9b and https://github.com/llvm/llvm-project/commit/7d2a1b6de4205dde613e83078314c13438f46578.
Let's see if that resolved everything...
https://lab.llvm.org/buildbot/#/builders/74/builds/10288/steps/14/logs/stdio
FAILED: tools/clang/lib/Tooling/ASTNodeAPI.json cd /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm_build_msan_track_origins/tools/clang/lib/Tooling && /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm_build_msan_track_origins/bin/clang-ast-dump --skip-processing=0 -I /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm_build_msan_track_origins/lib/clang/15.0.0/include -I /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/include -I /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm_build_msan_track_origins/tools/clang/include -I /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm_build_msan_track_origins/include -I /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/include -I /usr/include --json-output-path /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm_build_msan_track_origins/tools/clang/lib/Tooling/ASTNodeAPI.json ==28753==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x5633f83f12d7 in llvm::sys::path::const_iterator::operator++() /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/lib/Support/Path.cpp:243:3 #1 0x5633fd56090a in __distance<llvm::sys::path::const_iterator> /b/sanitizer-x86_64-linux-bootstrap-msan/build/libcxx_build_msan_track_origins/include/c++/v1/__iterator/distance.h:34:31 #2 0x5633fd56090a in distance<llvm::sys::path::const_iterator> /b/sanitizer-x86_64-linux-bootstrap-msan/build/libcxx_build_msan_track_origins/include/c++/v1/__iterator/distance.h:52:12 #3 0x5633fd56090a in void llvm::SmallVectorImpl<llvm::StringRef>::append<llvm::sys::path::const_iterator, void>(llvm::sys::path::const_iterator, llvm::sys::path::const_iterator) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/include/llvm/ADT/SmallVector.h:669:27 #4 0x5633fd542d4a in SmallVector<llvm::sys::path::const_iterator, void> /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/include/llvm/ADT/SmallVector.h:1202:11 #5 0x5633fd542d4a in clang::Preprocessor::HandleHeaderIncludeOrImport(clang::SourceLocation, clang::Token&, clang::Token&, clang::SourceLocation, clang::detail::SearchDirIteratorImpl<true>, clang::FileEntry const*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Lex/PPDirectives.cpp:2230:32 #6 0x5633fd525cd5 in clang::Preprocessor::HandleIncludeDirective(clang::SourceLocation, clang::Token&, clang::detail::SearchDirIteratorImpl<true>, clang::FileEntry const*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Lex/PPDirectives.cpp:1843:17 #7 0x5633fd5283df in clang::Preprocessor::HandleDirective(clang::Token&) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Lex/PPDirectives.cpp:1106:14 #8 0x5633fd46e9cf in clang::Lexer::LexTokenInternal(clang::Token&, bool) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Lex/Lexer.cpp:4091:7 #9 0x5633fd46698b in clang::Lexer::Lex(clang::Token&) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Lex/Lexer.cpp:3307:24 #10 0x5633fd6384d1 in clang::Preprocessor::Lex(clang::Token&) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Lex/Preprocessor.cpp:898:33 #11 0x5633fa9114d8 in ConsumeToken /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/include/clang/Parse/Parser.h:490:8 #12 0x5633fa9114d8 in clang::Parser::Initialize() /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/Parser.cpp:565:3 #13 0x5633fa8fb313 in clang::ParseAST(clang::Sema&, bool, bool) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/ParseAST.cpp:155:7 #14 0x5633f8193fd5 in ASTSrcLocGenerationAction::ExecuteAction() /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Tooling/DumpTool/ClangSrcLocDump.cpp:49:31 #15 0x5633fa11770f in clang::FrontendAction::Execute() /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Frontend/FrontendAction.cpp:1032:8 #16 0x5633f9ea7cb2 in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1036:33 #17 0x5633f81905b3 in main /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Tooling/DumpTool/ClangSrcLocDump.cpp:150:12 #18 0x7fcd163b909a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) (BuildId: eb6a5dd378d22b1e695984462a799cd4c81cdc22) #19 0x5633f80e0ef9 in _start (/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm_build_msan_track_origins/bin/clang-ast-dump+0xc7aef9) Uninitialized value was stored to memory at #0 0x5633f810bfea in __msan_memcpy.part.426 /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/compiler-rt/lib/msan/msan_interceptors.cpp:1605:3 #1 0x5633fd5608dd in operator!= /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/include/llvm/ADT/iterator.h:181:51 #2 0x5633fd5608dd in __distance<llvm::sys::path::const_iterator> /b/sanitizer-x86_64-linux-bootstrap-msan/build/libcxx_build_msan_track_origins/include/c++/v1/__iterator/distance.h:34:20 #3 0x5633fd5608dd in distance<llvm::sys::path::const_iterator> /b/sanitizer-x86_64-linux-bootstrap-msan/build/libcxx_build_msan_track_origins/include/c++/v1/__iterator/distance.h:52:12 #4 0x5633fd5608dd in void llvm::SmallVectorImpl<llvm::StringRef>::append<llvm::sys::path::const_iterator, void>(llvm::sys::path::const_iterator, llvm::sys::path::const_iterator) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/include/llvm/ADT/SmallVector.h:669:27 #5 0x5633fd542d4a in SmallVector<llvm::sys::path::const_iterator, void> /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/include/llvm/ADT/SmallVector.h:1202:11 #6 0x5633fd542d4a in clang::Preprocessor::HandleHeaderIncludeOrImport(clang::SourceLocation, clang::Token&, clang::Token&, clang::SourceLocation, clang::detail::SearchDirIteratorImpl<true>, clang::FileEntry const*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Lex/PPDirectives.cpp:2230:32 #7 0x5633fd525cd5 in clang::Preprocessor::HandleIncludeDirective(clang::SourceLocation, clang::Token&, clang::detail::SearchDirIteratorImpl<true>, clang::FileEntry const*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Lex/PPDirectives.cpp:1843:17 #8 0x5633fd5283df in clang::Preprocessor::HandleDirective(clang::Token&) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Lex/PPDirectives.cpp:1106:14 #9 0x5633fd46e9cf in clang::Lexer::LexTokenInternal(clang::Token&, bool) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Lex/Lexer.cpp:4091:7 #10 0x5633fd46698b in clang::Lexer::Lex(clang::Token&) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Lex/Lexer.cpp:3307:24 #11 0x5633fd6384d1 in clang::Preprocessor::Lex(clang::Token&) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Lex/Preprocessor.cpp:898:33 #12 0x5633fa9114d8 in ConsumeToken /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/include/clang/Parse/Parser.h:490:8 #13 0x5633fa9114d8 in clang::Parser::Initialize() /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/Parser.cpp:565:3 #14 0x5633fa8fb313 in clang::ParseAST(clang::Sema&, bool, bool) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/ParseAST.cpp:155:7 #15 0x5633f8193fd5 in ASTSrcLocGenerationAction::ExecuteAction() /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Tooling/DumpTool/ClangSrcLocDump.cpp:49:31 #16 0x5633fa11770f in clang::FrontendAction::Execute() /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Frontend/FrontendAction.cpp:1032:8 #17 0x5633f9ea7cb2 in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1036:33 #18 0x5633f81905b3 in main /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Tooling/DumpTool/ClangSrcLocDump.cpp:150:12 #19 0x7fcd163b909a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) (BuildId: eb6a5dd378d22b1e695984462a799cd4c81cdc22) Memory was marked as uninitialized #0 0x5633f8135f3b in __sanitizer_dtor_callback /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/compiler-rt/lib/msan/msan_interceptors.cpp:940:5 #1 0x5633f8363a10 in ~SmallVector /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/include/llvm/ADT/SmallVector.h:1190:3 #2 0x5633f8363a10 in (anonymous namespace)::RealFileSystem::status(llvm::Twine const&) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/lib/Support/VirtualFileSystem.cpp:313:1 #3 0x5633f8336b6f in llvm::vfs::OverlayFileSystem::status(llvm::Twine const&) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/lib/Support/VirtualFileSystem.cpp:433:36 #4 0x5633f959f0c4 in clang::FileSystemStatCache::get(llvm::StringRef, llvm::vfs::Status&, bool, std::__1::unique_ptr<llvm::vfs::File, std::__1::default_delete<llvm::vfs::File> >*, clang::FileSystemStatCache*, llvm::vfs::FileSystem&) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Basic/FileSystemStatCache.cpp:47:55 #5 0x5633f958b840 in clang::FileManager::getStatValue(llvm::StringRef, llvm::vfs::Status&, bool, std::__1::unique_ptr<llvm::vfs::File, std::__1::default_delete<llvm::vfs::File> >*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Basic/FileManager.cpp:576:12 #6 0x5633f958d2b4 in clang::FileManager::getFileRef(llvm::StringRef, bool, bool) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Basic/FileManager.cpp:256:20 #7 0x5633f958c404 in clang::FileManager::getFile(llvm::StringRef, bool, bool) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Basic/FileManager.cpp:198:17 #8 0x5633fd43a57d in (anonymous namespace)::InitHeaderSearch::AddUnmappedPath(llvm::Twine const&, clang::frontend::IncludeDirGroup, bool, llvm::Optional<unsigned int>) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Lex/InitHeaderSearch.cpp:177:22 #9 0x5633fd431ec9 in clang::ApplyHeaderSearchOptions(clang::HeaderSearch&, clang::HeaderSearchOptions const&, clang::LangOptions const&, llvm::Triple const&) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Lex/InitHeaderSearch.cpp:665:12 #10 0x5633f9e933cd in clang::CompilerInstance::createPreprocessor(clang::TranslationUnitKind) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:484:3 #11 0x5633fa109b5e in clang::FrontendAction::BeginSourceFile(clang::CompilerInstance&, clang::FrontendInputFile const&) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Frontend/FrontendAction.cpp:794:8 #12 0x5633f9ea7c59 in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1035:13 #13 0x5633f81905b3 in main /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Tooling/DumpTool/ClangSrcLocDump.cpp:150:12 #14 0x7fcd163b909a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) (BuildId: eb6a5dd378d22b1e695984462a799cd4c81cdc22) SUMMARY: MemorySanitizer: use-of-uninitialized-value /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/lib/Support/Path.cpp:243:3 in llvm::sys::path::const_iterator::operator++() Exiting
This triggered (or exposed) assertion failures:
$ cat inlinecost.cpp typedef char a; typedef unsigned b; typedef int c; typedef struct { b d[]; } e; e f(void *) {} typedef void *fptr; fptr g; template <typename, typename, typename, typename> void h(fptr i) { int (*j)(const void *) = (int (*)(const void *))i; j(g); } template <typename, typename> void k(b, b, b, const char *) { h<long, long, c, int>(fptr(f)); } template void k<c, a>(b, b, b, const char *); $ clang -target x86_64-linux-gnu -c inlinecost.cpp -O2 clang: ../lib/Analysis/InlineCost.cpp:2595: llvm::InlineResult {anonymous}::CallAnalyzer::analyze(): Assertion `CAI != CandidateCall.arg_end()' failed.
@mstorsjo Thanks for the report, the issue should be fixed by https://github.com/llvm/llvm-project/commit/8d5c8d57c637d898094af323d1888ea5a3364f8c.
@vitalybuka Are there any instructions on how to reproduce failures from this buildbot? It seems like this needs more than a simple bootstrap build due to the need for instrumented libcxx?
Okay, I managed to reproduce this using the instructions from https://github.com/google/sanitizers/wiki/MemorySanitizerBootstrappingClang.
Reduced to these two variants for -passes=msan:
target triple = "x86_64-unknown-linux-gnu" define void @test(i8* %p, i32* byval(i32) %p2) { %p2.i8 = bitcast i32* %p2 to i8* call void @llvm.memcpy.p0i8.p0i8.i64(i8* %p, i8* %p2.i8, i64 4, i1 false) ret void } declare void @llvm.memcpy.p0i8.p0i8.i64(i8*, i8*, i64, i1)
target triple = "x86_64-unknown-linux-gnu" define void @test(ptr %p, ptr byval(i32) %p2) { call void @llvm.memcpy.p0.p0.i64(i8* %p, i8* %p2, i64 4, i1 false) ret void } declare void @llvm.memcpy.p0.p0.i64(i8*, i8*, i64, i1)
The second one does not initialize the shadow for the byval argument.
With typed pointers, this happens because a bitcast is present, which will attempt to fetch the shadow (https://github.com/llvm/llvm-project/blob/e810d558093cff40caaa1aff24d289c76c59916d/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp#L2050). While a plain memcpy does not attempt to fetch the shadow (https://github.com/llvm/llvm-project/blob/e810d558093cff40caaa1aff24d289c76c59916d/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp#L2586).
Awesome, thanks! Unfortunately, there seems to be more issues:
$ cat lazycallgraph.cpp typedef char a; typedef unsigned b; typedef int c; namespace { typedef void *d; void e(void *) {} void *f; template <typename, typename, typename, typename> void g(d, d, d, d, d h, d i) { int (*j)(const void *) = (int (*)(const void *))h; c (*k)(int) = (c(*)(int))i; int l = j(f); k(l); } } // namespace template <typename, typename> void m(b, b, b, const char *) { d n, o(&n); g<long, short, c, int>(d(), d(), o, d(), d(e), n); } template void m<c, a>(b, b, b, const char *); $ clang -target x86_64-linux-gnu -c lazycallgraph.cpp -O3
https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild
For this bot buildbot_fast.sh -> buildbot_bootstrap_msan.sh
it's quite slow, so you may want to skip some unnecessary parts:
#!/usr/bin/env bash set -x set -e set -u HERE="$(cd $(dirname $0) && pwd)" . ${HERE}/buildbot_functions.sh ROOT=`pwd` PLATFORM=`uname` export PATH="/usr/local/bin:$PATH" LLVM=$ROOT/llvm CMAKE_COMMON_OPTIONS="-GNinja -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_ENABLE_PER_TARGET_RUNTIME_DIR=OFF" #clobber buildbot_update # Stage 1 build_stage1_clang # Stage 2 / Memory Sanitizer #build_stage2_msan #check_stage2_msan # Stage 3 / MemorySanitizer #build_stage3_msan #check_stage3_msan #if [[ -v msan_FAILED ]]; then # Stage 2 / MemoryWithOriginsSanitizer build_stage2_msan_track_origins check_stage2_msan_track_origins # Stage 3 / MemoryWithOriginsSanitizer #build_stage3_msan_track_origins #check_stage3_msan_track_origins #fi #cleanup $STAGE1_DIR
@mstorsjo Thanks! I've reduced this to a crash in -argpromotion:
efine void @caller() { call i32 @callee(ptr null) ret void } define internal void @callee(ptr %p) { ret void }
Similar issue with function type mismatch.
@nikic, it seems breaking the clang-ppc64le-rhel build
htps://lab.llvm.org/buildbot/#/builders/57/builds/16776
Can you please take a look?
We've spotted some breakages caused by this patch within the llvm test suite when built for AArch64-SVE. I've got https://reviews.llvm.org/D123670 as a WIP fix.
Thanks! I think the rest of my regularly build tested code now builds fine after that.
Hi,
I noticed this produces broken code:
clang -cc1 -triple amdgcn-- -emit-llvm -o - -x c op.c
with op.c being
void bar(); void foo() { bar(); }
The result is
define dso_local void @foo() #0 { entry: call void @bar(i32 noundef 42) ret void } declare void @bar(...) #1
and feeding it to e.g. opt we get
build-all-builtins/bin/opt: <stdin>:9:13: error: invalid forward reference to function 'bar' with wrong type: expected 'void (...)*' but was 'void ()*' call void @bar() ^
Probably the same for several targets.
@uabelho The IR is correct, but requires using opt -opaque-pointers explicitly. Normally, opaque pointer mode is automatically enabled, but there is no explicit mention of ptr in the IR. Not sure if we can do anything to improve that before the default on the opt side is switched.
Oh, didn't realize this. Yes we'll need to start passing -opaque-pointers to opt (and llc I suppose) explicitly then.
Thanks!
We noticed a new crash that still repros at head.
clang -target armv7a-linux-gnueabihf -march=armv8a -mthumb -c -O2 file.cc
llvm-project/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:10135: llvm::Value *llvm::VPTransformState::get(llvm::VPValue *, unsigned int): Assertion `(isa<VPWidenIntOrFpInductionRecipe>(Def->getDef()) || isa<VPScalarIVStepsRecipe>(Def->getDef())) && "unexpected recipe found to be invariant"' failed.
C struct b; struct c { using a = b *; }; struct d { d(c::a); }; struct g; struct i { typedef g *a; }; struct g { typedef i::a e; e f; } * h; struct o { typedef g ::e e; }; struct p; struct H { typedef g &a; }; template <typename> struct q; template <typename bq> struct q<bq *> { static bq *bu(H::a r) { g &j = r; h = &j; return h; } }; struct s : g { e cf() { return q<e>::bu(*this); } }; struct t { s ck; }; template <class cg> struct K { typedef cg bf; typedef typename bf::ce ce; struct cs { cs(bf) {} t cp; } co; K() : co(bf()) { typename ce::e k = co.cp.ck.cf(); k->f = k; } }; struct L { typedef K<p> a; }; struct p { typedef o ce; }; struct b : L::a {}; struct M { int l; using m = b; d n{new m[l]}; }; void u() { new M; }
$ cat /tmp/a.ll target triple = "thumbv8-unknown-linux-gnueabihf" define void @zot() { bb: br label %bb1 bb1: ; preds = %bb1, %bb %tmp = phi ptr [ null, %bb ], [ %tmp2, %bb1 ] store ptr %tmp, ptr %tmp, align 4 %tmp2 = getelementptr inbounds ptr, ptr %tmp, i32 1 %tmp3 = icmp eq ptr %tmp2, null br i1 %tmp3, label %bb4, label %bb1 bb4: ; preds = %bb1 ret void } $ opt -passes=loop-vectorize /tmp/a.ll -disable-output # crash
With
$ ./bin/clang -cc1 -emit-llvm BBI-68673.c
the IR contains the following
@a2_ldm_i64 = global [4 x [18 x { i64, i64 }]] zeroinitializer, align 16 %arrayidx = getelementptr inbounds [4 x [18 x { i64, i64 }]], ptr @a2_ldm_i64, i64 0, i64 %idxprom %.real = load volatile i64, ptr getelementptr inbounds ([18 x { i64, i64 }], ptr @a2_ldm_i64, i64 0, i64 1), align 16
but I would expect both GEPs to have the same first argument type (as the global symbol)
If I on the other hand compile with
$ ./bin/clang -cc1 -emit-llvm -no-opaque-pointers BBI-68673.c
I instead get
@a2_ldm_i64 = global [4 x [18 x { i64, i64 }]] zeroinitializer, align 16 %arrayidx = getelementptr inbounds [4 x [18 x { i64, i64 }]], [4 x [18 x { i64, i64 }]]* @a2_ldm_i64, i64 0, i64 %idxprom %.real = load volatile i64, i64* getelementptr inbounds ([4 x [18 x { i64, i64 }]], [4 x [18 x { i64, i64 }]]* @a2_ldm_i64, i64 0, i64 0, i64 1, i32 0), align 16
which looks more consistent.
Any idea what is going on here?
@markus Without tracing through it in detail, I'd guess that without opaque pointers this creates two getelementptr constant expressions that get folded together. With opaque pointers, the first one (which would be a zero-index GEP) is omitted, and only the second one is left. ConstantFolding will later canonicalize the GEP source type, but this only happens when InstCombine runs, while you're looking at unoptimized IR.
Are you encountering some particular issue relating to this? For well-written optimizations, the exact GEP representation shouldn't matter either way.
Yes, that appears to have been the case. Thanks for explaining.
Are you encountering some particular issue relating to this? For well-written optimizations, the exact GEP representation shouldn't matter either way.
One of our target specific passes ran into problems with this as it made some assumptions that no longer hold with opaque pointers.
Hi, unfortunately for some reason it doesn't play well with coroutines HALO. There is regression seen on Gor's Nishanov classical code snippet: https://godbolt.org/z/PKMxqq4Gr I'm checking IR to find out more.
We have run into a slight performance degrading issue with our downstream target. The situation is that we relay on the "consthoist" pass with option "-consthoist-gep=1" to factor out the common parts of GEP expresseions to reduce the number of symbol references.
For example with the old typed pointers we hade just before that pass
store i16 100, i16* getelementptr inbounds ([8 x i16], [8 x i16]* @extUeValidatedTps, i16 0, i16 0), align 1 store i16 101, i16* getelementptr inbounds ([8 x i16], [8 x i16]* @extUeValidatedTps, i16 0, i16 1), align 1 store i16 102, i16* getelementptr inbounds ([8 x i16], [8 x i16]* @extUeValidatedTps, i16 0, i16 2), align 1
resulting in that the first GEP was factored out (with the other two based on it) so in total we had one symbol reference.
Now with opaque pointers we instead have
store i16 100, ptr @extUeValidatedTps, align 1 store i16 101, ptr getelementptr inbounds ([8 x i16], ptr @extUeValidatedTps, i16 0, i16 1), align 1 store i16 102, ptr getelementptr inbounds ([8 x i16], ptr @extUeValidatedTps, i16 0, i16 2), align 1
again resulting in that the first GEP is factored out with the remaining GEPs based on it with offset adjusted. The result of this is that we have two symbol references.
So if this pass is to remain functionally equivalent then maybe ConstantHoisting.cpp should be taught to treat ptr @extUeValidatedTps the same way as zero indexed GEP of that symbol?
Sounds reasonable in general -- though isn't this a pre-existing problem you'd see if you simply had multiple loads from the same global (without any GEP)? Looking at the current ConstantHoist code, it doesn't seem to really consider the case where a global symbol address can be expensive, it only handles hoisting of large integer values.
Indeed, multiple (GEP-less) loads from the same symbol gives the same problem. At least if they are spread out across the CFG, otherwise isel deals with it. Not sure what is the right way to deal with this. Maybe we will try making some modifications to ConstantHoist.
Just one more thing regarding this:
It's (kind of) related to GEPs as well. From CoroElide perspective the thing is that we cannot collect coro.subfn.addrs associated with coro.begin intrinsics in processCoroId.
Normally, for each coro.begin we traverse its users list and save found coro.subfn.addr. That explains why elision works fine when coro.begin is directly used by coro.subfn.addr:
%35 = call noalias nonnull i8* @llvm.coro.begin(token %31, i8* %34) %40 = call i8* @llvm.coro.subfn.addr(i8* nonnull %35, i8 0)
With opaque pointers IR reaching CoroElide has intermediate GEPs so coro.subfn.addr is not present on coro.begin's list and DestroyAddrs are not collected:
%22 = call noalias nonnull ptr @llvm.coro.begin(token %18, ptr %21) %__promise.reload.addr.i36 = getelementptr inbounds %_Z3addIiE9generatorIT_ERS2_S1_.Frame, ptr %22, i64 0, i32 2 %23 = getelementptr inbounds i8, ptr %__promise.reload.addr.i36, i64 -16 %24 = call ptr @llvm.coro.subfn.addr(ptr nonnull %23, i8 0)
Thanks! I'm not familiar with the coroutine aspect here, but clearly there is a big optimization fail here: https://llvm.godbolt.org/z/GPG5df9xT We should be folding these two GEPs together, at least for the case of constant offsets. Probably we should canonicalize to i8 GEPs in InstCombine.
@yurai007 I've put up https://reviews.llvm.org/D124459 to fix this optimization failure.
@nikic: I can confirm that patch fix regression and makes coroutines snippet great again. Thanks!