This helps reduce the size of the encoded C++ type strings in the binary.
This is enabled by default only on Darwin, but can be enabled/disabled via command line options.
rdar://63288571
Differential D96816
[ObjC] Encode pointers to C++ classes as "^v" if the encoded string would otherwise include template specialization types ahatanak on Feb 16 2021, 2:11 PM. Authored by
Details This helps reduce the size of the encoded C++ type strings in the binary. This is enabled by default only on Darwin, but can be enabled/disabled via command line options. rdar://63288571
Diff Detail
Event TimelineComment Actions This patch focuses only on reducing the encoded size of classes with template specialization, but if the information about pointed-to types isn't used at all, clang can always emit "^v" for C++ pointer types. Or it might make more sense to base the decision on whether the type is POD rather than on whether the string includes template specialization. Comment Actions You might want to test nested classes of class templates. I don't know if we encode nested classes in any reasonable way in the first place. I'm generally okay with being aggressive about stripping this kind of information from encodings, since encodings are not good enough to do type reflection in ObjC++ anyway. If you did decide you wanted to preserve more information from the encoding, though, you could consider only applying this at the innermost level that mentions a template specialization type, rather than the outermost. That is, the new logic seems to not expand the X in X* if X recursively has a field that's a pointer to a template specialization, rather than expanding X and then not expanding the type of that one field. Definitely not something I think you need to do now, though. Comment Actions I'm broadly in favour of this change, but with the GNUstep ABIs this is an ABI-breaking change and so should not be on by default. The type encoding leaks into the ABI in two ways:
I'm quite happy with this to be opt-in for non-Apple runtimes and opt-out for Apple runtimes. Comment Actions Since this is just in pointee types, I suspect this is fine for dispatch outside of targets (like that one old Cray) where sizeof(void*) != sizeof(some_struct*), which I don't believe we generally support. But yeah, if this affects GNU-runtime ivar mangling, we need a way to not disturb this for that case.
Comment Actions
Comment Actions Very cool change, I wanted something like this for a long time :) It seems to break Index/preamble-reparse-changed-module.m on some platforms: Does the failure make sense to you? Comment Actions I don't know why the test is failing. It doesn't fail on my machine. This is also broken: http://green.lab.llvm.org/green/job/lldb-cmake/29029/ I'm not sure if it's related to the one you saw. Comment Actions I ssh'd into one of the failing boxes and ran the failing command without piping it into filecheck. Here's the output: thakis@dotc:~/src/llvm-project$ env CINDEXTEST_EDITING=1 CINDEXTEST_EXECUTE_COMMAND="cp /usr/local/google/home/thakis/src/llvm-project/clang/test/Index/Inputs/preamble-reparse-changed-module/new-head.h /usr/local/google/home/thakis/src/llvm-project/out/gn/obj/clang/test/Index/Output/preamble-reparse-changed-module.m.tmp/mod/head.h" CINDEXTEST_EXECUTE_AFTER_TRIAL=1 /usr/local/google/home/thakis/src/llvm-project/out/gn/bin/c-index-test -test-load-source-reparse 3 local /usr/local/google/home/thakis/src/llvm-project/clang/test/Index/preamble-reparse-changed-module.m -I /usr/local/google/home/thakis/src/llvm-project/out/gn/obj/clang/test/Index/Output/preamble-reparse-changed-module.m.tmp -I /usr/local/google/home/thakis/src/llvm-project/out/gn/obj/clang/test/Index/Output/preamble-reparse-changed-module.m.tmp/mod -fmodules -fmodules-cache-path=/usr/local/google/home/thakis/src/llvm-project/out/gn/obj/clang/test/Index/Output/preamble-reparse-changed-module.m.tmp/mcp c-index-test: ../../llvm/include/llvm/ADT/SmallVector.h:281: llvm::SmallVectorTemplateCommon::const_reference llvm::SmallVectorTemplateCommon<unsigned long, void>::operator[](llvm::SmallVectorTemplateCommon::size_type) const [T = unsigned long]: Assertion `idx < size()' failed. libclang: crash detected during parsing: { 'source_filename' : '(null)' 'command_line_args' : ['clang', '/usr/local/google/home/thakis/src/llvm-project/clang/test/Index/preamble-reparse-changed-module.m', '-I', '/usr/local/google/home/thakis/src/llvm-project/out/gn/obj/clang/test/Index/Output/preamble-reparse-changed-module.m.tmp', '-I', '/usr/local/google/home/thakis/src/llvm-project/out/gn/obj/clang/test/Index/Output/preamble-reparse-changed-module.m.tmp/mod', '-fmodules', '-fmodules-cache-path=/usr/local/google/home/thakis/src/llvm-project/out/gn/obj/clang/test/Index/Output/preamble-reparse-changed-module.m.tmp/mcp'], 'unsaved_files' : [], 'options' : 13, } Unable to load translation unit! Failure: libclang crashed Do you build without assertions locally maybe? Comment Actions #3 0x00007ffff78ee662 in __GI___assert_fail (assertion=0x90165c "idx < size()", file=0xc60efe "../../llvm/include/llvm/ADT/SmallVector.h", line=281, function=0xa8780d "llvm::SmallVectorTemplateCommon::const_reference llvm::SmallVectorTemplateCommon<unsigned long, void>::operator[](llvm::SmallVectorTemplateCommon::size_type) const [T = unsigned long]") at assert.c:101 #4 0x000000000444401a in clang::ASTReader::ParseLanguageOptions(llvm::SmallVector<unsigned long, 64u> const&, bool, clang::ASTReaderListener&, bool) () #5 0x000000000443eb62 in clang::ASTReader::ReadOptionsBlock(llvm::BitstreamCursor&, unsigned int, bool, clang::ASTReaderListener&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) () #6 0x00000000044474c5 in clang::ASTReader::ReadControlBlock(clang::serialization::ModuleFile&, llvm::SmallVectorImpl<clang::ASTReader::ImportedModule>&, clang::serialization::ModuleFile const*, unsigned int) () #7 0x000000000444a901 in clang::ASTReader::ReadASTCore(llvm::StringRef, clang::serialization::ModuleKind, clang::SourceLocation, clang::serialization::ModuleFile*, llvm::SmallVectorImpl<clang::ASTReader::ImportedModule>&, long, long, clang::ASTFileSignature, unsigned int) () #8 0x00000000044584f8 in clang::ASTReader::ReadAST(llvm::StringRef, clang::serialization::ModuleKind, clang::SourceLocation, unsigned int, llvm::SmallVectorImpl<clang::ASTReader::ImportedSubmodule>*) () #9 0x00000000042c07a5 in clang::CompilerInstance::findOrCompileModuleAndReadAST(llvm::StringRef, clang::SourceLocation, clang::SourceLocation, bool) () #10 0x00000000042c4650 in clang::CompilerInstance::loadModule(clang::SourceLocation, llvm::ArrayRef<std::pair<clang::IdentifierInfo*, clang::SourceLocation> >, clang::Module::NameVisibilityKind, bool) () #11 0x00000000049713c6 in clang::Preprocessor::LexAfterModuleImport(clang::Token&) () #12 0x0000000004972afb in clang::Preprocessor::Lex(clang::Token&) () #13 0x0000000004e6421d in clang::Parser::ParseModuleName(clang::SourceLocation, llvm::SmallVectorImpl<std::pair<clang::IdentifierInfo*, clang::SourceLocation> >&, bool) () #14 0x0000000004e5c9f5 in clang::Parser::ParseModuleImport(clang::SourceLocation) () #15 0x0000000004ed2e23 in clang::Parser::ParseObjCAtDirectives(clang::Parser::ParsedAttributesWithRange&) () #16 0x0000000004e5d802 in clang::Parser::ParseExternalDeclaration(clang::Parser::ParsedAttributesWithRange&, clang::ParsingDeclSpec*) () #17 0x0000000004e5b7e6 in clang::Parser::ParseTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef>&, bool) () #18 0x0000000004e5af77 in clang::Parser::ParseFirstTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef>&) () #19 0x0000000004e5632d in clang::ParseAST(clang::Sema&, bool, bool) () #20 0x000000000434e3a3 in clang::FrontendAction::Execute() () #21 0x0000000004261cc6 in clang::ASTUnit::Parse(std::shared_ptr<clang::PCHContainerOperations>, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer> >, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>) () #22 0x0000000004265d8d in clang::ASTUnit::LoadFromCompilerInvocation(std::shared_ptr<clang::PCHContainerOperations>, unsigned int, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>) () #23 0x0000000004267090 in clang::ASTUnit::LoadFromCommandLine(char const**, char const**, std::shared_ptr<clang::PCHContainerOperations>, llvm::IntrusiveRefCntPtr<clang::DiagnosticsEngine>, llvm::StringRef, bool, clang::CaptureDiagsKind, llvm::ArrayRef<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, llvm::MemoryBuffer*> >, bool, unsigned int, clang::TranslationUnitKind, bool, bool, bool, clang::SkipFunctionBodiesScope, bool, bool, bool, bool, llvm::Optional<llvm::StringRef>, std::unique_ptr<clang::ASTUnit, std::default_delete<clang::ASTUnit> >*, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>) () #24 0x00000000045c7b05 in void llvm::function_ref<void ()>::callback_fn<clang_parseTranslationUnit2FullArgv::$_1>(long) () #25 0x000000000464d955 in llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) () #26 0x000000000464db04 in RunSafelyOnThread_Dispatch(void*) () Maybe the problem is that the test reuses a serialized module from an earlier run, and this changes serialization stuff without bumping the serialization version? Comment Actions See maybe also https://reviews.llvm.org/D73202 . Do we need to bump serialization::VERSION_MINOR ? Comment Actions I gave that a try in http://reviews.llvm.org/rG063236646849564094f5fcfc947ad36dba0efedb . Looks like it did the trick: http://45.33.8.238/linux/39740/step_7.txt Some other test is also broken, but the breakage caused by this change is fixed in that build. So the failure happened with LLVM_APPEND_VC_REV=NO. Comment Actions I still don't understand why the test was failing, but did we have to bump the version because a new LangOpt (EncodeCXXClassTemplateSpec) was added? Comment Actions Here are the RUN lines: // RUN: mkdir -p %t/mod // RUN: touch %t/empty.h // RUN: cp %S/Inputs/preamble-reparse-changed-module/module.modulemap %t/mod // RUN: cp %S/Inputs/preamble-reparse-changed-module/head.h %t/mod // RUN: env CINDEXTEST_EDITING=1 CINDEXTEST_EXECUTE_COMMAND="cp %S/Inputs/preamble-reparse-changed-module/new-head.h %t/mod/head.h" CINDEXTEST_EXECUTE_AFTER_TRIAL=1 \ // RUN: c-index-test -test-load-source-reparse 3 local %s -I %t -I %t/mod -fmodules -fmodules-cache-path=%t/mcp 2>&1 | FileCheck %s It wouldn't have failed in a clean build. The test is reusing the module cache path %t/mcp from the previous run. I think most implicit modules tests make themselves robust to incremental builds by adding something like this before the first use of the cache: // RUN: rm -rf %t/mcp Comment Actions Right; see also description of https://reviews.llvm.org/D73202 However, while that makes the test more robust, the test kind of found a real bug that actual users could've hit (...but only uses who build with LLVM_APPEND_VC_REV=NO), since users usually don't manually manage cache dirs. Comment Actions I managed to reproduce the failure by building everything with -DLLVM_APPEND_VC_REV=NO. What I found was that any BENIGN_LANGOPT added to LangOptions.def would cause the test to fail since CompilerInvocation::getModuleHash doesn't use the option to compute the hash if it's a BENIGN_LANGOPT. The test was added a month ago, so I think my commit was the first to cause the failure. If I use LANGOPT instead of BENIGN_LANGOPT, the test passes. The description of BENIGN_LANGOPT says "for options that don't affect the construction of the AST in any way" and since the option can change the AST (e.g., the value of unsigned length = sizeof(@encode(S)); will differ depending on whether the member for the option is set or not), I think I should have used LANGOPT instead of BENIGN_LANGOPT. Since this is something that is easy to overlook, should there be a comment to remind people to update the version whenever they add a BENIGN_LANGOPT? |
I think there's some magic in Options.td for defining the thing and its no- variant together. I don't think you want the f prefix in the name either if it's an f-option.