This is an archive of the discontinued LLVM Phabricator instance.

Check for null before using TUScope
Needs ReviewPublic

Authored by zturner on Apr 4 2017, 8:49 PM.



To be honest I don't really understand anything about this code. I encountered a situation when running include-what-you-use against LLVM where TUScope was null here, triggering a segfault. Some surrounding comments and code suggests that this variable is specific to Objective C, but I don't really grok whether null is indicative here of an earlier problem or whether it should actually be handled. There are other places in this file where it is checked for null, so there is at least some precedent for it in Sema.

Diff Detail

Event Timeline

zturner created this revision.Apr 4 2017, 8:49 PM
kimgr added a subscriber: kimgr.Apr 5 2017, 1:04 AM
kimgr added a comment.Apr 8 2017, 1:47 AM

For some context, the backtrace when this happens is:

       include-what-you-use.exe!clang::Scope::getEntity() Line 313	C++
	include-what-you-use.exe!clang::Sema::PushOnScopeChains(clang::NamedDecl * D=0x07ed05b0, clang::Scope * S=0x00000000, bool AddToContext=true) Line 1302	C++
	include-what-you-use.exe!clang::Sema::LazilyCreateBuiltin(clang::IdentifierInfo * II=0x008087f0, unsigned int ID=0x00000114, clang::Scope * S=0x00000000, bool ForRedeclaration=false, clang::SourceLocation Loc={...}) Line 1910	C++
	include-what-you-use.exe!LookupBuiltin(clang::Sema & S={...}, clang::LookupResult & R={...}) Line 699	C++
	include-what-you-use.exe!LookupDirect(clang::Sema & S={...}, clang::LookupResult & R={...}, const clang::DeclContext * DC=0x007c1a14) Line 850	C++
	include-what-you-use.exe!CppNamespaceLookup(clang::Sema & S={...}, clang::LookupResult & R={...}, clang::ASTContext & Context={...}, clang::DeclContext * NS=0x007c1a14, `anonymous-namespace'::UnqualUsingDirectiveSet & UDirs={...}) Line 931	C++
	include-what-you-use.exe!clang::Sema::CppLookupName(clang::LookupResult & R={...}, clang::Scope * S=0x0078c870) Line 1310	C++
	include-what-you-use.exe!clang::Sema::LookupName(clang::LookupResult & R={...}, clang::Scope * S=0x079fcd10, bool AllowBuiltinCreation=false) Line 1798	C++
	include-what-you-use.exe!clang::Sema::getTypeName(const clang::IdentifierInfo & II={...}, clang::SourceLocation NameLoc={...}, clang::Scope * S=0x079fcd10, clang::CXXScopeSpec * SS=0x0449db94, bool isClassName=false, bool HasTrailingDot=false, clang::OpaquePtr<clang::QualType> ObjectTypePtr={...}, bool IsCtorOrDtorName=false, bool WantNontrivialTypeSourceInfo=true, bool IsClassTemplateDeductionContext=true, clang::IdentifierInfo * * CorrectedII=0x00000000) Line 331	C++
	include-what-you-use.exe!clang::Parser::TryAnnotateTypeOrScopeTokenAfterScopeSpec(clang::CXXScopeSpec & SS={...}, bool IsNewScope=true) Line 1727	C++
	include-what-you-use.exe!clang::Parser::TryAnnotateTypeOrScopeToken() Line 1717	C++
	include-what-you-use.exe!clang::Parser::ParseCastExpression(bool isUnaryExpression=false, bool isAddressOfOperand=false, bool & NotCastExpr=false, clang::Parser::TypeCastState isTypeCast=NotTypeCast, bool isVectorLiteral=false) Line 876	C++
	include-what-you-use.exe!clang::Parser::ParseCastExpression(bool isUnaryExpression=false, bool isAddressOfOperand=false, clang::Parser::TypeCastState isTypeCast=NotTypeCast, bool isVectorLiteral=false) Line 484	C++
	include-what-you-use.exe!clang::Parser::ParseAssignmentExpression(clang::Parser::TypeCastState isTypeCast=NotTypeCast) Line 171	C++
	include-what-you-use.exe!clang::Parser::ParseExpression(clang::Parser::TypeCastState isTypeCast=NotTypeCast) Line 121	C++
	include-what-you-use.exe!clang::Parser::ParseReturnStatement() Line 1905	C++
	include-what-you-use.exe!clang::Parser::ParseStatementOrDeclarationAfterAttributes(llvm::SmallVector<clang::Stmt *,32> & Stmts={...}, clang::Parser::AllowedConstructsKind Allowed=ACK_Any, clang::SourceLocation * TrailingElseLoc=0x00000000, clang::Parser::ParsedAttributesWithRange & Attrs={...}) Line 265	C++
	include-what-you-use.exe!clang::Parser::ParseStatementOrDeclaration(llvm::SmallVector<clang::Stmt *,32> & Stmts={...}, clang::Parser::AllowedConstructsKind Allowed=ACK_Any, clang::SourceLocation * TrailingElseLoc=0x00000000) Line 113	C++
	include-what-you-use.exe!clang::Parser::ParseCompoundStatementBody(bool isStmtExpr=false) Line 996	C++
	include-what-you-use.exe!clang::Parser::ParseFunctionStatementBody(clang::Decl * Decl=0x07e84ce0, clang::Parser::ParseScope & BodyScope={...}) Line 1965	C++
	include-what-you-use.exe!clang::Parser::ParseLateTemplatedFuncDef(clang::LateParsedTemplate & LPT={...}) Line 1418	C++
	include-what-you-use.exe!clang::Parser::LateTemplateParserCallback(void * P=0x0077d438, clang::LateParsedTemplate & LPT={...}) Line 1339	C++
	include-what-you-use.exe!include_what_you_use::IwyuAstConsumer::ParseFunctionTemplates(clang::TranslationUnitDecl * decl=0x007c1a00) Line 3558	C++
	include-what-you-use.exe!include_what_you_use::IwyuAstConsumer::HandleTranslationUnit(clang::ASTContext & context={...}) Line 3503	C++
	include-what-you-use.exe!clang::ParseAST(clang::Sema & S={...}, bool PrintStats=false, bool SkipFunctionBodies=false) Line 159	C++
	include-what-you-use.exe!clang::ASTFrontendAction::ExecuteAction() Line 611	C++
	include-what-you-use.exe!clang::FrontendAction::Execute() Line 512	C++
	include-what-you-use.exe!clang::CompilerInstance::ExecuteAction(clang::FrontendAction & Act={...}) Line 970	C++

This is triggered when we invoke a hacky hook in Sema to force instantiation of late-parsed function templates:

void ParseFunctionTemplates(TranslationUnitDecl* decl) {
  set<FunctionDecl*> late_parsed_decls = GetLateParsedFunctionDecls(decl);
  clang::Sema& sema = compiler()->getSema();

  // If we have any late-parsed functions, make sure the
  // -fdelayed-template-parsing flag is on. Otherwise we don't know where
  // they came from.
  CHECK_((compiler()->getLangOpts().DelayedTemplateParsing ||
          late_parsed_decls.empty()) &&
         "Should not have late-parsed decls without "

  for (const FunctionDecl* fd : late_parsed_decls) {

    if (CanIgnoreLocation(GetLocation(fd)))

    // Force parsing and AST building of the yet-uninstantiated function
    // template body.
    clang::LateParsedTemplate* lpt = sema.LateParsedTemplateMap[fd].get();
    sema.LateTemplateParser(sema.OpaqueParser, *lpt);

That LateTemplateParser call is a little crazy, of course, but it's the only way as far as I know to force-instantiate under -fdelayed-template-parsing.

Maybe this gives some context to judge whether the null-check is the right fix for the crash.

kimgr added a comment.Jun 11 2017, 1:39 PM

We'd love to see this addressed, either in our code or in Clang. But we're not sure what to do on our end, so... a gentle ping for help!

kimgr added a comment.Jun 29 2017, 5:12 AM

I did some more debugging today. This happens when we attempt to analyze llvm/Support/MathExtras.h, and only for the function templates that use Microsoft intrinsics (e.g. _BitScanForward in TrailingZerosCounter<T>.) So there's something in the parsing of builtins/intrinsics that requires TUScope to be non-null.

Can we seed Sema with a valid TUScope before invoking LateTemplateParser, and if so, how? Or is this because we invoke the parser multiple times? I'm guessing Clang is already done parsing when we invoke the late template parsing.

Grateful for any ideas here.

kimgr added a comment.Jul 2 2017, 3:53 AM

only for the function templates that use Microsoft intrinsics (e.g. _BitScanForward in TrailingZerosCounter<T>.)
So there's something in the parsing of builtins/intrinsics that requires TUScope to be non-null.

For posterity, this was misdiagnosed on my part. It turns out the pp conditionals in MathExtras.h select the GCC-style intrinsics for Clang, even in MS compat mode (because _MSC_VER is consistently checked *after* __has_builtin and friends).

So the problem is really with GCC builtins inside function templates. Here's a minimal repro:

template <class T>
int myctz(unsigned int value) {
 return __builtin_ctz(value);
kimgr added a comment.Jun 7 2021, 1:32 PM

This was eventually fixed in IWYU. I believe the problem is/was:

  • After code is parsed and the AST is built, Sema resets its TUScope member to null
  • We use Sema to lookup and define default constructors before traversing the AST using a RAV
  • Some yet-not-fully-identified constructs cause Sema to need a TUScope for this (something to do with LazilyCreateBuiltin, but I haven't been able to construct a reduced testcase)

So in, we explicitly re-point Sema::TUScope to Sema::getCurScope(), which seems to have fixed the crash.

This revision can be closed.