This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Fix crash in bugprone-bad-signal-to-kill-thread clang-tidy check.
ClosedPublic

Authored by ArcsinX on Aug 6 2020, 3:06 AM.

Details

Summary

Inside clangd, clang-tidy checks don't see preprocessor events in the preamble.
This leads to Token::PtrData == nullptr for tokens that the macro is defined to.
E.g. #define SIGTERM 15:

  • Token::Kind == tok::numeric_constant (Token::isLiteral() == true)
  • Token::UintData == 2
  • Token::PtrData == nullptr

As the result of this, bugprone-bad-signal-to-kill-thread check crashes at null-dereference inside clangd.

Diff Detail

Event Timeline

ArcsinX created this revision.Aug 6 2020, 3:06 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 6 2020, 3:06 AM
ArcsinX requested review of this revision.Aug 6 2020, 3:06 AM
hokein accepted this revision.Aug 6 2020, 6:52 AM

Thanks, this is a nice catch.

Looks like NotNullTerminatedResultCheck.cpp also has this pattern, we may want to fix that as well.

clang-tools-extra/clangd/unittests/DiagnosticsTests.cpp
441

ClangTidyBadSignalToKillThread doesn't seen ti provide much information about what is testcase testing. Maybe NoLiteralDataInMacroToken?

This revision is now accepted and ready to land.Aug 6 2020, 6:52 AM
ArcsinX updated this revision to Diff 283600.Aug 6 2020, 7:17 AM

Test rename: ClangTidyBadSignalToKillThread => ClangTidyNoLiteralDataInMacroToken

ArcsinX added inline comments.Aug 6 2020, 7:23 AM
clang-tools-extra/clangd/unittests/DiagnosticsTests.cpp
441

Thanks, renamed. But I want to keep ClangTidy prefix because the problem related with clang-tidy limitations inside clangd.

Looks like NotNullTerminatedResultCheck.cpp also has this pattern, we may want to fix that as well.

Yes, you are right. I will fix this in the next patch.

This revision was landed with ongoing or failed builds.Aug 6 2020, 11:46 AM
This revision was automatically updated to reflect the committed changes.

Looks like NotNullTerminatedResultCheck.cpp also has this pattern, we may want to fix that as well.

Yes, you are right. I will fix this in the next patch.

@hokein Sorry for bother you. Seems bugprone-not-null-terminated-result has the same 3 problems as bugprone-bad-signal-to-kill-thread:

  • #undef __STDC_WANT_LIB_EXT1__
  • #define __STDC_WANT_LIB_EXT1__ ((unsigned)1)
  • Token::PtrData == nullptr inside clangd

Do you expect separate 3 patches for this (as we did for bugprone-bad-signal-to-kill-thread) ?

Looks like NotNullTerminatedResultCheck.cpp also has this pattern, we may want to fix that as well.

Yes, you are right. I will fix this in the next patch.

@hokein Sorry for bother you. Seems bugprone-not-null-terminated-result has the same 3 problems as bugprone-bad-signal-to-kill-thread:

  • #undef __STDC_WANT_LIB_EXT1__
  • #define __STDC_WANT_LIB_EXT1__ ((unsigned)1)
  • Token::PtrData == nullptr inside clangd

Do you expect separate 3 patches for this (as we did for bugprone-bad-signal-to-kill-thread) ?

yeah, that would be nice. Thanks for digging into this.

Thinking more about this,

Inside clangd, clang-tidy checks don't see preprocessor events in the preamble.
This leads to Token::PtrData == nullptr for tokens that the macro is defined to.
E.g. #define SIGTERM 15:

Token::Kind == tok::numeric_constant (Token::isLiteral() == true)
Token::UintData == 2
Token::PtrData == nullptr

The token is in a pretty-broken state. Do you know why the UintData is set to 2, I suppose this is the length of the macro definition (15). If the PtrData is nullptr, I'd expect the UintData is 0.

Thinking more about this,

Inside clangd, clang-tidy checks don't see preprocessor events in the preamble.
This leads to Token::PtrData == nullptr for tokens that the macro is defined to.
E.g. #define SIGTERM 15:

Token::Kind == tok::numeric_constant (Token::isLiteral() == true)
Token::UintData == 2
Token::PtrData == nullptr

The token is in a pretty-broken state. Do you know why the UintData is set to 2, I suppose this is the length of the macro definition (15). If the PtrData is nullptr, I'd expect the UintData is 0.

I think it's here:

Token ASTReader::ReadToken(ModuleFile &F, const RecordDataImpl &Record,
                           unsigned &Idx) {
  Token Tok;
  Tok.startToken();
  Tok.setLocation(ReadSourceLocation(F, Record, Idx));
  Tok.setLength(Record[Idx++]);
  if (IdentifierInfo *II = getLocalIdentifier(F, Record[Idx++]))
    Tok.setIdentifierInfo(II);
  Tok.setKind((tok::TokenKind)Record[Idx++]);
  Tok.setFlag((Token::TokenFlags)Record[Idx++]);
  return Tok;
}

So, we set Token::UintData via Token::setLength() at preamble read, but do not set Token::PtrData without preprocessor events.

hokein added a comment.Aug 7 2020, 5:54 AM

Thinking more about this,

Inside clangd, clang-tidy checks don't see preprocessor events in the preamble.
This leads to Token::PtrData == nullptr for tokens that the macro is defined to.
E.g. #define SIGTERM 15:

Token::Kind == tok::numeric_constant (Token::isLiteral() == true)
Token::UintData == 2
Token::PtrData == nullptr

The token is in a pretty-broken state. Do you know why the UintData is set to 2, I suppose this is the length of the macro definition (15). If the PtrData is nullptr, I'd expect the UintData is 0.

I think it's here:

Token ASTReader::ReadToken(ModuleFile &F, const RecordDataImpl &Record,
                           unsigned &Idx) {
  Token Tok;
  Tok.startToken();
  Tok.setLocation(ReadSourceLocation(F, Record, Idx));
  Tok.setLength(Record[Idx++]);
  if (IdentifierInfo *II = getLocalIdentifier(F, Record[Idx++]))
    Tok.setIdentifierInfo(II);
  Tok.setKind((tok::TokenKind)Record[Idx++]);
  Tok.setFlag((Token::TokenFlags)Record[Idx++]);
  return Tok;
}

So, we set Token::UintData via Token::setLength() at preamble read, but do not set Token::PtrData without preprocessor events.

I see, thanks! I think this is a missing feature in AST serialization, see https://github.com/llvm/llvm-project/blob/master/clang/lib/Serialization/ASTWriter.cpp#L4260-L4261.