This is an archive of the discontinued LLVM Phabricator instance.

[CodeView] Saturate values bigger than supported by APInt.
ClosedPublic

Authored by mizvekov on Jul 1 2021, 4:06 PM.

Details

Summary

This fixes an assert firing when compiling code which involves 128 bit
integrals.

This would trigger runtime checks similar to this:

Assertion failed: getMinSignedBits() <= 64 && "Too many bits for int64_t", file llvm/include/llvm/ADT/APInt.h, line 1646

To get around this, we just saturate those big values.

Diff Detail

Event Timeline

mizvekov created this revision.Jul 1 2021, 4:06 PM
mizvekov requested review of this revision.Jul 1 2021, 4:06 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 1 2021, 4:06 PM

(I don't know the mechanism so will defer to rnk.)

MaskRay resigned from this revision.Jul 5 2021, 10:50 AM
rnk added a comment.Jul 7 2021, 10:01 AM

Please add a test case for this.

llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
3167

This avoids the crash, but I believe it will produce corrupt codeview records. I think it would be preferable to saturate the APSInt to the maximum representable encoded integer value in CodeViewRecordIO in the streaming and writing cases. There is prior art for this: long strings are truncated, for example.

mizvekov updated this revision to Diff 360643.Jul 21 2021, 4:43 PM
mizvekov edited the summary of this revision. (Show Details)
  • Fixes more issues.
  • Add test cases.
  • Add some FIXME notes.
Herald added a project: Restricted Project. · View Herald TranscriptJul 21 2021, 4:43 PM
mizvekov retitled this revision from [CodeView] Skip emitting values bigger than supported by APInt. to [CodeView] Saturate values bigger than supported by APInt..Jul 21 2021, 4:44 PM
mizvekov updated this revision to Diff 360645.Jul 21 2021, 4:47 PM
  • Really git add the test case this time.

I decided it would be faster and more convenient to respond in the form of a code review for clang, so here it is: D106585

Assuming that goes through, go ahead and rebase onto that after it lands.

clang/test/CodeGenCXX/debug-info-codeview-int128.cpp
1 ↗(On Diff #360648)

FWIW this actually isn't codeview specific. The existing code crashes if you compile for DWARF too:
https://gcc.godbolt.org/z/3ox8c87Y4

To channel @dblaikie , this test should be split up: ensure that clang produces the right LLVM IR, then add a codeview specific LLVM IR test case. We prefer to test the smallest testable unit.

llvm/lib/DebugInfo/CodeView/CodeViewRecordIO.cpp
201

You can use Value.isSingleWord() for a simpler condition.

205

This can be Value.getLimitedValue().

279

I think this particular method is uncovered because of a separate bug. We always consider enumerators to be unsigned:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp#L2120

I'm not sure if that's a bug or if it is intended, but that's why this is relatively uncovered.

llvm/lib/IR/DIBuilder.cpp
249

Since DIEnumerator actually stores an APInt, can you add an overload that accepts an APSInt and call that from clang instead? The implementation will have to decompose the sign bit and slice out the APInt portion, but that seems like a better way of handling this.

I noticed the APInt change to DIEnumerator is new as of D62475, so April 2020. +@MaskRay @LemonBoy

mizvekov updated this revision to Diff 361527.Jul 25 2021, 1:06 PM
mizvekov marked an inline comment as done.
  • rebase on top of D106585
  • Reimplement test based on IR
mizvekov updated this revision to Diff 361543.Jul 25 2021, 3:03 PM
mizvekov marked 4 inline comments as done.
  • Expose and use APInt::isSingleWord.
  • Use ApInt::getLimitedValue.
rnk accepted this revision.Jul 26 2021, 12:30 PM

lgtm with minor comment adjustments

llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
3167

I don't think we need a TODO here. If Microsoft ever adds an encoding for large integers, the code change will be elsewhere, and in the process of testing, this size 10 array is bounds checked during the writing process, so we don't need this comment to remind us to change this code.

llvm/test/DebugInfo/COFF/integer-128.ll
23

(no AI) I see, in the future we should make this use the "streaming" output style so this comes out as .short .quad directives.

This revision is now accepted and ready to land.Jul 26 2021, 12:30 PM
mizvekov added inline comments.Jul 26 2021, 1:03 PM
llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
3167

I agree the comment does not belong here, but I think we have encodings already, both for the value and the type.

CodeViewTypes.def:

CV_TYPE(LF_OCTWORD, 0x8017)
CV_TYPE(LF_UOCTWORD, 0x8018)

CodeViewDebug.cpp:

case dwarf::DW_ATE_signed:
  switch (ByteSize) {
  case 1:  STK = SimpleTypeKind::SignedCharacter; break;
  case 2:  STK = SimpleTypeKind::Int16Short;      break;
  case 4:  STK = SimpleTypeKind::Int32;           break;
  case 8:  STK = SimpleTypeKind::Int64Quad;       break;
  case 16: STK = SimpleTypeKind::Int128Oct;       break; // <-------
  }

Unless it would be risky to try to use them, if for example the Microsoft tooling would have trouble digesting it?

FWIW I would have considered adding this support, but then I found all those other problems, and time is a bit short until release branching.

This revision was landed with ongoing or failed builds.Jul 26 2021, 1:15 PM
This revision was automatically updated to reflect the committed changes.

I received this report from one of the bots: https://lab.llvm.org/buildbot/#/builders/5/builds/9633/steps/13/logs/stdio

`
FAIL: Clang :: CodeGenCXX/debug-info-enum-i128.cpp (5356 of 77963)
******************** TEST 'Clang :: CodeGenCXX/debug-info-enum-i128.cpp' FAILED ********************
Script:
--
: 'RUN: at line 1';   /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/clang -cc1 -internal-isystem /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/lib/clang/13.0.0/include -nostdsysteminc /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/test/CodeGenCXX/debug-info-enum-i128.cpp -triple x86_64-windows-msvc -gcodeview -debug-info-kind=limited -emit-llvm -o - | /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/test/CodeGenCXX/debug-info-enum-i128.cpp
: 'RUN: at line 2';   /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/clang -cc1 -internal-isystem /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/lib/clang/13.0.0/include -nostdsysteminc /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/test/CodeGenCXX/debug-info-enum-i128.cpp -triple x86_64-linux-gnu -debug-info-kind=limited -emit-llvm -o - | /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/test/CodeGenCXX/debug-info-enum-i128.cpp
--
Exit Code: 1
Command Output (stderr):
--
=================================================================
==22208==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x72240dd in operator new[](unsigned long) /b/s/w/ir/cache/builder/src/third_party/llvm/compiler-rt/lib/asan/asan_new_delete.cpp:98:3
    #1 0xe21510a in getMemory /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Support/APInt.cpp:46:10
    #2 0xe21510a in llvm::APInt::initSlowCase(llvm::APInt const&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Support/APInt.cpp:87:12
    #3 0x164a460b in APInt /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/ADT/APInt.h:321:7
    #4 0x164a460b in APSInt /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/ADT/APSInt.h:22:22
    #5 0x164a460b in EnumConstantDecl /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/include/clang/AST/Decl.h:3048:62
    #6 0x164a460b in clang::EnumConstantDecl::Create(clang::ASTContext&, clang::EnumDecl*, clang::SourceLocation, clang::IdentifierInfo*, clang::QualType, clang::Expr*, llvm::APSInt const&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/AST/Decl.cpp:4916:22
    #7 0x14605da3 in clang::Sema::CheckEnumConstant(clang::EnumDecl*, clang::EnumConstantDecl*, clang::SourceLocation, clang::IdentifierInfo*, clang::Expr*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Sema/SemaDecl.cpp:17905:10
    #8 0x14607d80 in clang::Sema::ActOnEnumConstant(clang::Scope*, clang::Decl*, clang::Decl*, clang::SourceLocation, clang::IdentifierInfo*, clang::ParsedAttributesView const&, clang::SourceLocation, clang::Expr*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Sema/SemaDecl.cpp:17970:5
    #9 0x13ead361 in clang::Parser::ParseEnumBody(clang::SourceLocation, clang::Decl*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:4855:35
    #10 0x13e9f97b in clang::Parser::ParseEnumSpecifier(clang::SourceLocation, clang::DeclSpec&, clang::Parser::ParsedTemplateInfo const&, clang::AccessSpecifier, clang::Parser::DeclSpecContext) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:4774:5
    #11 0x13e7e7e3 in clang::Parser::ParseDeclarationSpecifiers(clang::DeclSpec&, clang::Parser::ParsedTemplateInfo const&, clang::AccessSpecifier, clang::Parser::DeclSpecContext, clang::Parser::LateParsedAttrList*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:3999:7
    #12 0x13d47ec6 in clang::Parser::ParseDeclOrFunctionDefInternal(clang::ParsedAttributesWithRange&, clang::ParsingDeclSpec&, clang::AccessSpecifier) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/Parser.cpp:1045:3
    #13 0x13d47183 in clang::Parser::ParseDeclarationOrFunctionDefinition(clang::ParsedAttributesWithRange&, clang::ParsingDeclSpec*, clang::AccessSpecifier) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/Parser.cpp:1153:12
    #14 0x13d4305f in clang::Parser::ParseExternalDeclaration(clang::ParsedAttributesWithRange&, clang::ParsingDeclSpec*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/Parser.cpp:975:12
    #15 0x13d3bcb8 in clang::Parser::ParseTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef>&, bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/Parser.cpp:720:12
    #16 0x13d22d2c in clang::ParseAST(clang::Sema&, bool, bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseAST.cpp:158:20
    #17 0x10592b4c in clang::CodeGenAction::ExecuteAction() /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/CodeGen/CodeGenAction.cpp:1044:30
    #18 0x102ed574 in clang::FrontendAction::Execute() /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Frontend/FrontendAction.cpp:951:8
    #19 0x101044b0 in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:974:33
    #20 0x105808e0 in clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:278:25
    #21 0x7237834 in cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/tools/driver/cc1_main.cpp:246:15
    #22 0x723059f in ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/tools/driver/driver.cpp:338:12
    #23 0x722e6f4 in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/tools/driver/driver.cpp:415:12
    #24 0x7fc40b7e809a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
SUMMARY: AddressSanitizer: 32 byte(s) leaked in 2 allocation(s).
`

The error is happening on the test created for D106585 though.

rnk added a comment.Jul 26 2021, 3:42 PM

Yes, I filed https://bugs.llvm.org/show_bug.cgi?id=51221 for it. I plan to disable the test under asan for now.

Yes, I filed https://bugs.llvm.org/show_bug.cgi?id=51221 for it. I plan to disable the test under asan for now.

:-(

Thanks for taking care of it though!