Page MenuHomePhabricator

[DebugInfo] Enforce implicit constraints on `distinct` MDNodes
Needs ReviewPublic

Authored by scott.linder on Jun 23 2021, 5:12 PM.

Details

Summary

Add UNIQUED and DISTINCT properties in Metadata.def and use them to
implement restrictions on the distinct property of MDNodes:

  • DIExpression can currently be parsed from IR or read from bitcode as distinct, but this property is silently dropped when printing to IR. This causes accepted IR to fail to round-trip. As DIExpression appears inline at each use in the canonical form of IR, it cannot actually be distinct anyway, as there is no syntax to describe it.
  • Similarly, DIArgList is conceptually always uniqued. It is currently restricted to only appearing in contexts where there is no syntax for distinct, but for consistency it is treated equivalently to DIExpression in this patch.
  • DICompileUnit is already restricted to always being distinct, but along with adding general support for the inverse restriction I went ahead and described this in Metadata.def and updated the parser to be general. Future nodes which have this restriction can share this support.

The new UNIQUED property applies to DIExpression and DIArgList, and
forbids them to be distinct. It also implies they are canonically
printed inline at each use, rather than via MDNode ID.

The new DISTINCT property applies to DICompileUnit, and requires it to
be distinct.

A potential alternative change is to forbid the non-inline syntax for
DIExpression entirely, as is done with DIArgList implicitly by requiring
it appear in the context of a function. For example, we would forbid:

!named = !{!0}
!0 = !DIExpression()

Instead we would only accept the equivalent inlined version:

!named = !{!DIExpression()}

This essentially removes the ability to create a distinct DIExpression
by construction, as there is no syntax for distinct inline. If this
patch is accepted as-is, the result would be that the non-canonical
version is accepted, but the following would be an error and produce a diagnostic:

!named = !{!0}
; error: 'distinct' not allowed for !DIExpression()
!0 = distinct !DIExpression()

Also update some documentation to consistently use the inline syntax for
DIExpression, and to describe the restrictions on distinct for nodes
where applicable.

Diff Detail

Unit TestsFailed

TimeTest
3,260 msx64 debian > libarcher.critical::critical.c
Script: -- : 'RUN: at line 15'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests -I /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/runtime/src -L /var/lib/buildkite-agent/builds/llvm-project/build/lib -Wl,-rpath,/var/lib/buildkite-agent/builds/llvm-project/build/lib /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/critical/critical.c -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/critical.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/critical.c.tmp 2>&1 | tee /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/critical.c.tmp.log | /var/lib/buildkite-agent/builds/llvm-project/build/./bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/critical/critical.c
3,380 msx64 debian > libarcher.critical::lock-nested.c
Script: -- : 'RUN: at line 15'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests -I /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/runtime/src -L /var/lib/buildkite-agent/builds/llvm-project/build/lib -Wl,-rpath,/var/lib/buildkite-agent/builds/llvm-project/build/lib /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/critical/lock-nested.c -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/lock-nested.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/lock-nested.c.tmp 2>&1 | tee /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/lock-nested.c.tmp.log | /var/lib/buildkite-agent/builds/llvm-project/build/./bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/critical/lock-nested.c
3,460 msx64 debian > libarcher.critical::lock.c
Script: -- : 'RUN: at line 15'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests -I /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/runtime/src -L /var/lib/buildkite-agent/builds/llvm-project/build/lib -Wl,-rpath,/var/lib/buildkite-agent/builds/llvm-project/build/lib /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/critical/lock.c -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/lock.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/lock.c.tmp 2>&1 | tee /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/lock.c.tmp.log | /var/lib/buildkite-agent/builds/llvm-project/build/./bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/critical/lock.c
3,780 msx64 debian > libarcher.parallel::parallel-simple2.c
Script: -- : 'RUN: at line 15'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests -I /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/runtime/src -L /var/lib/buildkite-agent/builds/llvm-project/build/lib -Wl,-rpath,/var/lib/buildkite-agent/builds/llvm-project/build/lib /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/parallel/parallel-simple2.c -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/parallel/Output/parallel-simple2.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/parallel/Output/parallel-simple2.c.tmp 2>&1 | tee /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/parallel/Output/parallel-simple2.c.tmp.log | /var/lib/buildkite-agent/builds/llvm-project/build/./bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/parallel/parallel-simple2.c
3,620 msx64 debian > libarcher.races::critical-unrelated.c
Script: -- : 'RUN: at line 13'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests -I /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/runtime/src -L /var/lib/buildkite-agent/builds/llvm-project/build/lib -Wl,-rpath,/var/lib/buildkite-agent/builds/llvm-project/build/lib /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/races/critical-unrelated.c -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/critical-unrelated.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/deflake.bash /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/critical-unrelated.c.tmp 2>&1 | tee /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/critical-unrelated.c.tmp.log | /var/lib/buildkite-agent/builds/llvm-project/build/./bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/races/critical-unrelated.c
View Full Test Results (22 Failed)

Event Timeline

scott.linder created this revision.Jun 23 2021, 5:12 PM
scott.linder requested review of this revision.Jun 23 2021, 5:12 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 23 2021, 5:12 PM

This is my attempt to address a bug with distinct DIExpressions being accepted by the parser, but not round-tripping.

In general it seemed like it would be nice to have high level properties of the MDNode types all in Metadata.def, and it also seemed nice to make the "inlined" property depend on the fact that a node is always uniqued, as printing a distinct node inline seems impossible (or at least misleading) unless we invent some new syntax for it.

I got all the way to the MIR parser changes and then wasn't sure why e.g. DILocation is handled specially. I stopped before pushing the changes through there in order to get feedback on the general approach first; any comments are welcomed!

scott.linder edited the summary of this revision. (Show Details)Jun 23 2021, 5:17 PM
t-tye added inline comments.Jun 23 2021, 6:50 PM
llvm/include/llvm/IR/Metadata.def
21

Seems uniqued nodes now are always rendered inline and the other nodes are always rendered using IDs.

May be worth explaining that here since it is now describing these different cases.

77–103

Use single space for consistency with comment above. Same for following comment,

llvm/lib/AsmParser/LLParser.cpp
752

So are all UNIQUED nodes are now allowed to be inline?

scott.linder added inline comments.Jun 24 2021, 11:41 AM
llvm/include/llvm/IR/Metadata.def
21

Yes, I agree explaining that here would be helpful. I think I decided to combine the two concepts (always-uniqued, and printed-inline) early on when drafting the patch, and by the end I never actually wrote it down anywhere :^)

An alternative is to split the two up, but it seemed nice to have one consistent property that unifies printed-inline nodes. If anyone suspects this will not work out longer term I can also split up the concepts, although I would think at the very least we would want printed-inline to always imply always-uniqued.

scott.linder added inline comments.Jun 24 2021, 12:13 PM
llvm/lib/AsmParser/LLParser.cpp
752

Yes, that is the intent. In addition, all UNIQUED nodes are canonically printed inline.

scott.linder edited the summary of this revision. (Show Details)

Address feedback

scott.linder marked 3 inline comments as done.Jun 24 2021, 12:18 PM

I tried to describe the fact that UNIQUED implies printed-inline, both in Metadata.def and the commit message. Let me know what you think about the wording.

t-tye accepted this revision.Jun 24 2021, 2:03 PM

LGTM Thanks for documenting the differences.

This revision is now accepted and ready to land.Jun 24 2021, 2:03 PM

Implement inline case in AssemblyWriter::printNamedMDNode based on UNIQUED-ness.

I'm not sure my overall approach to this patch is actually the best way to
proceed. The biggest issue is that DIArgList is more particular about where it
can appear, so most of this generic support for UNIQUED nodes is just dead code
for DIArgList. It still seems like having a lot of if (isa<Foo>) ...;
littered around the IR parser/printer, bitcode reader/writer, MIR
reader/writer, etc. to implement inline nodes is more error prone, and we still
need to fix the bug with distinct either way.

Does anyone have thoughts on what approach to take?

StephenTozer accepted this revision.Jun 25 2021, 6:12 AM

Good patch, this is a good way of generifying the logic for inline metadata - and improving the documentation at the same time, which is a big plus!

I'm not sure my overall approach to this patch is actually the best way to
proceed. The biggest issue is that DIArgList is more particular about where it
can appear, so most of this generic support for UNIQUED nodes is just dead code
for DIArgList.

I think this is mostly unavoidable as-is; this patch does nicely take care of the inline printing/parsing logic, trying to also simplify the function-local aspects would be out of scope. In that regard, the class that DIArgList shares its behaviour with is ValueAsMetadata/LocalAsMetadata. Unfortunately there is no easy way to generify that logic, as ValueAsMetadata is a unique case since it is printed as i32 %value rather than !ClassName(...). And I don't think I'd quite call the generic support "dead code" for DIArgList - it may be an error in all cases except when parsed as a function operand, but it still does need to be handled, and if some change in future affects how we print/parse inline metadata then that change must still apply equally to both DIExpression and DIArgList, which this patch facilitates.

A potential alternative change is to forbid the non-inline syntax for
DIExpression entirely, as is done with DIArgList implicitly by requiring
it appear in the context of a function.

Personally, I could go either way with this. Currently, DIExpression does not actually use function state in any way, but will still only ever be used in the context of a function. In particular, I'm not sure what value there could ever be in using a DIExpression in a named metadata node. It seems that in principle it would be better suited to being explicitly function-local metadata, but I'm not sure if it's worth putting in the effort to make the change when doing so doesn't grant us much benefit and also breaks parsing for old IR that contains ID'd DIExpressions.

Assert based on UNIQUED/DISTINCT-ness in ModuleBitcodeWriter

This revision was landed with ongoing or failed builds.Jun 28 2021, 2:20 PM
This revision was automatically updated to reflect the committed changes.

Hi,

We've noticed that this patch changes things so bcfiles created with the patch cannot be parsed with binaries built before it.
E.g if I create foo.bc with a clang binary built from commit 8cd35ad854ab (this patch) with:

clang -save-temps foo.c -S -g

and then try to run opt built from aad87328fab (the commit before this patch) with

opt foo.bc -S > /dev/null

it results in

opt: ../lib/Bitcode/Reader/MetadataLoader.cpp:366: (anonymous namespace)::(anonymous namespace)::PlaceholderQueue::~PlaceholderQueue(): Assertion `empty() && "PlaceholderQueue hasn't been flushed before being destroyed"' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.	Program arguments: build-all-builtins/bin/opt foo.bc -S
 #0 0x0000000002a11623 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (build-all-builtins/bin/opt+0x2a11623)
 #1 0x0000000002a0f2de llvm::sys::RunSignalHandlers() (build-all-builtins/bin/opt+0x2a0f2de)
 #2 0x0000000002a119a6 SignalHandler(int) Signals.cpp:0:0
 #3 0x00007f2c97079630 __restore_rt sigaction.c:0:0
 #4 0x00007f2c947ac387 raise (/lib64/libc.so.6+0x36387)
 #5 0x00007f2c947ada78 abort (/lib64/libc.so.6+0x37a78)
 #6 0x00007f2c947a51a6 __assert_fail_base (/lib64/libc.so.6+0x2f1a6)
 #7 0x00007f2c947a5252 (/lib64/libc.so.6+0x2f252)
 #8 0x0000000003398a64 llvm::MetadataLoader::MetadataLoaderImpl::parseMetadata(bool) (build-all-builtins/bin/opt+0x3398a64)
 #9 0x00000000033a19ac llvm::MetadataLoader::parseMetadata(bool) (build-all-builtins/bin/opt+0x33a19ac)
#10 0x00000000033730a8 (anonymous namespace)::BitcodeReader::parseFunctionBody(llvm::Function*) BitcodeReader.cpp:0:0
#11 0x0000000003370f7c (anonymous namespace)::BitcodeReader::materialize(llvm::GlobalValue*) BitcodeReader.cpp:0:0
#12 0x00000000033720aa (anonymous namespace)::BitcodeReader::materializeModule() BitcodeReader.cpp:0:0
#13 0x000000000221fc8a llvm::Module::materializeAll() (build-all-builtins/bin/opt+0x221fc8a)
#14 0x000000000336acb1 llvm::BitcodeModule::getModuleImpl(llvm::LLVMContext&, bool, bool, bool, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (llvm::StringRef)>) (build-all-builtins/bin/opt+0x336acb1)
#15 0x000000000336ef60 llvm::parseBitcodeFile(llvm::MemoryBufferRef, llvm::LLVMContext&, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (llvm::StringRef)>) (build-all-builtins/bin/opt+0x336ef60)
#16 0x00000000024521a0 llvm::parseIR(llvm::MemoryBufferRef, llvm::SMDiagnostic&, llvm::LLVMContext&, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (llvm::StringRef)>) (build-all-builtins/bin/opt+0x24521a0)
#17 0x000000000245262e llvm::parseIRFile(llvm::StringRef, llvm::SMDiagnostic&, llvm::LLVMContext&, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (llvm::StringRef)>) (build-all-builtins/bin/opt+0x245262e)
#18 0x0000000000782acb main (build-all-builtins/bin/opt+0x782acb)
#19 0x00007f2c94798555 __libc_start_main (/lib64/libc.so.6+0x22555)
#20 0x000000000076f800 _start (build-all-builtins/bin/opt+0x76f800)
Abort

I suppose this is not expected?

That looks bad — best to revert while that's investigated.

That looks bad — best to revert while that's investigated.

This also breaks lang/cpp/member-and-local-vars-with-same-name/TestMembersAndLocalsWithSameName.py in the LLDB test suite:

Assertion failed: ((FromValue || !(isa<LocalAsMetadata>(MD) || isa<DIArgList>(MD))) && "Unexpected function-local metadata outside of value argument"), function WriteAsOperandInternal, file AsmWriter.cpp, line 2519.

Given that it's Friday, I'm going to go ahead and revert this to get our bot back to green.

Hi,

We've noticed that this patch changes things so bcfiles created with the patch cannot be parsed with binaries built before it.
E.g if I create foo.bc with a clang binary built from commit 8cd35ad854ab (this patch) with:

clang -save-temps foo.c -S -g

and then try to run opt built from aad87328fab (the commit before this patch) with

opt foo.bc -S > /dev/null

it results in

opt: ../lib/Bitcode/Reader/MetadataLoader.cpp:366: (anonymous namespace)::(anonymous namespace)::PlaceholderQueue::~PlaceholderQueue(): Assertion `empty() && "PlaceholderQueue hasn't been flushed before being destroyed"' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.	Program arguments: build-all-builtins/bin/opt foo.bc -S
 #0 0x0000000002a11623 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (build-all-builtins/bin/opt+0x2a11623)
 #1 0x0000000002a0f2de llvm::sys::RunSignalHandlers() (build-all-builtins/bin/opt+0x2a0f2de)
 #2 0x0000000002a119a6 SignalHandler(int) Signals.cpp:0:0
 #3 0x00007f2c97079630 __restore_rt sigaction.c:0:0
 #4 0x00007f2c947ac387 raise (/lib64/libc.so.6+0x36387)
 #5 0x00007f2c947ada78 abort (/lib64/libc.so.6+0x37a78)
 #6 0x00007f2c947a51a6 __assert_fail_base (/lib64/libc.so.6+0x2f1a6)
 #7 0x00007f2c947a5252 (/lib64/libc.so.6+0x2f252)
 #8 0x0000000003398a64 llvm::MetadataLoader::MetadataLoaderImpl::parseMetadata(bool) (build-all-builtins/bin/opt+0x3398a64)
 #9 0x00000000033a19ac llvm::MetadataLoader::parseMetadata(bool) (build-all-builtins/bin/opt+0x33a19ac)
#10 0x00000000033730a8 (anonymous namespace)::BitcodeReader::parseFunctionBody(llvm::Function*) BitcodeReader.cpp:0:0
#11 0x0000000003370f7c (anonymous namespace)::BitcodeReader::materialize(llvm::GlobalValue*) BitcodeReader.cpp:0:0
#12 0x00000000033720aa (anonymous namespace)::BitcodeReader::materializeModule() BitcodeReader.cpp:0:0
#13 0x000000000221fc8a llvm::Module::materializeAll() (build-all-builtins/bin/opt+0x221fc8a)
#14 0x000000000336acb1 llvm::BitcodeModule::getModuleImpl(llvm::LLVMContext&, bool, bool, bool, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (llvm::StringRef)>) (build-all-builtins/bin/opt+0x336acb1)
#15 0x000000000336ef60 llvm::parseBitcodeFile(llvm::MemoryBufferRef, llvm::LLVMContext&, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (llvm::StringRef)>) (build-all-builtins/bin/opt+0x336ef60)
#16 0x00000000024521a0 llvm::parseIR(llvm::MemoryBufferRef, llvm::SMDiagnostic&, llvm::LLVMContext&, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (llvm::StringRef)>) (build-all-builtins/bin/opt+0x24521a0)
#17 0x000000000245262e llvm::parseIRFile(llvm::StringRef, llvm::SMDiagnostic&, llvm::LLVMContext&, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (llvm::StringRef)>) (build-all-builtins/bin/opt+0x245262e)
#18 0x0000000000782acb main (build-all-builtins/bin/opt+0x782acb)
#19 0x00007f2c94798555 __libc_start_main (/lib64/libc.so.6+0x22555)
#20 0x000000000076f800 _start (build-all-builtins/bin/opt+0x76f800)
Abort

I suppose this is not expected?

Thank you for letting me know, and for the reproducer! Curiously this is just on the return error(...) path in upgradeDIExpression, which I just didn't test (as I'm not sure how to in the current LLVM infrastructure). I assume this is because normally bitcode error returns occur after most of the bitcode file has already been read, so there are no pending placeholders, but I need to learn some more about the bitcode parser in general to be sure I understand the issue. My first thought is that we should just flush placeholders on an error, as we are about to just bail anyway.

That looks bad — best to revert while that's investigated.

This also breaks lang/cpp/member-and-local-vars-with-same-name/TestMembersAndLocalsWithSameName.py in the LLDB test suite:

Assertion failed: ((FromValue || !(isa<LocalAsMetadata>(MD) || isa<DIArgList>(MD))) && "Unexpected function-local metadata outside of value argument"), function WriteAsOperandInternal, file AsmWriter.cpp, line 2519.

Given that it's Friday, I'm going to go ahead and revert this to get our bot back to green.

Thank you for reverting, I was out Friday+Monday. I thought I was safe to move the assert that is triggering here up further in the function body to make it stronger, but evidently I overlooked some cases. I'll identify them and either fix them elsewhere or restore the assert (possibly with a comment describing the rationale). Thanks again!

Hi,

We've noticed that this patch changes things so bcfiles created with the patch cannot be parsed with binaries built before it.
E.g if I create foo.bc with a clang binary built from commit 8cd35ad854ab (this patch) with:

clang -save-temps foo.c -S -g

and then try to run opt built from aad87328fab (the commit before this patch) with

opt foo.bc -S > /dev/null

it results in

opt: ../lib/Bitcode/Reader/MetadataLoader.cpp:366: (anonymous namespace)::(anonymous namespace)::PlaceholderQueue::~PlaceholderQueue(): Assertion `empty() && "PlaceholderQueue hasn't been flushed before being destroyed"' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.	Program arguments: build-all-builtins/bin/opt foo.bc -S
 #0 0x0000000002a11623 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (build-all-builtins/bin/opt+0x2a11623)
 #1 0x0000000002a0f2de llvm::sys::RunSignalHandlers() (build-all-builtins/bin/opt+0x2a0f2de)
 #2 0x0000000002a119a6 SignalHandler(int) Signals.cpp:0:0
 #3 0x00007f2c97079630 __restore_rt sigaction.c:0:0
 #4 0x00007f2c947ac387 raise (/lib64/libc.so.6+0x36387)
 #5 0x00007f2c947ada78 abort (/lib64/libc.so.6+0x37a78)
 #6 0x00007f2c947a51a6 __assert_fail_base (/lib64/libc.so.6+0x2f1a6)
 #7 0x00007f2c947a5252 (/lib64/libc.so.6+0x2f252)
 #8 0x0000000003398a64 llvm::MetadataLoader::MetadataLoaderImpl::parseMetadata(bool) (build-all-builtins/bin/opt+0x3398a64)
 #9 0x00000000033a19ac llvm::MetadataLoader::parseMetadata(bool) (build-all-builtins/bin/opt+0x33a19ac)
#10 0x00000000033730a8 (anonymous namespace)::BitcodeReader::parseFunctionBody(llvm::Function*) BitcodeReader.cpp:0:0
#11 0x0000000003370f7c (anonymous namespace)::BitcodeReader::materialize(llvm::GlobalValue*) BitcodeReader.cpp:0:0
#12 0x00000000033720aa (anonymous namespace)::BitcodeReader::materializeModule() BitcodeReader.cpp:0:0
#13 0x000000000221fc8a llvm::Module::materializeAll() (build-all-builtins/bin/opt+0x221fc8a)
#14 0x000000000336acb1 llvm::BitcodeModule::getModuleImpl(llvm::LLVMContext&, bool, bool, bool, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (llvm::StringRef)>) (build-all-builtins/bin/opt+0x336acb1)
#15 0x000000000336ef60 llvm::parseBitcodeFile(llvm::MemoryBufferRef, llvm::LLVMContext&, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (llvm::StringRef)>) (build-all-builtins/bin/opt+0x336ef60)
#16 0x00000000024521a0 llvm::parseIR(llvm::MemoryBufferRef, llvm::SMDiagnostic&, llvm::LLVMContext&, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (llvm::StringRef)>) (build-all-builtins/bin/opt+0x24521a0)
#17 0x000000000245262e llvm::parseIRFile(llvm::StringRef, llvm::SMDiagnostic&, llvm::LLVMContext&, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (llvm::StringRef)>) (build-all-builtins/bin/opt+0x245262e)
#18 0x0000000000782acb main (build-all-builtins/bin/opt+0x782acb)
#19 0x00007f2c94798555 __libc_start_main (/lib64/libc.so.6+0x22555)
#20 0x000000000076f800 _start (build-all-builtins/bin/opt+0x76f800)
Abort

I suppose this is not expected?

This seems to be a long-standing bug with the bitcode reader, where every return error("Invalid record") in llvm::MetadataLoader::MetadataLoaderImpl::parseOneMetadata is untested code that actually just triggers an assert in PlaceholderQueue::~PlaceholderQueue. For example, the last version bump for DIExpression has the same issue:

user@host:~/llvm-project/main:main$ git switch --detach ffc498dfcc8d67e6e9acaebb690377f674457ab4
Updating files: 100% (101031/101031), done.
HEAD is now at ffc498dfcc8d Align definition of DW_OP_plus with DWARF spec [3/3]
user@host:~/llvm-project/main:(ffc498dfcc8d...)$ ninja -C release/ clang
ninja: Entering directory `release/'
...
[2569/2569] Creating executable symlink bin/clang
user@host:~/llvm-project/main:(ffc498dfcc8d...)$ echo 'void f(int x) {}' | release/bin/clang -x c - -c -emit-llvm -o- -g >
../f.bc
user@host:~/llvm-project/main:(ffc498dfcc8d...)$ git switch --detach HEAD^
Previous HEAD position was ffc498dfcc8d Align definition of DW_OP_plus with DWARF spec [3/3]
HEAD is now at 3f250380d4bd Corrected some comment typos; NFC.
user@host:~/llvm-project/main:(3f250380d4bd...)$ ninja -C release/ opt
ninja: Entering directory `release/'
...
[149/149] Linking CXX executable bin/opt
user@host:~/llvm-project/main:(3f250380d4bd...)$ release/bin/opt ../f.bc -S
opt: /home/slinder1/llvm-project/main/llvm/lib/Bitcode/Reader/MetadataLoader.cpp:363: (anonymous namespace)::(anonymous namespace):
:PlaceholderQueue::~PlaceholderQueue(): Assertion `empty() && "PlaceholderQueue hasn't been flushed before being destroyed"' failed
.
#0 0x0000000002aab1b4 PrintStackTraceSignalHandler(void*) Signals.cpp:0:0
#1 0x0000000002aab4f6 SignalHandler(int) Signals.cpp:0:0
#2 0x00007f56a9e753c0 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x153c0)
#3 0x00007f56a970718b raise /build/glibc-eX1tMB/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
#4 0x00007f56a96e6859 abort /build/glibc-eX1tMB/glibc-2.31/stdlib/abort.c:81:7
#5 0x00007f56a96e6729 get_sysdep_segment_value /build/glibc-eX1tMB/glibc-2.31/intl/loadmsgcat.c:509:8
#6 0x00007f56a96e6729 _nl_load_domain /build/glibc-eX1tMB/glibc-2.31/intl/loadmsgcat.c:970:34
#7 0x00007f56a96f7f36 (/lib/x86_64-linux-gnu/libc.so.6+0x36f36)
#8 0x0000000002f8e298 llvm::MetadataLoader::MetadataLoaderImpl::parseMetadata(bool) (release/bin/opt+0x2f8e298)
#9 0x0000000002f94dcc llvm::MetadataLoader::parseMetadata(bool) (release/bin/opt+0x2f94dcc)
#10 0x0000000002f6ecc6 (anonymous namespace)::BitcodeReader::parseFunctionBody(llvm::Function*) BitcodeReader.cpp:0:0
#11 0x0000000002f6cf29 (anonymous namespace)::BitcodeReader::materialize(llvm::GlobalValue*) BitcodeReader.cpp:0:0
#12 0x0000000002f6dc1a (anonymous namespace)::BitcodeReader::materializeModule() BitcodeReader.cpp:0:0
#13 0x00000000025a7dda llvm::Module::materializeAll() (release/bin/opt+0x25a7dda)
#14 0x0000000002f66dd3 llvm::BitcodeModule::getModuleImpl(llvm::LLVMContext&, bool, bool, bool) (release/bin/opt+0x2f66dd3)
#15 0x0000000002f6a519 llvm::parseBitcodeFile(llvm::MemoryBufferRef, llvm::LLVMContext&) (release/bin/opt+0x2f6a519)
#16 0x00000000026c7cf1 llvm::parseIR(llvm::MemoryBufferRef, llvm::SMDiagnostic&, llvm::LLVMContext&) (release/bin/opt+0x26c7cf1)
#17 0x00000000026c81e0 llvm::parseIRFile(llvm::StringRef, llvm::SMDiagnostic&, llvm::LLVMContext&) (release/bin/opt+0x26c81e0)
#18 0x00000000014d65fc main (release/bin/opt+0x14d65fc)
#19 0x00007f56a96e80b3 __libc_start_main /build/glibc-eX1tMB/glibc-2.31/csu/../csu/libc-start.c:342:3
#20 0x00000000014cc5ae _start (release/bin/opt+0x14cc5ae)
Stack dump:
0.      Program arguments: release/bin/opt ../f.bc -S
Aborted (core dumped)
user@host:~/llvm-project/main:(3f250380d4bd...)$

Again, my guess is that we just want to not assert in this case, and "clear" the PlaceholderQueue as none of the Metadata we have parsed will be used.

The issue of testing newly-encoded bitcode against previous versions of the loader seems like a trickier issue, though. In fact, it may be tricky to test the fix for this assert at all, as I assume it requires manipulating bitcode files to make them invalid in some way that would trigger these return("Invalid record")s.

Any thoughts on where to go from here, and whether this patch should be blocked on these fixes?

That looks bad — best to revert while that's investigated.

Oops... I had misread this comment:

We've noticed that this patch changes things so bcfiles created with the patch cannot be parsed with binaries built before it.

I don't think we have any guarantees about reading new bitcode with old readers. There's some machinery for a high-level version check to get a nice error (checking MODULE_CODE_VERSION) but I think that's it. (I thought this was saying the new reader couldn't read the old bitcode, in which case there'd be a missing bitcode upgrade.)

Any thoughts on where to go from here, and whether this patch should be blocked on these fixes?

Assuming IIUC now, the patch can probably land as-is... but probably makes sense to investigate the LLDB issue first. @JDevlieghere, could the failing LLDB test somehow be reading a new bitcode file with an old reader, or is that something different?

That looks bad — best to revert while that's investigated.

Oops... I had misread this comment:

We've noticed that this patch changes things so bcfiles created with the patch cannot be parsed with binaries built before it.

I don't think we have any guarantees about reading new bitcode with old readers. There's some machinery for a high-level version check to get a nice error (checking MODULE_CODE_VERSION) but I think that's it. (I thought this was saying the new reader couldn't read the old bitcode, in which case there'd be a missing bitcode upgrade.)

Hm, I often do this and I can't remember it failing before. E.g. if I run into a crash in opt, I usually save the bc file and bisect into old versions to see with what opt version the crash starts appearing. This usually works very well so I was surprised and considered it a bug now when it didn't work.

Any thoughts on where to go from here, and whether this patch should be blocked on these fixes?

Assuming IIUC now, the patch can probably land as-is... but probably makes sense to investigate the LLDB issue first. @JDevlieghere, could the failing LLDB test somehow be reading a new bitcode file with an old reader, or is that something different?

I've seen the same crash in our testing and then there was no "new bitcode with old reader" stuff involved, it was just a normal run crashing in somewhere llc (for our out of tree target).

scott.linder added inline comments.Thu, Jul 22, 2:59 PM
llvm/lib/IR/AsmWriter.cpp
4702

I think I've identified the source of the bug causing the assert on GreenDragon: I didn't intend to remove this FromValue=true argument. I believe the assert can still be hoisted up, so I'll post an updated patch which just restores this line.

scott.linder reopened this revision.Fri, Jul 23, 11:04 AM
This revision is now accepted and ready to land.Fri, Jul 23, 11:04 AM

Fix bug in printMetadataImpl

scott.linder requested review of this revision.Fri, Jul 23, 11:05 AM