This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Enforce implicit constraints on `distinct` MDNodes
Needs ReviewPublic

Authored by scott.linder on Jun 23 2021, 5:12 PM.

Details

Summary

Add UNIQUED and DISTINCT properties in Metadata.def and use them to
implement restrictions on the distinct property of MDNodes:

  • DIExpression can currently be parsed from IR or read from bitcode as distinct, but this property is silently dropped when printing to IR. This causes accepted IR to fail to round-trip. As DIExpression appears inline at each use in the canonical form of IR, it cannot actually be distinct anyway, as there is no syntax to describe it.
  • DICompileUnit is already restricted to always being distinct, but along with adding general support for the inverse restriction I went ahead and described this in Metadata.def and updated the parser to be general. Future nodes which have this restriction can share this support.

The new UNIQUED property applies to DIExpression, and forbids it to be
distinct. It also implies it are canonically printed inline at each
use, rather than via MDNode ID.

The new DISTINCT property applies to DICompileUnit, and requires it to
be distinct.

A potential alternative change is to forbid the non-inline syntax for
DIExpression entirely, as is done with DIArgList implicitly by requiring
it appear in the context of a function. For example, we would forbid:

!named = !{!0}
!0 = !DIExpression()

Instead we would only accept the equivalent inlined version:

!named = !{!DIExpression()}

This essentially removes the ability to create a distinct DIExpression
by construction, as there is no syntax for distinct inline. If this
patch is accepted as-is, the result would be that the non-canonical
version is accepted, but the following would be an error and produce a diagnostic:

!named = !{!0}
; error: 'distinct' not allowed for !DIExpression()
!0 = distinct !DIExpression()

Also update some documentation to consistently use the inline syntax for
DIExpression, and to describe the restrictions on distinct for nodes
where applicable.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
t-tye accepted this revision.Jun 24 2021, 2:03 PM

LGTM Thanks for documenting the differences.

This revision is now accepted and ready to land.Jun 24 2021, 2:03 PM

Implement inline case in AssemblyWriter::printNamedMDNode based on UNIQUED-ness.

I'm not sure my overall approach to this patch is actually the best way to
proceed. The biggest issue is that DIArgList is more particular about where it
can appear, so most of this generic support for UNIQUED nodes is just dead code
for DIArgList. It still seems like having a lot of if (isa<Foo>) ...;
littered around the IR parser/printer, bitcode reader/writer, MIR
reader/writer, etc. to implement inline nodes is more error prone, and we still
need to fix the bug with distinct either way.

Does anyone have thoughts on what approach to take?

StephenTozer accepted this revision.Jun 25 2021, 6:12 AM

Good patch, this is a good way of generifying the logic for inline metadata - and improving the documentation at the same time, which is a big plus!

I'm not sure my overall approach to this patch is actually the best way to
proceed. The biggest issue is that DIArgList is more particular about where it
can appear, so most of this generic support for UNIQUED nodes is just dead code
for DIArgList.

I think this is mostly unavoidable as-is; this patch does nicely take care of the inline printing/parsing logic, trying to also simplify the function-local aspects would be out of scope. In that regard, the class that DIArgList shares its behaviour with is ValueAsMetadata/LocalAsMetadata. Unfortunately there is no easy way to generify that logic, as ValueAsMetadata is a unique case since it is printed as i32 %value rather than !ClassName(...). And I don't think I'd quite call the generic support "dead code" for DIArgList - it may be an error in all cases except when parsed as a function operand, but it still does need to be handled, and if some change in future affects how we print/parse inline metadata then that change must still apply equally to both DIExpression and DIArgList, which this patch facilitates.

A potential alternative change is to forbid the non-inline syntax for
DIExpression entirely, as is done with DIArgList implicitly by requiring
it appear in the context of a function.

Personally, I could go either way with this. Currently, DIExpression does not actually use function state in any way, but will still only ever be used in the context of a function. In particular, I'm not sure what value there could ever be in using a DIExpression in a named metadata node. It seems that in principle it would be better suited to being explicitly function-local metadata, but I'm not sure if it's worth putting in the effort to make the change when doing so doesn't grant us much benefit and also breaks parsing for old IR that contains ID'd DIExpressions.

Assert based on UNIQUED/DISTINCT-ness in ModuleBitcodeWriter

This revision was landed with ongoing or failed builds.Jun 28 2021, 2:20 PM
This revision was automatically updated to reflect the committed changes.

Hi,

We've noticed that this patch changes things so bcfiles created with the patch cannot be parsed with binaries built before it.
E.g if I create foo.bc with a clang binary built from commit 8cd35ad854ab (this patch) with:

clang -save-temps foo.c -S -g

and then try to run opt built from aad87328fab (the commit before this patch) with

opt foo.bc -S > /dev/null

it results in

opt: ../lib/Bitcode/Reader/MetadataLoader.cpp:366: (anonymous namespace)::(anonymous namespace)::PlaceholderQueue::~PlaceholderQueue(): Assertion `empty() && "PlaceholderQueue hasn't been flushed before being destroyed"' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.	Program arguments: build-all-builtins/bin/opt foo.bc -S
 #0 0x0000000002a11623 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (build-all-builtins/bin/opt+0x2a11623)
 #1 0x0000000002a0f2de llvm::sys::RunSignalHandlers() (build-all-builtins/bin/opt+0x2a0f2de)
 #2 0x0000000002a119a6 SignalHandler(int) Signals.cpp:0:0
 #3 0x00007f2c97079630 __restore_rt sigaction.c:0:0
 #4 0x00007f2c947ac387 raise (/lib64/libc.so.6+0x36387)
 #5 0x00007f2c947ada78 abort (/lib64/libc.so.6+0x37a78)
 #6 0x00007f2c947a51a6 __assert_fail_base (/lib64/libc.so.6+0x2f1a6)
 #7 0x00007f2c947a5252 (/lib64/libc.so.6+0x2f252)
 #8 0x0000000003398a64 llvm::MetadataLoader::MetadataLoaderImpl::parseMetadata(bool) (build-all-builtins/bin/opt+0x3398a64)
 #9 0x00000000033a19ac llvm::MetadataLoader::parseMetadata(bool) (build-all-builtins/bin/opt+0x33a19ac)
#10 0x00000000033730a8 (anonymous namespace)::BitcodeReader::parseFunctionBody(llvm::Function*) BitcodeReader.cpp:0:0
#11 0x0000000003370f7c (anonymous namespace)::BitcodeReader::materialize(llvm::GlobalValue*) BitcodeReader.cpp:0:0
#12 0x00000000033720aa (anonymous namespace)::BitcodeReader::materializeModule() BitcodeReader.cpp:0:0
#13 0x000000000221fc8a llvm::Module::materializeAll() (build-all-builtins/bin/opt+0x221fc8a)
#14 0x000000000336acb1 llvm::BitcodeModule::getModuleImpl(llvm::LLVMContext&, bool, bool, bool, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (llvm::StringRef)>) (build-all-builtins/bin/opt+0x336acb1)
#15 0x000000000336ef60 llvm::parseBitcodeFile(llvm::MemoryBufferRef, llvm::LLVMContext&, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (llvm::StringRef)>) (build-all-builtins/bin/opt+0x336ef60)
#16 0x00000000024521a0 llvm::parseIR(llvm::MemoryBufferRef, llvm::SMDiagnostic&, llvm::LLVMContext&, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (llvm::StringRef)>) (build-all-builtins/bin/opt+0x24521a0)
#17 0x000000000245262e llvm::parseIRFile(llvm::StringRef, llvm::SMDiagnostic&, llvm::LLVMContext&, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (llvm::StringRef)>) (build-all-builtins/bin/opt+0x245262e)
#18 0x0000000000782acb main (build-all-builtins/bin/opt+0x782acb)
#19 0x00007f2c94798555 __libc_start_main (/lib64/libc.so.6+0x22555)
#20 0x000000000076f800 _start (build-all-builtins/bin/opt+0x76f800)
Abort

I suppose this is not expected?

That looks bad โ€” best to revert while that's investigated.

That looks bad โ€” best to revert while that's investigated.

This also breaks lang/cpp/member-and-local-vars-with-same-name/TestMembersAndLocalsWithSameName.py in the LLDB test suite:

Assertion failed: ((FromValue || !(isa<LocalAsMetadata>(MD) || isa<DIArgList>(MD))) && "Unexpected function-local metadata outside of value argument"), function WriteAsOperandInternal, file AsmWriter.cpp, line 2519.

Given that it's Friday, I'm going to go ahead and revert this to get our bot back to green.

Hi,

We've noticed that this patch changes things so bcfiles created with the patch cannot be parsed with binaries built before it.
E.g if I create foo.bc with a clang binary built from commit 8cd35ad854ab (this patch) with:

clang -save-temps foo.c -S -g

and then try to run opt built from aad87328fab (the commit before this patch) with

opt foo.bc -S > /dev/null

it results in

opt: ../lib/Bitcode/Reader/MetadataLoader.cpp:366: (anonymous namespace)::(anonymous namespace)::PlaceholderQueue::~PlaceholderQueue(): Assertion `empty() && "PlaceholderQueue hasn't been flushed before being destroyed"' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.	Program arguments: build-all-builtins/bin/opt foo.bc -S
 #0 0x0000000002a11623 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (build-all-builtins/bin/opt+0x2a11623)
 #1 0x0000000002a0f2de llvm::sys::RunSignalHandlers() (build-all-builtins/bin/opt+0x2a0f2de)
 #2 0x0000000002a119a6 SignalHandler(int) Signals.cpp:0:0
 #3 0x00007f2c97079630 __restore_rt sigaction.c:0:0
 #4 0x00007f2c947ac387 raise (/lib64/libc.so.6+0x36387)
 #5 0x00007f2c947ada78 abort (/lib64/libc.so.6+0x37a78)
 #6 0x00007f2c947a51a6 __assert_fail_base (/lib64/libc.so.6+0x2f1a6)
 #7 0x00007f2c947a5252 (/lib64/libc.so.6+0x2f252)
 #8 0x0000000003398a64 llvm::MetadataLoader::MetadataLoaderImpl::parseMetadata(bool) (build-all-builtins/bin/opt+0x3398a64)
 #9 0x00000000033a19ac llvm::MetadataLoader::parseMetadata(bool) (build-all-builtins/bin/opt+0x33a19ac)
#10 0x00000000033730a8 (anonymous namespace)::BitcodeReader::parseFunctionBody(llvm::Function*) BitcodeReader.cpp:0:0
#11 0x0000000003370f7c (anonymous namespace)::BitcodeReader::materialize(llvm::GlobalValue*) BitcodeReader.cpp:0:0
#12 0x00000000033720aa (anonymous namespace)::BitcodeReader::materializeModule() BitcodeReader.cpp:0:0
#13 0x000000000221fc8a llvm::Module::materializeAll() (build-all-builtins/bin/opt+0x221fc8a)
#14 0x000000000336acb1 llvm::BitcodeModule::getModuleImpl(llvm::LLVMContext&, bool, bool, bool, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (llvm::StringRef)>) (build-all-builtins/bin/opt+0x336acb1)
#15 0x000000000336ef60 llvm::parseBitcodeFile(llvm::MemoryBufferRef, llvm::LLVMContext&, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (llvm::StringRef)>) (build-all-builtins/bin/opt+0x336ef60)
#16 0x00000000024521a0 llvm::parseIR(llvm::MemoryBufferRef, llvm::SMDiagnostic&, llvm::LLVMContext&, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (llvm::StringRef)>) (build-all-builtins/bin/opt+0x24521a0)
#17 0x000000000245262e llvm::parseIRFile(llvm::StringRef, llvm::SMDiagnostic&, llvm::LLVMContext&, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (llvm::StringRef)>) (build-all-builtins/bin/opt+0x245262e)
#18 0x0000000000782acb main (build-all-builtins/bin/opt+0x782acb)
#19 0x00007f2c94798555 __libc_start_main (/lib64/libc.so.6+0x22555)
#20 0x000000000076f800 _start (build-all-builtins/bin/opt+0x76f800)
Abort

I suppose this is not expected?

Thank you for letting me know, and for the reproducer! Curiously this is just on the return error(...) path in upgradeDIExpression, which I just didn't test (as I'm not sure how to in the current LLVM infrastructure). I assume this is because normally bitcode error returns occur after most of the bitcode file has already been read, so there are no pending placeholders, but I need to learn some more about the bitcode parser in general to be sure I understand the issue. My first thought is that we should just flush placeholders on an error, as we are about to just bail anyway.

That looks bad โ€” best to revert while that's investigated.

This also breaks lang/cpp/member-and-local-vars-with-same-name/TestMembersAndLocalsWithSameName.py in the LLDB test suite:

Assertion failed: ((FromValue || !(isa<LocalAsMetadata>(MD) || isa<DIArgList>(MD))) && "Unexpected function-local metadata outside of value argument"), function WriteAsOperandInternal, file AsmWriter.cpp, line 2519.

Given that it's Friday, I'm going to go ahead and revert this to get our bot back to green.

Thank you for reverting, I was out Friday+Monday. I thought I was safe to move the assert that is triggering here up further in the function body to make it stronger, but evidently I overlooked some cases. I'll identify them and either fix them elsewhere or restore the assert (possibly with a comment describing the rationale). Thanks again!

Hi,

We've noticed that this patch changes things so bcfiles created with the patch cannot be parsed with binaries built before it.
E.g if I create foo.bc with a clang binary built from commit 8cd35ad854ab (this patch) with:

clang -save-temps foo.c -S -g

and then try to run opt built from aad87328fab (the commit before this patch) with

opt foo.bc -S > /dev/null

it results in

opt: ../lib/Bitcode/Reader/MetadataLoader.cpp:366: (anonymous namespace)::(anonymous namespace)::PlaceholderQueue::~PlaceholderQueue(): Assertion `empty() && "PlaceholderQueue hasn't been flushed before being destroyed"' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.	Program arguments: build-all-builtins/bin/opt foo.bc -S
 #0 0x0000000002a11623 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (build-all-builtins/bin/opt+0x2a11623)
 #1 0x0000000002a0f2de llvm::sys::RunSignalHandlers() (build-all-builtins/bin/opt+0x2a0f2de)
 #2 0x0000000002a119a6 SignalHandler(int) Signals.cpp:0:0
 #3 0x00007f2c97079630 __restore_rt sigaction.c:0:0
 #4 0x00007f2c947ac387 raise (/lib64/libc.so.6+0x36387)
 #5 0x00007f2c947ada78 abort (/lib64/libc.so.6+0x37a78)
 #6 0x00007f2c947a51a6 __assert_fail_base (/lib64/libc.so.6+0x2f1a6)
 #7 0x00007f2c947a5252 (/lib64/libc.so.6+0x2f252)
 #8 0x0000000003398a64 llvm::MetadataLoader::MetadataLoaderImpl::parseMetadata(bool) (build-all-builtins/bin/opt+0x3398a64)
 #9 0x00000000033a19ac llvm::MetadataLoader::parseMetadata(bool) (build-all-builtins/bin/opt+0x33a19ac)
#10 0x00000000033730a8 (anonymous namespace)::BitcodeReader::parseFunctionBody(llvm::Function*) BitcodeReader.cpp:0:0
#11 0x0000000003370f7c (anonymous namespace)::BitcodeReader::materialize(llvm::GlobalValue*) BitcodeReader.cpp:0:0
#12 0x00000000033720aa (anonymous namespace)::BitcodeReader::materializeModule() BitcodeReader.cpp:0:0
#13 0x000000000221fc8a llvm::Module::materializeAll() (build-all-builtins/bin/opt+0x221fc8a)
#14 0x000000000336acb1 llvm::BitcodeModule::getModuleImpl(llvm::LLVMContext&, bool, bool, bool, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (llvm::StringRef)>) (build-all-builtins/bin/opt+0x336acb1)
#15 0x000000000336ef60 llvm::parseBitcodeFile(llvm::MemoryBufferRef, llvm::LLVMContext&, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (llvm::StringRef)>) (build-all-builtins/bin/opt+0x336ef60)
#16 0x00000000024521a0 llvm::parseIR(llvm::MemoryBufferRef, llvm::SMDiagnostic&, llvm::LLVMContext&, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (llvm::StringRef)>) (build-all-builtins/bin/opt+0x24521a0)
#17 0x000000000245262e llvm::parseIRFile(llvm::StringRef, llvm::SMDiagnostic&, llvm::LLVMContext&, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (llvm::StringRef)>) (build-all-builtins/bin/opt+0x245262e)
#18 0x0000000000782acb main (build-all-builtins/bin/opt+0x782acb)
#19 0x00007f2c94798555 __libc_start_main (/lib64/libc.so.6+0x22555)
#20 0x000000000076f800 _start (build-all-builtins/bin/opt+0x76f800)
Abort

I suppose this is not expected?

This seems to be a long-standing bug with the bitcode reader, where every return error("Invalid record") in llvm::MetadataLoader::MetadataLoaderImpl::parseOneMetadata is untested code that actually just triggers an assert in PlaceholderQueue::~PlaceholderQueue. For example, the last version bump for DIExpression has the same issue:

user@host:~/llvm-project/main:main$ git switch --detach ffc498dfcc8d67e6e9acaebb690377f674457ab4
Updating files: 100% (101031/101031), done.
HEAD is now at ffc498dfcc8d Align definition of DW_OP_plus with DWARF spec [3/3]
user@host:~/llvm-project/main:(ffc498dfcc8d...)$ ninja -C release/ clang
ninja: Entering directory `release/'
...
[2569/2569] Creating executable symlink bin/clang
user@host:~/llvm-project/main:(ffc498dfcc8d...)$ echo 'void f(int x) {}' | release/bin/clang -x c - -c -emit-llvm -o- -g >
../f.bc
user@host:~/llvm-project/main:(ffc498dfcc8d...)$ git switch --detach HEAD^
Previous HEAD position was ffc498dfcc8d Align definition of DW_OP_plus with DWARF spec [3/3]
HEAD is now at 3f250380d4bd Corrected some comment typos; NFC.
user@host:~/llvm-project/main:(3f250380d4bd...)$ ninja -C release/ opt
ninja: Entering directory `release/'
...
[149/149] Linking CXX executable bin/opt
user@host:~/llvm-project/main:(3f250380d4bd...)$ release/bin/opt ../f.bc -S
opt: /home/slinder1/llvm-project/main/llvm/lib/Bitcode/Reader/MetadataLoader.cpp:363: (anonymous namespace)::(anonymous namespace):
:PlaceholderQueue::~PlaceholderQueue(): Assertion `empty() && "PlaceholderQueue hasn't been flushed before being destroyed"' failed
.
#0 0x0000000002aab1b4 PrintStackTraceSignalHandler(void*) Signals.cpp:0:0
#1 0x0000000002aab4f6 SignalHandler(int) Signals.cpp:0:0
#2 0x00007f56a9e753c0 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x153c0)
#3 0x00007f56a970718b raise /build/glibc-eX1tMB/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
#4 0x00007f56a96e6859 abort /build/glibc-eX1tMB/glibc-2.31/stdlib/abort.c:81:7
#5 0x00007f56a96e6729 get_sysdep_segment_value /build/glibc-eX1tMB/glibc-2.31/intl/loadmsgcat.c:509:8
#6 0x00007f56a96e6729 _nl_load_domain /build/glibc-eX1tMB/glibc-2.31/intl/loadmsgcat.c:970:34
#7 0x00007f56a96f7f36 (/lib/x86_64-linux-gnu/libc.so.6+0x36f36)
#8 0x0000000002f8e298 llvm::MetadataLoader::MetadataLoaderImpl::parseMetadata(bool) (release/bin/opt+0x2f8e298)
#9 0x0000000002f94dcc llvm::MetadataLoader::parseMetadata(bool) (release/bin/opt+0x2f94dcc)
#10 0x0000000002f6ecc6 (anonymous namespace)::BitcodeReader::parseFunctionBody(llvm::Function*) BitcodeReader.cpp:0:0
#11 0x0000000002f6cf29 (anonymous namespace)::BitcodeReader::materialize(llvm::GlobalValue*) BitcodeReader.cpp:0:0
#12 0x0000000002f6dc1a (anonymous namespace)::BitcodeReader::materializeModule() BitcodeReader.cpp:0:0
#13 0x00000000025a7dda llvm::Module::materializeAll() (release/bin/opt+0x25a7dda)
#14 0x0000000002f66dd3 llvm::BitcodeModule::getModuleImpl(llvm::LLVMContext&, bool, bool, bool) (release/bin/opt+0x2f66dd3)
#15 0x0000000002f6a519 llvm::parseBitcodeFile(llvm::MemoryBufferRef, llvm::LLVMContext&) (release/bin/opt+0x2f6a519)
#16 0x00000000026c7cf1 llvm::parseIR(llvm::MemoryBufferRef, llvm::SMDiagnostic&, llvm::LLVMContext&) (release/bin/opt+0x26c7cf1)
#17 0x00000000026c81e0 llvm::parseIRFile(llvm::StringRef, llvm::SMDiagnostic&, llvm::LLVMContext&) (release/bin/opt+0x26c81e0)
#18 0x00000000014d65fc main (release/bin/opt+0x14d65fc)
#19 0x00007f56a96e80b3 __libc_start_main /build/glibc-eX1tMB/glibc-2.31/csu/../csu/libc-start.c:342:3
#20 0x00000000014cc5ae _start (release/bin/opt+0x14cc5ae)
Stack dump:
0.      Program arguments: release/bin/opt ../f.bc -S
Aborted (core dumped)
user@host:~/llvm-project/main:(3f250380d4bd...)$

Again, my guess is that we just want to not assert in this case, and "clear" the PlaceholderQueue as none of the Metadata we have parsed will be used.

The issue of testing newly-encoded bitcode against previous versions of the loader seems like a trickier issue, though. In fact, it may be tricky to test the fix for this assert at all, as I assume it requires manipulating bitcode files to make them invalid in some way that would trigger these return("Invalid record")s.

Any thoughts on where to go from here, and whether this patch should be blocked on these fixes?

That looks bad โ€” best to revert while that's investigated.

Oops... I had misread this comment:

We've noticed that this patch changes things so bcfiles created with the patch cannot be parsed with binaries built before it.

I don't think we have any guarantees about reading new bitcode with old readers. There's some machinery for a high-level version check to get a nice error (checking MODULE_CODE_VERSION) but I think that's it. (I thought this was saying the new reader couldn't read the old bitcode, in which case there'd be a missing bitcode upgrade.)

Any thoughts on where to go from here, and whether this patch should be blocked on these fixes?

Assuming IIUC now, the patch can probably land as-is... but probably makes sense to investigate the LLDB issue first. @JDevlieghere, could the failing LLDB test somehow be reading a new bitcode file with an old reader, or is that something different?

That looks bad โ€” best to revert while that's investigated.

Oops... I had misread this comment:

We've noticed that this patch changes things so bcfiles created with the patch cannot be parsed with binaries built before it.

I don't think we have any guarantees about reading new bitcode with old readers. There's some machinery for a high-level version check to get a nice error (checking MODULE_CODE_VERSION) but I think that's it. (I thought this was saying the new reader couldn't read the old bitcode, in which case there'd be a missing bitcode upgrade.)

Hm, I often do this and I can't remember it failing before. E.g. if I run into a crash in opt, I usually save the bc file and bisect into old versions to see with what opt version the crash starts appearing. This usually works very well so I was surprised and considered it a bug now when it didn't work.

Any thoughts on where to go from here, and whether this patch should be blocked on these fixes?

Assuming IIUC now, the patch can probably land as-is... but probably makes sense to investigate the LLDB issue first. @JDevlieghere, could the failing LLDB test somehow be reading a new bitcode file with an old reader, or is that something different?

I've seen the same crash in our testing and then there was no "new bitcode with old reader" stuff involved, it was just a normal run crashing in somewhere llc (for our out of tree target).

scott.linder added inline comments.Jul 22 2021, 2:59 PM
llvm/lib/IR/AsmWriter.cpp
4737โ€“4745

I think I've identified the source of the bug causing the assert on GreenDragon: I didn't intend to remove this FromValue=true argument. I believe the assert can still be hoisted up, so I'll post an updated patch which just restores this line.

scott.linder reopened this revision.Jul 23 2021, 11:04 AM
This revision is now accepted and ready to land.Jul 23 2021, 11:04 AM

Fix bug in printMetadataImpl

scott.linder requested review of this revision.Jul 23 2021, 11:05 AM

Eliminate the getDistinct method entirely from DIExpression and DIArgList. I
went about this by splitting up the DEFINE_MDNODE_* macros so each MDNode
class can use an appropriate version based on their distinct/uniqued
properties.

Also fix up the end of DebugInfoMetadata.h to #undef all of the macros.

Rebase

I believe the only pending issue is resolved, so I plan to land this tomorrow
if nobody has any objections.

This revision was not accepted when it landed; it landed in state Needs Review.Nov 9 2021, 10:19 AM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.

hi, this is causing crashes

$ build/rel/bin/opt -passes='thinlto-pre-link<O2>' /tmp/a.ll -o /dev/null
opt: ../../llvm/include/llvm/IR/Metadata.def:180: void (anonymous namespace)::ModuleBitcodeWriter::writeMetadataRecords(ArrayRef<const llvm::Metadata *>, SmallVectorImpl<uint64_t> &, std::vector<
unsigned int> *, std::vector<uint64_t> *): Assertion `!N->isDistinct() && "Expected non-distinct " "DIArgList"' failed.                                                                            
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.                                                                                                              
Stack dump:                                                                                                                                                                                        
0.      Program arguments: build/rel/bin/opt -passes=thinlto-pre-link<O2> /tmp/a.ll -o /dev/null                                                                                                   
 #0 0x0000000001f37a13 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Support/Unix/Signals.inc:565:13             
 #1 0x0000000001f3588e llvm::sys::RunSignalHandlers() /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Support/Signals.cpp:98:18                                        
 #2 0x0000000001f37d7f SignalHandler(int) /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Support/Unix/Signals.inc:407:1                                               
 #3 0x00007f2003b8e8e0 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x138e0)                                                                                                                
 #4 0x00007f2003668e71 raise ./signal/../sysdeps/unix/sysv/linux/raise.c:50:1                                                                                                                      
 #5 0x00007f2003652536 abort ./stdlib/abort.c:81:7                                                                                                                                                 
 #6 0x00007f200365241f get_sysdep_segment_value ./intl/loadmsgcat.c:509:8                                                                                                                          
 #7 0x00007f200365241f _nl_load_domain ./intl/loadmsgcat.c:970:34                                                                                                                                  
 #8 0x00007f20036617f2 (/lib/x86_64-linux-gnu/libc.so.6+0x357f2)                                                                                                                                   
 #9 0x000000000168e506 (anonymous namespace)::ModuleBitcodeWriter::writeMetadataRecords(llvm::ArrayRef<llvm::Metadata const*>, llvm::SmallVectorImpl<unsigned long>&, std::vector<unsigned int, std::allocator<unsigned int> >*, std::vector<unsigned long, std::allocator<unsigned long> >*) /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/include/llvm/IR/Metadata.def:0:0
#10 0x000000000167df15 writeFunctionMetadata /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Bitcode/Writer/BitcodeWriter.cpp:2303:3
#11 0x000000000167df15 writeFunction /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Bitcode/Writer/BitcodeWriter.cpp:3351:3
#12 0x000000000167df15 (anonymous namespace)::ModuleBitcodeWriter::write() /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Bitcode/Writer/BitcodeWriter.cpp:4448:7
#13 0x0000000001677d32 ~ModuleBitcodeWriterBase /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Bitcode/Writer/BitcodeWriter.cpp:155:7
#14 0x0000000001677d32 llvm::BitcodeWriter::writeModule(llvm::Module const&, bool, llvm::ModuleSummaryIndex const*, bool, std::array<unsigned int, 5ul>*) /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Bitcode/Writer/BitcodeWriter.cpp:4626:1
#15 0x00000000016813cb llvm::WriteBitcodeToFile(llvm::Module const&, llvm::raw_ostream&, bool, llvm::ModuleSummaryIndex const*, bool, std::array<unsigned int, 5ul>*) /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Bitcode/Writer/BitcodeWriter.cpp:4653:10

(I tried reducing as much as possible but there's still a lot of debug info)

hi, this is causing crashes

$ build/rel/bin/opt -passes='thinlto-pre-link<O2>' /tmp/a.ll -o /dev/null
opt: ../../llvm/include/llvm/IR/Metadata.def:180: void (anonymous namespace)::ModuleBitcodeWriter::writeMetadataRecords(ArrayRef<const llvm::Metadata *>, SmallVectorImpl<uint64_t> &, std::vector<
unsigned int> *, std::vector<uint64_t> *): Assertion `!N->isDistinct() && "Expected non-distinct " "DIArgList"' failed.                                                                            
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.                                                                                                              
Stack dump:                                                                                                                                                                                        
0.      Program arguments: build/rel/bin/opt -passes=thinlto-pre-link<O2> /tmp/a.ll -o /dev/null                                                                                                   
 #0 0x0000000001f37a13 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Support/Unix/Signals.inc:565:13             
 #1 0x0000000001f3588e llvm::sys::RunSignalHandlers() /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Support/Signals.cpp:98:18                                        
 #2 0x0000000001f37d7f SignalHandler(int) /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Support/Unix/Signals.inc:407:1                                               
 #3 0x00007f2003b8e8e0 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x138e0)                                                                                                                
 #4 0x00007f2003668e71 raise ./signal/../sysdeps/unix/sysv/linux/raise.c:50:1                                                                                                                      
 #5 0x00007f2003652536 abort ./stdlib/abort.c:81:7                                                                                                                                                 
 #6 0x00007f200365241f get_sysdep_segment_value ./intl/loadmsgcat.c:509:8                                                                                                                          
 #7 0x00007f200365241f _nl_load_domain ./intl/loadmsgcat.c:970:34                                                                                                                                  
 #8 0x00007f20036617f2 (/lib/x86_64-linux-gnu/libc.so.6+0x357f2)                                                                                                                                   
 #9 0x000000000168e506 (anonymous namespace)::ModuleBitcodeWriter::writeMetadataRecords(llvm::ArrayRef<llvm::Metadata const*>, llvm::SmallVectorImpl<unsigned long>&, std::vector<unsigned int, std::allocator<unsigned int> >*, std::vector<unsigned long, std::allocator<unsigned long> >*) /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/include/llvm/IR/Metadata.def:0:0
#10 0x000000000167df15 writeFunctionMetadata /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Bitcode/Writer/BitcodeWriter.cpp:2303:3
#11 0x000000000167df15 writeFunction /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Bitcode/Writer/BitcodeWriter.cpp:3351:3
#12 0x000000000167df15 (anonymous namespace)::ModuleBitcodeWriter::write() /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Bitcode/Writer/BitcodeWriter.cpp:4448:7
#13 0x0000000001677d32 ~ModuleBitcodeWriterBase /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Bitcode/Writer/BitcodeWriter.cpp:155:7
#14 0x0000000001677d32 llvm::BitcodeWriter::writeModule(llvm::Module const&, bool, llvm::ModuleSummaryIndex const*, bool, std::array<unsigned int, 5ul>*) /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Bitcode/Writer/BitcodeWriter.cpp:4626:1
#15 0x00000000016813cb llvm::WriteBitcodeToFile(llvm::Module const&, llvm::raw_ostream&, bool, llvm::ModuleSummaryIndex const*, bool, std::array<unsigned int, 5ul>*) /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Bitcode/Writer/BitcodeWriter.cpp:4653:10

(I tried reducing as much as possible but there's still a lot of debug info)

Thank you for the reproducer! It seems like one of my fundamental assumptions about how LLVM handles metadata was faulty.

I assumed that uniqued nodes couldn't "become distinct", but they evidently can due to things like RAUW. I'll try to get a better grasp on what the rules actually are and then update the review.

scott.linder reopened this revision.Nov 12 2021, 2:36 PM
scott.linder updated this revision to Diff 386957.EditedNov 12 2021, 2:37 PM

Update DIArgList::handleChangedOperand to propagate in-place changes which
would require the DIArgList become distinct up to the owning MetadataAsValue.

This feels a bit odd, but so does allowing DIArgList to be distinct as an
implementation detail, while never serializing that fact (and so making a
round-trip lossy). Semantically this may be OK, but unless we have a good
reason not to I feel like finding a way to ensure these "inline" nodes are
always uniqued will make the relationship between IR/bitcode and the in-memory
representation more obvious.

Update DIArgList::handleChangedOperand to propagate in-place changes which
would require the DIArgList become distinct up to the owning MetadataAsValue.

This feels a bit odd, but so does allowing DIArgList to be distinct as an
implementation detail, while never serializing that fact (and so making a
round-trip lossy). Semantically this may be OK, but unless we have a good
reason not to I feel like finding a way to ensure these "inline" nodes are
always uniqued will make the relationship between IR/bitcode and the in-memory
representation more obvious.

I'm adding some more reviewers from those involved in https://reviews.llvm.org/D108968, as I'm essentially extending the changes made there.

Update DIArgList::handleChangedOperand to propagate in-place changes which
would require the DIArgList become distinct up to the owning MetadataAsValue.

This feels a bit odd, but so does allowing DIArgList to be distinct as an
implementation detail, while never serializing that fact (and so making a
round-trip lossy).

Seems reasonable to try to fix this.

Semantically this may be OK, but unless we have a good
reason not to I feel like finding a way to ensure these "inline" nodes are
always uniqued will make the relationship between IR/bitcode and the in-memory
representation more obvious.

The tricky constraint is that (most) references to Metadata are not in any use-list.

llvm/lib/IR/DebugInfoMetadata.cpp
1672โ€“1673

I'm sorry; I'm lacking some context here; I'd appreciate a bit of hand-holding.

I see how this updates the MetadataAsValue reference.

Can you walk me through in way way it's enforced that DIArgList is only referenced by MetadataAsValue? It sounds like textual IR has something, but can you confirm? What about the C++ API? The verifier? And what about historical bitcode... i.e., what can be found there?

  • If there are holes in enforcement somehow, what's going to happen to the dangling references to this node?
  • Can/should we delete this node now? And/or somehow mark it as dead, so that ValueAsMetadata will look through it?
scott.linder added inline comments.Nov 15 2021, 3:51 PM
llvm/lib/IR/DebugInfoMetadata.cpp
1672โ€“1673

Can you walk me through in way way it's enforced that DIArgList is only referenced by MetadataAsValue?

I don't know that it is enforced currently, but I believe it was an intended restriction. The LLVM IR langref entry for DIArgList currently says (emphasis mine):

Because a DIArgList may refer to local values within a function, it must only be used as a function argument, must always be inlined, and cannot appear in named metadata.

The only way I can interpret the bold portion is to make it equivalent to "it must only be used through a MetadataAsValue", as that is the only means of using metadata as a function argument.

It sounds like textual IR has something, but can you confirm? What about the C++ API? The verifier? And what about historical bitcode... i.e., what can be found there?

Would it be appropriate to just add as a verifier check? No matter how the IR is produced it would then fail to verify, so we would cover any current or future APIs. For bitcode I've implemented an upgrade path to force IsDistinct=false when loading old bitcode.

If there are holes in enforcement somehow, what's going to happen to the dangling references to this node?
Can/should we delete this node now? And/or somehow mark it as dead, so that ValueAsMetadata will look through it?

I need to look closer at the lifetime of metadata nodes, it seems clear there is more to do here I'm just not sure exactly what that is.

StephenTozer added inline comments.Nov 16 2021, 3:34 AM
llvm/lib/IR/DebugInfoMetadata.cpp
1672โ€“1673

Can you walk me through in way way it's enforced that DIArgList is only referenced by MetadataAsValue? It sounds like textual IR has something, but can you confirm? What about the C++ API? The verifier? And what about historical bitcode... i.e., what can be found there?

In textual IR parsing, a DIArgList cannot be parsed without a reference to PerFunctionState, which is provided for metadata exclusively when parsing the contents of MetadataAsValue. There is no explicit assertion to this effect, and there is also no assertion in Bitcode parsing or in the verifier either - this should probably be added! Although I don't think there should be any existing cases of this, as the Bitcode writer can also only write DIArgLists as part of a MetadataAsValue and this has been true since the metadata was added, so the only possible scenario would be someone manually editing a bitcode file I think.

StephenTozer added inline comments.Nov 16 2021, 4:03 AM
llvm/lib/IR/DebugInfoMetadata.cpp
1672โ€“1673

Just to clarify, because it's slightly unclear in my last comment: I believe that the right move is to just add a verifier check. It doesn't really make sense for a DIArgList to exist outside of a MetadataAsValue in any circumstance, as it may refer to local function values and so must exist as a Value within a function, and the only way for that to happen is by being wrapped in a MetadataAsValue. As such, I don't think we need to put assertions all over the place to catch this, only to express it in the verifier as a rule of IR that will likely never be hit.

The equivalent assertion for LocalAsMetadata, which must be wrapped by MetadataAsValue for the same reason, is in Verifier::visitMDNode. I've also noticed while checking this that Verifier::visitMetadataAsValue does not call Verifier::visitDIArgList if it contains one, which should also be adjusted (this doesn't need to be a part of this patch).

dexonsmith added inline comments.Nov 16 2021, 6:36 PM
llvm/lib/IR/DebugInfoMetadata.cpp
1672โ€“1673

Relying on the verifier is probably fine. I suspect you could add verifier coverage in a unit test (i.e., DIArgList in an invalid spot and then run the verifier), maybe in llvm/unittests/IR/MetadataTest.cpp or DebugInfoTest.cpp. It looks like that's missing for LocalAsMetadata (my fault I'm sure).

Still, IMO an assertion in MDOperand that the argument is valid to reference (i.e., neither LocalAsMetadata nor DIArgList) would be nice to have. Maybe Metadata could grow:

bool isValidOperand() const { return Kind != ... };

and then MDOperand::reset could check:

assert((!MD || MD->isValidMDOperand()) && "...");

If there are holes in enforcement somehow, what's going to happen to the dangling references to this node?
Can/should we delete this node now? And/or somehow mark it as dead, so that ValueAsMetadata will look through it?

I need to look closer at the lifetime of metadata nodes, it seems clear there is more to do here I'm just not sure exactly what that is.

Looking at the example of LocalAsMetadata, since DIArgList seems similar, ValueAsMetadata::handleRAUW() deletes the node after RAUW'ing it. It looks like DIArgList doesn't inherit from ReplaceableMetadataImpl so it can't RAUW itself, but maybe it should?

Add enforcement of the always-owned-by-MetadataAsValue property of DIArgList to
the metadata verifier. I'm not sure this is sufficient, but at least it is a
start.

Fixed the handling of the uniquify()!=this case in
DIArgList::handleChangedOperand to assert if the above property doesn't hold.

Add unittests for both changes above; the reproducers for this in IR rely on
specific sequences of optimizations, and would be much more fragile.

Is this moving in the right direction? Does the restriction on DIArgList seem
reasonable, and if so does it need more explicit documentation or more thorough
enforcement?

Add enforcement of the always-owned-by-MetadataAsValue property of DIArgList to
the metadata verifier. I'm not sure this is sufficient, but at least it is a
start.

Fixed the handling of the uniquify()!=this case in
DIArgList::handleChangedOperand to assert if the above property doesn't hold.

Add unittests for both changes above; the reproducers for this in IR rely on
specific sequences of optimizations, and would be much more fragile.

Is this moving in the right direction? Does the restriction on DIArgList seem
reasonable, and if so does it need more explicit documentation or more thorough
enforcement?

I pushed this before looking at the intervening comments, so the last bit is a bit redundant; it sounds like the restriction on DIArgList was intended and makes sense, so I think the remaining question is whether the Verifier check is sufficient.

Is this moving in the right direction? Does the restriction on DIArgList seem
reasonable, and if so does it need more explicit documentation or more thorough
enforcement?

I pushed this before looking at the intervening comments, so the last bit is a bit redundant; it sounds like the restriction on DIArgList was intended and makes sense, so I think the remaining question is whether the Verifier check is sufficient.

It seems like DIArgList is very similar to LocalAsMetadata. I feel like it'd be less error-prone for it to use/inherit-from ReplaceableMetadataImpl, just like ValueAsMetadata does. Then it has the machinery to RAUW/delete itself when necessary.

(Moreover, given that it doesn't have any operands, I find it suspicious that it inherits (albeit indirectly) from MDNode. It should probably be a direct subclass of Metadata. This would mean it couldn't inherit from DINode which might complicate some debug info code; if you're going to clean that up I suggest doing it later as a separate patch.)

Is this moving in the right direction? Does the restriction on DIArgList seem
reasonable, and if so does it need more explicit documentation or more thorough
enforcement?

I pushed this before looking at the intervening comments, so the last bit is a bit redundant; it sounds like the restriction on DIArgList was intended and makes sense, so I think the remaining question is whether the Verifier check is sufficient.

It seems like DIArgList is very similar to LocalAsMetadata. I feel like it'd be less error-prone for it to use/inherit-from ReplaceableMetadataImpl, just like ValueAsMetadata does. Then it has the machinery to RAUW/delete itself when necessary.

(Moreover, given that it doesn't have any operands, I find it suspicious that it inherits (albeit indirectly) from MDNode. It should probably be a direct subclass of Metadata. This would mean it couldn't inherit from DINode which might complicate some debug info code; if you're going to clean that up I suggest doing it later as a separate patch.)

I looked into deriving DIArgList from ReplaceableMetadataImpl, and it does get awkward quickly that it is an MDNode. It seems like there is a fair amount of code around tracking/RAUW which relies on MDNodes being eventually "resolveable". The result is that I don't think I can inherit from ReplaceableMetadataImpl without also changing the inheritence from MDNode to Metadata as you suggest.

I'll look at what effect that has on any debug-info related code, but I do worry that no matter what there will be a fair amount of special-casing.

I'm pushing the (rough) patch which tries to make the switch from inheriting
DIArglist from MDNode to ReplaceableMetadataImpl.

The biggest issue I see with a change along these lines is that the
!DIArgList syntax becomes misleading. If we really want this, I think it
implies a bit more work, namely generalizing this to something closer to a
ValuesAsMetadata (plural), with a syntax closer to:

metadata {i32 1, i64 %x, float %f}

I think this doesn't add any ambiguity to the syntax, but this is just my
initial guess at what might make sense.

I'm not sure I can commit to this work, but I would still like to propose making
the UNIQUED/DISTINCT change for DIExpression and DICompileUnit. Would it be
reasonable for me to just remove DIArgList from my original patch, as it has
proven to be special enough not to generalize how I had hoped?

scott.linder edited the summary of this revision. (Show Details)

I went ahead with just leaving DIArgList alone, let me know what you think

I'm pushing the (rough) patch which tries to make the switch from inheriting
DIArglist from MDNode to ReplaceableMetadataImpl.

The biggest issue I see with a change along these lines is that the
!DIArgList syntax becomes misleading.

I don't see what's misleading about it. ! is the indicator that "metadata follows", and DIArgList is the name of this special kind of metadata.

TBH, I find it odd that there'd be something inheriting from MDNode that is as special as this....

If we really want this, I think it
implies a bit more work, namely generalizing this to something closer to a
ValuesAsMetadata (plural), with a syntax closer to:

metadata {i32 1, i64 %x, float %f}

I think this doesn't add any ambiguity to the syntax, but this is just my
initial guess at what might make sense.

This looks awkwardly similar to generic MDNode syntax (just with the distinction that generic metadata is not allowed to reference local variables). I don't think generic syntax is better than naming the type here. If you still disagree, can you explain why?

I'm not sure I can commit to this work, but I would still like to propose making
the UNIQUED/DISTINCT change for DIExpression and DICompileUnit. Would it be
reasonable for me to just remove DIArgList from my original patch, as it has
proven to be special enough not to generalize how I had hoped?

IIUC, DIExpression is data-driven, so it's reasonable for it always to be uniqued. Originally it was not going to inherit from MDNode, but IIRC it was useful as an incremental step for the transition from all generic metadata. At some point there was a FIXME in the code to reparent directly under Metadata but probably that was dropped eventually. I still think that'd be a reasonable thing to do.

If DICompileUnit already has this property, seems to me like that part could be separated into an NFC change to make it easier to review what's actually changing.

Seems reasonable to handle DIArgList separately from the other two. It'd be a shame to drop it entirely; it seems like there are some fundamental modelling problems with DIArgList that would be valuable to iron out.

llvm/lib/Bitcode/Reader/MetadataLoader.cpp
559

I don't think you need to change upgradeDIExpression.

1144

Seems odd that this is declared here. Would be nice at some point to make it local to each case below that needs it in an NFC change.

2003

I think you should replace this line with a comment, maybe something like:

// DIExpression is never distinct. Old bitcode uses the 0-bit for that
// but it's now ignored.

You can drop the assignment to IsDistinct entirely; it's not interesting here anymore.

2008โ€“2011

I don't think you need this change.

llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
2019

I'm not sure we need a new version.

llvm/lib/CodeGen/MIRParser/MIParser.cpp
1187

I'm not sure what you mean by this comment. What is "this", and why should a metadata kind being uniqued-by-content influence it?

2331

I'm not sure what you mean by this comment. What is "this", and why should a metadata kind being uniqued-by-content influence it?

llvm/lib/IR/AsmWriter.cpp
1232โ€“1233

I don't think nodes being uniqued-by-content should force them to be inlined everywhere. Those seem like independent properties to me.

llvm/lib/IR/DebugInfoMetadata.cpp
1075

Can this be Storage == Uniqued?

llvm/lib/IR/Verifier.cpp
936

Nit: if we add a newline here, should be in a separate NFC commit since there's no other change around.

Is this moving in the right direction? Does the restriction on DIArgList seem
reasonable, and if so does it need more explicit documentation or more thorough
enforcement?

I pushed this before looking at the intervening comments, so the last bit is a bit redundant; it sounds like the restriction on DIArgList was intended and makes sense, so I think the remaining question is whether the Verifier check is sufficient.

It seems like DIArgList is very similar to LocalAsMetadata. I feel like it'd be less error-prone for it to use/inherit-from ReplaceableMetadataImpl, just like ValueAsMetadata does. Then it has the machinery to RAUW/delete itself when necessary.

(Moreover, given that it doesn't have any operands, I find it suspicious that it inherits (albeit indirectly) from MDNode. It should probably be a direct subclass of Metadata. This would mean it couldn't inherit from DINode which might complicate some debug info code; if you're going to clean that up I suggest doing it later as a separate patch.)

I looked into deriving DIArgList from ReplaceableMetadataImpl, and it does get awkward quickly that it is an MDNode. It seems like there is a fair amount of code around tracking/RAUW which relies on MDNodes being eventually "resolveable". The result is that I don't think I can inherit from ReplaceableMetadataImpl without also changing the inheritence from MDNode to Metadata as you suggest.

I'll look at what effect that has on any debug-info related code, but I do worry that no matter what there will be a fair amount of special-casing.

I'm pushing the (rough) patch which tries to make the switch from inheriting
DIArglist from MDNode to ReplaceableMetadataImpl.

The biggest issue I see with a change along these lines is that the
!DIArgList syntax becomes misleading.

I don't see what's misleading about it. ! is the indicator that "metadata follows", and DIArgList is the name of this special kind of metadata.

TBH, I find it odd that there'd be something inheriting from MDNode that is as special as this....

It seemed like a useful invariant that the syntax !<id> always implied an MDNode, in the same way that !"..." means MDString, although I guess this is just an implementation detail.

It may just be a naming thing, though; we say in the langref that there are only metadata "strings" and metadata "nodes", but we already have metadata "ValueAsMetadata"s too. If we were to hoist up DIArgList as well how would we want to describe it? By saying we have: strings, nodes, ValueAsMetadatas, and DIArgLists?

If we really want this, I think it
implies a bit more work, namely generalizing this to something closer to a
ValuesAsMetadata (plural), with a syntax closer to:

metadata {i32 1, i64 %x, float %f}

I think this doesn't add any ambiguity to the syntax, but this is just my
initial guess at what might make sense.

This looks awkwardly similar to generic MDNode syntax (just with the distinction that generic metadata is not allowed to reference local variables). I don't think generic syntax is better than naming the type here. If you still disagree, can you explain why?

That was my intention, because I don't see why this needs to be DI-centric. My guess was wrong either way, though, because metadata { already signifies the start of a structure type IIUC.

I was just trying to suggest that all DIArgList currently does is stand in for a function-local metadata tuple, so just making one of those and using that seems more straightforward.

I'm not sure I can commit to this work, but I would still like to propose making
the UNIQUED/DISTINCT change for DIExpression and DICompileUnit. Would it be
reasonable for me to just remove DIArgList from my original patch, as it has
proven to be special enough not to generalize how I had hoped?

IIUC, DIExpression is data-driven, so it's reasonable for it always to be uniqued. Originally it was not going to inherit from MDNode, but IIRC it was useful as an incremental step for the transition from all generic metadata. At some point there was a FIXME in the code to reparent directly under Metadata but probably that was dropped eventually. I still think that'd be a reasonable thing to do.

That FIXME is still present, and if that is true then maybe my whole conception of the class hierarchy and what to call various parts is just not lining up with reality. As a smoke test for whether I'm on the same page, would you expect DIExpression to no longer be referred to as a "node" in documentation, as it would no longer derive from MDNode?

If DICompileUnit already has this property, seems to me like that part could be separated into an NFC change to make it easier to review what's actually changing.

Sure; just to confirm, you mean split this into two patches, one NFC patch adding HANDLE_MDNODE_LEAF_DISTINCT, and another adding HANDLE_MDNODE_LEAF_UNIQUED?

Seems reasonable to handle DIArgList separately from the other two. It'd be a shame to drop it entirely; it seems like there are some fundamental modelling problems with DIArgList that would be valuable to iron out.

Yes, I agree that at the very least the fact that DIArgList can transiently become distinct, but then not round-trip this property through IR, probably deserves attention. I am not particularly confident that the way I first though about remedying it really lines up with how metadata is currently designed, but I would be interested in learning enough to contribute a fix.

I'll wait to hear if I follow how you would prefer the patch be split up before I incorporate your other feedback.

llvm/lib/Bitcode/Reader/MetadataLoader.cpp
2003

OK, fair enough; so the bit just becomes "don't care" going forward, which naturally means old bitcode is forward compatible

llvm/lib/CodeGen/MIRParser/MIParser.cpp
1187

"this" is trying to get at "this code which supports DIExpression appearing inline in MIR"; I can try to fix up the wording some to be more clear

The property I was aiming for with the whole patch was that "inlined" metadata is never distinct. This seemed like a logical "rule" that would explain the lack of syntax for distinct nodes inline.

2331

ditto

llvm/lib/IR/DebugInfoMetadata.cpp
1075

Yes, I'll make this change.

llvm/lib/IR/Verifier.cpp
936

Woops, I just didn't notice this before pushing the diff. I'll just remove this change from the patch.

I was just trying to suggest that all DIArgList currently does is stand in for a function-local metadata tuple, so just making one of those and using that seems more straightforward.

Ah, that's an interesting point; thanks for clarifying!

Stepping back a bit...

Before the big metadata / debug info refactor, there was support for metadata nodes owned by a function. These nodes were allowed to point at function-local values. The only uses of these metadata nodes were for direct reference from debug info intrinsics. I removed them entirely to simplify the model, and left behind LocalAsMetadata.

I've since realized that much of this "global" metadata is really tied to a specific function. Three examples:

  • Inlined DILocations were made distinct in order to defeat uniquing across different function definitions.
  • DISubprogram have become function attachments; and it's illegal to have the same one attached in multiple places. It's effectively owned by the function; and it needs to be cloned when a function is cloned.
  • Outside of debug info, there are optimization-related metadata that describe function-local constructs, and that defeat uniquing by having a "root" node that self-references and/or is marked distinct. (IIRC, loop metadata is an example?)

I think we should bring back local metadata; moreover, there really seem to be three scopes for metadata:

  • Context: pure / immutable constants; cannot change / cannot be distinct. Only reference other "context" constants (any referenced LLVM values are also pure constants (i32 0 okay, but i32* @global is not)). Should be shared across all users of an LLVMContext and owned by the LLVMContext. Since they are immutable, never "cloned" by cloneFunctionInto / cloneModule / related APIs.
  • Module: module-specific. May have identity and be mutable (distinct). If it is uniqued / "constant", then it is at "module" scope because it (transitively) references something else with "module" scope (e.g., a global value or distinct metadata). Can reference LLVM global values but not local values. These are effectively owned by a specific Module. Cloning a module would require making new copies of these, but cloning a function would not.
  • Local: function-specific. May have identity and be mutable (distinct). If it is uniqued / "constant", then it is at "local" scope because it (transitively) references something else with "local" scope. Cloning a function would require making new copies of these. Effectively owned by the function/global value that it's associated with.

In this model, it seems like DIArgList falls under "local"; and you're suggesting it's just a stand-in for not having a local "MDNode", and not really semantically tied to debug info or argument lists. If that's the case, then it seems they're only required to be inline right now because there's no syntax to describe them otherwise. Syntax could look like:

define void @f(i32 %a, i32 %b) {
  call @llvm.some.intrinsic(!%1)
  ; Some local metadata. Can reference LocalAsMetadata.
  ; Cloned when the function is cloned.
  !%1 = metadata !{i32 %a, i32 %b, !%2}
  !%2 = distinct metadata !{!1}
}

; Global metadata. Cloned when a module is cloned.
!1 = distinct metadata !{!2, ptr @f}
!2 = metadata !{!1, !3}

; Context metadata. Pure constants.
!3 = metadata !{!4, !5}
!4 = metadata !{i32 1, ptr null}
!5 = metadata !{}

Stepping back again, I think there are three kinds of identity for metadata:

  • Distinct: not in any uniquing table (e.g., MDNode marked distinct)
  • Uniqued (always): in a uniquing table; either immutable (e.g., "Context" scope / DIExpression) or has permanent RAUW support to handle changes (e.g., ValueAsMetadata)
  • Uniqued (best-effort): findable via uniquing tables, but operands might change and lacks permanent RAUW support; degrades to "distinct" if an operand changes (e.g., MDNode without distinct)

The current design for metadata optimizes for having most metadata either "distinct" or "best-effort-uniqued". Permanent RAUW support is expensive (in this design).

Our recent discussions about DIArgList point toward making it always-uniqued; since the operands can change, this requires permanent RAUW support. I wonder if, instead, best-effort-uniqued would be sufficient... if only there were syntax for it. WDYT?

Sure; just to confirm, you mean split this into two patches, one NFC patch adding HANDLE_MDNODE_LEAF_DISTINCT, and another adding HANDLE_MDNODE_LEAF_UNIQUED?

Yes.

For the former, maybe "ALWAYS_DISTINCT" would be more clear? Either way, this seems like a well-defined concept.

For the latter, I'm not quite convinced yet. The name doesn't seem right, since most nodes are stored/de-duped in uniquing tables (maybe "ALWAYS_UNIQUED" would work?). But:

  • I don't think "always-uniqued" should imply things like "printed inline in textual IR"
  • I'm not sure should have MDNodes that have permanent RAUW support (since it's expensive)
    • Can we get away with a "local" MDNode that is best-effort-uniqued?
llvm/lib/CodeGen/MIRParser/MIParser.cpp
1187

We could support all nodes here for parsing, if we think that's useful... I'm not sure if there's a good reason to only have DIExpression here...

that would explain the lack of syntax for distinct nodes inline.

IIRC, the historical lack of syntax (in .ll at least, not sure about .mir) is because it didn't seem useful/important. When MDNodes are distinct, it is because they either need identity *a priori* (in which case, probably there are multiple references to them and they MUST be out-of-line) or there was a uniquing table collision for best-effort-uniqued metadata (expected to be rare). Anything that was always-uniqued had first class syntax.