This is an archive of the discontinued LLVM Phabricator instance.

[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7)
ClosedPublic

Authored by dzhidzhoev on Feb 14 2023, 6:28 AM.

Details

Summary

RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544

Similar to imported declarations, the patch tracks function-local types in
DISubprogram's 'retainedNodes' field. DwarfDebug is adjusted in accordance with
the aforementioned metadata change and provided a support of function-local
types scoped within a lexical block.

The patch assumes that DICompileUnit's 'enums field' no longer tracks local
types and DwarfDebug would assert if any locally-scoped types get placed there.

Diff Detail

Event Timeline

krisb created this revision.Feb 14 2023, 6:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 14 2023, 6:28 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
krisb edited the summary of this revision. (Show Details)
krisb added reviewers: dblaikie, jmmartinez.
krisb added projects: Restricted Project, debug-info.
krisb published this revision for review.Feb 17 2023, 6:33 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 17 2023, 6:33 AM
ykhatav added a subscriber: ykhatav.Mar 3 2023, 8:56 AM

Just a few minor comments. Everything else seems good to me.

llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
698

I cannot find the link to PR55680. Would you mind sharing it?

You could also reference local-type-as-template-parameter.ll, your test depicts the issue very clearly.

700–701

NIT: You could use find to avoid searching in LexicalBlockDIEs twice.

712–714

NIT: You could use insert/try_emplace instead of using count and operator[]. The assertion would become something like:

auto Inserted = getAbstractScopeDIEs().try_emplace(DS, ScopeDIE);
assert(Inserted.second && "Abstract DIE for this scope exists!");
return ScopeDIE;
krisb marked 2 inline comments as done.Mar 13 2023, 5:34 AM
krisb added inline comments.
llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
698

I cannot find the link to PR55680. Would you mind sharing it?

You could also reference local-type-as-template-parameter.ll, your test depicts the issue very clearly.

Here is the link to the issue https://github.com/llvm/llvm-project/issues/55680.
Mentioned the test in the comment. Thank you!

712–714

I'd slightly prefer to keep the original code as it looks a bit more readable to me (in your example, there should be another line to void out unused Inserted variable, otherwise it'd cause a warning).
Additional count() call in the assert() doesn't seem too expensive to me, but if you still think try_emplace() is better, I'll change to it.

krisb updated this revision to Diff 504608.Mar 13 2023, 5:34 AM
krisb marked an inline comment as done.

Apply review comments and rebase.

jmmartinez accepted this revision.Mar 20 2023, 5:22 AM
jmmartinez added inline comments.
llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
712–714

Ok for me. If the use of count was outside the assert I would have argued against.

This revision is now accepted and ready to land.Mar 20 2023, 5:22 AM
dblaikie added inline comments.Mar 20 2023, 5:49 PM
llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
712–714

FWIW I'd lean towards avoiding the extra lookup in the +Asserts build (by using emplace or try_emplace, etc - yeah, with the extra void cast for the non-asserts-unused variable), but wouldn't insist on it.

dzhidzhoev retitled this revision from [DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (5/7) to [DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7).May 15 2023, 5:38 AM
dzhidzhoev updated this revision to Diff 529100.Jun 6 2023, 5:13 PM

Added AutoUpdater, moving function-local types belonging to DICompileUnit to a corresponding DISubprogram.

This revision was landed with ongoing or failed builds.Jun 19 2023, 7:42 AM
This revision was automatically updated to reflect the committed changes.

This change triggers failed asserts when compiling code for at least arm and aarch64. It is reproducible with this reduced testcase:

$ cat repro.c
typedef long long a;
typedef int b();
int c, d;
long e, f;
short g, j;
void *h;
short i[];
char k;
a l, m, n;
void o();
int p();
void r(b u) {
  struct {
    a q;
    a s, t
  } a;
  e || p(c, d);
  f = l = a.s;
  for (; a.s;)
    if (p(c, a, &a, a))
      a.t &&a.q &&p(a, k);
  if (g)
    u(i, m, n, 0, h);
  f = a.s;
  for (; a.s;)
    a.t &&a.q &&p(c, a, k);
  if (g && j)
    o(g);
}
int v(short *, long long, long long, int, void *) { r(v); } 
$ clang -target aarch64-linux-gnu -c repro.c -g -O2
clang: ../lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2334: virtual void llvm::DwarfDebug::endFunctionImpl(const llvm::MachineFunction*): Assertion `LScopes.getAbstractScopesList().size() == NumAbstractSubprograms && "getOrCreateAbstractScope() inserted an abstract subprogram scope"' failed.
hans added a subscriber: hans.Jun 20 2023, 3:40 AM

We're seeing assertion failures in Chromium too. Reproducer for x86_64 Linux here: https://bugs.chromium.org/p/chromium/issues/detail?id=1456288#c2

Thank you for reports! I've reverted it.

hoy added a subscriber: hoy.Jun 20 2023, 12:02 PM

Hello, I'm seeing a build failure that seems related to this patch. I'm seeing the patch has been relanded and reverted a couple times and not sure where it is right now. Can you please confirm if the failure is related and fixed? Thanks.

2023-06-20T05:19:14.441-07:00] Stderr: PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
[2023-06-20T05:19:14.441-07:00] Stack dump:
[2023-06-20T05:19:14.441-07:00] 1. Running pass 'Function Pass Manager' on module 'buck-out/v2/gen/fbcode/c475d21d0247c4b4/f3/share/udaf/aggregators/utils/__f3_udaf_aggregation_utils__/__objects__/HybridHistAdditiveNoiseDPState.cpp.o'.
[2023-06-20T05:19:14.441-07:00] 2. Running pass 'X86 Assembly Printer' on function '@_ZN8facebook2f37xstream11aggregation5utils15HybridTopKStateIlE4bumpERKll'
[2023-06-20T05:19:14.441-07:00] #0 0x00000000055513d5 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) llvm-project/llvm/lib/Support/Unix/Signals.inc:602:13
[2023-06-20T05:19:14.441-07:00] #1 0x00000000055517d6 llvm::sys::RunSignalHandlers() llvm-project/llvm/lib/Support/Signals.cpp:103:5
[2023-06-20T05:19:14.441-07:00] #2 0x00000000055517d6 SignalHandler(int) llvm-project/llvm/lib/Support/Unix/Signals.inc:403:3
[2023-06-20T05:19:14.441-07:00] #3 0x00007f028da4459f glibc/2.34/src/glibc-2.34/signal/../sysdeps/unix/sysv/linux/libc_sigaction.c:8:0
[2023-06-20T05:19:14.441-07:00] #4 0x0000000007c7411e llvm::DILexicalBlockFile::classof(llvm::Metadata const*) llvm-project/llvm/include/llvm/IR/DebugInfoMetadata.h:2271:32
[2023-06-20T05:19:14.441-07:00] #5 0x0000000007c7411e llvm::isa_impl<llvm::DILexicalBlockFile, llvm::DILocalScope, void>::doit(llvm::DILocalScope const&) llvm-project/llvm/include/llvm/Support/Casting.h:64:53
[2023-06-20T05:19:14.441-07:00] #6 0x0000000007c7411e llvm::isa_impl_cl<llvm::DILexicalBlockFile, llvm::DILocalScope const*>::doit(llvm::DILocalScope const*) llvm-project/llvm/include/llvm/Support/Casting.h:110:12
[2023-06-20T05:19:14.441-07:00] #7 0x0000000007c7411e llvm::isa_impl_wrap<llvm::DILexicalBlockFile, llvm::DILocalScope const*, llvm::DILocalScope const*>::doit(llvm::DILocalScope const* const&) llvm-project/llvm/include/llvm/Support/Casting.h:137:12
[2023-06-20T05:19:14.441-07:00] #8 0x0000000007c7411e llvm::isa_impl_wrap<llvm::DILexicalBlockFile, llvm::DILocalScope const* const, llvm::DILocalScope const*>::doit(llvm::DILocalScope const* const&) llvm-project/llvm/include/llvm/Support/Casting.h:127:12
[2023-06-20T05:19:14.441-07:00] #9 0x0000000007c7411e llvm::CastIsPossible<llvm::DILexicalBlockFile, llvm::DILocalScope const*, void>::isPossible(llvm::DILocalScope const* const&) llvm-project/llvm/include/llvm/Support/Casting.h:255:12
[2023-06-20T05:19:14.441-07:00] #10 0x0000000007c7411e llvm::CastInfo<llvm::DILexicalBlockFile, llvm::DILocalScope const*, void>::doCastIfPossible(llvm::DILocalScope const* const&) llvm-project/llvm/include/llvm/Support/Casting.h:493:10
[2023-06-20T05:19:14.441-07:00] #11 0x0000000007c7411e decltype(auto) llvm::dyn_cast<llvm::DILexicalBlockFile, llvm::DILocalScope const>(llvm::DILocalScope const*) llvm-project/llvm/include/llvm/Support/Casting.h:663:10
[2023-06-20T05:19:14.441-07:00] #12 0x0000000007c7411e llvm::DILocalScope::getNonLexicalBlockFileScope() const llvm-project/llvm/lib/IR/DebugInfoMetadata.cpp:1037:20
[2023-06-20T05:19:14.441-07:00] #13 0x0000000007c7411e getRetainedNodeScope(llvm::MDNode const*) llvm-project/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:1549:33
[2023-06-20T05:19:14.441-07:00] #14 0x0000000007c7411e llvm::DwarfDebug::endFunctionImpl(llvm::MachineFunction const*) llvm-project/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2327:24
[2023-06-20T05:19:14.441-07:00] #15 0x0000000007f45860 llvm::DebugHandlerBase::endFunction(llvm::MachineFunction const*) llvm-project/llvm/lib/CodeGen/AsmPrinter/DebugHandlerBase.cpp:411:5
[2023-06-20T05:19:14.441-07:00] #16 0x0000000007af45e7 llvm::AsmPrinter::emitFunctionBody() llvm-project/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp:1894:17
[2023-06-20T05:19:14.441-07:00] #17 0x0000000007af30ca llvm::X86AsmPrinter::runOnMachineFunction(llvm::MachineFunction&) llvm-project/llvm/lib/Target/X86/X86AsmPrinter.cpp:86:3
[2023-06-20T05:19:14.441-07:00] #18 0x00000000078612f5 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) llvm-project/llvm/lib/CodeGen/MachineFunctionPass.cpp:92:13
[2023-06-20T05:19:14.441-07:00] #19 0x0000000007860981 llvm::FPPassManager::runOnFunction(llvm::Function&) llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1455:27
[2023-06-20T05:19:14.441-07:00] #20 0x0000000007860078 llvm::FPPassManager::runOnModule(llvm::Module&) llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1514:16
[2023-06-20T05:19:14.441-07:00] #21 0x0000000007de90b4 (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1583:27
[2023-06-20T05:19:14.441-07:00] #22 0x0000000007de90b4 llvm::legacy::PassManagerImpl::run(llvm::Module&) llvm-project/llvm/lib/IR/LegacyPassManager.cpp:538:44
[2023-06-20T05:19:14.441-07:00] #23 0x0000000005dd8d5a codegen(llvm::lto::Config const&, llvm::TargetMachine*, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, unsigned int, llvm::Module&, llvm::ModuleSummaryIndex const&) llvm-project/llvm/lib/LTO/LTOBackend.cpp:415:17

Hello, I'm seeing a build failure that seems related to this patch. I'm seeing the patch has been relanded and reverted a couple times and not sure where it is right now. Can you please confirm if the failure is related and fixed? Thanks.

Hello! Currently, the commit is reverted (in commit 6bea8331f9e09ba94a225c65becd4224a1a473af). Does the build failure still occur with it?
Could you provide a reproducing code for the crash?

hoy added a comment.Jun 21 2023, 9:26 AM

Hello, I'm seeing a build failure that seems related to this patch. I'm seeing the patch has been relanded and reverted a couple times and not sure where it is right now. Can you please confirm if the failure is related and fixed? Thanks.

Hello! Currently, the commit is reverted (in commit 6bea8331f9e09ba94a225c65becd4224a1a473af). Does the build failure still occur with it?
Could you provide a reproducing code for the crash?

Thanks for the information. As of today the build failure does not exist. It only happened yesterday.

dzhidzhoev reopened this revision.Jul 27 2023, 4:47 AM
This revision is now accepted and ready to land.Jul 27 2023, 4:47 AM
This revision was landed with ongoing or failed builds.Sep 26 2023, 12:07 PM
This revision was automatically updated to reflect the committed changes.
hans added a comment.Sep 29 2023, 5:24 AM

We're hitting an assert after this change:

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2331:
virtual void llvm::DwarfDebug::endFunctionImpl(const llvm::MachineFunction *):
Assertion `LScopes.getAbstractScopesList().size() == NumAbstractSubprograms &&
"getOrCreateAbstractScope() inserted an abstract subprogram scope"' failed.

See https://bugs.chromium.org/p/chromium/issues/detail?id=1487910#c3 for a reproducer.

I'll revert until it can be investigated/fixed.

We're hitting an assert after this change:

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2331:
virtual void llvm::DwarfDebug::endFunctionImpl(const llvm::MachineFunction *):
Assertion `LScopes.getAbstractScopesList().size() == NumAbstractSubprograms &&
"getOrCreateAbstractScope() inserted an abstract subprogram scope"' failed.

See https://bugs.chromium.org/p/chromium/issues/detail?id=1487910#c3 for a reproducer.

I'll revert until it can be investigated/fixed.

Thank you! Added a fix for that https://github.com/llvm/llvm-project/pull/68986.

aeubanks added a subscriber: aeubanks.EditedNov 7 2023, 8:11 AM

Heads up, we're seeing issues from this in Chrome again, will try to post a reduced repro (https://crbug.com/1500022)

The reduced repro I found crashed very far back, so I'm guessing this patch caused bad debug info which triggers an existing issue that doesn't handle bad debug info.

../../llvm/lib/CodeGen/AsmPrinter/DwarfFile.cpp:110: void llvm::DwarfFile::addScopeVariable(LexicalScope *, DbgVariable *): Assertion 'Ret.second' failed.

is the crash btw

Is there any issue with mixing IR built with LLVMContext::enableDebugTypeODRUniquing() and without? I'm suspecting that ThinLTO is mixing Rust and Clang IR which set that differently.

another observation, opt complained about some debug info due to setting LLVMContext::enableDebugTypeODRUniquing() and stripped it away, but llc happily accepted the debug info and crashed with the assertion above when emitting assembly.

Heads up, we're seeing issues from this in Chrome again, will try to post a reduced repro (https://crbug.com/1500022)

The reduced repro I found crashed very far back, so I'm guessing this patch caused bad debug info which triggers an existing issue that doesn't handle bad debug info.

Could you provide a reproducer for that crash?

Is there any issue with mixing IR built with LLVMContext::enableDebugTypeODRUniquing() and without? I'm suspecting that ThinLTO is mixing Rust and Clang IR which set that differently.

Apparently, it's an issue with ThinLTO that might be caused by a change in MetadataLoader https://github.com/llvm/llvm-project/pull/68986.
Is there a way to get a repro like this one https://bugs.chromium.org/p/chromium/issues/detail?id=1487910#c3 if it's not too complicated?

Is there any issue with mixing IR built with LLVMContext::enableDebugTypeODRUniquing() and without? I'm suspecting that ThinLTO is mixing Rust and Clang IR which set that differently.

Apparently, it's an issue with ThinLTO that might be caused by a change in MetadataLoader https://github.com/llvm/llvm-project/pull/68986.
Is there a way to get a repro like this one https://bugs.chromium.org/p/chromium/issues/detail?id=1487910#c3 if it's not too complicated?

I've shared the repro tar with your commit email address, sorry it's so big. I tried linking again with the revert and now it looks like the debug info verifier is firing instead. Do you also need a repro tar with everything built with the revert?

jmorse added subscribers: StephenTozer, jmorse.EditedNov 13 2023, 4:11 AM

Hi,

Just to note that we've been seeing LTO crashes as a result of rG3b449bd46a11a (now reverted on trunk), which @StephenTozer has kindly reduced down to:

https://gist.github.com/jmorse/b0248c3c9f9195487ffd7c7431a8d15e

Which triggers the assertion: DwarfDebug.cpp:2335: virtual void llvm::DwarfDebug::endFunctionImpl(const llvm::MachineFunction *): Assertion `LScopes.getAbstractScopesList().size() == NumAbstractSubprograms && "getOrCreateAbstractScope() inserted an abstract subprogram scope"' failed. If you revert-the-revert rG6beddd668 that is.

I haven't familiarised myself with this patch series (while greatly appreciating that it exists!), so perhaps this is already obvious but: This particular assertion is a check that no additional lexical scopes are discovered during DWARF emission that weren't found during the earlier building of the lexical-scopes-map, which enumerates all scopes/inlining-chains for all instructions' DebugLocs. If any more unexpectedly appear after that, I believe there's a risk that a container for lexical scopes gets reallocated, causing random crashes. I see there are now types in the retainedNodes field for DISubprograms with "scope" fields (EDIT: !53 in the reproducer), I imagine that the discovery of those lexical scopes which weren't reachable from instructions might be causing the assertion.

Hi,

Just to note that we've been seeing LTO crashes as a result of rG3b449bd46a11a (now reverted on trunk), which @StephenTozer has kindly reduced down to:

https://gist.github.com/jmorse/b0248c3c9f9195487ffd7c7431a8d15e

Which triggers the assertion: DwarfDebug.cpp:2335: virtual void llvm::DwarfDebug::endFunctionImpl(const llvm::MachineFunction *): Assertion `LScopes.getAbstractScopesList().size() == NumAbstractSubprograms && "getOrCreateAbstractScope() inserted an abstract subprogram scope"' failed. If you revert-the-revert rG6beddd668 that is.

I haven't familiarised myself with this patch series (while greatly appreciating that it exists!), so perhaps this is already obvious but: This particular assertion is a check that no additional lexical scopes are discovered during DWARF emission that weren't found during the earlier building of the lexical-scopes-map, which enumerates all scopes/inlining-chains for all instructions' DebugLocs. If any more unexpectedly appear after that, I believe there's a risk that a container for lexical scopes gets reallocated, causing random crashes. I see there are now types in the retainedNodes field for DISubprograms with "scope" fields (EDIT: !53 in the reproducer), I imagine that the discovery of those lexical scopes which weren't reachable from instructions might be causing the assertion.

I appreciate your help! Can I ask you how to reproduce that crash? When I run llc or llvm-lto on the gist, the assertion is triggered

Assertion failed: (((LinkageName.empty() || DeclLinkageName.empty()) || LinkageName == DeclLinkageName) && "decl has a linkage name and it is different"), function applySubprogramDefinitionAttributes, file DwarfUnit.cpp, line 1222.

that persists even with this commit reverted. Are you sure that the reproducer was correctly reduced?

Is there any issue with mixing IR built with LLVMContext::enableDebugTypeODRUniquing() and without? I'm suspecting that ThinLTO is mixing Rust and Clang IR which set that differently.

Apparently, it's an issue with ThinLTO that might be caused by a change in MetadataLoader https://github.com/llvm/llvm-project/pull/68986.
Is there a way to get a repro like this one https://bugs.chromium.org/p/chromium/issues/detail?id=1487910#c3 if it's not too complicated?

I've shared the repro tar with your commit email address, sorry it's so big. I tried linking again with the revert and now it looks like the debug info verifier is firing instead. Do you also need a repro tar with everything built with the revert?

Thank you so much! I think that's enough.

Hi,

Just to note that we've been seeing LTO crashes as a result of rG3b449bd46a11a (now reverted on trunk), which @StephenTozer has kindly reduced down to:

https://gist.github.com/jmorse/b0248c3c9f9195487ffd7c7431a8d15e

Which triggers the assertion: DwarfDebug.cpp:2335: virtual void llvm::DwarfDebug::endFunctionImpl(const llvm::MachineFunction *): Assertion `LScopes.getAbstractScopesList().size() == NumAbstractSubprograms && "getOrCreateAbstractScope() inserted an abstract subprogram scope"' failed. If you revert-the-revert rG6beddd668 that is.

I haven't familiarised myself with this patch series (while greatly appreciating that it exists!), so perhaps this is already obvious but: This particular assertion is a check that no additional lexical scopes are discovered during DWARF emission that weren't found during the earlier building of the lexical-scopes-map, which enumerates all scopes/inlining-chains for all instructions' DebugLocs. If any more unexpectedly appear after that, I believe there's a risk that a container for lexical scopes gets reallocated, causing random crashes. I see there are now types in the retainedNodes field for DISubprograms with "scope" fields (EDIT: !53 in the reproducer), I imagine that the discovery of those lexical scopes which weren't reachable from instructions might be causing the assertion.

I appreciate your help! Can I ask you how to reproduce that crash? When I run llc or llvm-lto on the gist, the assertion is triggered

Assertion failed: (((LinkageName.empty() || DeclLinkageName.empty()) || LinkageName == DeclLinkageName) && "decl has a linkage name and it is different"), function applySubprogramDefinitionAttributes, file DwarfUnit.cpp, line 1222.

that persists even with this commit reverted. Are you sure that the reproducer was correctly reduced?

Heads up, we're seeing issues from this in Chrome again, will try to post a reduced repro (https://crbug.com/1500022)

The reduced repro I found crashed very far back, so I'm guessing this patch caused bad debug info which triggers an existing issue that doesn't handle bad debug info.

Fix for that https://github.com/llvm/llvm-project/pull/75385.

Hi,

! In D144006#4658037, @dzhidzhoev wrote:
I appreciate your help! Can I ask you how to reproduce that crash? When I run llc or llvm-lto on the gist, the assertion is triggered
[...]
that persists even with this commit reverted. Are you sure that the reproducer was correctly reduced?

Apologies, Phab has stopped sending emails out so I didn't see this, and it looks like my sanitising of strings in that gist caused the extra problems. I've now updated the gist so that it compiles without that unrelated assertion; it should be "revision 2" of the same gist link.