This is an archive of the discontinued LLVM Phabricator instance.

[StripDeadDebugInfo] Drop dead CUs for const global expression
ClosedPublic

Authored by bader on Aug 4 2022, 9:18 AM.

Details

Summary

Compile units are not left explicitly if they have const global
expression. They should be left if that const global is used in a
function that is contained in extracted module.

Signed-off-by: Mikhail Lychkov <mikhail.lychkov@intel.com>

Diff Detail

Event Timeline

bader created this revision.Aug 4 2022, 9:18 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 4 2022, 9:18 AM
bader requested review of this revision.Aug 4 2022, 9:18 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 4 2022, 9:18 AM

I don't think this would end up matching the user expectation. If I'm in the debugger I would expect to be able to access global constants in the expression evaluator. Why would optimizing them out be the desired behavior?

bader added a comment.Sep 13 2022, 6:32 AM

I don't think this would end up matching the user expectation. If I'm in the debugger I would expect to be able to access global constants in the expression evaluator.

That sounds reasonable.

Why would optimizing them out be the desired behavior?

I'm following up on the discussion in this thread - https://discourse.llvm.org/t/questions-about-the-work-of-stripdeaddebuginfo-pass/60278. The situation we have is following. There an LLVM module (A), which was extracted from a bigger LLVM module (B). Let's imagine that original module B is a library of functions and we are trying to extract a sub-set (i.e. module A), which satisfy certain requirements (e.g. don't use particular HW features). We observe that module A includes a lot of debug information not relevant to extracted functions. If I understand correctly, this patch removes compile unit debug information if it's referenced by "dead" constant expressions ("dead" here means they are not used in extracted module A).
I think we have following case in our code:

//-------------------------------------
// a.cpp
int shared(int);
const int A = 8 * 10;
int foo() { return A; }

int bar() { return shared(A); }

//-------------------------------------
// b.cpp
const int B = 2 * 20;
int baz() { return B; }

int shared (int x) { return x; }

//-------------------------------------
// c.cpp
const int C = 3 + 2;

int foobar() { return C; }

We compile and link these files.

clang++ -c -S -emit-llvm a.cpp -g
clang++ -c -S -emit-llvm b.cpp -g
clang++ -c -S -emit-llvm c.cpp -g
llvm-link a.ll b.ll c.ll -S -o linked.ll

Then let's say we extract only function foo and I see following debug info in extracted module:

!7 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !8, producer: "clang version 16.0.0", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, globals: !9, splitDebugInlining: false, nameTableKind: None)
!8 = !DIFile(filename: "b.cpp", directory: "bader/tmp", checksumkind: CSK_MD5, checksum: "0c078350ba11ff83d6f23257d234280f")
!9 = !{!10}
!10 = !DIGlobalVariableExpression(var: !11, expr: !DIExpression(DW_OP_constu, 40, DW_OP_stack_value))
!11 = distinct !DIGlobalVariable(name: "B", scope: !7, file: !8, line: 2, type: !5, isLocal: true, isDefinition: true)
!12 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !13, producer: "clang version 16.0.0", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, globals: !14, splitDebugInlining: false, nameTableKind: None)
!13 = !DIFile(filename: "c.cpp", directory: "bader/tmp", checksumkind: CSK_MD5, checksum: "ae01ec03776b2faa3f5f8d470bac1874")
!14 = !{!15}
!15 = !DIGlobalVariableExpression(var: !16, expr: !DIExpression(DW_OP_constu, 5, DW_OP_stack_value))
!16 = distinct !DIGlobalVariable(name: "C", scope: !12, file: !13, line: 3, type: !5, isLocal: true, isDefinition: true)

Does it make sense to keep debug metadata for constant C in this case?

I think your example makes perfect sense, but it sounds like this should be a feature that only llvm-extract or whatever tool you are using in the described workflow should enable. Generally you don't want "dead" constant's debug info stripped in a normal compilation because you still want to be able to see these constants in the debugger. In your workflow the compiler has special knowledge that this debug info will be available externally, so it makes sense there.

bader added a comment.Mar 30 2023, 8:14 PM

I think your example makes perfect sense, but it sounds like this should be a feature that only llvm-extract or whatever tool you are using in the described workflow should enable. Generally you don't want "dead" constant's debug info stripped in a normal compilation because you still want to be able to see these constants in the debugger. In your workflow the compiler has special knowledge that this debug info will be available externally, so it makes sense there.

Is it okay to add an llvm::opt to customize this pass or there are better ways?

bader updated this revision to Diff 530148.Jun 9 2023, 9:29 PM

Add an option (dislabed by default) to strip CU with references to non-existing const global expressions.

bader added a comment.Jul 5 2023, 1:02 PM

Friendly ping^2.

aprantl accepted this revision.Jul 18 2023, 2:00 PM
This revision is now accepted and ready to land.Jul 18 2023, 2:00 PM