This is an archive of the discontinued LLVM Phabricator instance.

Remove line and file from DINamespace
ClosedPublic

Authored by aprantl on Apr 28 2017, 9:42 AM.

Details

Summary

Fixes the issue highlighted in http://lists.llvm.org/pipermail/cfe-dev/2014-June/037500.html.

The DW_AT_decl_file and DW_AT_decl_line attributes on namespaces prevent LLVM from uniquing types that are in the same namespace. They also don't carry any meaningful information.

Here's the canonical example of why this is broken:

$ cat test1.cpp
#include "test.h" // namespace test is in test.h
test::Test<int> foo1() {
   return test::Test<int>();
}

$ cat test2.cpp
namespace test { // namespace test is in test2.cpp
}
#include "test.h"
test::Test<int> foo2() {
   return test::Test<int>();
}

$ cat test.h
namespace test {
template <class T>
class Test {
};
}

$ clang++ -g -emit-llvm test1.cpp -S -o test1.ll
$ clang++ -g -emit-llvm test2.cpp -S -o test2.ll
$ llvm-link test1.ll test2.ll -S -o linked.ll
$ llc linked.ll

This patch removes the line and file attribute from DINamespace, and generates anonymous namespaces as distinct to prevent uniquing where it would be incorrect. The discussion in the linked thread predates distinct MDNodes which was the missing ingredient to make this happen.

rdar://problem/17484998

Diff Detail

Repository
rL LLVM

Event Timeline

aprantl created this revision.Apr 28 2017, 9:42 AM
dexonsmith added inline comments.Apr 28 2017, 9:48 AM
lib/Bitcode/Reader/MetadataLoader.cpp
1385โ€“1397 โ†—(On Diff #97112)

Is there something we should do here for upgrades for anonymous namespaces?

  • Old ones won't be distinct in bitcode.
  • They'll suddenly start merging.

Maybe it's not so bad if they merge, but if we do something easily (maybe triggered off of whether they have a name) we should try.

aprantl updated this revision to Diff 97120.Apr 28 2017, 10:21 AM

Add a bitcode upgrade for anonymous namespaces.
Thanks, Duncan!

dexonsmith edited edge metadata.Apr 28 2017, 10:28 AM

Upgrade LGTM. (I'm not planning to review the rest, but let me know if you need me to take a closer look.)

I'm all for this, couple of questions though:

  • When would the use of "distinct" on an anonymous namespace matter? Since LLVM uses cross-CU references to refer to entities in different CUs, never importing/moving entities between CUs, the anonymous namespaces shouldn't collide, should they? (anymore than named namespaces would collide when there are different entities in different CUs but both in the same named namespace - LLVM still produces two separate namespaces, one in each CU, one with each of the respective entities)?
  • Which tests cover the LLVM codegen side of this change - several tests no longer look for the line/file, but do any tests ensure it is not present? (if this patch were reverted, which tests would fail (not a perfect question, since the change in the metadata format would cause parse failures - but not talking about those))

I'm all for this, couple of questions though:

  • When would the use of "distinct" on an anonymous namespace matter? Since LLVM uses cross-CU references to refer to entities in different CUs, never importing/moving entities between CUs, the anonymous namespaces shouldn't collide, should they? (anymore than named namespaces would collide when there are different entities in different CUs but both in the same named namespace - LLVM still produces two separate namespaces, one in each CU, one with each of the respective entities)?

I believe the distinct is necessary because anonymous namespaces whose parent scope is the CU have a null scope. (This is what DIBuilder does to every type node to allow for type uniquing.) If the anonymous namespace are not distinct, all top-level anonymous namespaces would be uniqued. Thus two types from different CU's that are in different anonymous namespaces would both have the same scope pointing to the uniqued anonymous namespace. It is not unthinkable that this could happen to work correctly because of how DwarfDebug is processing the data, but I'd rather not rely on this.

  • Which tests cover the LLVM codegen side of this change - several tests no longer look for the line/file, but do any tests ensure it is not present? (if this patch were reverted, which tests would fail (not a perfect question, since the change in the metadata format would cause parse failures - but not talking about those))

test/DebugInfo/X86/dwarf-public-names.ll
will fail, but not for a good reason: It hardcodes the size of the CU :-)
I'll add CHECK-NOTs to one of the tests!

aprantl updated this revision to Diff 97134.Apr 28 2017, 11:52 AM

Added a negative check for decl_line/decl_file to DebugInfo/Generic/namespace.ll

aprantl updated this revision to Diff 97160.Apr 28 2017, 2:23 PM

Don't make anonymous namespaces distinct.

dblaikie added inline comments.Apr 28 2017, 2:37 PM
lib/Bitcode/Reader/MetadataLoader.cpp
1389 โ†—(On Diff #97160)

This is no longer accurate (& the code below's no longer correct)?

lib/Bitcode/Writer/BitcodeWriter.cpp
1647 โ†—(On Diff #97160)

So this won't improve the bitcode size? (well, it won't produce the file name/line number, but the fields will still be there, with teh minimal 0 encoding) Is that necessary? I thought in other places the number of fields in a record were detected and used to do some of the back/forwards compat?

lib/IR/AsmWriter.cpp
1759 โ†—(On Diff #97160)

Why is the file printing still here, but the line printing is removed? Seems inconsistent/surprising?

lib/IR/DIBuilder.cpp
733โ€“735 โ†—(On Diff #97160)

I'd probably ramble on a bit more here - explaining that not only is it OK, but an explicit tradeoff of link time versus memory usage versus code simplicity & could be revisited if those tradeoffs turn out to be not right.

Also, from my perspective it's not so much because the children (& maybe better to rephrase that "the things referring to an anonymous namespace" because the namespace doesn't have children in the mechanical sense - other nodes refer to it... eh anyway, whatever you think's best) are unique but because they're necessarily in different CUs & don't move between them because of how they're referenced.

aprantl updated this revision to Diff 97167.Apr 28 2017, 3:11 PM

Address further review feedback from David.

aprantl marked 5 inline comments as done.Apr 28 2017, 3:12 PM
dblaikie accepted this revision.Apr 28 2017, 3:21 PM
dblaikie added inline comments.
lib/CodeGen/AsmPrinter/DwarfUnit.cpp
379 โ†—(On Diff #97167)

no need to pass the filename down here either? (why does DINamespace still have a 'getFilename'? (I'm assuming the reason the line isn't passed down is because DINamespace doesn't have getLine anymore?))

Yeah, I see DIScope still has getFilename - one day might be nice to refactor out some of this if not all scopes have/need this information.

lib/IR/DebugInfoMetadata.cpp
513 โ†—(On Diff #97167)

What's the first null parameter here represent/for?

This revision is now accepted and ready to land.Apr 28 2017, 3:21 PM
This revision was automatically updated to reflect the committed changes.