This is an archive of the discontinued LLVM Phabricator instance.

IRMover: Avoid accidentally mapping types from the destination module (PR30799)
ClosedPublic

Authored by hans on Nov 1 2016, 1:11 PM.

Details

Summary

During an LTO build of Chromium, I ran into the "mapping to a source type" assert.

During Module linking, it's possible for SrcM->getIdentifiedStructTypes(); to return types that are actually defined in the destination module (DstM). Depending on how the bitcode file was read, getIdentifiedStructTypes() might do a walk over all values, including metadata nodes, looking for types. In my case, a debug info metadata node was shared between the two modules, and it referred to a type defined in the destination module (see test case).

Diff Detail

Repository
rL LLVM

Event Timeline

hans updated this revision to Diff 76621.Nov 1 2016, 1:11 PM
hans retitled this revision from to IRMover: Avoid accidentally mapping types from the destination module (PR30799).
hans updated this object.
hans added reviewers: pcc, rnk.
hans added a subscriber: llvm-commits.
inglorion added inline comments.
test/LTO/X86/type-mapping-bug.ll
23 ↗(On Diff #76621)

This explains what happens without the above change to IRMover.cpp, but it doesn't explain what happens after you make that change and why that behavior is correct. Could you add that information here? Other than that, looks good - thanks for fixing this!

pcc added inline comments.Nov 1 2016, 2:33 PM
test/LTO/X86/type-mapping-bug.ll
3 ↗(On Diff #76621)

Would it be possible to write this test with llvm-link instead of llvm-lto?

mehdi_amini added inline comments.Nov 1 2016, 2:43 PM
test/LTO/X86/type-mapping-bug.ll
3 ↗(On Diff #76621)

Likely not. I fixed a similar issues in r280599 / D23841 and it requires the LLVMContext to be initialized with ODR type uniquing, which only happens in LTO. (We may be able to add an option to llvm-link, but we can't really test it as it is supposed to be "NFC").

hans added inline comments.Nov 1 2016, 2:45 PM
test/LTO/X86/type-mapping-bug.ll
3 ↗(On Diff #76621)

Sadly, I wasn't able to reproduce this with llvm-link, only with llvm-lto.

Some code path must be different. While I think it should be possible to trigger the same problem there, I couldn't figure out how to do it.

23 ↗(On Diff #76621)

Thanks, updating the comment.

hans updated this revision to Diff 76637.Nov 1 2016, 2:45 PM

Updating comment.

hans added inline comments.Nov 1 2016, 2:48 PM
test/LTO/X86/type-mapping-bug.ll
3 ↗(On Diff #76621)

Thanks for the pointer! Yes, that must be it.
I'm kind of happy to see someone else has run into this. I thought I was about to go mad debugging it :-)

During Module linking, it's possible for SrcM->getIdentifiedStructTypes(); to return types that are actually defined in the destination module (DstM).

That's very surprising to me and I'd like to understand it a bit more (We're supposed to only ODR metadata).
getIdentifiedStructTypes() isn't even documented in the Module header unfortunately. It is not clear off-hand it the issue should be handled in the IRMover or in getIdentifiedStructTypes itself.

mehdi_amini added inline comments.Nov 1 2016, 2:58 PM
test/LTO/X86/type-mapping-bug.ll
3 ↗(On Diff #76621)

ODR type uniquing was introduced in the 3.9 release, so it can still be a bit rough around the edges...
(I'm debugging an infinite recursion in metadata in the IRMover when linking Webkit with ThinLTO these days)

hans added a comment.Nov 1 2016, 3:14 PM

During Module linking, it's possible for SrcM->getIdentifiedStructTypes(); to return types that are actually defined in the destination module (DstM).

That's very surprising to me and I'd like to understand it a bit more (We're supposed to only ODR metadata).

Yes, but the metadata can refer to values, which getIdentifiedStructTypes() will walk when hunting for types. It was very surprising to me too :-)

getIdentifiedStructTypes() isn't even documented in the Module header unfortunately. It is not clear off-hand it the issue should be handled in the IRMover or in getIdentifiedStructTypes itself.

I'm not sure how we'd change getIdentifiedStructTypes() though. Once we've ODR'd the metadata, the module now refers to a value with this type, and it makes sense that getIdentifiedStructTypes() finds it.

getIdentifiedStructTypes() isn't even documented in the Module header unfortunately. It is not clear off-hand it the issue should be handled in the IRMover or in getIdentifiedStructTypes itself.

I'm not sure how we'd change getIdentifiedStructTypes() though. Once we've ODR'd the metadata, the module now refers to a value with this type, and it makes sense that getIdentifiedStructTypes() finds it.

To begin with: in what situation walking the metadata graph can find a named typed that can't be found by walking only the IR?

Thanks for updating the comment, @hans. I'll give @mehdi_amini some time to look into getIdentifiedStructTypes() and whether or not it is supposed to be this way. For what it's worth, my first instinct was that it shouldn't give us the types that this diff skips over, but after a cursory glance at the implementation (since there is no documentation), I concluded that this is ok. There may be a follow-up diff in our future to clarify the intended behavior of getIdentifiedStructTypes()...

hans added a comment.Nov 1 2016, 3:28 PM

getIdentifiedStructTypes() isn't even documented in the Module header unfortunately. It is not clear off-hand it the issue should be handled in the IRMover or in getIdentifiedStructTypes itself.

I'm not sure how we'd change getIdentifiedStructTypes() though. Once we've ODR'd the metadata, the module now refers to a value with this type, and it makes sense that getIdentifiedStructTypes() finds it.

To begin with: in what situation walking the metadata graph can find a named typed that can't be found by walking only the IR?

I think my test case shows that: when the DI node looks like this:

!11 = !DITemplateValueParameter(type: !12, value: %Tricky.1* bitcast (i8* @templateValueParam to %Tricky.1*))

The non-MD IR doesn't refer to %Tricky.1, but this node has a constant expression that does.

TypeFinder::incorporateMDNode does:

if (auto *C = dyn_cast<ConstantAsMetadata>(Op)) {
  incorporateValue(C->getValue());

which hits:

/// incorporateValue - This method is used to walk operand lists finding types
/// hiding in constant expressions and other operands that won't be walked in
/// other ways.  GlobalValues, basic blocks, instructions, and inst operands are
/// all explicitly enumerated.
void TypeFinder::incorporateValue(const Value *V) {
hans added a comment.Nov 11 2016, 9:52 AM

Mehdi: ping?

mehdi_amini accepted this revision.Nov 11 2016, 5:25 PM
mehdi_amini added a reviewer: mehdi_amini.

LGTM.

I looked into how to put named type on Modules, but that has large implication (for example cloning a function from one module to another would now require special handling to remap named types).
Also before LLVM 3.0 type didn't have name, the names were kept on the side in a hash table on the module, having the current behavior has been a careful choice: http://blog.llvm.org/2011/11/llvm-30-type-system-rewrite.html

lib/Linker/IRMover.cpp
707 ↗(On Diff #76637)

some of which get linked by name add when ODR Type Uniquing is enabled on the Context

This revision is now accepted and ready to land.Nov 11 2016, 5:25 PM

Ping?

I think this should go in 3.9.1, so it'd be great to land it ASAP.

Ping?

I think this should go in 3.9.1, so it'd be great to land it ASAP.

Sorry, I got distracted by other stuff. Committing now.

As for 3.9.1, we should ask Tom (now cc'd) what he thinks.

lib/Linker/IRMover.cpp
707 ↗(On Diff #76637)

Done.

This revision was automatically updated to reflect the committed changes.