This is an archive of the discontinued LLVM Phabricator instance.

[Verifier] Check that ValueAsMetadata belongs to the module
Needs RevisionPublic

Authored by rtereshin on Jun 8 2018, 5:48 PM.

Details

Summary

Judging from https://bugs.llvm.org/show_bug.cgi?id=37685 we sometimes end up with module-level metadata that have value-operands belonging to a different module.

I've added the IR Verifier check and hit another interesting case:

llvm/test/ThinLTO/X86/dicompositetype-unique2.ll fails on the 3rd RUN-line (llvm-lto --thinlto-action=run %t1.bc %t2.bc -thinlto-save-temps=%t3.) with the following (partial) stack trace:

frame #11: 0x0000000100df4c5f llvm-lto`llvm::verifyModule(M=0x00000001077017c0, OS=0x00000001039aba08, BrokenDebugInfo=0x0000700002f56d4b) at Verifier.cpp:4672
    frame #12: 0x0000000100b448f3 llvm-lto`llvm::UpgradeDebugInfo(M=0x00000001077017c0) at AutoUpgrade.cpp:3099
    frame #13: 0x000000010221037c llvm-lto`llvm::FunctionImporter::importFunctions(this=0x0000700002f57d80, DestModule=0x0000000107700f00, ImportList=0x000000010750b9c8) at FunctionImport.cpp:970
    frame #14: 0x0000000100ee52b3 llvm-lto`(anonymous namespace)::crossImportIntoModule(TheModule=0x0000000107700f00, Index=0x000000010750bd50, ModuleMap=0x00007ffeefbfb8f0, ImportList=0x000000010750b9c8) at ThinLTOCodeGenerator.cpp:202
    frame #15: 0x0000000100f03d82 llvm-lto`(anonymous namespace)::ProcessThinLTOModule(TheModule=0x0000000107700f00, Index=0x000000010750bd50, ModuleMap=0x00007ffeefbfb8f0, TM=0x0000000107306be0, ImportList=0x000000010750b9c8, ExportList=size=0, GUIDPreservedSymbols=0x00007ffeefbfb8a8, DefinedGlobals=0x000000010750b888, CacheOptions=0x00007ffeefbff608, DisableCodeGen=false, SaveTempsDir=(Data = "/Volumes/Data/llvm/build/obj/test/ThinLTO/X86/Output/dicompositetype-unique2.ll.tmp3", Length = 84), Freestanding=false, OptLevel=3, count=0) at ThinLTOCodeGenerator.cpp:460
    frame #16: 0x0000000100f022d6 llvm-lto`llvm::ThinLTOCodeGenerator::run(this=0x000000010750dd88, count=0)::$_2::operator()(int) const at ThinLTOCodeGenerator.cpp:1016

That is an independent problem as CloneFunctionInto isn't used by this llvm-lto run at all, though it's quite similar: it would be !DITemplateValueParameter(name: "F", type: !24, value: %struct.S* ()* @_Z3Getv) referenced from a DISubprogram associated with a function belonging to a module M1, while the global value referenced by DITemplateValueParameter itself would belong to a different module M2.

Diff Detail

Repository
rL LLVM

Event Timeline

rtereshin created this revision.Jun 8 2018, 5:48 PM

Is it possible to add a short unittest for this?

Hi Roman, so I hit PR36785 while working on PR36784 which is a very similar bug triggered under certain conditions with LTO, you may want to take a look at that bug as well. It’s occurring for basically the exact same reason in another part of the code base, uniquing MD nodes leads to references across modules. In my case, my change to fix it (D47898) works around this because I took this (surprising) behavior to be expected, e.g. this review for a similar bug https://reviews.llvm.org/D26212 where the conclusion seemed to be that this behavior was expected—though that may be the wrong decision. Given that there’s been some history here and we’re taking slightly different approaches I think it’d be worth it to start a thread on llvm-dev to get some feedback on whether this behavior is allowable. Give me a day or so to take a look at ODR type uniquing in LTO and send that email.

Actually, I'm not sure that mail is necessary. After looking at ODR type uniquing a bit I think the correct behavior is to wrap your check in if (!M.getContext().isODRUniquingDebugTypes()) since I believe we expect ODR uniquing to be able to create cross-module references but we should otherwise not see them. Adding @tejohnson @pcc to take a look as well.

Actually, I'm not sure that mail is necessary. After looking at ODR type uniquing a bit I think the correct behavior is to wrap your check in if (!M.getContext().isODRUniquingDebugTypes()) since I believe we expect ODR uniquing to be able to create cross-module references but we should otherwise not see them. Adding @tejohnson @pcc to take a look as well.

Metadata are shared between modules by design, but having a Value (in this case, a global constant) that belongs to one module and is reachable from another in the same time is odd. Is that really something we explicitly support? Will we get a dangling pointer reachable from an alive module if the owning module gets deallocated?

Metadata are shared between modules by design, but having a Value (in this case, a global constant) that belongs to one module and is reachable from another in the same time is odd. Is that really something we explicitly support?

I agree that it's surprising, Teresa/Dexon/Mehdi have worked on DI ODR uniquing before so they're in a better place to comment on whether it's expected.

Will we get a dangling pointer reachable from an alive module if the owning module gets deallocated?

Value::~Value() calls out ValueAsMetadata::handleDeletion() so it seems like it would not leave a stale reference though I haven't tested this (and I'm also not sure if the code that handles DITemplateValueParameters would handle null Metadata entries well.)

Value::~Value() calls out ValueAsMetadata::handleDeletion() so it seems like it would not leave a stale reference

Looks like it. On the other hand, I would think that referencing a value from a different module is more tolerable if there is no copy of that value in the module directly referencing the metadata node. In all the tests cases we've got so far however it's not the case as the value referenced always has a copy in every module involved, so it seems having these cross-module references is just calling for trouble. Especially given this fair consideration:

I'm also not sure if the code that handles DITemplateValueParameters would handle null Metadata entries well.

Even if it does, it's a loss of information for no apparent reason.

IIRC, sharing uniqued metadata across modules should only happen as a temporary state. It's a trick for when loading modules that will be merged together, to avoid unnecessarily creating multiple copies of the graphs.

I feel like the verifier should reject this state, unless it's specifically being called in a context where modules-to-be-merged are still being loaded.

IIRC, sharing uniqued metadata across modules should only happen as a temporary state. It's a trick for when loading modules that will be merged together, to avoid unnecessarily creating multiple copies of the graphs.

I feel like the verifier should reject this state, unless it's specifically being called in a context where modules-to-be-merged are still being loaded.

The crash noted in the test occurs during cross-module importing, does that count as merging? Both of the modules are independently codegened, not merged together, but there is importing. To me it seems like cross-module references are a temporary state while merging for monolithic LTO, but not a temporary state under ThinLTO.

IIRC, sharing uniqued metadata across modules should only happen as a temporary state. It's a trick for when loading modules that will be merged together, to avoid unnecessarily creating multiple copies of the graphs.

I feel like the verifier should reject this state, unless it's specifically being called in a context where modules-to-be-merged are still being loaded.

The crash noted in the test occurs during cross-module importing, does that count as merging? Both of the modules are independently codegened, not merged together, but there is importing. To me it seems like cross-module references are a temporary state while merging for monolithic LTO, but not a temporary state under ThinLTO.

Yes this is essentially merging. For ThinLTO importing we still IRMove in the source module, but pass in the set of globals to move so that it is selective. The assert shown in the patch description here is happening after the source module has been loaded by the importer, but just before we have done the IRMove. So this is very much in a temporary state. Note that the caller of the function importer (crossImportIntoModule) does call the verifier again right after all importing is complete. It would be interesting to see if the assert is hit there if we can somehow skip it during the call from UpgradeDebugInfo.

The source modules being imported from are loaded separately from the memory buffers when they are codegened by a different thread.

dexonsmith requested changes to this revision.Oct 17 2020, 6:03 AM

Marking as needing changes since this would trigger in ThinLTO’s temporary state.

This revision now requires changes to proceed.Oct 17 2020, 6:03 AM