The Assignment Tracking debug-info feature is outlined in this RFC. This first series of patches adds documentation, the changes necessary to start emitting and using the new metadata, and updates clang with an option to enable the feature. Working with the new metadata in the middle and back end will come later. There are still a few rough edges but I'm putting these patches up now hoping to get feedback on the design and implementation from the upstream community.
Overview
It's possible to find intrinsics linked to an instruction by looking at the MetadataAsValue uses of the attached DIAssignID. That covers instruction -> intrinsic(s) lookup. Add a global DIAssignID -> instruction(s) map which gives us the ability to perform intrinsic -> instruction(s) lookup. Add plumbing to keep the map up to date through optimisations and add utility functions including two that perform those lookups. Finally, add a unittest.
Details / patch tour
In llvm/lib/IR/LLVMContextImpl.h add AssignmentIDToInstrs which maps DIAssignID * attachments to Instruction *s. Because the DIAssignID * is the key we can't use a TrackingMDNodeRef for it, and therefore cannot easily update the mapping when a temporary DIAssignID is replaced.
Temporary DIAssignID's are only used in IR parsing to deal with metadata forward references. Update llvm/lib/AsmParser/LLParser.cpp to avoid using temporary DIAssignID's for attachments.
In llvm/lib/IR/Metadata.cpp add Instruction::updateDIAssignIDMapping which is called to remove or add an entry (or both) to AssignmentIDToInstrs. Call this from Instruction::setMetadata and add a call to setMetadata in Intruction's dtor that explicitly unsets the DIAssignID so that the mappging gets updated.
In llvm/lib/IR/DebugInfo.cpp and DebugInfo.h add utility functions:
getAssignmentInsts(const DbgAssignIntrinsic *DAI) getAssignmentMarkers(const Instruction *Inst) RAUW(DIAssignID *Old, DIAssignID *New) deleteAll(Function *F)
These core utils are tested in llvm/unittests/IR/DebugInfoTest.cpp.
Notes / observations
This all needs to be looked at closely from a performance perspective once it's up and running. This mapping is obviously quite intrusive, but I'm not sure we can get around that.
On the one hand, "llvm::at::" isn't the most descriptive of namespaces; but on the other hand, I don't believe llvm provides a stable C++ API, so if someone doesn't like it then it can just be changed. Good use of namespaces IMO.