Prior to this patch, using any kind of interface (op interface, attr interface, type interface) as the key of a llvm::DenseSet, llvm::DenseMap or other related containers, leads to invalid pointer dereferences, despite compiling.
The gist of the problem is that a llvm::DenseMapInfo specialization for the base type (aka one of Operation*, Type or Attribute) are selected when using an interface as a key, which uses getFromOpaquePointer with invalid pointer addresses to construct instances for the empty key and tombstone key values. The interface is then constructed with this invalid base type and an attempt is made to lookup the implementation in the interface map, which then dereferences the invalid pointer address. (For more details and the exact call chain involved see the GitHub issue below)
The current workaround is to use the more generic base type (eg. instead of DenseSet<FunctionOpInterface>, DenseSet<Operation*>), but this is strictly worse from a code perspective (doesn't enforce the invariant, code is less self documenting, having to insert casts etc).
This patch fixes that issue by defining a DenseMapInfo specialization of Interface subclasses which uses a new constructor to construct an instance without querying a concept. That allows getEmptyKey and getTombstoneKey to construct an interface with invalid pointer values.
Fixes https://github.com/llvm/llvm-project/issues/54908
Other bug fix approaches tried and tested that do not work:
- Specializing getFromOpaquePointer to mlir::detail::Interface: Doesn't work because the implementation can't differentiate between valid and invalid pointers and always skipping looking up the concept implementation would be incorrect.
- Specializing llvm::DenseMapInfo for mlir::detail::Interface subclasses: Doesn't work as the specialization would be ambiguous with existing specializations for OpState, Type and Attribute specializations. This is due to them only differing in the second std::enable_if_t parameter, which do not yield mutually exclusive results.
What does the error message look like when one 1) doesn't set the dialect and/or 2) has it in namespace? My guess is not too intuitive. I wonder if we could improve that (beyond just sending out email and giving folks heads-up)