Prior to this patch, cloneInto would do a simple walk over the blocks and contained operations and clone and map them as it encounters them. As finishing touch it then remaps any successor and operands it has remapped during that process.
This is generally fine, but sadly leads to a lot of uses of both operations and blocks from the source region, in the cloned operations in the target region. Those uses lead to writes in the use-def list of the operations, making cloneInto never thread safe.
This patch reimplements cloneInto in three steps to avoid ever creating any extra uses on elements in the source region:
- It first creates the mapping of all blocks and block operands
- It then clones all operations to create the mapping of all operation results, but does not yet clone any regions or set the operands
- After all operation results have been mapped, it now sets the operations operands and clones their regions.
That way it is now possible to call cloneInto from multiple threads if the Region or Operation is isolated-from-above. This allows creating copies of functions or to use mlir::inlineCall with the same source region from multiple threads. In the general case, the method is thread-safe if through cloning, no new uses of Values from outside the cloned Operation/Region are created. This can be ensured by mapping any outside operands via the BlockAndValueMapping to Values owned by the caller thread.
While I was at it, I also reworked the clone method of Operation a little bit and added a proper options class to avoid having a cloneWithoutRegionsAndOperands method, and be more extensible in the future. cloneWithoutRegions is now also a simple wrapper that calls clone with the proper options set. That way all the operation cloning code is now contained solely within clone.
Regarding testing: I have no clue what an automated test for thread safety would look like, nor whether that is possible. I used TSAN on my own project to find uses of mlir::inlineCall making writes to the source callable, creating race conditions. After this patch, TSAN no longer reports any issues in my project.
Can you expand the documentation here?