This is functionally equivalent to the old implementation.
As per https://llvm-compile-time-tracker.com/compare.php?from=5f4e9bf6416e45eba483a4e5e263749989fdb3b3&to=4739e6e4eb54d3736e6457249c0919b30f6c855a&stat=instructions
this is a clear geomean compile-time regression-free win with overall geomean of -0.08%
32 PHI's appears to be the sweet spot; both the 16 and 64 performed worse:
https://llvm-compile-time-tracker.com/compare.php?from=5f4e9bf6416e45eba483a4e5e263749989fdb3b3&to=c4efe1fbbfdf0305ac26cd19eacb0c7774cdf60e&stat=instructions
https://llvm-compile-time-tracker.com/compare.php?from=5f4e9bf6416e45eba483a4e5e263749989fdb3b3&to=e4989d1c67010d3339d1a40ff5286a31f10cfe82&stat=instructions
If we have more PHI's than that, we fall-back to the original DenseSet-based implementation,
so the not-so-fast cases will still be handled.
However compile-time isn't the main motivation here.
I can name at least 3 limitations of this CSE:
- Assumes that all PHI nodes have incoming basic blocks in the same order (can be fixed while keeping the DenseMap)
- Does not special-handle undef incoming values (i don't see how we can do this with hashing)
- Does not special-handle backedge incoming values (maybe can be fixed by hashing backedge as some magical value)
Thoughts?