This change fixes a couple places where the pseudo probe intrinsic blocks optimizations because they are not naturally removable. To unblock those optimizations, the blocking pseudo probes are moved out of the original blocks and tagged dangling, instead of allowing pseudo probes to be literally removed. The reason is that when the original block is removed, we won't be able to sample it. Instead of assigning it a zero weight, moving all its pseudo probes into another block and marking them dangling should allow the counts inference a chance to assign them a more reasonable weight. We have not seen counts quality degradation from our experiments.
The optimizations being unblocked are:
- Removing conditional probes for if-converted branches. Conditional probes are tagged dangling when their homing branch arms are folded so that they will not be over-counted.
- Unblocking jump threading from removing empty blocks. Pseudo probe prevents jump threading from removing logically empty blocks that only has one unconditional jump instructions.
- Unblocking SimplifyCFG and MIR tail duplicate to thread empty blocks and blocks with redundant branch checks.
Since dangling probes are logically deleted, they should not consume any samples in LTO postLink. This can be achieved by setting their distribution factors to zero when dangled.
It sounds like "SkipPseudoOp = true" is the more common cases and only certain cases will use the default value. It will lead to the confusion whether user should use the default value or not for now. I know you mentioned you would flip the default value in the future. If you think user is better to choose "SkipPseudoOp = true" in common cases, it may be good to suggest that in the front and add a TODO for flipping the default value.