The IR/MIR pseudo probe intrinsics don't get materialized into real machine instructions and therefore they don't incur runtime cost directly. However, they come with indirect cost by blocking certain optimizations. Some of the blocking are intentional (such as blocking code merge) for better counts quality while the others are accidental. This change unblocks perf-critical optimizations that do not affect counts quality. They include:
- IR InstCombine, sinking load operation to shorten lifetimes.
- MIR LiveRangeShrink, similar to #1
- MIR MachineSinking, similar to #1
- MIR TwoAddressInstructionPass, i.e, opeq transform
- MIR function argument copy elision
- IR stack protection. (though not perf-critical but nice to have).
Similar to IR pass, add a MachineInstr::isDebugOrPseudoInstr()?