Ignore ephemeral values (only feeding llvm.assume intrinsics) when
computing the instruction count to decide if a block is small enough for
threading. This is similar to the handling of these values in the
InlineCost computation. These instructions will eventually be removed
and shouldn't count against code size (similar to the existing ignoring
of phis).
Without this change, when enabling -fwhole-program-vtables, which causes
type test / assume sequences to be inserted by clang, we can get
different threading decisions. In particular, when building with
instrumentation FDO it can affect the optimizations decisions before FDO
matching, leading to some mismatches.
So there's two potential compile-time problems here: The first is that while this starts off with assume inside the block, it may scan uses outside the block as well. Additionally it scans over all assumes in order to find those in the block.
I'm not sure whether this is really important in practice, but having seen pathological ephemeral value collection during inlining, I'm being a bit cautious here.
I think for the particular case it is used for here, it might make the most sense to not collect ephemeral values upfront, instead compute them on the fly. We can do this by changing the direction of the instruction walk from end to start, and then doing something like:
This also has the advantage that we don't need to compute any ephemeral values past the ten or so instructions we look at.