Some (rare) control-flow graphs in datacenter binaries contain many basic
blocks, e.g, >50K. In such cases, the existing implementation might take
up to several hours. This diff introduces a simple flag, MaxChainSize, that
forbid creating chains of blocks exceeding the specified size. When setting
MaxChainSize=4096, the runtime of the alg increase by ~100x on some prod
benchmarks.
Separately introducing an option to disable ext-tsp for function w/o profile
data (OFF by default). If we ever hit a runtime problem, we may turn it ON.
What's the typical chain size and how often do we actually hit 4096 limit?