DemandedBits currently uses a simple vector for the worklist, which means that instructions may be inserted multiple times into it. Especially in combination with the deep lattice, this may cause instructions too be recomputed very often. To avoid this, switch to a SetVector.
Here's two debug traces for a very simple example before and after this change: https://gist.github.com/nikic/83ea77186f1b873725135299b31aff3c Notice that %x is recomputed 31 times before and once after.