If we want a branch or a fused pair not to cross or be against the boundary, we currently emit NOP before it (D70157), and in most cases, we can bring back the lost performance due to microcode update. We also observed cases in which nop padding doesn't mitigate the effect very well, but prefix padding does (D72225). As we discussed about the prefix padding, D72225 adopts an aggressive way to add prefixes and, as a result, the fact that every single intruction ends up in it's own fragment is a huge increase in memory usage. So we put forward a light-weight solution. In this solution, to align a branch, at most one instruction can be prefixed, and if there is no sufficient room to add segment prefixes, NOP will be inserted instead. We measured the memory usage of the link process with lto when building SPEC, it only increased a little compared to NOP padding. We turned on the new prefix padding by default and passed the internal large test set and llvm's testsuite.
D75203 seems to support a general alignment padding. If the general alignment padding supports adding prefixes for instructions, this patch is not needed. Currently, the revison is opened here to avoid duplicate work .
This looks like a potentially unrelated change. Can it be separated?