Microcode update for Jump Conditional Code Erratum may cause performance
loss for some workloads:
Here is the patch to mitigate performance impact by aligning branches
within 32-byte boundary. The impacted instructions are:
a. Conditional jump.
b. Fused conditional jump.
c. Unconditional jump.
d. Indirect jump.
Add two options for llvm-mc:
1. `-x86-align-branch-boundary=NUM` aligns branches within NUM byte boundary.
2. `-x86-align-branch=TYPE[+TYPE...]` specifies types of branches to align.
to align branches within a 32-Byte boundary to reduce the potential performance
loss of the microcode update.
A new MCFragment type, MCBoundaryAlignFragment, is added, which has
1. `BranchPadding`: The variable size frag to insert NOP before branch.
2. `FusedJccPadding`: The variable size frag to insert NOP before fused
3. `FusedJccSplit`: The zero size frag to separate the instruction which is fused
with the following conditional jump from fused jcc.
4. `FusiblePlaceHolder`: The fragment to be inserted before the instruction that
is valid as first instruction in macro fusion. It would turn into
FusedJccPadding if macro fusion really happened.
`alignBranchesBegin` inserts `MCBoundaryAlignFragment` before instructions,
`alignBranchesEnd` sets the target branch for the `MCBoundaryAlignFragment`,
`relaxBoundaryAlign` grows or shrinks sizes of NOP to align the target branch.
Nop padding is disabled when the instruction may be rewritten by the linker,
such as TLS Call.