Page MenuHomePhabricator

[X86] Implement -mibt-fix-direct=<aggressive,conservative,none>
Needs ReviewPublic

Authored by joaomoreira on Jan 27 2022, 11:56 PM.



When the compiler emits a direct call or jump to a function that starts with an ENDBR instruction, it may emit the given instruction targeting the function entry-point plus an offset of +4, willing to bypass the ENDBR instruction during direct control-flow and saving decode bandwidth in the CPU pipeline.

This feature was pointed out as desirable in [1]. In [2], a binary-level optimization tool with this goal was proposed for the kernel. In the latter, not having this feature enabled was described as a "compiler bug", as going through endbrs in direct control-flows is a waste of processing power.

The proposed changes create a new flag, -mibt-fix-direct, which can be used with the options aggressive or conservative. On -mibt-fix-direct=aggressive, the compiler will try optimize every direct call with the offset, as long as the target is supposed to have an ENDBR instruction in its prologue (static or nocf_check targets don't get their respective direct calls optimized). When -mibt-fix-direct=conservative is used, non-materializable functions (those that are only declared, but not defined within the compiler context) are also not optimized. The latter case is useful, for example, to prevent wrong optimization of assembly-defined functions, at the cost, of course, of not being very broad in applying the direct call fixes throughout the binary.

Both optimization modes will work regardless of LTO, but the conservative mode will present more complete coverage when LTO is used. The feature was tested for compiling the Linux kernel in conservative mode with a config file based on defconfig + LTO + -mibt-seal ( When enabled, the feature reduced the number of direct calls that needed to be fixed by objtool [2] from 198063 to 1828 (likely missed because of assembly/unhandled sources being linked together).

Before this is merged, it would be good to have and already in. Either way, this diff is being posted now as it is practically ready to land (probably only requiring then minor fixes to comply with these other two mentioned changes).

Please, add any important reviewers I may have missed.

[1] -
[2] -

Diff Detail