This is an archive of the discontinued LLVM Phabricator instance.

[X86] Enable ibt-seal optimization when LTO is used in Kernel
ClosedPublic

Authored by joaomoreira on Dec 20 2021, 6:01 PM.

Details

Summary

Intel's CET/IBT requires every indirect branch target to be an ENDBR instruction. Because of that, the compiler needs to correctly emit these instruction on function's prologues. Because this is a security feature, it is desirable that only actual indirect-branch-targeted functions are emitted with ENDBRs. While it is possible to identify address-taken functions through LTO, minimizing these ENDBR instructions remains a hard task for user-space binaries because exported functions may end being reachable through PLT entries, that will use an indirect branch for such. Because this cannot be determined during compilation-time, the compiler currently emits ENDBRs to every non-local-linkage function.

Despite the challenge presented for user-space, the kernel landscape is different as no PLTs are used. With the intent of providing the most fit ENDBR emission for the kernel, kernel developers proposed an optimization named "ibt-seal" which replaces the ENDBRs for NOPs directly in the binary. The discussion of this feature can be seen in [1].

This diff brings the enablement of the flag -mibt-seal, which in combination with LTO enforces a different policy for ENDBR placement in when the code-model is set to "kernel". In this scenario, the compiler will only emit ENDBRs to address taken functions, ignoring non-address taken functions that are don't have local linkage.

A comparison between an LTO-compiled kernel binaries without and with the -mibt-seal feature enabled shows that when -mibt-seal was used, the number of ENDBRs in the vmlinux.o binary patched by objtool decreased from 44383 to 33192, and that the number of superfluous ENDBR instructions nopped-out decreased from 11730 to 540.

The 540 missed superfluous ENDBRs need to be investigated further, but hypotheses are: assembly code not being taken care of by the compiler, kernel exported symbols mechanisms creating bogus address taken situations or even these being removed due to other binary optimizations like kernel's static_calls. For now, I assume that the large drop in the number of ENDBR instructions already justifies the feature being merged.

[1] - https://lkml.org/lkml/2021/11/22/591

Diff Detail

Unit TestsFailed

Event Timeline

joaomoreira created this revision.Dec 20 2021, 6:01 PM
joaomoreira requested review of this revision.Dec 20 2021, 6:01 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 20 2021, 6:01 PM

Modularized needsPrologueENDBR function and removed missed comment.

Thanks for this optimization. The logic is clear on base of "the kernel landscape is different as no PLTs are used"

llvm/lib/Target/X86/X86IndirectBranchTracking.cpp
124

Not sure what the action if a function both has local and external call.
Can this be replace with "hasExternalLinkage" ?

joaomoreira added inline comments.Dec 21 2021, 5:54 PM
llvm/lib/Target/X86/X86IndirectBranchTracking.cpp
124

This is remaining from the same logic which was in place before this patch. Still, in my understanding, it cannot be replaced by "hasExternalLinkage" as static functions may have their address taken and still have local linkage.

LGTM, Let me accept later if no others opps

xiangzhangllvm accepted this revision.Dec 22 2021, 9:21 PM
This revision is now accepted and ready to land.Dec 22 2021, 9:21 PM
joaomoreira added a comment.EditedJan 18 2022, 1:31 PM

Hi, is anything still preventing this from being merged?

@pengfei or @craig.topper perhaps one of you could merge this? please, let me know if I'm missing any obvious requirement or detail.

@pengfei or @craig.topper perhaps one of you could merge this? please, let me know if I'm missing any obvious requirement or detail.

Sure, I can do it.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJan 20 2022, 6:55 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript

The module flags metadata usage "Override" was wrong. I fixed it by using "Min".

Herald added a project: Restricted Project. · View Herald TranscriptJul 22 2022, 3:10 PM
MaskRay added a comment.EditedDec 18 2022, 7:08 PM

I have a question why the option is needed.

  if (IBTSeal) {
    return F.hasAddressTaken();
  }
  // if !IBTSeal, fall into default case.
  LLVM_FALLTHROUGH;
// Address taken or externally linked functions may be reachable.
default:
  return (F.hasAddressTaken() || !F.hasLocalLinkage());
  • If the link mixes bitcode files and ELF relocatable files, for a function in a bitcode file, F.hasAddressTaken() doesn't indicate that its address is not taken by an ELF relocatable file.
  • For ThinLTO, a function may have false F.hasAddressTaken() for the definition in one module and true F.hasAddressTaken() for a reference in another module.

In both cases, return F.hasAddressTaken(); isn't sufficient.

If we don't consider these issue: I think -mibt-seal can be removed if we merge unnamed_addr/local_unnamed_addr for ThinLTO (but we don't). If we do,
F.hasAddressTaken() || (F.hasHiddenVisibility() && !F.hasLocalLinkage()) can be used as a proxy.

(I substantially edited my previous comment.) It now says this:

If the link mixes bitcode files and ELF relocatable files, for a function in a bitcode file, F.hasAddressTaken() doesn't indicate that its address is not taken by an ELF relocatable file.

Sent D140363 to remove this incorrectly implemented option.

llvm/test/CodeGen/X86/ibtseal-small.ll
2

It's not necessary to create 3 identical test files. We can use 3 RUN lines instead of using the DejaGnu style approach:)