This is an archive of the discontinued LLVM Phabricator instance.

KCFI sanitizer
ClosedPublic

Authored by samitolvanen on Feb 8 2022, 3:30 PM.

Details

Summary

The KCFI sanitizer, enabled with -fsanitize=kcfi, implements a
forward-edge control flow integrity scheme for indirect calls. It
uses a !kcfi_type metadata node to attach a type identifier for each
function and injects verification code before indirect calls.

Unlike the current CFI schemes implemented in LLVM, KCFI does not
require LTO, does not alter function references to point to a jump
table, and never breaks function address equality. KCFI is intended
to be used in low-level code, such as operating system kernels,
where the existing schemes can cause undue complications because
of the aforementioned properties. However, unlike the existing
schemes, KCFI is limited to validating only indirect calls and is
not compatible with executable-only memory.

KCFI does not provide runtime support, but always traps when a
type mismatch is encountered. Users of the scheme are expected
to handle the trap. With -fsanitize=kcfi, Clang emits a kcfi
operand bundle to indirect calls, and LLVM lowers this to a
known architecture-specific sequence of instructions for each
callsite to make runtime patching easier for users who require this
functionality.

A KCFI type identifier is a 32-bit constant produced by taking the
lower half of xxHash64 from a C++ mangled typename. If a program
contains indirect calls to assembly functions, they must be
manually annotated with the expected type identifiers to prevent
errors. To make this easier, Clang generates a weak SHN_ABS
__kcfi_typeid_<function> symbol for each address-taken function
declaration, which can be used to annotate functions in assembly
as long as at least one C translation unit linked into the program
takes the function address. For example on AArch64, we might have
the following code:

.c:
  int f(void);
  int (*p)(void) = f;
  p();

.s:
  .4byte __kcfi_typeid_f
  .global f
  f:
    ...

Note that X86 uses a different preamble format for compatibility
with Linux kernel tooling. See the comments in
X86AsmPrinter::emitKCFITypeId for details.

As users of KCFI may need to locate trap locations for binary
validation and error handling, LLVM can additionally emit the
locations of traps to a .kcfi_traps section.

Similarly to other sanitizers, KCFI checking can be disabled for
a function with a no_sanitize("kcfi") function attribute.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
samitolvanen added inline comments.Jun 2 2022, 2:17 PM
llvm/lib/Target/X86/X86ISelLowering.h
83 ↗(On Diff #433795)

Is there a reason why you really need specific KCFI_ nodes instead of only embedding the hash information into an attribute at the Machine Instruction?

This implementation is similar to CALL_RVMARKER, CALL_BTI and basically all other pseudo call instructions in LLVM. Is adding an attribute to MachineInstr the preferred approach instead?

I would also suggest that you at least don't name these as "KCFI_something", as in the future others might want to reuse the same structure for other CFI approaches.

Always happy to hear suggestions for alternative naming. Did you have something in mind?

joaomoreira added inline comments.Jun 2 2022, 2:41 PM
llvm/lib/Target/X86/X86ISelLowering.h
83 ↗(On Diff #433795)

This implementation is similar to CALL_RVMARKER, CALL_BTI and basically all other pseudo call instructions in LLVM. Is adding an attribute to MachineInstr the preferred approach instead?

My understanding is that if, every time a new mitigation or optimization comes in, you create a new opcode for it, it will eventually bloat to non-feasibility.

Imagine you have some mitigation like kguard being implemented. Now you can have calls which are KCFI checked but not KGUARD checked; then KCFI not-checked but KGUARD checked; then KCFI and KGUARD checked.; then none-checked. And then you need all these variations for tail calls (which imho is a first, minor, instance of the problem)...

So, in general, my understanding is that this approach works, yeah, but that in the long term it could become a hassle... so ideally we should use attributes to define these sub-specific instructions instead of opcodes.

I would also suggest that you at least don't name these as "KCFI_something", as in the future others might want to reuse the same structure for other CFI approaches.

Always happy to hear suggestions for alternative naming. Did you have something in mind?

I think switching from KCFI into CFI would already be good enough, as in the end these are all implementing the control-flow integrity concept.

samitolvanen added inline comments.Jun 2 2022, 3:59 PM
llvm/lib/Target/X86/X86ISelLowering.h
83 ↗(On Diff #433795)

My understanding is that if, every time a new mitigation or optimization comes in, you create a new opcode for it, it will eventually bloat to non-feasibility.

I do agree, if there are enough call variants that can be combined, this will quickly get messy. So far this probably just hasn't been an issue. I'm not particularly happy about the multiple pseudo instructions here either.

@pcc, any thoughts about the best way to make this cleaner? Should we switch to a MachineInstr attribute, or perhaps pass another operand with the specific call type, or something else?

so ideally we should use attributes to define these sub-specific instructions instead of opcodes.

One concern I have with using an attribute is that we have to ensure that no pass accidentally drops it while transforming the call instruction or replacing it with something else. That's not an issue if we have a separate pseudo instruction with the type as an operand. Did you run into any issues with this?

Also, how do you serialize the type attribute in MIR? Something similar to debug-instr-number?

I think switching from KCFI into CFI would already be good enough

Sure, sounds good to me. I'll change the naming once we have a consensus on the correct approach here.

MaskRay requested changes to this revision.Jun 6 2022, 10:33 PM

Please don't land the patch this state. There are several questions which should be sorted out first.

a) Naming

About the 'k' prefix: this is generic and does not need to be coupled with "kernel",
but perhaps an argument can be made that the 'k' does not need to refer to "kernel".

b) -fsanitize=function (D1338)

It is a very similar feature but uses prolog data (after the function label)
instead of prefix data (before the function label), with some limitation that
only available on x86 and restricted to C++.
I am thinking whether we can lift the C++ restriction (we'll need to move away from
CGM.GetAddrOfRTTIDescriptor) and switch to prefix data (easier to add more ports). @pcc

c) inliner

Add this change to enabling inlining

--- i/llvm/lib/Transforms/Utils/InlineFunction.cpp
+++ w/llvm/lib/Transforms/Utils/InlineFunction.cpp
@@ -1776,6 +1776,8 @@ llvm::InlineResult llvm::InlineFunction(CallBase &CB, InlineFunctionInfo &IFI,
         continue;
       if (Tag == LLVMContext::OB_clang_arc_attachedcall)
         continue;
+      if (Tag == LLVMContext::OB_kcfi)
+        continue;
 
       return InlineResult::failure("unsupported operand bundle");
     }

d) x86-64 scheme

Why is the following scheme picked? How are the double-int3 before and after movl useful?

        .type   test,@function
        .globl  __cfi_test                      # @test
        .type   __cfi_test,@function
__cfi_test:
        int3
        int3
        movl    $917620134, %eax                # imm = 0x36B1C5A6
        int3
        int3

e) __kcfi_typeid_*

This deserves a better description in the summary.

Is it useful to suppress the typeid symbol for _Z (if some C++ projects want to use this feature)? A mangled name has encoded the type information.

IIUC the current scheme only works when two TUs have address-taken declarations of the same name but with different signatures,
and done by a linker warning.

ELF linkers don't error for two SHN_ABS STB_GLOBAL symbols of the same st_value.
When the st_value fields differ, there will be a diagnostic. If needed, the specialized diagnostic can be added there.
But I'd prefer the lld patch to be separate and be properly tested (I add myself as a reviewer for lld/ELF changes to catch possibly unintended usage, sorry!)

% cat a.s
.globl foo
.set foo, 1
% cat b.s
.globl foo
.set foo, 2
% ld.lld a.o b.o
ld.lld: error: duplicate symbol: foo
>>> defined in a.o
>>> defined in b.o
clang/docs/ControlFlowIntegrity.rst
314 ↗(On Diff #433795)

Don't repeat the heading.

clang/lib/CodeGen/CodeGenModule.cpp
2201
if (!(llvm::isAlnum(C) || C == '_' || C == '.'))
  return false;
6495

T

clang/lib/CodeGen/CodeGenModule.h
1399

Unlikely Emit*, there are many set* functions which use the correct case, so there is no excuse using the wrong case.

clang/lib/Driver/SanitizerArgs.cpp
699
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1339 ↗(On Diff #433795)

early return

1345 ↗(On Diff #433795)

Use getCodePointerSIze().

A 4-byte PC-relative relocation is insufficient if text and data are more than 31-bit away.

llvm/lib/MC/MCObjectFileInfo.cpp
1130 ↗(On Diff #433795)

Remove SHF_WRITE. The section doesn't need a dynamic relocation.

llvm/lib/Target/X86/X86AsmPrinter.cpp
135 ↗(On Diff #433795)

For a temporary symbol (STB_LOCAL), no need to add __ prefix.

llvm/test/CodeGen/AArch64/kcfi-bti.ll
4 ↗(On Diff #433795)

Add a test with "patchable-function-entry"="2"

This revision now requires changes to proceed.Jun 6 2022, 10:33 PM

a) Naming

About the 'k' prefix: this is generic and does not need to be coupled with "kernel",
but perhaps an argument can be made that the 'k' does not need to refer to "kernel".

This scheme was specifically designed for the kernel, hence the name. Clang has other CFI schemes for user space, and while this might be useful for other low-level software, everyone else is probably better off using -fsanitize=cfi. I'm always happy to hear ideas for improving naming, of course!

c) inliner

Add this change to enabling inlining

Good catch, I'll fix that in the next version.

d) x86-64 scheme

Why is the following scheme picked? How are the double-int3 before and after movl useful?

This preamble format was specifically requested by x86 kernel maintainers to avoid special casing objtool and to accommodate runtime patching. I'll add a comment to explain this.

e) __kcfi_typeid_*

This deserves a better description in the summary.

Sure, I'll add better description.

Is it useful to suppress the typeid symbol for _Z (if some C++ projects want to use this feature)? A mangled name has encoded the type information.

It's not, we still need a symbol that can be referenced directly in assembly code without having to compute the actual hash.

IIUC the current scheme only works when two TUs have address-taken declarations of the same name but with different signatures,
and done by a linker warning.

Yes, this will build, but an indirect call from the TU with an incorrect declaration will result in a runtime error.

ELF linkers don't error for two SHN_ABS STB_GLOBAL symbols of the same st_value.
When the st_value fields differ, there will be a diagnostic. If needed, the specialized diagnostic can be added there.

OK, so we could just not make the symbols weak and end up failing at link time if there's a mismatch. That sounds reasonable to me.

But I'd prefer the lld patch to be separate and be properly tested (I add myself as a reviewer for lld/ELF changes to catch possibly unintended usage, sorry!)

Sure, I'll drop the lld change from this patch and we can improve the error message later. I'll address the remaining inline comments you had in the next version as well. Thanks for taking a look!

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1345 ↗(On Diff #433795)

Use getCodePointerSIze().

A 4-byte PC-relative relocation is insufficient if text and data are more than 31-bit away.

I did use that earlier, but the kernel maintainers specifically requested that I use a 32-bit offset instead to reduce memory overhead. The kernel uses this scheme also for static call addresses, so the 31-bit limit doesn't seem to be a concern in practise.

samitolvanen marked 9 inline comments as done.
  • Addressed Fangrui's feedback.
  • Renamed KCFI_* DAG nodes and pseudo instructions to CFI_* for now based on Joao's feedback. Still looking for more feedback on the best way to implement this part.
lld/ELF/Symbols.cpp
553 ↗(On Diff #433795)

Dropped the lld changes for now, and will send another patch to improve the error message later.

llvm/test/CodeGen/AArch64/kcfi-bti.ll
4 ↗(On Diff #433795)

Added the "patchable-function-entry"="2" test to AArch64/kcfi.ll.

samitolvanen edited the summary of this revision. (Show Details)Jun 7 2022, 3:36 PM

ELF linkers don't error for two SHN_ABS STB_GLOBAL symbols of the same st_value.
When the st_value fields differ, there will be a diagnostic. If needed, the specialized diagnostic can be added there.

After testing this a bit further, it looks like this won't work with LTO where we're not yet linking ELF object files. For example:

$ llvm-modextract -b -n 0 -o - sound/soc/sh/rcar/dma.o | llvm-dis | grep __kcfi_typeid_rsnd_mod_get_status
module asm ".globl __kcfi_typeid_rsnd_mod_get_status"
module asm ".set __kcfi_typeid_rsnd_mod_get_status, 463955820"
$ llvm-modextract -b -n 0 -o - sound/soc/sh/rcar/cmd.o | llvm-dis | grep __kcfi_typeid_rsnd_mod_get_status
module asm ".globl __kcfi_typeid_rsnd_mod_get_status"
module asm ".set __kcfi_typeid_rsnd_mod_get_status, 463955820"
$ ld.lld -r -o test.o sound/soc/sh/rcar/dma.o sound/soc/sh/rcar/cmd.o
ld.lld: error: duplicate symbol: __kcfi_typeid_rsnd_mod_get_status
>>> defined in sound/soc/sh/rcar/dma.o
>>> defined in sound/soc/sh/rcar/cmd.o

Do you have any thoughts on how to solve this without going back to a weak symbol?

  • Per Fangrui's request, added comments explaining the X86 preamble format, and a note about it to the patch summary.
  • Switched back to .weak for the __kcfi_typeid_ symbols to fix compatibility with LTO. Whether we need a warning for symbol value mismatches and how this should be implemented can be addressed later. The warning isn't critical for the functionality and mismatches should be extremely rare.
samitolvanen edited the summary of this revision. (Show Details)Jun 9 2022, 2:22 PM
ychen added inline comments.Jun 9 2022, 6:13 PM
clang/lib/CodeGen/CodeGenModule.cpp
2192

FYI: using prefix data may not work for the C++ coroutine. (https://github.com/llvm/llvm-project/issues/49689) because corosplit pass may clone coro functions and change its function type. But prefix data is opaque, so there is no way to detect this, and then drop the prefix data in the corosplit pass.

If this is a concern for now or for the near future, it might be better to use alternative approaches like D115844.

samitolvanen added inline comments.Jun 10 2022, 9:18 AM
clang/lib/CodeGen/CodeGenModule.cpp
2192

FYI: using prefix data may not work for the C++ coroutine.

Thanks for the link, that's interesting. The Linux kernel doesn't use C++ so this isn't a concern there, but I suppose there could theoretically also be C++ users for this feature. @pcc, any thoughts if we should just switch from prefix data to a metadata node here?

Rebased after ToT member function name changes.

Switched from prefix data + attribute to a metadata node based on previous discussion. This seems to be a cleaner solution overall.

Fixed the debug output in InstCombine to use metadata as well.

pcc added inline comments.Jun 23 2022, 6:59 PM
clang/lib/CodeGen/CodeGenModule.cpp
2192

I think the issue with coroutines was more about the fact that the function was cloned (invalidating the relative reference in the prefix data) than that the function type was changed. I seem to recall from a discussion with the coroutine folks that the functions created by the coroutine pass with different types aren't actually intended to be called directly from C/C++, so the fact that the types are changed doesn't matter.

I think the main advantage of metadata here would be in doing things like the check for type mismatches in direct calls more easily, so it seems like a reasonable approach even though the rationale is different.

6505

It would be better to have a separate function for computing the KCFI type ids than to try and reuse this one, since you don't need the MDStrings (i.e. unnecessary string uniquing) in your new caller. It also isn't appropriate to pass MetadataIdMap in your new caller because you'll end up with inconsistent values added to MetadataIdMap if we end up calling the other caller (e.g. if both CFI and KCFI are enabled), but that's moot if you avoid it.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
2453 ↗(On Diff #439137)

For PAuth ABI the motivation for avoiding the gap between check and call was around avoiding spilling the verified pointer, is this not possible with KCFI?

llvm/lib/Target/X86/X86AsmPrinter.cpp
119 ↗(On Diff #439137)

If this is just about satisfying objtool do these symbols need to be exported or can they just be STB_LOCAL?

llvm/lib/Target/X86/X86ISelLowering.h
83 ↗(On Diff #433795)

Would it make sense to add a field to MachineInstr::ExtraInfo to hold the additional information here? That way you can avoid increasing the size of MachineInstr.

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
3093 ↗(On Diff #439137)

With CFI, if this situation occurs we unconditionally crash, the rationale being that if we end up in this situation it is a situation unexpected by the developer and may even be intended to be unreachable at runtime, so it's best to crash if it does occur. The same approach is used for things like UBSan integer overflow checks if the optimizer figures out that the overflow will always happen. If I'm understanding the code correctly KCFI will just let the call occur in this case and I'm not sure if that's a good idea.

samitolvanen planned changes to this revision.Jun 24 2022, 10:45 AM
samitolvanen added inline comments.
clang/lib/CodeGen/CodeGenModule.cpp
6505

The code in SanitizerArgs.cpp doesn't allow both CFI and KCFI to be enabled at the same time, but you're right, it's probably better to just split this into a separate function for KCFI. I'll do that in the next version.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
2453 ↗(On Diff #439137)

I don't believe that's a concern here. We might emit instructions for setting up call arguments between the check and the call, but nothing should spill the pointer or change the target register anymore. The main reason to avoid bundling the check and the call is to avoid further complications with expanding the remaining pseudo instructions.

llvm/lib/Target/X86/X86AsmPrinter.cpp
119 ↗(On Diff #439137)

They don't have to be exported, but we ran into some objtool confusion when the parent function was weak, so it's easier to just use the same linkage for the preamble symbol.

llvm/lib/Target/X86/X86ISelLowering.h
83 ↗(On Diff #433795)

I'll look into it. Per offline discussion, we might need a similar construction for SDNode to avoid increasing memory usage too much should we go with type hash attributes instead.

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
3093 ↗(On Diff #439137)

The difference here is that unlike with CFI, we never emit KCFI checks for direct calls in the back-end. While this code drops unnecessary kcfi operand bundles from the IR, it doesn't actually change the machine code we generate.

Also, I would argue that KCFI should focus only on runtime protection of actual indirect calls, and otherwise shouldn't trap on potential undefined behavior that already exists. Causing a runtime failure for something we can detect at compile-time and don't aim to protect against doesn't sound ideal, especially in the kernel where we prefer to avoid crashing unnecessarily. I did consider changing this to a warning though, so the developers are notified of potential issues, but opted for debug logging for now. Thoughts?

  • Moved to a standalone function for generating KCFI type hashes in Clang.
  • Switched from pseudo instructions to a MachineInstr::ExtraInfo + SDNode attributes.
  • Added KCFI passes for emitting the indirect call checks.
llvm/lib/Target/X86/X86ISelLowering.h
83 ↗(On Diff #433795)

In my testing, adding a uint32_t attribute to SDNode increased max RSS by ~0.15% during a kernel build. As there's no ExtraInfo type construction in SDNode currently, adding one would increase memory usage more. Perhaps we can look into that if more attributes are needed?

nickdesaulniers accepted this revision.Jun 30 2022, 11:21 AM

I see you modified the mir parser & printer; consider adding a .mir test.

Still LGTM. Might be nice to document the generated asm more for other compiler vendors to better understand the implementation, rather than having to read the tests added here.

clang/lib/CodeGen/CodeGenFunction.h
4595

Should this parameter be an ArrayRef? This might be a more precise type, but seeing an impl type kind of breaks the whole pImpl idiom; I'm pretty sure that's what ArrayRef is for.
https://llvm.org/docs/ProgrammersManual.html#llvm-adt-arrayref-h
That said, this header is pretty inconsistent in the use of SmallVectorImp vs ArrayRef, but I *think* ArrayRef is strongly preferred.

llvm/lib/CodeGen/MachineInstr.cpp
1770–1771 ↗(On Diff #440389)

If FirstOp is false, reassign it false? Is that right? I see that's what's done on L1763, but..."what?"

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
7821–7830 ↗(On Diff #440389)

Can we check whether the CallBase is an indirect call first before bothering with getting the operand bundle?

7822–7840 ↗(On Diff #440389)

Are we able to reorder these?

CLI.setDebugLoc ...
  ...
if (Bundle && CB.isIndirectCall) {
  ...
  CLI.setCFIType ...
llvm/lib/Target/AArch64/AArch64KCFI.cpp
106 ↗(On Diff #440389)

const auto &SubTarget = ...

llvm/lib/Target/X86/X86KCFI.cpp
110 ↗(On Diff #440389)

const auto &SubTarget = ...

samitolvanen marked 3 inline comments as done.

Added a MIR parser test for cfi-type, address Nick's other feedback.

I see you modified the mir parser & printer; consider adding a .mir test.

I added a .mir test for parsing the cfi-type. The machine instructions generated for the various call types are covered in the .ll tests.

Might be nice to document the generated asm more for other compiler vendors to better understand the implementation, rather than having to read the tests added here.

The exact instruction sequence is also documented in the kernel patches in case other compiler vendors want to implement this, but sure, that makes sense. I assume comments in the various AsmPrinter functions would be sufficient?

clang/lib/CodeGen/CodeGenFunction.h
4595

I don't believe I can use ArrayRef when I need to append to the list, it appears to be read-only.

llvm/lib/CodeGen/MachineInstr.cpp
1770–1771 ↗(On Diff #440389)

Ah yes, it looks like this has been copied a few times earlier already. I'll stop the cycle.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
7822–7840 ↗(On Diff #440389)

We can reorder these, but this does follow the same convention as the rest of the function. I assume CallLoweringInfo was specifically designed so that the arguments can all be initialized in one statement.

It uses LLVM prefix data to store a type identifier for each function and injects verification code before indirect calls.

Is "prefix data" stale now?

MaskRay added a comment.EditedJul 1 2022, 12:45 PM

There are 30+ comments not marked as "done". Having so many is distracting to reviewers as it's unclear whether this is in a ready-for-review state.
I personally definitely don't consider this in a ready-for-land state.

clang/include/clang/Basic/Features.def
226

move before modules for an alphabetical order.

MaskRay requested changes to this revision.Jul 1 2022, 12:59 PM
MaskRay added inline comments.
clang/lib/CodeGen/CodeGenModule.cpp
2187

delete blank line

2193

delete blank line

clang/test/CodeGen/kcfi.c
3

If -O2 has different behavior, having the test in clang/test/CodeGen is likely a layering violation.
Optimization tests should be placed in llvm/test/, in the relevant pass. Folks working on llvm optimizations generally don't want to test every frontend.

clang/test/Driver/fsanitize.c
670

Use modern spelling --target=. -target is deprecated

This revision now requires changes to proceed.Jul 1 2022, 12:59 PM
MaskRay added inline comments.Jul 1 2022, 1:29 PM
clang/lib/CodeGen/CodeGenModule.h
1401

End with a period.

This function can be lower-case as well. There is no strong precedent to keep it upper-case.

clang/lib/Driver/SanitizerArgs.cpp
63

This is incorrect.

If a violation is found, ud2 is executed. ud2 is not followed by normal control flow so I don't think recovery from the error is supported.

This seems like Unrecoverable

clang/test/CodeGen/kcfi.c
3

If may be useful to have a -x c++ test.

Add a not-address-taken external linkage function.

8

Optional: COM: is fine. But to make non-RUN non-CHECK lines stand out (from editor highlighting), a simpler /// suffices as well.

llvm/docs/LangRef.rst
7222 ↗(On Diff #441546)
MaskRay added inline comments.Jul 1 2022, 1:38 PM
llvm/lib/Target/X86/X86AsmPrinter.cpp
125 ↗(On Diff #441546)

A previous comment isn't done. This doesn't explain that the double-int3 before and after the identifier is an objtool requirement (or similar).

samitolvanen edited the summary of this revision. (Show Details)Jul 6 2022, 11:24 AM
samitolvanen marked 31 inline comments as done.

Addressed Fangrui's feedback.

It uses LLVM prefix data to store a type identifier for each function and injects verification code before indirect calls.

Is "prefix data" stale now?

Yes, I forgot to update the commit message here. Fixed.

There are 30+ comments not marked as "done". Having so many is distracting to reviewers as it's unclear whether this is in a ready-for-review state.

My bad, I wasn't familiar with the convention here. I've just been marking my replies as done.

I personally definitely don't consider this in a ready-for-land state.

I left some questions about your comments below. PTAL.

samitolvanen added inline comments.Jul 6 2022, 11:26 AM
clang/lib/Driver/SanitizerArgs.cpp
63

This variable is only used to indicate whether -fno-sanitize-recover command line parameter can be used with the sanitizer. It makes no sense to allow this with KCFI as we always emit a recoverable instruction sequence, hence it's included here.

Also, ud2 absolutely is recoverable in the kernel, and Linux specifically uses ud2 to trigger warnings in assembly code.

clang/test/CodeGen/kcfi.c
3

If -O2 has different behavior, having the test in clang/test/CodeGen is likely a layering violation.

Good point, the -O2 test isn't actually relevant anymore as we no longer rely on it. I'll drop it.

Optimization tests should be placed in llvm/test/, in the relevant pass.

Yes, the InstCombine test below covers this already.

3

If may be useful to have a -x c++ test.

Would you mind elaborating on this? The main difference with -x c++ is the name mangling, I'm not sure if that adds much value in this case.

Add a not-address-taken external linkage function.

Added.

clang/test/Driver/fsanitize.c
670

Use modern spelling --target=. -target is deprecated

It doesn't look like --target works here:

clang-15: error: unsupported option '--target'; did you mean '-target'?

Also the rest of the file uses -target, so perhaps these can all be updated at once when the flag is deprecated?

llvm/lib/Target/X86/X86AsmPrinter.cpp
125 ↗(On Diff #441546)

A previous comment isn't done. This doesn't explain that the double-int3 before and after the identifier is an objtool requirement (or similar).

These are not objtool requirements, and I believe the comments do explain why the int3 padding is added. The first two are to reserve some space for runtime patching this code, and the latter two is to avoid call target gadgets in the checks. I'm obviously happy to improve the comments, is there something specific you'd like me to add?

samitolvanen marked 2 inline comments as done.

Addressed the rest of Fangrui's comments after clarifying the open questions with him offline.

MaskRay added inline comments.Jul 7 2022, 12:18 PM
clang/lib/Driver/SanitizerArgs.cpp
63

ud2 being recoverable in the kernel is insufficient. The IR should consider this recoverable. In the presence of a failure, the control flow should be transferred as if no failure happens. E.g. for an asan out-of-bounds failure, the code should behave as if the failure is ignored.

samitolvanen marked an inline comment as done.Jul 7 2022, 12:46 PM
samitolvanen added inline comments.
clang/lib/Driver/SanitizerArgs.cpp
63

ud2 being recoverable in the kernel is insufficient. The IR should consider this recoverable. In the presence of a failure, the control flow should be transferred as if no failure happens. E.g. for an asan out-of-bounds failure, the code should behave as if the failure is ignored.

Sure, that's exactly now the code we emit here works as well. There are no traps emitted to the IR code, we only add an operand bundle to indirect calls, which doesn't affect the control flow as far as the IR is concerned.

samitolvanen marked 2 inline comments as done.
  • Added a missing equivalency check to MachineInstr.

Mostly looks good.

clang/lib/CodeGen/CodeGenModule.cpp
2200

Use llvm::all_of or llvm::any_of

llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
332 ↗(On Diff #444022)

Add const. Delete blank line after the declaration.

350 ↗(On Diff #444022)

Add const

384 ↗(On Diff #444022)

delete blank line

401 ↗(On Diff #444022)

delete blank line

llvm/lib/Target/AArch64/AArch64KCFI.cpp
62 ↗(On Diff #444022)

delete blank line

103 ↗(On Diff #444022)

delete blank line

llvm-project doesn't typically add a blank line after a variable declaration. please fix other places in this patch.

111 ↗(On Diff #444022)

delete blank line

117 ↗(On Diff #444022)

don't use early return if the then part is simple.

llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp
1183 ↗(On Diff #444022)
} else if (Info.CFIType) {
  ...
}
llvm/lib/Target/X86/X86KCFI.cpp
101 ↗(On Diff #444022)

delete blank line

107 ↗(On Diff #444022)

MaskRayUnsubmittedDone
delete blank line

115 ↗(On Diff #444022)

delete blank line

121 ↗(On Diff #444022)

Fix this in a way similar to AArch64.

samitolvanen marked 14 inline comments as done.

Addressed Fangrui's feedback.

llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
332 ↗(On Diff #444022)

Deleted the blank line, but this can't be const because it's assigned to in the if statement below.

MaskRay accepted this revision.Jul 13 2022, 12:52 PM

Worth waiting a bit and checking whether @pcc has more comments.

This revision is now accepted and ready to land.Jul 13 2022, 12:52 PM
  • Simplified AArch64 by dropping the KCFI_CHECK_BTI pseudo and just using a different caller-saved temporary register if needed.
  • Added -verify-machineinstrs to llc tests.
samitolvanen planned changes to this revision.Jul 22 2022, 4:12 PM

Based on the recent LKML discussions about X86 retbleed mitigations (https://lore.kernel.org/lkml/20220716230344.239749011@linutronix.de/), we're going to need some changes to the code generated for X86.

(Will read new updates.)

Note that release/15.x is branching soon (https://discourse.llvm.org/t/llvm-15-0-0-release-schedule/63495): 2022-07-26.
It seems that we can miss the branch to have more time mature the feature and have it for 16.0.

Addressed conflicts with X86 retbleed mitigations (https://lore.kernel.org/lkml/20220716230344.239749011@linutronix.de/):

  1. Changed the type check instruction sequence emitted by X86AsmPrinter::LowerKCFI_CHECK not to include the full constant, which allows us to freely position the function preamble without worrying about call target gadgets at indirect call sites.
  2. Changed the lowering code to take patchable-function-prefix into account, and allowed -fpatchable-function-entry=N,M where M>0 to be used in Clang with KCFI.
  3. As we must maintain alignment of the function entry on X86 to avoid performance regressions (https://lore.kernel.org/lkml/87ilnuuiw8.ffs@tglx/), changed the preamble padding to ensure the function entry remains aligned with KCFI, also when combined with patchable-function-prefix.
This revision is now accepted and ready to land.Jul 25 2022, 2:36 PM
samitolvanen edited the summary of this revision. (Show Details)Jul 25 2022, 2:37 PM

It seems that we can miss the branch to have more time mature the feature and have it for 16.0.

Sure, that sounds reasonable.

joaomoreira accepted this revision.Jul 27 2022, 8:44 PM

I really like this revision. It removes the redundancy of having KCFI passes both in CodeGen and in the backend; it detangles CALL instructions from KCFI by creating a new MIR instruction; it fixes alignment while still supporting the -fpatchable-function-entry option; it doesn't add hashes/gadgets through the code as it was before needed by the use of cmp instructions. With all this said, the patch LGTM.

I tested the patch compiling toy snippets and also compiling the kernel and found no issues.

Some minor suggestions/not so relevant:

  • Change the name of CFIType to CFIHash, as it is probably more descriptive.
  • Change the KCFI.* into CFI.* where the passes/structures can be re-used by a different CFI approach. For example, X86KCFI::emitCheck can be easily re-used by other CFI schemes. Similarly, the LowerKCFI_CHECK function can also be easily re-used. Because these other approaches are hypothetical by now, this is not a big deal... but may be subject to changes in the future, in case someone comes up with new schemes.
  • Deny the ENDBR64/32 opcodes as valid hashes for the CFIType. There was an effort in the past to prevent it from being emitted as an immediate, thus, perhaps, we should also care. Since FineIBT may inherit hashes from KCFI, it would be great to prevent these from becoming valid indirect targets or needing to be checked.

Some relevant optimizations/improvements for future work:

  • When LTO is used, hashes may be suppressed for non-local + non-address taken functions.
  • As suggested by Andy Cooper, add a salt attribute that allows to differentiate hashes of functions with the same prototype.
llvm/include/llvm/CodeGen/SelectionDAGNodes.h
617 ↗(On Diff #447476)

I wonder if CFIHash would be a more descriptive name.

  • Ensured that we don't emit ENDBR64/32 as the type hash on X86.

Some minor suggestions/not so relevant:

  • Change the name of CFIType to CFIHash, as it is probably more descriptive.

My thinking here was that Type is more generic than a Hash, but I don't really have a strong opinion about this.

  • Change the KCFI.* into CFI.* where the passes/structures can be re-used by a different CFI approach. For example, X86KCFI::emitCheck can be easily re-used by other CFI schemes. Similarly, the LowerKCFI_CHECK function can also be easily re-used. Because these other approaches are hypothetical by now, this is not a big deal... but may be subject to changes in the future, in case someone comes up with new schemes.

Sure, it should be trivial for a future patch that wants to reuse this code to rename the functions and classes as needed.

  • Deny the ENDBR64/32 opcodes as valid hashes for the CFIType. There was an effort in the past to prevent it from being emitted as an immediate, thus, perhaps, we should also care. Since FineIBT may inherit hashes from KCFI, it would be great to prevent these from becoming valid indirect targets or needing to be checked.

Good point. It should be quite unlikely that we'll end up randomly generating ENDBR as a hash, but I added a helper to the X86 code to ensure we don't end up emitting these values.

Some relevant optimizations/improvements for future work:

  • When LTO is used, hashes may be suppressed for non-local + non-address taken functions.
  • As suggested by Andy Cooper, add a salt attribute that allows to differentiate hashes of functions with the same prototype.

Agreed, and another request from Peter Z was to add a flag to omit ENDBR from functions with the !kcfi_type attribute. I think these can all be separate follow-up patches.

joaomoreira added inline comments.Aug 12 2022, 12:00 PM
llvm/lib/Target/X86/X86AsmPrinter.cpp
121 ↗(On Diff #448454)

Can we use another constant blinding scheme, such as a Value++ or anything else? This way, we would prevent endbrs from being emitted in the indirect branch guards too.

Since we are using Value (prologue) and ~Value (caller/guard) for doing the checks, we also need to check if ~ENDBR was picked as a KCFIType, otherwise ENDBR will be emitted in the ibranch guards.

llvm/test/CodeGen/X86/kcfi.ll
91 ↗(On Diff #448454)

We need to also ensure/test that these are not emitted in the caller/indirect branch guards.

I assume that in the current scheme (blinding with ~Value) would be unfeasible to do this, so maybe we need a different approach for masking (as suggested above).

samitolvanen marked 2 inline comments as done.
  • Changed MaskKCFIType to also avoid negative invalid patterns and return Value + 1 for clarity.
llvm/lib/Target/X86/X86AsmPrinter.cpp
121 ↗(On Diff #448454)

Can we use another constant blinding scheme, such as a Value++ or anything else? This way, we would prevent endbrs from being emitted in the indirect branch guards too.

Since we are using Value (prologue) and ~Value (caller/guard) for doing the checks, we also need to check if ~ENDBR was picked as a KCFIType, otherwise ENDBR will be emitted in the ibranch guards.

I don't mind changing this to Value + 1, but that actually doesn't change anything because we emit -Value in indirect call checks, not ~Value. Therefore, using ~Value works equally well here.

Specifically, this code currently emits ~Valuein the preamble and -(~Value) == Value + 1 in the indirect call check. Switching to Value + 1 simply reverses the order; we'll emit Value + 1 in the preamble and -(Value + 1) == ~Value in the indirect call check.

However, you are right that we also need to avoid -ENDBR in this function. I'll fix that and clarify the comment.

llvm/test/CodeGen/X86/kcfi.ll
91 ↗(On Diff #448454)

Added a test for -ENDBR too.

joaomoreira added inline comments.Aug 13 2022, 1:02 AM
llvm/lib/Target/X86/X86AsmPrinter.cpp
121 ↗(On Diff #448454)

Oops, got confused with the operands, tks for clearing it up. Otherwise, changes LGTM.

OK, so we could just not make the symbols weak and end up failing at link time if there's a mismatch. That sounds reasonable to me.

For STB_WEAK SHN_ABS __kcfi_typeid_*, there is no duplicate definition error. Is this behavior intentional?
Note: I don't think we should change lld to recognize some symbol prefix and enforce more rigid diagnostics.
An error must be implemented with existing ELF features.

clang/lib/CodeGen/CodeGenModule.cpp
2220–2222
if (const llvm::MDNode *MD = F.getMetadata(llvm::LLVMContext::MD_kcfi_type))
  Type = ...
else
  continue;
MaskRay added inline comments.Aug 18 2022, 4:47 PM
llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
330 ↗(On Diff #452339)

const?

331 ↗(On Diff #452339)

delete blank line

samitolvanen marked 3 inline comments as done.

Addressed Fangrui's comments.

For STB_WEAK SHN_ABS __kcfi_typeid_*, there is no duplicate definition error. Is this behavior intentional?

Yes, this is by design. Multiple translation units must be able to take the address of the same function declaration, and it must be possible to link them together.

Note: I don't think we should change lld to recognize some symbol prefix and enforce more rigid diagnostics.
An error must be implemented with existing ELF features.

Sure, that's reasonable. I don't think we need more diagnostics here as __kcfi_typeid_ value mismatches are unlikely, but should the need arise, we can figure out the best way to implement them later. Note that this must also work with LTO when linking bitcode files, so we can't rely on ELF features alone.

llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
330 ↗(On Diff #452339)

This can't be const because we assign to this array below in the for (auto &Reg : ScratchRegs) loop.

efriedma added inline comments.Aug 19 2022, 12:41 PM
clang/docs/ControlFlowIntegrity.rst
319 ↗(On Diff #453838)

To clarify, by "indirect call checking", do you mean function pointers specifically? Or does this also work for C++ vtables/member function pointers?

samitolvanen added inline comments.Aug 19 2022, 12:47 PM
clang/docs/ControlFlowIntegrity.rst
319 ↗(On Diff #453838)

Yes, function pointers only. I'll clarify this part.

samitolvanen marked an inline comment as done.

Clarified that KCFI checks only function pointers in the documentation.

kees accepted this revision.Aug 19 2022, 3:12 PM
This revision was landed with ongoing or failed builds.Aug 24 2022, 12:02 PM
Closed by commit rG67504c95494f: KCFI sanitizer (authored by samitolvanen). · Explain Why
This revision was automatically updated to reflect the committed changes.
samitolvanen added inline comments.Aug 24 2022, 12:35 PM
llvm/include/llvm/CodeGen/MachineInstr.h
265 ↗(On Diff #455316)

This fails on 32-bit architectures as PointerEmbeddedInt doesn't allow storing 32 bits in a 32-bit pointer:

// Note: This '<' is correct; using '<=' would result in some shifts
// overflowing their storage types.
static_assert(Bits < sizeof(uintptr_t) * CHAR_BIT,
              "Cannot embed more bits than we have in a pointer!")
samitolvanen added inline comments.Aug 24 2022, 3:46 PM
llvm/include/llvm/CodeGen/MachineInstr.h
265 ↗(On Diff #455316)

PointerSumType also needs space for a tag, which we don't have in a 32-bit pointer. Looks like we just need to always store CFIType in ExtraInfo to avoid this issue, similarly to HeapAllocMarker.