A new function pass (Transforms/CFGuard/CFGuard.cpp) inserts CFGuard checks on
indirect function calls, using either the check mechanism (X86, ARM, AArch64) or
or the dispatch mechanism (X86-64). The check mechanism requires a new calling
convention for the supported targets. The dispatch mechanism adds the target as
an operand bundle, which is processed by SelectionDAG. Another pass
(CodeGen/CFGuardLongjmp.cpp) identifies and emits valid longjmp targets, as
required by /guard:cf. This feature is enabled using the cfguard CC1 option.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Here is some feedback, I apologize for dragging my feet.
Also, I would really like to get feedback from @pcc. He's OOO currently, though.
clang/include/clang/Driver/Options.td | ||
---|---|---|
506 | @hans, WDYT about the -cc1 interface? I think this is fine. | |
clang/lib/Driver/ToolChains/Clang.cpp | ||
5980 | Comment formatting needs to be fixed, you can use clang-format for this. | |
llvm/include/llvm/CodeGen/MachineBasicBlock.h | ||
179 ↗ | (On Diff #213413) | I hesitate to make MBB labels public, I think it might be better to build your own MCSymbol label with a public name. |
llvm/lib/CodeGen/CFGuardLongjmp.cpp | ||
102 | Rather than changing the machine CFG, I would suggest using MachineInstr::setPostInstrSymbol, which I've been planning to use for exactly this purpose. :) You can make labels with MCContext::createSymbol, and you'll want to come up with symbols that won't clash with C or C++ symbols. I see MSVC makes labels like $LN4, so maybe $cfgsjN or something like that. I think MCContext will add numbers for you. | |
llvm/lib/Target/X86/X86FixupCFGuard.cpp | ||
11 ↗ | (On Diff #213413) | I guess it's not X86CFGuard.cpp now that you've generalized it as a cross-platform IR pass. |
57–58 ↗ | (On Diff #213413) | Can you elaborate on what the correctness requirement is? The correctness requirement is just that the loaded value is not saved in a CSR which is then spilled and reloaded from memory across a call site. If possible, I think it would be worth teaching both FastISel and SDISel to build the correct MachineInstrs during instruction selection, and turn this into a validation pass that fails if it sees indirect calls through registers in functions where CFG is enabled. |
114 ↗ | (On Diff #213413) | I think you probably also want to look for tail calls. I think the best way to do this would be to use MI.isCall() in the loop, and then switch on the opcode to handle all call variants. |
134 ↗ | (On Diff #213413) | Is it possible to use MachineRegisterInfo to iterate over all the uses of the load, instead of scanning the BB? It seems like this could be O(n^2) if a single BB contains a long sequence of indirect calls. I think it would look like this: MachineRegisterInfo *MRI = MF.getRegInfo(); for (MachineInstr &MI : MRI->use_instructions(Reg)) // do something with MI There's also use_operands. |
168–170 ↗ | (On Diff #213413) | Erasing at the end is always the safest way to do it. :) |
209 ↗ | (On Diff #213413) | This doesn't do anything with the return value, should it? |
llvm/lib/Transforms/CFGuard/CFGuard.cpp | ||
39 | Does this need to be a ModulePass? It's preferable to avoid module passes if possible. When the pass manager has a sequence of function passes to run, it runs all passes on function f, then g, then h, and so on, rather than running the function pass over the entire module and then the next pass over the entire module, and so on. This is done to get better cache locality: each function pass touches f's data structures repeatedly, and then moves on to g without ever going back to f. You should be able to do the global checks in an override of doInitialization, which receives a Module. | |
43 | I guess more LLVM-style names for these would be CF_Check, CF_Dispatch. Of course, Rui is changing the style soon anyway. | |
221 | You'd want to set the tail call kind here to forward tail or musttail. This matters for things like member pointer function thunks. This C++ makes a musttail thunk, for example: struct A { virtual void f(); }; auto mp = &A::f; Which gives this IR: define linkonce_odr void @"??_9A@@$BA@AA"(%struct.A* %this, ...) #0 comdat align 2 { entry: %0 = bitcast %struct.A* %this to void (%struct.A*, ...)*** %vtable = load void (%struct.A*, ...)**, void (%struct.A*, ...)*** %0, align 8, !tbaa !3 %1 = load void (%struct.A*, ...)*, void (%struct.A*, ...)** %vtable, align 8 musttail call void (%struct.A*, ...) %1(%struct.A* %this, ...) ret void } This does an indirect virtual call through the 0th slot, and if we lose musttail, it won't perfectly forward arguments. | |
255 | So, this isn't going to work if the original calling convention was something other than the default. For x64, the only commonly used non-standard convention would be vectorcall. Changing the convention here in the IR is going to make it hard to pass that along. I think as you can see from how much code you have here already, replacing call instructions in general is really hard. These days there are also bundles, which I guess is something else missing from the above. Here's a sketch of an alternative approach:
This way, the original call instruction remains with exactly the same arrangement of args, attributes, callbr successors, tail call kind, etc. All you're doing is changing the callee, and passing the original callee as an extra argument on the side. Basically, this approach leverages bundles to avoid messing with the function type, which more or less requires replacing the call. :) There are probably issues with this design that I haven't foreseen, but I think it's worth a try. | |
llvm/test/CodeGen/AArch64/cfguard-checks.ll | ||
4–5 | These annotations shouldn't be necessary, the tests should pass as written on Linux. The mtriple flag above tells llc to generate code for Windows. |
This is very exciting! I didn't look closely at the actual instrumentation code, as rnk knows that better and had some good comments.
clang/include/clang/Driver/Options.td | ||
---|---|---|
506 | The -cc1 interface looks good to me, but since these are cc1 options only, I think they should be in CC1Options.td, not Options.td (I realize this is a pre-existing issue with -cfguard). | |
clang/lib/CodeGen/CodeGenModule.cpp | ||
484–485 | Maybe check for this first, and then the ControlFlowGuardNoChecks one in an else-if, to make this one take precedence? | |
clang/lib/Driver/ToolChains/Clang.cpp | ||
5979 | It seems unfortunate that the Instrument and NoChecks variables allow four different states. whereas I think the logic should really only allow for three: 1) do nothing, 2) tables only, 3) tables and checks. Maybe we could have an enum instead? | |
5980 | Also I'd suggest braces since even though there's just one statement, there are multiple lines here and in the else block. | |
clang/lib/Driver/ToolChains/MSVC.cpp | ||
425 | Thanks for remembering /fallback. Please add a /guard:cf option to the first test in clang/test/Driver/cl-fallback.c | |
llvm/docs/LangRef.rst | ||
441 | Indentation looks a little off here. "function, which can be inserted before indirect calls" -- maybe instead "function, calls to which can be inserted ..." | |
447 | I don't think X86_32 is common nomenclature.. maybe just X86 is enough. | |
456 | Indentation off here too. | |
458 | Maybe explicitly call out in the text that this calling convention is only used on x86_64, and is used instead of cfguard_checkcc (iiuc)? | |
llvm/include/llvm/IR/CallingConv.h | ||
79 | nit: s/take/takes/ | |
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp | ||
384 | ultra nit: period at the end, here and for other comments | |
llvm/lib/Target/X86/X86FixupCFGuard.cpp | ||
13 ↗ | (On Diff #213413) | Naive question: Why do we generate code as in the examples in the first place, and can't some general optimization pass do this folding? From the examples it looks like straight-forward constant propagation. |
llvm/lib/Target/X86/X86FixupCFGuard.cpp | ||
---|---|---|
13 ↗ | (On Diff #213413) | Actually, I used this test IR, LLVM seems to always fold the memory operand into the call: @fptr = external dso_local global void()* define i32 @foo() { %fp1 = load void()*, void()** @fptr call void %fp1() %fp2 = load void()*, void()** @fptr call void %fp2() ret i32 0 } Maybe it won't do it if there are more parameters, I'm not sure. I ran llc with both isels for x64 and ia32, and it always folded the load into the call. Maybe it's best to make this a verification pass that emits an error via MCContext if there is an unfolded load of the CFG check function pointer? |
Improved Control Flow Guard implementation.
As suggested by @rnk, the CFGuard dispatch mechanism now uses a new
cfguardtarget operand bundle to pass the indirect call target. Instruction
selection adds this target as a new ArgListEntry and sets a new
isCFGuardTarget bit. CC_X86_Win64_C puts this argument in RAX.
Thanks for this really helpful feedback @rnk! I think I've been able to address most of your comments in the latest update. Your suggested approach of using operand bundles for the dispatch mechanism instead of changing calling conventions works well and makes the design a lot cleaner.
clang/include/clang/Driver/Options.td | ||
---|---|---|
506 | Thanks for all the helpful feedback @hans. I think I've addressed all your comments in the latest revision. | |
llvm/lib/CodeGen/CFGuardLongjmp.cpp | ||
102 | Thanks for the suggestion! It looks like MCContext::createSymbol is private, so I'm generating a new symbol name and number for MCContext::getOrCreateSymbol. Does this look OK? | |
llvm/lib/Target/X86/X86FixupCFGuard.cpp | ||
13 ↗ | (On Diff #213413) | I'm seeing this when compiling with optimizations disabled. When I run llc with -fast-isel=false, the following slightly modified IR does not fold the memory operand into the call: @fptr = external dso_local global void()* define i32 @foo() #0 { %fp1 = load void()*, void()** @fptr call void %fp1() %fp2 = load void()*, void()** @fptr call void %fp2() ret i32 0 } attributes #0 = { noinline optnone } It looks like this is caused by checks for OptLevel == CodeGenOpt::None in:
I guess this is not high priority as it only happens at -O0. Should I look into enabling these specific optimizations when CFG is enabled, or just emit a warning when this happens? |
llvm/lib/Transforms/CFGuard/CFGuard.cpp | ||
207 | Is this the correct way to add operand bundles to what is essentially an existing call/invoke instruction? | |
255 | This design seems to work well. I guess at some point we would also have to teach GlobalISel to recognize these operand bundles, right? |
I think this looks pretty good, thanks! I really only noticed style nits. I think with some small fixes, we should go ahead and merge it. If you want, I can commit it on your behalf, but I know there are other folks at Microsoft with commit access to LLVM, if you'd prefer. When you upload the next diff, make sure to use the merge point with origin/master as the base, so that the patch will apply cleanly.
clang/lib/Driver/ToolChains/Clang.cpp | ||
---|---|---|
5982–5991 | Style nit: LLVM puts the close bracket on the same line as else pretty consistently. clang-format will make it so. | |
llvm/lib/CodeGen/CFGuardLongjmp.cpp | ||
102 | I guess I was thinking that createTempSymbol would work, but I see you are taking steps to make these labels public, so that one is not appropriate. If the name matters, then yes, I think getOrCreateSymbol is the right API. | |
llvm/lib/Transforms/CFGuard/CFGuard.cpp | ||
207 | To the best of my knowledge, yes. | |
221 | Thanks, I see the test case for this. | |
255 | Yep. I'm not sure what the status of x86 global isel is, honestly. I guess I'll find out more at the dev meeting. | |
llvm/test/CodeGen/X86/cfguard-checks.ll | ||
127 | I think it would be interesting to have another test that exercises x86_vectorcallcc, since that's something the bundle approach enables. I guess this is good C++ to start from: struct HVA { double x, y, z, w; }; void foo(void (__vectorcall *fp)(HVA), HVA *o) { fp(*o); } |
Split cfguard_setjmp test into target-specific tests and add missing cfguard_longjmp pass for ARM and AArch64.
All new files added in this patch had dos line endings. I fixed that in e59f7488 but going forward please configure your environment to create new files for LLVM with unix line endings.
Hi, I filed https://bugs.llvm.org/show_bug.cgi?id=44049 for some strange crashes that we're seeing because the CFG code overwrites the lower byte of function pointers before jumping to them. (Commenting separately here because I was unable to CC @ajpaverd in Bugzilla)
Are there any plans to implement __declspec(guard(nocf)) or an equivalent mechanism? __attribute__((nocf_check)) doesn't do anything without the -fcf-protection flag. (https://bugs.llvm.org/show_bug.cgi?id=44096)
I think -fcf-protection and __attribute__((nocf_check)) have to do with CET and Intel's endbranch instruction or what have you. Similar goals, different implementation. I think at this point it's "patches welcome". =S
Well, this patch specifically has code (even a test) that nocf_check prevents CFG, so it looks like the intent was there, it's just that there isn't a way to get that attribute onto functions from cpp in a CFG/non-CET world.
Apologies for the delayed reply! This code was indeed reusing nocf_check, but I now think it would be better to use a separate attribute for Control Flow Guard. Please see the proposed improvement at D72167: Add support for __declspec(guard(nocf))
Is building check-all immediately after configuration expected to be clean? I have a build configured to build LLVM with just clang and just with the PowerPC target, and it seems check-all (under such a configuration) does not trigger the building of the library added in this patch; which, in turn, leads to test failures.
/src/llvm/test/tools/llvm-config/booleans.test:27:24: error: CHECK-SHARED-MODE-NOT: excluded string found in input CHECK-SHARED-MODE-NOT: error: ^ <stdin>:3:14: note: found here llvm-config: error: missing: /build/lib/libLLVMCFGuard.a ^~~~~~
I have confirmed that the case I mentioned fails with rGd157a9b.
The CFGuard library shouldn't be needed for targets other than ARM, AArch64, and X86, and it's only being built for these targets. However, it looks like llvm-build is also including it in the LibraryDependencies.inc file for other targets, such as PowerPC, which I think is causing this error. I'm not sure how to indicate that the library is only needed for a specified subset of targets. @rnk or @hans any suggestions?
Is the CFGuard library meaningful to provide for the other targets? Is consistency in the availability of the library (e.g., in case of hardcoded -l in the Makefile of some LLVM-dependent project) desired? It may make sense to treat the library as not target-specific.
It seems like this pass was added in lib/Transforms but does not have any unit-tests (lit tests) provided. It isn't even linked into opt. As it is an LLVM IR pass it should be tested as such I believe. Can you provide tests for this?
Thanks for the feedback. D82447 now links this into opt and adds lit tests in test/Transforms/CFGuard/.
@hans, WDYT about the -cc1 interface? I think this is fine.