Page MenuHomePhabricator

Add Windows Control Flow Guard checks (/guard:cf).
ClosedPublic

Authored by ajpaverd on Aug 5 2019, 11:21 AM.

Details

Summary

A new function pass (Transforms/CFGuard/CFGuard.cpp) inserts CFGuard checks on
indirect function calls, using either the check mechanism (X86, ARM, AArch64) or
or the dispatch mechanism (X86-64). The check mechanism requires a new calling
convention for the supported targets. The dispatch mechanism adds the target as
an operand bundle, which is processed by SelectionDAG. Another pass
(CodeGen/CFGuardLongjmp.cpp) identifies and emits valid longjmp targets, as
required by /guard:cf. This feature is enabled using the cfguard CC1 option.

Diff Detail

Event Timeline

ajpaverd created this revision.Aug 5 2019, 11:21 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptAug 5 2019, 11:21 AM
alex added a subscriber: alex.Aug 17 2019, 7:49 AM
dmajor added a subscriber: dmajor.Aug 19 2019, 10:02 AM
rnk added a comment.Aug 26 2019, 4:16 PM

I plan to take a look at this tomorrow, sorry for putting this off for a while. :(

In D65761#1646059, @rnk wrote:

I plan to take a look at this tomorrow, sorry for putting this off for a while. :(

Thanks @rnk! If there's anything I can clarify/fix, please let me know.

rnk added a reviewer: pcc.Sep 4 2019, 10:55 AM
rnk added a subscriber: hans.Sep 5 2019, 5:50 PM

Here is some feedback, I apologize for dragging my feet.

Also, I would really like to get feedback from @pcc. He's OOO currently, though.

clang/include/clang/Driver/Options.td
506

@hans, WDYT about the -cc1 interface? I think this is fine.

clang/lib/Driver/ToolChains/Clang.cpp
5980

Comment formatting needs to be fixed, you can use clang-format for this.

llvm/include/llvm/CodeGen/MachineBasicBlock.h
179 ↗(On Diff #213413)

I hesitate to make MBB labels public, I think it might be better to build your own MCSymbol label with a public name.

llvm/lib/CodeGen/CFGuardLongjmp.cpp
102

Rather than changing the machine CFG, I would suggest using MachineInstr::setPostInstrSymbol, which I've been planning to use for exactly this purpose. :) You can make labels with MCContext::createSymbol, and you'll want to come up with symbols that won't clash with C or C++ symbols. I see MSVC makes labels like $LN4, so maybe $cfgsjN or something like that. I think MCContext will add numbers for you.

llvm/lib/Target/X86/X86FixupCFGuard.cpp
11 ↗(On Diff #213413)

I guess it's not X86CFGuard.cpp now that you've generalized it as a cross-platform IR pass.

57–58 ↗(On Diff #213413)

Can you elaborate on what the correctness requirement is? The correctness requirement is just that the loaded value is not saved in a CSR which is then spilled and reloaded from memory across a call site. If possible, I think it would be worth teaching both FastISel and SDISel to build the correct MachineInstrs during instruction selection, and turn this into a validation pass that fails if it sees indirect calls through registers in functions where CFG is enabled.

114 ↗(On Diff #213413)

I think you probably also want to look for tail calls. I think the best way to do this would be to use MI.isCall() in the loop, and then switch on the opcode to handle all call variants.

134 ↗(On Diff #213413)

Is it possible to use MachineRegisterInfo to iterate over all the uses of the load, instead of scanning the BB? It seems like this could be O(n^2) if a single BB contains a long sequence of indirect calls. I think it would look like this:

MachineRegisterInfo *MRI = MF.getRegInfo();
for (MachineInstr &MI : MRI->use_instructions(Reg))
  // do something with MI

There's also use_operands.

168–170 ↗(On Diff #213413)

Erasing at the end is always the safest way to do it. :)

209 ↗(On Diff #213413)

This doesn't do anything with the return value, should it?

llvm/lib/Transforms/CFGuard/CFGuard.cpp
39

Does this need to be a ModulePass? It's preferable to avoid module passes if possible. When the pass manager has a sequence of function passes to run, it runs all passes on function f, then g, then h, and so on, rather than running the function pass over the entire module and then the next pass over the entire module, and so on. This is done to get better cache locality: each function pass touches f's data structures repeatedly, and then moves on to g without ever going back to f.

You should be able to do the global checks in an override of doInitialization, which receives a Module.

43

I guess more LLVM-style names for these would be CF_Check, CF_Dispatch. Of course, Rui is changing the style soon anyway.

221

You'd want to set the tail call kind here to forward tail or musttail. This matters for things like member pointer function thunks. This C++ makes a musttail thunk, for example:

struct A { virtual void f(); };
auto mp = &A::f;

Which gives this IR:

define linkonce_odr void @"??_9A@@$BA@AA"(%struct.A* %this, ...) #0 comdat align 2 {
entry:
  %0 = bitcast %struct.A* %this to void (%struct.A*, ...)***
  %vtable = load void (%struct.A*, ...)**, void (%struct.A*, ...)*** %0, align 8, !tbaa !3
  %1 = load void (%struct.A*, ...)*, void (%struct.A*, ...)** %vtable, align 8
  musttail call void (%struct.A*, ...) %1(%struct.A* %this, ...)
  ret void
}

This does an indirect virtual call through the 0th slot, and if we lose musttail, it won't perfectly forward arguments.

255

So, this isn't going to work if the original calling convention was something other than the default. For x64, the only commonly used non-standard convention would be vectorcall. Changing the convention here in the IR is going to make it hard to pass that along.

I think as you can see from how much code you have here already, replacing call instructions in general is really hard. These days there are also bundles, which I guess is something else missing from the above.

Here's a sketch of an alternative approach:

  • load __guard_dispatch_icall_fptr as normal
  • get the pre-existing function type of the callee
  • cast the loaded pointer to the original function pointer type
  • use CB->setCalledOperand to replace the call target
  • add the original call target as an "argument bundle" to the existing instruction
  • change SelectionDAGBuilder.cpp to recognize the new bundle kind and create a new ArgListEntry in the argument list
  • make and set a new bit in ArgListEntry, something like isCFGTarget
  • change CC_X86_Win64_C to put CFGTarget arguments in RAX, similar to how CCIfNest puts things in R10

This way, the original call instruction remains with exactly the same arrangement of args, attributes, callbr successors, tail call kind, etc. All you're doing is changing the callee, and passing the original callee as an extra argument on the side. Basically, this approach leverages bundles to avoid messing with the function type, which more or less requires replacing the call. :)

There are probably issues with this design that I haven't foreseen, but I think it's worth a try.

llvm/test/CodeGen/AArch64/cfguard-checks.ll
4–5

These annotations shouldn't be necessary, the tests should pass as written on Linux. The mtriple flag above tells llc to generate code for Windows.

hans added a comment.Sep 6 2019, 5:07 AM

This is very exciting! I didn't look closely at the actual instrumentation code, as rnk knows that better and had some good comments.

clang/include/clang/Driver/Options.td
506

The -cc1 interface looks good to me, but since these are cc1 options only, I think they should be in CC1Options.td, not Options.td (I realize this is a pre-existing issue with -cfguard).

clang/lib/CodeGen/CodeGenModule.cpp
484–485

Maybe check for this first, and then the ControlFlowGuardNoChecks one in an else-if, to make this one take precedence?

clang/lib/Driver/ToolChains/Clang.cpp
5979

It seems unfortunate that the Instrument and NoChecks variables allow four different states. whereas I think the logic should really only allow for three: 1) do nothing, 2) tables only, 3) tables and checks. Maybe we could have an enum instead?

5980

Also I'd suggest braces since even though there's just one statement, there are multiple lines here and in the else block.

clang/lib/Driver/ToolChains/MSVC.cpp
425

Thanks for remembering /fallback. Please add a /guard:cf option to the first test in clang/test/Driver/cl-fallback.c

llvm/docs/LangRef.rst
441

Indentation looks a little off here.

"function, which can be inserted before indirect calls" -- maybe instead "function, calls to which can be inserted ..."

447

I don't think X86_32 is common nomenclature.. maybe just X86 is enough.

456

Indentation off here too.

458

Maybe explicitly call out in the text that this calling convention is only used on x86_64, and is used instead of cfguard_checkcc (iiuc)?

llvm/include/llvm/IR/CallingConv.h
79

nit: s/take/takes/

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
384

ultra nit: period at the end, here and for other comments

llvm/lib/Target/X86/X86FixupCFGuard.cpp
13 ↗(On Diff #213413)

Naive question: Why do we generate code as in the examples in the first place, and can't some general optimization pass do this folding? From the examples it looks like straight-forward constant propagation.

rnk added inline comments.Sep 6 2019, 11:22 AM
llvm/lib/Target/X86/X86FixupCFGuard.cpp
13 ↗(On Diff #213413)

Actually, I used this test IR, LLVM seems to always fold the memory operand into the call:

@fptr = external dso_local global void()*
define i32 @foo() {
	%fp1 = load void()*, void()** @fptr
	call void %fp1()
	%fp2 = load void()*, void()** @fptr
	call void %fp2()
	ret i32 0
}

Maybe it won't do it if there are more parameters, I'm not sure.

I ran llc with both isels for x64 and ia32, and it always folded the load into the call. Maybe it's best to make this a verification pass that emits an error via MCContext if there is an unfolded load of the CFG check function pointer?

ajpaverd updated this revision to Diff 223167.Oct 4 2019, 1:25 AM
ajpaverd marked 22 inline comments as done.

Improved Control Flow Guard implementation.

As suggested by @rnk, the CFGuard dispatch mechanism now uses a new
cfguardtarget operand bundle to pass the indirect call target. Instruction
selection adds this target as a new ArgListEntry and sets a new
isCFGuardTarget bit. CC_X86_Win64_C puts this argument in RAX.

In D65761#1660121, @rnk wrote:

Here is some feedback, I apologize for dragging my feet.

Also, I would really like to get feedback from @pcc. He's OOO currently, though.

Thanks for this really helpful feedback @rnk! I think I've been able to address most of your comments in the latest update. Your suggested approach of using operand bundles for the dispatch mechanism instead of changing calling conventions works well and makes the design a lot cleaner.

ajpaverd marked 9 inline comments as done.Oct 4 2019, 2:36 AM
ajpaverd added inline comments.
clang/include/clang/Driver/Options.td
506

Thanks for all the helpful feedback @hans. I think I've addressed all your comments in the latest revision.

llvm/lib/CodeGen/CFGuardLongjmp.cpp
102

Thanks for the suggestion! It looks like MCContext::createSymbol is private, so I'm generating a new symbol name and number for MCContext::getOrCreateSymbol. Does this look OK?

llvm/lib/Target/X86/X86FixupCFGuard.cpp
13 ↗(On Diff #213413)

I'm seeing this when compiling with optimizations disabled. When I run llc with -fast-isel=false, the following slightly modified IR does not fold the memory operand into the call:

@fptr = external dso_local global void()*
define i32 @foo() #0 {
  %fp1 = load void()*, void()** @fptr
  call void %fp1()
  %fp2 = load void()*, void()** @fptr
  call void %fp2()
  ret i32 0
}
attributes #0 = { noinline optnone }

It looks like this is caused by checks for OptLevel == CodeGenOpt::None in:

  • SelectionDAGISel::IsLegalToFold
  • X86DAGToDAGISel::IsProfitableToFold
  • X86DAGToDAGISel::PreprocessISelDAG (in this case OptLevel != CodeGenOpt::None)

I guess this is not high priority as it only happens at -O0. Should I look into enabling these specific optimizations when CFG is enabled, or just emit a warning when this happens?

llvm/lib/Transforms/CFGuard/CFGuard.cpp
207

Is this the correct way to add operand bundles to what is essentially an existing call/invoke instruction?

255

This design seems to work well. I guess at some point we would also have to teach GlobalISel to recognize these operand bundles, right?

rnk added a comment.Oct 17 2019, 4:32 PM

I think this looks pretty good, thanks! I really only noticed style nits. I think with some small fixes, we should go ahead and merge it. If you want, I can commit it on your behalf, but I know there are other folks at Microsoft with commit access to LLVM, if you'd prefer. When you upload the next diff, make sure to use the merge point with origin/master as the base, so that the patch will apply cleanly.

clang/lib/Driver/ToolChains/Clang.cpp
5982–5991

Style nit: LLVM puts the close bracket on the same line as else pretty consistently. clang-format will make it so.

llvm/lib/CodeGen/CFGuardLongjmp.cpp
102

I guess I was thinking that createTempSymbol would work, but I see you are taking steps to make these labels public, so that one is not appropriate. If the name matters, then yes, I think getOrCreateSymbol is the right API.

llvm/lib/Transforms/CFGuard/CFGuard.cpp
207

To the best of my knowledge, yes.

221

Thanks, I see the test case for this.

255

Yep. I'm not sure what the status of x86 global isel is, honestly. I guess I'll find out more at the dev meeting.

llvm/test/CodeGen/X86/cfguard-checks.ll
127

I think it would be interesting to have another test that exercises x86_vectorcallcc, since that's something the bundle approach enables. I guess this is good C++ to start from:

struct HVA {
  double x, y, z, w;
};
void foo(void (__vectorcall *fp)(HVA), HVA *o) {
  fp(*o);
}
ajpaverd updated this revision to Diff 225943.Oct 21 2019, 12:56 PM
ajpaverd marked 6 inline comments as done.

Final changes and tests suggested by @rnk.

ajpaverd updated this revision to Diff 225955.Oct 21 2019, 2:18 PM
ajpaverd marked 3 inline comments as done.

Formatting fixes for CFGuard patch.

ajpaverd edited the summary of this revision. (Show Details)Oct 21 2019, 2:23 PM
ajpaverd updated this revision to Diff 226164.Oct 23 2019, 10:26 AM

Split cfguard_setjmp test into target-specific tests and add missing cfguard_longjmp pass for ARM and AArch64.

This revision was not accepted when it landed; it landed in state Needs Review.Oct 28 2019, 8:20 AM
This revision was automatically updated to reflect the committed changes.

All new files added in this patch had dos line endings. I fixed that in e59f7488 but going forward please configure your environment to create new files for LLVM with unix line endings.

Hi, I filed https://bugs.llvm.org/show_bug.cgi?id=44049 for some strange crashes that we're seeing because the CFG code overwrites the lower byte of function pointers before jumping to them. (Commenting separately here because I was unable to CC @ajpaverd in Bugzilla)

dmajor added a comment.Dec 3 2019, 6:05 AM

Are there any plans to implement __declspec(guard(nocf)) or an equivalent mechanism? __attribute__((nocf_check)) doesn't do anything without the -fcf-protection flag. (https://bugs.llvm.org/show_bug.cgi?id=44096)

rnk added a comment.Dec 5 2019, 6:00 PM

Are there any plans to implement __declspec(guard(nocf)) or an equivalent mechanism? __attribute__((nocf_check)) doesn't do anything without the -fcf-protection flag. (https://bugs.llvm.org/show_bug.cgi?id=44096)

I think -fcf-protection and __attribute__((nocf_check)) have to do with CET and Intel's endbranch instruction or what have you. Similar goals, different implementation. I think at this point it's "patches welcome". =S

dmajor added a comment.Dec 5 2019, 6:23 PM
In D65761#1772031, @rnk wrote:

I think -fcf-protection and __attribute__((nocf_check)) have to do with CET and Intel's endbranch instruction or what have you. Similar goals, different implementation. I think at this point it's "patches welcome". =S

Well, this patch specifically has code (even a test) that nocf_check prevents CFG, so it looks like the intent was there, it's just that there isn't a way to get that attribute onto functions from cpp in a CFG/non-CET world.

Well, this patch specifically has code (even a test) that nocf_check prevents CFG, so it looks like the intent was there, it's just that there isn't a way to get that attribute onto functions from cpp in a CFG/non-CET world.

Apologies for the delayed reply! This code was indeed reusing nocf_check, but I now think it would be better to use a separate attribute for Control Flow Guard. Please see the proposed improvement at D72167: Add support for __declspec(guard(nocf))

Is building check-all immediately after configuration expected to be clean? I have a build configured to build LLVM with just clang and just with the PowerPC target, and it seems check-all (under such a configuration) does not trigger the building of the library added in this patch; which, in turn, leads to test failures.

/src/llvm/test/tools/llvm-config/booleans.test:27:24: error: CHECK-SHARED-MODE-NOT: excluded string found in input
CHECK-SHARED-MODE-NOT: error:
                       ^
<stdin>:3:14: note: found here
llvm-config: error: missing: /build/lib/libLLVMCFGuard.a
             ^~~~~~

I have confirmed that the case I mentioned fails with rGd157a9b.

I have confirmed that the case I mentioned fails with rGd157a9b.

@ajpaverd, is a fix forthcoming for the issue I mentioned with this patch?

I have confirmed that the case I mentioned fails with rGd157a9b.

@ajpaverd, is a fix forthcoming for the issue I mentioned with this patch?

The CFGuard library shouldn't be needed for targets other than ARM, AArch64, and X86, and it's only being built for these targets. However, it looks like llvm-build is also including it in the LibraryDependencies.inc file for other targets, such as PowerPC, which I think is causing this error. I'm not sure how to indicate that the library is only needed for a specified subset of targets. @rnk or @hans any suggestions?

The CFGuard library shouldn't be needed for targets other than ARM, AArch64, and X86, and it's only being built for these targets. However, it looks like llvm-build is also including it in the LibraryDependencies.inc file for other targets, such as PowerPC, which I think is causing this error. I'm not sure how to indicate that the library is only needed for a specified subset of targets. @rnk or @hans any suggestions?

Is the CFGuard library meaningful to provide for the other targets? Is consistency in the availability of the library (e.g., in case of hardcoded -l in the Makefile of some LLVM-dependent project) desired? It may make sense to treat the library as not target-specific.

It seems like this pass was added in lib/Transforms but does not have any unit-tests (lit tests) provided. It isn't even linked into opt. As it is an LLVM IR pass it should be tested as such I believe. Can you provide tests for this?

It seems like this pass was added in lib/Transforms but does not have any unit-tests (lit tests) provided. It isn't even linked into opt. As it is an LLVM IR pass it should be tested as such I believe. Can you provide tests for this?

Thanks for the feedback. D82447 now links this into opt and adds lit tests in test/Transforms/CFGuard/.