Page MenuHomePhabricator

Add inline assembly load hardening mitigation for Load Value Injection (LVI) on X86 [6/6]
AcceptedPublic

Authored by sconstab on Fri, Mar 13, 1:39 PM.

Details

Summary

Added code to X86AsmParser::emitInstruction() to add an LFENCE after each instruction that may load.

Diff Detail

Event Timeline

sconstab created this revision.Fri, Mar 13, 1:39 PM
sconstab updated this revision to Diff 250583.Mon, Mar 16, 9:22 AM

Added warnings for instructions that cannot be automatically mitigating by inserting an LFENCE.

sconstab edited the summary of this revision. (Show Details)Mon, Mar 16, 9:23 AM
sconstab retitled this revision from Add inline assembly load hardening mitigation for Load Value Injection (LVI) on X86 to Add inline assembly load hardening mitigation for Load Value Injection (LVI) on X86 [6/6].Mon, Mar 16, 9:31 AM
craig.topper added inline comments.Mon, Mar 16, 10:10 AM
llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
3236

Can we move this code out to a separate method called from here? Something like performLVIMitigation?

zbrid accepted this revision.Mon, Mar 16, 11:36 AM

LGTM. This is awesome, thanks! Someone else should give another LGTM before submitting.

This revision is now accepted and ready to land.Mon, Mar 16, 11:36 AM
sconstab updated this revision to Diff 250619.Mon, Mar 16, 12:58 PM

Separated out the load/cfi hardening functionality into two separate member functions.

sconstab marked an inline comment as done.Mon, Mar 16, 12:59 PM
sconstab added inline comments.
llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
3236

I created two new methods. Let me know what you think.

mattdr added a subscriber: mattdr.Mon, Mar 16, 2:11 PM

This seems like a much more robust approach for mitigating inline asm. Thanks for that!

For now, my biggest question is around layering and separation of concerns. Do we really want to do transformations in the parser? It seems like that opens the door to making the parser API surprising and less helpful. For example, if the LVI mitigation is enabled the Parse and Emit operations of the parser no longer round-trip, which is very new and very nonobvious.

I definitely understand that you put the code here because it was the one path you could find that all attempts to emit ASM would go through. And I know it's frustrating I don't have a concrete suggestion for a different approach. Still working on that.

llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
3158

Let's add a comment about how this list was created, or maybe a reference to public documentation if this list is pulled from there. Otherwise folks coming afterward have no way to tell if this list is correct.

3186

Once the details of LVI have faded, future maintainers will really appreciate a comment here about what this mitigation *is*, what it's doing, etc.

3192

Same comment as for the list above.

sconstab updated this revision to Diff 250804.Tue, Mar 17, 9:40 AM

Added comments with pointers to LVI public documentation.

First, want to apologize for the high latency -- thanks to (gestures vaguely in the air) I've been mostly offline during the day and trying to catch up at night.

I think I've probably said my piece here. Summarizing themes from my comments:

  1. the parser seems like a weird place to put this mitigation
  2. the lists of instructions feel very magic because there's no explanation of how they were arrived at. They can also be incomplete or become out of date. Consider fixing both by matching instructions by interesting properties instead of opcodes. Include an explanation of why the properties make the instructions interesting in the context of LVI.
  3. the warnings we fire aren't actionable

Ultimately, though, these are code-quality and maintainability issues and I'm not a maintainer. Especially since I'm only occasionally-online right now, please don't consider me a blocker for the review.

llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
3167

Following up on my previous comment: it seems like this is supposed to be the full set of instructions that combine a load from memory with a control flow change dependent on that load.

How are we sure the list is complete? Just from a quick look at the return instructions LLVM has for X86, it looks like LRETL and LRETQ exist and would not be mitigated.

Can we just look at the MCInstDesc and warn for instructions that mayLoad and mayAffectControlFlow? Then this list will stay up to date even if new instructions are added.

3186

This warning should at least tell users *why* this is vulnerable, and ideally offer the next step for fixing it.

3206

I didn't find any mention of these instructions, why they would "require additional considerations", or what folks should actually *do* if they hit this warning by reading https://software.intel.com/security-software-guidance/insights/deep-dive-load-value-injection.

Here's my best guess:
This is a list of string instructions that can be used with REP and which update ZF in EFLAGS instead of relying on an a count in rcx to terminate". When used with REP or REPNE, these instructions become interesting because an attacker could use LVI to change the target's speculative control flow by injecting a value for a load. Since the instruction is indivisible, there's no place to put an LFENCE, so like RET these instructions need to be split into load and compare/branch components.

Is that accurate?

If so, we could explain it that way and replace the hardcoded list with hasImplicitDefOfPhysReg(X86::EFLAGS)

zbrid accepted this revision.Thu, Apr 2, 5:11 PM
zbrid added a subscriber: jyknight.

LGTM, but I'm not sure whether or not this should be upstreamed due to Matt's concerns and due to my own lack of what's okay within LLVM for maintainers. Could @jyknight or @george.burgess.iv chime in? @craig.topper, is there an owner to this component who could chime in?

sconstab marked 2 inline comments as done.Thu, Apr 2, 10:30 PM
sconstab added inline comments.
llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
3167

I looked into this and unfortunately the mayLoad attribute is not set for any of the RET family (see https://github.com/llvm/llvm-project/blob/b1d581019f5d4d176ad70ddffee13db247a13ef1/llvm/lib/Target/X86/X86InstrControl.td#L21). Either this is a bug or I don't actually understand what the mayLoad attribute actually means. It seems pretty strange since the POPr family has mayLoad. @craig.topper : opinion?

3186

The manual mitigation is a little bit more involved than what would typically be conveyed in a compiler warning. Would it be acceptable for the warning to point to the LVI deep dive document?

sconstab marked 2 inline comments as not done.Fri, Apr 3, 12:17 PM
craig.topper added inline comments.Fri, Apr 3, 3:04 PM
llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
3167

I suspect it doesn't have the mayLoad flag because from the perspective of the codegen passes that normally use that flag, the load of the return address isn't really something we would care about. It could just as well be reading the return address from some stack dedicated return address stack or a stack implemented in hardware or something.

Not sure what the implications of just adding the flag would be to codegen.

3204

I don't think this works if the rep prefix is on the line before the cmps/scas instruction. That will assemble to the same thing but the parser sees it as two instructions.

mattdr added inline comments.Tue, Apr 7, 2:56 AM
llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
3167

Yikes, okay. It's incredibly surprising that mayLoad isn't set for the RET family, and if anything stops working when we set it that's almost certainly a bug we want to know about.

But I can accept that fixing that is out of scope for this change -- at least, for the initial version. If so, though, that makes it all the more important that we describe how we arrived at this particular list of instructions, in enough detail that someone else can retrace our steps.

3186

Sure, a link to an external site seems fine. Can we please link to the specific part of the page that explains the mitigations for this specific set of instructions?

3204

@craig.topper any notion what the best workaround would be? One option is keeping track of whether we've seen a REP or REPNE before this.

Although perhaps a simpler approach is just not worrying about whether we saw REP. We're here to try to warn users about instructions that we can't mitigate. What's the likelihood that we see a CMPSB without REP and it turns out it _doesn't_ need mitigation?

3206

And if we hit the same problem as above with RET, where somehow the instruction tables are inaccurate and we can't rely on them yet, then it's similarly okay with me if we leave the hardcoded list of instructions as long as we provide really specific comments describing why the list is all and only these instructions.

sconstab marked 3 inline comments as done.Tue, Apr 7, 1:25 PM
sconstab added inline comments.
llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
3186

@mattdr Working on it!

3204

@mattdr The difference is that CMPSB without REP can be mitigated simply by following it with an LFENCE. REP CMPSB must be manually decomposed into a loop, into which an LFENCE can be inserted.

I wonder if it would suffice to simply emit a warning/error if a lone REP is encountered to indicate that the user may need to manually mitigate?

3206

Similar to providing a link to a public document in a warning message, I think it should suffice to have a link to said document (and specifically the relevant subsection of that document) in the comment.