Page MenuHomePhabricator

[x86/SLH] Teach the x86 speculative load hardening pass to harden against v1.2 BCBS attacks directly.

Authored by chandlerc on Jul 23 2018, 4:46 AM.



Attacks using spectre v1.2 (a subset of BCBS) are described in the paper

The core idea is to speculatively store over the address in a vtable,
jumptable, or other target of indirect control flow that will be
subsequently loaded. Speculative execution after such a store can
forward the stored value to subsequent loads, and if called or jumped
to, the speculative execution will be steered to this potentially
attacker controlled address.

Up until now, this could be mitigated by enableing retpolines. However,
that is a relatively expensive technique to mitigate this particular
flavor. Especially because in most cases SLH will have already mitigated
this. To fully mitigate this with SLH, we need to do two core things:

  1. Unfold loads from calls and jumps, allowing the loads to be post-load hardened.
  2. Force hardening of incoming registers even if we didn't end up needing to harden the load itself.

The reason we need to do these two things is because hardening calls and
jumps from this particular variant is importantly different from
hardening against leak of secret data. Because the "bad" data here isn't
a secret, but in fact speculatively stored by the attacker, it may be
loaded from any address, regardless of whether it is read-only memory,
mapped memory, or a "hardened" address. The only 100% effective way to
harden these instructions is to harden the their operand itself. But to
the extent possible, we'd like to take advantage of all the other
hardening going on, we just need a fallback in case none of that
happened to cover the particular input to the control transfer

This patch implements all of this, but it isn't quite ready to go yet.
First, there is some duplicated code between this patch and the
post-load hardening. I'll work on refactoring that code separately and
then this patch will simplify when rebased. I also need to test to see
what (if anything) we need to do so that these hardening steps are
naturally skipped when retpolines are in fact enabled, but that should
be trivial.

However, this patch shows all of the important mechanics here, including
the hoops we have to go through to unfold the loads from all of the
different instructions and then harden their incoming registers. It also
shows the expected result on the indirect test case.

For users of SLH, currently they are paing 2% to 6% performance overhead
for retpolines, but this mechanism is expected to be substantially

Diff Detail


Event Timeline

chandlerc created this revision.Jul 23 2018, 4:46 AM
emaste added a subscriber: emaste.Jul 23 2018, 8:17 AM
craig.topper added inline comments.Jul 23 2018, 4:27 PM
843 ↗(On Diff #156751)

Why not use getOpcodeAfterMemoryUnfold?

echristo accepted this revision.Jul 23 2018, 5:42 PM

Couple of small nits, ctopper's suggestion seems reasonable too. Only "better" way I had of dealing with this would be not folding the load at all in the first place, but it seems harder to justify given the existing unfolding machinery.


2233–2236 ↗(On Diff #156751)

Mind sinking this closer to the if statement?

2244 ↗(On Diff #156751)

I suppose technically you want < 8?

This revision is now accepted and ready to land.Jul 23 2018, 5:42 PM
chandlerc added inline comments.Jul 23 2018, 8:39 PM
843 ↗(On Diff #156751)

Because that doesn't (trivially) give me the register class...

But I can use that here to nuke a few of these lines.

If I'm missing a way to write this more simply let me know. The tricky part is passing in the register to the unfold API.

chandlerc updated this revision to Diff 156966.Jul 23 2018, 8:40 PM

Split the TCRETURN folding into to make the
messy situation there more clear.

Updated this patch based on review feedback.

chandlerc updated this revision to Diff 157013.Jul 24 2018, 6:01 AM

Rebase on top of the cleaned up folding patch, and now without any copy/paste
code smells. We just call out to the generic low-level register hardening

Other than checking for interactions w/ retpolines, this should be ready to go.

chandlerc updated this revision to Diff 157015.Jul 24 2018, 6:16 AM

Confirmed that this does-the-right-thing with retpolines (as they get
a different instruction).

Notably, we still harden a loaded value that is potentially a secret, because
speculating even a retpoline has a risk of disclosing the loaded value.
However, when the target of the retpoline is not considered a secret (it is
loaded from RO memory for example), we don't force hardening the target as the
retpoline will block any BCBS-style attack.

I've added a retpoline mode to the indirect test which shows both of these

I think this patch is now good-to-go for a last round of review.

echristo accepted this revision.Jul 24 2018, 3:24 PM

Couple of small nits, but that's it. Still LGTM.

2150–2151 ↗(On Diff #157015)

Nit: Flow.

2187 ↗(On Diff #157015)

Nit: I know you don't mean it, but it sounds like you're talking about r0 on some architecture :)

chandlerc marked 2 inline comments as done.Jul 24 2018, 6:32 PM

All done and submitting! Thanks!

2187 ↗(On Diff #157015)

Doh, yeah. Simpler to just say 'the first operand of the instruction'. Thanks.

This revision was automatically updated to reflect the committed changes.
chandlerc marked an inline comment as done.
Herald added a project: Restricted Project. · View Herald TranscriptSep 3 2019, 12:25 AM