Introduce __builtin_load_no_speculate
Needs ReviewPublic

Authored by kristof.beyls on Jan 5 2018, 3:02 AM.

Details

Reviewers
olista01
Summary

Recently, Google Project Zero disclosed several classes of attack
against speculative execution. One of these, known as variant-1
(CVE-2017-5753), allows explicit bounds checks to be bypassed under
speculation, providing an arbitrary read gadget. Further details can
be found on the GPZ blog [1].

This patch adds a new builtin function that provides a mechanism for
limiting speculation by a CPU after a bounds-checked memory access.
This patch provides the clang-side of the needed functionality; there is
also an llvm-side patch this patch is dependent on.
We've tried to design this in such a way that it can be used for any
target where this might be necessary. The patch provides a generic
implementation of the builtin, with most of the target-specific
support in the LLVM counter part to this clang patch.

The signature of the new, polymorphic, builtin is:

T __builtin_load_no_speculate(const volatile T *ptr,

const volatile void *lower,
const volatile void *upper,
T failval,
const volatile void *cmpptr)

T can be any integral type (signed or unsigned char, int, short, long,
etc) or any pointer type.

The builtin implements the following logical behaviour:

inline T __builtin_load_no_speculate(const volatile T *ptr,

                                   const volatile void *lower,
                                   const volatile void *upper, T failval,
                                   const volatile void *cmpptr) {
T result;
if (cmpptr >= lower && cmpptr < upper)
  result = *ptr;
else
  result = failval;
return result;

}

In addition, the builtin ensures that future speculation using *ptr may
only continue iff cmpptr lies within the bounds specified.

To make the builtin easier to use, the final two arguments can both be
omitted: failval will default to 0 in this case and if cmpptr is omitted
ptr will be used for expansions of the range check. In addition, either
lower or upper (but not both) may be a literal NULL and the expansion
will then ignore that boundary condition when expanding.

This also introduces the predefined pre-processor macro
__HAVE_LOAD_NO_SPECULATE, that allows users to check if their version of
the compiler supports this intrinsic.

The builtin is defined for all architectures, even if they do not
provide a mechanism for inhibiting speculation. If they do not have
such support the compiler will emit a warning and simply implement the
architectural behavior of the builtin.

This patch can be used with the header file that Arm recently
published here: https://github.com/ARM-software/speculation-barrier.

Kernel patches are also being developed, eg:
https://lkml.org/lkml/2018/1/3/754. The intent is that eventually
code like this will be able to use support directly from the compiler
in a portable manner.

Similar patches are also being developed for GCC and have been posted to
their development list, see
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00205.html

[1] More information on the topic can be found here:
https://googleprojectzero.blogspot.co.uk/2018/01/reading-privileged-memory-with-side.html
Arm specific information can be found here:
https://www.arm.com/security-update

Diff Detail

kristof.beyls created this revision.Jan 5 2018, 3:02 AM
fhahn added a subscriber: fhahn.Jan 5 2018, 3:03 AM
emaste added a subscriber: emaste.Jan 5 2018, 4:02 AM

The API design has been discussed over the past weeks in detail on the gcc mailing list. As a result of that, we propose to adapt the API, to enable efficient code generation also on architectures that need to generate a barrier instruction to achieve the desired semantics.

The main change in the proposed API is to drop the failval parameter and to tweak the semantics to the below.
There is a more detailed rationale for these changes at https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01546.html

I haven't updated the code to implement the new specification yet, but thought I'd share the new specification as soon as possible, while I find the time to adapt the implementation:

The signature of the new, polymorphic, builtin is:

T __builtin_speculation_safe_load(const volatile T *ptr,
                                  const volatile void *lower,
                                  const volatile void *upper,
                                  const volatile void *cmpptr)

T can be any integral type (signed or unsigned char, int, short, long, etc) or any pointer type.

This builtin provides a means to limit the extent to which a processor can continue speculative execution with the result of loading a value stored at ptr. The boundary conditions, described by cmpptr, lower_bound and upper_bound, define the conditions under which execution after the load can continue safely:

  • When the builtin is not being executed speculatively:
    • if lower_bound <= cmpptr < upper_bound, the value at address ptr is returned.
    • if cmpptr is not within these bounds, the behaviour is undefined.
  • When the builtin is being executed speculatively, either:
    • Execution of instructions following the builtin that have a dependency on the result of the intrinsic will be blocked, until the builtin is no longer executing speculatively. At this point, the semantics under point 1 above apply.
    • Speculation may continue using the value at address ptr as the return value of the builtin, if lower_bound <= cmpptr < upper_bound, or an unspecified constant value if cmpptr is outside these bounds.

The final argument, cmpptr, may be omitted if it is the same as ptr.

The builtin is supported for all architectures, but on machines where target-specific support for inhibiting speculation is not implemented, or not necessary, the compiler will emit a warning.

The pre-processor macro __HAVE_SPECULATION_SAFE_LOAD is defined with the value 1 when the compiler supports this builtin.