This is an archive of the discontinued LLVM Phabricator instance.

[slh] x86 impl of ARM instrinsic + builtin for SLH
Needs ReviewPublic

Authored by zbrid on Mar 26 2019, 10:51 AM.

Details

Summary

This is similar to the work Kristof did for ARM here: https://reviews.llvm.org/D49072

For now, I have only implemented the version that lowers the intrinsic using an LFENCE. I'm workign on a version that can be lowered as an LFENCE or lowered using the control flow speculation available, so users have the option just as they do in the ARM patch.

This is intended to add to the discussion rather than be a definitive patch relating to the way we will handle spot mitigations as far as the final API/implementation in LLVM goes. Any comments about the API, the way implemented this, or anything else are welcome.

Note: This lowering to an lfence appears to be the same method that gcc currently uses in their implementation of the builtin.

This is part of implementing a technique to mitigate against Spectre v1,
similar in spirit to what has been proposed by Chandler for X86_64 at
http://lists.llvm.org/pipermail/llvm-dev/2018-March/122085.html.

This patch adds a new builtin function that provides a mechanism for
limiting the effects of miss-speculation by a CPU.
This patch provides the clang-side of the needed functionality; there is
also an llvm-side patch this patch is dependent on.

We've tried to design this in such a way that it can be used for any
target where this might be necessary. The patch provides a generic
implementation of the builtin, with most of the target-specific
support in the LLVM counter part to this clang patch.

The signature of the new, polymorphic, builtin is:

T __builtin_speculation_safe_value(T v)

T can be any integral type (signed or unsigned char, int, short, long,
etc) or any pointer type.

The builtin assures that value v will be made 0 on execution paths that
are being executed under control flow miss-speculation by the CPU, when
the miss-speculated path originated due to misprediction of a direct
conditional branch.

Whereas this still leaves open the possibility of execution on a
miss-speculated path starting at misprediction of other control flow
instructions, our believe is that the above guarantee is still useful in
mitigating vulnerability to Spectre v1-style attacks and implementable
for most, if not all, target instruction sets.

This also introduces the predefined pre-processor macro
__HAVE_SPECULATION_SAFE_VALUE, that allows users to check if their
version of the compiler supports this intrinsic.

Event Timeline

zbrid created this revision.Mar 26 2019, 10:51 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMar 26 2019, 10:51 AM
aheejin added inline comments.Mar 26 2019, 10:54 AM
clang/test/Preprocessor/init.c
9678

Nit: Remove the whitespace to be consistent with adjacent lines? (I think having a whitespace is better in general though)

zbrid updated this revision to Diff 192297.Mar 26 2019, 11:04 AM

update whitespace in wasm file to match surrounding

zbrid marked an inline comment as done.Mar 26 2019, 11:05 AM

Thanks for picking this up, Zola!

I quickly looked through the patch - comparing it with what I had done under D49070 and D49073.
Apart from the point remarks inline, I had the following immediate thoughts:

  1. Could you clang-format the patch?
  2. Could you rebase the patch to top-of-trunk (it seems it is a bit behind ToT)?
  3. For discussions, seeing the whole patch as it is might be helpful. OTOH, I think it also makes reviewing easier if the target-dependent and the target-independent parts would be split. I think that could also help others in implementing the intrinsics for their targets: they'd have guidance on what might be needed from that target-dependent implementation patches for X86 and AArch64.
  4. Lowering to LFENCE seems a correct lowering to me, but someone more knowledgeable about x86 should confirm.
  5. I think the LLVM-IR intrinsic should be target-independent, and not x86-specific. That would result in less duplication of code when implementing support for multiple architectures. I seem to remember that's how I implemented this in D49070. I didn't look so far at the SelectionDAG parts of this patch, as I think the differences between my implementation in D49070 and this patch may go away after making the intrinsic target-independent.

If we'd take the discussion about adding support for intrinsic T __builtin_speculation_safe_value(T v) further here, I'd be happy to abandon the patch series at D49073.
However, in that case, I think the explanation of the intrinsic there should be copied over here to provide a bit more context.

clang/lib/CodeGen/CGBuiltin.cpp
13

This line doesn't seem to be needed?

3987

line too long - run clang-format?

clang/lib/Sema/SemaChecking.cpp
1497

needs one more space of indentation?

5326

Should this be "TheCall->getNumArgs() > 1" (larger than rather then less than)?

clang/test/CodeGen/builtin-speculation-safe-value.c
2–3

When I wrote this test in D49073 this line read "REQUIRES: aarch64-registered-target". Looking at this now, I wonder why the requires might be needed, beyond the RUN line containing "-triple x86_64-linux-gnu". It'd be nice if this test didn't need a REQUIRES line.... But maybe there is a good reason it does need a requires line after all?

clang/test/Preprocessor/init.c
9678

It seems this is the only intended change in this file; all the other changes in this file were unintended for this patch?

llvm/include/llvm/IR/Intrinsics.td
1171

accidental new line diff?

llvm/include/llvm/IR/IntrinsicsX86.td
4819–4822 ↗(On Diff #192297)

I think this needs to be a target independent LLVM IR intrinsic, not x86 specific. See D49070. This will also need documentation in LangRef.rst then, also see D49070 for a possible documentation I proposed for this intrinsic there.

llvm/lib/Target/X86/X86SpeculativeLoadHardening.cpp
615–629

The lowering of the intrinsic on a 32 bit and a 64 bit value looks identical to me, so the if statement isn't needed?

zbrid updated this revision to Diff 192845.Mar 29 2019, 9:16 AM

update with clang-format

zbrid updated this revision to Diff 192852.Mar 29 2019, 10:06 AM

fix test formatting; make target independent intrinsic; add doc

zbrid updated this revision to Diff 192857.Mar 29 2019, 10:30 AM

remove unnecessary requirement from builtin test

zbrid updated this revision to Diff 192870.Mar 29 2019, 11:16 AM

remove unnecessary if in x86 slh intrinsic lowering function

zbrid marked 2 inline comments as done.Mar 29 2019, 11:18 AM
zbrid added inline comments.
llvm/lib/Target/X86/X86SpeculativeLoadHardening.cpp
615–629

Good catch.

zbrid updated this revision to Diff 192873.Mar 29 2019, 11:28 AM
zbrid marked an inline comment as done.

actually fix if statement

zbrid retitled this revision from [slh] x86 impl of ARM instrinsic for SLH to [slh] x86 impl of ARM instrinsic + builtin for SLH.Apr 1 2019, 9:18 AM
zbrid edited the summary of this revision. (Show Details)

This intrinsic got added to gcc a while ago and should become available in the upcoming gcc 9 release.
In gcc however, the prototype of the intrinsic is slightly different (see https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html):
type __builtin_speculation_safe_value (type val, type failval)
It provides a second optional argument "failval". From the gcc documentation: "The function may use target-dependent speculation tracking state to cause failval to be returned when it is known that speculative execution has incorrectly predicted a conditional branch operation."
So, when implementing the intrinsic using a speculation barrier such as lfence, that failval argument doesn't have any effect. However, when lowering the intrinsic using speculation tracking similar to how that's used in SLH, this failval parameter is used to return a non-zero value on miss-speculation, in case the developer prefers that over the default zero value.

I think we should make the intrinsic compatible with the one introduced in gcc.

llvm/test/CodeGen/X86/speculative-load-hardening-intrinsic.ll
2

I guess the -mtriple command line option may not be needed since the IR file contain "target triple" and "target datalayout" information?

4–5

I guess this is not strictly necessary for this test, so should be removed?

9–63

Thanks for those updates, Zola. It makes it easier to compare this patch with the code I wrote earlier.
Doing that comparison, I see that I had a few changes too in target-independent SelectionDAG under lib/Codegen/SelectionDAG.
IIRC, you might find that you'll need that code if you also add tests here to test the correct thing happens when applying the intrinsic on other types than i32 or i64.
You probably also would want a test on a pointer data type here, I guess.

65–72

I guess this is not strictly necessary for this test, so should be removed?