Page MenuHomePhabricator

kristof.beyls (Kristof Beyls)
User

Projects

User does not belong to any projects.

User Details

User Since
Jun 18 2014, 12:59 AM (234 w, 1 d)

Recent Activity

Yesterday

kristof.beyls updated the diff for D54896: Introduce control flow speculation tracking pass for AArch64..

This updated diff addresses Oliver's final comment, namely about what to do when the program uses X16, e.g because of use of inline assembly specifically requesting to use X16.
Besides all the options I came up with previously, there is a 4th option: prevent speculation by inserting DSB SYS/ISB instruction pairs. Conceptually this is not unlike using lfence in X86SpeculativeLoadHardening.
I believe the pros/cons of this approach are (compared to the 3 options I listed previously):
Pros:

  • A relatively simple implementation.
  • Still keeping the advantages of doing speculation hardening very late.
  • No silent non-protected code.
  • This alternative protection mechanism is expected to only be needed for a very small amount of code.

Cons:

  • This likely has higher overhead than if we could still use a data flow mechanism to track when control flow miss-speculation happens.
Wed, Dec 12, 5:45 AM

Thu, Dec 6

kristof.beyls added a comment to D54896: Introduce control flow speculation tracking pass for AArch64..

There currently isn't even a user interface to reserve X16.

X16 can be reserved by the user using inline assembly, either with a clobber or named register variable. I can think of a few cases where this might happen in real code:

D51432 uses it in libunwind to implement unwinding through functions using return-address signing.
Inline assembly which contains a function call will need to clobber X16 and X17.
If moving this pass to before register allocation would add a lot of extra complexity, maybe this could also be solved by copying the taint into SP before each inline asm block, and back out afterwards, like we currently do for calls?

Thu, Dec 6, 7:27 AM
kristof.beyls updated the diff for D54896: Introduce control flow speculation tracking pass for AArch64..
  • Addressing most of Oliver's comments.
Thu, Dec 6, 6:34 AM

Wed, Dec 5

kristof.beyls added inline comments to D54909: [clang][slh] add Clang attr no_speculative_load_hardening.
Wed, Dec 5, 10:02 AM

Mon, Dec 3

kristof.beyls added inline comments to D54909: [clang][slh] add Clang attr no_speculative_load_hardening.
Mon, Dec 3, 10:33 AM

Mon, Nov 26

kristof.beyls created D54896: Introduce control flow speculation tracking pass for AArch64..
Mon, Nov 26, 6:25 AM

Thu, Nov 22

kristof.beyls accepted D54555: [clang][slh] add attribute for speculative load hardening.

LGTM now. I'm not an expert on the clang attribute mechanics, so am relying on Aaron's judgement/review for that part.
Thanks Zola!

Thu, Nov 22, 12:00 AM

Tue, Nov 20

kristof.beyls added inline comments to D54555: [clang][slh] add attribute for speculative load hardening.
Tue, Nov 20, 2:25 AM

Fri, Nov 16

kristof.beyls added a comment to D54555: [clang][slh] add attribute for speculative load hardening.

Does this hardening impact the ABI in any way? e.g., do we have to do anything special to handle calls through function pointers where the bound function pointer is marked with this attribute?

Fri, Nov 16, 6:50 AM
kristof.beyls added inline comments to D54555: [clang][slh] add attribute for speculative load hardening.
Fri, Nov 16, 6:34 AM

Nov 9 2018

kristof.beyls accepted D53908: [AArch64] Support HiSilicon's TSV110 processor.

This patch looks fine to me, assuming the TSV110 core implements Armv8.2 including the optional extensions ARMv8.2-FP16, ARMv8.2-DotProd, ARMv8.2-FHM and SPE.

Nov 9 2018, 2:10 AM

Oct 31 2018

kristof.beyls updated the diff for D53691: Introduce bug life cycle documentation..

Add a sentence on statuses involved when triaging.

Oct 31 2018, 6:25 AM
kristof.beyls added a reviewer for D53927: [AArch64] Enable libm vectorized functions via SLEEF: fpetrogalli.
Oct 31 2018, 4:03 AM
kristof.beyls updated subscribers of D53908: [AArch64] Support HiSilicon's TSV110 processor.
Oct 31 2018, 1:56 AM

Oct 29 2018

kristof.beyls added inline comments to D53691: Introduce bug life cycle documentation..
Oct 29 2018, 8:59 AM
kristof.beyls updated the diff for D53691: Introduce bug life cycle documentation..
Oct 29 2018, 8:56 AM
kristof.beyls added inline comments to D53691: Introduce bug life cycle documentation..
Oct 29 2018, 8:18 AM

Oct 28 2018

kristof.beyls added inline comments to D53691: Introduce bug life cycle documentation..
Oct 28 2018, 1:26 AM

Oct 26 2018

kristof.beyls added inline comments to D53691: Introduce bug life cycle documentation..
Oct 26 2018, 7:44 AM

Oct 25 2018

kristof.beyls updated subscribers of D53691: Introduce bug life cycle documentation..
Oct 25 2018, 2:32 AM
kristof.beyls created D53691: Introduce bug life cycle documentation..
Oct 25 2018, 2:26 AM
kristof.beyls added a comment to D53190: ARM: avoid infinite combining loop.

I guess you're mainly asking review on this to get opinions on the approach to introduce new pseudo instructions just to avoid cycles during DAG combining and/or lowering?
From the patch, I furthermore guess that introducing a new pseudo leads to quite a bit of code duplication/violations of the "don't-repeat-yourself" principle in the instruction matching patterns?

Oct 25 2018, 1:24 AM

Oct 12 2018

kristof.beyls added inline comments to D53121: [Driver] Add defaults for Android ARM FPUs..
Oct 12 2018, 1:44 AM

Oct 11 2018

kristof.beyls updated subscribers of D53121: [Driver] Add defaults for Android ARM FPUs..

This LGTM, but we should wait to hear from Kristof before submitting.

Oct 11 2018, 1:33 AM

Sep 21 2018

kristof.beyls added a comment to D52341: [mctoll] Initial changes for MC to LL raiser that takes a binary and raises it back to llvm bitcode.

Hi Aaron,

Sep 21 2018, 12:15 AM

Sep 19 2018

kristof.beyls added a reviewer for D52258: Fix for bug 34002: olista01.
Sep 19 2018, 12:32 AM

Sep 18 2018

kristof.beyls added a comment to D51862: [LNT] more pep8 code style fixes.

I'm also using pycodestyle 2.4.0. The error/warning was not about raw strings in regexps, rather illegal escape sequences, i.e.:

.\lnt\server\db\regression.py:58:20: W605 invalid escape sequence '\d'

I think using raw strings is better than using something like \\d.

Btw, I doubt that I can commit to llvm repo, so I need somebody to commit this for me. Thanks!

Sep 18 2018, 3:06 AM

Sep 7 2018

kristof.beyls added a reviewer for D51780: ARM: align loops to 4 bytes on Cortex-M3 and Cortex-M4.: SjoerdMeijer.
Sep 7 2018, 5:40 AM

Sep 6 2018

kristof.beyls added inline comments to D48580: [AArch64] Support reserving x1-7 registers..
Sep 6 2018, 2:24 AM

Aug 30 2018

kristof.beyls accepted D51465: Revamp test-suite documentation.

I am as excited by this as the other reviewers.
Thank you very much for this, Matthias!

Aug 30 2018, 5:50 AM

Aug 27 2018

kristof.beyls accepted D51157: [x86/SLH] Add a real Clang flag and LLVM IR attribute for Speculative Load Hardening..

I'm not an expert on many of the areas touched by this patch, but it looks fine from me from a high-level point-of-view, modulo a few nits I have on a few comments.

Aug 27 2018, 5:18 AM

Aug 23 2018

kristof.beyls added inline comments to D51157: [x86/SLH] Add a real Clang flag and LLVM IR attribute for Speculative Load Hardening..
Aug 23 2018, 11:15 PM

Jul 10 2018

kristof.beyls updated the summary of D49073: Introducing __builtin_speculation_safe_value.
Jul 10 2018, 1:14 AM

Jul 9 2018

kristof.beyls updated the diff for D49072: Enable automatic mitigation against control flow speculation..

update diff with full context

Jul 9 2018, 4:58 AM
kristof.beyls added a child revision for D49070: Introduce llvm.speculation_safe_value intrinsic.: D49073: Introducing __builtin_speculation_safe_value.
Jul 9 2018, 4:50 AM
kristof.beyls added a parent revision for D49073: Introducing __builtin_speculation_safe_value: D49070: Introduce llvm.speculation_safe_value intrinsic..
Jul 9 2018, 4:50 AM
kristof.beyls created D49073: Introducing __builtin_speculation_safe_value.
Jul 9 2018, 4:49 AM
kristof.beyls added a parent revision for D49072: Enable automatic mitigation against control flow speculation.: D49071: Enable lowering of llvm.speculation_safe_value to DSB/ISB pair..
Jul 9 2018, 4:46 AM
kristof.beyls added a child revision for D49071: Enable lowering of llvm.speculation_safe_value to DSB/ISB pair.: D49072: Enable automatic mitigation against control flow speculation..
Jul 9 2018, 4:46 AM
kristof.beyls created D49072: Enable automatic mitigation against control flow speculation..
Jul 9 2018, 4:46 AM
kristof.beyls removed a child revision for D41761: Introduce llvm.nospeculateload intrinsic: D49071: Enable lowering of llvm.speculation_safe_value to DSB/ISB pair..
Jul 9 2018, 4:44 AM
kristof.beyls edited parent revisions for D49071: Enable lowering of llvm.speculation_safe_value to DSB/ISB pair., added: 1; removed: 1.
Jul 9 2018, 4:44 AM
kristof.beyls added a child revision for D49070: Introduce llvm.speculation_safe_value intrinsic.: D49071: Enable lowering of llvm.speculation_safe_value to DSB/ISB pair..
Jul 9 2018, 4:44 AM
kristof.beyls added a child revision for D41761: Introduce llvm.nospeculateload intrinsic: D49071: Enable lowering of llvm.speculation_safe_value to DSB/ISB pair..
Jul 9 2018, 4:43 AM
kristof.beyls added a parent revision for D49071: Enable lowering of llvm.speculation_safe_value to DSB/ISB pair.: D41761: Introduce llvm.nospeculateload intrinsic.
Jul 9 2018, 4:43 AM
kristof.beyls created D49071: Enable lowering of llvm.speculation_safe_value to DSB/ISB pair..
Jul 9 2018, 4:42 AM
kristof.beyls added a child revision for D49069: Introduce control flow speculation tracking for AArch64.: D49070: Introduce llvm.speculation_safe_value intrinsic..
Jul 9 2018, 4:39 AM
kristof.beyls added a parent revision for D49070: Introduce llvm.speculation_safe_value intrinsic.: D49069: Introduce control flow speculation tracking for AArch64..
Jul 9 2018, 4:39 AM
kristof.beyls created D49070: Introduce llvm.speculation_safe_value intrinsic..
Jul 9 2018, 4:39 AM
kristof.beyls created D49069: Introduce control flow speculation tracking for AArch64..
Jul 9 2018, 4:36 AM

Jun 28 2018

kristof.beyls added inline comments to D47930: Make email options of find_interesting_reviews more flexible..
Jun 28 2018, 7:46 AM
kristof.beyls updated the diff for D47930: Make email options of find_interesting_reviews more flexible..
Jun 28 2018, 7:44 AM
kristof.beyls added a comment to D48580: [AArch64] Support reserving x1-7 registers..
Jun 28 2018, 12:02 AM

Jun 27 2018

kristof.beyls updated subscribers of D48581: [AArch64] Support reserving x1-7 registers..
Jun 27 2018, 12:50 AM
kristof.beyls added a comment to D48580: [AArch64] Support reserving x1-7 registers..

I believe this is aiming to implement the -ffixed-reg command line option from gcc.
The documentation is at https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html:

Jun 27 2018, 12:36 AM

Jun 22 2018

kristof.beyls accepted D48077: [LNT] Allow --use-perf=profile and --run-under to work together.

It's clear from your description that the run_under _mutateCommandLine isn't a "private" function in the module as is, and therefore removing the under score makes sense to me.
Maybe it'd be better for the mutatePlan in perf.py to be implemented in a different way so that _mutateCommandLine really can be private in the run_under module.
I'm afraid I don't have enough experience here to know whether that would be a reasonable goal, implementable with reasonable effort.
I hope @MatzeB could share his opinion on this.

Jun 22 2018, 12:05 AM

Jun 8 2018

kristof.beyls created D47930: Make email options of find_interesting_reviews more flexible..
Jun 8 2018, 12:07 AM

Jun 4 2018

kristof.beyls accepted D47581: [LNT] Fix use of field index in daily report.

LGTM.

Jun 4 2018, 6:06 AM

Jun 1 2018

kristof.beyls added a comment to D47581: [LNT] Fix use of field index in daily report.

Maybe the best way to test this is to add a helper function in tests/server/ui/V4Pages.py to extract the full content from a specific field from the table (i.e. including all HTML tags) and check against exact HTML content?

What do you think - would that be feasible and useful to implement?

After trying some things I think it will be easier to extract just the destination of the links and check that those are correct, especially as it's just the destination of the links that was wrong.

Jun 1 2018, 7:48 AM

May 31 2018

kristof.beyls added a comment to D47581: [LNT] Fix use of field index in daily report.

Hi John,

May 31 2018, 5:24 AM

May 17 2018

kristof.beyls added a comment to D46192: Script to match open Phabricator reviews with potential reviewers.

Oh - and I indeed also ran the source code through YAPF - thanks for that pointer, I didn't know that existed.

May 17 2018, 3:10 AM
kristof.beyls updated the diff for D46192: Script to match open Phabricator reviews with potential reviewers.

Many thanks for the detailed review, Jonas!
I think I've addressed all your comments - very useful feedback!

May 17 2018, 3:08 AM

May 14 2018

kristof.beyls added a comment to D46714: [test-suite] Add list of programs we might add..

It does seem like a wiki would be nice to maintain this kind of information. In the absence of that, I think that a file in the test-suite repository, or a page in www are about equally easy/hard to maintain: it requires commit access to make any changes.
A file in www in theory could be more visible as it becomes part of the llvm.org web pages. That being said, source code is also viewable online, so it's easy to browse this text too.

May 14 2018, 11:51 PM
kristof.beyls added a comment to D45000: [LNT] fix some typos/dead code.

@MatzeB Is there something else I should do for this to be commited? First time using the Phabricator so I'm a bit unfamiliar with the process.

May 14 2018, 1:30 AM

May 7 2018

kristof.beyls added inline comments to D46192: Script to match open Phabricator reviews with potential reviewers.
May 7 2018, 7:50 AM
kristof.beyls updated the diff for D46192: Script to match open Phabricator reviews with potential reviewers.
May 7 2018, 7:48 AM
kristof.beyls updated the diff for D46192: Script to match open Phabricator reviews with potential reviewers.

Updating diff to make it add the script to llvm/utils/Reviewing.
I chose to put it in a new folder "Reviewing", since ideas for further scripts related to reviewing have been made on this thread.

May 7 2018, 6:37 AM
kristof.beyls accepted D46433: [LNT] lnt profile upgrade command for large Spec2017 perf.data aborts.

I hope that in the long run, we can get rid of this cPerf.cpp C++ implementation, and instead make use of "perf data convert --to-ctf" to convert perf.data files into a CTF format which should be easier to parse in python.
At the time this C++ implementation was added, the "perf data convert --to-ctf" functionality tended not to be compiled in to perf as distributed by the main linux distros.
Looking at a reason stack overflow issue about this suggests that the situation may not have changed much (https://stackoverflow.com/questions/43576997/building-perf-with-babeltrace-for-perf-to-ctf-conversion), hence we'll probably need this C++ implementation for some time to come.

May 7 2018, 3:12 AM

May 1 2018

kristof.beyls added a comment to D46192: Script to match open Phabricator reviews with potential reviewers.

This should probably go into llvm/utils, not the root.

It would also be good to have the opposite script - "given git diff / commits, whom should i add as reviewers?"

I once wrote the following one-liner to do this. It works pretty well but it can be *very* slow. It will suggest a top 10 of reviewers for your staged changes.

git diff --name-only --cached | xargs -n 1 git blame --porcelain | grep \"^author \" | sort | uniq -c | sort -nr | head -10

It could evaluate CODE_OWNERS.txt, git log / git shortlog -sne of the changed files, git blame.

May 1 2018, 6:46 AM
kristof.beyls added a comment to D46192: Script to match open Phabricator reviews with potential reviewers.

Hi Kristof. Thanks for this.

One more heuristic that might help getting quick reviews:

  1. If there is overlap between the files touched by the patch-under-review and and a 'reviewer' of recent changes to same file, that person may be a good reviewer.

    This heuristic is helpful where original implementor may have moved on, but new active reviewers in same area are happy to review. Some commit messages have 'reviewers' listed and that could be used.
May 1 2018, 6:45 AM
kristof.beyls added a comment to D46192: Script to match open Phabricator reviews with potential reviewers.

Do you plan on actually commiting this to the repository, or would you like any feedback nontheless?

May 1 2018, 6:44 AM
kristof.beyls added a comment to D46215: [AArch64] Support reserving x16 and x17 register.

I have some doubts this is the right long term approach.

As https://android.googlesource.com/kernel/common/+/c0385b24af15020a1e505f2c984db0d7c0d017e1%5E%21/#F4 indicates, there are others who have use cases for reserving other registers than x16,x17,x18. (The linked patch reserves x1,x2,x3,x4,x5,x6,x7).
With that indication, I think it may be better to bite the bullet now and implement full support for gcc command line option "-ffixed-reg", where reg is, according to the gcc documentation: "reg must be the name of a register. The register names accepted are machine-specific and are defined in the REGISTER_NAMES macro in the machine description macro file.".
For the full documentation of this option, see https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html.

What do you think?

Thanks,

Kristof

That was original proposal bug @chandlerc suggested in our offline discussion that it's not easily doable. It seems like we might be able to duplicate the same code for all remaining registers which will require a lot of duplicated logic but it should work? However, I might be missing some details of the register allocator which will break.

May 1 2018, 5:56 AM

Apr 28 2018

kristof.beyls added a reviewer for D46215: [AArch64] Support reserving x16 and x17 register: srhines.

I have some doubts this is the right long term approach.

Apr 28 2018, 12:16 AM

Apr 27 2018

kristof.beyls created D46192: Script to match open Phabricator reviews with potential reviewers.
Apr 27 2018, 8:01 AM

Mar 26 2018

kristof.beyls added inline comments to D44374: [GlobalISel] Add support for lowering vector operations to scalar..
Mar 26 2018, 7:42 AM

Mar 16 2018

kristof.beyls added a comment to D44519: Add llvm-exegesis tool..

Thanks for sharing this - this looks like a really promising tool.
I wonder if you have thought about adding unit/regression tests? Having them would make it quite a bit easier for others to contribute to its development.

Sure, we do have unit tests (they are gmock unit tests, not lit tests). I did not include them in this diff to make review more manageable, but I'll add them now if you want.

Mar 16 2018, 3:36 AM
kristof.beyls added a comment to D44519: Add llvm-exegesis tool..

Thanks for sharing this - this looks like a really promising tool.
I wonder if you have thought about adding unit/regression tests? Having them would make it quite a bit easier for others to contribute to its development.
Off the top of my head - to be able to unit/regression test this, you'd need a mock/fake "libpfm-like" implementation to make the test not rely on the results coming from whatever machine you're running on?
If that's a direction you want to take, designing that in earlier may be easier than retrofitting it in later?

Mar 16 2018, 3:09 AM

Mar 12 2018

kristof.beyls added inline comments to D44374: [GlobalISel] Add support for lowering vector operations to scalar..
Mar 12 2018, 8:23 AM
kristof.beyls added a comment to D44374: [GlobalISel] Add support for lowering vector operations to scalar..

As far as I'm aware it is impossible to write unit tests for this unless a backend uses it?
I've attempted adding this to AArch64's GlobalISel implementation and hit other deficiencies in the backend that are unrelated but this code uncovered.

Mar 12 2018, 7:59 AM

Feb 22 2018

kristof.beyls accepted D43612: [zorg] Update testsuite parallelism settings for Linaro AArch64 builders..
Feb 22 2018, 6:20 AM
kristof.beyls added inline comments to D43612: [zorg] Update testsuite parallelism settings for Linaro AArch64 builders..
Feb 22 2018, 12:33 AM

Feb 3 2018

kristof.beyls added a comment to D41761: Introduce llvm.nospeculateload intrinsic.
  1. Is the API available in C too?

    IIUC, the SecureBoundOffset intrinsic is usable from both C and C++, but the IsPointerInRange intrinsic can only be used from C++? Do you have ideas around bringing similar functionality to C?

Yeah, this is part of why I suggest the much more generic ProtectFromSpeculation API which I think is easily applicable in C. The C version might use pointers or whatever, but this kind of API doesn't fundamentally require any interesting lang

  1. For the variant with a general predicate (bool ProtectFromSpeculation(bool predicate, ZeroArgs&... args);); do you have ideas about how to make sure that the optimizers in the compiler don’t simplify the predicate too much? For example:

    ` if (a>7) { x = a; if (ProtectFromSpeculation(a>5, x) { ... = v[x]; ... } }

    ` how to prevent this from being optimized to: ` if (a>7) { x = a; if (ProtectFromSpeculation(true, x) { ... = v[x]; ... } } ` which leads to no longer giving protection.

No matter what, this will require deep compiler support to implement. Even without the example you give, these construct fundamentally violate the rules the optimizer uses: they are by definition no-ops for execution of the program!

This means we will have to work to build up specific an dedicated infrastructure in the compiler to model these as having special semantics. That exact infrastructure can provide whatever optimization barriers are necessary to get the desired behavior. For example, the code generation I suggest above for x86 cannot be implemented in LLVM using its IR alone (I've actually tried). We'll have to model this both in the IR and even in the code generator specially in order to produce the kind of specific pattern that is necessary.

But there is also the question of what burden do we want to place on the user of these intrinsics vs. what performance hit we're willing to accept due to optimization barriers. I could imagine two approaches here:

  1. It is the programmers responsibility to correctly protect any predicates that their application is sensitive to. As a consequence, if the a>5 predicate is sensitive for the application, so must the a>7 predicate be, and it is the programmers responsibility to protect both of them. This allows the implementation to have the minimal set of optimization barriers, but may make it difficult for programmers to use correctly.
  2. The predicate provided to these APIs is truly special and is forced to be a *dynamic* predicate. That is, we require the compiler to emit the predicate as if no preconditions existed. There are ways to model this in LLVM and I assume any compiler. As a trivial (but obviously bad) example: all references to variables within the predicate could be lowered by rinsing that SSA value through an opaque construct like inline asm.

    There is clearly a "programmer ease / security" vs. "better optimization" tradeoff between the two. If one isn't *clearly* the correct choice in all cases, we could even expose both behind separate APIs that try to make it clear the extent of protections provided.

    Does that make sense?
Feb 3 2018, 12:00 PM

Jan 30 2018

kristof.beyls updated subscribers of D41761: Introduce llvm.nospeculateload intrinsic.

How can two variant-1 attacks be "different" enough that a speculationsafeload would protect against one but not the other, when both exploit the same load operation

Sorry, wasn't quite clear. There are two speculated loads for a variant-1 attack: the load that reads the secret, and the load that leaks the secret to the user. The first load has to be speculation-safe to stop the attack; whether the second is speculation-safe is irrelevant, at least in the proposals so far. That isn't really a fundamental problem, just an illustration that reasoning about these attacks is tricky.

Wait, no, this isn't right. It is actually possible to attack a speculationsafeload, at least in theory.

In general, the logic behind the "soft" speculation barriers being proposed is that they block execution if a speculation check fails. But given that the speculation check is itself computed speculatively, you have to ensure that not only is the result correct in the normal sense, but also that it's correct when computed speculatively. And as I showed earlier, IR transforms can turn an arbitrary value into a speculatively-uninitialized value. Therefore, the speculation check can spuriously succeed, and the speculation barrier doesn't reliably prevent speculation.

Jan 30 2018, 8:35 AM
kristof.beyls added a comment to D41761: Introduce llvm.nospeculateload intrinsic.

I have no idea what form that assurance would take, since I don't know how LangRef handles such matters.

Well, I don't really know either; LangRef only describes the abstract virtual machine, mostly. That's part of the problem. :)

How can two variant-1 attacks be "different" enough that a speculationsafeload would protect against one but not the other, when both exploit the same load operation

Sorry, wasn't quite clear. There are two speculated loads for a variant-1 attack: the load that reads the secret, and the load that leaks the secret to the user. The first load has to be speculation-safe to stop the attack; whether the second is speculation-safe is irrelevant, at least in the proposals so far. That isn't really a fundamental problem, just an illustration that reasoning about these attacks is tricky.

I don't see how the code being spread over multiple functions matters- all that matters are the load, and the branch (or nested branches) that actually guard that load

Well, the CPU doesn't care (assuming it can perfectly predict calls), but it's problematic from an auditing perspective because it's harder to spot. Particularly since the vulnerable code might not explicitly reference any user-controlled data at all.

If the application logic doesn't explicitly prevent any of the loads the attacker is exploiting

A C programmer cannot reasonably come up with a complete list of all the potentially exploitable loads without the compiler being aware that the user needs Spectre-resistant code. There are two key problems:

  1. An exploitable load might not exist in the original program. One example is the switch-to-lookup-table transform. Given:

    ` int a(unsigned x) { switch (x) { case 0: return 2; case 1: return 44; case 2: return 23; default: return 8; } } `

    We transform to: ` int a(int x) { if (x > 2) return 8; const static int table[] = {2, 44, 23}; return table[x]; } `

    Now you have a speculated out-of-bounds load from code which didn't contain any loads.
  2. The exploitable code might come out of some non-obvious lowering. Even if a pointer points to something "obviously" safe, it might actually be uninitialized along some impossible path through the function. Probably the easiest example to explain is polly's invariant load hoisting. Basically, if you have a loop like this:

    ` int sum = 0; for (int i = 0; i < n; ++i) { if (b) sum += (*p)[i]; } `

    It gets transformed to something like this:

    ` int *pp; if (b) pp = *p; int sum = 0; for (int i = 0; i < n; ++i) { if (b) sum += pp[i]; } `

    So now you have the if() if() pattern which leads to a speculated load from an uninitialized pointer.
Jan 30 2018, 8:25 AM
kristof.beyls added a comment to D41761: Introduce llvm.nospeculateload intrinsic.

(I hope you can bear with me while I come up to speed; I understand there's some time pressure here, so I'm erring on the side of speaking sooner)

The proposed API for nospeculateload seems like it could be problematic for programmers, because it gives them no way to tell whether the bounds check passed, unless they are able to identify a failval sentinel that can never be the result of a successful load (and yet is safe to use as the speculative result of the load). Thus, it seems likely that in a lot of cases, the bounds check will be duplicated: once inside nospeculateload, and once in user code. That will make code using this API "feel" inefficient, which will tend to discourage programmers from using it. Furthermore, if it actually is inefficient, that will create pressure for optimization passes to "peek inside" nospeculateload in order to eliminate the duplication, and that seems like a can of worms we really don't want to open.

Another way of putting it is that we probably want this API to be as primitive as possible, because the more logic we put inside the intrinsic, the greater the risk that some parts of it will be unsuited for some users. Consequently, we've been experimenting with APIs that are concerned solely with the bounds check, rather than with the subsequent load:

int64_t SecureBoundedOffset(int64_t offset, int64_t bound);

At the abstract machine level, this function just returns offset, and has the precondition that 0 <= offset < bound, but it also ensures that for speculative executions, offset will always appear to be within those bounds. There are also variants for uint64_t, and variants that take the base-2 log of the bound, for greater efficiency when the bound is a power of two.

template <typename T, typename... ZeroArgs>
bool IsPointerInRange(T*& pointer, T* begin, T* end, ZeroArgs... args);

This function returns whether pointer is between begin and end, and also guarantees that if the function returns false, then any speculative execution which assumes it to be true will treat pointer and args... as zero (all ZeroArgs must be integers or pointers). Notice that this API allows the optimizer to hoist loads past the branch, so long as the loads don't depend on pointer or args...; I'm not sure if that's true of nospeculateload or SecureBoundedOffset.

Notice that by not handling the load itself, these APIs avoid the ptr/cmpptr awkwardness, as well as the need for the user to designate a sentinel value. One tradeoff is that whereas nospeculateload can be thought of as a conditional load, plus some "security magic", these APIs are harder to understand without explicitly thinking about speculative execution. However, I'm not sure that's much of a problem in practice-- if you don't want to think about speculative execution, you probably shouldn't be using this API in the first place.

Is there any way we could implement an interface like those efficiently on ARM?

I don't think it's likely the compiler would intentionally introduce a load using the same pointer as the operand to a speculationsafeload; given most transforms don't understand what speculationsafeload does, the compiler has no reason to introduce code like that (even if it isn't technically prohibited by LangRef).

Do we need to explicitly prohibit it in LangRef so that future transformations don't start understanding too much about what speculationsafeload does?

More practically, I'm worried about the possibility that code which doesn't appear to be vulnerable at the source-code level will get lowered to assembly which is vulnerable. For example, the compiler could produce code where the CPU speculates a load from an uninitialized pointer value. Without an IR/MIR model for speculation, we have no way to prevent this from happening.

When you say "code which doesn't appear to be vulnerable at the source-code level", do you mean "code that is explicitly protected by speculationsafeload", or "code that doesn't appear to need speculationsafeload in the first place"? If the former, could you give a more concrete example?

It seems to me that we ought to be able to specify this intrinsic without having an explicit model of CPU speculation in the IR, because at the IR level we can just constrain the types of transformations that can be performed on this intrinsic. That way we only need to talk about speculation when we're specifying how this intrinsic is used to generate code for a specific platform that happens to feature CPU-level branch speculation. Very tentatively, I think the specific constraints on transformations that are needed here are that the intrinsic has unmodeled side-effects (so it can't be eliminated or executed speculatively), and that it should be treated as writing to all memory (or only to pointer and args.. in the case of an API like IsPointerInRange).

Jan 30 2018, 8:20 AM

Jan 26 2018

kristof.beyls added a comment to D42568: [GlobalISel] Bail out on calls to dllimported functions.

Hi Martin,

Jan 26 2018, 1:43 AM

Jan 23 2018

kristof.beyls added a comment to D40696: Enable aggressive FMA on T99 and provide AArch64 option for other micro-arch's.

I am not sure I follow. I think @MatzeB 's issue was using getProcFamily() and it should be fine to just check the subtarget feature in enableAggressiveFMAFusion. @MatzeB summarized some benefits of making it a subtarget feature here: https://reviews.llvm.org/D40177#936974

If there's only one EnableAggressiveFMA boolean in AArch64.td:

What happens if a certain micro-arch wants to enable AggressiveFMA for scalar doubles and vector doubles, but not for scalar floats and vector floats? (as an example).

Won't they have to call getProcFamily() in enableAggressiveFMAFusion()?

As there is only one boolean in AArch64.td, AggressiveFMA will be either enabled for all the floating-point types, or disabled for all the floating-point types, scalars or vectors. The only way of knowing which particular floating-point type is being tested for AggressiveFMA is by testing the type of the EVT, and then making a decision based on getProcFamily() and EVT type.

Is it safe to speculate that each and every AArch64 micro-arch wants AggressiveFMA enabled or disabled for all possible floating-point types?

Jan 23 2018, 12:22 AM

Jan 18 2018

kristof.beyls added a comment to D41761: Introduce llvm.nospeculateload intrinsic.

It would be awesome to have static analysis rules to help identify *where* to put these intrinsics. Is somebody working on that? Or did I miss it?

Jan 18 2018, 7:14 AM
kristof.beyls added a comment to D41761: Introduce llvm.nospeculateload intrinsic.

"When the intrinsic is being executed speculatively"

What does this mean?

LLVM IR defines the semantics of a program only in terms of visible side-effects. It does not define any semantics for code which does not execute, and it does not provide any guarantees in terms of what data can be leaked via side-channels. If you're going to attach meaningful semantics to speculationsafeload, you also need to generally define what code the compiler is allowed to emit in blocks which could be speculatively executed. As an extreme example, there needs to be some rule which prevents the compiler from inserting an attackable load immediately before a call to speculationsafeload.

Jan 18 2018, 7:13 AM

Jan 17 2018

kristof.beyls added a comment to D41760: Introduce __builtin_load_no_speculate.

The API design has been discussed over the past weeks in detail on the gcc mailing list. As a result of that, we propose to adapt the API, to enable efficient code generation also on architectures that need to generate a barrier instruction to achieve the desired semantics.

Jan 17 2018, 7:30 AM
kristof.beyls added a comment to D41761: Introduce llvm.nospeculateload intrinsic.

The API design has been discussed over the past weeks in detail on the gcc mailing list. As a result of that, we propose to adapt the API, to enable efficient code generation also on architectures that need to generate a barrier instruction to achieve the desired semantics.

Jan 17 2018, 7:24 AM

Jan 11 2018

kristof.beyls added a comment to D38196: [AArch64] Avoid interleaved SIMD store instructions for Exynos.
In D38196#951927, @az wrote:

Thank you for providing several very useful feedback to improve this patch. In the future (not very soon), I am planning to make this pass more complete by 1) implementing the existing vector element sub-pass in the same way as optimize interleaved store instructions sub-pass for consistency reason (put all rewrite rules and not just one, caching, instruction replacement table, etc.) 2) Addressing the ST3 instruction inefficiency assuming I can find an efficient combination of instructions to replace ST3. I hope I can put you as the reviewer.

I would like also to get your feedback on a couple of things:

  1. Is it worth replacing how code generation (i.e. building new instructions) is implemented in this pass? It is currently hard-coded but we can replace it by a table driven approach similar to the one used in the analysis phase (see IRT Table). In particular, I need to find a simple way to express through a table or other means how the new instructions are built out of the old one and more specifically how the operands of new instructions related to old ones. This adds complexity to the implementation and understanding of the code generation but makes adding more rewrite rules quite simple. It tried this before but the patch was not accepted but may be I should bring it again now that we added several new rewrite pattern in this patch and expect to add more for ST3.
Jan 11 2018, 2:54 AM

Jan 10 2018

kristof.beyls added a comment to D41761: Introduce llvm.nospeculateload intrinsic.

Thanks for the feedback, Chandler and Philip!

Jan 10 2018, 7:38 AM

Jan 5 2018

kristof.beyls updated subscribers of D41761: Introduce llvm.nospeculateload intrinsic.
Jan 5 2018, 5:49 AM
kristof.beyls updated subscribers of D41760: Introduce __builtin_load_no_speculate.
Jan 5 2018, 5:47 AM
kristof.beyls added a child revision for D41761: Introduce llvm.nospeculateload intrinsic: D41760: Introduce __builtin_load_no_speculate.
Jan 5 2018, 3:08 AM
kristof.beyls added a parent revision for D41760: Introduce __builtin_load_no_speculate: D41761: Introduce llvm.nospeculateload intrinsic.
Jan 5 2018, 3:08 AM
kristof.beyls created D41761: Introduce llvm.nospeculateload intrinsic.
Jan 5 2018, 3:07 AM
kristof.beyls created D41760: Introduce __builtin_load_no_speculate.
Jan 5 2018, 3:02 AM

Dec 8 2017

kristof.beyls accepted D40985: [AArch64] Add Exynos to host detection.

LGTM modulo the one nitpick for which I'm not sure if it is needed.

Dec 8 2017, 12:03 AM