chandlerc (Chandler Carruth)Administrator
User

Projects

User does not belong to any projects.

User Details

User Since
Jul 7 2012, 2:54 PM (293 w, 1 d)
Roles
Administrator

Recent Activity

Fri, Feb 16

chandlerc added a comment to D41330: [X86] Reduce Store Forward Block issues in HW.

Sorry for showing up late, but I was looking at this code because it turns out it is miscompiling code for us. We're working on a test case, but there are really a number of basic LLVM coding convention problems here.

Fri, Feb 16, 8:12 PM
chandlerc committed rL325420: [DAG, X86] Revert r324797, r324491, and r324359..
[DAG, X86] Revert r324797, r324491, and r324359.
Fri, Feb 16, 6:29 PM
chandlerc accepted D43203: [Driver] Generate .eh_frame_hdr for static executables too..

Agreed.

Fri, Feb 16, 5:05 PM

Tue, Feb 13

chandlerc accepted D43214: [X86] Use EDI for retpoline when no scratch regs are left.

So, if I understand correctly, this prevents the use of -mregparm=3, 32-bit x86, and a caller-save-all calling convention. And the reason this is reliable is because LLVM doesn't have such a calling convention (except maybe AnyReg) and if someone even managed to craft such IR they would just hit the assert and have to extend this? And we believe Clang definitively cannot create such a calling convention?

Tue, Feb 13, 1:35 AM

Wed, Feb 7

chandlerc committed rL324546: [x86] Fix nasty bug in the x86 backend that is essentially impossible to.
[x86] Fix nasty bug in the x86 backend that is essentially impossible to
Wed, Feb 7, 4:01 PM
This revision was not accepted when it landed; it landed in state Needs Review.
Wed, Feb 7, 4:01 PM
chandlerc added a comment to D42732: [x86] Fix nasty bug in the x86 backend that is essentially impossible to hit from IR but creates a minefield for MI passes..

Ping, love to get a review on the actual functionality here....

Wed, Feb 7, 1:06 AM
chandlerc updated the diff for D42732: [x86] Fix nasty bug in the x86 backend that is essentially impossible to hit from IR but creates a minefield for MI passes..

Update to tidy up MIR and reflect a syntax change.

Wed, Feb 7, 1:06 AM

Tue, Feb 6

chandlerc committed rL324449: [x86/retpoline] Make the external thunk names exactly match the names.
[x86/retpoline] Make the external thunk names exactly match the names
Tue, Feb 6, 10:18 PM
chandlerc closed D42998: [x86/retpoline] Make the external thunk names exactly match the names that happened to end up in GCC..
Tue, Feb 6, 10:18 PM
chandlerc added a comment to D42998: [x86/retpoline] Make the external thunk names exactly match the names that happened to end up in GCC..

Maybe it's only a theoretical issue because we only use r11 in 64bit mode though?

Not even a theoretical issue. LLVM has no code path that uses a register other than r11 in 64-bit mode as an explicit design decision, and we have asserts to that effect.

OK, sounds good.

It was not 100% clear only from looking at this patch, as there's no immediately-visible assert that we wouldn't emit __x86_indirect_thunk_e[acd]x with is64Bit().

Tue, Feb 6, 10:07 PM
chandlerc added a comment to D42998: [x86/retpoline] Make the external thunk names exactly match the names that happened to end up in GCC..

Oops, just realized this won't necessarily always match: gcc would name the thunk __x86_indirect_thunk_eax for 32bit mode, and __x86_indirect_thunk_rax for 64bit mode.

Tue, Feb 6, 7:56 PM
chandlerc added inline comments to D42998: [x86/retpoline] Make the external thunk names exactly match the names that happened to end up in GCC..
Tue, Feb 6, 7:43 PM
chandlerc updated the diff for D42998: [x86/retpoline] Make the external thunk names exactly match the names that happened to end up in GCC..

Code review suggestion.

Tue, Feb 6, 7:42 PM
chandlerc added a comment to D42998: [x86/retpoline] Make the external thunk names exactly match the names that happened to end up in GCC..

Should we just pick a name and go with it?

-eric

Tue, Feb 6, 7:38 PM
chandlerc created D42998: [x86/retpoline] Make the external thunk names exactly match the names that happened to end up in GCC..
Tue, Feb 6, 5:59 PM

Sat, Feb 3

chandlerc added a comment to D42889: [LoopIdiomRecognize] Add support for memmove. Fixes PR25165.

Some quick comments on the code etc., want to think a bit harder about the loop access checks just to make sure I'm not missing anything. Ben may also have thoughts there.

Sat, Feb 3, 5:15 PM

Fri, Feb 2

chandlerc added a comment to D41761: Introduce llvm.nospeculateload intrinsic.

(I hope you can bear with me while I come up to speed; I understand there's some time pressure here, so I'm erring on the side of speaking sooner)

The proposed API for nospeculateload seems like it could be problematic for programmers, because it gives them no way to tell whether the bounds check passed, unless they are able to identify a failval sentinel that can never be the result of a successful load (and yet is safe to use as the speculative result of the load). Thus, it seems likely that in a lot of cases, the bounds check will be duplicated: once inside nospeculateload, and once in user code. That will make code using this API "feel" inefficient, which will tend to discourage programmers from using it. Furthermore, if it actually is inefficient, that will create pressure for optimization passes to "peek inside" nospeculateload in order to eliminate the duplication, and that seems like a can of worms we really don't want to open.

Another way of putting it is that we probably want this API to be as primitive as possible, because the more logic we put inside the intrinsic, the greater the risk that some parts of it will be unsuited for some users. Consequently, we've been experimenting with APIs that are concerned solely with the bounds check, rather than with the subsequent load:

int64_t SecureBoundedOffset(int64_t offset, int64_t bound);

At the abstract machine level, this function just returns offset, and has the precondition that 0 <= offset < bound, but it also ensures that for speculative executions, offset will always appear to be within those bounds. There are also variants for uint64_t, and variants that take the base-2 log of the bound, for greater efficiency when the bound is a power of two.

template <typename T, typename... ZeroArgs>
bool IsPointerInRange(T*& pointer, T* begin, T* end, ZeroArgs... args);

This function returns whether pointer is between begin and end, and also guarantees that if the function returns false, then any speculative execution which assumes it to be true will treat pointer and args... as zero (all ZeroArgs must be integers or pointers). Notice that this API allows the optimizer to hoist loads past the branch, so long as the loads don't depend on pointer or args...; I'm not sure if that's true of nospeculateload or SecureBoundedOffset.

Notice that by not handling the load itself, these APIs avoid the ptr/cmpptr awkwardness, as well as the need for the user to designate a sentinel value. One tradeoff is that whereas nospeculateload can be thought of as a conditional load, plus some "security magic", these APIs are harder to understand without explicitly thinking about speculative execution. However, I'm not sure that's much of a problem in practice-- if you don't want to think about speculative execution, you probably shouldn't be using this API in the first place.

Is there any way we could implement an interface like those efficiently on ARM?

I don't think it's likely the compiler would intentionally introduce a load using the same pointer as the operand to a speculationsafeload; given most transforms don't understand what speculationsafeload does, the compiler has no reason to introduce code like that (even if it isn't technically prohibited by LangRef).

Do we need to explicitly prohibit it in LangRef so that future transformations don't start understanding too much about what speculationsafeload does?

More practically, I'm worried about the possibility that code which doesn't appear to be vulnerable at the source-code level will get lowered to assembly which is vulnerable. For example, the compiler could produce code where the CPU speculates a load from an uninitialized pointer value. Without an IR/MIR model for speculation, we have no way to prevent this from happening.

When you say "code which doesn't appear to be vulnerable at the source-code level", do you mean "code that is explicitly protected by speculationsafeload", or "code that doesn't appear to need speculationsafeload in the first place"? If the former, could you give a more concrete example?

It seems to me that we ought to be able to specify this intrinsic without having an explicit model of CPU speculation in the IR, because at the IR level we can just constrain the types of transformations that can be performed on this intrinsic. That way we only need to talk about speculation when we're specifying how this intrinsic is used to generate code for a specific platform that happens to feature CPU-level branch speculation. Very tentatively, I think the specific constraints on transformations that are needed here are that the intrinsic has unmodeled side-effects (so it can't be eliminated or executed speculatively), and that it should be treated as writing to all memory (or only to pointer and args.. in the case of an API like IsPointerInRange).

Thanks very much for sharing this, Geoff!
I have a few immediate questions/thoughts, and wonder what you and others here think about them:

  1. Lowering to various instruction sets.

    I think one of the questions to ask is whether the APIs here can be lowered well to different instruction sets. I believe that may be possible for Arm, but I'm still looking into it. It would be useful for us to have a few examples of how these APIs are envisioned to be used in practice, to make sure we understand the proposal well enough. E.g. maybe a few examples in the same spirit as under “More complex cases” at https://developer.arm.com/support/security-update/compiler-support-for-mitigations? Do you happen to have a suggestion of how this intrinsic would best be lowered on some instruction sets other than Arm? There was quite a bit of debate about the ability to efficiently lower the intrinsic in our proposal on the gcc mailing list, e.g. see https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01136.html.

Focusing on ProtectFromSpeculation API style, it can be lowered *very* efficiently on x86 at least:

  xor  %zero_reg, %zero_reg
  ...
  cmpq %a, 42
  jb below
  ja above
equal:
  cmovne %zero_reg, %arg0
  cmovne %zero_reg, %arg1
  ...
  cmovne %zero_reg, %argN
  ...

below:
  cmovnb %zero_reg, %arg0
  cmovnb %zero_reg, %arg1
  ...
  cmovnb %zero_reg, %argN
  ...

above:
  cmovna %zero_reg, %arg0
  cmovna %zero_reg, %arg1
  ...
  cmovna %zero_reg, %argN
  ...

Especially in the common case of 1 or 2 arguments needing to be zeroed this ends up being quite nice code generation using the existing structure of the conditional branch.

  1. Is the API available in C too?

    IIUC, the SecureBoundOffset intrinsic is usable from both C and C++, but the IsPointerInRange intrinsic can only be used from C++? Do you have ideas around bringing similar functionality to C?

Yeah, this is part of why I suggest the much more generic ProtectFromSpeculation API which I think is easily applicable in C. The C version might use pointers or whatever, but this kind of API doesn't fundamentally require any interesting lang

Fri, Feb 2, 3:00 PM
chandlerc added a comment to D41761: Introduce llvm.nospeculateload intrinsic.

(I hope you can bear with me while I come up to speed; I understand there's some time pressure here, so I'm erring on the side of speaking sooner)

The proposed API for nospeculateload seems like it could be problematic for programmers, because it gives them no way to tell whether the bounds check passed, unless they are able to identify a failval sentinel that can never be the result of a successful load (and yet is safe to use as the speculative result of the load). Thus, it seems likely that in a lot of cases, the bounds check will be duplicated: once inside nospeculateload, and once in user code. That will make code using this API "feel" inefficient, which will tend to discourage programmers from using it. Furthermore, if it actually is inefficient, that will create pressure for optimization passes to "peek inside" nospeculateload in order to eliminate the duplication, and that seems like a can of worms we really don't want to open.

Another way of putting it is that we probably want this API to be as primitive as possible, because the more logic we put inside the intrinsic, the greater the risk that some parts of it will be unsuited for some users. Consequently, we've been experimenting with APIs that are concerned solely with the bounds check, rather than with the subsequent load:

int64_t SecureBoundedOffset(int64_t offset, int64_t bound);

At the abstract machine level, this function just returns offset, and has the precondition that 0 <= offset < bound, but it also ensures that for speculative executions, offset will always appear to be within those bounds. There are also variants for uint64_t, and variants that take the base-2 log of the bound, for greater efficiency when the bound is a power of two.

template <typename T, typename... ZeroArgs>
bool IsPointerInRange(T*& pointer, T* begin, T* end, ZeroArgs... args);

This function returns whether pointer is between begin and end, and also guarantees that if the function returns false, then any speculative execution which assumes it to be true will treat pointer and args... as zero (all ZeroArgs must be integers or pointers). Notice that this API allows the optimizer to hoist loads past the branch, so long as the loads don't depend on pointer or args...; I'm not sure if that's true of nospeculateload or SecureBoundedOffset.

Notice that by not handling the load itself, these APIs avoid the ptr/cmpptr awkwardness, as well as the need for the user to designate a sentinel value. One tradeoff is that whereas nospeculateload can be thought of as a conditional load, plus some "security magic", these APIs are harder to understand without explicitly thinking about speculative execution. However, I'm not sure that's much of a problem in practice-- if you don't want to think about speculative execution, you probably shouldn't be using this API in the first place.

Is there any way we could implement an interface like those efficiently on ARM?

I don't think it's likely the compiler would intentionally introduce a load using the same pointer as the operand to a speculationsafeload; given most transforms don't understand what speculationsafeload does, the compiler has no reason to introduce code like that (even if it isn't technically prohibited by LangRef).

Do we need to explicitly prohibit it in LangRef so that future transformations don't start understanding too much about what speculationsafeload does?

More practically, I'm worried about the possibility that code which doesn't appear to be vulnerable at the source-code level will get lowered to assembly which is vulnerable. For example, the compiler could produce code where the CPU speculates a load from an uninitialized pointer value. Without an IR/MIR model for speculation, we have no way to prevent this from happening.

When you say "code which doesn't appear to be vulnerable at the source-code level", do you mean "code that is explicitly protected by speculationsafeload", or "code that doesn't appear to need speculationsafeload in the first place"? If the former, could you give a more concrete example?

It seems to me that we ought to be able to specify this intrinsic without having an explicit model of CPU speculation in the IR, because at the IR level we can just constrain the types of transformations that can be performed on this intrinsic. That way we only need to talk about speculation when we're specifying how this intrinsic is used to generate code for a specific platform that happens to feature CPU-level branch speculation. Very tentatively, I think the specific constraints on transformations that are needed here are that the intrinsic has unmodeled side-effects (so it can't be eliminated or executed speculatively), and that it should be treated as writing to all memory (or only to pointer and args.. in the case of an API like IsPointerInRange).

Thanks very much for sharing this, Geoff!
I have a few immediate questions/thoughts, and wonder what you and others here think about them:

  1. Lowering to various instruction sets.

    I think one of the questions to ask is whether the APIs here can be lowered well to different instruction sets. I believe that may be possible for Arm, but I'm still looking into it. It would be useful for us to have a few examples of how these APIs are envisioned to be used in practice, to make sure we understand the proposal well enough. E.g. maybe a few examples in the same spirit as under “More complex cases” at https://developer.arm.com/support/security-update/compiler-support-for-mitigations? Do you happen to have a suggestion of how this intrinsic would best be lowered on some instruction sets other than Arm? There was quite a bit of debate about the ability to efficiently lower the intrinsic in our proposal on the gcc mailing list, e.g. see https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01136.html.
Fri, Feb 2, 2:55 PM

Wed, Jan 31

chandlerc added inline comments to D42732: [x86] Fix nasty bug in the x86 backend that is essentially impossible to hit from IR but creates a minefield for MI passes..
Wed, Jan 31, 1:35 PM
chandlerc committed rL323915: [x86] Make the retpoline thunk insertion a machine function pass..
[x86] Make the retpoline thunk insertion a machine function pass.
Wed, Jan 31, 12:58 PM
chandlerc closed D42726: [x86] Make the retpoline thunk insertion a machine function pass..
Wed, Jan 31, 12:58 PM
chandlerc added a comment to D42732: [x86] Fix nasty bug in the x86 backend that is essentially impossible to hit from IR but creates a minefield for MI passes..

(Also, thanks for review on my first ever usage of MIR! I have so little idea of what I'm doing with it....)

Wed, Jan 31, 11:15 AM
chandlerc updated the diff for D42726: [x86] Make the retpoline thunk insertion a machine function pass..

Update with fixes from review.

Wed, Jan 31, 11:15 AM
chandlerc added inline comments to D42732: [x86] Fix nasty bug in the x86 backend that is essentially impossible to hit from IR but creates a minefield for MI passes..
Wed, Jan 31, 11:15 AM
chandlerc added a comment to D42726: [x86] Make the retpoline thunk insertion a machine function pass..

Thanks, applied fixes. I can update the patch if useful.

Wed, Jan 31, 11:11 AM
chandlerc created D42732: [x86] Fix nasty bug in the x86 backend that is essentially impossible to hit from IR but creates a minefield for MI passes..
Wed, Jan 31, 4:38 AM

Tue, Jan 30

chandlerc added a comment to D42453: Use branch funnels for virtual calls when retpoline mitigation is enabled..

Sorry for the delay, took me a bit to internalize what you were proposing.

Tue, Jan 30, 10:44 PM
chandlerc created D42726: [x86] Make the retpoline thunk insertion a machine function pass..
Tue, Jan 30, 10:25 PM

Wed, Jan 24

chandlerc added a comment to D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

@chandlerc is MacOS support omitted intentionally (ie. not implemented yet)?

Wed, Jan 24, 7:26 PM
chandlerc added inline comments to D42453: Use branch funnels for virtual calls when retpoline mitigation is enabled..
Wed, Jan 24, 7:00 PM
chandlerc updated subscribers of D41761: Introduce llvm.nospeculateload intrinsic.

Adding a few folks on my team who will hopefully take over helping here (my time has vanished). Note that they aren't on llvm-commits, so please reply-to-all etc.

Wed, Jan 24, 9:37 AM

Tue, Jan 23

chandlerc added inline comments to D42453: Use branch funnels for virtual calls when retpoline mitigation is enabled..
Tue, Jan 23, 8:43 PM
chandlerc added inline comments to D42453: Use branch funnels for virtual calls when retpoline mitigation is enabled..
Tue, Jan 23, 6:02 PM

Mon, Jan 22

chandlerc accepted D42397: Fix retpoline PLT header size for i386..

LGTM!

Mon, Jan 22, 7:17 PM
chandlerc committed rL323155: Introduce the "retpoline" x86 mitigation technique for variant #2 of the….
Introduce the "retpoline" x86 mitigation technique for variant #2 of the…
Mon, Jan 22, 2:09 PM
chandlerc committed rLLD323155: Introduce the "retpoline" x86 mitigation technique for variant #2 of the….
Introduce the "retpoline" x86 mitigation technique for variant #2 of the…
Mon, Jan 22, 2:08 PM
chandlerc committed rC323155: Introduce the "retpoline" x86 mitigation technique for variant #2 of the….
Introduce the "retpoline" x86 mitigation technique for variant #2 of the…
Mon, Jan 22, 2:08 PM
chandlerc closed D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...
Mon, Jan 22, 2:08 PM
chandlerc added a comment to D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

Thanks Matthias and Eric!

Mon, Jan 22, 1:30 AM

Jan 19 2018

chandlerc added a comment to D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

Ping! Would really like to get this landed, but as it has changed somewhat, looking for a final round of review....

Jan 19 2018, 4:43 PM

Jan 18 2018

chandlerc updated the diff for D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

Add the lfence instruction to all of the speculation capture loops, both
those generated by LLVM and those generated by LLD for the PLT.

Jan 18 2018, 6:14 PM

Jan 16 2018

chandlerc added a comment to D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

Per kernel https://marc.info/?l=linux-kernel&m=151580566622935&w=2 and gcc https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01059.html, it seems AMD needs there to be an lfence in the speculation trap (and the pause is not useful for them, but does no harm). There seems to be some speculation (but no confirmation yet?) that pause *is* necessary vs lfence on intel. So in order to work generically, they seem to be suggesting using both instructions:

loop:
  pause
  lfence
  jmp loop

Some more links
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01209.html
and final patch:
https://github.com/gcc-mirror/gcc/commit/a31e654fa107be968b802786d747e962c2fcdb2b

Per kernel https://marc.info/?l=linux-kernel&m=151580566622935&w=2 and gcc https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01059.html, it seems AMD needs there to be an lfence in the speculation trap (and the pause is not useful for them, but does no harm). There seems to be some speculation (but no confirmation yet?) that pause *is* necessary vs lfence on intel. So in order to work generically, they seem to be suggesting using both instructions:

loop:
  pause
  lfence
  jmp loop

Some more links
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01209.html
and final patch:
https://github.com/gcc-mirror/gcc/commit/a31e654fa107be968b802786d747e962c2fcdb2b

Yes for AMD, we require "lfence" instruction after the "pause" in the "retpoline" loop filler. This solution has already been accepted in GCC and Linux kernel.
Can you please do the same in LLVM as well?

Jan 16 2018, 10:28 AM
chandlerc added a comment to D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

Per kernel https://marc.info/?l=linux-kernel&m=151580566622935&w=2 and gcc https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01059.html, it seems AMD needs there to be an lfence in the speculation trap (and the pause is not useful for them, but does no harm). There seems to be some speculation (but no confirmation yet?) that pause *is* necessary vs lfence on intel. So in order to work generically, they seem to be suggesting using both instructions:

loop:
  pause
  lfence
  jmp loop

Some more links
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01209.html
and final patch:
https://github.com/gcc-mirror/gcc/commit/a31e654fa107be968b802786d747e962c2fcdb2b

Jan 16 2018, 10:28 AM

Jan 15 2018

chandlerc added a comment to D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

Ok, I think this is all done. Rafael, I think I've implemented your suggestion as well and it still passes all my tests (including the test-suite) and a bunch of internal code I have.

Jan 15 2018, 5:56 AM
chandlerc updated the diff for D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

Couple of other minor fixes.

Jan 15 2018, 5:55 AM
chandlerc updated the diff for D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

Re-implement the indirectbr removal pass based on Rafael's suggestion. It now
creates a single switch block and threads all the indirictbr's in the function
through that block. This should give substantially smaller code especially in
the case of using retpoline where the switch block expands to a search tree
that we probably don't want to duplicate 100s of times.

Jan 15 2018, 5:50 AM

Jan 12 2018

chandlerc added a comment to D41957: Utility for checking out llvm, clang, and associated tools and configuring a build folder.

This whole script is pretty opinionated:

  • Use of svn vs. the git mirrors
  • picking a specific set of llvm projects
  • picking a specific set of cmake options (BUILD_TYPE=Release, ENABLE_ASSERTIONS=OFF, ninja...)

    How can we avoid giving the impression that these are somehow recommended? (Because really I could start 3 longish bikeshed-like discussions now about what choices would be better here...)
Jan 12 2018, 10:01 AM

Jan 5 2018

chandlerc added a comment to D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

For AMD processors we may be able to handle indirect jumps via a simpler lfence mechanism. Indirect calls may still require retpoline. If this turns out to be the right solution for AMD processors we may need to put some code in to support this.

Yeah, if it ends up that we want non-retpoline mitigations for AMD we can and should add them. One hope I have is that this patch is at least generically *sufficient* (when paired with correct RSB filling) even if it suboptimal in some cases and we end up adding more precise tools later.

Just to say that at Sony we're still doing our investigation and might be interested in lfence. But who knows, we might just zap the predictor on syscalls and context switches; for environments that have mostly a few long-running processes with comparatively few syscalls it might be net cheaper than making every indirection more expensive.

Jan 5 2018, 9:06 PM
chandlerc updated the diff for D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

Teach the thunk emission to put them in comdats and enhance tests to verify
this.

Jan 5 2018, 5:45 PM
chandlerc added a comment to D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

(Er sorry, have two updates from Rafael I need to actually include, will be done momentarily...)

Jan 5 2018, 4:41 PM
chandlerc added a comment to D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

Any more comments? I'd love to land this and start the backporting and merging process. =D

Jan 5 2018, 4:37 PM
chandlerc added a comment to D41761: Introduce llvm.nospeculateload intrinsic.

Just as an FYI, we have been experimenting with similar APIs ourselves. We developed two candidate alternative APIs that, IMO, seem substantially better than this.

Jan 5 2018, 3:20 AM
chandlerc added a comment to D39910: [ARM] Issue an eror when non-general-purpose registers used in address operands (alternative).

I approved this patch back in November.

Jan 5 2018, 3:08 AM
chandlerc added a comment to D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

I had one comment I think got missed: there are some references to X86::CALL64r/X86::CALL64m in X86FrameLowering.cpp and X86MCInstLower.cpp which look like they could be relevant, but aren't addressed by this patch.

Jan 5 2018, 2:47 AM
chandlerc updated the diff for D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

Add explicit checks for various lowerings that would need direct retpoline
support that are not yet implemented. These are constructs that don't show up
in typical C, C++, and other static languages today: state points, patch
points, stack probing and morestack.

Jan 5 2018, 2:46 AM
chandlerc added a comment to D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

Add support for externally provided thunks. This is an independent feature;
when combined with the overall retpoline feature it suppresses the thunk
emission and rotates the names to be distinct names that an external build
system for the kernel (for example) can provide.

I've added some minimal documentation about the semantic requirements of these
thunks to the commit log, although it is fairly obvious. More comprehensive
documentation will be part of the large follow-up effort around docs.

Thanks! I'm however not seeing the updated commit message that contains the usage documentation of the new option in the differential revision.

Jan 5 2018, 2:23 AM
chandlerc updated the diff for D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

Rebase and fold in the latest batch of changes from Rui based on the review
from Rafael.

Jan 5 2018, 1:51 AM

Jan 4 2018

chandlerc added a comment to D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

All comments addressed now I think, thanks everyone! See some thoughts about a few of these below though.

Jan 4 2018, 6:13 PM
chandlerc updated the diff for D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

Add the new test file for external thunks.

Jan 4 2018, 6:12 PM
chandlerc added a comment to D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...
In D41723#967861, @ruiu wrote:

Chandler,

Please apply https://reviews.llvm.org/D41744 to this patch. It includes the following changes:

  1. xchg is replaced with mov/pop instructions
  2. x86-64 lazy PLT relocation target is now aligned to 16 byte
  3. the x86-64 PLT header for lazy PLT resolution is shrunk from 48 bytes to 32 bytes (which became possible by utilizing the space made by (2))
Jan 4 2018, 6:08 PM
chandlerc updated the diff for D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

Add support for externally provided thunks. This is an independent feature;
when combined with the overall retpoline feature it suppresses the thunk
emission and rotates the names to be distinct names that an external build
system for the kernel (for example) can provide.

Jan 4 2018, 6:06 PM
chandlerc added inline comments to D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...
Jan 4 2018, 4:32 PM
chandlerc updated the diff for D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

Add another minor suggestion that I forgot previously.

Jan 4 2018, 4:30 PM
chandlerc updated the diff for D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

Fix several bugs caught in code review, as well as implementing suggestions.

Jan 4 2018, 4:22 PM
chandlerc added a comment to D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

That should be easy to support -- just don't do the thunk-emission -- but it does then mean the need to standardize on the names and semantics of the required thunks.

I don't think this is true. We need to document LLVM's semantics and the thunks required. If they match GCCs, cool. If not and a user provides a flag to request an external thunk, then they have to give LLVM one that matches LLVM's semantics. Since they control the name, they can even have thunks with each compiler's semantics and different names. While in practice, I expect the semantics to match, I don't think we should be in the business of forcing this into the ABI. We already have waaaay to much in the ABI. If we come up with a better way to do retpoline in the future, we should rapidly switch to that without needing to coordinate about names and semantics across compilers.

Right, I shouldn't have said "standardize", I did actually mean "document".

And it would be good if the same function-names were used as GCC. Fine to do as a followup patch, but maybe at least have an idea what the command-line UI will be for it.

I strongly disagree here. Again, this should *not* be part of the ABI and it should not be something that we have to engineer to match exactly with other compilers.

No, I agree with you, it does not _need_ to match. However, if the semantics are the same as GCC's thunks (which they do appear to be), it would be _good_ to coordinate with eachother, simply in order to make users' lives easier. Which is the same reason to coordinate the command-line UI. Neither are required, but both would be good.

We should even be free to add __llvm_retpoline2_foo() thunks in the future with new semantics with zero compatibility concerns.

It wouldn't have _zero_ concerns due to externally-provided thunks -- they'll need to provide new ones to use the new compiler version. But probably that's okay.

Jan 4 2018, 2:16 PM
chandlerc added a comment to D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

Looks like this uses a different command-line argument than the preliminary patchset for GCC, which is rather unfortunate.
http://git.infradead.org/users/dwmw2/gcc-retpoline.git/shortlog/refs/heads/gcc-7_2_0-retpoline-20171219
It'd be awesome if both compilers would somehow end up using the same command-line spelling. Maybe GCC would like to switch to "-mretpoline"? Have we been talking to them?

Jan 4 2018, 11:34 AM
chandlerc added a comment to D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

Pardon my unsolicited comment, but it seems to me that using a "retpoline" will have two unintended negative side-effects:

  1. It will ~neuter Control Flow Integrity by providing a "universal" gadget that pulls a call target off the stack and is allowed to call anything
Jan 4 2018, 11:18 AM
chandlerc added a comment to D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

Thanks for the review!

Jan 4 2018, 10:25 AM
chandlerc updated the diff for D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

Fix suggested in code review and a couple of indentation fixes via clang-format
that I missed previously.

Jan 4 2018, 10:17 AM
chandlerc added a comment to D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

Comment for Rui about the 32-bit PLT sequence...

Jan 4 2018, 9:19 AM
chandlerc updated subscribers of D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...

Adding a bunch of subscribers who likely will care about this, and our release managers.

Jan 4 2018, 1:31 AM
chandlerc created D41723: Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre...
Jan 4 2018, 1:14 AM

Dec 29 2017

chandlerc updated subscribers of D40969: [DAG] Elide overlapping stores.

There was a comment by @nemanjai on the submitted revision in phab that I think got lost:

Dec 29 2017, 6:54 AM

Dec 27 2017

chandlerc added a comment to D39352: [SimplifyCFG] Don't do if-conversion if there is a long dependence chain.

I don't think this patch's approach is the right one, and we should probably pursue Eli's suggestion.... It also causes very significant regressions in benchmarks, so I think we should revert it for now.

Dec 27 2017, 10:13 PM

Dec 26 2017

chandlerc committed rL321469: There is no portable format string for printing `uintptr_t` values..
There is no portable format string for printing `uintptr_t` values.
Dec 26 2017, 9:48 PM

Dec 22 2017

chandlerc committed rC321392: Add an explicit `LLVM_FALLTHROUGH` annotation to an intentional.
Add an explicit `LLVM_FALLTHROUGH` annotation to an intentional
Dec 22 2017, 3:30 PM
chandlerc committed rL321392: Add an explicit `LLVM_FALLTHROUGH` annotation to an intentional.
Add an explicit `LLVM_FALLTHROUGH` annotation to an intentional
Dec 22 2017, 3:30 PM

Dec 21 2017

chandlerc committed rL321345: Rewrite the cached map used for locating the most precise DIE among.
Rewrite the cached map used for locating the most precise DIE among
Dec 21 2017, 10:42 PM
chandlerc closed D40987: Rewrite the cached map used for locating the most precise DIE among inlined subroutines for a given address..
Dec 21 2017, 10:42 PM
chandlerc added a comment to D40987: Rewrite the cached map used for locating the most precise DIE among inlined subroutines for a given address..

Thanks all! Going to land this after a touch more testing so that my test runs start being faster. =D

Dec 21 2017, 10:04 PM
chandlerc added inline comments to D41296: Limit size of non-GlobalValue name.
Dec 21 2017, 9:42 PM

Dec 13 2017

chandlerc added inline comments to D41203: [PM][InstCombine] fixing omission of AliasAnalysis in new-pass-manager's version of InstCombine.
Dec 13 2017, 10:57 PM
chandlerc accepted D41203: [PM][InstCombine] fixing omission of AliasAnalysis in new-pass-manager's version of InstCombine.

LGTM. It'd be great to add an instcombine test that uses new PM and requires AA to optimize as well. Can be in a follow-up commit if needed.

Dec 13 2017, 2:25 PM

Dec 8 2017

chandlerc added inline comments to D40987: Rewrite the cached map used for locating the most precise DIE among inlined subroutines for a given address..
Dec 8 2017, 11:53 AM
chandlerc added a comment to D40987: Rewrite the cached map used for locating the most precise DIE among inlined subroutines for a given address..

Thanks for all the feedback!

Dec 8 2017, 3:16 AM
chandlerc updated the diff for D40987: Rewrite the cached map used for locating the most precise DIE among inlined subroutines for a given address..

Update based on code review comments.

Dec 8 2017, 3:16 AM
chandlerc requested changes to D40648: A few initializations to please Valgrind. NFC.

If MSan doesn't complain about these, I'm inclined to leave them uninitialized. I'm not sure how valuable it is to workaround valgrind false-positives.

Dec 8 2017, 2:31 AM

Dec 7 2017

chandlerc created D40987: Rewrite the cached map used for locating the most precise DIE among inlined subroutines for a given address..
Dec 7 2017, 3:24 PM

Nov 28 2017

chandlerc committed rL319164: Add a new pass to speculate around PHI nodes with constant (integer) operands….
Add a new pass to speculate around PHI nodes with constant (integer) operands…
Nov 28 2017, 3:33 AM
chandlerc closed D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable. by committing rL319164: Add a new pass to speculate around PHI nodes with constant (integer) operands….
Nov 28 2017, 3:32 AM
chandlerc added a comment to D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable..

Thanks for all the feedback, landing with suggested fixes.

Nov 28 2017, 3:16 AM
chandlerc added a comment to D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable..

LGTM

I've also only implemented this in the new pass manager. If folks are

very interested, I can try to add it to the old PM as well, but I didn't
really see much point (my use case is already switched over to the new
PM).

Isn't the legacy PM still the default we use for most frontends including clang and the new PM is only used for (Thin)LTO so far?

Nov 28 2017, 3:05 AM

Nov 20 2017

chandlerc added inline comments to D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable..
Nov 20 2017, 2:31 PM
chandlerc updated the diff for D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable..

Update to address the issue highlighted by David and add more test cases.

Nov 20 2017, 2:30 PM
chandlerc added a comment to D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable..

Without having read the code yet:

We could in theory make it part of the codegen pass
pipeline, but there doesn't really seem to be a good reason for that --
it isn't "lowering" in any sense and only relies on pretty standard cost
model based TTI queries, so it seems to fit well with the "optimization"
pipeline model.

This does sound like a CodeGen pass to me! Not all of CodeGen is lowering, we do have optimization passes there too. And the fact that TTI is used is another indicator that this is a machine specific pass. This being part of CodeGen would also allow targets to not add the pass to the pipeline if it doesn't help them.

Nov 20 2017, 2:22 PM

Nov 17 2017

chandlerc committed rL318549: [PM/Unswitch] Teach SimpleLoopUnswitch to do non-trivial unswitching,.
[PM/Unswitch] Teach SimpleLoopUnswitch to do non-trivial unswitching,
Nov 17 2017, 12:00 PM
chandlerc closed D34200: [PM/unswitch] Teach SimpleLoopUnswitch to do non-trivial unswitching, making it no longer even remotely simple. by committing rL318549: [PM/Unswitch] Teach SimpleLoopUnswitch to do non-trivial unswitching,.
Nov 17 2017, 12:00 PM

Nov 15 2017

chandlerc added a comment to D34200: [PM/unswitch] Teach SimpleLoopUnswitch to do non-trivial unswitching, making it no longer even remotely simple..

Thanks Davide, I think I've got these covered either by adjustments or future work. Give a shout about anything you'd still like addresed in follow-up patches.

Nov 15 2017, 11:06 PM

Nov 14 2017

chandlerc accepted D40007: [NewPassManager] Pass the -fdebug-pass-manager flag setting into the Analysis managers to match what we do in opt.

LGTM, nice catch!

Nov 14 2017, 12:11 AM