@nikic raised an issue on D75936 that the added complexity to the O0 pipeline was causing noticeable slowdowns for -O0 builds. This patch addresses the issue by adding a pass with equal security properties, but without any optimizations (and more importantly, without the need for expensive analysis dependencies).
Details
Diff Detail
Event Timeline
I should add that I am submitting this patch as an alternative to D80064. That revision invisibly disables the mitigation at -O0, which may not be secure for some users.
This approach looks reasonable to me.
Rather than creating a separate pass, we could also pass OptLevel into it and pick the optimized/unoptimized approach based on that. No particular preference though, as it seems like apart from a couple initial checks, there is no code to share.
Isn't this pass basically SESES? https://github.com/llvm/llvm-project/blob/master/llvm/lib/Target/X86/X86SpeculativeExecutionSideEffectSuppression.cpp
Perhaps there's an opportunity to unify the two.
My understanding of SESES is that it inserts an LFENCE before each transmitter, whereas this pass inserts an LFENCE after each load. They are slightly different threat models; SESES is more strict. For example, SESES would mitigate the following vulnerability, whereas the LVI hardening will not:
uint64_t maybe_secret = *ptr; // architectural load __mm_lfence(); if (is_secret) { // suppose `is_secret == true` but branch mispredicts // do something constant-time with `maybe_secret` } else { return byte_array[maybe_secret * 4096]; // speculatively leak the secret }
I'm not opposed to merging the two approaches. But I am also not sure how to justify it.
Thank you for the analysis and for the example, they were really helpful. Agree the approaches are different. They feel similar enough I wish we could find one that satisfied all relevant requirements, but I can't say with any certainty that e.g. SESES' performance is on par with the mitigation proposed here.
llvm/lib/Target/X86/X86TargetMachine.cpp | ||
---|---|---|
503 | Hi Scott, I'd like to push back on this submitted change. From what I can tell, using the SESES pass here would result in less code to maintain while fulfilling the goals of this change. That seems to be sufficient justification in a large codebases such as LLVM, though I'm open to arguments to the contrary. Is there a reason to prefer this specific, newly implemented approach over the SESES model in this situation? I can't immediately see any benefit to adding another pass here that does largely the same thing (if in a less strict manner). I may be missing something, so please let me know if I am. |
Also to be a bit clearer, I don't think it's necessary to unify the approaches. It seems like deleting the new approach and dropping in SESES would be sufficient. Let me know if that's not the case.
@zbrid From a practical perspective I think you are correct. SESES mitigates a superset of gadgets that this pass mitigates, and therefore for code reuse/maintainability reasons it would make sense to replace this pass with SESES.
From a security perspective, I think that this could become problematic. It would mean that at -O0 I would get more security than I would at -O[1-3]. IMO optimization levels should not work like that.
It may also be worth noting that this new unoptimized pass is equivalent to the behavior of the mitigation implemented for gcc through binutils. Given that I wonder if it makes sense to use this pass at O1 or O2 and save the mostly costly analysis for O3.
I have seen some build systems use -O2 for release builds. So if we go this route, maybe we could have unoptimized for -O0 and -O1, and optimized for -O2 and -O3.
I don't really think this is a concern. As long as the pass provides at least the same level of security as what users need from the LVI pass we can use SESES. The particular implementation used is hidden behind the abstraction of the compiler flag.
A related note is that it seems like the unoptimized LVI pass does not provide the exact same level of security as the graph LVI pass, so this concern is already an issue whether we use SESES or the LVI upoptimized pass. (EDITED TO ADD: Oh wait I may have misunderstood your point. It sounds like you're saying the two LVI strategies are essentially the same with the unoptimized having extra LFENCES that are redundant. That doesn't change my mind though because one could say the same wrt optimized LVI and SESES and just add that SESES has a more redundant LFENCEs than unoptimized LVI.)
Ah that's an interesting point, but I'm not too sure similarity to gcc should be prioritized here. Is there a reason to value the similarities with the gcc approach? Up until now we've accepted the differences and I don't think we have new evidence suggesting the similarity should be prioritized.
I created a patch for the suggested change. Perhaps we should continue the conversation there? https://reviews.llvm.org/D82037
That doesn't change my mind though because one could say the same wrt optimized LVI and SESES and just add that SESES has a more redundant LFENCEs than unoptimized LVI.
I don't think that this is accurate. The relationships are:
Security(optimized LVI) == Security(unoptimized LVI) Security(optimized LVI) < Security(SESES) Security(unoptimized LVI) < Security(SESES)
SESES is targeting a broader threat model that encompasses non-universal Spectre v1 gadgets. Users who care about those gadgets should be using SESES outright and not the LVI passes.
Hi Scott,
I'd like to push back on this submitted change. From what I can tell, using the SESES pass here would result in less code to maintain while fulfilling the goals of this change. That seems to be sufficient justification in a large codebases such as LLVM, though I'm open to arguments to the contrary.
Is there a reason to prefer this specific, newly implemented approach over the SESES model in this situation? I can't immediately see any benefit to adding another pass here that does largely the same thing (if in a less strict manner). I may be missing something, so please let me know if I am.