This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/Driver/
-
clang/
-
Driver/
-
Options.td
-
lib/Driver/ToolChains/Arch/
-
Driver/
-
ToolChains/
-
Arch/
-
X86.cpp
-
test/Driver/
-
Driver/
-
x86-target-features.c
-
llvm/
-
lib/Target/X86/
-
Target/
-
X86/
-
CMakeLists.txt
38/50
ImmutableGraph.h
-
X86.h
-
X86.td
33/42
X86LoadValueInjectionLoadHardening.cpp
-
X86Subtarget.h
-
X86TargetMachine.cpp
-
test/CodeGen/X86/
-
CodeGen/
-
X86/
1/1
O0-pipeline.ll
1
O3-pipeline.ll
-
lvi-hardening-gadget-graph.ll

Differential D75936

Add a Pass to X86 that builds a Condensed CFG for Load Value Injection (LVI) Gadgets [4/6]
ClosedPublic

Authored by sconstab on Mar 10 2020, 10:00 AM.

Download Raw Diff

Details

Reviewers

andrew.w.kaylor
chandlerc
zbrid
george.burgess.iv
kparzysz
mattdr
craig.topper

Commits

rG363720c2b0f6: [X86] Add a Pass that builds a Condensed CFG for Load Value Injection (LVI)…
rGe3ba468fc3c1: [X86] Add a Pass that builds a Condensed CFG for Load Value Injection (LVI)…
rGe97a3e5d9d42: [X86] Add a Pass that builds a Condensed CFG for Load Value Injection (LVI)…
rGc74dd640fd74: [X86] Add a Pass that builds a Condensed CFG for Load Value Injection (LVI)…

Summary

Adds a new data structure, ImmutableGraph, and uses RDF to find LVI gadgets and add them to a MachineGadgetGraph.

More specifically, a new X86 machine pass finds Load Value Injection (LVI) gadgets consisting of a load from memory (i.e., SOURCE), and any operation that may transmit the value loaded from memory over a covert channel, or use the value loaded from memory to determine a branch/call target (i.e., SINK).

Also adds a new target feature to X86: +lvi-load-hardening

The feature can be added via the clang CLI using -mlvi-hardening.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Added help text for the CLI options

sconstab added a reviewer: kparzysz.Mar 12 2020, 8:24 AM

sconstab mentioned this in D75939: [x86][seses] Introduce SESES pass for LVI.Mar 12 2020, 8:32 AM

sconstab added a child revision: D76158: Add inline assembly load hardening mitigation for Load Value Injection (LVI) on X86 [6/6].Mar 13 2020, 1:50 PM

sconstab retitled this revision from Add a Pass to X86 that builds a Condensed CFG for Load Value Injection (LVI) Gadgets [4/5] to Add a Pass to X86 that builds a Condensed CFG for Load Value Injection (LVI) Gadgets [4/6].Mar 16 2020, 9:30 AM

Thanks for putting this up! Here are a few comments.

llvm/lib/Target/X86/ImmutableGraph.h
2	Might be useful if you add a comment about what makes this a fast DAG impl in case someone may want to use it later.
llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
255	I think this should go at the top of the function.
272	Am I misunderstanding this comment? It sounds like if FixedLoads is true then BOTH fixed loads and non-fixed loads will be mitigated. Since runOnMachineFunction would call hardenLoads twice for non-fixed loads, would that result in double mitigation for non-fixed loads in the case where we also harden fixed loads? Unfortunately I'm having trouble reasoning through this myself, so I'd appreciate some clarification.
llvm/test/CodeGen/X86/O0-pipeline.ll
61	Remove pass from name since that's typically the convention.

Addressed Zola's comments.

sconstab marked 5 inline comments as done.Mar 18 2020, 7:10 PM

sconstab added inline comments.

llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
272	The comment was incorrect.

LGTM. I would prefer if an actual LLVM maintainer also gave LGTM. @jyknight, @george.burgess.iv, @craig.topper?

This revision is now accepted and ready to land.Apr 2 2020, 5:13 PM

Closed by commit rGc74dd640fd74: [X86] Add a Pass that builds a Condensed CFG for Load Value Injection (LVI)… (authored by sconstab, committed by craig.topper). · Explain WhyApr 3 2020, 1:03 PM

This revision was automatically updated to reflect the committed changes.

Herald added a project: Restricted Project. · View Herald TranscriptApr 3 2020, 1:03 PM

chandlerc added inline comments.Apr 3 2020, 4:41 PM

llvm/lib/Target/X86/ImmutableGraph.h
47–57	Folks, this isn't even close to following LLVM's coding conventions or naming conventions. These violate the C++ standard. This shouldn't have been landed as-is. Can you all back this out and actually dig into the review and get this to match LLVM's actual coding style and standards?

Adding a few early style notes for the next round, but overall echo @chandlerc that this seems significantly outside of normal LLVM code.

llvm/lib/Target/X86/ImmutableGraph.h
27	It's sort of surprising that the LLVM style guide doesn't call this out explicitly, but `#include` guards are supposed to include the full file path. If they just used the filename, like, this, files with the same name in different paths would collide. For an example of the expected style, see an adjacent header in this directory: https://github.com/llvm/llvm-project/blob/ba8b3052b59ebee4311d10bee5209dac8747acea/llvm/lib/Target/X86/X86AsmPrinter.h#L10
319	As a general rule `new` is a code-smell in modern C++. This should be a `vector`.
336	this should return a `unique_ptr` to signal ownership transfer

craig.topper added inline comments.Apr 3 2020, 5:10 PM

llvm/lib/Target/X86/ImmutableGraph.h
47–57	Reverted at 1d42c0db9a2b27c149c5bac373caa5a6d38d1f74

craig.topper reopened this revision.Apr 3 2020, 5:15 PM

This revision is now accepted and ready to land.Apr 3 2020, 5:15 PM

Fix include guard on ImmutableGraph.h

Remove underscores from names.

craig.topper planned changes to this revision.Apr 3 2020, 8:51 PM

sconstab added inline comments.Apr 3 2020, 9:42 PM

llvm/lib/Target/X86/ImmutableGraph.h
319	@mattdr I do agree with the general rule. I also think that in this case where the structure is immutable, std::vector is wasteful because it needs to keep separate values for the current number of elements and the current capacity. At local scope within a function the unneeded value would likely be optimized away, but then there would be an awkward handoff to transfer the data from the vector to the array members. I would not want to see the array members changed to vectors, unless LLVM provides an encapsulated array structure that does not need to grow and shrink.
336	Yes, agree.

Use std::unique_ptr for arrays in the graph. I started trying to use std::vector, but it kept crashing. Which initially I thought was some issue with the fact that we store pointers into the vectors in other places and that somehow std::move of the vector was breaking that. I think I now realize its because the array is 1 larger than the real number of nodes and my std::vector version didn't take that into account. But thinking about it more, since we are storing pointers into the array, it probably makes more sense for it to just be a plain array than a vector since resizing a vector would invalidate those pointers. Using an array makes it more clear than we don't resize.

This revision is now accepted and ready to land.Apr 3 2020, 11:59 PM

craig.topper mentioned this in D75937: Add Support to X86 for Load Hardening to Mitigate Load Value Injection (LVI) [5/6].Apr 4 2020, 12:00 AM

Use unique_ptr::operator[] in a few places.

mattdr added inline comments.Apr 4 2020, 2:06 AM

llvm/lib/Target/X86/ImmutableGraph.h
14	erm, "terrific"? If there's a substantive argument w.r.t. cache locality etc., please make it explicit.
17	"extraordinarily" is, again, not a useful engineering categorization. Please restrict comments to describing quantifiable claims of complexity.
41	Every template argument for a class represents combinatorial addition of complexity for the resulting code. Why do each of these template arguments need to exist? in particular, why does SizeT need to exist?
319	So, first: I'm glad you removed the unnecessary use of `new[]` here and the corresponding (and error-prone!) use of `delete[]` later. That removes a memory leak LLVM won't have to debug. You suggest here that something other than `std::vector` would be more efficient. If so, would `std::array` suffice? If not, can you explain why static allocation is impossible but dynamic allocation would be too expensive?

-Add edge_begin()/edge_end()/edges() to Node class. Hides the N+1 trick used to find the end of a Node's edges.
-Add nodes()/edges() and use range-based for loops.
-Stop using things in the traits class since it doesn't have range-based for loops.
-Const-correct as required since nodes()/edges() return an ArrayRef that ends up making the range for loops const.
-Use llvm::for_each instead of std::for_each.
-Rename Node::value()/Edge::value() to getValue() to align with llvm naming convention.

sconstab added inline comments.Apr 4 2020, 8:53 AM

llvm/lib/Target/X86/ImmutableGraph.h
14	This is valid. I will reword.
17	AFAIK there is not a precise engineering term for "tiny O(1)." Nonetheless I will reword.
41	I suspect that there may be more uses for this data structure and that eventually it may migrate to ADT. I have SizeT as a template argument because I found it plenty sufficient to have `int` as the size parameter for the array bounds, but I suspect other uses may require `size_t`.
319	A statically sized array (e.g., std::array) is insufficient because the size in this case is not compiler determinable; a dynamically sized and dynamically resizable array (e.g., std::vector) is sufficient but overly costly; a dynamically sized and dynamically unresizable array is sufficient and has minimal cost.

sconstab added inline comments.Apr 4 2020, 9:26 AM

llvm/lib/Target/X86/ImmutableGraph.h

286

@craig.topper It now occurs to me that these fields should probably be reordered to:

std::unique_ptr<Node[]> Nodes;
std::unique_ptr<Edge[]> Edges;
size_type NodesSize;
size_type EdgesSize;

The current ordering will cause internal fragmentation.

Old ordering:

static_assert(sizeof(ImmutableGraph<T, V>) == 32);

New ordering:

static_assert(sizeof(ImmutableGraph<T, V>) == 24);

With vectors instead of arrays:

static_assert(sizeof(ImmutableGraph<T, V>) == 48);

Overall, the restyling by @craig.topper looks much better than what I had committed before. I agree that std::unique_ptr<T *> is the right "container" in this circumstance. And the addition of ArrayRef<> accessors is also a nice touch. A few extra inline comments.

llvm/lib/Target/X86/ImmutableGraph.h
14	"Iteration and traversal operations benefit from cache locality."
17	"Operations on sets of nodes/edges are efficient, and representations of those sets in memory are compact. For instance..."
85	After the members are reordered, this list must also be reordered.
104	This had not occurred to me until now, but a lot of code is shared between `NodeSet` and `EdgeSet`. Maybe a template could reduce the redundancy?

sconstab added inline comments.Apr 4 2020, 1:26 PM

llvm/lib/Target/X86/ImmutableGraph.h
308	Just noticed that `ImmutableGraphBuilder` and `ImmutableGraph` have non-identical types called `NodeRef`. Suggest renaming this one to `BuilderNodeRef`.

craig.topper marked 2 inline comments as done.Apr 4 2020, 2:05 PM

craig.topper added inline comments.

llvm/lib/Target/X86/ImmutableGraph.h
286	I noticed that too. I just didn't focus on it since we only ever one in memory at a time. I'll change in my next update.
308	NodeRef is in the Traits class not the ImmutableGraph, but I will rename the builder one.

craig.topper marked 2 inline comments as done.Apr 4 2020, 2:17 PM

craig.topper added inline comments.

llvm/lib/Target/X86/ImmutableGraph.h
330	Can this be changed to VI < VertexSize?
370	I think I'll change this to llvm::count_if. Also there was previously a conditional here that made sure the distance between edges was >0, but it didn't seem necessary. Please let me know if there's a reason I should put that back

-Apply updates to comments.
-Use nodes()/edges() to implement nodes_begin/end and edges_begin/end to simplify the code a little
-Reorder fields in the Graph class.

-Add methods to get the index of a Node or Edge to remove calls to std::distance in various places

craig.topper requested review of this revision.Apr 6 2020, 10:18 AM

mattdr added inline comments.Apr 7 2020, 3:28 AM

llvm/lib/Target/X86/ImmutableGraph.h
18	"spatial"
42	I think this self-reference to `ImmutableGraph` dropped the `SizeT` parameter.
74	Seems like you also want to add a comment here that we know we will never be asked for `edges_end` for the last stored node -- that is, we know that `this + 1` always refers to a valid Node (which is presumably a dummy/sentinel)
80	Why "protected" rather than "private"? Usually seeing "protected" makes me think subclassing is expected, but that doesn't seem to be the case here.
118	How do we know that a value of `size_type` (aka `SizeT`) can be cast to `unsigned` without truncation?
299	this will also break if a non-default `SizeT` is provided. Maybe a good argument to just leave out `SizeT` for now, and it can be added in the future as needed?
319	I'm not sure we allocate enough of these in the course of a compilation for the one extra word in a `std::vector` to matter, but I won't press the point.
llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
114	Cleaner how?
234	If the user requests hardening and we can't do it, it seems better to fail loudly so they don't accidentally deploy an unmitigated binary.

Summary points for @craig.topper who has commandeered this diff:

fix the typo that Matt pointed out
SizeT should not be a template parameter, and size_type should be fixed to int.
Maybe have a member reference in MachineGadgetGraph to the associated MachineFunction.
Determine how this pass (and other X86 machine passes) should fail on unsupported X86 subtargets.

llvm/lib/Target/X86/ImmutableGraph.h
42	Yup. Good catch.
74	Not sure I agree. I cannot think of a conventional use of this interface that would perform an operation on the sentinel. G->nodes_end().edges_end(); // invalid use of any end iterator SomeNode.edges_end(); // invalid if SomeNode == G->nodes_end() That is, the way that we "know" that we will never be asked for `edges_end()` for the last stored node is that the ask itself would already violate C++ conventions.
80	The `MachineGadgetGraph` class actually does subclass `ImmutableGraph` to add some contextual information. I did not want the constructors for `ImmutableGraph` to be public, because the intent is to use the builder. So `protected` seemed like the best option.
118	Ah. We do not know that. We could have a static assert here, but maybe the best thing to do would be to follow Matt's earlier advice and fix `size_type` to `int`, rather than have it as a template parameter. Anything larger would break the `BitVectors` and/or waste space.
llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
114	Maybe by keeping a member reference to the associated `MachineFunction`?
234	@craig.topper I think this is related to the discussion we were having about what would happen for SLH on unsupported subtargets. I'm not sure what the most appropriate solution would be.

Some more comments. FWIW, I'm doing rounds of review as I can in some evening quiet or during my son's nap. This is a huge change and it's really hard to get any part of it into my head at once in a reasonable amount of time.

llvm/lib/Target/X86/ImmutableGraph.h
74	I believe any operation on the last `Node` in the array will end up accessing the sentinel: Node* LastNode = G->nodes_begin() + (G->nodes_size() - 1); // valid reference to the last node LastNode->edges_end(); // uses `this+1`, accessing the sentinel value in the Nodes array
80	Ah, I missed that. I searched through the file for `public ImmutableGraph` and didn't find it because `MachineGadgetGraph` uses the default inheritance specifier.
llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
84–86	Please replace these with constants or functions.
114	Let's put that in the comment instead.

I'll wait for current comments to be addressed before doing my next round here.

sconstab added inline comments.Apr 7 2020, 3:30 PM

llvm/lib/Target/X86/ImmutableGraph.h
74	`G->nodes_size()` will return the size without the sentinel node, so your example should actually operate on the last data node. Right?

craig.topper updated this revision to Diff 256073.Apr 8 2020, 11:01 AM

Address review comments

craig.topper marked an inline comment as done.Apr 8 2020, 11:16 AM

craig.topper added inline comments.

llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
84–86	Oops forgot to do this one.

craig.topper marked 37 inline comments as done and an inline comment as not done.Apr 8 2020, 11:25 AM

craig.topper added inline comments.

llvm/lib/Target/X86/ImmutableGraph.h
42	Removed the template argument
74	Comment added to clarify the extra node was allocated.
118	Removed the template argument
llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
114	I found a way to remove the getMF method entirely.
234	Added a fatal error. Which isn't great as it will generate a crash report in clang. But it will tell the user to file a compiler bug so I guess that's something.

-Replace ARG_NODE and GADGET_EDGE defines with static constexpr members

-Put llvm:: on the for_each calls in this patch instead of D75937

sconstab added inline comments.Apr 9 2020, 6:45 PM

llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
234	Would it be better to have report_fatal_error("LVI load hardening is only supported on 64-bit " "targets.", false); So that the crash diagnostic is not generated?

@craig.topper can you please update/rebase this stack to remove the early LVI CFI work that ended up landed in https://reviews.llvm.org/D76812 instead?

craig.topper removed a child revision: D76158: Add inline assembly load hardening mitigation for Load Value Injection (LVI) on X86 [6/6].Apr 10 2020, 5:30 PM

craig.topper edited parent revisions, added: D75935: Add RET-hardening Support to X86 to mitigate Load Value Injection (LVI) [3/6]; removed: D75932: Move RDF from Hexagon to Codegen [1/6].

Accepting modulo some comments to be addressed. Most of my review effort was spent on the data structure and algorithm employed as well as code style and readability.

I am least confident about my understanding of instrAccessesStackSlot and the other functions that make up instrIsFixedAccess, since each function seems to be pattern-matching on something very specific without a reference to why. I also don't know if this diff provides an exhaustive list of fixed loads, or indeed if it was intended to.

llvm/lib/Target/X86/ImmutableGraph.h
103	Worth adding a comment for this (and `getEdgeIndex`) that this will crash if the `Node` (`Edge`) provided is not a reference into this specific instance of `ImmutableGraph`.
104	Ideally I agree we'd find a way to collapse these -- but for this diff, let's content ourselves with a FIXME comment to that effect.
341	This `if` is unnecessary
349	Technically a "generic" graph, so we should leave out "Gadget" here
367	Two comments here would make the code significantly easier to understand: Note that we're using `.size()` here rather than `.count()`, so we're actually iterating over all Node indices, not just the ones to be trimmed The `TrimmedNodes` vector maps indices in the original NodeSet to the number of `Node`s before that index that have been trimmed by that index, to allow later code to map elements to their new position in a dense array with the trimmed items removed
llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
57	This is another case where references to good documentation will go a long way. Without details about what the tradeoff is and how to reason about it, it doesn't seem like anyone should use this flag.
237	Each call to `hardenLoads` leads to a call to `buildGadgetGraph`. A lot of the work that `getGadgetGraph` does seems to be common between mitigating fixed and non-fixed loads -- for example, computing register dataflow and liveness over the entire function. And calling `hardenLoads` twice looks to be the common case, since `NoFixedLoads` is `false` by default. Could we make this pass about half as expensive by default by combining these two calls to `hardenLoads` into one? It would do the expensive work once, then either harden _all_ loads or _only_ non-fixed loads.
323	Please add a comment explaining the semantics of the boolean return here. I think it's: `true` if we need to consider defs of this instruction tainted by this use (and therefore add them for analysis); `false` if this instruction consumes its use
330	Why is it okay to assume that a call doesn't propagate its uses to defs? Is it because we can assume the CFI transform is already inserting an LFENCE? Whatever the reason, let's state it explicitly here
342	Some more detail would be useful here: precise about what? What are the likely errors?
410	We analyze every def from every instruction in the function, but then also in `AnalyzeDefUseChain` analyze every def of every instruction with an interesting use. Are we doing a lot of extra work?
494	Worth a comment here that we don't need to worry about indirect branches (jmp to register) because elsewhere we prevent them from being generated
507	It seems very weird to make this a template argument rather than just, like, a regular argument.

This revision is now accepted and ready to land.Apr 13 2020, 4:53 PM

Address some of the review comments. Primarily the ones in ImmutableGraph. I did de-templatize the method in X86LoadValueInjectionLoadHardening.cpp

craig.topper marked 6 inline comments as done.Apr 14 2020, 1:25 PM

craig.topper added inline comments.

llvm/lib/Target/X86/ImmutableGraph.h
367	I've stopped using TrimmedNodes.size() and TrimEdges.size() in favor of the size methods from the graph which should make things more obvious. I renamed TrimmedNodes to RemappedNodeIndex and stored the new index rather than the adjustment needed. I'm also changed it to walk nodes instead of indices so we don't have to translate to Node to make the contains call. I also removed the NewNumEdges count_if and the if statement around the edge loop from the loop below. I don't think that provided any value and just complicated the code.
llvm/lib/Target/X86/X86ISelLowering.cpp
8626 ↗	(On Diff #257465)	Oops this snuck in from something else I was experimenting with in my repo earlier. I'll remove.
llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
507	Agreed. I've remove the template argument and made it a static function instead of a method since it doesn't use anything from the class.

Fix some mistakes I made in the previous upload where I accidentally deleted the pipeline tests instead of updating them.

Merge in test case from D77431 since it logically belongs with these changes.

craig.topper mentioned this in D77431: [X86] Add tests to clang Driver to ensure that SLH/Retpoline features are not enabled with LVI-hardening.Apr 14 2020, 4:04 PM

mattdr marked an inline comment as done.Apr 14 2020, 10:42 PM

mattdr added inline comments.

llvm/lib/Target/X86/ImmutableGraph.h
367	Many thanks! These changes make the code much more accessible.

sconstab commandeered this revision.Apr 16 2020, 12:11 PM

sconstab edited reviewers, added: craig.topper; removed: sconstab.

Removed the -x86-lvi-no-fixed CLI flag. This change simplifies the code flow quite a bit.

craig.topper added inline comments.Apr 28 2020, 3:26 PM

llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
323	Was this comment addressed?
342	Was this answered somewhere?
410	Was this answered somewhere?

Addressed the previously unaddressed comments, as pointed out by @craig.topper.

Herald added a subscriber: mgrang. · View Herald TranscriptMay 4 2020, 5:10 PM

sconstab added inline comments.May 4 2020, 5:10 PM

llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
323	It had not been addressed, so thank you for pointing this out. That lambda was doing too many things at once, which made it more confusing than it needed to be. So I just inlined it in the for (auto N : Uses) { … } loop, and I added some additional clarifying comments.
342	This was referring to the use of `mayLoad()`. At the time I wrote that comment, I wasn't sure that `mayLoad()` was exactly what was needed there, but I now think that it does suffice (SLH also uses `MachineInstr::mayLoad()`).
410	Wow, big oversight on my part. @mattdr was correct that this was doing a LOT of extra work. I added a memoization scheme that remembers the instructions that may transmit for each def. The getGadgetGraph() routine now runs about 75% faster.

Calling special attention to the comment at line 341, since I think it affects the correctness of the pass.

llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
288	This comment doesn't seem to match how the map is used -- it looks like the loop assumes a def has been analyzed iff it is present in the map. This matches my expectation that, if a def is present and maps to an empty list, it would meant the def had been analyzed and found not to transmit.
296	fwiw, this code would be easier to understand if we didn't shadow `Def` with another variable named `Def`.
322–323	"current def" is a bit ambiguous here. I _believe_ it means `AnalyzeDef`'s `Def` argument? At least, that's the interpretation that makes the comment make sense since `UsesVisited` is in `AnalyzeDef`'s scope.
332	Copying a comment from a previous iteration: Why is it okay to assume that a call doesn't propagate its uses to defs? Is it because we can assume the CFI transform is already inserting an LFENCE? Whatever the reason, let's state it explicitly here
342	The comment doesn't match the loop, which is traversing over `Uses`. More importantly, though: why are we allowed to stop traversing through `Uses` here? This `Def` won't be analyzed again, so this is our only chance to enumerate all transmitters to make sure we have all the necessary source -> sink edges in the gadget graph.
365–366	This is also the place we populate `Transmitters` (with a default-constructed vector) for the current def if we haven't otherwise found any transmits. That's good, and necessary for `Transmitters` to remember we've analyzed the current def. But we should leave a comment about this subtle load-bearing side-effect.
367–370	Should `Transmitters` map to an `llvm::SmallSet`?

Addressed comments by @mattdr.

Several comments in the code have been updated, but the code has not changed.

llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
296	Changed the outer def to `SourceDef`, which also seems to make the code after the lambda a lot clearer.
322–323	I am now trying to be clearer by using capital-d "Def" to refer specifically to the def that is being analyzed, and lower-case-d "def" to refer to any other defs. Do you think this is better? Good enough?
332	Added clarification to the comment.
342	@mattdr I think that the code is correct, and I added more to the comment in an attempt to clarify. Let me know if you still think that this is an issue.
367–370	In my testing, `std::vector` seems a bit faster than `llvm::SmallSet`. I also suspect that `llvm::SmallSet` may waste more space because many defs will have no transmitters.

The extra comments and the new variable name are all helpful. Thanks again.

llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
322–323	Much better. Thank you for the change!
342	I definitely misread `continue` as `break` here. Thanks for the extra clarity and sorry for the noise.

Closed by commit rGe97a3e5d9d42: [X86] Add a Pass that builds a Condensed CFG for Load Value Injection (LVI)… (authored by sconstab, committed by craig.topper). · Explain WhyMay 11 2020, 1:29 PM

This revision was automatically updated to reflect the committed changes.

This change causes a 0.8% compile-time regression for unoptimized builds. Based on the pipeline test diffs, I expect this is because the new pass requests a bunch of analyses, which it most likely (LVI load hardening disabled) will not need. Would it be possible to compute the analyses only if LVI load hardening is actually enabled?

In D75936#2032027, @nikic wrote:

This change causes a 0.8% compile-time regression for unoptimized builds. Based on the pipeline test diffs, I expect this is because the new pass requests a bunch of analyses, which it most likely (LVI load hardening disabled) will not need. Would it be possible to compute the analyses only if LVI load hardening is actually enabled?

@craig.topper Do you have any ideas on how this could be done?

In D75936#2032078, @sconstab wrote:

In D75936#2032027, @nikic wrote:

This change causes a 0.8% compile-time regression for unoptimized builds. Based on the pipeline test diffs, I expect this is because the new pass requests a bunch of analyses, which it most likely (LVI load hardening disabled) will not need. Would it be possible to compute the analyses only if LVI load hardening is actually enabled?

@craig.topper Do you have any ideas on how this could be done?

Unfortunately due to LTO the need for LVI hardening is carried as a function attribute. The pass manager system doesn't allow for running different passes per function. So I don't have any good ideas of how to do that.

craig.topper added inline comments.May 12 2020, 12:24 PM

llvm/test/CodeGen/X86/O3-pipeline.ll
147	I'm curious what happens if we add AU.setPreservesCFG() to getAnalysisUsage in FixupStatepointCallerSaved.cpp From a quick look through that pass it doesn't look like it changes the Machine CFG. PostRA Machine Sink already preserves CFG. So I think that should remove the dominator tree construction after PostRA machine sink.

I'm no expert in the pass manager, but given the very targeted applicability of LVI it definitely seems like our goal should be 0% impact for the vast majority of compilations that don't concern themselves with it.

Is there a way to require the pass be enabled for the compiler invocation as well as for the subtarget? If there's a mismatch (where LVI is desired but the required analyses weren't done) we can fail the compile.

craig.topper mentioned this in D79813: [Statepoint] Mark FixupStatepointCallerSaved as preserving the CFG.May 12 2020, 3:19 PM

In D75936#2032090, @craig.topper wrote:

In D75936#2032078, @sconstab wrote:

In D75936#2032027, @nikic wrote:

This change causes a 0.8% compile-time regression for unoptimized builds. Based on the pipeline test diffs, I expect this is because the new pass requests a bunch of analyses, which it most likely (LVI load hardening disabled) will not need. Would it be possible to compute the analyses only if LVI load hardening is actually enabled?

@craig.topper Do you have any ideas on how this could be done?

Unfortunately due to LTO the need for LVI hardening is carried as a function attribute. The pass manager system doesn't allow for running different passes per function. So I don't have any good ideas of how to do that.

Hm, I see. One possibility would be to make those analyses lazy, but that's a larger change.

Possibly a pragmatic choice would be to not support this feature at O0? It does not seem relevant for non-production binaries. The relative impact of a couple unnecessary analysis passes is much higher at O0 than it is at O3.

craig.topper mentioned this in rGde92dc2850c1: [Statepoint] Mark FixupStatepointCallerSaved as preserving the CFG.May 13 2020, 11:25 AM

nikic mentioned this in D80064: [X86] Disable LVI load hardening pass at O0.May 16 2020, 9:06 AM

sconstab mentioned this in D80964: [X86] Add an Unoptimized Load Value Injection (LVI) Load Hardening Pass.Jun 1 2020, 5:05 PM

craig.topper mentioned this in rG7e06cf0011a8: [X86] Add an Unoptimized Load Value Injection (LVI) Load Hardening Pass.Jun 10 2020, 3:36 PM

tstellar mentioned this in rG72bff7855d8c: [X86] Add an Unoptimized Load Value Injection (LVI) Load Hardening Pass.Jun 24 2020, 9:45 AM

Revision Contents

Path

Size

clang/

include/

clang/

Driver/

Options.td

4 lines

lib/

Driver/

ToolChains/

Arch/

X86.cpp

8 lines

test/

Driver/

x86-target-features.c

12 lines

llvm/

lib/

Target/

X86/

1 line

446 lines

2 lines

7 lines

X86LoadValueInjectionLoadHardening.cpp

521 lines

X86Subtarget.h

5 lines

X86TargetMachine.cpp

2 lines

test/

CodeGen/

X86/

O0-pipeline.ll

4 lines

O3-pipeline.ll

3 lines

lvi-hardening-gadget-graph.ll

129 lines

Diff 263262

clang/include/clang/Driver/Options.td

	Show First 20 Lines • Show All 2,328 Lines • ▼ Show 20 Lines
	def mno_stackrealign : Flag<["-"], "mno-stackrealign">, Group<m_Group>;			def mno_stackrealign : Flag<["-"], "mno-stackrealign">, Group<m_Group>;

	def mretpoline : Flag<["-"], "mretpoline">, Group<m_Group>, Flags<[CoreOption,DriverOption]>;			def mretpoline : Flag<["-"], "mretpoline">, Group<m_Group>, Flags<[CoreOption,DriverOption]>;
	def mno_retpoline : Flag<["-"], "mno-retpoline">, Group<m_Group>, Flags<[CoreOption,DriverOption]>;			def mno_retpoline : Flag<["-"], "mno-retpoline">, Group<m_Group>, Flags<[CoreOption,DriverOption]>;
	def mspeculative_load_hardening : Flag<["-"], "mspeculative-load-hardening">,			def mspeculative_load_hardening : Flag<["-"], "mspeculative-load-hardening">,
	Group<m_Group>, Flags<[CoreOption,CC1Option]>;			Group<m_Group>, Flags<[CoreOption,CC1Option]>;
	def mno_speculative_load_hardening : Flag<["-"], "mno-speculative-load-hardening">,			def mno_speculative_load_hardening : Flag<["-"], "mno-speculative-load-hardening">,
	Group<m_Group>, Flags<[CoreOption]>;			Group<m_Group>, Flags<[CoreOption]>;
				def mlvi_hardening : Flag<["-"], "mlvi-hardening">, Group<m_Group>, Flags<[CoreOption,DriverOption]>,
				HelpText<"Enable all mitigations for Load Value Injection (LVI)">;
				def mno_lvi_hardening : Flag<["-"], "mno-lvi-hardening">, Group<m_Group>, Flags<[CoreOption,DriverOption]>,
				HelpText<"Disable mitigations for Load Value Injection (LVI)">;
	def mlvi_cfi : Flag<["-"], "mlvi-cfi">, Group<m_Group>, Flags<[CoreOption,DriverOption]>,			def mlvi_cfi : Flag<["-"], "mlvi-cfi">, Group<m_Group>, Flags<[CoreOption,DriverOption]>,
	HelpText<"Enable only control-flow mitigations for Load Value Injection (LVI)">;			HelpText<"Enable only control-flow mitigations for Load Value Injection (LVI)">;
	def mno_lvi_cfi : Flag<["-"], "mno-lvi-cfi">, Group<m_Group>, Flags<[CoreOption,DriverOption]>,			def mno_lvi_cfi : Flag<["-"], "mno-lvi-cfi">, Group<m_Group>, Flags<[CoreOption,DriverOption]>,
	HelpText<"Disable control-flow mitigations for Load Value Injection (LVI)">;			HelpText<"Disable control-flow mitigations for Load Value Injection (LVI)">;

	def mrelax : Flag<["-"], "mrelax">, Group<m_riscv_Features_Group>,			def mrelax : Flag<["-"], "mrelax">, Group<m_riscv_Features_Group>,
	HelpText<"Enable linker relaxation">;			HelpText<"Enable linker relaxation">;
	def mno_relax : Flag<["-"], "mno-relax">, Group<m_riscv_Features_Group>,			def mno_relax : Flag<["-"], "mno-relax">, Group<m_riscv_Features_Group>,
	▲ Show 20 Lines • Show All 1,152 Lines • Show Last 20 Lines

clang/lib/Driver/ToolChains/Arch/X86.cpp

Show First 20 Lines • Show All 167 Lines • ▼ Show 20 Lines	if (Args.hasArgNoClaim(options::OPT_mretpoline, options::OPT_mno_retpoline,
// FIXME: Add a warning about failing to specify `-mretpoline` and		// FIXME: Add a warning about failing to specify `-mretpoline` and
// eventually switch to an error here.		// eventually switch to an error here.
Features.push_back("+retpoline-indirect-calls");		Features.push_back("+retpoline-indirect-calls");
Features.push_back("+retpoline-indirect-branches");		Features.push_back("+retpoline-indirect-branches");
SpectreOpt = options::OPT_mretpoline_external_thunk;		SpectreOpt = options::OPT_mretpoline_external_thunk;
}		}

auto LVIOpt = clang::driver::options::ID::OPT_INVALID;		auto LVIOpt = clang::driver::options::ID::OPT_INVALID;
if (Args.hasFlag(options::OPT_mlvi_cfi, options::OPT_mno_lvi_cfi, false)) {		if (Args.hasFlag(options::OPT_mlvi_hardening, options::OPT_mno_lvi_hardening,
		false)) {
		Features.push_back("+lvi-load-hardening");
		Features.push_back("+lvi-cfi"); // load hardening implies CFI protection
		LVIOpt = options::OPT_mlvi_hardening;
		} else if (Args.hasFlag(options::OPT_mlvi_cfi, options::OPT_mno_lvi_cfi,
		false)) {
Features.push_back("+lvi-cfi");		Features.push_back("+lvi-cfi");
LVIOpt = options::OPT_mlvi_cfi;		LVIOpt = options::OPT_mlvi_cfi;
}		}

if (SpectreOpt != clang::driver::options::ID::OPT_INVALID &&		if (SpectreOpt != clang::driver::options::ID::OPT_INVALID &&
LVIOpt != clang::driver::options::ID::OPT_INVALID) {		LVIOpt != clang::driver::options::ID::OPT_INVALID) {
D.Diag(diag::err_drv_argument_not_allowed_with)		D.Diag(diag::err_drv_argument_not_allowed_with)
<< D.getOpts().getOptionName(SpectreOpt)		<< D.getOpts().getOptionName(SpectreOpt)
<< D.getOpts().getOptionName(LVIOpt);		<< D.getOpts().getOptionName(LVIOpt);
}		}

// Now add any that the user explicitly requested on the command line,		// Now add any that the user explicitly requested on the command line,
// which may override the defaults.		// which may override the defaults.
handleTargetFeaturesGroup(Args, Features, options::OPT_m_x86_Features_Group);		handleTargetFeaturesGroup(Args, Features, options::OPT_m_x86_Features_Group);
}		}

clang/test/Driver/x86-target-features.c

	Show First 20 Lines • Show All 160 Lines • ▼ Show 20 Lines

	// RUN: %clang -target i386-linux-gnu -mlvi-cfi -mspeculative-load-hardening %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=LVICFI-SLH %s			// RUN: %clang -target i386-linux-gnu -mlvi-cfi -mspeculative-load-hardening %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=LVICFI-SLH %s
	// LVICFI-SLH: error: invalid argument 'mspeculative-load-hardening' not allowed with 'mlvi-cfi'			// LVICFI-SLH: error: invalid argument 'mspeculative-load-hardening' not allowed with 'mlvi-cfi'
	// RUN: %clang -target i386-linux-gnu -mlvi-cfi -mretpoline %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=LVICFI-RETPOLINE %s			// RUN: %clang -target i386-linux-gnu -mlvi-cfi -mretpoline %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=LVICFI-RETPOLINE %s
	// LVICFI-RETPOLINE: error: invalid argument 'mretpoline' not allowed with 'mlvi-cfi'			// LVICFI-RETPOLINE: error: invalid argument 'mretpoline' not allowed with 'mlvi-cfi'
	// RUN: %clang -target i386-linux-gnu -mlvi-cfi -mretpoline-external-thunk %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=LVICFI-RETPOLINE-EXTERNAL-THUNK %s			// RUN: %clang -target i386-linux-gnu -mlvi-cfi -mretpoline-external-thunk %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=LVICFI-RETPOLINE-EXTERNAL-THUNK %s
	// LVICFI-RETPOLINE-EXTERNAL-THUNK: error: invalid argument 'mretpoline-external-thunk' not allowed with 'mlvi-cfi'			// LVICFI-RETPOLINE-EXTERNAL-THUNK: error: invalid argument 'mretpoline-external-thunk' not allowed with 'mlvi-cfi'

				// RUN: %clang -target i386-linux-gnu -mlvi-hardening %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=LVIHARDENING %s
				// RUN: %clang -target i386-linux-gnu -mno-lvi-hardening %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=NO-LVIHARDENING %s
				// LVIHARDENING: "-target-feature" "+lvi-load-hardening" "-target-feature" "+lvi-cfi"
				// NO-LVIHARDENING-NOT: lvi

				// RUN: %clang -target i386-linux-gnu -mlvi-hardening -mspeculative-load-hardening %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=LVIHARDENING-SLH %s
				// LVIHARDENING-SLH: error: invalid argument 'mspeculative-load-hardening' not allowed with 'mlvi-hardening'
				// RUN: %clang -target i386-linux-gnu -mlvi-hardening -mretpoline %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=LVIHARDENING-RETPOLINE %s
				// LVIHARDENING-RETPOLINE: error: invalid argument 'mretpoline' not allowed with 'mlvi-hardening'
				// RUN: %clang -target i386-linux-gnu -mlvi-hardening -mretpoline-external-thunk %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=LVIHARDENING-RETPOLINE-EXTERNAL-THUNK %s
				// LVIHARDENING-RETPOLINE-EXTERNAL-THUNK: error: invalid argument 'mretpoline-external-thunk' not allowed with 'mlvi-hardening'

	// RUN: %clang -target i386-linux-gnu -mwaitpkg %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=WAITPKG %s			// RUN: %clang -target i386-linux-gnu -mwaitpkg %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=WAITPKG %s
	// RUN: %clang -target i386-linux-gnu -mno-waitpkg %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=NO-WAITPKG %s			// RUN: %clang -target i386-linux-gnu -mno-waitpkg %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=NO-WAITPKG %s
	// WAITPKG: "-target-feature" "+waitpkg"			// WAITPKG: "-target-feature" "+waitpkg"
	// NO-WAITPKG: "-target-feature" "-waitpkg"			// NO-WAITPKG: "-target-feature" "-waitpkg"

	// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mmovdiri %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=MOVDIRI %s			// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mmovdiri %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=MOVDIRI %s
	// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-movdiri %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=NO-MOVDIRI %s			// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-movdiri %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=NO-MOVDIRI %s
	// MOVDIRI: "-target-feature" "+movdiri"			// MOVDIRI: "-target-feature" "+movdiri"
	▲ Show 20 Lines • Show All 46 Lines • Show Last 20 Lines

llvm/lib/Target/X86/CMakeLists.txt

Show First 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	set(sources
X86IndirectThunks.cpp		X86IndirectThunks.cpp
X86InterleavedAccess.cpp		X86InterleavedAccess.cpp
X86InsertPrefetch.cpp		X86InsertPrefetch.cpp
X86InstrFMA3Info.cpp		X86InstrFMA3Info.cpp
X86InstrFoldTables.cpp		X86InstrFoldTables.cpp
X86InstrInfo.cpp		X86InstrInfo.cpp
X86EvexToVex.cpp		X86EvexToVex.cpp
X86LegalizerInfo.cpp		X86LegalizerInfo.cpp
		X86LoadValueInjectionLoadHardening.cpp
X86LoadValueInjectionRetHardening.cpp		X86LoadValueInjectionRetHardening.cpp
X86MCInstLower.cpp		X86MCInstLower.cpp
X86MachineFunctionInfo.cpp		X86MachineFunctionInfo.cpp
X86MacroFusion.cpp		X86MacroFusion.cpp
X86OptimizeLEAs.cpp		X86OptimizeLEAs.cpp
X86PadShortFunction.cpp		X86PadShortFunction.cpp
X86PartialReduction.cpp		X86PartialReduction.cpp
X86RegisterBankInfo.cpp		X86RegisterBankInfo.cpp
Show All 21 Lines

llvm/lib/Target/X86/ImmutableGraph.h

This file was added.

				//==========-- ImmutableGraph.h - A fast DAG implementation ---------=========//
				//
				zbridUnsubmitted Done Reply Inline Actions Might be useful if you add a comment about what makes this a fast DAG impl in case someone may want to use it later. zbrid: Might be useful if you add a comment about what makes this a fast DAG impl in case someone may…
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				///
				/// Description: ImmutableGraph is a fast DAG implementation that cannot be
				/// modified, except by creating a new ImmutableGraph. ImmutableGraph is
				/// implemented as two arrays: one containing nodes, and one containing edges.
				/// The advantages to this implementation are two-fold:
				/// 1. Iteration and traversal operations benefit from cache locality.
				/// 2. Operations on sets of nodes/edges are efficient, and representations of
				mattdrUnsubmitted Not Done Reply Inline Actions erm, "terrific"? If there's a substantive argument w.r.t. cache locality etc., please make it explicit. mattdr: erm, "terrific"? If there's a substantive argument w.r.t. cache locality etc., please make it…
				sconstabAuthorUnsubmitted Not Done Reply Inline Actions This is valid. I will reword. sconstab: This is valid. I will reword.
				sconstabAuthorUnsubmitted Not Done Reply Inline Actions "Iteration and traversal operations benefit from cache locality." sconstab: "Iteration and traversal operations benefit from cache locality."
				/// those sets in memory are compact. For instance, a set of edges is
				/// implemented as a bit vector, wherein each bit corresponds to one edge in
				/// the edge array. This implies a lower bound of 64x spatial improvement
				mattdrUnsubmitted Not Done Reply Inline Actions "extraordinarily" is, again, not a useful engineering categorization. Please restrict comments to describing quantifiable claims of complexity. mattdr: "extraordinarily" is, again, not a useful engineering categorization. Please restrict comments…
				sconstabAuthorUnsubmitted Not Done Reply Inline Actions AFAIK there is not a precise engineering term for "tiny O(1)." Nonetheless I will reword. sconstab: AFAIK there is not a precise engineering term for "tiny O(1)." Nonetheless I will reword.
				sconstabAuthorUnsubmitted Not Done Reply Inline Actions "Operations on sets of nodes/edges are efficient, and representations of those sets in memory are compact. For instance..." sconstab: "Operations on sets of nodes/edges are efficient, and representations of those sets in memory…
				/// over, e.g., an llvm::DenseSet or llvm::SmallSet. It also means that
				mattdrUnsubmitted Done Reply Inline Actions "spatial" mattdr: "spatial"
				/// insert/erase/contains operations complete in negligible constant time:
				/// insert and erase require one load and one store, and contains requires
				/// just one load.
				///
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_LIB_TARGET_X86_IMMUTABLEGRAPH_H
				#define LLVM_LIB_TARGET_X86_IMMUTABLEGRAPH_H

				mattdrUnsubmitted Done Reply Inline Actions It's sort of surprising that the LLVM style guide doesn't call this out explicitly, but `#include` guards are supposed to include the full file path. If they just used the filename, like, this, files with the same name in different paths would collide. For an example of the expected style, see an adjacent header in this directory: https://github.com/llvm/llvm-project/blob/ba8b3052b59ebee4311d10bee5209dac8747acea/llvm/lib/Target/X86/X86AsmPrinter.h#L10 mattdr: It's sort of surprising that the LLVM style guide doesn't call this out explicitly, but…
				#include "llvm/ADT/BitVector.h"
				#include "llvm/ADT/GraphTraits.h"
				#include "llvm/ADT/STLExtras.h"
				#include "llvm/Support/raw_ostream.h"
				#include <algorithm>
				#include <iterator>
				#include <utility>
				#include <vector>

				namespace llvm {

				template <typename NodeValueT, typename EdgeValueT> class ImmutableGraph {
				using Traits = GraphTraits<ImmutableGraph<NodeValueT, EdgeValueT> *>;
				template <typename> friend class ImmutableGraphBuilder;
				mattdrUnsubmitted Done Reply Inline Actions Every template argument for a class represents combinatorial addition of complexity for the resulting code. Why do each of these template arguments need to exist? in particular, why does SizeT need to exist? mattdr: Every template argument for a class represents combinatorial addition of complexity for the…
				sconstabAuthorUnsubmitted Done Reply Inline Actions I suspect that there may be more uses for this data structure and that eventually it may migrate to ADT. I have SizeT as a template argument because I found it plenty sufficient to have `int` as the size parameter for the array bounds, but I suspect other uses may require `size_t`. sconstab: I suspect that there may be more uses for this data structure and that eventually it may…

				mattdrUnsubmitted Done Reply Inline Actions I think this self-reference to `ImmutableGraph` dropped the `SizeT` parameter. mattdr: I think this self-reference to `ImmutableGraph` dropped the `SizeT` parameter.
				sconstabAuthorUnsubmitted Done Reply Inline Actions Yup. Good catch. sconstab: Yup. Good catch.
				craig.topperUnsubmitted Done Reply Inline Actions Removed the template argument craig.topper: Removed the template argument
				public:
				using node_value_type = NodeValueT;
				using edge_value_type = EdgeValueT;
				using size_type = int;
				class Node;
				class Edge {
				friend class ImmutableGraph;
				template <typename> friend class ImmutableGraphBuilder;

				const Node *Dest;
				edge_value_type Value;

				public:
				const Node *getDest() const { return Dest; };
				const edge_value_type &getValue() const { return Value; }
				chandlercUnsubmitted Done Reply Inline Actions Folks, this isn't even close to following LLVM's coding conventions or naming conventions. These violate the C++ standard. This shouldn't have been landed as-is. Can you all back this out and actually dig into the review and get this to match LLVM's actual coding style and standards? chandlerc: Folks, this isn't even close to following LLVM's coding conventions or naming conventions.
				craig.topperUnsubmitted Done Reply Inline Actions Reverted at 1d42c0db9a2b27c149c5bac373caa5a6d38d1f74 craig.topper: Reverted at 1d42c0db9a2b27c149c5bac373caa5a6d38d1f74
				};
				class Node {
				friend class ImmutableGraph;
				template <typename> friend class ImmutableGraphBuilder;

				const Edge *Edges;
				node_value_type Value;

				public:
				const node_value_type &getValue() const { return Value; }

				const Edge *edges_begin() const { return Edges; }
				// Nodes are allocated sequentially. Edges for a node are stored together.
				// The end of this Node's edges is the beginning of the next node's edges.
				// An extra node was allocated to hold the end pointer for the last real
				// node.
				const Edge *edges_end() const { return (this + 1)->Edges; }
				mattdrUnsubmitted Done Reply Inline Actions Seems like you also want to add a comment here that we know we will never be asked for `edges_end` for the last stored node -- that is, we know that `this + 1` always refers to a valid Node (which is presumably a dummy/sentinel) mattdr: Seems like you also want to add a comment here that we know we will never be asked for…
				sconstabAuthorUnsubmitted Done Reply Inline Actions Not sure I agree. I cannot think of a conventional use of this interface that would perform an operation on the sentinel. G->nodes_end().edges_end(); // invalid use of any end iterator SomeNode.edges_end(); // invalid if SomeNode == G->nodes_end() That is, the way that we "know" that we will never be asked for `edges_end()` for the last stored node is that the ask itself would already violate C++ conventions. sconstab: Not sure I agree. I cannot think of a conventional use of this interface that would perform an…
				mattdrUnsubmitted Done Reply Inline Actions I believe any operation on the last `Node` in the array will end up accessing the sentinel: Node* LastNode = G->nodes_begin() + (G->nodes_size() - 1); // valid reference to the last node LastNode->edges_end(); // uses `this+1`, accessing the sentinel value in the Nodes array mattdr: I believe any operation on the last `Node` in the array will end up accessing the sentinel…
				sconstabAuthorUnsubmitted Done Reply Inline Actions `G->nodes_size()` will return the size without the sentinel node, so your example should actually operate on the last data node. Right? sconstab: `G->nodes_size()` will return the size without the sentinel node, so your example should…
				craig.topperUnsubmitted Done Reply Inline Actions Comment added to clarify the extra node was allocated. craig.topper: Comment added to clarify the extra node was allocated.
				ArrayRef<Edge> edges() const {
				return makeArrayRef(edges_begin(), edges_end());
				}
				};

				protected:
				mattdrUnsubmitted Done Reply Inline Actions Why "protected" rather than "private"? Usually seeing "protected" makes me think subclassing is expected, but that doesn't seem to be the case here. mattdr: Why "protected" rather than "private"? Usually seeing "protected" makes me think subclassing is…
				sconstabAuthorUnsubmitted Done Reply Inline Actions The `MachineGadgetGraph` class actually does subclass `ImmutableGraph` to add some contextual information. I did not want the constructors for `ImmutableGraph` to be public, because the intent is to use the builder. So `protected` seemed like the best option. sconstab: The `MachineGadgetGraph` class actually does subclass `ImmutableGraph` to add some contextual…
				mattdrUnsubmitted Done Reply Inline Actions Ah, I missed that. I searched through the file for `public ImmutableGraph` and didn't find it because `MachineGadgetGraph` uses the default inheritance specifier. mattdr: Ah, I missed that. I searched through the file for `public ImmutableGraph` and didn't find it…
				ImmutableGraph(std::unique_ptr<Node[]> Nodes, std::unique_ptr<Edge[]> Edges,
				size_type NodesSize, size_type EdgesSize)
				: Nodes(std::move(Nodes)), Edges(std::move(Edges)), NodesSize(NodesSize),
				EdgesSize(EdgesSize) {}
				ImmutableGraph(const ImmutableGraph &) = delete;
				sconstabAuthorUnsubmitted Done Reply Inline Actions After the members are reordered, this list must also be reordered. sconstab: After the members are reordered, this list must also be reordered.
				ImmutableGraph(ImmutableGraph &&) = delete;
				ImmutableGraph &operator=(const ImmutableGraph &) = delete;
				ImmutableGraph &operator=(ImmutableGraph &&) = delete;

				public:
				ArrayRef<Node> nodes() const { return makeArrayRef(Nodes.get(), NodesSize); }
				const Node *nodes_begin() const { return nodes().begin(); }
				const Node *nodes_end() const { return nodes().end(); }

				ArrayRef<Edge> edges() const { return makeArrayRef(Edges.get(), EdgesSize); }
				const Edge *edges_begin() const { return edges().begin(); }
				const Edge *edges_end() const { return edges().end(); }

				size_type nodes_size() const { return NodesSize; }
				size_type edges_size() const { return EdgesSize; }

				// Node N must belong to this ImmutableGraph.
				size_type getNodeIndex(const Node &N) const {
				mattdrUnsubmitted Done Reply Inline Actions Worth adding a comment for this (and `getEdgeIndex`) that this will crash if the `Node` (`Edge`) provided is not a reference into this specific instance of `ImmutableGraph`. mattdr: Worth adding a comment for this (and `getEdgeIndex`) that this will crash if the `Node`…
				return std::distance(nodes_begin(), &N);
				sconstabAuthorUnsubmitted Not Done Reply Inline Actions This had not occurred to me until now, but a lot of code is shared between `NodeSet` and `EdgeSet`. Maybe a template could reduce the redundancy? sconstab: This had not occurred to me until now, but a lot of code is shared between `NodeSet` and…
				mattdrUnsubmitted Done Reply Inline Actions Ideally I agree we'd find a way to collapse these -- but for this diff, let's content ourselves with a FIXME comment to that effect. mattdr: Ideally I agree we'd find a way to collapse these -- but for this diff, let's content ourselves…
				}
				// Edge E must belong to this ImmutableGraph.
				size_type getEdgeIndex(const Edge &E) const {
				return std::distance(edges_begin(), &E);
				}

				// FIXME: Could NodeSet and EdgeSet be templated to share code?
				class NodeSet {
				const ImmutableGraph &G;
				BitVector V;

				public:
				NodeSet(const ImmutableGraph &G, bool ContainsAll = false)
				: G{G}, V{static_cast<unsigned>(G.nodes_size()), ContainsAll} {}
				mattdrUnsubmitted Done Reply Inline Actions How do we know that a value of `size_type` (aka `SizeT`) can be cast to `unsigned` without truncation? mattdr: How do we know that a value of `size_type` (aka `SizeT`) can be cast to `unsigned` without…
				sconstabAuthorUnsubmitted Done Reply Inline Actions Ah. We do not know that. We could have a static assert here, but maybe the best thing to do would be to follow Matt's earlier advice and fix `size_type` to `int`, rather than have it as a template parameter. Anything larger would break the `BitVectors` and/or waste space. sconstab: Ah. We do not know that. We could have a static assert here, but maybe the best thing to do…
				craig.topperUnsubmitted Done Reply Inline Actions Removed the template argument craig.topper: Removed the template argument
				bool insert(const Node &N) {
				size_type Idx = G.getNodeIndex(N);
				bool AlreadyExists = V.test(Idx);
				V.set(Idx);
				return !AlreadyExists;
				}
				void erase(const Node &N) {
				size_type Idx = G.getNodeIndex(N);
				V.reset(Idx);
				}
				bool contains(const Node &N) const {
				size_type Idx = G.getNodeIndex(N);
				return V.test(Idx);
				}
				void clear() { V.reset(); }
				size_type empty() const { return V.none(); }
				/// Return the number of elements in the set
				size_type count() const { return V.count(); }
				/// Return the size of the set's domain
				size_type size() const { return V.size(); }
				/// Set union
				NodeSet &operator\|=(const NodeSet &RHS) {
				assert(&this->G == &RHS.G);
				V \|= RHS.V;
				return *this;
				}
				/// Set intersection
				NodeSet &operator&=(const NodeSet &RHS) {
				assert(&this->G == &RHS.G);
				V &= RHS.V;
				return *this;
				}
				/// Set disjoint union
				NodeSet &operator^=(const NodeSet &RHS) {
				assert(&this->G == &RHS.G);
				V ^= RHS.V;
				return *this;
				}

				using index_iterator = typename BitVector::const_set_bits_iterator;
				index_iterator index_begin() const { return V.set_bits_begin(); }
				index_iterator index_end() const { return V.set_bits_end(); }
				void set(size_type Idx) { V.set(Idx); }
				void reset(size_type Idx) { V.reset(Idx); }

				class iterator {
				const NodeSet &Set;
				size_type Current;

				void advance() {
				assert(Current != -1);
				Current = Set.V.find_next(Current);
				}

				public:
				iterator(const NodeSet &Set, size_type Begin)
				: Set{Set}, Current{Begin} {}
				iterator operator++(int) {
				iterator Tmp = *this;
				advance();
				return Tmp;
				}
				iterator &operator++() {
				advance();
				return *this;
				}
				Node operator() const {
				assert(Current != -1);
				return Set.G.nodes_begin() + Current;
				}
				bool operator==(const iterator &other) const {
				assert(&this->Set == &other.Set);
				return this->Current == other.Current;
				}
				bool operator!=(const iterator &other) const { return !(*this == other); }
				};

				iterator begin() const { return iterator{*this, V.find_first()}; }
				iterator end() const { return iterator{*this, -1}; }
				};

				class EdgeSet {
				const ImmutableGraph &G;
				BitVector V;

				public:
				EdgeSet(const ImmutableGraph &G, bool ContainsAll = false)
				: G{G}, V{static_cast<unsigned>(G.edges_size()), ContainsAll} {}
				bool insert(const Edge &E) {
				size_type Idx = G.getEdgeIndex(E);
				bool AlreadyExists = V.test(Idx);
				V.set(Idx);
				return !AlreadyExists;
				}
				void erase(const Edge &E) {
				size_type Idx = G.getEdgeIndex(E);
				V.reset(Idx);
				}
				bool contains(const Edge &E) const {
				size_type Idx = G.getEdgeIndex(E);
				return V.test(Idx);
				}
				void clear() { V.reset(); }
				bool empty() const { return V.none(); }
				/// Return the number of elements in the set
				size_type count() const { return V.count(); }
				/// Return the size of the set's domain
				size_type size() const { return V.size(); }
				/// Set union
				EdgeSet &operator\|=(const EdgeSet &RHS) {
				assert(&this->G == &RHS.G);
				V \|= RHS.V;
				return *this;
				}
				/// Set intersection
				EdgeSet &operator&=(const EdgeSet &RHS) {
				assert(&this->G == &RHS.G);
				V &= RHS.V;
				return *this;
				}
				/// Set disjoint union
				EdgeSet &operator^=(const EdgeSet &RHS) {
				assert(&this->G == &RHS.G);
				V ^= RHS.V;
				return *this;
				}

				using index_iterator = typename BitVector::const_set_bits_iterator;
				index_iterator index_begin() const { return V.set_bits_begin(); }
				index_iterator index_end() const { return V.set_bits_end(); }
				void set(size_type Idx) { V.set(Idx); }
				void reset(size_type Idx) { V.reset(Idx); }

				class iterator {
				const EdgeSet &Set;
				size_type Current;

				void advance() {
				assert(Current != -1);
				Current = Set.V.find_next(Current);
				}

				public:
				iterator(const EdgeSet &Set, size_type Begin)
				: Set{Set}, Current{Begin} {}
				iterator operator++(int) {
				iterator Tmp = *this;
				advance();
				return Tmp;
				}
				iterator &operator++() {
				advance();
				return *this;
				}
				Edge operator() const {
				assert(Current != -1);
				return Set.G.edges_begin() + Current;
				}
				bool operator==(const iterator &other) const {
				assert(&this->Set == &other.Set);
				return this->Current == other.Current;
				}
				bool operator!=(const iterator &other) const { return !(*this == other); }
				};

				iterator begin() const { return iterator{*this, V.find_first()}; }
				iterator end() const { return iterator{*this, -1}; }
				};
				sconstabAuthorUnsubmitted Done Reply Inline Actions @craig.topper It now occurs to me that these fields should probably be reordered to: std::unique_ptr<Node[]> Nodes; std::unique_ptr<Edge[]> Edges; size_type NodesSize; size_type EdgesSize; The current ordering will cause internal fragmentation. Old ordering: static_assert(sizeof(ImmutableGraph<T, V>) == 32); New ordering: static_assert(sizeof(ImmutableGraph<T, V>) == 24); With vectors instead of arrays: static_assert(sizeof(ImmutableGraph<T, V>) == 48); sconstab: @craig.topper It now occurs to me that these fields should probably be reordered to: ``` std…
				craig.topperUnsubmitted Done Reply Inline Actions I noticed that too. I just didn't focus on it since we only ever one in memory at a time. I'll change in my next update. craig.topper: I noticed that too. I just didn't focus on it since we only ever one in memory at a time. I'll…

				private:
				std::unique_ptr<Node[]> Nodes;
				std::unique_ptr<Edge[]> Edges;
				size_type NodesSize;
				size_type EdgesSize;
				};

				template <typename GraphT> class ImmutableGraphBuilder {
				using node_value_type = typename GraphT::node_value_type;
				using edge_value_type = typename GraphT::edge_value_type;
				static_assert(
				std::is_base_of<ImmutableGraph<node_value_type, edge_value_type>,
				mattdrUnsubmitted Done Reply Inline Actions this will also break if a non-default `SizeT` is provided. Maybe a good argument to just leave out `SizeT` for now, and it can be added in the future as needed? mattdr: this will also break if a non-default `SizeT` is provided. Maybe a good argument to just leave…
				GraphT>::value,
				"Template argument to ImmutableGraphBuilder must derive from "
				"ImmutableGraph<>");
				using size_type = typename GraphT::size_type;
				using NodeSet = typename GraphT::NodeSet;
				using Node = typename GraphT::Node;
				using EdgeSet = typename GraphT::EdgeSet;
				using Edge = typename GraphT::Edge;
				using BuilderEdge = std::pair<edge_value_type, size_type>;
				sconstabAuthorUnsubmitted Done Reply Inline Actions Just noticed that `ImmutableGraphBuilder` and `ImmutableGraph` have non-identical types called `NodeRef`. Suggest renaming this one to `BuilderNodeRef`. sconstab: Just noticed that `ImmutableGraphBuilder` and `ImmutableGraph` have non-identical types called…
				craig.topperUnsubmitted Done Reply Inline Actions NodeRef is in the Traits class not the ImmutableGraph, but I will rename the builder one. craig.topper: NodeRef is in the Traits class not the ImmutableGraph, but I will rename the builder one.
				using EdgeList = std::vector<BuilderEdge>;
				using BuilderVertex = std::pair<node_value_type, EdgeList>;
				using VertexVec = std::vector<BuilderVertex>;

				public:
				using BuilderNodeRef = size_type;

				BuilderNodeRef addVertex(const node_value_type &V) {
				auto I = AdjList.emplace(AdjList.end(), V, EdgeList{});
				return std::distance(AdjList.begin(), I);
				}
				mattdrUnsubmitted Done Reply Inline Actions As a general rule `new` is a code-smell in modern C++. This should be a `vector`. mattdr: As a general rule `new` is a code-smell in modern C++. This should be a `vector`.
				sconstabAuthorUnsubmitted Done Reply Inline Actions @mattdr I do agree with the general rule. I also think that in this case where the structure is immutable, std::vector is wasteful because it needs to keep separate values for the current number of elements and the current capacity. At local scope within a function the unneeded value would likely be optimized away, but then there would be an awkward handoff to transfer the data from the vector to the array members. I would not want to see the array members changed to vectors, unless LLVM provides an encapsulated array structure that does not need to grow and shrink. sconstab: @mattdr I do agree with the general rule. I also think that in this case where the structure is…
				mattdrUnsubmitted Done Reply Inline Actions So, first: I'm glad you removed the unnecessary use of `new[]` here and the corresponding (and error-prone!) use of `delete[]` later. That removes a memory leak LLVM won't have to debug. You suggest here that something other than `std::vector` would be more efficient. If so, would `std::array` suffice? If not, can you explain why static allocation is impossible but dynamic allocation would be too expensive? mattdr: So, first: I'm glad you removed the unnecessary use of `new[]` here and the corresponding (and…
				sconstabAuthorUnsubmitted Done Reply Inline Actions A statically sized array (e.g., std::array) is insufficient because the size in this case is not compiler determinable; a dynamically sized and dynamically resizable array (e.g., std::vector) is sufficient but overly costly; a dynamically sized and dynamically unresizable array is sufficient and has minimal cost. sconstab: A statically sized array (e.g., std::array) is insufficient because the size in this case is…
				mattdrUnsubmitted Done Reply Inline Actions I'm not sure we allocate enough of these in the course of a compilation for the one extra word in a `std::vector` to matter, but I won't press the point. mattdr: I'm not sure we allocate enough of these in the course of a compilation for the one extra word…

				void addEdge(const edge_value_type &E, BuilderNodeRef From,
				BuilderNodeRef To) {
				AdjList[From].second.emplace_back(E, To);
				}

				bool empty() const { return AdjList.empty(); }

				template <typename... ArgT> std::unique_ptr<GraphT> get(ArgT &&... Args) {
				size_type VertexSize = AdjList.size(), EdgeSize = 0;
				for (const auto &V : AdjList) {
				craig.topperUnsubmitted Not Done Reply Inline Actions Can this be changed to VI < VertexSize? craig.topper: Can this be changed to VI < VertexSize?
				EdgeSize += V.second.size();
				}
				auto VertexArray =
				std::make_unique<Node[]>(VertexSize + 1 /* terminator node */);
				auto EdgeArray = std::make_unique<Edge[]>(EdgeSize);
				size_type VI = 0, EI = 0;
				mattdrUnsubmitted Done Reply Inline Actions this should return a `unique_ptr` to signal ownership transfer mattdr: this should return a `unique_ptr` to signal ownership transfer
				sconstabAuthorUnsubmitted Not Done Reply Inline Actions Yes, agree. sconstab: Yes, agree.
				for (; VI < VertexSize; ++VI) {
				VertexArray[VI].Value = std::move(AdjList[VI].first);
				VertexArray[VI].Edges = &EdgeArray[EI];
				auto NumEdges = static_cast<size_type>(AdjList[VI].second.size());
				for (size_type VEI = 0; VEI < NumEdges; ++VEI, ++EI) {
				mattdrUnsubmitted Not Done Reply Inline Actions This `if` is unnecessary mattdr: This `if` is unnecessary
				auto &E = AdjList[VI].second[VEI];
				EdgeArray[EI].Value = std::move(E.first);
				EdgeArray[EI].Dest = &VertexArray[E.second];
				}
				}
				assert(VI == VertexSize && EI == EdgeSize && "ImmutableGraph malformed");
				VertexArray[VI].Edges = &EdgeArray[EdgeSize]; // terminator node
				return std::make_unique<GraphT>(std::move(VertexArray),
				mattdrUnsubmitted Not Done Reply Inline Actions Technically a "generic" graph, so we should leave out "Gadget" here mattdr: Technically a "generic" graph, so we should leave out "Gadget" here
				std::move(EdgeArray), VertexSize, EdgeSize,
				std::forward<ArgT>(Args)...);
				}

				template <typename... ArgT>
				static std::unique_ptr<GraphT> trim(const GraphT &G, const NodeSet &TrimNodes,
				const EdgeSet &TrimEdges,
				ArgT &&... Args) {
				size_type NewVertexSize = G.nodes_size() - TrimNodes.count();
				size_type NewEdgeSize = G.edges_size() - TrimEdges.count();
				auto NewVertexArray =
				std::make_unique<Node[]>(NewVertexSize + 1 /* terminator node */);
				auto NewEdgeArray = std::make_unique<Edge[]>(NewEdgeSize);

				// Walk the nodes and determine the new index for each node.
				size_type NewNodeIndex = 0;
				std::vector<size_type> RemappedNodeIndex(G.nodes_size());
				for (const Node &N : G.nodes()) {
				mattdrUnsubmitted Not Done Reply Inline Actions Two comments here would make the code significantly easier to understand: Note that we're using `.size()` here rather than `.count()`, so we're actually iterating over all Node indices, not just the ones to be trimmed The `TrimmedNodes` vector maps indices in the original NodeSet to the number of `Node`s before that index that have been trimmed by that index, to allow later code to map elements to their new position in a dense array with the trimmed items removed mattdr: Two comments here would make the code significantly easier to understand: 1. Note that we're…
				craig.topperUnsubmitted Done Reply Inline Actions I've stopped using TrimmedNodes.size() and TrimEdges.size() in favor of the size methods from the graph which should make things more obvious. I renamed TrimmedNodes to RemappedNodeIndex and stored the new index rather than the adjustment needed. I'm also changed it to walk nodes instead of indices so we don't have to translate to Node to make the contains call. I also removed the NewNumEdges count_if and the if statement around the edge loop from the loop below. I don't think that provided any value and just complicated the code. craig.topper: I've stopped using TrimmedNodes.size() and TrimEdges.size() in favor of the size methods from…
				mattdrUnsubmitted Done Reply Inline Actions Many thanks! These changes make the code much more accessible. mattdr: Many thanks! These changes make the code much more accessible.
				if (TrimNodes.contains(N))
				continue;
				RemappedNodeIndex[G.getNodeIndex(N)] = NewNodeIndex++;
				craig.topperUnsubmitted Done Reply Inline Actions I think I'll change this to llvm::count_if. Also there was previously a conditional here that made sure the distance between edges was >0, but it didn't seem necessary. Please let me know if there's a reason I should put that back craig.topper: I think I'll change this to llvm::count_if. Also there was previously a conditional here that…
				}
				assert(NewNodeIndex == NewVertexSize &&
				"Should have assigned NewVertexSize indices");

				size_type VertexI = 0, EdgeI = 0;
				for (const Node &N : G.nodes()) {
				if (TrimNodes.contains(N))
				continue;
				NewVertexArray[VertexI].Value = N.getValue();
				NewVertexArray[VertexI].Edges = &NewEdgeArray[EdgeI];
				for (const Edge &E : N.edges()) {
				if (TrimEdges.contains(E))
				continue;
				NewEdgeArray[EdgeI].Value = E.getValue();
				size_type DestIdx = G.getNodeIndex(*E.getDest());
				size_type NewIdx = RemappedNodeIndex[DestIdx];
				assert(NewIdx < NewVertexSize);
				NewEdgeArray[EdgeI].Dest = &NewVertexArray[NewIdx];
				++EdgeI;
				}
				++VertexI;
				}
				assert(VertexI == NewVertexSize && EdgeI == NewEdgeSize &&
				"Gadget graph malformed");
				NewVertexArray[VertexI].Edges = &NewEdgeArray[NewEdgeSize]; // terminator
				return std::make_unique<GraphT>(std::move(NewVertexArray),
				std::move(NewEdgeArray), NewVertexSize,
				NewEdgeSize, std::forward<ArgT>(Args)...);
				}

				private:
				VertexVec AdjList;
				};

				template <typename NodeValueT, typename EdgeValueT>
				struct GraphTraits<ImmutableGraph<NodeValueT, EdgeValueT> *> {
				using GraphT = ImmutableGraph<NodeValueT, EdgeValueT>;
				using NodeRef = typename GraphT::Node const *;
				using EdgeRef = typename GraphT::Edge const &;

				static NodeRef edge_dest(EdgeRef E) { return E.getDest(); }
				using ChildIteratorType =
				mapped_iterator<typename GraphT::Edge const *, decltype(&edge_dest)>;

				static NodeRef getEntryNode(GraphT *G) { return G->nodes_begin(); }
				static ChildIteratorType child_begin(NodeRef N) {
				return {N->edges_begin(), &edge_dest};
				}
				static ChildIteratorType child_end(NodeRef N) {
				return {N->edges_end(), &edge_dest};
				}

				static NodeRef getNode(typename GraphT::Node const &N) { return NodeRef{&N}; }
				using nodes_iterator =
				mapped_iterator<typename GraphT::Node const *, decltype(&getNode)>;
				static nodes_iterator nodes_begin(GraphT *G) {
				return {G->nodes_begin(), &getNode};
				}
				static nodes_iterator nodes_end(GraphT *G) {
				return {G->nodes_end(), &getNode};
				}

				using ChildEdgeIteratorType = typename GraphT::Edge const *;

				static ChildEdgeIteratorType child_edge_begin(NodeRef N) {
				return N->edges_begin();
				}
				static ChildEdgeIteratorType child_edge_end(NodeRef N) {
				return N->edges_end();
				}
				static typename GraphT::size_type size(GraphT *G) { return G->nodes_size(); }
				};

				} // end namespace llvm

				#endif // LLVM_LIB_TARGET_X86_IMMUTABLEGRAPH_H

llvm/lib/Target/X86/X86.h

	Show First 20 Lines • Show All 134 Lines • ▼ Show 20 Lines
	/// a reduction sequence and is therefore safe to reassociate in interesting			/// a reduction sequence and is therefore safe to reassociate in interesting
	/// ways.			/// ways.
	FunctionPass *createX86PartialReductionPass();			FunctionPass *createX86PartialReductionPass();

	InstructionSelector *createX86InstructionSelector(const X86TargetMachine &TM,			InstructionSelector *createX86InstructionSelector(const X86TargetMachine &TM,
	X86Subtarget &,			X86Subtarget &,
	X86RegisterBankInfo &);			X86RegisterBankInfo &);

				FunctionPass *createX86LoadValueInjectionLoadHardeningPass();
	FunctionPass *createX86LoadValueInjectionRetHardeningPass();			FunctionPass *createX86LoadValueInjectionRetHardeningPass();
	FunctionPass *createX86SpeculativeLoadHardeningPass();			FunctionPass *createX86SpeculativeLoadHardeningPass();
	FunctionPass *createX86SpeculativeExecutionSideEffectSuppression();			FunctionPass *createX86SpeculativeExecutionSideEffectSuppression();

	void initializeEvexToVexInstPassPass(PassRegistry &);			void initializeEvexToVexInstPassPass(PassRegistry &);
	void initializeFixupBWInstPassPass(PassRegistry &);			void initializeFixupBWInstPassPass(PassRegistry &);
	void initializeFixupLEAPassPass(PassRegistry &);			void initializeFixupLEAPassPass(PassRegistry &);
	void initializeFPSPass(PassRegistry &);			void initializeFPSPass(PassRegistry &);
	void initializeWinEHStatePassPass(PassRegistry &);			void initializeWinEHStatePassPass(PassRegistry &);
	void initializeX86AvoidSFBPassPass(PassRegistry &);			void initializeX86AvoidSFBPassPass(PassRegistry &);
	void initializeX86AvoidTrailingCallPassPass(PassRegistry &);			void initializeX86AvoidTrailingCallPassPass(PassRegistry &);
	void initializeX86CallFrameOptimizationPass(PassRegistry &);			void initializeX86CallFrameOptimizationPass(PassRegistry &);
	void initializeX86CmovConverterPassPass(PassRegistry &);			void initializeX86CmovConverterPassPass(PassRegistry &);
	void initializeX86CondBrFoldingPassPass(PassRegistry &);			void initializeX86CondBrFoldingPassPass(PassRegistry &);
	void initializeX86DomainReassignmentPass(PassRegistry &);			void initializeX86DomainReassignmentPass(PassRegistry &);
	void initializeX86ExecutionDomainFixPass(PassRegistry &);			void initializeX86ExecutionDomainFixPass(PassRegistry &);
	void initializeX86ExpandPseudoPass(PassRegistry &);			void initializeX86ExpandPseudoPass(PassRegistry &);
	void initializeX86FixupSetCCPassPass(PassRegistry &);			void initializeX86FixupSetCCPassPass(PassRegistry &);
	void initializeX86FlagsCopyLoweringPassPass(PassRegistry &);			void initializeX86FlagsCopyLoweringPassPass(PassRegistry &);
				void initializeX86LoadValueInjectionLoadHardeningPassPass(PassRegistry &);
	void initializeX86LoadValueInjectionRetHardeningPassPass(PassRegistry &);			void initializeX86LoadValueInjectionRetHardeningPassPass(PassRegistry &);
	void initializeX86OptimizeLEAPassPass(PassRegistry &);			void initializeX86OptimizeLEAPassPass(PassRegistry &);
	void initializeX86PartialReductionPass(PassRegistry &);			void initializeX86PartialReductionPass(PassRegistry &);
	void initializeX86SpeculativeLoadHardeningPassPass(PassRegistry &);			void initializeX86SpeculativeLoadHardeningPassPass(PassRegistry &);
	void initializeX86SpeculativeExecutionSideEffectSuppressionPass(PassRegistry &);			void initializeX86SpeculativeExecutionSideEffectSuppressionPass(PassRegistry &);

	namespace X86AS {			namespace X86AS {
	enum : unsigned {			enum : unsigned {
	Show All 12 Lines

llvm/lib/Target/X86/X86.td

	Show First 20 Lines • Show All 438 Lines • ▼ Show 20 Lines
	def FeatureLVIControlFlowIntegrity			def FeatureLVIControlFlowIntegrity
	: SubtargetFeature<			: SubtargetFeature<
	"lvi-cfi", "UseLVIControlFlowIntegrity", "true",			"lvi-cfi", "UseLVIControlFlowIntegrity", "true",
	"Prevent indirect calls/branches from using a memory operand, and "			"Prevent indirect calls/branches from using a memory operand, and "
	"precede all indirect calls/branches from a register with an "			"precede all indirect calls/branches from a register with an "
	"LFENCE instruction to serialize control flow. Also decompose RET "			"LFENCE instruction to serialize control flow. Also decompose RET "
	"instructions into a POP+LFENCE+JMP sequence.">;			"instructions into a POP+LFENCE+JMP sequence.">;

				// Mitigate LVI attacks against data loads
				def FeatureLVILoadHardening
				: SubtargetFeature<
				"lvi-load-hardening", "UseLVILoadHardening", "true",
				"Insert LFENCE instructions to prevent data speculatively injected "
				"into loads from being used maliciously.">;

	// Direct Move instructions.			// Direct Move instructions.
	def FeatureMOVDIRI : SubtargetFeature<"movdiri", "HasMOVDIRI", "true",			def FeatureMOVDIRI : SubtargetFeature<"movdiri", "HasMOVDIRI", "true",
	"Support movdiri instruction">;			"Support movdiri instruction">;
	def FeatureMOVDIR64B : SubtargetFeature<"movdir64b", "HasMOVDIR64B", "true",			def FeatureMOVDIR64B : SubtargetFeature<"movdir64b", "HasMOVDIR64B", "true",
	"Support movdir64b instruction">;			"Support movdir64b instruction">;

	def FeatureFastBEXTR : SubtargetFeature<"fast-bextr", "HasFastBEXTR", "true",			def FeatureFastBEXTR : SubtargetFeature<"fast-bextr", "HasFastBEXTR", "true",
	"Indicates that the BEXTR instruction is implemented as a single uop "			"Indicates that the BEXTR instruction is implemented as a single uop "
	▲ Show 20 Lines • Show All 879 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp

This file was added.

				//==-- X86LoadValueInjectionLoadHardening.cpp - LVI load hardening for x86 --=//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				///
				/// Description: This pass finds Load Value Injection (LVI) gadgets consisting
				/// of a load from memory (i.e., SOURCE), and any operation that may transmit
				/// the value loaded from memory over a covert channel, or use the value loaded
				/// from memory to determine a branch/call target (i.e., SINK).
				///
				//===----------------------------------------------------------------------===//

				#include "ImmutableGraph.h"
				#include "X86.h"
				#include "X86Subtarget.h"
				#include "X86TargetMachine.h"
				#include "llvm/ADT/DenseMap.h"
				#include "llvm/ADT/DenseSet.h"
				#include "llvm/ADT/SmallSet.h"
				#include "llvm/ADT/Statistic.h"
				#include "llvm/ADT/StringRef.h"
				#include "llvm/CodeGen/MachineBasicBlock.h"
				#include "llvm/CodeGen/MachineDominanceFrontier.h"
				#include "llvm/CodeGen/MachineDominators.h"
				#include "llvm/CodeGen/MachineFunction.h"
				#include "llvm/CodeGen/MachineFunctionPass.h"
				#include "llvm/CodeGen/MachineInstr.h"
				#include "llvm/CodeGen/MachineInstrBuilder.h"
				#include "llvm/CodeGen/MachineLoopInfo.h"
				#include "llvm/CodeGen/MachineRegisterInfo.h"
				#include "llvm/CodeGen/RDFGraph.h"
				#include "llvm/CodeGen/RDFLiveness.h"
				#include "llvm/InitializePasses.h"
				#include "llvm/Support/CommandLine.h"
				#include "llvm/Support/DOTGraphTraits.h"
				#include "llvm/Support/Debug.h"
				#include "llvm/Support/GraphWriter.h"
				#include "llvm/Support/raw_ostream.h"

				using namespace llvm;

				#define PASS_KEY "x86-lvi-load"
				#define DEBUG_TYPE PASS_KEY

				STATISTIC(NumFunctionsConsidered, "Number of functions analyzed");
				STATISTIC(NumFunctionsMitigated, "Number of functions for which mitigations "
				"were deployed");
				STATISTIC(NumGadgets, "Number of LVI gadgets detected during analysis");

				static cl::opt<bool> NoConditionalBranches(
				PASS_KEY "-no-cbranch",
				cl::desc("Don't treat conditional branches as disclosure gadgets. This "
				"may improve performance, at the cost of security."),
				cl::init(false), cl::Hidden);
				mattdrUnsubmitted Not Done Reply Inline Actions This is another case where references to good documentation will go a long way. Without details about what the tradeoff is and how to reason about it, it doesn't seem like anyone should use this flag. mattdr: This is another case where references to good documentation will go a long way. Without details…

				static cl::opt<bool> EmitDot(
				PASS_KEY "-dot",
				cl::desc(
				"For each function, emit a dot graph depicting potential LVI gadgets"),
				cl::init(false), cl::Hidden);

				static cl::opt<bool> EmitDotOnly(
				PASS_KEY "-dot-only",
				cl::desc("For each function, emit a dot graph depicting potential LVI "
				"gadgets, and do not insert any fences"),
				cl::init(false), cl::Hidden);

				static cl::opt<bool> EmitDotVerify(
				PASS_KEY "-dot-verify",
				cl::desc("For each function, emit a dot graph to stdout depicting "
				"potential LVI gadgets, used for testing purposes only"),
				cl::init(false), cl::Hidden);

				namespace {

				struct MachineGadgetGraph : ImmutableGraph<MachineInstr *, int> {
				static constexpr int GadgetEdgeSentinel = -1;
				static constexpr MachineInstr *const ArgNodeSentinel = nullptr;

				using GraphT = ImmutableGraph<MachineInstr *, int>;
				using Node = typename GraphT::Node;
				using Edge = typename GraphT::Edge;
				using size_type = typename GraphT::size_type;
				mattdrUnsubmitted Not Done Reply Inline Actions Please replace these with constants or functions. mattdr: Please replace these with constants or functions.
				craig.topperUnsubmitted Done Reply Inline Actions Oops forgot to do this one. craig.topper: Oops forgot to do this one.
				MachineGadgetGraph(std::unique_ptr<Node[]> Nodes,
				std::unique_ptr<Edge[]> Edges, size_type NodesSize,
				size_type EdgesSize, int NumFences = 0, int NumGadgets = 0)
				: GraphT(std::move(Nodes), std::move(Edges), NodesSize, EdgesSize),
				NumFences(NumFences), NumGadgets(NumGadgets) {}
				static inline bool isCFGEdge(const Edge &E) {
				return E.getValue() != GadgetEdgeSentinel;
				}
				static inline bool isGadgetEdge(const Edge &E) {
				return E.getValue() == GadgetEdgeSentinel;
				}
				int NumFences;
				int NumGadgets;
				};

				class X86LoadValueInjectionLoadHardeningPass : public MachineFunctionPass {
				public:
				X86LoadValueInjectionLoadHardeningPass() : MachineFunctionPass(ID) {}

				StringRef getPassName() const override {
				return "X86 Load Value Injection (LVI) Load Hardening";
				}
				void getAnalysisUsage(AnalysisUsage &AU) const override;
				bool runOnMachineFunction(MachineFunction &MF) override;

				static char ID;

				private:
				mattdrUnsubmitted Done Reply Inline Actions Cleaner how? mattdr: Cleaner how?
				sconstabAuthorUnsubmitted Done Reply Inline Actions Maybe by keeping a member reference to the associated `MachineFunction`? sconstab: Maybe by keeping a member reference to the associated `MachineFunction`?
				mattdrUnsubmitted Done Reply Inline Actions Let's put that in the comment instead. mattdr: Let's put that in the comment instead.
				craig.topperUnsubmitted Done Reply Inline Actions I found a way to remove the getMF method entirely. craig.topper: I found a way to remove the getMF method entirely.
				using GraphBuilder = ImmutableGraphBuilder<MachineGadgetGraph>;
				using EdgeSet = MachineGadgetGraph::EdgeSet;
				using NodeSet = MachineGadgetGraph::NodeSet;
				using Gadget = std::pair<MachineInstr , MachineInstr >;

				const X86Subtarget *STI;
				const TargetInstrInfo *TII;
				const TargetRegisterInfo *TRI;

				std::unique_ptr<MachineGadgetGraph>
				getGadgetGraph(MachineFunction &MF, const MachineLoopInfo &MLI,
				const MachineDominatorTree &MDT,
				const MachineDominanceFrontier &MDF) const;

				bool instrUsesRegToAccessMemory(const MachineInstr &I, unsigned Reg) const;
				bool instrUsesRegToBranch(const MachineInstr &I, unsigned Reg) const;
				inline bool isFence(const MachineInstr *MI) const {
				return MI && (MI->getOpcode() == X86::LFENCE \|\|
				(STI->useLVIControlFlowIntegrity() && MI->isCall()));
				}
				};

				} // end anonymous namespace

				namespace llvm {

				template <>
				struct GraphTraits<MachineGadgetGraph *>
				: GraphTraits<ImmutableGraph<MachineInstr , int> > {};

				template <>
				struct DOTGraphTraits<MachineGadgetGraph *> : DefaultDOTGraphTraits {
				using GraphType = MachineGadgetGraph;
				using Traits = llvm::GraphTraits<GraphType *>;
				using NodeRef = typename Traits::NodeRef;
				using EdgeRef = typename Traits::EdgeRef;
				using ChildIteratorType = typename Traits::ChildIteratorType;
				using ChildEdgeIteratorType = typename Traits::ChildEdgeIteratorType;

				DOTGraphTraits(bool isSimple = false) : DefaultDOTGraphTraits(isSimple) {}

				std::string getNodeLabel(NodeRef Node, GraphType *) {
				if (Node->getValue() == MachineGadgetGraph::ArgNodeSentinel)
				return "ARGS";

				std::string Str;
				raw_string_ostream OS(Str);
				OS << *Node->getValue();
				return OS.str();
				}

				static std::string getNodeAttributes(NodeRef Node, GraphType *) {
				MachineInstr *MI = Node->getValue();
				if (MI == MachineGadgetGraph::ArgNodeSentinel)
				return "color = blue";
				if (MI->getOpcode() == X86::LFENCE)
				return "color = green";
				return "";
				}

				static std::string getEdgeAttributes(NodeRef, ChildIteratorType E,
				GraphType *) {
				int EdgeVal = (*E.getCurrent()).getValue();
				return EdgeVal >= 0 ? "label = " + std::to_string(EdgeVal)
				: "color = red, style = \"dashed\"";
				}
				};

				} // end namespace llvm

				constexpr MachineInstr *MachineGadgetGraph::ArgNodeSentinel;
				constexpr int MachineGadgetGraph::GadgetEdgeSentinel;

				char X86LoadValueInjectionLoadHardeningPass::ID = 0;

				void X86LoadValueInjectionLoadHardeningPass::getAnalysisUsage(
				AnalysisUsage &AU) const {
				MachineFunctionPass::getAnalysisUsage(AU);
				AU.addRequired<MachineLoopInfo>();
				AU.addRequired<MachineDominatorTree>();
				AU.addRequired<MachineDominanceFrontier>();
				AU.setPreservesCFG();
				}

				static void WriteGadgetGraph(raw_ostream &OS, MachineFunction &MF,
				MachineGadgetGraph *G) {
				WriteGraph(OS, G, /ShortNames/ false,
				"Speculative gadgets for \"" + MF.getName() + "\" function");
				}

				bool X86LoadValueInjectionLoadHardeningPass::runOnMachineFunction(
				MachineFunction &MF) {
				LLVM_DEBUG(dbgs() << "***** " << getPassName() << " : " << MF.getName()
				<< " *****\n");
				STI = &MF.getSubtarget<X86Subtarget>();
				if (!STI->useLVILoadHardening())
				return false;

				// FIXME: support 32-bit
				if (!STI->is64Bit())
				report_fatal_error("LVI load hardening is only supported on 64-bit", false);

				// Don't skip functions with the "optnone" attr but participate in opt-bisect.
				const Function &F = MF.getFunction();
				if (!F.hasOptNone() && skipFunction(F))
				return false;

				++NumFunctionsConsidered;
				TII = STI->getInstrInfo();
				TRI = STI->getRegisterInfo();
				LLVM_DEBUG(dbgs() << "Building gadget graph...\n");
				const auto &MLI = getAnalysis<MachineLoopInfo>();
				const auto &MDT = getAnalysis<MachineDominatorTree>();
				const auto &MDF = getAnalysis<MachineDominanceFrontier>();
				std::unique_ptr<MachineGadgetGraph> Graph = getGadgetGraph(MF, MLI, MDT, MDF);
				LLVM_DEBUG(dbgs() << "Building gadget graph... Done\n");
				if (Graph == nullptr)
				return false; // didn't find any gadgets

				if (EmitDotVerify) {
				mattdrUnsubmitted Done Reply Inline Actions If the user requests hardening and we can't do it, it seems better to fail loudly so they don't accidentally deploy an unmitigated binary. mattdr: If the user requests hardening and we can't do it, it seems better to fail loudly so they don't…
				sconstabAuthorUnsubmitted Done Reply Inline Actions @craig.topper I think this is related to the discussion we were having about what would happen for SLH on unsupported subtargets. I'm not sure what the most appropriate solution would be. sconstab: @craig.topper I think this is related to the discussion we were having about what would happen…
				craig.topperUnsubmitted Done Reply Inline Actions Added a fatal error. Which isn't great as it will generate a crash report in clang. But it will tell the user to file a compiler bug so I guess that's something. craig.topper: Added a fatal error. Which isn't great as it will generate a crash report in clang. But it will…
				sconstabAuthorUnsubmitted Not Done Reply Inline Actions Would it be better to have report_fatal_error("LVI load hardening is only supported on 64-bit " "targets.", false); So that the crash diagnostic is not generated? sconstab: Would it be better to have ``` report_fatal_error("LVI load hardening is only supported on…
				WriteGadgetGraph(outs(), MF, Graph.get());
				return false;
				}
				mattdrUnsubmitted Not Done Reply Inline Actions Each call to `hardenLoads` leads to a call to `buildGadgetGraph`. A lot of the work that `getGadgetGraph` does seems to be common between mitigating fixed and non-fixed loads -- for example, computing register dataflow and liveness over the entire function. And calling `hardenLoads` twice looks to be the common case, since `NoFixedLoads` is `false` by default. Could we make this pass about half as expensive by default by combining these two calls to `hardenLoads` into one? It would do the expensive work once, then either harden _all_ loads or _only_ non-fixed loads. mattdr: Each call to `hardenLoads` leads to a call to `buildGadgetGraph`. A lot of the work that…

				if (EmitDot \|\| EmitDotOnly) {
				LLVM_DEBUG(dbgs() << "Emitting gadget graph...\n");
				std::error_code FileError;
				std::string FileName = "lvi.";
				FileName += MF.getName();
				FileName += ".dot";
				raw_fd_ostream FileOut(FileName, FileError);
				if (FileError)
				errs() << FileError.message();
				WriteGadgetGraph(FileOut, MF, Graph.get());
				FileOut.close();
				LLVM_DEBUG(dbgs() << "Emitting gadget graph... Done\n");
				if (EmitDotOnly)
				return false;
				}

				return 0;
				zbridUnsubmitted Done Reply Inline Actions I think this should go at the top of the function. zbrid: I think this should go at the top of the function.
				}

				std::unique_ptr<MachineGadgetGraph>
				X86LoadValueInjectionLoadHardeningPass::getGadgetGraph(
				MachineFunction &MF, const MachineLoopInfo &MLI,
				const MachineDominatorTree &MDT,
				const MachineDominanceFrontier &MDF) const {
				using namespace rdf;

				// Build the Register Dataflow Graph using the RDF framework
				TargetOperandInfo TOI{*TII};
				DataFlowGraph DFG{MF, TII, TRI, MDT, MDF, TOI};
				DFG.build();
				Liveness L{MF.getRegInfo(), DFG};
				L.computePhiInfo();

				GraphBuilder Builder;
				zbridUnsubmitted Done Reply Inline Actions Am I misunderstanding this comment? It sounds like if FixedLoads is true then BOTH fixed loads and non-fixed loads will be mitigated. Since runOnMachineFunction would call hardenLoads twice for non-fixed loads, would that result in double mitigation for non-fixed loads in the case where we also harden fixed loads? Unfortunately I'm having trouble reasoning through this myself, so I'd appreciate some clarification. zbrid: Am I misunderstanding this comment? It sounds like if FixedLoads is true then BOTH fixed loads…
				sconstabAuthorUnsubmitted Done Reply Inline Actions The comment was incorrect. sconstab: The comment was incorrect.
				using GraphIter = typename GraphBuilder::BuilderNodeRef;
				DenseMap<MachineInstr *, GraphIter> NodeMap;
				int FenceCount = 0, GadgetCount = 0;
				auto MaybeAddNode = [&NodeMap, &Builder](MachineInstr *MI) {
				auto Ref = NodeMap.find(MI);
				if (Ref == NodeMap.end()) {
				auto I = Builder.addVertex(MI);
				NodeMap[MI] = I;
				return std::pair<GraphIter, bool>{I, true};
				}
				return std::pair<GraphIter, bool>{Ref->getSecond(), false};
				};

				// The `Transmitters` map memoizes transmitters found for each def. If a def
				// has not yet been analyzed, then it will not appear in the map. If a def
				// has been analyzed and was determined not to have any transmitters, then
				mattdrUnsubmitted Done Reply Inline Actions This comment doesn't seem to match how the map is used -- it looks like the loop assumes a def has been analyzed iff it is present in the map. This matches my expectation that, if a def is present and maps to an empty list, it would meant the def had been analyzed and found not to transmit. mattdr: This comment doesn't seem to match how the map is used -- it looks like the loop assumes a def…
				// its list of transmitters will be empty.
				DenseMap<NodeId, std::vector<NodeId>> Transmitters;

				// Analyze all machine instructions to find gadgets and LFENCEs, adding
				// each interesting value to `Nodes`
				auto AnalyzeDef = [&](NodeAddr<DefNode *> SourceDef) {
				SmallSet<NodeId, 8> UsesVisited, DefsVisited;
				std::function<void(NodeAddr<DefNode *>)> AnalyzeDefUseChain =
				mattdrUnsubmitted Done Reply Inline Actions fwiw, this code would be easier to understand if we didn't shadow `Def` with another variable named `Def`. mattdr: fwiw, this code would be easier to understand if we didn't shadow `Def` with another variable…
				sconstabAuthorUnsubmitted Done Reply Inline Actions Changed the outer def to `SourceDef`, which also seems to make the code after the lambda a lot clearer. sconstab: Changed the outer def to `SourceDef`, which also seems to make the code after the lambda a lot…
				[&](NodeAddr<DefNode *> Def) {
				if (Transmitters.find(Def.Id) != Transmitters.end())
				return; // Already analyzed `Def`

				// Use RDF to find all the uses of `Def`
				rdf::NodeSet Uses;
				RegisterRef DefReg = DFG.getPRI().normalize(Def.Addr->getRegRef(DFG));
				for (auto UseID : L.getAllReachedUses(DefReg, Def)) {
				auto Use = DFG.addr<UseNode *>(UseID);
				if (Use.Addr->getFlags() & NodeAttrs::PhiRef) { // phi node
				NodeAddr<PhiNode *> Phi = Use.Addr->getOwner(DFG);
				for (auto I : L.getRealUses(Phi.Id)) {
				if (DFG.getPRI().alias(RegisterRef(I.first), DefReg)) {
				for (auto UA : I.second)
				Uses.emplace(UA.first);
				}
				}
				} else { // not a phi node
				Uses.emplace(UseID);
				}
				}

				// For each use of `Def`, we want to know whether:
				// (1) The use can leak the Def'ed value,
				// (2) The use can further propagate the Def'ed value to more defs
				for (auto UseID : Uses) {
				if (!UsesVisited.insert(UseID).second)
				mattdrUnsubmitted Done Reply Inline Actions Please add a comment explaining the semantics of the boolean return here. I think it's: `true` if we need to consider defs of this instruction tainted by this use (and therefore add them for analysis); `false` if this instruction consumes its use mattdr: Please add a comment explaining the semantics of the boolean return here. I //think// it's…
				craig.topperUnsubmitted Done Reply Inline Actions Was this comment addressed? craig.topper: Was this comment addressed?
				sconstabAuthorUnsubmitted Done Reply Inline Actions It had not been addressed, so thank you for pointing this out. That lambda was doing too many things at once, which made it more confusing than it needed to be. So I just inlined it in the for (auto N : Uses) { … } loop, and I added some additional clarifying comments. sconstab: It had not been addressed, so thank you for pointing this out. That lambda was doing too many…
				mattdrUnsubmitted Not Done Reply Inline Actions "current def" is a bit ambiguous here. I _believe_ it means `AnalyzeDef`'s `Def` argument? At least, that's the interpretation that makes the comment make sense since `UsesVisited` is in `AnalyzeDef`'s scope. mattdr: "current def" is a bit ambiguous here. I _believe_ it means `AnalyzeDef`'s `Def` argument? At…
				sconstabAuthorUnsubmitted Done Reply Inline Actions I am now trying to be clearer by using capital-d "Def" to refer specifically to the def that is being analyzed, and lower-case-d "def" to refer to any other defs. Do you think this is better? Good enough? sconstab: I am now trying to be clearer by using capital-d "Def" to refer specifically to the def that is…
				mattdrUnsubmitted Done Reply Inline Actions Much better. Thank you for the change! mattdr: Much better. Thank you for the change!
				continue; // Already visited this use of `Def`

				auto Use = DFG.addr<UseNode *>(UseID);
				assert(!(Use.Addr->getFlags() & NodeAttrs::PhiRef));
				MachineOperand &UseMO = Use.Addr->getOp();
				MachineInstr &UseMI = *UseMO.getParent();
				assert(UseMO.isReg());
				mattdrUnsubmitted Not Done Reply Inline Actions Why is it okay to assume that a call doesn't propagate its uses to defs? Is it because we can assume the CFI transform is already inserting an LFENCE? Whatever the reason, let's state it explicitly here mattdr: Why is it okay to assume that a call doesn't propagate its uses to defs? Is it because we can…

				// We naively assume that an instruction propagates any loaded
				mattdrUnsubmitted Done Reply Inline Actions Copying a comment from a previous iteration: Why is it okay to assume that a call doesn't propagate its uses to defs? Is it because we can assume the CFI transform is already inserting an LFENCE? Whatever the reason, let's state it explicitly here mattdr: Copying a comment from a previous iteration: > Why is it okay to assume that a call doesn't…
				sconstabAuthorUnsubmitted Done Reply Inline Actions Added clarification to the comment. sconstab: Added clarification to the comment.
				// uses to all defs unless the instruction is a call, in which
				// case all arguments will be treated as gadget sources during
				// analysis of the callee function.
				if (UseMI.isCall())
				continue;

				// Check whether this use can transmit (leak) its value.
				if (instrUsesRegToAccessMemory(UseMI, UseMO.getReg()) \|\|
				(!NoConditionalBranches &&
				instrUsesRegToBranch(UseMI, UseMO.getReg()))) {
				mattdrUnsubmitted Done Reply Inline Actions Some more detail would be useful here: precise about what? What are the likely errors? mattdr: Some more detail would be useful here: precise about what? What are the likely errors?
				craig.topperUnsubmitted Done Reply Inline Actions Was this answered somewhere? craig.topper: Was this answered somewhere?
				sconstabAuthorUnsubmitted Done Reply Inline Actions This was referring to the use of `mayLoad()`. At the time I wrote that comment, I wasn't sure that `mayLoad()` was exactly what was needed there, but I now think that it does suffice (SLH also uses `MachineInstr::mayLoad()`). sconstab: This was referring to the use of `mayLoad()`. At the time I wrote that comment, I wasn't sure…
				mattdrUnsubmitted Not Done Reply Inline Actions The comment doesn't match the loop, which is traversing over `Uses`. More importantly, though: why are we allowed to stop traversing through `Uses` here? This `Def` won't be analyzed again, so this is our only chance to enumerate all transmitters to make sure we have all the necessary source -> sink edges in the gadget graph. mattdr: The comment doesn't match the loop, which is traversing over `Uses`. More importantly, though…
				sconstabAuthorUnsubmitted Done Reply Inline Actions @mattdr I think that the code is correct, and I added more to the comment in an attempt to clarify. Let me know if you still think that this is an issue. sconstab: @mattdr I think that the code is correct, and I added more to the comment in an attempt to…
				mattdrUnsubmitted Done Reply Inline Actions I definitely misread `continue` as `break` here. Thanks for the extra clarity and sorry for the noise. mattdr: I definitely misread `continue` as `break` here. Thanks for the extra clarity and sorry for the…
				Transmitters[Def.Id].push_back(Use.Addr->getOwner(DFG).Id);
				if (UseMI.mayLoad())
				continue; // Found a transmitting load -- no need to continue
				// traversing its defs (i.e., this load will become
				// a new gadget source anyways).
				}

				// Check whether the use propagates to more defs.
				NodeAddr<InstrNode *> Owner{Use.Addr->getOwner(DFG)};
				rdf::NodeList AnalyzedChildDefs;
				for (auto &ChildDef :
				Owner.Addr->members_if(DataFlowGraph::IsDef, DFG)) {
				if (!DefsVisited.insert(ChildDef.Id).second)
				continue; // Already visited this def
				if (Def.Addr->getAttrs() & NodeAttrs::Dead)
				continue;
				if (Def.Id == ChildDef.Id)
				continue; // `Def` uses itself (e.g., increment loop counter)

				AnalyzeDefUseChain(ChildDef);

				// `Def` inherits all of its child defs' transmitters.
				for (auto TransmitterId : Transmitters[ChildDef.Id])
				Transmitters[Def.Id].push_back(TransmitterId);
				mattdrUnsubmitted Done Reply Inline Actions This is also the place we populate `Transmitters` (with a default-constructed vector) for the current def if we haven't otherwise found any transmits. That's good, and necessary for `Transmitters` to remember we've analyzed the current def. But we should leave a comment about this subtle load-bearing side-effect. mattdr: This is also the place we populate `Transmitters` (with a default-constructed vector) for the…
				}
				}

				// Note that this statement adds `Def.Id` to the map if no
				mattdrUnsubmitted Not Done Reply Inline Actions Should `Transmitters` map to an `llvm::SmallSet`? mattdr: Should `Transmitters` map to an `llvm::SmallSet`?
				sconstabAuthorUnsubmitted Done Reply Inline Actions In my testing, `std::vector` seems a bit faster than `llvm::SmallSet`. I also suspect that `llvm::SmallSet` may waste more space because many defs will have no transmitters. sconstab: In my testing, `std::vector` seems a bit faster than `llvm::SmallSet`. I also suspect that…
				// transmitters were found for `Def`.
				auto &DefTransmitters = Transmitters[Def.Id];

				// Remove duplicate transmitters
				llvm::sort(DefTransmitters);
				DefTransmitters.erase(
				std::unique(DefTransmitters.begin(), DefTransmitters.end()),
				DefTransmitters.end());
				};

				// Find all of the transmitters
				AnalyzeDefUseChain(SourceDef);
				auto &SourceDefTransmitters = Transmitters[SourceDef.Id];
				if (SourceDefTransmitters.empty())
				return; // No transmitters for `SourceDef`

				MachineInstr *Source = SourceDef.Addr->getFlags() & NodeAttrs::PhiRef
				? MachineGadgetGraph::ArgNodeSentinel
				: SourceDef.Addr->getOp().getParent();
				auto GadgetSource = MaybeAddNode(Source);
				// Each transmitter is a sink for `SourceDef`.
				for (auto TransmitterId : SourceDefTransmitters) {
				MachineInstr Sink = DFG.addr<StmtNode >(TransmitterId).Addr->getCode();
				auto GadgetSink = MaybeAddNode(Sink);
				// Add the gadget edge to the graph.
				Builder.addEdge(MachineGadgetGraph::GadgetEdgeSentinel,
				GadgetSource.first, GadgetSink.first);
				++GadgetCount;
				}
				};

				LLVM_DEBUG(dbgs() << "Analyzing def-use chains to find gadgets\n");
				// Analyze function arguments
				NodeAddr<BlockNode *> EntryBlock = DFG.getFunc().Addr->getEntryBlock(DFG);
				for (NodeAddr<PhiNode *> ArgPhi :
				EntryBlock.Addr->members_if(DataFlowGraph::IsPhi, DFG)) {
				NodeList Defs = ArgPhi.Addr->members_if(DataFlowGraph::IsDef, DFG);
				llvm::for_each(Defs, AnalyzeDef);
				}
				// Analyze every instruction in MF
				mattdrUnsubmitted Done Reply Inline Actions We analyze every def from every instruction in the function, but then also in `AnalyzeDefUseChain` analyze every def of every instruction with an interesting use. Are we doing a lot of extra work? mattdr: We analyze every def from every instruction in the function, but then //also// in…
				craig.topperUnsubmitted Done Reply Inline Actions Was this answered somewhere? craig.topper: Was this answered somewhere?
				sconstabAuthorUnsubmitted Done Reply Inline Actions Wow, big oversight on my part. @mattdr was correct that this was doing a LOT of extra work. I added a memoization scheme that remembers the instructions that may transmit for each def. The getGadgetGraph() routine now runs about 75% faster. sconstab: Wow, big oversight on my part. @mattdr was correct that this was doing a LOT of extra work. I…
				for (NodeAddr<BlockNode *> BA : DFG.getFunc().Addr->members(DFG)) {
				for (NodeAddr<StmtNode *> SA :
				BA.Addr->members_if(DataFlowGraph::IsCode<NodeAttrs::Stmt>, DFG)) {
				MachineInstr *MI = SA.Addr->getCode();
				if (isFence(MI)) {
				MaybeAddNode(MI);
				++FenceCount;
				} else if (MI->mayLoad()) {
				NodeList Defs = SA.Addr->members_if(DataFlowGraph::IsDef, DFG);
				llvm::for_each(Defs, AnalyzeDef);
				}
				}
				}
				LLVM_DEBUG(dbgs() << "Found " << FenceCount << " fences\n");
				LLVM_DEBUG(dbgs() << "Found " << GadgetCount << " gadgets\n");
				if (GadgetCount == 0)
				return nullptr;
				NumGadgets += GadgetCount;

				// Traverse CFG to build the rest of the graph
				SmallSet<MachineBasicBlock *, 8> BlocksVisited;
				std::function<void(MachineBasicBlock *, GraphIter, unsigned)> TraverseCFG =
				[&](MachineBasicBlock *MBB, GraphIter GI, unsigned ParentDepth) {
				unsigned LoopDepth = MLI.getLoopDepth(MBB);
				if (!MBB->empty()) {
				// Always add the first instruction in each block
				auto NI = MBB->begin();
				auto BeginBB = MaybeAddNode(&*NI);
				Builder.addEdge(ParentDepth, GI, BeginBB.first);
				if (!BlocksVisited.insert(MBB).second)
				return;

				// Add any instructions within the block that are gadget components
				GI = BeginBB.first;
				while (++NI != MBB->end()) {
				auto Ref = NodeMap.find(&*NI);
				if (Ref != NodeMap.end()) {
				Builder.addEdge(LoopDepth, GI, Ref->getSecond());
				GI = Ref->getSecond();
				}
				}

				// Always add the terminator instruction, if one exists
				auto T = MBB->getFirstTerminator();
				if (T != MBB->end()) {
				auto EndBB = MaybeAddNode(&*T);
				if (EndBB.second)
				Builder.addEdge(LoopDepth, GI, EndBB.first);
				GI = EndBB.first;
				}
				}
				for (MachineBasicBlock *Succ : MBB->successors())
				TraverseCFG(Succ, GI, LoopDepth);
				};
				// ArgNodeSentinel is a pseudo-instruction that represents MF args in the
				// GadgetGraph
				GraphIter ArgNode = MaybeAddNode(MachineGadgetGraph::ArgNodeSentinel).first;
				TraverseCFG(&MF.front(), ArgNode, 0);
				std::unique_ptr<MachineGadgetGraph> G{Builder.get(FenceCount, GadgetCount)};
				LLVM_DEBUG(dbgs() << "Found " << G->nodes_size() << " nodes\n");
				return G;
				}

				bool X86LoadValueInjectionLoadHardeningPass::instrUsesRegToAccessMemory(
				const MachineInstr &MI, unsigned Reg) const {
				if (!MI.mayLoadOrStore() \|\| MI.getOpcode() == X86::MFENCE \|\|
				MI.getOpcode() == X86::SFENCE \|\| MI.getOpcode() == X86::LFENCE)
				return false;

				// FIXME: This does not handle pseudo loading instruction like TCRETURN*
				const MCInstrDesc &Desc = MI.getDesc();
				int MemRefBeginIdx = X86II::getMemoryOperandNo(Desc.TSFlags);
				if (MemRefBeginIdx < 0) {
				LLVM_DEBUG(dbgs() << "Warning: unable to obtain memory operand for loading "
				"instruction:\n";
				MI.print(dbgs()); dbgs() << '\n';);
				return false;
				}
				MemRefBeginIdx += X86II::getOperandBias(Desc);

				const MachineOperand &BaseMO =
				MI.getOperand(MemRefBeginIdx + X86::AddrBaseReg);
				const MachineOperand &IndexMO =
				MI.getOperand(MemRefBeginIdx + X86::AddrIndexReg);
				mattdrUnsubmitted Not Done Reply Inline Actions Worth a comment here that we don't need to worry about indirect branches (jmp to register) because elsewhere we prevent them from being generated mattdr: Worth a comment here that we don't need to worry about indirect branches (jmp to register)…
				return (BaseMO.isReg() && BaseMO.getReg() != X86::NoRegister &&
				TRI->regsOverlap(BaseMO.getReg(), Reg)) \|\|
				(IndexMO.isReg() && IndexMO.getReg() != X86::NoRegister &&
				TRI->regsOverlap(IndexMO.getReg(), Reg));
				}

				bool X86LoadValueInjectionLoadHardeningPass::instrUsesRegToBranch(
				const MachineInstr &MI, unsigned Reg) const {
				if (!MI.isConditionalBranch())
				return false;
				for (const MachineOperand &Use : MI.uses())
				if (Use.isReg() && Use.getReg() == Reg)
				return true;
				mattdrUnsubmitted Done Reply Inline Actions It seems very weird to make this a template argument rather than just, like, a regular argument. mattdr: It seems very weird to make this a template argument rather than just, like, a regular argument.
				craig.topperUnsubmitted Done Reply Inline Actions Agreed. I've remove the template argument and made it a static function instead of a method since it doesn't use anything from the class. craig.topper: Agreed. I've remove the template argument and made it a static function instead of a method…
				return false;
				}

				INITIALIZE_PASS_BEGIN(X86LoadValueInjectionLoadHardeningPass, PASS_KEY,
				"X86 LVI load hardening", false, false)
				INITIALIZE_PASS_DEPENDENCY(MachineLoopInfo)
				INITIALIZE_PASS_DEPENDENCY(MachineDominatorTree)
				INITIALIZE_PASS_DEPENDENCY(MachineDominanceFrontier)
				INITIALIZE_PASS_END(X86LoadValueInjectionLoadHardeningPass, PASS_KEY,
				"X86 LVI load hardening", false, false)

				FunctionPass *llvm::createX86LoadValueInjectionLoadHardeningPass() {
				return new X86LoadValueInjectionLoadHardeningPass();
				}

llvm/lib/Target/X86/X86Subtarget.h

Show First 20 Lines • Show All 431 Lines • ▼ Show 20 Lines	protected:
bool UseRetpolineExternalThunk = false;		bool UseRetpolineExternalThunk = false;

/// Prevent generation of indirect call/branch instructions from memory,		/// Prevent generation of indirect call/branch instructions from memory,
/// and force all indirect call/branch instructions from a register to be		/// and force all indirect call/branch instructions from a register to be
/// preceded by an LFENCE. Also decompose RET instructions into a		/// preceded by an LFENCE. Also decompose RET instructions into a
/// POP+LFENCE+JMP sequence.		/// POP+LFENCE+JMP sequence.
bool UseLVIControlFlowIntegrity = false;		bool UseLVIControlFlowIntegrity = false;

		/// Insert LFENCE instructions to prevent data speculatively injected into
		/// loads from being used maliciously.
		bool UseLVILoadHardening = false;

/// Use software floating point for code generation.		/// Use software floating point for code generation.
bool UseSoftFloat = false;		bool UseSoftFloat = false;

/// Use alias analysis during code generation.		/// Use alias analysis during code generation.
bool UseAA = false;		bool UseAA = false;

/// The minimum alignment known to hold of the stack frame on		/// The minimum alignment known to hold of the stack frame on
/// entry to the function and which must be maintained by every function.		/// entry to the function and which must be maintained by every function.
▲ Show 20 Lines • Show All 286 Lines • ▼ Show 20 Lines	public:
}		}
bool useIndirectThunkBranches() const {		bool useIndirectThunkBranches() const {
return useRetpolineIndirectBranches() \|\| useLVIControlFlowIntegrity();		return useRetpolineIndirectBranches() \|\| useLVIControlFlowIntegrity();
}		}

bool preferMaskRegisters() const { return PreferMaskRegisters; }		bool preferMaskRegisters() const { return PreferMaskRegisters; }
bool useGLMDivSqrtCosts() const { return UseGLMDivSqrtCosts; }		bool useGLMDivSqrtCosts() const { return UseGLMDivSqrtCosts; }
bool useLVIControlFlowIntegrity() const { return UseLVIControlFlowIntegrity; }		bool useLVIControlFlowIntegrity() const { return UseLVIControlFlowIntegrity; }
		bool useLVILoadHardening() const { return UseLVILoadHardening; }

unsigned getPreferVectorWidth() const { return PreferVectorWidth; }		unsigned getPreferVectorWidth() const { return PreferVectorWidth; }
unsigned getRequiredVectorWidth() const { return RequiredVectorWidth; }		unsigned getRequiredVectorWidth() const { return RequiredVectorWidth; }

// Helper functions to determine when we should allow widening to 512-bit		// Helper functions to determine when we should allow widening to 512-bit
// during codegen.		// during codegen.
// TODO: Currently we're always allowing widening on CPUs without VLX,		// TODO: Currently we're always allowing widening on CPUs without VLX,
// because for many cases we don't have a better option.		// because for many cases we don't have a better option.
▲ Show 20 Lines • Show All 160 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86TargetMachine.cpp

Show First 20 Lines • Show All 79 Lines • ▼ Show 20 Lines	extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializeX86Target() {
initializeX86ExecutionDomainFixPass(PR);		initializeX86ExecutionDomainFixPass(PR);
initializeX86DomainReassignmentPass(PR);		initializeX86DomainReassignmentPass(PR);
initializeX86AvoidSFBPassPass(PR);		initializeX86AvoidSFBPassPass(PR);
initializeX86AvoidTrailingCallPassPass(PR);		initializeX86AvoidTrailingCallPassPass(PR);
initializeX86SpeculativeLoadHardeningPassPass(PR);		initializeX86SpeculativeLoadHardeningPassPass(PR);
initializeX86SpeculativeExecutionSideEffectSuppressionPass(PR);		initializeX86SpeculativeExecutionSideEffectSuppressionPass(PR);
initializeX86FlagsCopyLoweringPassPass(PR);		initializeX86FlagsCopyLoweringPassPass(PR);
initializeX86CondBrFoldingPassPass(PR);		initializeX86CondBrFoldingPassPass(PR);
		initializeX86LoadValueInjectionLoadHardeningPassPass(PR);
initializeX86LoadValueInjectionRetHardeningPassPass(PR);		initializeX86LoadValueInjectionRetHardeningPassPass(PR);
initializeX86OptimizeLEAPassPass(PR);		initializeX86OptimizeLEAPassPass(PR);
initializeX86PartialReductionPass(PR);		initializeX86PartialReductionPass(PR);
}		}

static std::unique_ptr<TargetLoweringObjectFile> createTLOF(const Triple &TT) {		static std::unique_ptr<TargetLoweringObjectFile> createTLOF(const Triple &TT) {
if (TT.isOSBinFormatMachO()) {		if (TT.isOSBinFormatMachO()) {
if (TT.getArch() == Triple::x86_64)		if (TT.getArch() == Triple::x86_64)
▲ Show 20 Lines • Show All 395 Lines • ▼ Show 20 Lines
}		}
void X86PassConfig::addMachineSSAOptimization() {		void X86PassConfig::addMachineSSAOptimization() {
addPass(createX86DomainReassignmentPass());		addPass(createX86DomainReassignmentPass());
TargetPassConfig::addMachineSSAOptimization();		TargetPassConfig::addMachineSSAOptimization();
}		}

void X86PassConfig::addPostRegAlloc() {		void X86PassConfig::addPostRegAlloc() {
addPass(createX86FloatingPointStackifierPass());		addPass(createX86FloatingPointStackifierPass());
		addPass(createX86LoadValueInjectionLoadHardeningPass());
}		}

void X86PassConfig::addPreSched2() { addPass(createX86ExpandPseudoPass()); }		void X86PassConfig::addPreSched2() { addPass(createX86ExpandPseudoPass()); }

void X86PassConfig::addPreEmitPass() {		void X86PassConfig::addPreEmitPass() {
if (getOptLevel() != CodeGenOpt::None) {		if (getOptLevel() != CodeGenOpt::None) {
addPass(new X86ExecutionDomainFix());		addPass(new X86ExecutionDomainFix());
addPass(createBreakFalseDeps());		addPass(createBreakFalseDeps());
▲ Show 20 Lines • Show All 55 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/O0-pipeline.ll

	Show First 20 Lines • Show All 49 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: MachineDominator Tree Construction			; CHECK-NEXT: MachineDominator Tree Construction
	; CHECK-NEXT: X86 EFLAGS copy lowering			; CHECK-NEXT: X86 EFLAGS copy lowering
	; CHECK-NEXT: X86 WinAlloca Expander			; CHECK-NEXT: X86 WinAlloca Expander
	; CHECK-NEXT: Eliminate PHI nodes for register allocation			; CHECK-NEXT: Eliminate PHI nodes for register allocation
	; CHECK-NEXT: Two-Address instruction pass			; CHECK-NEXT: Two-Address instruction pass
	; CHECK-NEXT: Fast Register Allocator			; CHECK-NEXT: Fast Register Allocator
	; CHECK-NEXT: Bundle Machine CFG Edges			; CHECK-NEXT: Bundle Machine CFG Edges
	; CHECK-NEXT: X86 FP Stackifier			; CHECK-NEXT: X86 FP Stackifier
				; CHECK-NEXT: MachineDominator Tree Construction
				; CHECK-NEXT: Machine Natural Loop Construction
				; CHECK-NEXT: Machine Dominance Frontier Construction
				; CHECK-NEXT: X86 Load Value Injection (LVI) Load Hardening
				zbridUnsubmitted Done Reply Inline Actions Remove pass from name since that's typically the convention. zbrid: Remove pass from name since that's typically the convention.
	; CHECK-NEXT: Fixup Statepoint Caller Saved			; CHECK-NEXT: Fixup Statepoint Caller Saved
	; CHECK-NEXT: Lazy Machine Block Frequency Analysis			; CHECK-NEXT: Lazy Machine Block Frequency Analysis
	; CHECK-NEXT: Machine Optimization Remark Emitter			; CHECK-NEXT: Machine Optimization Remark Emitter
	; CHECK-NEXT: Prologue/Epilogue Insertion & Frame Finalization			; CHECK-NEXT: Prologue/Epilogue Insertion & Frame Finalization
	; CHECK-NEXT: Post-RA pseudo instruction expansion pass			; CHECK-NEXT: Post-RA pseudo instruction expansion pass
	; CHECK-NEXT: X86 pseudo instruction expansion pass			; CHECK-NEXT: X86 pseudo instruction expansion pass
	; CHECK-NEXT: Analyze Machine Code For Garbage Collection			; CHECK-NEXT: Analyze Machine Code For Garbage Collection
	; CHECK-NEXT: Insert fentry calls			; CHECK-NEXT: Insert fentry calls
	Show All 22 Lines

llvm/test/CodeGen/X86/O3-pipeline.ll

	Show First 20 Lines • Show All 135 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: Machine Optimization Remark Emitter			; CHECK-NEXT: Machine Optimization Remark Emitter
	; CHECK-NEXT: Greedy Register Allocator			; CHECK-NEXT: Greedy Register Allocator
	; CHECK-NEXT: Virtual Register Rewriter			; CHECK-NEXT: Virtual Register Rewriter
	; CHECK-NEXT: Stack Slot Coloring			; CHECK-NEXT: Stack Slot Coloring
	; CHECK-NEXT: Machine Copy Propagation Pass			; CHECK-NEXT: Machine Copy Propagation Pass
	; CHECK-NEXT: Machine Loop Invariant Code Motion			; CHECK-NEXT: Machine Loop Invariant Code Motion
	; CHECK-NEXT: Bundle Machine CFG Edges			; CHECK-NEXT: Bundle Machine CFG Edges
	; CHECK-NEXT: X86 FP Stackifier			; CHECK-NEXT: X86 FP Stackifier
				; CHECK-NEXT: MachineDominator Tree Construction
				; CHECK-NEXT: Machine Dominance Frontier Construction
				; CHECK-NEXT: X86 Load Value Injection (LVI) Load Hardening
	; CHECK-NEXT: Fixup Statepoint Caller Saved			; CHECK-NEXT: Fixup Statepoint Caller Saved
				craig.topperUnsubmitted Not Done Reply Inline Actions I'm curious what happens if we add AU.setPreservesCFG() to getAnalysisUsage in FixupStatepointCallerSaved.cpp From a quick look through that pass it doesn't look like it changes the Machine CFG. PostRA Machine Sink already preserves CFG. So I think that should remove the dominator tree construction after PostRA machine sink. craig.topper: I'm curious what happens if we add AU.setPreservesCFG() to getAnalysisUsage in…
	; CHECK-NEXT: PostRA Machine Sink			; CHECK-NEXT: PostRA Machine Sink
	; CHECK-NEXT: MachineDominator Tree Construction			; CHECK-NEXT: MachineDominator Tree Construction
	; CHECK-NEXT: Machine Natural Loop Construction			; CHECK-NEXT: Machine Natural Loop Construction
	; CHECK-NEXT: Machine Block Frequency Analysis			; CHECK-NEXT: Machine Block Frequency Analysis
	; CHECK-NEXT: MachinePostDominator Tree Construction			; CHECK-NEXT: MachinePostDominator Tree Construction
	; CHECK-NEXT: Lazy Machine Block Frequency Analysis			; CHECK-NEXT: Lazy Machine Block Frequency Analysis
	; CHECK-NEXT: Machine Optimization Remark Emitter			; CHECK-NEXT: Machine Optimization Remark Emitter
	; CHECK-NEXT: Shrink Wrapping analysis			; CHECK-NEXT: Shrink Wrapping analysis
	▲ Show 20 Lines • Show All 48 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/lvi-hardening-gadget-graph.ll

This file was added.

				; RUN: llc -verify-machineinstrs -mtriple=x86_64-unknown -x86-lvi-load-dot-verify -o %t < %s \| FileCheck %s

				; Function Attrs: noinline nounwind optnone uwtable
				define dso_local i32 @test(i32* %untrusted_user_ptr, i32* %secret, i32 %secret_size) #0 {
				entry:
				%untrusted_user_ptr.addr = alloca i32*, align 8
				%secret.addr = alloca i32*, align 8
				%secret_size.addr = alloca i32, align 4
				%ret_val = alloca i32, align 4
				%i = alloca i32, align 4
				store i32* %untrusted_user_ptr, i32** %untrusted_user_ptr.addr, align 8
				store i32* %secret, i32** %secret.addr, align 8
				store i32 %secret_size, i32* %secret_size.addr, align 4
				store i32 0, i32* %ret_val, align 4
				call void @llvm.x86.sse2.lfence()
				store i32 0, i32* %i, align 4
				br label %for.cond

				for.cond: ; preds = %for.inc, %entry
				%0 = load i32, i32* %i, align 4
				%1 = load i32, i32* %secret_size.addr, align 4
				%cmp = icmp slt i32 %0, %1
				br i1 %cmp, label %for.body, label %for.end

				for.body: ; preds = %for.cond
				%2 = load i32, i32* %i, align 4
				%rem = srem i32 %2, 2
				%cmp1 = icmp eq i32 %rem, 0
				br i1 %cmp1, label %if.then, label %if.else

				if.then: ; preds = %for.body
				%3 = load i32, i32* %secret.addr, align 8
				%4 = load i32, i32* %ret_val, align 4
				%idxprom = sext i32 %4 to i64
				%arrayidx = getelementptr inbounds i32, i32* %3, i64 %idxprom
				%5 = load i32, i32* %arrayidx, align 4
				%6 = load i32, i32* %untrusted_user_ptr.addr, align 8
				store i32 %5, i32* %6, align 4
				br label %if.end

				if.else: ; preds = %for.body
				%7 = load i32, i32* %secret.addr, align 8
				%8 = load i32, i32* %ret_val, align 4
				%idxprom2 = sext i32 %8 to i64
				%arrayidx3 = getelementptr inbounds i32, i32* %7, i64 %idxprom2
				store i32 42, i32* %arrayidx3, align 4
				br label %if.end

				if.end: ; preds = %if.else, %if.then
				%9 = load i32, i32* %untrusted_user_ptr.addr, align 8
				%10 = load i32, i32* %9, align 4
				store i32 %10, i32* %ret_val, align 4
				br label %for.inc

				for.inc: ; preds = %if.end
				%11 = load i32, i32* %i, align 4
				%inc = add nsw i32 %11, 1
				store i32 %inc, i32* %i, align 4
				br label %for.cond

				for.end: ; preds = %for.cond
				%12 = load i32, i32* %ret_val, align 4
				ret i32 %12
				}

				; CHECK: digraph "Speculative gadgets for \"test\" function" {
				; CHECK-NEXT: label="Speculative gadgets for \"test\" function";
				; CHECK: Node0x{{[0-9a-f]+}} [shape=record,color = green,label="{LFENCE\n}"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 0];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{renamable $eax = MOV32rm %stack.4.i, 1, $noreg, 0, $noreg :: (dereferenceable load 4 from %ir.i)\n}"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[color = red, style = "dashed"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{JCC_1 %bb.6, 13, implicit killed $eflags\n}"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{CMP32rm killed renamable $eax, %stack.2.secret_size.addr, 1, $noreg, 0, $noreg, implicit-def $eflags :: (dereferenceable load 4 from %ir.secret_size.addr)\n}"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[color = red, style = "dashed"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{renamable $eax = MOV32rm %stack.4.i, 1, $noreg, 0, $noreg :: (dereferenceable load 4 from %ir.i)\n}"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[color = red, style = "dashed"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{JCC_1 %bb.4, 5, implicit killed $eflags\n}"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{renamable $rax = MOV64rm %stack.1.secret.addr, 1, $noreg, 0, $noreg :: (dereferenceable load 8 from %ir.secret.addr)\n}"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[color = red, style = "dashed"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{renamable $eax = MOV32rm killed renamable $rax, 4, killed renamable $rcx, 0, $noreg :: (load 4 from %ir.arrayidx)\n}"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{renamable $rcx = MOVSX64rm32 %stack.3.ret_val, 1, $noreg, 0, $noreg :: (dereferenceable load 4 from %ir.ret_val)\n}"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[color = red, style = "dashed"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{renamable $rcx = MOV64rm %stack.0.untrusted_user_ptr.addr, 1, $noreg, 0, $noreg :: (dereferenceable load 8 from %ir.untrusted_user_ptr.addr)\n}"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[color = red, style = "dashed"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{MOV32mr killed renamable $rcx, 1, $noreg, 0, $noreg, killed renamable $eax :: (store 4 into %ir.6)\n}"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{renamable $rax = MOV64rm %stack.1.secret.addr, 1, $noreg, 0, $noreg :: (dereferenceable load 8 from %ir.secret.addr)\n}"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[color = red, style = "dashed"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{MOV32mi killed renamable $rax, 4, killed renamable $rcx, 0, $noreg, 42 :: (store 4 into %ir.arrayidx3)\n}"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{renamable $rcx = MOVSX64rm32 %stack.3.ret_val, 1, $noreg, 0, $noreg :: (dereferenceable load 4 from %ir.ret_val)\n}"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[color = red, style = "dashed"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{renamable $rax = MOV64rm %stack.0.untrusted_user_ptr.addr, 1, $noreg, 0, $noreg :: (dereferenceable load 8 from %ir.untrusted_user_ptr.addr)\n}"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[color = red, style = "dashed"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{renamable $eax = MOV32rm killed renamable $rax, 1, $noreg, 0, $noreg :: (load 4 from %ir.9)\n}"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,color = blue,label="{ARGS}"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 0];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{MOV64mr %stack.0.untrusted_user_ptr.addr, 1, $noreg, 0, $noreg, killed renamable $rdi :: (store 8 into %ir.untrusted_user_ptr.addr)\n}"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 0];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{JMP_1 %bb.5\n}"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{JMP_1 %bb.1\n}"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{renamable $eax = MOV32rm %stack.3.ret_val, 1, $noreg, 0, $noreg :: (dereferenceable load 4 from %ir.ret_val)\n}"];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 0];
				; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{RET 0, $eax\n}"];
				; CHECK-NEXT: }

				; Function Attrs: nounwind
				declare void @llvm.x86.sse2.lfence() #1

				attributes #0 = { "target-features"="+lvi-cfi"
				"target-features"="+lvi-load-hardening" }
				attributes #1 = { nounwind }

This is an archive of the discontinued LLVM Phabricator instance.

Add a Pass to X86 that builds a Condensed CFG for Load Value Injection (LVI) Gadgets [4/6]ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 263262

clang/include/clang/Driver/Options.td

clang/lib/Driver/ToolChains/Arch/X86.cpp

clang/test/Driver/x86-target-features.c

llvm/lib/Target/X86/CMakeLists.txt

llvm/lib/Target/X86/ImmutableGraph.h

llvm/lib/Target/X86/X86.h

llvm/lib/Target/X86/X86.td

llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp

llvm/lib/Target/X86/X86Subtarget.h

llvm/lib/Target/X86/X86TargetMachine.cpp

llvm/test/CodeGen/X86/O0-pipeline.ll

llvm/test/CodeGen/X86/O3-pipeline.ll

llvm/test/CodeGen/X86/lvi-hardening-gadget-graph.ll

Add a Pass to X86 that builds a Condensed CFG for Load Value Injection (LVI) Gadgets [4/6]
ClosedPublic