This is an archive of the discontinued LLVM Phabricator instance.

Add a Pass to X86 that builds a Condensed CFG for Load Value Injection (LVI) Gadgets [4/6]
ClosedPublic

Authored by sconstab on Mar 10 2020, 10:00 AM.

Details

Summary

Adds a new data structure, ImmutableGraph, and uses RDF to find LVI gadgets and add them to a MachineGadgetGraph.

More specifically, a new X86 machine pass finds Load Value Injection (LVI) gadgets consisting of a load from memory (i.e., SOURCE), and any operation that may transmit the value loaded from memory over a covert channel, or use the value loaded from memory to determine a branch/call target (i.e., SINK).

Also adds a new target feature to X86: +lvi-load-hardening

The feature can be added via the clang CLI using -mlvi-hardening.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Added help text for the CLI options

sconstab retitled this revision from Add a Pass to X86 that builds a Condensed CFG for Load Value Injection (LVI) Gadgets [4/5] to Add a Pass to X86 that builds a Condensed CFG for Load Value Injection (LVI) Gadgets [4/6].Mar 16 2020, 9:30 AM
zbrid added a comment.Mar 18 2020, 5:34 PM

Thanks for putting this up! Here are a few comments.

llvm/lib/Target/X86/ImmutableGraph.h
2

Might be useful if you add a comment about what makes this a fast DAG impl in case someone may want to use it later.

llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
255

I think this should go at the top of the function.

272

Am I misunderstanding this comment? It sounds like if FixedLoads is true then BOTH fixed loads and non-fixed loads will be mitigated. Since runOnMachineFunction would call hardenLoads twice for non-fixed loads, would that result in double mitigation for non-fixed loads in the case where we also harden fixed loads? Unfortunately I'm having trouble reasoning through this myself, so I'd appreciate some clarification.

llvm/test/CodeGen/X86/O0-pipeline.ll
61

Remove pass from name since that's typically the convention.

sconstab updated this revision to Diff 251242.Mar 18 2020, 7:09 PM

Addressed Zola's comments.

sconstab marked 5 inline comments as done.Mar 18 2020, 7:10 PM
sconstab added inline comments.
llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
272

The comment was incorrect.

zbrid accepted this revision.Apr 2 2020, 5:13 PM
zbrid added a subscriber: jyknight.

LGTM. I would prefer if an actual LLVM maintainer also gave LGTM. @jyknight, @george.burgess.iv, @craig.topper?

This revision is now accepted and ready to land.Apr 2 2020, 5:13 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptApr 3 2020, 1:03 PM
chandlerc added inline comments.Apr 3 2020, 4:41 PM
llvm/lib/Target/X86/ImmutableGraph.h
47–57

Folks, this isn't even close to following LLVM's coding conventions or naming conventions.

These violate the C++ standard.

This shouldn't have been landed as-is. Can you all back this out and actually dig into the review and get this to match LLVM's actual coding style and standards?

mattdr added a subscriber: mattdr.Apr 3 2020, 4:57 PM

Adding a few early style notes for the next round, but overall echo @chandlerc that this seems significantly outside of normal LLVM code.

llvm/lib/Target/X86/ImmutableGraph.h
27

It's sort of surprising that the LLVM style guide doesn't call this out explicitly, but #include guards are supposed to include the full file path. If they just used the filename, like, this, files with the same name in different paths would collide.

For an example of the expected style, see an adjacent header in this directory: https://github.com/llvm/llvm-project/blob/ba8b3052b59ebee4311d10bee5209dac8747acea/llvm/lib/Target/X86/X86AsmPrinter.h#L10

319

As a general rule new is a code-smell in modern C++. This should be a vector.

336

this should return a unique_ptr to signal ownership transfer

craig.topper added inline comments.Apr 3 2020, 5:10 PM
llvm/lib/Target/X86/ImmutableGraph.h
47–57
craig.topper reopened this revision.Apr 3 2020, 5:15 PM
This revision is now accepted and ready to land.Apr 3 2020, 5:15 PM
craig.topper commandeered this revision.Apr 3 2020, 5:15 PM
craig.topper updated this revision to Diff 254962.
craig.topper edited reviewers, added: sconstab; removed: craig.topper.
craig.topper marked an inline comment as done.

Fix include guard on ImmutableGraph.h

Remove underscores from names.

craig.topper planned changes to this revision.Apr 3 2020, 8:51 PM
sconstab added inline comments.Apr 3 2020, 9:42 PM
llvm/lib/Target/X86/ImmutableGraph.h
319

@mattdr I do agree with the general rule. I also think that in this case where the structure is immutable, std::vector is wasteful because it needs to keep separate values for the current number of elements and the current capacity. At local scope within a function the unneeded value would likely be optimized away, but then there would be an awkward handoff to transfer the data from the vector to the array members.

I would not want to see the array members changed to vectors, unless LLVM provides an encapsulated array structure that does not need to grow and shrink.

336

Yes, agree.

Use std::unique_ptr for arrays in the graph. I started trying to use std::vector, but it kept crashing. Which initially I thought was some issue with the fact that we store pointers into the vectors in other places and that somehow std::move of the vector was breaking that. I think I now realize its because the array is 1 larger than the real number of nodes and my std::vector version didn't take that into account. But thinking about it more, since we are storing pointers into the array, it probably makes more sense for it to just be a plain array than a vector since resizing a vector would invalidate those pointers. Using an array makes it more clear than we don't resize.

This revision is now accepted and ready to land.Apr 3 2020, 11:59 PM

Use unique_ptr::operator[] in a few places.

mattdr added inline comments.Apr 4 2020, 2:06 AM
llvm/lib/Target/X86/ImmutableGraph.h
14

erm, "terrific"? If there's a substantive argument w.r.t. cache locality etc., please make it explicit.

17

"extraordinarily" is, again, not a useful engineering categorization. Please restrict comments to describing quantifiable claims of complexity.

41

Every template argument for a class represents combinatorial addition of complexity for the resulting code. Why do each of these template arguments need to exist? in particular, why does SizeT need to exist?

319

So, first: I'm glad you removed the unnecessary use of new[] here and the corresponding (and error-prone!) use of delete[] later. That removes a memory leak LLVM won't have to debug.

You suggest here that something other than std::vector would be more efficient. If so, would std::array suffice? If not, can you explain why static allocation is impossible but dynamic allocation would be too expensive?

-Add edge_begin()/edge_end()/edges() to Node class. Hides the N+1 trick used to find the end of a Node's edges.
-Add nodes()/edges() and use range-based for loops.
-Stop using things in the traits class since it doesn't have range-based for loops.
-Const-correct as required since nodes()/edges() return an ArrayRef that ends up making the range for loops const.
-Use llvm::for_each instead of std::for_each.
-Rename Node::value()/Edge::value() to getValue() to align with llvm naming convention.

sconstab added inline comments.Apr 4 2020, 8:53 AM
llvm/lib/Target/X86/ImmutableGraph.h
14

This is valid. I will reword.

17

AFAIK there is not a precise engineering term for "tiny O(1)." Nonetheless I will reword.

41

I suspect that there may be more uses for this data structure and that eventually it may migrate to ADT. I have SizeT as a template argument because I found it plenty sufficient to have int as the size parameter for the array bounds, but I suspect other uses may require size_t.

319

A statically sized array (e.g., std::array) is insufficient because the size in this case is not compiler determinable; a dynamically sized and dynamically resizable array (e.g., std::vector) is sufficient but overly costly; a dynamically sized and dynamically unresizable array is sufficient and has minimal cost.

sconstab added inline comments.Apr 4 2020, 9:26 AM
llvm/lib/Target/X86/ImmutableGraph.h
286

@craig.topper It now occurs to me that these fields should probably be reordered to:

std::unique_ptr<Node[]> Nodes;
std::unique_ptr<Edge[]> Edges;
size_type NodesSize;
size_type EdgesSize;

The current ordering will cause internal fragmentation.

Old ordering:

static_assert(sizeof(ImmutableGraph<T, V>) == 32);

New ordering:

static_assert(sizeof(ImmutableGraph<T, V>) == 24);

With vectors instead of arrays:

static_assert(sizeof(ImmutableGraph<T, V>) == 48);

Overall, the restyling by @craig.topper looks much better than what I had committed before. I agree that std::unique_ptr<T *> is the right "container" in this circumstance. And the addition of ArrayRef<> accessors is also a nice touch. A few extra inline comments.

llvm/lib/Target/X86/ImmutableGraph.h
14

"Iteration and traversal operations benefit from cache locality."

17

"Operations on sets of nodes/edges are efficient, and representations of those sets in memory are compact. For instance..."

85

After the members are reordered, this list must also be reordered.

104

This had not occurred to me until now, but a lot of code is shared between NodeSet and EdgeSet. Maybe a template could reduce the redundancy?

sconstab added inline comments.Apr 4 2020, 1:26 PM
llvm/lib/Target/X86/ImmutableGraph.h
308

Just noticed that ImmutableGraphBuilder and ImmutableGraph have non-identical types called NodeRef. Suggest renaming this one to BuilderNodeRef.

craig.topper marked 2 inline comments as done.Apr 4 2020, 2:05 PM
craig.topper added inline comments.
llvm/lib/Target/X86/ImmutableGraph.h
286

I noticed that too. I just didn't focus on it since we only ever one in memory at a time. I'll change in my next update.

308

NodeRef is in the Traits class not the ImmutableGraph, but I will rename the builder one.

craig.topper marked 2 inline comments as done.Apr 4 2020, 2:17 PM
craig.topper added inline comments.
llvm/lib/Target/X86/ImmutableGraph.h
330

Can this be changed to VI < VertexSize?

370

I think I'll change this to llvm::count_if. Also there was previously a conditional here that made sure the distance between edges was >0, but it didn't seem necessary. Please let me know if there's a reason I should put that back

-Apply updates to comments.
-Use nodes()/edges() to implement nodes_begin/end and edges_begin/end to simplify the code a little
-Reorder fields in the Graph class.

-Add methods to get the index of a Node or Edge to remove calls to std::distance in various places

craig.topper requested review of this revision.Apr 6 2020, 10:18 AM
mattdr added inline comments.Apr 7 2020, 3:28 AM
llvm/lib/Target/X86/ImmutableGraph.h
17

"spatial"

41

I think this self-reference to ImmutableGraph dropped the SizeT parameter.

73

Seems like you also want to add a comment here that we know we will never be asked for edges_end for the last stored node -- that is, we know that this + 1 always refers to a valid Node (which is presumably a dummy/sentinel)

79

Why "protected" rather than "private"? Usually seeing "protected" makes me think subclassing is expected, but that doesn't seem to be the case here.

117

How do we know that a value of size_type (aka SizeT) can be cast to unsigned without truncation?

298

this will also break if a non-default SizeT is provided. Maybe a good argument to just leave out SizeT for now, and it can be added in the future as needed?

319

I'm not sure we allocate enough of these in the course of a compilation for the one extra word in a std::vector to matter, but I won't press the point.

llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
113

Cleaner how?

233

If the user requests hardening and we can't do it, it seems better to fail loudly so they don't accidentally deploy an unmitigated binary.

Summary points for @craig.topper who has commandeered this diff:

  • fix the typo that Matt pointed out
  • SizeT should not be a template parameter, and size_type should be fixed to int.
  • Maybe have a member reference in MachineGadgetGraph to the associated MachineFunction.
  • Determine how this pass (and other X86 machine passes) should fail on unsupported X86 subtargets.
llvm/lib/Target/X86/ImmutableGraph.h
41

Yup. Good catch.

73

Not sure I agree. I cannot think of a conventional use of this interface that would perform an operation on the sentinel.

G->nodes_end().edges_end(); // invalid use of any end iterator
SomeNode.edges_end(); // invalid if SomeNode == G->nodes_end()

That is, the way that we "know" that we will never be asked for edges_end() for the last stored node is that the ask itself would already violate C++ conventions.

79

The MachineGadgetGraph class actually does subclass ImmutableGraph to add some contextual information. I did not want the constructors for ImmutableGraph to be public, because the intent is to use the builder. So protected seemed like the best option.

117

Ah. We do not know that. We could have a static assert here, but maybe the best thing to do would be to follow Matt's earlier advice and fix size_type to int, rather than have it as a template parameter. Anything larger would break the BitVectors and/or waste space.

llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
113

Maybe by keeping a member reference to the associated MachineFunction?

233

@craig.topper I think this is related to the discussion we were having about what would happen for SLH on unsupported subtargets. I'm not sure what the most appropriate solution would be.

mattdr added a comment.Apr 7 2020, 3:09 PM

Some more comments. FWIW, I'm doing rounds of review as I can in some evening quiet or during my son's nap. This is a huge change and it's really hard to get any part of it into my head at once in a reasonable amount of time.

llvm/lib/Target/X86/ImmutableGraph.h
73

I believe any operation on the last Node in the array will end up accessing the sentinel:

Node* LastNode = G->nodes_begin() + (G->nodes_size() - 1);  // valid reference to the last node
LastNode->edges_end();  // uses `this+1`, accessing the sentinel value in the Nodes array
79

Ah, I missed that. I searched through the file for public ImmutableGraph and didn't find it because MachineGadgetGraph uses the default inheritance specifier.

llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
83–85

Please replace these with constants or functions.

113

Let's put that in the comment instead.

mattdr added a comment.Apr 7 2020, 3:18 PM

I'll wait for current comments to be addressed before doing my next round here.

sconstab added inline comments.Apr 7 2020, 3:30 PM
llvm/lib/Target/X86/ImmutableGraph.h
73

G->nodes_size() will return the size without the sentinel node, so your example should actually operate on the last data node. Right?

Address review comments

craig.topper marked an inline comment as done.Apr 8 2020, 11:16 AM
craig.topper added inline comments.
llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
83–85

Oops forgot to do this one.

craig.topper marked 37 inline comments as done and an inline comment as not done.Apr 8 2020, 11:25 AM
craig.topper added inline comments.
llvm/lib/Target/X86/ImmutableGraph.h
41

Removed the template argument

73

Comment added to clarify the extra node was allocated.

117

Removed the template argument

llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
113

I found a way to remove the getMF method entirely.

233

Added a fatal error. Which isn't great as it will generate a crash report in clang. But it will tell the user to file a compiler bug so I guess that's something.

craig.topper marked 5 inline comments as done.

-Replace ARG_NODE and GADGET_EDGE defines with static constexpr members

-Put llvm:: on the for_each calls in this patch instead of D75937

sconstab added inline comments.Apr 9 2020, 6:45 PM
llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
233

Would it be better to have

report_fatal_error("LVI load hardening is only supported on 64-bit "
                   "targets.", false);

So that the crash diagnostic is not generated?

@craig.topper can you please update/rebase this stack to remove the early LVI CFI work that ended up landed in https://reviews.llvm.org/D76812 instead?

mattdr accepted this revision.Apr 13 2020, 4:53 PM

Accepting modulo some comments to be addressed. Most of my review effort was spent on the data structure and algorithm employed as well as code style and readability.

I am least confident about my understanding of instrAccessesStackSlot and the other functions that make up instrIsFixedAccess, since each function seems to be pattern-matching on something very specific without a reference to why. I also don't know if this diff provides an exhaustive list of fixed loads, or indeed if it was intended to.

llvm/lib/Target/X86/ImmutableGraph.h
103

Worth adding a comment for this (and getEdgeIndex) that this will crash if the Node (Edge) provided is not a reference into this specific instance of ImmutableGraph.

104

Ideally I agree we'd find a way to collapse these -- but for this diff, let's content ourselves with a FIXME comment to that effect.

341

This if is unnecessary

349

Technically a "generic" graph, so we should leave out "Gadget" here

367

Two comments here would make the code significantly easier to understand:

  1. Note that we're using .size() here rather than .count(), so we're actually iterating over *all* Node indices, not just the ones to be trimmed
  2. The TrimmedNodes vector maps indices in the original NodeSet to the number of Nodes before that index that have been trimmed by that index, to allow later code to map elements to their new position in a dense array with the trimmed items removed
llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
57

This is another case where references to good documentation will go a long way. Without details about what the tradeoff is and how to reason about it, it doesn't seem like anyone should use this flag.

237

Each call to hardenLoads leads to a call to buildGadgetGraph. A *lot* of the work that getGadgetGraph does seems to be common between mitigating fixed and non-fixed loads -- for example, computing register dataflow and liveness over the entire function.

And calling hardenLoads twice looks to be the common case, since NoFixedLoads is false by default.

Could we make this pass about half as expensive by default by combining these two calls to hardenLoads into one? It would do the expensive work once, then either harden _all_ loads or _only_ non-fixed loads.

323

Please add a comment explaining the semantics of the boolean return here. I think it's: true if we need to consider defs of this instruction tainted by this use (and therefore add them for analysis); false if this instruction consumes its use

330

Why is it okay to assume that a call doesn't propagate its uses to defs? Is it because we can assume the CFI transform is already inserting an LFENCE? Whatever the reason, let's state it explicitly here

342

Some more detail would be useful here: precise about what? What are the likely errors?

410

We analyze every def from every instruction in the function, but then also in AnalyzeDefUseChain analyze every def of every instruction with an interesting use. Are we doing a lot of extra work?

494

Worth a comment here that we don't need to worry about indirect branches (jmp to register) because elsewhere we prevent them from being generated

507

It seems very weird to make this a template argument rather than just, like, a regular argument.

This revision is now accepted and ready to land.Apr 13 2020, 4:53 PM

Address some of the review comments. Primarily the ones in ImmutableGraph. I did de-templatize the method in X86LoadValueInjectionLoadHardening.cpp

craig.topper marked 6 inline comments as done.Apr 14 2020, 1:25 PM
craig.topper added inline comments.
llvm/lib/Target/X86/ImmutableGraph.h
367

I've stopped using TrimmedNodes.size() and TrimEdges.size() in favor of the size methods from the graph which should make things more obvious.

I renamed TrimmedNodes to RemappedNodeIndex and stored the new index rather than the adjustment needed. I'm also changed it to walk nodes instead of indices so we don't have to translate to Node to make the contains call.

I also removed the NewNumEdges count_if and the if statement around the edge loop from the loop below. I don't think that provided any value and just complicated the code.

llvm/lib/Target/X86/X86ISelLowering.cpp
8626 ↗(On Diff #257465)

Oops this snuck in from something else I was experimenting with in my repo earlier. I'll remove.

llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
507

Agreed. I've remove the template argument and made it a static function instead of a method since it doesn't use anything from the class.

craig.topper marked an inline comment as done.

Fix some mistakes I made in the previous upload where I accidentally deleted the pipeline tests instead of updating them.

Merge in test case from D77431 since it logically belongs with these changes.

mattdr marked an inline comment as done.Apr 14 2020, 10:42 PM
mattdr added inline comments.
llvm/lib/Target/X86/ImmutableGraph.h
367

Many thanks! These changes make the code much more accessible.

sconstab commandeered this revision.Apr 16 2020, 12:11 PM
sconstab edited reviewers, added: craig.topper; removed: sconstab.
sconstab updated this revision to Diff 260439.Apr 27 2020, 1:15 PM

Removed the -x86-lvi-no-fixed CLI flag. This change simplifies the code flow quite a bit.

craig.topper added inline comments.Apr 28 2020, 3:26 PM
llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
323

Was this comment addressed?

342

Was this answered somewhere?

410

Was this answered somewhere?

sconstab updated this revision to Diff 261968.May 4 2020, 5:10 PM
sconstab marked 9 inline comments as done.

Addressed the previously unaddressed comments, as pointed out by @craig.topper.

sconstab added inline comments.May 4 2020, 5:10 PM
llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
323

It had not been addressed, so thank you for pointing this out. That lambda was doing too many things at once, which made it more confusing than it needed to be. So I just inlined it in the

for (auto N : Uses) {
…
}

loop, and I added some additional clarifying comments.

342

This was referring to the use of mayLoad(). At the time I wrote that comment, I wasn't sure that mayLoad() was exactly what was needed there, but I now think that it does suffice (SLH also uses MachineInstr::mayLoad()).

410

Wow, big oversight on my part. @mattdr was correct that this was doing a LOT of extra work. I added a memoization scheme that remembers the instructions that may transmit for each def. The getGadgetGraph() routine now runs about 75% faster.

mattdr added a comment.May 7 2020, 1:57 PM

Calling special attention to the comment at line 341, since I think it affects the correctness of the pass.

llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
288

This comment doesn't seem to match how the map is used -- it looks like the loop assumes a def has been analyzed iff it is present in the map. This matches my expectation that, if a def is present and maps to an empty list, it would meant the def had been analyzed and found not to transmit.

296

fwiw, this code would be easier to understand if we didn't shadow Def with another variable named Def.

322–323

"current def" is a bit ambiguous here. I _believe_ it means AnalyzeDef's Def argument? At least, that's the interpretation that makes the comment make sense since UsesVisited is in AnalyzeDef's scope.

332

Copying a comment from a previous iteration:

Why is it okay to assume that a call doesn't propagate its uses to defs? Is it because we can assume the CFI transform is already inserting an LFENCE? Whatever the reason, let's state it explicitly here

342

The comment doesn't match the loop, which is traversing over Uses. More importantly, though: why are we allowed to stop traversing through Uses here? This Def won't be analyzed again, so this is our only chance to enumerate all transmitters to make sure we have all the necessary source -> sink edges in the gadget graph.

365–366

This is also the place we populate Transmitters (with a default-constructed vector) for the current def if we haven't otherwise found any transmits. That's good, and necessary for Transmitters to remember we've analyzed the current def. But we should leave a comment about this subtle load-bearing side-effect.

367–370

Should Transmitters map to an llvm::SmallSet?

sconstab updated this revision to Diff 262816.May 7 2020, 11:32 PM
sconstab marked 9 inline comments as done.

Addressed comments by @mattdr.

Several comments in the code have been updated, but the code has not changed.

llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
296

Changed the outer def to SourceDef, which also seems to make the code after the lambda a lot clearer.

322–323

I am now trying to be clearer by using capital-d "Def" to refer specifically to the def that is being analyzed, and lower-case-d "def" to refer to any other defs. Do you think this is better? Good enough?

332

Added clarification to the comment.

342

@mattdr I think that the code is correct, and I added more to the comment in an attempt to clarify. Let me know if you still think that this is an issue.

367–370

In my testing, std::vector seems a bit faster than llvm::SmallSet. I also suspect that llvm::SmallSet may waste more space because many defs will have no transmitters.

mattdr accepted this revision.May 8 2020, 12:17 AM
mattdr marked 2 inline comments as done.

The extra comments and the new variable name are all helpful. Thanks again.

llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
322–323

Much better. Thank you for the change!

342

I definitely misread continue as break here. Thanks for the extra clarity and sorry for the noise.

This revision was automatically updated to reflect the committed changes.
nikic added a subscriber: nikic.May 12 2020, 11:41 AM

This change causes a 0.8% compile-time regression for unoptimized builds. Based on the pipeline test diffs, I expect this is because the new pass requests a bunch of analyses, which it most likely (LVI load hardening disabled) will not need. Would it be possible to compute the analyses only if LVI load hardening is actually enabled?

This change causes a 0.8% compile-time regression for unoptimized builds. Based on the pipeline test diffs, I expect this is because the new pass requests a bunch of analyses, which it most likely (LVI load hardening disabled) will not need. Would it be possible to compute the analyses only if LVI load hardening is actually enabled?

@craig.topper Do you have any ideas on how this could be done?

This change causes a 0.8% compile-time regression for unoptimized builds. Based on the pipeline test diffs, I expect this is because the new pass requests a bunch of analyses, which it most likely (LVI load hardening disabled) will not need. Would it be possible to compute the analyses only if LVI load hardening is actually enabled?

@craig.topper Do you have any ideas on how this could be done?

Unfortunately due to LTO the need for LVI hardening is carried as a function attribute. The pass manager system doesn't allow for running different passes per function. So I don't have any good ideas of how to do that.

craig.topper added inline comments.May 12 2020, 12:24 PM
llvm/test/CodeGen/X86/O3-pipeline.ll
141–145

I'm curious what happens if we add AU.setPreservesCFG() to getAnalysisUsage in FixupStatepointCallerSaved.cpp From a quick look through that pass it doesn't look like it changes the Machine CFG.

PostRA Machine Sink already preserves CFG. So I think that should remove the dominator tree construction after PostRA machine sink.

I'm no expert in the pass manager, but given the very targeted applicability of LVI it definitely seems like our goal should be 0% impact for the vast majority of compilations that don't concern themselves with it.

Is there a way to require the pass be enabled for the compiler invocation as well as for the subtarget? If there's a mismatch (where LVI is desired but the required analyses weren't done) we can fail the compile.

nikic added a comment.May 13 2020, 9:04 AM

This change causes a 0.8% compile-time regression for unoptimized builds. Based on the pipeline test diffs, I expect this is because the new pass requests a bunch of analyses, which it most likely (LVI load hardening disabled) will not need. Would it be possible to compute the analyses only if LVI load hardening is actually enabled?

@craig.topper Do you have any ideas on how this could be done?

Unfortunately due to LTO the need for LVI hardening is carried as a function attribute. The pass manager system doesn't allow for running different passes per function. So I don't have any good ideas of how to do that.

Hm, I see. One possibility would be to make those analyses lazy, but that's a larger change.

Possibly a pragmatic choice would be to not support this feature at O0? It does not seem relevant for non-production binaries. The relative impact of a couple unnecessary analysis passes is much higher at O0 than it is at O3.