This is an archive of the discontinued LLVM Phabricator instance.

[SelectOpti] Restrict load sinking
ClosedPublic

Authored by apostolakis on Sep 15 2022, 5:28 PM.

Details

Summary

This is a follow-up to D133777, which resolved a use-after-free case but
did not cover all possible memory bugs due to misplacement of loads.

In short, the overall problem was that sinked loads could be moved after
state-modifying instructions leading to memory bugs.

The solution is to restrict load sinking unless it is found to be sound.
i) Within a basic block (to-be-sinked load and select-user are in the same BB),
loads can be sinked only if there is no intervening state-modifying instruction.
This is a conservative approach to avoid resorting to alias analysis to detect
potential memory overlap.
ii) Across basic blocks, sinking of loads is avoided. This is because going over
multiple basic blocks looking for memory conflicts could be computationally
expensive and also unlikely to allow loads to sink. Further, experiments showed
that not sinking these loads has a slight positive performance effect.
Maybe for some of these loads, having some separation allows enough time
for the load to be executed in time for its user. This is not the case for
floating point operations that benefit more from sinking.

The solution in D133777 was essentially undone in this patch,
since the latter is a complete solution to the observed problem.

Overall, the performance impact of this patch is minimal.
Tested on two internal Google workloads with instrPGO.
Search application showed <0.05% perf difference,
while the database one showed a slight improvement,
but not statistically significant.

Diff Detail

Event Timeline

apostolakis created this revision.Sep 15 2022, 5:28 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 15 2022, 5:28 PM
apostolakis requested review of this revision.Sep 15 2022, 5:28 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 15 2022, 5:28 PM
apostolakis edited the summary of this revision. (Show Details)Sep 15 2022, 5:58 PM
apostolakis edited the summary of this revision. (Show Details)Sep 15 2022, 6:03 PM
apostolakis edited the summary of this revision. (Show Details)

Typo in comment

apostolakis edited the summary of this revision. (Show Details)Sep 15 2022, 7:40 PM
davidxl added inline comments.Sep 15 2022, 9:19 PM
llvm/test/CodeGen/X86/select-optimize.ll
247

why is this unsafe to sink?

apostolakis added inline comments.Sep 15 2022, 9:34 PM
llvm/test/CodeGen/X86/select-optimize.ll
247

it is not unsafe to sink this load but the current version of the transformation does not check for sink-safety if such loads from other basic blocks and thus they should not be sunk.
See patch description for why cross-basic-block load moves were avoided, even though there might be cases such as this one where sinking would have been safe.

davidxl added inline comments.Sep 16 2022, 8:28 AM
llvm/lib/CodeGen/SelectOptimize.cpp
713

In most of the cases, the load and sinkpt are not in the same bb right -- as the case in the test case I commented.

apostolakis added inline comments.Sep 16 2022, 11:31 AM
llvm/lib/CodeGen/SelectOptimize.cpp
713

Actually, most of the sinkable loads are in the same basic block.
For a search workload, only 2.8% of the sinkable loads were in a different BB, while the rest (i.e., the vast majority, 97.2%) were in the same BB as the sinkpt.
The intuition here is that sinkable loads, as defined in this optimization, are loads that have a single-use (only used for the purpose of computing the select operand). Thus, it is expected that they will be close-by in the same BB as the select, i.e., their use.
If they are not in the same BB, there might be a good reason for that. Either there is a state-modifying and aliasing instruction in-between that prevents them for moving, or due to some other optimization (e.g., for both reported bugs, the loaded stack objects were freed early).

Note also that for the search workload, out of all the sinkable loads in the same BB, only 3 load-sinks (0.06% of sinkable loads) were conservatively skipped due to the check for potential (although non-aliasing in these cased) state-modifying instructions in line 717. So, even for same-BB loads, the impact of this transformation seems minimal. This also explains how rare it was for an actual memory bug to occur.

These statistics also justify the small perf impact of this patch

apostolakis added inline comments.Sep 16 2022, 11:38 AM
llvm/lib/CodeGen/SelectOptimize.cpp
713

Oh I now realized that your comment was on this check within isSafeToSinkLoad, and not about skipping cross-BB load sinks in line 760, although the provided statistics should provide more clarity overall.

Regarding this specific check: isSafeToSinkLoad function is only called for cases where both LoadI and SinkPt are in the same BB (see callsite in line 762). Additionally, the iteration here will only work properly for intra-BB searches, so to prevent any misuse it returns conservatively false.

Actually, re-reading this check I could have just called this function and avoid the if-statement in line 760

Ok -- my confusion was the naming of the SinkPt variable -- it should actually be named 'SI'. I through the sinkpt is the actual bb after select is expanded into branch.

Ok -- my confusion was the naming of the SinkPt variable -- it should actually be named 'SI'. I through the sinkpt is the actual bb after select is expanded into branch.

Oh I see. Naming it as sink-point was indeed confusing. I will change it.

davidxl added inline comments.Sep 16 2022, 12:52 PM
llvm/test/CodeGen/X86/select-optimize.ll
203

For sunkable loads, should the lifetime marker still be preserved?

apostolakis added inline comments.Sep 16 2022, 12:56 PM
llvm/test/CodeGen/X86/select-optimize.ll
203

The lifetime-marker intrinsic writes to memory and thus there will not be any sunk loads bypassing lifetime markers.
This is why this patch also covers the use-after-free resolved by D133777.

davidxl added inline comments.Sep 16 2022, 1:02 PM
llvm/test/CodeGen/X86/select-optimize.ll
203

Can you add another case with lifemarker and check if it is not sunk?

apostolakis added inline comments.Sep 16 2022, 1:03 PM
llvm/test/CodeGen/X86/select-optimize.ll
203

Sure.

Add test with lifetime-marker, rename SinktPt to SI to avoid
confusion, and remove redundant check for same/different BB.

davidxl accepted this revision.Sep 16 2022, 1:32 PM

lgtm

This revision is now accepted and ready to land.Sep 16 2022, 1:32 PM
This revision was landed with ongoing or failed builds.Sep 16 2022, 1:55 PM
This revision was automatically updated to reflect the committed changes.