PeepholeOpt knows how to fold loads into using instructions, but if we encounter an instruction w/store semantics, we discard all candidates. This patch relaxes that slightly for memory accesses which are known to be invariant. This is one step towards a more general selective clearing based on aliasing, but doing the full aliasing scheme efficiently would be challenging.
There are two design questions which come up where I need input from reviewers:
- Currently, loads from constant global variables are not treated as invariant when queried from PeepholdOpt. This is because PeepholeOpt doesn't have access to AA, and the query function uses pointsToConstantMemory. I can tackle this in three ways: a) add direct GV checking in the query routine, or b) set the associated flag in SslectionDAG, or c) pipe through AA. (c) would be complicated, so I'm tentatively rejecting that. Out of (a) and (b), what is more consistent with overall design? I've included both in the patch so that you can see what they look like.
- The patch implemented waits until a fold barrier is encountered, and the selectively filtered the sets. Unfortunately, this makes the operation O(n^2) in the worst case. Other options would be to maintain a separate InvariantFoldSet - O(n) - or to simply query each operand at the using instruction - O(operands), but provides cross block functionality. What would folks prefer to see?
p.s. The current tests are all atomic loads, but there's nothing atomic specific about this. I'll add new tests, and rebase once the design questions are mostly settled.
?