Recognize invariants which specify alignment restrictions on pointers, and use this to strengthen alignment requirements where possible.
Diff Detail
Event Timeline
This is a rebased version of this pass, slightly simplified and with more comments (and a correctness fix so that it actually checks for dominance of the pointer use by the invariant). Generally, it is a ScalarEvolution-powered transformation that updates load, store and memory intrinsic pointer alignments based on invariant((a+q) & b == 0) expressions.
Many of the simple cases we can get with the computeKnownBits patch (http://reviews.llvm.org/D4490), but we still need something like this for the more complicated cases (such as those with an offset) that require some algebra. Note that gcc's __builtin_assume_aligned's optional third argument provides exactly for this kind of 'misalignment' offset.
FYI, I'm OK with this, but I don't understand SCEV. Someone who does should look at it as well because my eyes juts glazed over there.
lib/Transforms/Scalar/AlignmentFromInvariants.cpp | ||
---|---|---|
65–70 ↗ | (On Diff #11593) | nit: indent to "(", maybe via clang-format? |
Now uses AssumptionTracker too (although it was already pretty fast, just a linear scan, for the assumption finding, now it can be even faster).
Why does this pass run late? Is it because it's only intended to prepare for ISEL? What happens if AlignmentFromAssumptions runs early? Are there transformations that don't preserve load alignment info after you generate it? Unless there's a good reason, it should run before passes that require SCEV or after passes that preserve SCEV. There may have been a thread on this that I missed/forgot, but I think the motivation/justification should be commented somewhere, like the file header.
AlignmentFromAssumptions::runOnFunction is giant. I like it to see the full scope of a loop within a single page of code and like the lifetime of local variables to be obvious. The logic to deduce alignment from llvm.assume can be factored out, as well as the logic for processing each use on the worklist.
The NewAlignment for MemTransferInst logic is a little hard to follow. Can you just have something roughly like this:
NewDestAlign = max(OldDestAlign, getNewAlignment(..., MI->getDest())) NewSrcAlign = max(OldSrcAlign, getNewAlignment(..., MI->getSource())) NewAlign = min(NewDestAlign, NewSrcAlign).
When you have a MemInstrinsic but know it's not MemTransfer, you set the alignment unconditionally. I think you should explicitly check for MemSetInst instead.
I think you are using SCEV primarily to determine whether two induction variables have a constant difference. e.g.
const SCEV *DiffSCEV = SE->getMinusSCEV(PtrSCEV, AASCEV);
That's fine, but I would avoid overusing SCEV. For example, getNewAlignmentDiff() would be much more efficient operating on constants. Is there any value in using SCEV do perform arithmetic in that case. In general, only use SCEV if you need to reason about induction variables.
I didn't review isValidAssumeForContext yet, because I need to find the patch it is in.
lib/Transforms/Scalar/AlignmentFromAssumptions.cpp | ||
---|---|---|
45 | I never indent within namespace. I don't see any value in it. |
- Original Message -----
From: "Andrew Trick" <atrick@apple.com>
To: hfinkel@anl.gov, chandlerc@gmail.com
Cc: atrick@apple.com, gribozavr@gmail.com, llvm-commits@cs.uiuc.edu
Sent: Thursday, July 31, 2014 9:31:27 PM
Subject: Re: [PATCH] Alignment assumptions using invariantsWhy does this pass run late? Is it because it's only intended to
prepare for ISEL? What happens if AlignmentFromAssumptions runs
early? Are there transformations that don't preserve load alignment
info after you generate it? Unless there's a good reason, it should
run before passes that require SCEV or after passes that preserve
SCEV. There may have been a thread on this that I missed/forgot, but
I think the motivation/justification should be commented somewhere,
like the file header.
Yes, the primary motivation is to fixup alignments for vector loads/stores after vectorization (and unrolling). I've added a comment to this effect. I added this pass just after the SLP vectorizer runs (which, admittedly, does not preserve SE, although I imagine it could). Regardless, I actually don't think that the preservation matters too much in this case: SE computes lazily, and this pass won't issue any SE queries unless there are any assume intrinsics, so there should be no real additional cost in the common case (SLP does preserve DT and LoopInfo).
AlignmentFromAssumptions::runOnFunction is giant. I like it to see
the full scope of a loop within a single page of code and like the
lifetime of local variables to be obvious. The logic to deduce
alignment from llvm.assume can be factored out, as well as the logic
for processing each use on the worklist.
Agreed.
The NewAlignment for MemTransferInst logic is a little hard to
follow. Can you just have something roughly like this:NewDestAlign = max(OldDestAlign, getNewAlignment(..., MI->getDest())) NewSrcAlign = max(OldSrcAlign, getNewAlignment(..., MI->getSource())) NewAlign = min(NewDestAlign, NewSrcAlign).
To some extend. For each of the source and destination we could have two potential determinations for the alignment. We want to pick a the maximum for each field that is less than the maximum of the other field. I've simplified this somewhat by making use of std::max.
When you have a MemInstrinsic but know it's not MemTransfer, you set
the alignment unconditionally. I think you should explicitly check
for MemSetInst instead.
Makes sense; I'll add an assert.
I think you are using SCEV primarily to determine whether two
induction variables have a constant difference. e.g.const SCEV *DiffSCEV = SE->getMinusSCEV(PtrSCEV, AASCEV);That's fine, but I would avoid overusing SCEV. For example,
getNewAlignmentDiff() would be much more efficient operating on
constants. Is there any value in using SCEV do perform arithmetic in
that case. In general, only use SCEV if you need to reason about
induction variables.
I think that SCEV is the right thing to use here. In the line you have above, for example, the subtraction will in general be two linear recurrences, and we want to know if they have a constant difference. There are a lot of potential special cases that could be done more efficiently, but I'm not sure that is worthwhile (in this case, for example, I think the preceding division is more expensive regardless).
Fundamentally, the problem is dealing with comparing values within a loop to those outside the loop, so if we have:
__builtin_assume_aligned(x, 32, -2*y + 4); // (x + 2*y - 4) is 32-byte aligned
for (int i = 0; i < n; ++i) {
x[2*(y + i)] = q[i];
}
then the loop gets unrolled:
for (int i = 0; i < n; i += 4) {
x[2*(y + i)] = q[i]; x[2*(y + i+1)] = q[i+1]; x[2*(y + i+2)] = q[i+2]; x[2*(y + i+3)] = q[i+3];
}
SCEV provides a natural way of expressing the problem, and one which already understands permissible things to do with potential overflows, etc. Doing this by hand is much more complicated than just looking for constant offsets to the base objects and dividing them, because you need to find the non-constant part of each addressing expression (which might not be the same SSA value), subtract that, then deal with the PHI-node part, and then relate all that to the common base. This is exactly what SCEV is good at, so I'd like to use it.
I didn't review isValidAssumeForContext yet, because I need to find
the patch it is in.
This is in http://reviews.llvm.org/D4490.
Comment at: lib/Transforms/Scalar/AlignmentFromAssumptions.cpp:44
@@ +43,3 @@
+
+namespace {+ struct AlignmentFromAssumptions : public FunctionPass {
I never indent within namespace. I don't see any value in it.
Done.
Thanks again,
Hal
I think that SCEV is the right thing to use here. In the line you have
above, for example, the subtraction will in general be two linear
recurrences, and we want to know if they have a constant
difference. There are a lot of potential special cases that could be
done more efficiently, but I'm not sure that is worthwhile (in this
case, for example, I think the preceding division is more expensive
regardless).
Ok. My point here was that it looks like getNewAlignmentDiff only works (only reduces to a constant) if both DiffSCEV and AlignSCEV are constants to begin with. In that case, just extract the constants first, and don't bother with SCEV at this particular point.
lib/Transforms/Scalar/AlignmentFromAssumptions.cpp | ||
---|---|---|
362 | "Of these four" | |
393–397 | To avoid redundant worklist entries why not this: for (User *UJ : J->users()) { Instruction *K = cast<Instruction>(UJ); if (Visited.insert(K) && isValidAssumeForContext(I, K, DL, DT)) WorkList.push_back(K); nit: it's a little hard to guess what 'I' means this far away from where it is declared as an argument. 'AssumptionCall'? |
I never indent within namespace. I don't see any value in it.