I'll split this patch up for review, just starting a discussion and
putting something people can play with.
This patch adds a pass to build extended SSA (see "ABCD: eliminating
array bounds checks on demand"), and an intrinsic to support it. This
is then used to get functionality equivalent to propagateEquality in
GVN, in NewGVN (without having to replace instructions as we go). It
would work similarly in SCCP or other passes. This has been talked
about a few times, so i built a real implementation and tried to
productionize it.
The intrinsic is essentially a copy intrinsic which the passes uses to
attach data to (depending on what predicates apply). It is marked as
readnone and returning it's first argument (which is the operand it's
a copy of).
Copies are inserted for operands used in assumes and conditional
branches that are based on comparisons (see below for more)
Every use affected by the predicate is renamed to the appropriate
intrinsic result.
E.g.
%cmp = icmp eq i32 %x, 50
br i1 %cmp, label %true, label %false
true:
ret i32 %x
false:
ret i32 1
will become
%cmp = icmp eq i32, %x, 50
br i1 %cmp, label %true, label %false
true:
; Has predicate info
; branch predicate info { TrueEdge: 1 Comparison: %cmp = icmp eq i32 %x, 50 }
%x.0 = call @llvm.predicateinfo.i32(i32 %x)
ret i32 %x.0
false:
ret i23 1
(you can use -print-predicateinfo to get an annotated-with-predicateinfo dump)
This enables us to easily determine what operations are affected by a
given predicate, and how operations affected by a chain of
predicates.
The intrinsic, marked as returning it's first argument, has no code
generation effect (though currently not every optimization pass knows
that intrinsics with the returned attribute can be looked through).
We deliberately do not attach any info through a second operand so
that the intrinsics do not need to dominate the comparisons/etc (since
in the case of assume, we may want to push them up the post-dominator
tree)
The actual pass stores data about each renamed operand. For operands
used in comparisons in branches, it stores what edge you are in (true
or false), the original comparison, and both parts of the branch edge
(IE the branch block and the successor block for this edge). This is
done so they can be moved without worrying and we can detect they are
invalid. For operands renamed due to assumes, we store where the
assume is.
The pass does not do insertions in blocks unless the operand with
predicateinfo is used there.
(see the false edge above)
Time wise, the pass is O(defs+uses of operands of branch ending
comparisons/assumes). For a CFG that contains 2 million blocks and 1
million icmps, it takes 600ms to insert predicate info. The largest
amount of time is acutally spent in the depth first iterator walking
the dominator tree and inserting blocks into the visited set (which is
a waste of time since it's a tree, sadly). Most passes simply crash
on this file :) Renaming uses takes about 150ms. (timing on most
normal testcases is not really measurable)
We could make it possible to insert only for certain operations if we
want, and at least what i've seen so far, it would be fast enough to
not worry horribly about. Given how fast it is, I have not worried
about updating. Currently, NewGVN nowunderstands the returned
attribute, so it destroys them all. Also as mentioned, we can detect
if they've been invalidated by being moved. Just moving them would
not actually invalidate them (since they have all the info in them
necessary to give correct answers), the only thing that actually would
make them do something bad would be to add uses of them that *aren't*
dominated by one of the edges.
Note that if we decide we don't want to go this direction for some
reason, i would likely make this private to NewGVN or something (i
have another way to do what it does, but it's not as nice).
We also may want to centralize some of the knowledge about what things
imply what (IE have "getConstantValue" or "getEquivalentNames" or
various things, otherwise people have to use isTrueWhenEqual and
isImpliedTrue, everywhere. We have a bunch of missed opt reports about
the places that try to do this but don't catch every case, etc).
redundant.