Page MenuHomePhabricator

[SCEV][LoopVectorize] Allow ScalarEvolution to make assumptions about overflows
AbandonedPublic

Authored by sbaranga on Jun 1 2015, 9:15 AM.

Details

Summary

Add a new pass - AssumingScalarEvolution - that extends
SCEV, but can make assumptions and generate code that
can check these assumptions. This pass uses the normal
ScalarEvolution pass for anything outside the current
analyzed loop, but uses its own data structures to handle
the SCEVs within the loop. For now, the pass assumes
that chrec expressions will not overflow.

The AssumingScalarEvolution pass can add checks for the
made assumptions, so that a loop can be versioned.

We use this pass in order to add runtime overflow checks
in the Loop Vectorize pass of expressions which can
in theory overflow and would prevent the vectorization
a loop. Also note that the runtime checks will almost always
pass, since having an overflow is usually a sign of a
coding error on the part of user.

The main reasons behind inheriting from ScalarEvolution:
Since ScalarEvolution maintains its own cache for various
results, making any assumption would likely require invalidating
the entire cache.

This way we can use the new pass as an almost drop-in
replacement for ScalarEvolution. The users that are doing
a transformation would have to know about it in order to add
the checks.

However, there are probably a lot more ways in which this could
be implemented.

Any comments are much appreciated!

Thanks,
Silviu

Diff Detail

Event Timeline

sbaranga updated this revision to Diff 26900.Jun 1 2015, 9:15 AM
sbaranga retitled this revision from to [SCEV][LoopVectorize] Allow ScalarEvolution to make assumptions about overflows.
sbaranga updated this object.
sbaranga edited the test plan for this revision. (Show Details)
sbaranga added a subscriber: Unknown Object (MLST).
jmolloy added a subscriber: jmolloy.Jun 1 2015, 9:48 AM
hfinkel added a subscriber: hfinkel.

Thanks for working on this! cc'ing Andy in case he has an opinion regarding the general approach.

include/llvm/Analysis/ScalarEvolution.h
786

Are these two methods special in theory, in this context, or in the future would overrides for more methods be useful?

1262

Don't use 'chrec' here without defining it.

1313

You should clear the cache here?

lib/Analysis/ScalarEvolution.cpp
9178

Given that the SCEVs are uniqued, you could at least eliminate those that are structurally identical easily.

lib/Transforms/Vectorize/LoopVectorize.cpp
2904

Line too long?

atrick edited edge metadata.Jun 22 2015, 9:38 PM

Overall, it's great to see something like this working in practice. In the past, several alternatives have been discussed. IIRC the main objection to this approach is that SCEV could make assumptions that are not necessary for the optimization leading to overhead for unnecessary checks.

Given this is a major design decision, it would be good to have feedback from anyone who has done a lot of work in the area (thanks to Hal for the review). What about Sanjoy, Arnold (especially for the vectorizer change), MichaelZ, AdamN ??

I don't see a major problem with this approach. Just a few comments on the implementation...

I'm afraid that iterating over a DenseMap will produce SCEV expressions and IR checks in a nondeterministic order. We usually fix this with a MapVector.

+ AssumptionMapTy::iterator getLoopAssumptionBegin(const Loop *L) {

I don't see AnalyzedLoop being initialized in the ctor:

+ AssumingScalarEvolution::AssumingScalarEvolution() :
+ ScalarEvolution(false, ID), SE(nullptr) {

See Instruction::getModule()

+ Module *M = Loc->getParent()->getParent()->getParent();

On most targets, I think it's more efficient to check the high bits for overflow. e.g.

if ((a + 0x800000) & 0xffffff000000) overflow

Instead of:

+ if (Signed) {
+ APInt CmpMinValue = APInt::getSignedMinValue(DstBits).sext(SrcBits);
+ ConstantInt *CTMin = ConstantInt::get(M->getContext(), CmpMinValue);
+ Value *MinCheck = OFBuilder.CreateICmp(ICmpInst::ICMP_SLT,
+ TripCount,
+ CTMin);
+ TripCountCheck = OFBuilder.CreateOr(TripCountCheck, MinCheck);
+ }

sanjoy edited edge metadata.Jun 23 2015, 12:57 AM

I'll make it a point to do a more detailed review this week, but I
have a high level question:

Instead of a separate AssumingScalarEvolution, why not have an
interface like:

NoOverflowResult ScalarEvolution::proveNoOverflow(SCEVAddRecExpr

*AddRec, FlagType, bool CanAddAssumptions);

NoOverflowResult can be Yes, No or (if CanAddAssumptions is true) a
predicate that, if satisfied, will guarantee AddRec does not overflow
in the FlagType sense. Then the client can choose if it is profitable
to actually emit the predicate if it is cheap enough relative to expected
payoff.

A much more general interface could be

Result ScalarEvolution::provePredicate(CmpInst::Pred, SCEV *LHS, SCEV *RHS)

and have it return one of "Yes | No | Guarded (Predicate)"

Then not only can you ask the "sext{addrec} == addrec{sext}" question
to check for no-wrap, but also push more general logic into
ScalarEvolution as needed in the future (i.e. is this add-rec equal to this
other add-rec at any iteration? if not, can I make it so by adding runtime
checks?).

  • Sanjoy

Hi Hal,

Thanks for having a look! I've only replied to comments related to the design for now. I think we can get the others after we've converged on some conclusion.

Thanks,
Silviu

include/llvm/Analysis/ScalarEvolution.h
786

I think they are special because these expressions cannot fold sext/zext expressions. As far as I can see all the others can, so these seem like they are the only expressions that are stopping us for getting AddRecExprs as results?

For example if we have (add (sext({x, + , 1}), y). Getting rid of the sext gets us {x + y, +, 1}.

However, there are other methods which might be useful to override, but not related to expression folding. For example SimplifyICmpOperands when taking an ULE predicate can only canonicalize the comparison if it knows that it can add one to the right operand without overflowing (the operand is loop invariant and could possibly be checked). This issue appears mostly when trying to get the number of backedges taken.

Hi Andy,

Thanks for looking at this!

Overall, it's great to see something like this working in practice. In the past, several alternatives have been discussed. IIRC the main objection to this approach is that SCEV could make assumptions that are not necessary for the optimization leading to overhead for unnecessary checks.

Would that be on the list? If so, it would be an interesting read.
Yes, in theory this can produce unnecessary checks. I think it works ok for the vectorizer (since any case where we wouldn't make this assumption would result in not vectorizing). It looks like a trade-off between not having all the users deal explicitly with this against the number of memchecks.

On most targets, I think it's more efficient to check the high bits for overflow. e.g.

if ((a + 0x800000) & 0xffffff000000) overflow

Instead of:

+ if (Signed) {
+ APInt CmpMinValue = APInt::getSignedMinValue(DstBits).sext(SrcBits);
+ ConstantInt *CTMin = ConstantInt::get(M->getContext(), CmpMinValue);
+ Value *MinCheck = OFBuilder.CreateICmp(ICmpInst::ICMP_SLT,
+ TripCount,
+ CTMin);
+ TripCountCheck = OFBuilder.CreateOr(TripCountCheck, MinCheck);
+ }

Nice. I wonder if the optimizers handle well the cases where we would test some consecutive values with this method, or if we don't end up generating something similar anyway.

Thanks,
Silviu

Hi Sanjoy,

I'll make it a point to do a more detailed review this week, but I
have a high level question:

Instead of a separate AssumingScalarEvolution, why not have an
interface like:

NoOverflowResult ScalarEvolution::proveNoOverflow(SCEVAddRecExpr

*AddRec, FlagType, bool CanAddAssumptions);

I actually started from something similar, but ended up moving to the current form as it seemed cleaner and the loop vectorizer doesn't need the flexibility (if something doesn't come out as a SCEVAddRecExpr it will fail).

I think the users actually want to get a SCEVAddRecExpr (or a SCEVConstant if possible) form a normal SCEV. It is possible to parse the SCEV and, figure out if there are any assumptions that can be made and add them somewhere.

It would also be incorrect in this case to set the nuw/nsw bits (which didn't seem ideal).

Another problem is that ScalarEvolution's expression cache really gets in the way.

NoOverflowResult can be Yes, No or (if CanAddAssumptions is true) a
predicate that, if satisfied, will guarantee AddRec does not overflow
in the FlagType sense. Then the client can choose if it is profitable
to actually emit the predicate if it is cheap enough relative to expected
payoff.

A much more general interface could be

Result ScalarEvolution::provePredicate(CmpInst::Pred, SCEV *LHS, SCEV *RHS)

and have it return one of "Yes | No | Guarded (Predicate)"

Then not only can you ask the "sext{addrec} == addrec{sext}" question
to check for no-wrap, but also push more general logic into
ScalarEvolution as needed in the future (i.e. is this add-rec equal to this
other add-rec at any iteration? if not, can I make it so by adding runtime
checks?).

  • Sanjoy

That would be also one way to do it. We would probably need some utility functions to drive that interface. The initial solution would be at the exact opposite side of the spectrum (easy to use, but not flexible).

Cheers,
Silviu

anemet edited edge metadata.Jun 23 2015, 1:25 PM

Hi Silviu,

Can you please discuss the use-cases? We all ran into SCEV not always being the right vehicle to prove no-overflow but this proposes a pretty big change, so I want to make sure we can't take more targeted/distributed solutions to the problem. (My general feeling is similar to what Andy and Sanjoy have already expressed.)

I feel like that in some cases, we can prove no-overflow at *compile* time by further analyzing the IR (like what I am proposing in D10472). Essentially this is relying on C/C++ signed overflow being undefined.

In other cases we may need prove no-overflow of smaller types so that we can up-level the sign/zero-extensions. Is this perhaps something that's better done in indvars? The idea is (maybe flawed) that you can eliminate an extension in the loop by using an overflow check outside the loop.

Anyhow, you collected some testcases so categorizing the issues would probably help the discussion.

I am also in favor of allowing finer level of control along the lines of Sanjoy's comments. Your approach may work for the vectorizer but in case of the general dependence analysis, we may not need to prove of no-overflow of all pointers. For example, if a pointer can't alias with any other accesses in the loop, we don't care that we can't get a true affine form for it.

Thanks,
Adam

Hi Adam,

Can you please discuss the use-cases? We all ran into SCEV not always being the right vehicle to prove no-overflow but this proposes a pretty big change, so I want to make sure we can't take more targeted/distributed solutions to the problem. (My general feeling is similar to what Andy and Sanjoy have already expressed.)

I think the biggest problem would be where overflow can happen, depending on the input data. I would prefer to add some no overflow proving code for cases when it is possbile to prove rather than implementing something like this, but it looks like for certain cases there is no work-around. I've listed some examples below.

I feel like that in some cases, we can prove no-overflow at *compile* time by further analyzing the IR (like what I am proposing in D10472). Essentially this is relying on C/C++ signed overflow being undefined.

Ideally we would figure out at compile time, but I think it would be impossible (or impractical) to cover all the cases where we could prove this, and there are cases where we cannot prove no-overflow (and the overflow condition would depend on input data).

In other cases we may need prove no-overflow of smaller types so that we can up-level the sign/zero-extensions. Is this perhaps something that's better done in indvars? The idea is (maybe flawed) that you can eliminate an extension in the loop by using an overflow check outside the loop.

Anyhow, you collected some testcases so categorizing the issues would probably help the discussion.

I am also in favor of allowing finer level of control along the lines of Sanjoy's comments. Your approach may work for the vectorizer but in case of the general dependence analysis, we may not need to prove of no-overflow of all pointers. For example, if a pointer can't alias with any other accesses in the loop, we don't care that we can't get a true affine form for it.

Yes, good point! I need to think a bit about the interface (and perhaps do some experimenting), but it should definitely be possible to have something similar to what Sanjoy suggested (and I like the idea).

The test cases I have come from C/C++ where unsigned integers are being used as induction variables and for some reason they would get extended at some point. There are cases where we wouldn't need the extend, but those are outside current scope. The problem with these cases is that the behaviour of unsigned overflow is defined (at least in C/C++), and there is no way of statically reasoning about these cases (we can overflow, and it can for example cause infinite loops). For example:

void test(uint32_t n) {

for (uint16_t i = 0; i < n; ++i) {
  <do something>
}

}

Here we would need to compare i can overflow, and for values larger than 2^16-1 we'll get an infinite loop.
In fact there is no way to progress here besides versioning the loop as far as I can see.

A related example:

void test(uint32_t n) {

for (uint32_t i = 0; i <= n; ++i) {
  <do something>
}

}

Here we have no extend operations, but for n == 2^32-1 this will be an infinite loop and i will overflow. SCEV gives up on computing the backedge count because there is no correct result that it can give.

This can affect memory accesses as well:

void test(uint32_t n, uint16_t ind, uint32_t offset, char *a, char *b) {

for (uint16_t i = 0; i < n; ++i) {
  ind++;
  a[ind + offset] = b[ind + offset] * 3;
}

}

ind + offset does not evaluate here as a chrec. ind is a chrec, but then we apply zext to it and add it to the offset. The absence of nsw/nuw means that we actually get add(offset, zext({ind, +, 1}) for ind + offset. This causes a number of problems (accesses to a are not consecutive, etc) And that's correct, they can be non-consecutive for some values of n - but unlikely to ever happen at execution.

In fact since we would get or not sign extends here depending on the pointer size, this would mean that it's possible to run into issues when porting code from a 32-bit target to a 64-bit one.

I think there are valid reasons to use usigned integers (eg. for the extra range), so it may be easy to hit these cases. But it looks like any combination of unsigned integers and zext will disable most loop optimizations? Having unsigned integers in the exit condition will probably cause issues on its own.

Thanks,
Silviu

sbaranga updated this revision to Diff 29882.Jul 16 2015, 5:08 AM
sbaranga edited edge metadata.

Re-designed to allow the users to make only the assumptions that they need.

The new implementation does not need the AssumingScalarEvolution pass.

Added "SCEVPredicate" as a class in SCEV as an interface for an assumption that
can be made and checked at run-time. Currently we only have implementations for
the SCEVAddRec overflow assumptions, and a SCEVPredicate that can hold
multiple other predicates (SCEVPredicateSet).

We use a SCEV rewriter to take a SCEV and transform it according to the
predicates in a predicate set. Combined with the SCEV AddRec overflow
predicate, this allows us to fold sext/zext expressions into AddRec
expressions and will produce folded SCEVs equivalent with what
the AssumingScalarEvolution pass was producing.

Added a SCEVPredicate in ExitLimit, to allow getting a backedge count
guarded by a SCEV predicate and patched the ExitLimit logic to handle the
predicate. Added code to generate SCEV overflow predicates when
computing the loop count from an icmp condition (more code paths could also
theoretically do this).

Added a getGuardedBackedgeCount method to ScalarEvolution that can return
the backedge count as a SCEV guarded by a predicate. Now LoopAccessAnalysis
and LoopVectorize are using this method.

Modified LoopAccessAnalysis and LoopVectorize to use the SCEV rewriter.
These passes were already using something similar for the symbolic stride.

We now also allow LoopDistribute to use generate the SCEV run-time checks.
This allows us to remove special handling in the LoopAccessAnalysis for the
case where a client wasn't aware of the new run-time checks.

Hi Silviu,

I have some comments, mostly nitpicks. I'll continue the review as I go, but here are the first ones.

cheers,
--renato

include/llvm/Analysis/LoopAccessAnalysis.h
656

this comment seems a bit terse. UseAssumptions is an argument to some methods in SCEV, nothing to do with LoopInfo.

include/llvm/Analysis/ScalarEvolution.h
191

nitpick: maybe isAlwaysTrue / isNeverTrue or isAlwaysTrue / isAlwaysFalse would be more descriptive?

247

No description?

248

Why the duplication? Will you have one for each further type?

275

Why here, and not in SCEVPredicate?

Hi Silviu,

I have some comments, mostly nitpicks. I'll continue the review as I go, but here are the first ones.

cheers,
--renato

Thanks! I'll make sure to fix these in the next upload. Some comments from the last review should also apply, so I'll do that as well.

Cheers,
Silviu

I think this change is orthogonal to improving SCEV for cases where we can prove no wrapping by looking at the IR. At the high level this change looks good to me (I have always wanted to have the ability for SCEV to gather assumptions needed to prove no wrapping) - and the examples demonstrate that there is a need. Silviu has addressed the issue of only generating the checks when needed.

I'll leave it to Sanjoy and Adam to do a detailed review.

rengolin added inline comments.Jul 20 2015, 1:26 AM
include/llvm/Analysis/ScalarEvolutionExpressions.h
752

Do you need to keep this public?

In addOverflowAssumption, you check for MakeAssumptions to add or not the SCEV to it, and if it remains public, anyone will be able to bypass this check.

rengolin added inline comments.Jul 20 2015, 6:53 AM
lib/Analysis/LoopAccessAnalysis.cpp
122

Why are you using the llvm namespace here?

Shouldn't these functions be static?

127

replaceSymbolicStrideSCEV is defined on the header, as namespace llvm, static, and leads to confusions like these.

If you need it in more than one place, it shouldn't be static and it should be implemented in its own cpp file. If these are local only, they should be static or in anonymous namespaces.

hfinkel added inline comments.Jul 25 2015, 5:52 PM
include/llvm/Analysis/ScalarEvolutionExpressions.h
869

Why not make this a member function of SCEV (same for rewriteSCEVWithAssumptions below)?

lib/Analysis/ScalarEvolution.cpp
8303

Don't remove blank line.

Thanks for the reviews! I'll have a new version out soon to fix all of the highlighted issues.

-Silviu

include/llvm/Analysis/ScalarEvolution.h
248

I was thinking of having a SmallVector for storage of each predicate, plus all the references in Preds. So adding further types would require adding additional SmallVectors (we will probably have more types, I can think of at least one more for checking the range of an expression).

include/llvm/Analysis/ScalarEvolutionExpressions.h
869

Makes sense.

sbaranga updated this revision to Diff 32418.Aug 18 2015, 9:01 AM
sbaranga edited edge metadata.

Rebased the patch on trunk (which is the largest part of this update).

Moved the SCEV rewriter implementation from the ScalarEvolutionExpressions header
to the ScalarEvolution .cpp file. ScalarEvolution now has methods to perform
the SCEV rewrite.

Renamed the isAlways/isNever methods from SCEVPredicate to isAlwaysTrue/isAlwaysFalse.

Hi Silviu,

Sorry about the delay! Also my comments below are somewhat disorganized because I am still trying to wrap my head around this large patch. I *think* that the overall direction is good but I have a hard time seeing how the different parts connect at this point. I also have some ideas below of how you could probably restructure this patch to help the review.

As a high-level comment, why doesn't the strided assumption become another type of assumption after these changes?

Added a SCEVPredicate in ExitLimit, to allow getting a backedge count
guarded by a SCEV predicate and patched the ExitLimit logic to handle the
predicate. Added code to generate SCEV overflow predicates when
computing the loop count from an icmp condition (more code paths could also theoretically do this).

Why is this needed? It's not explained.

As another high-level comment, I am also wondering if this 1000+ line patch can be split up to ease review. The steps I am thinking:

  1. Somehow generalize/refactor the stride-checks to make them more similar to the overflow assumptions
  2. Add the minimal predicate set logic and drive it from LAA or somewhere so that you can add tests for simple cases. Perhaps only make strided assumptions work with predicate sets first.
  3. Add overflow assumptions
  4. Add the guarded back-edge count extension along with analysis tests
  5. Add the ExitLimit mods with some more tests. This may have to come before 4 but I don't really understand this change at this point.

This is just an quick idea, feel free to structure the patches differently but I would be more comfortable with a more incremental approach.

Thanks,
Adam

include/llvm/Analysis/LoopAccessAnalysis.h
295–296

Can you please make this a bit more precise? E.g.

With these assumption satisfied the memory accesses become simple AddRecs which allows them to get analyzed.

Also it should either be called Preds or PredSet.

576

What's the deal with unique_ptr above and this here? Can you explaining the ownership situation?

Also it's probably less confusing if we use the same variable name for PredicateSets.

lib/Analysis/LoopAccessAnalysis.cpp
141–148

Hmm, why don't you only write Preds after to ensure that we succeeded? Isn't this the same as:

if (R.Ret && !R.P.isNever()) {

R.P.add(&Preds);  <-- write Preds here.
return R.Ret:

}

if not it needs a comment.

158–161

We should not recompute this for every insert. Also we need to bound the number of predicates so getting the guarded backedge count at a more central place is probably a better idea.

591–606

This does not look tight enough.

Can't we fail in hasComputable... for reasons other than the SCEV not being an AddRec?
Also the return value of convertSCEV... should be checked rather than just repeating hasComputable... blindly.

lib/Transforms/Vectorize/LoopVectorize.cpp
312–314

There is no def for this function.

Hi Adam,

Hi Silviu,

Sorry about the delay! Also my comments below are somewhat disorganized because I am still trying to wrap my head around this large patch. I *think* that the overall direction is good but I have a hard time seeing how the different parts connect at this point. I also have some ideas below of how you could probably restructure this patch to help the review.

Thanks! Restructuring the patch sounds like a good idea.

As a high-level comment, why doesn't the strided assumption become another type of assumption after these changes?

That's a good idea, we should be able to do that. I don't see any problem with making a predicate for it in one of the patches.

Added a SCEVPredicate in ExitLimit, to allow getting a backedge count
guarded by a SCEV predicate and patched the ExitLimit logic to handle the
predicate. Added code to generate SCEV overflow predicates when
computing the loop count from an icmp condition (more code paths could also theoretically do this).

Why is this needed? It's not explained.

The exit limit is what's being cached. That result is used to generate the (exact) backedge count, and other things. In order to keep caching mostly the same, we need to add the predicate to the ExitLimit. So this is all part of getting getBackedgeCount to work with predicates.

As another high-level comment, I am also wondering if this 1000+ line patch can be split up to ease review. The steps I am thinking:

  1. Somehow generalize/refactor the stride-checks to make them more similar to the overflow assumptions
  2. Add the minimal predicate set logic and drive it from LAA or somewhere so that you can add tests for simple cases. Perhaps only make strided assumptions work with predicate sets first.
  3. Add overflow assumptions
  4. Add the guarded back-edge count extension along with analysis tests
  5. Add the ExitLimit mods with some more tests. This may have to come before 4 but I don't really understand this change at this point.

Yes, I've also felt the need to split this up. At least all the places where we're making assumptions should probably be reviewed separately. My plans were slightly different though (more focused towards being able to get the backedge count). I'll have to think about the exact details, but it will probably end up being similar to what you've listed above.

This is just an quick idea, feel free to structure the patches differently but I would be more comfortable with a more incremental approach.

Thanks,
Adam

include/llvm/Analysis/LoopAccessAnalysis.h
576

This should be removed as it doesn't do anything. The unique_ptr above should contain the predicates, LoopAccessInfo ownes it and all other classes/structs have reference to it.

lib/Analysis/LoopAccessAnalysis.cpp
141–148

R.P.add(&Preds) adds Preds to R.P, so that wouldn't write Preds. We use the temporary because we can only be sure if we've succeeded or not after we see the final set of predicates.

158–161

Caching expressions would make sense to me. We would have to recompute the expressions every time we commit to a new predicate though.

591–606

Yes, you're correct. A better solution seems to only make the assumption after we've seen that hasComputableBounds.. now returns true.

This change probably needs a closer look.

lib/Transforms/Vectorize/LoopVectorize.cpp
312–314

Must be an artefact from the rebase. It's not being used anywhere either, so it just needs to be removed.

  1. Somehow generalize/refactor the stride-checks to make them more similar to the overflow assumptions
  2. Add the minimal predicate set logic and drive it from LAA or somewhere so that you can add tests for simple cases. Perhaps only make strided assumptions work with predicate sets first.
  3. Add overflow assumptions
  4. Add the guarded back-edge count extension along with analysis tests
  5. Add the ExitLimit mods with some more tests. This may have to come before 4 but I don't really understand this change at this point.

Yes, I've also felt the need to split this up. At least all the places where we're making assumptions should probably be reviewed separately. My plans were slightly different though (more focused towards being able to get the backedge count). I'll have to think about the exact details, but it will probably end up being similar to what you've listed above.

Great! I understand that your actual goal to get the backedge count into a suitable form for analysis by using predicates. The tricky part that that needs multiple things: the whole predicated SCEV infrastructure *plus* a good way to expose this otherwise implicit step to clients so that they have a way to reason about the cost of adding these predicates.

That's why if possible I'd like to get the first set in where the clients are in full control (based on potentially wrapping pointers for example) and then start bringing in the whole backedge count complexity.

That's said it's entirely possible that I am oversimplifying things.

Thanks,
Adam

lib/Analysis/LoopAccessAnalysis.cpp
141–148

I see. It still feels like a common multi-step pattern that clients can get wrong. I wonder if we can make the API better here. We can discuss this after you split this into multiple patches.

158–161

I don't follow.

Is this still relevant, or has this been superseded by D13595 ?

sbaranga abandoned this revision.Oct 26 2015, 3:41 AM

Is this still relevant, or has this been superseded by D13595 ?

It is only relevant as a reference for future work I'll be splitting out patches from here in the future, although we've already diverged with D13595 in some points. I'll mark it as abandoned.