This is an archive of the discontinued LLVM Phabricator instance.

Fix a bug when unswitching on partial LIV for SwitchInst
ClosedPublic

Authored by trentxintong on Jan 24 2017, 5:07 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

trentxintong created this revision.Jan 24 2017, 5:07 PM

Reorganize some logic.

This is the CFG I get by running unswitching on the 1st test case included in this patch (without the patch), we are unswitching on the value 11 a few times till we've taken up all the quota. (Also learn how to attach files in Phabricator).

david2050 added a comment.EditedJan 26 2017, 6:45 PM

Hi Xin, let me make sure I understand given:

for(...;...;++i)
   switch (f(i) & c) 
        case 11:
        case 12:

we might unswitch this too

if (c) 
   for(...;...;++i)
     switch (f(i) & c) 
         case 11:
         case 12:
else
  for(...;...;++i)
   switch (0) 
        case 11:
        case 12:

and we achieve no simplification in one loop.

How does restricting to a single bit change this?

Hi Xin, let me make sure I understand given:

for(...;...;++i)
   switch (f(i) & c) 
        case 11:
        case 12:

we might unswitch this too

if (c) 
   for(...;...;++i)
     switch (f(i) & c) 
         case 11:
         case 12:
else
  for(...;...;++i)
   switch (0) 
        case 11:
        case 12:

and we achieve no simplification in one loop.

How does restricting to a single bit change this?

Hi David

We are trying to pick a value for c such that f(i) & c can be simplified to either a loop _variant_ value which we will not try to unswitch on again, or a constant which we will not unswitch either. The problem is when c is more than 1 bit, we can not pick a value that will guarantee such simplifications. Lets say c is a signed int here and we pick a random value, then we do not know what f(i) & c will simplify to, maybe it will not simplify at all. In such cases, given the way loop unswitching is implemented, it will try to unswitch this loop again (It records what case value has been unswitched and will not unswitch on that case-value again).

In your case, we picked c==0, we get simplification in the else-loop and no simplification in the if-loop as what you have in the example. However, the way loop unswitching is implemented is that we try to keep unswitching the current loop (the if-loop in this case) until it cant be unswitched anymore or no quota left. We know c is not 0, but we do not know what c is and thus f(i) & c can not be simplified, so we do not know whats unswitched out ...

If c is one bit, it can only be true or false, we are guaranteed to either reach a constant or a loop _variant_ input value for the switch for the current loop and the cloned loop.

Hi Xin, let me make sure I understand given:

for(...;...;++i)
   switch (f(i) & c) 
        case 11:
        case 12:

we might unswitch this too

if (c) 
   for(...;...;++i)
     switch (f(i) & c) 
         case 11:
         case 12:
else
  for(...;...;++i)
   switch (0) 
        case 11:
        case 12:

and we achieve no simplification in one loop.

How does restricting to a single bit change this?

Hi David

We are trying to pick a value for c such that f(i) & c can be simplified to either a loop _variant_ value which we will not try to unswitch on again, or a constant which we will not unswitch either. The problem is when c is more than 1 bit, we can not pick a value that will guarantee such simplifications. Lets say c is a signed int here and we pick a random value, then we do not know what f(i) & c will simplify to, maybe it will not simplify at all. In such cases, given the way loop unswitching is implemented, it will try to unswitch this loop again (It records what case value has been unswitched and will not unswitch on that case-value again).

In your case, we picked c==0, we get simplification in the else-loop and no simplification in the if-loop as what you have in the example. However, the way loop unswitching is implemented is that we try to keep unswitching the current loop (the if-loop in this case) until it cant be unswitched anymore or no quota left. We know c is not 0, but we do not know what c is and thus f(i) & c can not be simplified, so we do not know whats unswitched out ...

If c is one bit, it can only be true or false, we are guaranteed to either reach a constant or a loop _variant_ input value for the switch for the current loop and the cloned loop.

To be more precise, its not we are picking c == 0, its we are trying to unswitch the following input value f(i) & c (0 in this case) out of the loop.

@sanjoy @efriedma @hfinkel Mind taking a look ? Especially this fixes a bug which make loop unswitching less effective. Thanks !.

sanjoy edited edge metadata.Feb 1 2017, 4:18 PM

I would say the problem here is that we're not doing a good enough job of choosing a good value to unswitch on (and not that we're unswitching on the wrong "type" of value). That is, say we have a switch on Inv & Var (Inv being the loop invariant value) that dispatches on 0, 200 and 300. In such a case, we should clearly unswitch on Inv == 0 to simplify the switch to a direct branch to the (Inv & Var) == 0 case in one of the loops. Similar logic follows if we have a switch on Inv | Var and one of the cases is -1.

So I think the patch should be structured as:

  • Keep FindLIVLoopCondition the same (maybe return a bit of extra information to make the next steps easier)
  • In processCurrentLoop, check to see if the expression we're switching on simplifies to any of the case values given some constant value for what was returned by FindLIVLoopCondition
    • If there is such a constant value, unswitch on that and profit
    • Else give up
lib/Transforms/Scalar/LoopUnswitch.cpp
442 ↗(On Diff #85698)

Did you clang-format this bit?

I would say the problem here is that we're not doing a good enough job of choosing a good value to unswitch on (and not that we're unswitching on the wrong "type" of value). That is, say we have a switch on Inv & Var (Inv being the loop invariant value) that dispatches on 0, 200 and 300. In such a case, we should clearly unswitch on Inv == 0 to simplify the switch to a direct branch to the (Inv & Var) == 0 case in one of the loops. Similar logic follows if we have a switch on Inv | Var and one of the cases is -1.

So I think the patch should be structured as:

  • Keep FindLIVLoopCondition the same (maybe return a bit of extra information to make the next steps easier)
  • In processCurrentLoop, check to see if the expression we're switching on simplifies to any of the case values given some constant value for what was returned by FindLIVLoopCondition
    • If there is such a constant value, unswitch on that and profit
    • Else give up

Yes, the ability to simplify f(x) (The chain of AND(s) and OR(s)) to a meaningful case value is important here. Our current way is to pick 1 of the case values and expecting f(x) simplifies to a meaningful case value, which probably does not happen in many cases. Since we are concerning with a chain of and/or operators working on one or multiple loop-variant values and a single loop invariant values, I think this is what we could do, we record wether the chain contains AND and OR. Once we have that we can do the following:

  • If we have a case 0 in the switch and we only have AND, we can pick loop_invariant as 0, this way we can unswitch case 0 out of the loop.
  • if we have a case UNSIGNED_TYPE_MAX (signed -1) and we only have ORs, we can pick the loop_invariant as UNSIGNED_TYPE_MAX and unswitch UNSIGNED_TYPE_MAX out of the loop.

What do you think ?

Yes, the ability to simplify f(x) (The chain of AND(s) and OR(s)) to a meaningful case value is important here. Our current way is to pick 1 of the case values and expecting f(x) simplifies to a meaningful case value, which probably does not happen in many cases. Since we are concerning with a chain of and/or operators working on one or multiple loop-variant values and a single loop invariant values, I think this is what we could do, we record wether the chain contains AND and OR. Once we have that we can do the following:

  • If we have a case 0 in the switch and we only have AND, we can pick loop_invariant as 0, this way we can unswitch case 0 out of the loop.
  • if we have a case UNSIGNED_TYPE_MAX (signed -1) and we only have ORs, we can pick the loop_invariant as UNSIGNED_TYPE_MAX and unswitch UNSIGNED_TYPE_MAX out of the loop.

What do you think ?

Sounds good!

Address @sanjoy comments.

sanjoy requested changes to this revision.Feb 14 2017, 10:46 AM

Comments inline.

lib/Transforms/Scalar/LoopUnswitch.cpp
372 ↗(On Diff #88246)

Why not drop the Ty and call it a OperatorChain? I've only seen Ty used in templated code when it isn't 100% obvious that something is a type.

409 ↗(On Diff #88246)

Please put braces around such a large if block to make its extent obvious.

439 ↗(On Diff #88246)

Doesn't this regress the callers of FindLIVLoopCondition that pass in nullptr for OCS? They used to get the recursive search before, and now they don't?

My preference would be to avoid allowing a null OCS, and instead have it be a OperatorChainTy & instead. Callers that don't need it can pass in a dummy.

684 ↗(On Diff #88246)

auto *

This revision now requires changes to proceed.Feb 14 2017, 10:46 AM
trentxintong added inline comments.Feb 14 2017, 10:58 AM
lib/Transforms/Scalar/LoopUnswitch.cpp
439 ↗(On Diff #88246)

The only places which we pass in a nullptr is TryTrivialLoopUnswitch (with the patch), i.e. before the patch TryTrivialLoopUnswitch ignores partial LIV even we manage to get one by walking up the chain. So there is no place which will get regressed in the current state the pass is written.

trentxintong edited edge metadata.

Address comments from @sanjoy. There is still one unresolved issue, waiting to be discussed and addressed.

I took @sanjoy suggestion about passing in the OperatorChain as a reference, instead of pointer (and allowing nullptr to be passed) in.

sanjoy requested changes to this revision.Feb 18 2017, 3:21 PM

Mostly minor comments.

lib/Transforms/Scalar/LoopUnswitch.cpp
373 ↗(On Diff #88782)

Normally I've seen LLVM prefixes these enums as OC_OpChainNone etc.

420 ↗(On Diff #88782)

I'd leave this uninitialized here. That way if you forget to assign to it down one path, sanitizers / compiler warnings will catch the bug.

I'm also not sold on the OCS and CChain names -- how about ParentChain and NewChain?

431 ↗(On Diff #88782)

I'd rather have this last case be:

case OpChainMixed:
  CChain = OpChainMixed;
  break;

That will make this switch "fully covered" and have the compiler issue a warning if someone adds a new enum but forgets to update the switch.

453 ↗(On Diff #88782)

This style is a bit awkward, IMO. (In a later change, no need to pre-commit review) do you mind changing FindLIVLoopCondition to return std::pair<Value *, OperatorChain> as well?

679 ↗(On Diff #88782)

Why do you care about the case being selected not being the default?

test/Transforms/LoopUnswitch/basictest.ll
186 ↗(On Diff #88782)

s/dont/don't/
s/a input/an input/

This revision now requires changes to proceed.Feb 18 2017, 3:21 PM
trentxintong added inline comments.Feb 18 2017, 4:56 PM
lib/Transforms/Scalar/LoopUnswitch.cpp
679 ↗(On Diff #88782)

In its form, loop unswitch does not unswitch any value out of the default case. If we unswitch something out of the default here, it will not be recorded as unswitched-out and it will try to unswitch it again.

I guess this can be improved, but i prefer in another patch.

trentxintong edited edge metadata.

Address @sanjoy comments.

lib/Transforms/Scalar/LoopUnswitch.cpp
453 ↗(On Diff #88782)

Yes, I will do this in a subsequent commit.

Decided to teach how to unswitch value out of default-case in case its can lead to
simplifications in at least one of the loops.

Also fix a small bug about how unswitched value is recorded. Updated the test cases to
check for that.

sanjoy accepted this revision.Feb 18 2017, 9:58 PM

LGTM, but let's not fold in the changes to RewriteLoopBodyWithConditionConstant into this patch.

lib/Transforms/Scalar/LoopUnswitch.cpp
684 ↗(On Diff #89053)

Not sure what you meany by !.. If it is supposed to be sentence terminator then I'd just go with a period. Same below.

1338 ↗(On Diff #89053)

This bit is new right? Let's do this change in a different commit and go with the bailout on a default switch case in FindLIVLoopCondition in the first step (but please add a FIXME to FindLIVLoopCondition).

This revision is now accepted and ready to land.Feb 18 2017, 9:58 PM
trentxintong added inline comments.Feb 18 2017, 10:24 PM
lib/Transforms/Scalar/LoopUnswitch.cpp
1338 ↗(On Diff #89053)

Actually this change is necessary to make the patch work. When I first wrote this patch to solve the problem, I was disabling walking up the operator chain for switches. And this would work and I manually checked the test case worked.

Then we changed the patch to move up the chain and now the problem is: If we have a chain of ands and we have a partial LIV on top of the chain, we will not get to BranchesInfo.setUnswitched(SI, Val); here. Because the SwitchInst is not a direct user of the partial LIV.

What i did is to move this setUnswitched out to where SwitchInst is unswitched, i.e. in TrivialUnswitching and NonTrivialUnswitching. And with that we can record what is unswitched. I also added more checks into the test cases to make sure unswitching only happen once.

Minor comment fixes.

Basically, there are 2 problems here this patch addresses.

  1. we try to find a value that actually simplify the operator chain and in case we cant, we give up and this is better than picking a random value and hope it simplifies.
  2. we fix how unswitch value is recorded. current way (before the patch) can not record partial LIV.

Both are needed to have the patch working, as we can not allow partial LIV by walking up the chain and not fixing how to record unswitched partial LIV correctly.

@sanjoy can you please take a look at this again ? Thanks.

lgtm again

lib/Transforms/Scalar/LoopUnswitch.cpp
1060 ↗(On Diff #89061)

I understand why you need this, but IIUC the CurLoopInstructions table is unnecessary.

We can solve the problem it solves in either of two ways:

  • Remove cases from switches after unswitching them. That is, with switch (V) { case 0: ... } the V != 0 version of the loop does not have a case 0:. This would prevent us from unswitching on that condition again by definition.
  • Do not unswitch on cases that lead to a block that solely contains unreachable. This way we will avoid doing redundant work generally (simplifycfg will clean these trivial cases up), and also prevent us unswitching switch es that we just unswitched.

So while your change itself is okay, depending on how much time you have it would be nice to remove CurLoopInstructions altogether.

trentxintong added inline comments.Feb 27 2017, 10:10 AM
lib/Transforms/Scalar/LoopUnswitch.cpp
1060 ↗(On Diff #89061)
  • Remove cases from switches after unswitching them. I think this is possible.
  • Do not unswitch on cases that lead to a block that solely contains unreachable. - We can not split critical edges with some obscure cases correctly right now, say we have 2 edges from BB-A to BB-B, when we call SplitEdge, we do not know which one is split. To be more specific, the one returned by GetSuccessorNumber(BB, Succ) is split. In such case we can not mark a case with unreachable because we may not get a basic block to put that unreachable instruction. This can be fixed as well.

I am going to land this for now. And think about what i can do to make this better sometime next month or so.

This revision was automatically updated to reflect the committed changes.

@sanjoy Thanks alot for spending time with me to drive this forward.