This is an archive of the discontinued LLVM Phabricator instance.

[SimplifyCFG] Implement the suggested ctlz transform
AbandonedPublic

Authored by shawnl on Apr 25 2019, 4:23 PM.

Details

Reviewers
jmolloy
Summary

This is patch 5 is a series beginning with D61150

Diff Detail

Repository
rL LLVM

Event Timeline

shawnl created this revision.Apr 25 2019, 4:23 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 25 2019, 4:23 PM
nikic added a subscriber: nikic.Apr 26 2019, 12:32 AM

Can you please point to patch 5? I think it's missing. (Best directly add it as a parent revision.)

jmolloy requested changes to this revision.Apr 26 2019, 1:42 AM

Thanks for implementing this! It's really cool.

lib/Transforms/Utils/SimplifyCFG.cpp
5492

s/Another/A. "Another" requires the user to know some context which has now gone.

5493

TODO -> FIXME

5494

No underscores in function names. ReduceSwitchRangeWithCtlz?

5495

It looks to me like this isn't clang-formatted. Can you git-clang-format all your diffs?

5496

You don't need DL or TTI here, please remove them.

5497

I'd use a reference here. It's what most of the rest of LLVM uses.

5498

s/reagarding/regarding.

"when given input 0, they return the bitwidth" would be a more natural phrasing.

5500

I would avoid talking about the solution you didn't implement. Checking for zero input (which your popcount does) is a totally reasonable and understandable thing (and much more understandable than trying to grok the xor/rotate explanation to then find it isn't used :D )

5503

getIntegerBitWidth() is defined on Type. You can avoid the cast.

unsigned BitWidth = SI->getCondition()->getType()->getIntegerBitWidth()
5515

This would be a pattern for instcombine to pick up. We shouldn't have to worry about this here.

5519

Isn't the order exactly reversed?

llvm::reverse(Values);
5520

And if you do need to sort, please use

llvm::sort(*Values)
5553

Please keep using SmallVector.

5556

Please keep using llvm::sort.

5620

Please remove the extra scope. There's no RAII objects here, so they aren't required.

5624

IntegerType::get(Ty->getContext(), 1)

Change to

Builder.getInt1(false);

And please change "Zero" to a more descriptive name for the second parameter to ctlz.

5632

See comments on diffbase about fshr. I don't think you need it, the rotate sequence is good enough. Is there a good reason why you think fshr is better?

5645

Why are you converting from APInt to uint64_t to APInt again? Just remove .getLimitedValue() and keep Sub.lshr() in the next line?

test/Transforms/SimplifyCFG/rangereduce.ll
85

Uh, something's fishy here - why was this fshr here in the first place? This switch just needs a subtract.

121

Your previous patch is clearly bogus; this is a shift by zero.

This revision now requires changes to proceed.Apr 26 2019, 1:42 AM
shawnl marked 6 inline comments as done.Apr 26 2019, 5:23 AM
shawnl added inline comments.
lib/Transforms/Utils/SimplifyCFG.cpp
5500

which your popcount does

It does not. It only checks if the MAXIMUM is 1.

5515

Yes we do, because InstCombine does not know how to rearrange case statements.

5519

Nope, the first becomes the last. I guess I could use swap() and move().

5553

I was having problems when I tried to modify the reference to SmallVectorImpl that I did not know how to fix.

test/Transforms/SimplifyCFG/rangereduce.ll
121

shifts by zero are well defined, and get cleaned up by the optimizer, but I agree it didn't need to be inserted.

shawnl edited the summary of this revision. (Show Details)Apr 26 2019, 5:24 AM
lebedev.ri added inline comments.
lib/Transforms/Utils/SimplifyCFG.cpp
5515

Sounds like a missed optimization on instcombine side then?
Trying to handle every single unrelated micro-transform in every optimization is a path to failure.

jmolloy added inline comments.Apr 26 2019, 6:10 AM
lib/Transforms/Utils/SimplifyCFG.cpp
5519

I'm not sure I agree with that, or at least I'm missing something.

Given a sorted array A of integers with popcount 1 (single bit set), I believe:

order(A) == order(cttz(A))  //  cttz does not change the order.
order(A) == reverse(order(ctlz(A))) // Conversely ctlz reverses the order.

If you need to handle the zero value too, then I would have thought cttz would barrel-shift as you describe and ctlz would barrel-shift the reversed order.

My comment above was based on the assumption that you're not allowing zero to be passed to ctlz/cttz. Given you are, just keep on using llvm::sort.

5553

We can fix that. Please keep using SmallVector and we can work through the difficulties.

shawnl marked 3 inline comments as done.Apr 28 2019, 12:24 AM
shawnl added inline comments.
lib/Transforms/Utils/SimplifyCFG.cpp
5515

Yeah I am working on some patches.

5519

Oh yes, I was think about cttz, but now I am only using ctlz, so I can use llvm::reverse()

shawnl updated this revision to Diff 197008.Apr 28 2019, 1:24 AM
shawnl planned changes to this revision.Apr 28 2019, 7:56 PM

This transform is deeply flawed, and needs much work. If the default is not reachable, there is no need to limit to countPopulation == 1, only assure that there are no duplicates. If the default _is_ reachable, then countPopulation <= 1 continues to be important, but also ctlz and cttz must be run, except in very specific circumstances (such as utf-8 first character parsing)

shawnl updated this revision to Diff 197296.Apr 30 2019, 4:09 AM
jmolloy requested changes to this revision.Apr 30 2019, 5:32 AM
This revision now requires changes to proceed.Apr 30 2019, 5:32 AM
shawnl updated this revision to Diff 197334.Apr 30 2019, 8:03 AM
jmolloy requested changes to this revision.May 1 2019, 1:58 AM
jmolloy added inline comments.
lib/Transforms/Utils/SimplifyCFG.cpp
5520

This loop is scary and hard to read. It's really asking for a helper function. I would do something like this:

// Given a set of equivalence classes, return true if, for every pair A, B in
// Values:
//   Fn(A) == Fn(B) if and only if EC(A) == EC(B).
template <typename Function>
bool IsBijectiveOverEquivalenceClasses(const EquivalenceClasses<APInt> &Values,
                                       const Function &Fn) {
  DenseMap<unsigned, APInt> Leaders;
  for (auto &V : Values) {
    auto TransformedValue = Fn(V.getData());
    auto LeaderIt = Leaders.find(TransformedValue);
    if (LeadersIt == Leaders.end()) {
      Leaders[TransformedValue] = V.getLeader()->getData();
      continue;
    }
    if (LeaderIt->getData() != V.getData())
      // This transformed value has already been seen and did not belong to the
      // same equivalence class. This is not a bijective relation.
      return false;
  }
  return true;
}

static bool CanReduceSwitchRangeWithCtlzOrCttz(SwitchInst &SI, bool &UseCtlz,
                                               bool &UseCttz, bool &UseInvert) {
  // Preconstruct equivalence classes of values based on successor block.
  DenseMap<BasicBlock *, APInt> Last;
  EquivalenceClasses<APInt> ECs;
  for (auto CI : SI.cases()) {
    auto I = Last.try_emplace(CI.getCaseSuccessor(), CI.getCaseValue()).first;
    ECs.unionSets(I.second, V);
  }
  UseInvert = false;

  UseCtlz = true;
  if (IsBijectiveOverEquivalenceClasses(
          ECs, [&](const APInt &V) { return V.countLeadingZeros(); }))
    return true;

  UseCtlz = false;
  if (IsBijectiveOverEquivalenceClasses(
          ECs, [&](const APInt &V) { return V.countTrailingZeros(); }))
    return true;

  UseInvert = true;
  UseCtlz = true;
  if (IsBijectiveOverEquivalenceClasses(
          ECs, [&](const APInt &V) { return (~V).countLeadingZeros(); }))
    return true;

  UseCtlz = false;
  if (IsBijectiveOverEquivalenceClasses(
          ECs, [&](const APInt &V) { return (~V).countTrailingZeros(); }))
    return true;

  return false;
}

static bool ReduceSwitchRangeWithCtlzOrCttz(SwitchInst &SI,
                                            SmallVectorImpl<uint64_t> &Values,
                                            bool &UseCtlz, bool &UseCttz,
                                            bool &UseInvert) {
  // Note we don't pass Values into this function. This predicate uses the
  // switch's current case values *and* their associated successor block, so a
  // rearranged Values array cannot be analyzed (because we wouldn't know which
  // successsor belonged to which value).
  if (!CanReduceSwitchRangeWithCtlzOrCttz(SI, UseCtlz, UseCttz, UseInvert))
    return false;

  for (auto CI : SI.cases()) {
    APInt I = CI.getCaseValue();
    if (UseCtlz)
      Values.push_back((Invert ? ~V : V).countTrailingZeros());
    else
      Values.push_back((Invert ? ~V : V).countLeadingZeros());
  }
  llvm::sort(Values);
  return true;
}
5532

This is a fallacy (will always be false) because:

  1. SwitchInst::CaseHandle is unique based on switched value. There is one CaseHandle for every value in Values.
  2. Prev is zero-initialized just above this line, in the loop.
This revision now requires changes to proceed.May 1 2019, 1:58 AM
shawnl abandoned this revision.Jun 14 2019, 8:31 AM

This optimization is not really worth it, for the amount of effort needed to implement it.