This is patch 5 is a series beginning with D61150
Details
Diff Detail
Event Timeline
Can you please point to patch 5? I think it's missing. (Best directly add it as a parent revision.)
Thanks for implementing this! It's really cool.
lib/Transforms/Utils/SimplifyCFG.cpp | ||
---|---|---|
5492 | s/Another/A. "Another" requires the user to know some context which has now gone. | |
5493 | TODO -> FIXME | |
5494 | No underscores in function names. ReduceSwitchRangeWithCtlz? | |
5495 | It looks to me like this isn't clang-formatted. Can you git-clang-format all your diffs? | |
5496 | You don't need DL or TTI here, please remove them. | |
5497 | I'd use a reference here. It's what most of the rest of LLVM uses. | |
5498 | s/reagarding/regarding. "when given input 0, they return the bitwidth" would be a more natural phrasing. | |
5500 | I would avoid talking about the solution you didn't implement. Checking for zero input (which your popcount does) is a totally reasonable and understandable thing (and much more understandable than trying to grok the xor/rotate explanation to then find it isn't used :D ) | |
5503 | getIntegerBitWidth() is defined on Type. You can avoid the cast. unsigned BitWidth = SI->getCondition()->getType()->getIntegerBitWidth() | |
5515 | This would be a pattern for instcombine to pick up. We shouldn't have to worry about this here. | |
5519 | Isn't the order exactly reversed? llvm::reverse(Values); | |
5520 | And if you do need to sort, please use llvm::sort(*Values) | |
5553 | Please keep using SmallVector. | |
5556 | Please keep using llvm::sort. | |
5620 | Please remove the extra scope. There's no RAII objects here, so they aren't required. | |
5624 |
Change to Builder.getInt1(false); And please change "Zero" to a more descriptive name for the second parameter to ctlz. | |
5632 | See comments on diffbase about fshr. I don't think you need it, the rotate sequence is good enough. Is there a good reason why you think fshr is better? | |
5645 | Why are you converting from APInt to uint64_t to APInt again? Just remove .getLimitedValue() and keep Sub.lshr() in the next line? | |
test/Transforms/SimplifyCFG/rangereduce.ll | ||
85 ↗ | (On Diff #196754) | Uh, something's fishy here - why was this fshr here in the first place? This switch just needs a subtract. |
121 ↗ | (On Diff #196754) | Your previous patch is clearly bogus; this is a shift by zero. |
lib/Transforms/Utils/SimplifyCFG.cpp | ||
---|---|---|
5500 |
It does not. It only checks if the MAXIMUM is 1. | |
5515 | Yes we do, because InstCombine does not know how to rearrange case statements. | |
5519 | Nope, the first becomes the last. I guess I could use swap() and move(). | |
5553 | I was having problems when I tried to modify the reference to SmallVectorImpl that I did not know how to fix. | |
test/Transforms/SimplifyCFG/rangereduce.ll | ||
121 ↗ | (On Diff #196754) | shifts by zero are well defined, and get cleaned up by the optimizer, but I agree it didn't need to be inserted. |
lib/Transforms/Utils/SimplifyCFG.cpp | ||
---|---|---|
5515 | Sounds like a missed optimization on instcombine side then? |
lib/Transforms/Utils/SimplifyCFG.cpp | ||
---|---|---|
5519 | I'm not sure I agree with that, or at least I'm missing something. Given a sorted array A of integers with popcount 1 (single bit set), I believe: order(A) == order(cttz(A)) // cttz does not change the order. order(A) == reverse(order(ctlz(A))) // Conversely ctlz reverses the order. If you need to handle the zero value too, then I would have thought cttz would barrel-shift as you describe and ctlz would barrel-shift the reversed order. My comment above was based on the assumption that you're not allowing zero to be passed to ctlz/cttz. Given you are, just keep on using llvm::sort. | |
5553 | We can fix that. Please keep using SmallVector and we can work through the difficulties. |
This transform is deeply flawed, and needs much work. If the default is not reachable, there is no need to limit to countPopulation == 1, only assure that there are no duplicates. If the default _is_ reachable, then countPopulation <= 1 continues to be important, but also ctlz and cttz must be run, except in very specific circumstances (such as utf-8 first character parsing)
lib/Transforms/Utils/SimplifyCFG.cpp | ||
---|---|---|
5520 | This loop is scary and hard to read. It's really asking for a helper function. I would do something like this: // Given a set of equivalence classes, return true if, for every pair A, B in // Values: // Fn(A) == Fn(B) if and only if EC(A) == EC(B). template <typename Function> bool IsBijectiveOverEquivalenceClasses(const EquivalenceClasses<APInt> &Values, const Function &Fn) { DenseMap<unsigned, APInt> Leaders; for (auto &V : Values) { auto TransformedValue = Fn(V.getData()); auto LeaderIt = Leaders.find(TransformedValue); if (LeadersIt == Leaders.end()) { Leaders[TransformedValue] = V.getLeader()->getData(); continue; } if (LeaderIt->getData() != V.getData()) // This transformed value has already been seen and did not belong to the // same equivalence class. This is not a bijective relation. return false; } return true; } static bool CanReduceSwitchRangeWithCtlzOrCttz(SwitchInst &SI, bool &UseCtlz, bool &UseCttz, bool &UseInvert) { // Preconstruct equivalence classes of values based on successor block. DenseMap<BasicBlock *, APInt> Last; EquivalenceClasses<APInt> ECs; for (auto CI : SI.cases()) { auto I = Last.try_emplace(CI.getCaseSuccessor(), CI.getCaseValue()).first; ECs.unionSets(I.second, V); } UseInvert = false; UseCtlz = true; if (IsBijectiveOverEquivalenceClasses( ECs, [&](const APInt &V) { return V.countLeadingZeros(); })) return true; UseCtlz = false; if (IsBijectiveOverEquivalenceClasses( ECs, [&](const APInt &V) { return V.countTrailingZeros(); })) return true; UseInvert = true; UseCtlz = true; if (IsBijectiveOverEquivalenceClasses( ECs, [&](const APInt &V) { return (~V).countLeadingZeros(); })) return true; UseCtlz = false; if (IsBijectiveOverEquivalenceClasses( ECs, [&](const APInt &V) { return (~V).countTrailingZeros(); })) return true; return false; } static bool ReduceSwitchRangeWithCtlzOrCttz(SwitchInst &SI, SmallVectorImpl<uint64_t> &Values, bool &UseCtlz, bool &UseCttz, bool &UseInvert) { // Note we don't pass Values into this function. This predicate uses the // switch's current case values *and* their associated successor block, so a // rearranged Values array cannot be analyzed (because we wouldn't know which // successsor belonged to which value). if (!CanReduceSwitchRangeWithCtlzOrCttz(SI, UseCtlz, UseCttz, UseInvert)) return false; for (auto CI : SI.cases()) { APInt I = CI.getCaseValue(); if (UseCtlz) Values.push_back((Invert ? ~V : V).countTrailingZeros()); else Values.push_back((Invert ? ~V : V).countLeadingZeros()); } llvm::sort(Values); return true; } | |
5532 | This is a fallacy (will always be false) because:
|
This optimization is not really worth it, for the amount of effort needed to implement it.
s/Another/A. "Another" requires the user to know some context which has now gone.