This is patch 5 is a series beginning with D61150
Details
Diff Detail
Event Timeline
Can you please point to patch 5? I think it's missing. (Best directly add it as a parent revision.)
Thanks for implementing this! It's really cool.
| lib/Transforms/Utils/SimplifyCFG.cpp | ||
|---|---|---|
| 5492 | s/Another/A. "Another" requires the user to know some context which has now gone. | |
| 5493 | TODO -> FIXME | |
| 5494 | No underscores in function names. ReduceSwitchRangeWithCtlz? | |
| 5495 | It looks to me like this isn't clang-formatted. Can you git-clang-format all your diffs? | |
| 5496 | You don't need DL or TTI here, please remove them. | |
| 5497 | I'd use a reference here. It's what most of the rest of LLVM uses. | |
| 5498 | s/reagarding/regarding. "when given input 0, they return the bitwidth" would be a more natural phrasing. | |
| 5500 | I would avoid talking about the solution you didn't implement. Checking for zero input (which your popcount does) is a totally reasonable and understandable thing (and much more understandable than trying to grok the xor/rotate explanation to then find it isn't used :D ) | |
| 5503 | getIntegerBitWidth() is defined on Type. You can avoid the cast. unsigned BitWidth = SI->getCondition()->getType()->getIntegerBitWidth() | |
| 5515 | This would be a pattern for instcombine to pick up. We shouldn't have to worry about this here. | |
| 5519 | Isn't the order exactly reversed? llvm::reverse(Values); | |
| 5520 | And if you do need to sort, please use llvm::sort(*Values) | |
| 5553 | Please keep using SmallVector. | |
| 5556 | Please keep using llvm::sort. | |
| 5620 | Please remove the extra scope. There's no RAII objects here, so they aren't required. | |
| 5624 |
Change to Builder.getInt1(false); And please change "Zero" to a more descriptive name for the second parameter to ctlz. | |
| 5632 | See comments on diffbase about fshr. I don't think you need it, the rotate sequence is good enough. Is there a good reason why you think fshr is better? | |
| 5645 | Why are you converting from APInt to uint64_t to APInt again? Just remove .getLimitedValue() and keep Sub.lshr() in the next line? | |
| test/Transforms/SimplifyCFG/rangereduce.ll | ||
| 84–85 | Uh, something's fishy here - why was this fshr here in the first place? This switch just needs a subtract. | |
| 118–119 | Your previous patch is clearly bogus; this is a shift by zero. | |
| lib/Transforms/Utils/SimplifyCFG.cpp | ||
|---|---|---|
| 5500 |
It does not. It only checks if the MAXIMUM is 1. | |
| 5515 | Yes we do, because InstCombine does not know how to rearrange case statements. | |
| 5519 | Nope, the first becomes the last. I guess I could use swap() and move(). | |
| 5553 | I was having problems when I tried to modify the reference to SmallVectorImpl that I did not know how to fix. | |
| test/Transforms/SimplifyCFG/rangereduce.ll | ||
| 118–119 | shifts by zero are well defined, and get cleaned up by the optimizer, but I agree it didn't need to be inserted. | |
| lib/Transforms/Utils/SimplifyCFG.cpp | ||
|---|---|---|
| 5515 | Sounds like a missed optimization on instcombine side then? | |
| lib/Transforms/Utils/SimplifyCFG.cpp | ||
|---|---|---|
| 5519 | I'm not sure I agree with that, or at least I'm missing something. Given a sorted array A of integers with popcount 1 (single bit set), I believe: order(A) == order(cttz(A)) // cttz does not change the order. order(A) == reverse(order(ctlz(A))) // Conversely ctlz reverses the order. If you need to handle the zero value too, then I would have thought cttz would barrel-shift as you describe and ctlz would barrel-shift the reversed order. My comment above was based on the assumption that you're not allowing zero to be passed to ctlz/cttz. Given you are, just keep on using llvm::sort. | |
| 5553 | We can fix that. Please keep using SmallVector and we can work through the difficulties. | |
This transform is deeply flawed, and needs much work. If the default is not reachable, there is no need to limit to countPopulation == 1, only assure that there are no duplicates. If the default _is_ reachable, then countPopulation <= 1 continues to be important, but also ctlz and cttz must be run, except in very specific circumstances (such as utf-8 first character parsing)
| lib/Transforms/Utils/SimplifyCFG.cpp | ||
|---|---|---|
| 5520 | This loop is scary and hard to read. It's really asking for a helper function. I would do something like this: // Given a set of equivalence classes, return true if, for every pair A, B in
// Values:
// Fn(A) == Fn(B) if and only if EC(A) == EC(B).
template <typename Function>
bool IsBijectiveOverEquivalenceClasses(const EquivalenceClasses<APInt> &Values,
const Function &Fn) {
DenseMap<unsigned, APInt> Leaders;
for (auto &V : Values) {
auto TransformedValue = Fn(V.getData());
auto LeaderIt = Leaders.find(TransformedValue);
if (LeadersIt == Leaders.end()) {
Leaders[TransformedValue] = V.getLeader()->getData();
continue;
}
if (LeaderIt->getData() != V.getData())
// This transformed value has already been seen and did not belong to the
// same equivalence class. This is not a bijective relation.
return false;
}
return true;
}
static bool CanReduceSwitchRangeWithCtlzOrCttz(SwitchInst &SI, bool &UseCtlz,
bool &UseCttz, bool &UseInvert) {
// Preconstruct equivalence classes of values based on successor block.
DenseMap<BasicBlock *, APInt> Last;
EquivalenceClasses<APInt> ECs;
for (auto CI : SI.cases()) {
auto I = Last.try_emplace(CI.getCaseSuccessor(), CI.getCaseValue()).first;
ECs.unionSets(I.second, V);
}
UseInvert = false;
UseCtlz = true;
if (IsBijectiveOverEquivalenceClasses(
ECs, [&](const APInt &V) { return V.countLeadingZeros(); }))
return true;
UseCtlz = false;
if (IsBijectiveOverEquivalenceClasses(
ECs, [&](const APInt &V) { return V.countTrailingZeros(); }))
return true;
UseInvert = true;
UseCtlz = true;
if (IsBijectiveOverEquivalenceClasses(
ECs, [&](const APInt &V) { return (~V).countLeadingZeros(); }))
return true;
UseCtlz = false;
if (IsBijectiveOverEquivalenceClasses(
ECs, [&](const APInt &V) { return (~V).countTrailingZeros(); }))
return true;
return false;
}
static bool ReduceSwitchRangeWithCtlzOrCttz(SwitchInst &SI,
SmallVectorImpl<uint64_t> &Values,
bool &UseCtlz, bool &UseCttz,
bool &UseInvert) {
// Note we don't pass Values into this function. This predicate uses the
// switch's current case values *and* their associated successor block, so a
// rearranged Values array cannot be analyzed (because we wouldn't know which
// successsor belonged to which value).
if (!CanReduceSwitchRangeWithCtlzOrCttz(SI, UseCtlz, UseCttz, UseInvert))
return false;
for (auto CI : SI.cases()) {
APInt I = CI.getCaseValue();
if (UseCtlz)
Values.push_back((Invert ? ~V : V).countTrailingZeros());
else
Values.push_back((Invert ? ~V : V).countLeadingZeros());
}
llvm::sort(Values);
return true;
} | |
| 5532 | This is a fallacy (will always be false) because:
| |
This optimization is not really worth it, for the amount of effort needed to implement it.
s/Another/A. "Another" requires the user to know some context which has now gone.