https://alive2.llvm.org/ce/z/94yRMN
Fixes #54177
Differential D122077
[InstCombine] Fold (ctpop(X) == 1) | (X == 0) into ctpop(X) < 2 hkmatsumoto on Mar 19 2022, 1:16 PM. Authored by
Details
Diff Detail
Unit Tests Event Timeline
Comment Actions Address code reviews
foldOrOfCtpop folds the aforementioned pattern into (icmp uge ctpop(X) 2)
Comment Actions Reflect code reviews
Comment Actions A few changes for tests suggested inline. There might be some generalization of ctpop analysis that we can make as a follow-up patch.
Comment Actions Reflect code reviews
Comment Actions Thank you for the making the test changes. I pushed the baseline tests on your behalf here: Please rebase and update the patch here in Phabricator - it should only show changes in the CHECK lines after the update. Comment Actions Why is this fold preferable to (X & (X-1)) == 0? At least on all architectures without native population count, the binary-and based test is preferable and it might even be better with it. Comment Actions Less IR instructions. I think SDAG already expands some ctpop patterns to logic (backends should decide about optimal form) Comment Actions Correct - we try to convert the setcc(ctpop) pattern here: That doesn't check whether the target has a popcount instruction for scalar types, so it is potentially too aggressive. For x86, we have a bonus optimization after that, so it should not show up there:
Comment Actions @spatel Since I don't have commit access, can you land this patch? |
Why create ">= 2" instead of "> 1" directly?
I don't think it makes the transform or code any clearer with ">= 2", and we will always canonicalize to the other form, so I would prefer to go directly to the final result for efficiency.