https://alive2.llvm.org/ce/z/94yRMN
Fixes #54177
Differential D122077
[InstCombine] Fold (ctpop(X) == 1) | (X == 0) into ctpop(X) < 2 hkmatsumoto on Mar 19 2022, 1:16 PM. Authored by
Details
Diff Detail
Event Timeline
Comment Actions Address code reviews
foldOrOfCtpop folds the aforementioned pattern into (icmp uge ctpop(X) 2)
Comment Actions Reflect code reviews
Comment Actions A few changes for tests suggested inline. There might be some generalization of ctpop analysis that we can make as a follow-up patch.
Comment Actions Reflect code reviews
Comment Actions Thank you for the making the test changes. I pushed the baseline tests on your behalf here: Please rebase and update the patch here in Phabricator - it should only show changes in the CHECK lines after the update. Comment Actions Why is this fold preferable to (X & (X-1)) == 0? At least on all architectures without native population count, the binary-and based test is preferable and it might even be better with it. Comment Actions Less IR instructions. I think SDAG already expands some ctpop patterns to logic (backends should decide about optimal form) Comment Actions Correct - we try to convert the setcc(ctpop) pattern here: That doesn't check whether the target has a popcount instruction for scalar types, so it is potentially too aggressive. For x86, we have a bonus optimization after that, so it should not show up there:
Comment Actions @spatel Since I don't have commit access, can you land this patch? |
Why create ">= 2" instead of "> 1" directly?
I don't think it makes the transform or code any clearer with ">= 2", and we will always canonicalize to the other form, so I would prefer to go directly to the final result for efficiency.