As discussed in D89952, instcombine can sometimes find a way to reduce similar patterns, but it is incomplete.
InstSimplify uses the computeConstantRange() ValueTracking analysis via simplifyICmpWithConstant(), so we just need to fill in the max value of ctpop to process any "icmp pred ctpop(X), C" pattern (the min value is initialized to zero automatically).
Details
Diff Detail
Event Timeline
FYI -- I'm not sure if it'll change the test impact of this patch, but I did commit a bunch of x86 vector popcnt tests to the repository this morning.
IIUC those were codegen tests -- 8556f38 -- so there won't be any overlap with the change here. I think some targets do run -instsimplify as a pre-codegen IR pass, but x86 is not one of them.
In general, we want to have this kind of analysis in IR to efficiently reduce code as quickly as possible. If there's evidence that these patterns are created later in the pipeline, then we could repeat the transforms in codegen. But we don't want to have that redundancy blindly because everything has a compile-time cost.
Thanks. Unless less I missed it, we should also add the upper range for count leading/trailing zero too.
Yes - it should just be adding the case line on top of this and adding tests. But I prefer to do it in steps, so it's easier to debug if we hit any problems.
LGTM
llvm/lib/Analysis/ValueTracking.cpp | ||
---|---|---|
6465 | Use Width instead of II.getType()->getScalarSizeInBits()? |
llvm/lib/Analysis/ValueTracking.cpp | ||
---|---|---|
6465 | Yes, we already have the bit-width here. I'll also add an assert for 'Lower', so the zero initializer assumption doesn't break. |
Use Width instead of II.getType()->getScalarSizeInBits()?