The one thing of note here is that the 'bitwidth' constant (32/64) was previously pessimistic.
Given x & (-1 >> (C - z)), we were taking C to be bitwidth(x), but in reality
we want (-1 >> (C - z)) pattern to mean "low z bits must be all-ones".
And for that, C should be bitwidth(-1 >> (C - z)), i.e. of the shift operation itself.
Last pattern D does not seem to exhibit any of these truncation issues.
Although it has the opposite problem - if we extract low bits (no shift) from i64,
and then truncate to i32, then we fail to shrink this 64-bit extraction into 32-bit extraction.