This is an archive of the discontinued LLVM Phabricator instance.

[X86] Add more patterns for BZHI isel
ClosedPublic

Authored by craig.topper on Apr 27 2017, 2:20 PM.

Details

Summary

This patch adds more patterns that a reasonable person might write that can be compiled to BZHI.

This adds support for

(~0U >> (32 - b)) & a;

and

a << (32 - b) >> (32 - b);

This was inspired by the code in APInt::clearUnusedBits.

This can pass an index of 32 to the bzhi instruction which a quick test of Haswell hardware shows will not mask any bits. Though the description text in the Intel manual says the "index is saturated to OperandSize-1". The pseudocode in the same manual indicates no bits will be zeroed for this case.

I think this is still missing cases where the subtract portion is an 8-bit operation.

Diff Detail

Event Timeline

craig.topper created this revision.Apr 27 2017, 2:20 PM
RKSimon edited edge metadata.Apr 30 2017, 4:49 AM

According to the AMD APM v3:

If the value of index is greater than or equal to the operand size, index is set to (op_size-1). In this case, the CF flag is set.

So AMD's description is similar to the incorrect Intel description. Do you have an AMD machine that you can verify this with?

I tested with

printf("%x", _bzhi_u32(0xffffffff, 32));

On Haswell that returned 0xffffffff.

So AMD's description is similar to the incorrect Intel description. Do you have an AMD machine that you can verify this with?

I tested with

printf("%x", _bzhi_u32(0xffffffff, 32));

On Haswell that returned 0xffffffff.

Excavator returned 0xffffffff as well. The Ryzen is in pieces so I can't test it right now.

RKSimon accepted this revision.May 9 2017, 2:49 AM

Ryzen returns 0xffffffff as well.

LGTM.

This revision is now accepted and ready to land.May 9 2017, 2:49 AM
This revision was automatically updated to reflect the committed changes.