LowerBitSets: Use byte arrays instead of bit sets to represent in-memory bit sets.

By loading from indexed offsets into a byte array and applying a mask, a

program can test bits from the bit set with a relatively short instruction

sequence. For example, suppose we have 15 bit sets to lay out:

A (16 bits), B (15 bits), C (14 bits), D (13 bits), E (12 bits),

F (11 bits), G (10 bits), H (9 bits), I (7 bits), J (6 bits), K (5 bits),

L (4 bits), M (3 bits), N (2 bits), O (1 bit)

These bits can be laid out in a 16-byte array like this:

Byte Offset 0123456789ABCDEF

Bit

7 HHHHHHHHHIIIIIII 6 GGGGGGGGGGJJJJJJ 5 FFFFFFFFFFFKKKKK 4 EEEEEEEEEEEELLLL 3 DDDDDDDDDDDDDMMM 2 CCCCCCCCCCCCCCNN 1 BBBBBBBBBBBBBBBO 0 AAAAAAAAAAAAAAAA

For example, to test bit X of A, we evaluate ((bits[X] & 1) != 0), or to

test bit X of I, we evaluate ((bits[9 + X] & 0x80) != 0). This can be done

in 1-2 machine instructions on x86, or 4-6 instructions on ARM.

This uses the LPT multiprocessor scheduling algorithm to lay out the bits

efficiently.

Saves ~450KB of instructions in a recent build of Chromium.

Differential Revision: http://reviews.llvm.org/D7954