This is an archive of the discontinued LLVM Phabricator instance.

Use bitset for assembler predicates
ClosedPublic

Authored by rampitec on Mar 5 2019, 3:17 PM.

Details

Summary

AMDGPU target run out of Subtarget feature flags hitting the limit of 64.
AssemblerPredicates uses at most uint64_t for their representation.
At the same time CodeGen has exhausted this a long time ago and switched
to a FeatureBitset with the current limit of 192 bits.

This patch completes transition to the bitset for feature bits extending
it to asm matcher and MC code emitter.

Diff Detail

Event Timeline

rampitec created this revision.Mar 5 2019, 3:17 PM
rampitec updated this revision to Diff 189636.Mar 6 2019, 6:40 PM

Reduced tables size by creating a unique set of FeatureBitsets and referencing them from the tables.
That solves the problem with gcc hanging while compiling generated asm matcher if used with sanitizers.

craig.topper added inline comments.Mar 6 2019, 7:17 PM
include/llvm/MC/MCParser/MCTargetAsmParser.h
258

return a reference

363–368

Should this return a "const FeatureBitset&"?

364

Should probably take "const FeatureBitset&"

lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
4169–4170

Probably should pass the bits by const reference?

lib/Target/AArch64/MCTargetDesc/AArch64MCCodeEmitter.cpp
190–193

const FeatureBitset&

lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCCodeEmitter.h
67–70

const FeatureBitset&

lib/Target/ARM/AsmParser/ARMAsmParser.cpp
6071

const FeatureBitset&

10482

Can be a const reference?

10590–10591

This invokes the std::initializer_list constructor in FeatureBitset that can't be evaluated at compile time. Thus this is calling a static global constructor.

I just recently fixed this issue in the *GenSubtargetInfo.inc files

rampitec updated this revision to Diff 189764.Mar 7 2019, 11:58 AM
rampitec marked 9 inline comments as done.

Switched to const references to FeatureBitset where possible.

lib/Target/ARM/AsmParser/ARMAsmParser.cpp
10590–10591

I understand the concern, but use of FeatureBitArray makes it very dangerous for manual initialization since it does not provide any symbolic names for the initializers. If an individual bit will move to another dword it will start silently malfunction.

So far I have moved the array inside the consuming function in a hope that a smart compiler can postpone actual initialization until the first use, which may never happen. I suppose a best solution would be to change it when the above FIXME is resolved, so that this array will be auto generated by tablegen.

craig.topper added inline comments.Mar 8 2019, 2:28 PM
lib/Target/ARM/AsmParser/ARMAsmParser.cpp
10484

This seems to be making the assumption that ARM feature bits fit in a unisgned long long?

utils/TableGen/AsmMatcherEmitter.cpp
2835–2836

Can we bound this to less than a full unsigned?

3369

How many entries end up in this table on different targets? Since we're tablegening this, can we use hex contants and use FeatureBitArray instead of invoking the std::initializer_list constructor? That probably depends on how many entries end up in the table.

3395–3396

Surely we can bound this to less than a full unsigned? X86 uses almost no features in the assembler which means we can probably use a uint8_t and MatchEntry is used about 28000 times. So that's going to save like 3 * 28000 bytes just from X86.

rampitec updated this revision to Diff 189962.Mar 8 2019, 4:44 PM
rampitec marked 5 inline comments as done.

Reduced type size.

lib/Target/ARM/AsmParser/ARMAsmParser.cpp
10484

Yes, if exceptions enabled it would raise an exception. The set cannot be constructed from a FeatureBitset since it does not have comparison operators.
I would add an assert, but unfortunately cannot enumerate all features. The solution can be to use a SmallVector, but as long as it fits 64 bit this seems to be an overkill.

utils/TableGen/AsmMatcherEmitter.cpp
2835–2836

Yes, 8 bit is enough.

3369

Asm matcher:

MSP430:0
AArch64:32
ARM:88
WebAssembly:7
Hexagon:11
BPF:0
Sparc:5
AMDGPU:55
Mips:137
PowerPC:10
SystemZ:28
Lanai:0
X86:5

MCCodeEmitter:

MSP430:0
AArch64:28
ARM:81
WebAssembly:8
Hexagon:11
BPF:0
Sparc:5
AMDGPU:68
AMDGPU:0
Mips:149
PowerPC:0
SystemZ:29
Lanai:0

Do you think we need FeatureBitArray here? It has a cost as well, every time one calls getAsBitset().

rampitec updated this revision to Diff 189967.Mar 8 2019, 4:56 PM

Added FeatureBitset::operator < () so that ARM can use set of it.

rampitec marked 3 inline comments as done.Mar 8 2019, 4:58 PM
rampitec added inline comments.
lib/Target/ARM/AsmParser/ARMAsmParser.cpp
10484

I have just added comparison operator to be on a safe side.

utils/TableGen/AsmMatcherEmitter.cpp
3395–3396

BTW, table is now smaller. It used to be full 64 bit before per record. OperandMatchEntry can also be rearranged to save on padding now.

craig.topper accepted this revision.Mar 8 2019, 5:17 PM

LGTM

utils/TableGen/AsmMatcherEmitter.cpp
3395–3396

It wasn't a full 64 bit before, getMinimalTypeForEnumBitfield was making it uint8_t for X86.

This revision is now accepted and ready to land.Mar 8 2019, 5:17 PM
rampitec marked an inline comment as done.Mar 8 2019, 5:26 PM
rampitec added inline comments.
utils/TableGen/AsmMatcherEmitter.cpp
3395–3396

Oh, I see. It was for us AMDGPU. Also a pretty big table.

rampitec updated this revision to Diff 189974.Mar 8 2019, 5:29 PM

Use getMinimalTypeForRange() for FeatureBitset types. In case it will grow.
Resorted operand match table fields to decrease table size.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptMar 11 2019, 10:04 AM
Herald added a subscriber: jrtc27. · View Herald Transcript