This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
test/TableGen/
-
TableGen/
-
BitOffsetDecoder.td
-
FixedLenDecoderEmitter/
-
InitValue.td
-
utils/TableGen/
-
TableGen/
1/3
FixedLenDecoderEmitter.cpp

Differential D98046

[TableGen] Fix excessive compile time issue in FixedLenDecoderEmitter
ClosedPublic

Authored by foad on Mar 5 2021, 7:37 AM.

Download Raw Diff

Details

Reviewers

dsanders
jmolloy
Joe_Nash
hgreving

Commits

rGb8bf94df2576: [TableGen] Fix excessive compile time issue in FixedLenDecoderEmitter

Summary

This patch reduces the time taken for clang to compile the generated
disassembler for an out-of-tree target with InsnType bigger than 64 bits
from 4m30s to 48s.

D67686 did a similar thing for CodeEmitterGen.

The idea is to tweak the API of the APInt-like InsnType class so that
we don't need so many temporary InsnTypes. This takes advantage of the
rule stated in D52100 that currently "no string of bits extracted
from the encoding may exceeed 64-bits", so we can use uint64_t for some
temporaries.

D52100 goes on to say that "fields are still permitted to exceed 64-bits
so long as they aren't one contiguous string of bits". This patch breaks
that by always using a "uint64_t tmp" in the generated decodeToMCInst,
but it should be easy to fix in FilterChooser::emitBinaryParser by
choosing to use a different type of tmp based on the known total field
width.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

foad requested review of this revision.Mar 5 2021, 7:37 AM

foad created this revision.

Herald added a project: Restricted Project. · View Herald TranscriptMar 5 2021, 7:37 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

D52100 goes on to say that "fields are still permitted to exceed 64-bits
so long as they aren't one contiguous string of bits". This patch breaks
that by always using a "uint64_t tmp" in the generated decodeToMCInst,
but it should be easy to fix in FilterChooser::emitBinaryParser by
choosing to use a different type of tmp based on the known total field
width.

I didn't try to implement the fix because I don't have a target that needs it so I wouldn't be able to test it. @dsanders do you?

jmolloy added a reviewer: hgreving.Mar 5 2021, 9:10 AM

Harbormaster completed remote builds in B92320: Diff 328527.Mar 6 2021, 12:28 AM

Ping! Does anyone have an interest in this for their own out-of-tree target?

I didn't try to implement the fix because I don't have a target that needs it so I wouldn't be able to test it. @dsanders do you?

Sorry, I've gone to try and confirm this a couple times in the last week and things kept coming up. I'll take another look a moment

In D98046#2629692, @dsanders wrote:

I didn't try to implement the fix because I don't have a target that needs it so I wouldn't be able to test it. @dsanders do you?

Sorry, I've gone to try and confirm this a couple times in the last week and things kept coming up. I'll take another look a moment

We've got a few that reach 64 but AFAICT none that exceed it

LGTM

llvm/utils/TableGen/FixedLenDecoderEmitter.cpp
1125–1133	It looks like insertBits has implementations that cover both values of UseInsertBits. Would it make sense to always use insertBits here and rely on the type of tmp to pick the appropriate version of insertBits()?

This revision is now accepted and ready to land.Mar 16 2021, 12:22 PM

foad added inline comments.Mar 16 2021, 12:32 PM

llvm/utils/TableGen/FixedLenDecoderEmitter.cpp
1125–1133	Using insertBits relies on us having written the initial `tmp = 0;` line, which we prefer to avoid for the simple cases. I think that's the only reason not to use insertBits in all cases.

dsanders added inline comments.Mar 16 2021, 1:25 PM

llvm/utils/TableGen/FixedLenDecoderEmitter.cpp
1125–1133	That makes sense to me. Thanks

Closed by commit rGb8bf94df2576: [TableGen] Fix excessive compile time issue in FixedLenDecoderEmitter (authored by foad). · Explain WhyMar 17 2021, 2:38 AM

This revision was automatically updated to reflect the committed changes.

foad added a commit: rGb8bf94df2576: [TableGen] Fix excessive compile time issue in FixedLenDecoderEmitter.

Revision Contents

Path

Size

llvm/

test/

TableGen/

BitOffsetDecoder.td

4 lines

FixedLenDecoderEmitter/

InitValue.td

4 lines

utils/

TableGen/

FixedLenDecoderEmitter.cpp

82 lines

Diff 328527

llvm/test/TableGen/BitOffsetDecoder.td

Show First 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	def baz : Instruction {
let AsmString = "baz $factor";		let AsmString = "baz $factor";
field bits<16> SoftFail = 0;		field bits<16> SoftFail = 0;
}		}

}		}

// CHECK: tmp = fieldFromInstruction(insn, 8, 7);		// CHECK: tmp = fieldFromInstruction(insn, 8, 7);
// CHECK: tmp = fieldFromInstruction(insn, 8, 8) << 3;		// CHECK: tmp = fieldFromInstruction(insn, 8, 8) << 3;
// CHECK: tmp \|= fieldFromInstruction(insn, 8, 4) << 7;		// CHECK: insertBits(tmp, fieldFromInstruction(insn, 8, 4), 7, 4);
// CHECK: tmp \|= fieldFromInstruction(insn, 12, 4) << 3;		// CHECK: insertBits(tmp, fieldFromInstruction(insn, 12, 4), 3, 4);
// CHECK: tmp = fieldFromInstruction(insn, 8, 8) << 4;		// CHECK: tmp = fieldFromInstruction(insn, 8, 8) << 4;

llvm/test/TableGen/FixedLenDecoderEmitter/InitValue.td

Show All 35 Lines	def bax : Instruction {
let factor{32} = 1; // non-zero initial value		let factor{32} = 1; // non-zero initial value
let Inst{15...8} = factor{32...25};		let Inst{15...8} = factor{32...25};
}		}

}		}

// CHECK: tmp = fieldFromInstruction(insn, 9, 7) << 1;		// CHECK: tmp = fieldFromInstruction(insn, 9, 7) << 1;
// CHECK: tmp = 0x1;		// CHECK: tmp = 0x1;
// CHECK: tmp \|= fieldFromInstruction(insn, 9, 7) << 1;		// CHECK: insertBits(tmp, fieldFromInstruction(insn, 9, 7), 1, 7);
// CHECK: tmp = 0x100000000;		// CHECK: tmp = 0x100000000;
// CHECK: tmp \|= fieldFromInstruction(insn, 8, 7) << 25;		// CHECK: insertBits(tmp, fieldFromInstruction(insn, 8, 7), 25, 7);

llvm/utils/TableGen/FixedLenDecoderEmitter.cpp

Show First 20 Lines • Show All 967 Lines • ▼ Show 20 Lines	emitDecoderFunction(formatted_raw_ostream &OS, DecoderSet &Decoders,
// input decoder index.		// input decoder index.
OS.indent(Indentation) << "template <typename InsnType>\n";		OS.indent(Indentation) << "template <typename InsnType>\n";
OS.indent(Indentation) << "static DecodeStatus decodeToMCInst(DecodeStatus S,"		OS.indent(Indentation) << "static DecodeStatus decodeToMCInst(DecodeStatus S,"
<< " unsigned Idx, InsnType insn, MCInst &MI,\n";		<< " unsigned Idx, InsnType insn, MCInst &MI,\n";
OS.indent(Indentation) << " uint64_t "		OS.indent(Indentation) << " uint64_t "
<< "Address, const void *Decoder, bool &DecodeComplete) {\n";		<< "Address, const void *Decoder, bool &DecodeComplete) {\n";
Indentation += 2;		Indentation += 2;
OS.indent(Indentation) << "DecodeComplete = true;\n";		OS.indent(Indentation) << "DecodeComplete = true;\n";
OS.indent(Indentation) << "InsnType tmp;\n";		// TODO: When InsnType is large, using uint64_t limits all fields to 64 bits
		// It would be better for emitBinaryParser to use a 64-bit tmp whenever
		// possible but fall back to an InsnType-sized tmp for truly large fields.
		OS.indent(Indentation) << "using TmpType = "
		"std::conditional_t<std::is_integral<InsnType>::"
		"value, InsnType, uint64_t>;\n";
		OS.indent(Indentation) << "TmpType tmp;\n";
OS.indent(Indentation) << "switch (Idx) {\n";		OS.indent(Indentation) << "switch (Idx) {\n";
OS.indent(Indentation) << "default: llvm_unreachable(\"Invalid index!\");\n";		OS.indent(Indentation) << "default: llvm_unreachable(\"Invalid index!\");\n";
unsigned Index = 0;		unsigned Index = 0;
for (const auto &Decoder : Decoders) {		for (const auto &Decoder : Decoders) {
OS.indent(Indentation) << "case " << Index++ << ":\n";		OS.indent(Indentation) << "case " << Index++ << ":\n";
OS << Decoder;		OS << Decoder;
OS.indent(Indentation+2) << "return S;\n";		OS.indent(Indentation+2) << "return S;\n";
}		}
▲ Show 20 Lines • Show All 117 Lines • ▼ Show 20 Lines	unsigned FilterChooser::getIslands(std::vector<unsigned> &StartBits,
return Num;		return Num;
}		}

void FilterChooser::emitBinaryParser(raw_ostream &o, unsigned &Indentation,		void FilterChooser::emitBinaryParser(raw_ostream &o, unsigned &Indentation,
const OperandInfo &OpInfo,		const OperandInfo &OpInfo,
bool &OpHasCompleteDecoder) const {		bool &OpHasCompleteDecoder) const {
const std::string &Decoder = OpInfo.Decoder;		const std::string &Decoder = OpInfo.Decoder;

if (OpInfo.numFields() != 1 \|\| OpInfo.InitValue != 0) {		bool UseInsertBits = OpInfo.numFields() != 1 \|\| OpInfo.InitValue != 0;

		if (UseInsertBits) {
o.indent(Indentation) << "tmp = 0x";		o.indent(Indentation) << "tmp = 0x";
o.write_hex(OpInfo.InitValue);		o.write_hex(OpInfo.InitValue);
o << ";\n";		o << ";\n";
}		}

for (const EncodingField &EF : OpInfo) {		for (const EncodingField &EF : OpInfo) {
o.indent(Indentation) << "tmp ";		o.indent(Indentation);
if (OpInfo.numFields() != 1 \|\| OpInfo.InitValue != 0) o << '\|';		if (UseInsertBits)
o << "= fieldFromInstruction"		o << "insertBits(tmp, ";
<< "(insn, " << EF.Base << ", " << EF.Width << ')';		else
if (OpInfo.numFields() != 1 \|\| EF.Offset != 0)		o << "tmp = ";
		o << "fieldFromInstruction(insn, " << EF.Base << ", " << EF.Width << ')';
		if (UseInsertBits)
		o << ", " << EF.Offset << ", " << EF.Width << ')';
		else if (EF.Offset != 0)
		dsandersUnsubmitted Not Done Reply Inline Actions It looks like insertBits has implementations that cover both values of UseInsertBits. Would it make sense to always use insertBits here and rely on the type of tmp to pick the appropriate version of insertBits()? dsanders: It looks like insertBits has implementations that cover both values of UseInsertBits. Would it…
		foadAuthorUnsubmitted Done Reply Inline Actions Using insertBits relies on us having written the initial `tmp = 0;` line, which we prefer to avoid for the simple cases. I think that's the only reason not to use insertBits in all cases. foad: Using insertBits relies on us having written the initial `tmp = 0;` line, which we prefer to…
		dsandersUnsubmitted Not Done Reply Inline Actions That makes sense to me. Thanks dsanders: That makes sense to me. Thanks
o << " << " << EF.Offset;		o << " << " << EF.Offset;
o << ";\n";		o << ";\n";
}		}

if (Decoder != "") {		if (Decoder != "") {
OpHasCompleteDecoder = OpInfo.HasCompleteDecoder;		OpHasCompleteDecoder = OpInfo.HasCompleteDecoder;
o.indent(Indentation) << Emitter->GuardPrefix << Decoder		o.indent(Indentation) << Emitter->GuardPrefix << Decoder
<< "(MI, tmp, Address, Decoder)"		<< "(MI, tmp, Address, Decoder)"
▲ Show 20 Lines • Show All 1,007 Lines • ▼ Show 20 Lines
// fieldFromInstruction().		// fieldFromInstruction().
// On Windows we make sure that this function is not inlined when		// On Windows we make sure that this function is not inlined when
// using the VS compiler. It has a bug which causes the function		// using the VS compiler. It has a bug which causes the function
// to be optimized out in some circustances. See llvm.org/pr38292		// to be optimized out in some circustances. See llvm.org/pr38292
static void emitFieldFromInstruction(formatted_raw_ostream &OS) {		static void emitFieldFromInstruction(formatted_raw_ostream &OS) {
OS << "// Helper functions for extracting fields from encoded instructions.\n"		OS << "// Helper functions for extracting fields from encoded instructions.\n"
<< "// InsnType must either be integral or an APInt-like object that "		<< "// InsnType must either be integral or an APInt-like object that "
"must:\n"		"must:\n"
<< "// * Have a static const max_size_in_bits equal to the number of bits "
"in the\n"
<< "// encoding.\n"
<< "// * be default-constructible and copy-constructible\n"		<< "// * be default-constructible and copy-constructible\n"
<< "// * be constructible from a uint64_t\n"		<< "// * be constructible from a uint64_t\n"
<< "// * be constructible from an APInt (this can be private)\n"		<< "// * be constructible from an APInt (this can be private)\n"
<< "// * Support getBitsSet(loBit, hiBit)\n"		<< "// * Support insertBits(bits, startBit, numBits)\n"
<< "// * be convertible to uint64_t\n"		<< "// * Support extractBitsAsZExtValue(numBits, startBit)\n"
<< "// * Support the ~, &, ==, !=, and \|= operators with other objects of "		<< "// * be convertible to bool\n"
		<< "// * Support the ~, &, ==, and != operators with other objects of "
"the same type\n"		"the same type\n"
<< "// * Support shift (<<, >>) with signed and unsigned integers on the "
"RHS\n"
<< "// * Support put (<<) to raw_ostream&\n"		<< "// * Support put (<<) to raw_ostream&\n"
<< "template <typename InsnType>\n"		<< "template <typename InsnType>\n"
<< "#if defined(_MSC_VER) && !defined(__clang__)\n"		<< "#if defined(_MSC_VER) && !defined(__clang__)\n"
<< "__declspec(noinline)\n"		<< "__declspec(noinline)\n"
<< "#endif\n"		<< "#endif\n"
<< "static InsnType fieldFromInstruction(InsnType insn, unsigned "		<< "static std::enable_if_t<std::is_integral<InsnType>::value, InsnType>\n"
"startBit,\n"		<< "fieldFromInstruction(const InsnType &insn, unsigned startBit,\n"
<< " unsigned numBits, "		<< " unsigned numBits) {\n"
"std::true_type) {\n"
<< " assert(startBit + numBits <= 64 && \"Cannot support >64-bit "		<< " assert(startBit + numBits <= 64 && \"Cannot support >64-bit "
"extractions!\");\n"		"extractions!\");\n"
<< " assert(startBit + numBits <= (sizeof(InsnType) * 8) &&\n"		<< " assert(startBit + numBits <= (sizeof(InsnType) * 8) &&\n"
<< " \"Instruction field out of bounds!\");\n"		<< " \"Instruction field out of bounds!\");\n"
<< " InsnType fieldMask;\n"		<< " InsnType fieldMask;\n"
<< " if (numBits == sizeof(InsnType) * 8)\n"		<< " if (numBits == sizeof(InsnType) * 8)\n"
<< " fieldMask = (InsnType)(-1LL);\n"		<< " fieldMask = (InsnType)(-1LL);\n"
<< " else\n"		<< " else\n"
<< " fieldMask = (((InsnType)1 << numBits) - 1) << startBit;\n"		<< " fieldMask = (((InsnType)1 << numBits) - 1) << startBit;\n"
<< " return (insn & fieldMask) >> startBit;\n"		<< " return (insn & fieldMask) >> startBit;\n"
<< "}\n"		<< "}\n"
<< "\n"		<< "\n"
<< "template <typename InsnType>\n"		<< "template <typename InsnType>\n"
<< "static InsnType fieldFromInstruction(InsnType insn, unsigned "		<< "static std::enable_if_t<!std::is_integral<InsnType>::value, "
"startBit,\n"		"uint64_t>\n"
<< " unsigned numBits, "		<< "fieldFromInstruction(const InsnType &insn, unsigned startBit,\n"
"std::false_type) {\n"		<< " unsigned numBits) {\n"
<< " assert(startBit + numBits <= InsnType::max_size_in_bits && "		<< " return insn.extractBitsAsZExtValue(numBits, startBit);\n"
"\"Instruction field out of bounds!\");\n"		<< "}\n\n";
<< " InsnType fieldMask = InsnType::getBitsSet(0, numBits);\n"		}
<< " return (insn >> startBit) & fieldMask;\n"
		// emitInsertBits - Emit the templated helper function insertBits().
		static void emitInsertBits(formatted_raw_ostream &OS) {
		OS << "// Helper function for inserting bits extracted from an encoded "
		"instruction into\n"
		<< "// a field.\n"
		<< "template <typename InsnType>\n"
		<< "static std::enable_if_t<std::is_integral<InsnType>::value>\n"
		<< "insertBits(InsnType &field, InsnType bits, unsigned startBit, "
		"unsigned numBits) {\n"
		<< " assert(startBit + numBits <= sizeof field * 8);\n"
		<< " field \|= (InsnType)bits << startBit;\n"
<< "}\n"		<< "}\n"
<< "\n"		<< "\n"
<< "template <typename InsnType>\n"		<< "template <typename InsnType>\n"
<< "static InsnType fieldFromInstruction(InsnType insn, unsigned "		<< "static std::enable_if_t<!std::is_integral<InsnType>::value>\n"
"startBit,\n"		<< "insertBits(InsnType &field, uint64_t bits, unsigned startBit, "
<< " unsigned numBits) {\n"		"unsigned numBits) {\n"
<< " return fieldFromInstruction(insn, startBit, numBits, "		<< " field.insertBits(bits, startBit, numBits);\n"
"std::is_integral<InsnType>());\n"
<< "}\n\n";		<< "}\n\n";
}		}

// emitDecodeInstruction - Emit the templated helper function		// emitDecodeInstruction - Emit the templated helper function
// decodeInstruction().		// decodeInstruction().
static void emitDecodeInstruction(formatted_raw_ostream &OS) {		static void emitDecodeInstruction(formatted_raw_ostream &OS) {
OS << "template <typename InsnType>\n"		OS << "template <typename InsnType>\n"
<< "static DecodeStatus decodeInstruction(const uint8_t DecodeTable[], "		<< "static DecodeStatus decodeInstruction(const uint8_t DecodeTable[], "
▲ Show 20 Lines • Show All 186 Lines • ▼ Show 20 Lines	void FixedLenDecoderEmitter::run(raw_ostream &o) {
OS << "#include \"llvm/Support/Debug.h\"\n";		OS << "#include \"llvm/Support/Debug.h\"\n";
OS << "#include \"llvm/Support/LEB128.h\"\n";		OS << "#include \"llvm/Support/LEB128.h\"\n";
OS << "#include \"llvm/Support/raw_ostream.h\"\n";		OS << "#include \"llvm/Support/raw_ostream.h\"\n";
OS << "#include <assert.h>\n";		OS << "#include <assert.h>\n";
OS << '\n';		OS << '\n';
OS << "namespace llvm {\n\n";		OS << "namespace llvm {\n\n";

emitFieldFromInstruction(OS);		emitFieldFromInstruction(OS);
		emitInsertBits(OS);

Target.reverseBitsForLittleEndianEncoding();		Target.reverseBitsForLittleEndianEncoding();

// Parameterize the decoders based on namespace and instruction width.		// Parameterize the decoders based on namespace and instruction width.
std::set<StringRef> HwModeNames;		std::set<StringRef> HwModeNames;
const auto &NumberedInstructions = Target.getInstructionsByEnumValue();		const auto &NumberedInstructions = Target.getInstructionsByEnumValue();
NumberedEncodings.reserve(NumberedInstructions.size());		NumberedEncodings.reserve(NumberedInstructions.size());
DenseMap<Record *, unsigned> IndexOfInstruction;		DenseMap<Record *, unsigned> IndexOfInstruction;
▲ Show 20 Lines • Show All 138 Lines • Show Last 20 Lines