This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/
-
llvm/
-
CodeGen/GlobalISel/
-
GlobalISel/
-
InstructionSelectTestgen.h
2/2
InstructionSelector.h
-
InstructionSelectorImpl.h
4/4
InstructionSelectorTestgen.h
-
InitializePasses.h
-
lib/CodeGen/GlobalISel/
-
CodeGen/
-
GlobalISel/
-
CMakeLists.txt
-
GlobalISel.cpp
2/2
InstructionSelectTestgen.cpp
49/51
InstructionSelectorTestgen.cpp
-
test/TableGen/
-
TableGen/
-
GlobalISelEmitter.td
-
utils/
-
TableGen/
4/4
GlobalISelEmitter.cpp
2/2
update_instruction_select_testgen_tests.sh

Differential D43962

[GlobalISel][utils] Adding the init version of Instruction Select Testgen
AbandonedPublic

Authored by rtereshin on Mar 1 2018, 12:37 PM.

Download Raw Diff

Details

Reviewers

qcolombet
ab
dsanders
aditya_nandakumar
bogner
volkan
t.p.northover
rovka
javed.absar
aemerson

Commits

rL330988: [GlobalISel] Reporting rules covered as part of the InstructionSelect's debug…
rL331398: [GlobalISel][InstructionSelect] Making Coverage Info generation optional on per…
rL331396: [GlobalISel][InstructionSelect] Refactoring buildMatchTable out, NFC
rL331395: [GlobalISel][InstructionSelect] Refactoring out a getMatchTable virtual method…

Summary

This is the first version of the testgen - a tool, currently implemented as an
llc MIR-pass, that generates regression lit-tests for GlobalISel's Instruction
Selector. The generation is done on rule-by-rule basis and currently covers
selection rules automatically imported by TableGen from SelectionDAGISel.

This is a prerequisite for the tests generated for

AArch64: https://reviews.llvm.org/D43976
ARM: https://reviews.llvm.org/D43979
1. updated in https://reviews.llvm.org/D43982 (this also demonstrates how the tests could be updated when GlobalISel changes)
X86: https://reviews.llvm.org/D43994 (this also demonstrates the usefulness of ABI speculation done by Testgen)

What this tool is and isn't:

This is not a fuzzer for Instruction Selector, it isn't trying to come up with a malicious input for it and break it, nor it's trying to discover bugs in it.

This is a regression testing tool, it's main goal is to capture the current state of the GlobalISel's InstructionSelect pass providing the best test coverage for it with small and highly targeted tests that pass, and catch any regressions later due to changes in:
1. *.td-definitions of the instructions and selection patterns;
2. GlobalISel's emitter (the TableGen backend), including the ones that intend to change rules' priorities and the ones that don't;
3. manually written parts of the Instruction Selector.

Potentially this is also an analysis tool that may make it easier to see and control the actual effects of changes like listed above on the Selector, detect dead rules, etc.

It may be extended in the future to generate tests for other passes of the GlobalISel's pipeline, and / or have a fuzzer mode of operation, but currently these aren't the goals.

It also may be turned into a benchmark-gen for Instruction Selector relatively easily. That way we could benchmark the Selector on large inputs created in-memory, avoiding huge (~90%) overhead of parsing MIR off disk, and having the input with any desirable probabilities of any pattern supported, as well as having only the patterns supported, avoiding any input that is not selectable, thus getting more stable, targeted, and precise performance data.

Potential user stories:

New backend development.
Porting an existing backend from SelectionDAG ISel to GlobalISel.

While the first one is promising, it appears that the second one is more
prominent right now and therefore the main target of this tool.

Design goals:

As we mostly care about providing regression testing of InstructionSelect pass of GlobalISel's implementations early in the development for pre-existing targets, we can not rely on any other GlobalISel passes being well-developed and fully functional, in particular, we expect InstructionSelect pass to be well ahead everything else due to the semi-automatic porting mechanism.

See https://reviews.llvm.org/rL326396 as an example of breaking ties with the Legalizer, selectUnconstrainedRegBanks of this patch as an example of the same w.r.t. RegBankSelect.

We want the testgen to be relatively robust and able to handle gracefully non-functional changes, for instance, changes in the typical order of the MatchTable opcodes for rules, or even presense of specific opcodes, like the number of operands check, or changes in concrete serialization format for MIR.

We want the testgen to be as target-independent and generic as possible and impose as less maintainance burden on backend writers as possible.

If it's not jeopardizing other goals and not too difficult to do, we want testgen to generate naturally-looking tests that are likely to come out the same if written by a human.

Design decisions made:

Current implementation of testgen uses TableGen'erated MatchTable's to generate the tests. We could've branched off input data-wise earlier, but that would mean re-implementing too much of the GlobalISel's emitter.

We're using only matching parts of the MatchTable to generate MIR and relying on the selector itself to generate FileCheck's for the expected output for a few major reasons:
1. it simplifies the implementation;
2. it reduces the number of tests failing as of time of their generation due to the MIR being selected not by a rule intended, which is desirable as we aren't fuzzing the selector, but trying to generate passing tests;

Usage:

llc -mtriple aarch64-- -run-pass instruction-select-testgen -simplify-mir input.mir -o output.mir

will add a number of Machine Function's, one per every imported *.td-defined
selection rule, into intput.mir and write the result as output.mir.

Command line options:

-testgen-from-rule=N -testgen-until-rule=M - generate tests for a subrange of rules only;

-testgen-exclude-rules=N{,N} - skip specific rules;

-testgen-include-only=N{,N} - generate tests for explicitly listed rules only;

-testgen-set-all-features - speculatively satisfy all target / module / and function features requirements to cover feature-specific rules;

-testgen-no-abi - don't speculate on ABI boundaries tring to make the test look natural and test COPY's selection, but just IMPLICIT_DEF undefined vregs instead.

Note:
-testgen-no-abi=false tried to emit real RET opcodes at some point by using
CallLowering::lowerReturn and deriving IR Types from LLTs, but it proved
to be unreliable for most targets and created an extra dependency.

This patch also provides utils/update_instruction_select_testgen_tests.sh tool
that would generate a couple of lit-tests:

usage: ./utils/update_instruction_select_testgen_tests.sh <testgen'd file> <llc binary> <target triple> [extra llc args]

for instance, executing

../../utils/update_instruction_select_testgen_tests.sh ../../test/CodeGen/AArch64/GlobalISel/arm64-instruction-select-testgen-testgend.mir ./bin/llc aarch64--

from a build/obj directory would create 2 files:

../../test/CodeGen/AArch64/GlobalISel/arm64-instruction-select-testgen-testgend.mir
and
../../test/CodeGen/AArch64/GlobalISel/arm64-instruction-select-testgen-selected.mir

testing that the testgen outputs the same MIR and the selector selects that MIR
the same way respectively.

Coverage:

Target  | Rules    | Fail to | Tests     | Selected by the
        | Imported | Select  | Generated | Rule Intended
--------+----------+---------+-----------+----------------
AArch64 |  1654    |  0.0%   |  1449     |  85%
ARM     |  1055    |  0.2%   |   991     |  78%
x86     |   887    | 13.8%   |   765     |  68%

"Fail to Select" stands for "a generated test crashed / asserted the selector",
this is something to -testgen-exclude-rules in practice. The major reason
for this right now is a limited support of COPY_TO_REGCLASS in *.td-defined
patterns by the GlobalISel importing mechanism.

"Selected by the Rule Intended" basically means the target coverage provided by
the tool. A test could be selected by a rule different from the rule that was
used to generate it for a variety of reasons, approximately in order from most
prominent ones to the rarest ones:

The test generated isn't specific enough due to:
1. lack of support of complex patterns by the testgen;
2. too basic support of immediate predicates by the testgen;
3. rules genuinely intersecting with each other and local approach of the testgen not considering rules partially hiding each other.

A rule is genuinely dead and
1. it was rendered dead by GlobalISel;
2. it was dead in SelectionDAG ISel to beging with;
3. it is rendered dead by manually written parts of the selector executing before trying TableGen'erated selectImpl.

Known deficiencies:

Testgen could not be currently easily extended by a target to support complex patterns, which should greatly improve coverage.

Testgen's way of dealing with features is very sketchy at the moment and needs to improved.

Testgen should probably be a separate from llc binary tool

approximately from the most important to fix soon to the least important.

A couple of dependencies for this patch as well as the tests generated
are coming soon in separate patches.

See also test/CodeGen/AArch64/GlobalISel/select-with-no-legality-check.mir
currently committed for an example output of the testgen.

Diff Detail

Repository: rL LLVM

Event Timeline

rtereshin created this revision.Mar 1 2018, 12:37 PM

Herald added subscribers: mgrang, kristof.beyls, mgorny. · View Herald TranscriptMar 1 2018, 12:37 PM

rtereshin edited the summary of this revision. (Show Details)Mar 1 2018, 12:41 PM

rtereshin edited the summary of this revision. (Show Details)Mar 1 2018, 1:27 PM

rtereshin edited the summary of this revision. (Show Details)Mar 1 2018, 1:43 PM

rtereshin edited the summary of this revision. (Show Details)Mar 1 2018, 3:31 PM

rtereshin edited the summary of this revision. (Show Details)Mar 1 2018, 4:35 PM

Fixing a little non-determinism in picking live-in phys regs used to define input vregs.

rtereshin updated this revision to Diff 136675.Mar 1 2018, 9:07 PM

fixing little formatting issues, NFC

Performance

or why all the tests are in a single file?

in short, with test cases in separate test-files it takes at least ~30 times (or at least 2 minutes just for AArch64 and only for the rules currently imported by GlobalISel emitter) longer to update the tests, and at least 15 times (or at least 40 seconds just for AArch64 and only for the rules currently imported by GlobalISel emitter) longer to run the tests on 4-cores SSD-only iMac running macOS depending on a build type. In some cases, the difference reaches 130x / 5 minutes.

With all the tests in a single file it takes about 2 seconds to update them for AArch64 at the moment, and about 1.33 seconds to run them on the same machine as described above (both tests ran in parallel, where the test testing the Testgen itself takes ~1/3 of a second, and the test testing the Instruction Selector takes the full 1.33 seconds).

rtereshin edited the summary of this revision. (Show Details)Mar 5 2018, 9:47 AM

Rebased against master.
GIM_RecordInsn handler is made more tolerant to alternative orderings of GIM_RecordInsn's with respect to other opcodes while matching multi-instruction patterns.
A use-after-free bug is fixed in emitReturnIntoTestCase caught by ASAN.

Hi Roman,

This is a very large patch with lots of complexity, and is hard to review. While your description outlines the general idea of the tool, there's little documentation in the code of how this actually works under the hood. Can you add more descriptions so someone can follow what's happening when it comes to maintenance later, e.g. what's LiveInRA supposed to do, what are the pre-conditions and expected outputs of each phase of this tool?

Without that I suspect this patch will continue to lie in the review queue for a long while.

Thanks,
Amara

In D43962#1048769, @aemerson wrote:

Hi Roman,

This is a very large patch with lots of complexity, and is hard to review. While your description outlines the general idea of the tool, there's little documentation in the code of how this actually works under the hood. Can you add more descriptions so someone can follow what's happening when it comes to maintenance later, e.g. what's LiveInRA supposed to do, what are the pre-conditions and expected outputs of each phase of this tool?

Without that I suspect this patch will continue to lie in the review queue for a long while.

Thanks,
Amara

Hi Amara,

Thanks for taking a look at this! Will do.

Brining perf-data up as the comment is hidden due to the diff update: https://reviews.llvm.org/D43962#1026201

Aside from being a large patch with few function-level comments, some of the non-functional changes also make this difficult to review. It would be helpful to move things like the indentation correction on testImm*(), the introduction of buildTable and getMatchTable(), the changes to coverage, moving the emission of selectImpl() down, etc. into a separate patch(es).

I think the overall approach of parsing the match table, emitting something that matches, and constructing some scaffolding around it is a good plan. We'll need to find a good way of handling immediates, C++ predicates, ComplexPattern and similar but that should definitely be left for later patches.

Some targets have a rather large number of rules (e.g. X86 is around 17k IIRC). Do we have a mechanism for keeping the number of generated tests to a reasonable level?

include/llvm/CodeGen/GlobalISel/InstructionSelector.h
225	Is this comment accurate? GIR_AddRegister is listed as having 2 operands and I only see 2 in my build
include/llvm/CodeGen/GlobalISel/InstructionSelectorTestgen.h
27–29	This one probably isn't harmful since InstructionSelectorTestgen.h isn't going to be widely included but we ought to avoid anonymous namespaces in headers since each compilation unit that includes the header will get it's own version of the declaration. It's probably best to put it in the llvm namespace if we can't push it into the cpp entirely. InstructionSelectorTestgen doesn't seem to have any state so it looks like generateTestCase() could be a static function outside the class
33–38	Are all of these really needed in the header? Most seem to only be used from InstructionSelectorTestgen.cpp
include/llvm/Support/CodeGenCoverage.h
26 ↗	(On Diff #139188)	Just a small nit: we should probably have 'const' somewhere in the 'covered_iterator' name.
lib/CodeGen/GlobalISel/InstructionSelectTestgen.cpp
46–48	Could you add a comment indicating that verification is the only thing we do with these functions and why?
lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp
59	If the function already exists somehow then this might not return an empty test case due to the getOrInsertFunction() we should probably fail in that case
103	LLT's scalars aren't always integers. So long as we inject bitcasts this will be fine though
124	If I understand correctly, this is a register allocator that is used to generate the live-ins for the test function. This needs some explaining in the comments and possibly also renaming (I don't have a good spelling for it, maybe InputRegAllocator)
274–309	This lambda is getting pretty big. I'd be inclined to make it a static function in its own right
283	I think a word is missing from "of the required by"
307	greadily -> greedily
344–346	As a general thing: Writes to dbgs() should generally be wrapped in DEBUG(...) or DEBUG_ONLY(...) so that we don't format strings we're not going to emit.
354	I'm not sure IMPLICIT_DEF is the right thing to use for these if-statements. IMPLICIT_DEF is a definition with an unknown value (much like UNDEF) so it wouldn't be wrong to propagate it in something like %0 = IMPLICIT_DEF %2 = G_ADD %0, %1 to: %2 = IMPLICIT_DEF a COPY of a live-in phys-reg would be safer but it's probably ok since constant propagation isn't ISel's job. I think this would only become an issue if we started porting this to combiners.
367	This table is going to be quite fragile. We should at put this somewhere near the GIM_/GIR_ declarations or at least cross-reference them in the comments. Tablegen-erating them might be sensible if we get additional metadata.
422	If we have a rule with variadic instructions, what do we do about the number of defs?
469	Is this ever false? ensureNumOperands() looks like it adds operands until this is true
485	What is this for?
486–489	This comment explains the reasons behind something but doesn't really explain what that something is
498	I don't think I understand this variable. What does it represent?
503	This should probably indicate what is being skipped
504	In some ways it would be nice to support this (e.g. to check the tests are the same) but I agree it's way too big a task for a first patch.
505–511	Doesn't NDEBUG also disable the dbgs() stream? That assert can't succeed (NestingLevel > 1 vs NestingLevel == 1). It looks like this ought to be report_fatal_error() or similar
597	I don't think I understand the Def.getReg() part of this. Both the Use and the Def must be the same register so I'd expect to always use one or the other for all cases.
681–682	Function-level features can be handled by listing them in the function attributes: attributes #0 = { "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" } The tricky bit will be mapping the enum back to the feature name. Module-level features are harder since they're only read once per module. You'd have to separate them into multiple files
715	There's no guarantee that 1 will match. There's a few predicates that check they're multiples of N
761–784	One thing to mention here is that if something changes the register number of OtherInsnB.getOperand(OtherOpIdx) after this opcode is processed then these might diverge. I think we're ok on that since setReg() doesn't seem to be called from latePass()
796	As with the other immediate case, '1' might not match all predicates
849	I was confused by this function name at first. It sounded like the regbanks were unconstrained and that didn't make sense. I see it's about assigning regbanks to vregs that don't already check one. I haven't thought of a good spelling ('assignRegBanksToUnconstrainedVregs()' or 'ensureRegistersHaveBanks()' are a couple ideas) but we may be able to find a clearer name.
884	We should have a couple examples of how duplicates can happen (e.g. i32 and f32 both mapping to s32)
utils/TableGen/GlobalISelEmitter.cpp
4069–4073	It seems that this block has been moved down from the other side of the sort. Non-functional changes like this ought to be in separate patches
4091	This definition should probably be wrapped in a #ifdef
utils/update_instruction_select_testgen_tests.sh
40–43	The 'perl' commands might be an issue for the windows bots.

rtereshin marked 16 inline comments as done.Apr 25 2018, 5:25 PM

rtereshin added inline comments.

include/llvm/CodeGen/GlobalISel/InstructionSelector.h
225	The comments in this section describe an opcode below the comments, not above. So this is all about `GIR_AddTempRegister` and it appears to have 3 operands (https://github.com/llvm-mirror/llvm/blob/master/include/llvm/CodeGen/GlobalISel/InstructionSelectorImpl.h#L598-L601)
include/llvm/CodeGen/GlobalISel/InstructionSelectorTestgen.h
27–29	That's a good insight, thanks, I didn't think of that. `generateTestCase` calls virtual member-functions and may do so more in the future and I'd rather keep it that way and keep it a class member. I'm moving the helper class (`LiveInRA` or however we would (re)name it) in `InstructionSelectorTestgen`as a member-class instead, I hope that solves the problem well enough.
33–38	Only `TestgenSetAllFeatures` is required to be here as it's used by the instruction selector itself. I'm not exactly sure why not to list all the options in a header, in a sense it's part of its interface, CLI in this case, but sure, I'll remove all of those that don't have to be here. After all, nothing would force this list to be complete, so maybe it's better if it's explicitly incomplete, and also this is rarely (if ever) done across other parts of LLVM.
include/llvm/Support/CodeGenCoverage.h
26 ↗	(On Diff #139188)	Sure, good catch! I'm renaming it to `const_covered_iterator`
lib/CodeGen/GlobalISel/InstructionSelectTestgen.cpp
46–48	Sure, adding the following comment: // Generally it is possible to run the Testgen over a non-empty module, // practically probably a lit-test, and have it add additional test-cases // to it. The easiest way to limit the number of verification failures on // the final output caused by initially present machine functions rather // than the ones added, is to run the verification on those already // present functions first.
lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp
59	True. Let's just `MachineFunction::reset()` it for now if such a thing happens, that would make `llc` update the existing test cases if ran over an already generated test-module. It's a sensible thing to do, I think. Generally, if not reset, it would actually just add another machine basic block and repeat the instruction sequence derived from the selection rule, so we'll end up with a single machine-function test-case with multiple identical basic blocks testing the same pattern. W/ no bug fixes it will screw up the ABI lowering at the moment though, so it will only work for patterns w/ no input vregs, such as having only constants for operands. And that means it will generate a small test-module, it won't fail or crash, but rather just filter out the broken machine functions using machine verifier, as it always does. Or tries to, at least. Technically, I'm still considering turning this into a sort of a benchmark-gen, not sure at the moment if that could be actually useful though. Theoretically, we could have a tool that will be able to generate long def-use chains of instruction sequences corresponding to a particular pattern or a subset of pattern in a type-driven manner and use them to benchmark the selector as well as other parts of the pipeline.
103	They aren't, but it appears to me that this is the best we could do w/o going to great lengths, as the information about the actual type is pretty much lost at match table level. At some point, I tried to lower ABI, specifically to insert an appropriate return sequence, by re-using `CallLoweringInfo::lowerReturn` provided by the target. The method expects an LLVM `Value`, however, the targets only analyze (or were at the moment) the `Value`'s type, so it seemed sufficient to provide it an IR constant of an appropriate IR type. It didn't work our for a number of reasons, most notably not all the targets could handle all the types, especially vector types. So I started to use the `PATCHABLE_RET` instead for this purpose. So currently `deduceIRType` is only used to build appropriate machine memory operands to make `GIM_CheckAtomicOrdering` happy, specifically, figure out a reasonable size and alignment. As the size is directly inferable from LLT, this is mostly to figure out the alignment.
124	Your understanding is correct. I'm adding the following comment: +/// A helper providing sensible phys regs to define patterns' input vregs. +/// +/// It appears that the most portable and robust way to define all the input +/// vregs (used, but not defined) of an instruction sequence being generated is +/// to define them as COPY's from appropriate and preferably distinct physical +/// registers, live-in to the basic containing the instruction sequence and +/// preferably the entire function as well. The best pick is to have the same +/// size in bits as the vreg, same register bank, to be allocatable, and from +/// the beginning of the allocation order. The class also handles the following +/// issues: +/// +/// 1) RegisterBankInfo::getRegBankFromRegClass not being defined for +/// tablegen'erated reg classes. +/// 2) Register classes containing the same physical register and yet having +/// different sizes, and therefore physical registers having only weekly +/// defined size as the maximum of the sizes of all register classes they +/// belong to. +/// 3) Register banks not having a full list of register sizes available +/// directly. +/// +/// The latter is also used for picking sensible register banks for internal +/// (defined and used both by the instruction sequence being generated) vregs. and renaming the class from `LiveInRA` to `InputRegAllocator` (from `(anonymous namespace)::LiveInRA` to `llvm::InstructionSelectorTestgen::InputRegAllocator` to be exact).
274–309	Agreed, doing that. I'm also changing `Size2RegBanksTy` from `DenseMap` to `std::map`, that gets rid of the extracting keys (`Sizes`) and sorting them every time, makes the implementation a little simpler and reduces the number of arguments (former closure captured variables) of the static function being extracted.
283	An example sentence would be "Didn't find a register bank containing allocatable registers of the required by LLT <2 x s16> size of 32 bits or larger for an unconstrained vreg %1". Breaking it down as follows: "Didn't find a register bank containing allocatable registers of a compatible size; vreg: %1(<2 x s16>), size: 32 bits (or larger)."
307	Oops, good catch, thanks!
354	COPY of a live-in phys-reg would be safer It is, this is why I'm using `IMPLICIT_DEF` only if I couldn't find a phys-reg with the appropriate size and within the required register bank (or if this behavior is explicitly requested by a command line option).
367	Agreed. I'm making this less fragile by doing the following: Specifying a list of pairs implicitly defining a mapping from an opcode to its number of operands instead so there is no need to maintain the records in a specific order, matching the order of the opcode definitions opcodes are used directly, not just as a comment, thus making sure there are no non-existent opcodes mentioned in the list adding assertions that would make sure that there are no opcodes missing from the mapping putting the definition right next to the opcode definitions
469	Of course, the number of operands could be greater than `OpIdx`, `ensureNumOperands` doesn't add or remove operands in that case.
485	Often I need to create a generic virtual register w/o knowing which type is expected from it yet, and it's not exactly possible to create a virtual register w/o a type. This constant exists to be consistent with the type I use as a default / initial option.
utils/TableGen/GlobalISelEmitter.cpp
4069–4073	I'm extracting a number of patches from this one to break it down a little.

Herald added a reviewer: javed.absar. · View Herald TranscriptApr 25 2018, 5:25 PM

Hi Daniel @dsanders,

Thank you for looking into this and the detailed review.

It would be helpful to move things like the indentation correction on testImm*(), the introduction of buildTable and getMatchTable(), the changes to coverage, moving the emission of selectImpl() down, etc. into a separate patch(es).

I believe I have this done, please take a look at the extracted patches:
https://reviews.llvm.org/D46095
https://reviews.llvm.org/D46096
https://reviews.llvm.org/D46097
https://reviews.llvm.org/D46098

I'm also half-through the inline comments.

rtereshin marked 31 inline comments as done.Apr 26 2018, 6:47 PM

rtereshin added inline comments.

lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp
344–346	Done. so that we don't format strings we're not going to emit. w/o DEBUG macro we actually emit all of this in assert and release builds likewise.
422	That's a very good question, thanks! I suppose, I will have to replace `NumDefs = InsnB->getDesc().NumDefs;` line below with `NumDefs = std::max(InsnB->getDesc().NumDefs, InsnB->getNumExplicitDefs());` as soon as we get `MachineInstr::getNumExplicitDefs()` merged in ;-) (https://reviews.llvm.org/D45640). It will help, but won't solve the problem. It's not a problem for now, though, as `InstructionSelector` can't really handle those either. For instance, record instruction opcode clearly assumes that the definition is always the operand 0. When it does support the case, however, most likely it won't be doing that by checking how many definitions an instruction has, it will most likely just rely on MIR being valid. That naturally assumes machine verifier can check this stuff. So I guess at some point machine verifier will special case instructions like `G_UNMERGE_VALUES`. We can implement that check in the machine verifier as `unsigned getNumExplicitDefsExpected(const MachineInstr &I)` refactored out and then checking if the actual `I` has the number of defs expected. And then we can reuse `getNumExplicitDefsExpected` right here in Testgen. If the generic opcode has a very flexible number of defs, as in, it's not derivable from the number of operands and their types (do we even have these?), it will probably be still all right, we just see the highest operand index that the pattern explicitly requires to be a definition (via record instruction opcodes, for instance), and we say that that operand is the last def (and of course every operand with a lower index is also a def). And that should produce a) valid MIR, we started by saying this mysterious opcode is very flexible with defs b) MIR that could be matched by the pattern, and it's all we care about. Not to mention, if we end up having generic opcodes like this - with non-derivable number of defs and the number of defs having an impact on the instructions' semantics - we will end up having a match table opcode checking the number of defs explicitly. But again, it's not a problem for now.
486–489	I'm replacing the comment with the following: // Raw representation of a single match table rule as an ordered union of // several continuous regions of the match table. The representation tries its // best to ignore parts of the table that don't affect semantics of a single // rule in isolation, like labels and rule IDs. // // Assuming that all the parts of the MatchTable that don't affect the // selection process but only identify a rule, like GIR_Coverage opcodes, come // within a rule as a single continuous block, the meaningful parts of the // rule could be represented as some prefix (starting from the first // non-control flow opcode, in other words, skipping GIM_Try and its // label-operand) and suffix: using RuleBodyTy = std::pair<ArrayRef<int64_t>, ArrayRef<int64_t>>;
498	I'm renaming the variable from `CoverageBlockPassed` to `ExcludedRegionPassed` and adding a bunch of comments that should make it clearer, like this: // Get a rule descriptor, containing the index of its GIR_Done opcode, RuleID, // and a raw representation of the entire body. // // \pre MatchTable is a non-optimized linear match table. // \pre CurrentIdx points to the first (and only) GIM_Try opcode of the rule // that has all its semantically meaningless opcodes that to be excluded from // the body as a single continuous subregion somewhere. // \post the prefix of the body is a range from the first opcode after the // initial GIM_Try until GIR_Coverage (or, more generally, the first // semantically meaningless opcode), the suffix is a range from the first // opcode after the last semantically meaningless opcode until GIR_Done // (exclusive). static std::tuple<uint64_t, int64_t, RuleBodyTy> getDoneIdxRuleIDAndBody(const int64_t *MatchTable, uint64_t CurrentIdx) { // skipping GIM_Try const uint64_t BodyIdx = nextGIOpcodeIdx(MatchTable, CurrentIdx); uint64_t ExcludedFirst = BodyIdx; uint64_t ExcludedLast = ExcludedFirst; // RuleID we discovered so far. Or the first one if we have many O_o int64_t FirstRuleID = -1; // Did we already iterated over that continuous region of unimportant opcodes // we are going to exclude from the body? bool ExcludedRegionPassed = false; unsigned NestingLevel = 0; do {
503	I'm adding the following comment: // Skipping the OnFail label operand
504	It will be hard to make sure that the tests are the same. For instance, let's suppose that non-optimized table does some meaningless checks, for instance, checks register banks on internal vregs (the registers defined and used inside the pattern and not being the pattern's overall inputs or outputs), while optimized table doesn't. Semantically they are the same, but in the case of the latter Testgen will have to guess more regbanks, and it might guess it differently (from what is explicitly checked by the non-optimized table). It makes no difference in the selected code, and the selected test will pass, however, the testgend test (and that one only tests the Testgen itself, not the selector) will be technically different. Also, if optimization reorders opcodes, the Testgen might easily end up with different virtual register names, while keeping the actual def-use chain the same, or schedule the def-use chain differently. Not to mention, it will noticeably increase the maintenance burden, and I think it's best to keep that at a minimum.
505–511	Doesn't NDEBUG also disable the dbgs() stream? No, the only difference is that `NDEBUG` resolves `dbgs()` directly to `errs()` and therefore sends the output straight to `stderr` while in assert builds there is a circular buffer in the middle (that smart enough to flush if killed). That assert can't succeed (NestingLevel > 1 vs NestingLevel == 1). It looks like this ought to be report_fatal_error() or similar True, good catch, I'm tidying this up.
597	Both the Use and the Def must be the same register so I'd expect to always use one or the other for all cases. They must be the same register by the end of this `case`, but we know little in the beginning. We know that `Def` is a register and that's about it. Either (and we don't know which) or both `Def` and `Use` could be `%noreg`s (`0`). If one of them is an actual vreg (not `%noreg`) it might have a meaningful type or / and a bank assigned, and we can't loose that assignment. Technically, to make it even more robust we need to intersect both of their constraints here (what if both of them are valid vregs already, one has a bank checked, but not a type yet, and the other has the type checked, but not the bank?), but so far it was working pretty well as is.
681–682	Thanks, I didn't dig into this deep enough yet to know that for myself! And yes, the mapping is the hardest part. So this is why that ugly `TestgenSetAllFeatures` command line option for now, to overcome this the cheap and dirty way. Of course, that strips us off testing the checking features part of every pattern and the match table as a whole. This is for future patches and improvements, though.
715	Absolutely no guarantee. However, I actually took a quick look and it appeared to me that 1 would match more often then any other fixed constant, did't measure it though. Full support for immediate predicates is for future patches.
761–784	True, and that's the whole point of having more than one pass: satisfying dependencies. If the number and complexity of the dependencies were much greater, I would probably process the table once, "box" (as in incapsulate) every opcode (with its operands) in an object, and then sort the objects in a way that satisfies the dependencies. However, the dependencies we've got seem to be simple enough to get away with just a few passes over the table, so I find the "an object per opcode" solution greatly over-engineered and not needed.
781	@dsanders This situation, btw, is very similar to the one with record instruction opcode: we don't know which register is defined and which is not, and we can not afford loosing the definition. With record instruction we don't know which register might already have LLT and / or regbank "checked", and we can't afford loosing that info.
796	For future patches.
849	I'm renaming `selectUnconstrainedRegBanks` to `assignRegBanksToUnconstrainedVRegs` and `selectRegBank` helper function (former lambda) to `assignRegBank` for consistency.
884	I'm adding the following comment: // The major source of literal duplicates is the fact that we map MVTs // like i32 and f32 to the same s32 LLT, therefore 2 or more patterns // originally written for SelectionDAG ISel get imported as the exact same // sequence of semantically meaningful match table opcodes, matching and // rendering opcodes both:
utils/TableGen/GlobalISelEmitter.cpp
4091	It is wrapped in `GET_GLOBALISEL_IMPL` along with the InstructionSelector implementation. Due to `InstructionSelector::getTestgen` method we need the class definition even if we aren't planning to use testgen, but why would we not?
utils/update_instruction_select_testgen_tests.sh
40–43	`sed` is horrible with multiline patterns, hm... This script isn't required to run tests, only to generate / update them, so maybe we could deal with it a bit later. Maybe with a little help from somebody running windows?

rtereshin marked 33 inline comments as done.Apr 26 2018, 6:49 PM

Hi Daniel @dsanders

I've addressed all of the inline comments, I believe.

We'll need to find a good way of handling immediates, C++ predicates, ComplexPattern and similar but that should definitely be left for later patches.

Yes.

Some targets have a rather large number of rules (e.g. X86 is around 17k IIRC). Do we have a mechanism for keeping the number of generated tests to a reasonable level?

Not yet, but it's not a problem yet either as we import only a fraction of the rules as of now.

with few function-level comments

+ @aemerson (Hi Amara)

there's little documentation in the code of how this actually works under the hood. Can you add more descriptions so someone can follow what's happening when it comes to maintenance later, e.g. what's LiveInRA supposed to do, what are the pre-conditions and expected outputs of each phase of this tool?

All of this improved noticeably, I hope, but not everything is commented and I'll probably add on it later.

Thank you!
Roman

rtereshin added commits: rL331395: [GlobalISel][InstructionSelect] Refactoring out a getMatchTable virtual method…, rL331396: [GlobalISel][InstructionSelect] Refactoring buildMatchTable out, NFC, rL331398: [GlobalISel][InstructionSelect] Making Coverage Info generation optional on per…, rL330988: [GlobalISel] Reporting rules covered as part of the InstructionSelect's debug….May 2 2018, 1:24 PM

rtereshin added a child revision: D44700: [GlobalISel] Improving InstructionSelect's performance by reducing MatchTable.May 4 2018, 2:57 PM

Improved in-source comments
Improved usability and scripts, in particular, added find_failing_instruction_select_rules.sh script that will automatically find the selection rules that will fail if executed (most of the time will crash the selector)
Added support for extending loads / truncating stores MatchTable checks
Rebased against master and re-solved namespace / visibility issues introduced by AMDGPU backend (the only backend that put its target-specific derived InstructionSelector's declaration in a header)
Increased Testgen's tolerance towards register banks quirks: AMD GPU is the only backend that has banks "covering" register classes that span cross multiple banks.
Decreased invasiveness of the patch in its surroundings, especially within the global isel emitter.
Made sure that if the tests are updated they are updated in place as much as possible (w/o re-ordering machine functions representing selection rules within the test file) and identified by their much more stable Rule IDs rather than number / position or initial index in the MatchTable. This makes diffs much more manageable for a manual review.
Other small improvements and bug fixes.

rtereshin removed a child revision: D44700: [GlobalISel] Improving InstructionSelect's performance by reducing MatchTable.May 16 2018, 10:02 PM

I have:

Fixed IR-building code that failed to update def-use chains properly until very end of the test generation, rendering GIM_CheckSameOperands fragile and dependent on which operand was already defined / checked explicitly and which one is not
Started sorting machine instructions inserted in topological order explicitly (rather than implicity relying on GIM_RecordInsn's opcodes), thus a) reducing the size of the diffs in case of changes b) making GIM_CheckSameOperands less fragile (it could have created uses not dominated by defs previously)
Stabilized resulting vreg numbers to reduce the diffs in case of changes

mgrang added inline comments.May 19 2018, 11:26 PM

lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp
261	Please use llvm::sort instead of std::sort. See https://llvm.org/docs/CodingStandards.html#beware-of-non-deterministic-sorting-order-of-equal-elements.

ping

lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp
261	Good catch, thanks, will do!

aemerson resigned from this revision.Jan 30 2019, 1:36 PM

Herald added subscribers: Petar.Avramovic, jfb. · View Herald TranscriptJan 30 2019, 1:36 PM

Closing as we decided not to pursue this.

Herald added a project: Restricted Project. · View Herald TranscriptFeb 26 2019, 9:11 AM

Herald added a subscriber: jdoerfert. · View Herald Transcript

rtereshin abandoned this revision.Feb 26 2019, 9:12 AM

Revision Contents

Path

Size

include/

llvm/

CodeGen/

GlobalISel/

InstructionSelectTestgen.h

30 lines

InstructionSelector.h

69 lines

InstructionSelectorImpl.h

1 line

InstructionSelectorTestgen.h

63 lines

InitializePasses.h

1 line

lib/

CodeGen/

GlobalISel/

CMakeLists.txt

2 lines

GlobalISel.cpp

1 line

InstructionSelectTestgen.cpp

79 lines

InstructionSelectorTestgen.cpp

916 lines

test/

TableGen/

GlobalISelEmitter.td

2 lines

utils/

TableGen/

GlobalISelEmitter.cpp

81 lines

update_instruction_select_testgen_tests.sh

62 lines

Diff 144043

include/llvm/CodeGen/GlobalISel/InstructionSelectTestgen.h

This file was added.

				//===-- GlobalISel/InstructionSelectTestgen.h -------------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				/// \file This file describes the interface of the ModulePass responsible
				/// for auto-generating regression tests for the InstructionSelect pass of
				/// GlobalISel.
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CODEGEN_GLOBALISEL_INSTRUCTIONSELECT_TESTGEN_H
				#define LLVM_CODEGEN_GLOBALISEL_INSTRUCTIONSELECT_TESTGEN_H

				#include "llvm/Pass.h"

				namespace llvm {
				class InstructionSelectTestgen : public ModulePass {
				public:
				static char ID;
				InstructionSelectTestgen() : ModulePass(ID) {}
				StringRef getPassName() const override { return "InstructionSelectTestgen"; }
				void getAnalysisUsage(AnalysisUsage &AU) const override;
				bool runOnModule(Module &M) override;
				};
				} // namespace llvm

				#endif // LLVM_CODEGEN_GLOBALISEL_INSTRUCTIONSELECT_TESTGEN_H

include/llvm/CodeGen/GlobalISel/InstructionSelector.h

Show All 13 Lines
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLVM_CODEGEN_GLOBALISEL_INSTRUCTIONSELECTOR_H		#ifndef LLVM_CODEGEN_GLOBALISEL_INSTRUCTIONSELECTOR_H
#define LLVM_CODEGEN_GLOBALISEL_INSTRUCTIONSELECTOR_H		#define LLVM_CODEGEN_GLOBALISEL_INSTRUCTIONSELECTOR_H

#include "llvm/ADT/DenseMap.h"		#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/Optional.h"		#include "llvm/ADT/Optional.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
		#include "llvm/CodeGen/GlobalISel/InstructionSelectorTestgen.h"
#include "llvm/Support/CodeGenCoverage.h"		#include "llvm/Support/CodeGenCoverage.h"
#include <bitset>		#include <bitset>
#include <cstddef>		#include <cstddef>
#include <cstdint>		#include <cstdint>
#include <functional>		#include <functional>
#include <initializer_list>		#include <initializer_list>
#include <vector>		#include <vector>

▲ Show 20 Lines • Show All 186 Lines • ▼ Show 20 Lines	enum {
GIR_AddImplicitUse,		GIR_AddImplicitUse,
/// Add an register to the specified instruction		/// Add an register to the specified instruction
/// - InsnID - Instruction ID to modify		/// - InsnID - Instruction ID to modify
/// - RegNum - The register to add		/// - RegNum - The register to add
GIR_AddRegister,		GIR_AddRegister,
/// Add a temporary register to the specified instruction		/// Add a temporary register to the specified instruction
/// - InsnID - Instruction ID to modify		/// - InsnID - Instruction ID to modify
/// - TempRegID - The temporary register ID to add		/// - TempRegID - The temporary register ID to add
/// - TempRegFlags - The register flags to set		/// - TempRegFlags - The register flags to set
		dsandersUnsubmitted Done Reply Inline Actions Is this comment accurate? GIR_AddRegister is listed as having 2 operands and I only see 2 in my build dsanders: Is this comment accurate? GIR_AddRegister is listed as having 2 operands and I only see 2 in my…
		rtereshinAuthorUnsubmitted Done Reply Inline Actions The comments in this section describe an opcode below the comments, not above. So this is all about `GIR_AddTempRegister` and it appears to have 3 operands (https://github.com/llvm-mirror/llvm/blob/master/include/llvm/CodeGen/GlobalISel/InstructionSelectorImpl.h#L598-L601) rtereshin: The comments in this section describe an opcode below the comments, not above. So this is…
GIR_AddTempRegister,		GIR_AddTempRegister,
/// Add an immediate to the specified instruction		/// Add an immediate to the specified instruction
/// - InsnID - Instruction ID to modify		/// - InsnID - Instruction ID to modify
/// - Imm - The immediate to add		/// - Imm - The immediate to add
GIR_AddImm,		GIR_AddImm,
/// Render complex operands to the specified instruction		/// Render complex operands to the specified instruction
/// - InsnID - Instruction ID to modify		/// - InsnID - Instruction ID to modify
/// - RendererID - The renderer to call		/// - RendererID - The renderer to call
Show All 40 Lines	enum {

/// A successful emission		/// A successful emission
GIR_Done,		GIR_Done,

/// Increment the rule coverage counter.		/// Increment the rule coverage counter.
/// - RuleID - The ID of the rule that was covered.		/// - RuleID - The ID of the rule that was covered.
GIR_Coverage,		GIR_Coverage,

		/// Keeping track of the number of the GI opcodes. Must be the last entry.
GIU_NumOpcodes,		GIU_NumOpcodes,
};		};

		/// Maintaining some meta-information on GI opcodes.
		struct GIOpcodeMeta {
		GIOpcodeMeta() {
		static constexpr unsigned GIU_NumOperands[GIU_NumOpcodes][2] = {
		{GIM_Try, 1},
		{GIM_RecordInsn, 3},
		{GIM_CheckFeatures, 1},
		{GIM_CheckOpcode, 2},
		{GIM_CheckNumOperands, 2},
		{GIM_CheckI64ImmPredicate, 2},
		{GIM_CheckAPIntImmPredicate, 2},
		{GIM_CheckAPFloatImmPredicate, 2},
		{GIM_CheckAtomicOrdering, 2},
		{GIM_CheckAtomicOrderingOrStrongerThan, 2},
		{GIM_CheckAtomicOrderingWeakerThan, 2},
		{GIM_CheckType, 3},
		{GIM_CheckPointerToAny, 3},
		{GIM_CheckRegBankForClass, 3},
		{GIM_CheckComplexPattern, 4},
		{GIM_CheckConstantInt, 3},
		{GIM_CheckLiteralInt, 3},
		{GIM_CheckIntrinsicID, 3},
		{GIM_CheckIsMBB, 2},
		{GIM_CheckIsSafeToFold, 1},
		{GIM_CheckIsSameOperand, 4},
		{GIM_Reject, 0},

		{GIR_MutateOpcode, 3},
		{GIR_BuildMI, 2},
		{GIR_Copy, 3},
		{GIR_CopyOrAddZeroReg, 4},
		{GIR_CopySubReg, 4},
		{GIR_AddImplicitDef, 2},
		{GIR_AddImplicitUse, 2},
		{GIR_AddRegister, 2},
		{GIR_AddTempRegister, 3},
		{GIR_AddImm, 2},
		{GIR_ComplexRenderer, 2},
		{GIR_ComplexSubOperandRenderer, 3},
		{GIR_CustomRenderer, 3},
		{GIR_CopyConstantAsSImm, 2},
		{GIR_ConstrainOperandRC, 3},
		{GIR_ConstrainSelectedInstOperands, 1},
		{GIR_MergeMemOperands, 3},
		{GIR_EraseFromParent, 1},
		{GIR_MakeTempReg, 2},
		{GIR_Done, 0},
		{GIR_Coverage, 1},
		};
		#ifndef NDEBUG
		bool Visited[GIU_NumOpcodes] = {};
		#endif
		for (const auto &R : GIU_NumOperands) {
		assert(!Visited[R[0]] && (Visited[R[0]] = true) &&
		"GIU_NumOperands table contains duplicates or missing a record");
		NumOperands[R[0]] = R[1];
		}
		}
		unsigned NumOperands[GIU_NumOpcodes];
		};

enum {		enum {
/// Indicates the end of the variable-length MergeInsnID list in a		/// Indicates the end of the variable-length MergeInsnID list in a
/// GIR_MergeMemOperands opcode.		/// GIR_MergeMemOperands opcode.
GIU_MergeMemOperands_EndOfList = -1,		GIU_MergeMemOperands_EndOfList = -1,
};		};

/// Provides the logic to select generic machine instructions.		/// Provides the logic to select generic machine instructions.
class InstructionSelector {		class InstructionSelector {
▲ Show 20 Lines • Show All 84 Lines • ▼ Show 20 Lines	protected:
/// means that we don't need to worry about G_OR with equivalent semantics.		/// means that we don't need to worry about G_OR with equivalent semantics.
bool isBaseWithConstantOffset(const MachineOperand &Root,		bool isBaseWithConstantOffset(const MachineOperand &Root,
const MachineRegisterInfo &MRI) const;		const MachineRegisterInfo &MRI) const;

/// Return true if MI can obviously be folded into IntoMI.		/// Return true if MI can obviously be folded into IntoMI.
/// MI and IntoMI do not need to be in the same basic blocks, but MI must		/// MI and IntoMI do not need to be in the same basic blocks, but MI must
/// preceed IntoMI.		/// preceed IntoMI.
bool isObviouslySafeToFold(MachineInstr &MI, MachineInstr &IntoMI) const;		bool isObviouslySafeToFold(MachineInstr &MI, MachineInstr &IntoMI) const;

		public:
		virtual std::unique_ptr<const InstructionSelectorTestgen> getTestgen() const {
		llvm_unreachable("Subclasses must use tablegen'erated GlobalISel "
		"implementation to auto-generate test cases");
		}
};		};

} // end namespace llvm		} // end namespace llvm

#endif // LLVM_CODEGEN_GLOBALISEL_INSTRUCTIONSELECTOR_H		#endif // LLVM_CODEGEN_GLOBALISEL_INSTRUCTIONSELECTOR_H

include/llvm/CodeGen/GlobalISel/InstructionSelectorImpl.h

Show First 20 Lines • Show All 100 Lines • ▼ Show 20 Lines	case GIM_RecordInsn: {
DEBUG_WITH_TYPE(TgtInstructionSelector::getName(),		DEBUG_WITH_TYPE(TgtInstructionSelector::getName(),
dbgs() << CurrentIdx << ": Is a physical register\n");		dbgs() << CurrentIdx << ": Is a physical register\n");
if (handleReject() == RejectAndGiveUp)		if (handleReject() == RejectAndGiveUp)
return false;		return false;
break;		break;
}		}

MachineInstr *NewMI = MRI.getVRegDef(MO.getReg());		MachineInstr *NewMI = MRI.getVRegDef(MO.getReg());
		assert(NewMI && "Expected a vreg definition");
if ((size_t)NewInsnID < State.MIs.size())		if ((size_t)NewInsnID < State.MIs.size())
State.MIs[NewInsnID] = NewMI;		State.MIs[NewInsnID] = NewMI;
else {		else {
assert((size_t)NewInsnID == State.MIs.size() &&		assert((size_t)NewInsnID == State.MIs.size() &&
"Expected to store MIs in order");		"Expected to store MIs in order");
State.MIs.push_back(NewMI);		State.MIs.push_back(NewMI);
}		}
DEBUG_WITH_TYPE(TgtInstructionSelector::getName(),		DEBUG_WITH_TYPE(TgtInstructionSelector::getName(),
▲ Show 20 Lines • Show All 653 Lines • Show Last 20 Lines

include/llvm/CodeGen/GlobalISel/InstructionSelectorTestgen.h

This file was added.

				//===- llvm/CodeGen/GlobalISel/InstructionSelectorTestgen.h ------ C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				/// \file This file declares the API for the instruction selector testgen.
				/// The class is responsible for auto-generating regression tests for the
				/// InstructionSelect pass of GlobalISel on rule by rule basis.
				/// The basic and fully functional implementation is TableGen'erated, the
				/// class is used by the InstructionSelectTestgen pass.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CODEGEN_GLOBALISEL_INSTRUCTIONSELECTOR_TESTGEN_H
				#define LLVM_CODEGEN_GLOBALISEL_INSTRUCTIONSELECTOR_TESTGEN_H

				#include "llvm/ADT/StringRef.h"
				#include "llvm/CodeGen/GlobalISel/MachineIRBuilder.h"
				#include "llvm/CodeGen/MachineModuleInfo.h"
				#include "llvm/Support/CommandLine.h"

				namespace llvm {

				/// Force all target (as well as function and module) features to be available.
				extern cl::opt<bool> TestgenSetAllFeatures;
				dsandersUnsubmitted Done Reply Inline Actions This one probably isn't harmful since InstructionSelectorTestgen.h isn't going to be widely included but we ought to avoid anonymous namespaces in headers since each compilation unit that includes the header will get it's own version of the declaration. It's probably best to put it in the llvm namespace if we can't push it into the cpp entirely. InstructionSelectorTestgen doesn't seem to have any state so it looks like generateTestCase() could be a static function outside the class dsanders: This one probably isn't harmful since InstructionSelectorTestgen.h isn't going to be widely…
				rtereshinAuthorUnsubmitted Done Reply Inline Actions That's a good insight, thanks, I didn't think of that. `generateTestCase` calls virtual member-functions and may do so more in the future and I'd rather keep it that way and keep it a class member. I'm moving the helper class (`LiveInRA` or however we would (re)name it) in `InstructionSelectorTestgen`as a member-class instead, I hope that solves the problem well enough. rtereshin: That's a good insight, thanks, I didn't think of that. `generateTestCase` calls virtual member…

				class InstructionSelectorTestgen {
				public:
				virtual ~InstructionSelectorTestgen() = default;

				virtual void generateTestCases(Module &M, MachineModuleInfo &MMI) const = 0;

				virtual bool checkFeatures(unsigned FeatureBitsetID) const = 0;
				virtual LLT getTypeObject(unsigned TypeObjectID) const = 0;
				dsandersUnsubmitted Done Reply Inline Actions Are all of these really needed in the header? Most seem to only be used from InstructionSelectorTestgen.cpp dsanders: Are all of these really needed in the header? Most seem to only be used from…
				rtereshinAuthorUnsubmitted Done Reply Inline Actions Only `TestgenSetAllFeatures` is required to be here as it's used by the instruction selector itself. I'm not exactly sure why not to list all the options in a header, in a sense it's part of its interface, CLI in this case, but sure, I'll remove all of those that don't have to be here. After all, nothing would force this list to be complete, so maybe it's better if it's explicitly incomplete, and also this is rarely (if ever) done across other parts of LLVM. rtereshin: Only `TestgenSetAllFeatures` is required to be here as it's used by the instruction selector…
				virtual const TargetInstrInfo &getTII() const = 0;
				virtual const TargetRegisterInfo &getTRI() const = 0;
				virtual const RegisterBankInfo &getRBI() const = 0;
				virtual const int64_t *getMatchTable() const = 0;

				static MachineIRBuilder createEmptyTestCase(StringRef Name, Module &M,
				MachineModuleInfo &MMI);
				static void emitReturnIntoTestCase(MachineIRBuilder &MIRBuilder);
				static Type *deduceIRType(LLT LLTy, LLVMContext &Context);

				class InputRegAllocator;

				protected:
				InstructionSelectorTestgen() = default;

				void generateTestCasesImpl(Module &M, MachineModuleInfo &MMI) const;

				private:
				void generateTestCase(uint64_t CurrentIdx, InputRegAllocator &RA,
				MachineIRBuilder &MIRBuilder) const;
				};

				} // namespace llvm

				#endif // LLVM_CODEGEN_GLOBALISEL_INSTRUCTIONSELECTOR_TESTGEN_H

include/llvm/InitializePasses.h

	Show First 20 Lines • Show All 174 Lines • ▼ Show 20 Lines
	void initializeInferFunctionAttrsLegacyPassPass(PassRegistry&);			void initializeInferFunctionAttrsLegacyPassPass(PassRegistry&);
	void initializeInlineCostAnalysisPass(PassRegistry&);			void initializeInlineCostAnalysisPass(PassRegistry&);
	void initializeInstCountPass(PassRegistry&);			void initializeInstCountPass(PassRegistry&);
	void initializeInstNamerPass(PassRegistry&);			void initializeInstNamerPass(PassRegistry&);
	void initializeInstSimplifierPass(PassRegistry&);			void initializeInstSimplifierPass(PassRegistry&);
	void initializeInstrProfilingLegacyPassPass(PassRegistry&);			void initializeInstrProfilingLegacyPassPass(PassRegistry&);
	void initializeInstructionCombiningPassPass(PassRegistry&);			void initializeInstructionCombiningPassPass(PassRegistry&);
	void initializeInstructionSelectPass(PassRegistry&);			void initializeInstructionSelectPass(PassRegistry&);
				void initializeInstructionSelectTestgenPass(PassRegistry&);
	void initializeInterleavedAccessPass(PassRegistry&);			void initializeInterleavedAccessPass(PassRegistry&);
	void initializeInternalizeLegacyPassPass(PassRegistry&);			void initializeInternalizeLegacyPassPass(PassRegistry&);
	void initializeIntervalPartitionPass(PassRegistry&);			void initializeIntervalPartitionPass(PassRegistry&);
	void initializeJumpThreadingPass(PassRegistry&);			void initializeJumpThreadingPass(PassRegistry&);
	void initializeLCSSAVerificationPassPass(PassRegistry&);			void initializeLCSSAVerificationPassPass(PassRegistry&);
	void initializeLCSSAWrapperPassPass(PassRegistry&);			void initializeLCSSAWrapperPassPass(PassRegistry&);
	void initializeLazyBlockFrequencyInfoPassPass(PassRegistry&);			void initializeLazyBlockFrequencyInfoPassPass(PassRegistry&);
	void initializeLazyBranchProbabilityInfoPassPass(PassRegistry&);			void initializeLazyBranchProbabilityInfoPassPass(PassRegistry&);
	▲ Show 20 Lines • Show All 209 Lines • Show Last 20 Lines

lib/CodeGen/GlobalISel/CMakeLists.txt

	add_llvm_library(LLVMGlobalISel			add_llvm_library(LLVMGlobalISel
	CallLowering.cpp			CallLowering.cpp
	GlobalISel.cpp			GlobalISel.cpp
	Combiner.cpp			Combiner.cpp
	CombinerHelper.cpp			CombinerHelper.cpp
	IRTranslator.cpp			IRTranslator.cpp
	InstructionSelect.cpp			InstructionSelect.cpp
	InstructionSelector.cpp			InstructionSelector.cpp
				InstructionSelectTestgen.cpp
				InstructionSelectorTestgen.cpp
	LegalityPredicates.cpp			LegalityPredicates.cpp
	LegalizeMutations.cpp			LegalizeMutations.cpp
	Legalizer.cpp			Legalizer.cpp
	LegalizerHelper.cpp			LegalizerHelper.cpp
	LegalizerInfo.cpp			LegalizerInfo.cpp
	Localizer.cpp			Localizer.cpp
	MachineIRBuilder.cpp			MachineIRBuilder.cpp
	RegBankSelect.cpp			RegBankSelect.cpp
	RegisterBank.cpp			RegisterBank.cpp
	RegisterBankInfo.cpp			RegisterBankInfo.cpp
	Utils.cpp			Utils.cpp

	DEPENDS			DEPENDS
	intrinsics_gen			intrinsics_gen
	)			)

lib/CodeGen/GlobalISel/GlobalISel.cpp

	Show All 16 Lines
	using namespace llvm;			using namespace llvm;

	void llvm::initializeGlobalISel(PassRegistry &Registry) {			void llvm::initializeGlobalISel(PassRegistry &Registry) {
	initializeIRTranslatorPass(Registry);			initializeIRTranslatorPass(Registry);
	initializeLegalizerPass(Registry);			initializeLegalizerPass(Registry);
	initializeLocalizerPass(Registry);			initializeLocalizerPass(Registry);
	initializeRegBankSelectPass(Registry);			initializeRegBankSelectPass(Registry);
	initializeInstructionSelectPass(Registry);			initializeInstructionSelectPass(Registry);
				initializeInstructionSelectTestgenPass(Registry);
	}			}

lib/CodeGen/GlobalISel/InstructionSelectTestgen.cpp

This file was added.

				//===- llvm/CodeGen/GlobalISel/InstructionSelectTestgen.cpp ---------------===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				/// \file
				/// This file implements the InstructionSelectTestgen class.
				//===----------------------------------------------------------------------===//

				#include "llvm/CodeGen/GlobalISel/InstructionSelectTestgen.h"
				#include "llvm/CodeGen/GlobalISel/InstructionSelector.h"
				#include "llvm/IR/Verifier.h"

				#define DEBUG_TYPE "instruction-select-testgen"

				using namespace llvm;

				char InstructionSelectTestgen::ID = 0;
				INITIALIZE_PASS(InstructionSelectTestgen, DEBUG_TYPE,
				"Generate instruction selection test cases", false, false)

				void InstructionSelectTestgen::getAnalysisUsage(AnalysisUsage &AU) const {
				AU.addRequired<MachineModuleInfo>();
				}

				static const MachineFunction &createReturnOnlyTestCase(Module &M,
				MachineModuleInfo &MMI) {
				MachineIRBuilder MIRBuilder =
				InstructionSelectorTestgen::createEmptyTestCase("test_return", M, MMI);
				InstructionSelectorTestgen::emitReturnIntoTestCase(MIRBuilder);
				MachineFunction &MF = MIRBuilder.getMF();
				MF.getProperties().set(MachineFunctionProperties::Property::Legalized);
				MF.getProperties().set(MachineFunctionProperties::Property::RegBankSelected);
				return MF;
				}

				bool InstructionSelectTestgen::runOnModule(Module &M) {
				assert(!verifyModule(M, &dbgs()) && "Input module is not valid");

				MachineModuleInfo &MMI = getAnalysis<MachineModuleInfo>();

				for (const Function &F : M.functions())
				if (const MachineFunction *MF = MMI.getMachineFunction(F))
				// Generally it is possible to run the Testgen over a non-empty module,
				// practically probably a lit-test, and have it add additional test-cases
				dsandersUnsubmitted Done Reply Inline Actions Could you add a comment indicating that verification is the only thing we do with these functions and why? dsanders: Could you add a comment indicating that verification is the only thing we do with these…
				rtereshinAuthorUnsubmitted Done Reply Inline Actions Sure, adding the following comment: // Generally it is possible to run the Testgen over a non-empty module, // practically probably a lit-test, and have it add additional test-cases // to it. The easiest way to limit the number of verification failures on // the final output caused by initially present machine functions rather // than the ones added, is to run the verification on those already // present functions first. rtereshin: Sure, adding the following comment: ``` // Generally it is possible to run the Testgen…
				// to it. The easiest way to limit the number of verification failures on
				// the final output caused by initially present machine functions rather
				// than the ones added, is to run the verification on those already
				// present functions first.
				MF->verify(this, "Pre-existing function", /AbortOnErrors=/true);

				// Looks like we need to start off with at least some machine function as
				// there is no such thing as a machine module, in particular, we need to gain
				// access to the Target and underlying InstructionSelector and
				// InstructionSelectorTestgen objects.
				const MachineFunction &MF = createReturnOnlyTestCase(M, MMI);

				// An early Testgen sanity check, in particular, see if the return sequence
				// common to all the following test cases makes sense:
				MF.verify(this, "Return only test function", /AbortOnErrors=/true);

				const InstructionSelector *ISel = MF.getSubtarget().getInstructionSelector();
				assert(ISel && "Can not work without InstructionSelector");

				const std::unique_ptr<const InstructionSelectorTestgen> Testgen =
				ISel->getTestgen();
				assert(Testgen && "Can not work without InstructionSelectorTestgen");
				Testgen->generateTestCases(M, MMI);

				for (const Function &F : M.functions())
				if (const MachineFunction *MF = MMI.getMachineFunction(F))
				MF->verify(this, "Output machine function", /AbortOnErrors=/true);

				assert(!verifyModule(M, &dbgs()) && "Output module is not valid");
				return true;
				}

lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp

This file was added.

				//===- llvm/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp -------------===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				/// \file
				/// This file implements the InstructionSelectorTestgen class.
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/CodeGen/GlobalISel/InstructionSelector.h"
				#include "llvm/CodeGen/GlobalISel/RegisterBankInfo.h"
				#include "llvm/IR/Verifier.h"

				#define DEBUG_TYPE "instructionselector-testgen"

				using namespace llvm;

				cl::opt<unsigned> TestgenFromRule(
				"testgen-from-rule",
				cl::desc("Generate test cases for instruction selector rules from the one "
				"specified (by index, inclusive)"),
				cl::init(std::numeric_limits<unsigned>::min()), cl::Hidden);

				cl::opt<unsigned> TestgenUntilRule(
				"testgen-until-rule",
				cl::desc("Generate test cases for instruction selector rules until the one "
				"specified (by index, inclusive)"),
				cl::init(std::numeric_limits<unsigned>::max()), cl::Hidden);

				cl::list<unsigned> TestgenExcludeRules(
				"testgen-exclude-rules",
				cl::desc("Don't try to generate test cases for the instruction selector "
				"rules listed (by index)"),
				cl::CommaSeparated, cl::Hidden);

				cl::list<unsigned> TestgenIncludeOnly(
				"testgen-include-only",
				cl::desc("Don't try to generate test cases for any instruction selector "
				"rules but the ones listed (by index, every other testgen- option "
				"is ignored)"),
				cl::CommaSeparated, cl::Hidden);

				cl::opt<bool>
				TestgenNoABI("testgen-no-abi",
				cl::desc("Don't try to imitate proper ABI boundaries"),
				cl::Hidden);

				cl::opt<bool> llvm::TestgenSetAllFeatures(
				"testgen-set-all-features",
				cl::desc("Force all target features to be available"), cl::Hidden);

				MachineIRBuilder
				InstructionSelectorTestgen::createEmptyTestCase(StringRef Name, Module &M,
				MachineModuleInfo &MMI) {
				dsandersUnsubmitted Done Reply Inline Actions If the function already exists somehow then this might not return an empty test case due to the getOrInsertFunction() we should probably fail in that case dsanders: If the function already exists somehow then this might not return an empty test case due to the…
				rtereshinAuthorUnsubmitted Done Reply Inline Actions True. Let's just `MachineFunction::reset()` it for now if such a thing happens, that would make `llc` update the existing test cases if ran over an already generated test-module. It's a sensible thing to do, I think. Generally, if not reset, it would actually just add another machine basic block and repeat the instruction sequence derived from the selection rule, so we'll end up with a single machine-function test-case with multiple identical basic blocks testing the same pattern. W/ no bug fixes it will screw up the ABI lowering at the moment though, so it will only work for patterns w/ no input vregs, such as having only constants for operands. And that means it will generate a small test-module, it won't fail or crash, but rather just filter out the broken machine functions using machine verifier, as it always does. Or tries to, at least. Technically, I'm still considering turning this into a sort of a benchmark-gen, not sure at the moment if that could be actually useful though. Theoretically, we could have a tool that will be able to generate long def-use chains of instruction sequences corresponding to a particular pattern or a subset of pattern in a type-driven manner and use them to benchmark the selector as well as other parts of the pipeline. rtereshin: True. Let's just `MachineFunction::reset()` it for now if such a thing happens, that would make…
				LLVMContext &Context = M.getContext();

				Function &F = *cast<Function>(M.getOrInsertFunction(
				Name, FunctionType::get(Type::getVoidTy(Context), /isVarArg=/false)));
				auto *BB = F.empty() ? BasicBlock::Create(Context, "entry", &F) : &F.front();
				if (BB->empty())
				new UnreachableInst(Context, BB);
				assert(!verifyFunction(F, &dbgs()) && "Empty test IR is broken");

				MachineFunction &MF = MMI.getOrCreateMachineFunction(F);
				// In case the test case with that particular name already existed and we are
				// in the update mode:
				MF.reset();

				MachineBasicBlock *MBB = MF.CreateMachineBasicBlock(BB);
				MF.push_back(MBB);
				MF.verify(/Pass=/nullptr, "Empty test function", /AbortOnErrors=/true);

				MachineIRBuilder MIRBuilder(MF);
				MIRBuilder.setMBB(*MBB);
				return MIRBuilder;
				}

				void InstructionSelectorTestgen::emitReturnIntoTestCase(
				MachineIRBuilder &MIRBuilder) {
				MachineBasicBlock &MBB = MIRBuilder.getMBB();
				unsigned RetVReg = 0;
				if (!MBB.empty()) {
				const MachineInstr &RootInstr = MBB.back();
				if (RootInstr.getNumOperands()) {
				const MachineOperand &MO = RootInstr.getOperand(0);
				if (MO.isReg() && MO.isDef())
				RetVReg = MO.getReg();
				}
				}
				auto InsnB = MIRBuilder.buildInstr(TargetOpcode::PATCHABLE_RET).addDef(0);
				if (RetVReg)
				InsnB.addUse(RetVReg);
				}

				Type *InstructionSelectorTestgen::deduceIRType(LLT LLTy, LLVMContext &Context) {
				assert(LLTy.isValid() && "Didn't expect an invalid LLT as input");
				if (LLTy.isVector())
				return VectorType::get(deduceIRType(LLTy.getElementType(), Context),
				dsandersUnsubmitted Done Reply Inline Actions LLT's scalars aren't always integers. So long as we inject bitcasts this will be fine though dsanders: LLT's scalars aren't always integers. So long as we inject bitcasts this will be fine though
				rtereshinAuthorUnsubmitted Done Reply Inline Actions They aren't, but it appears to me that this is the best we could do w/o going to great lengths, as the information about the actual type is pretty much lost at match table level. At some point, I tried to lower ABI, specifically to insert an appropriate return sequence, by re-using `CallLoweringInfo::lowerReturn` provided by the target. The method expects an LLVM `Value`, however, the targets only analyze (or were at the moment) the `Value`'s type, so it seemed sufficient to provide it an IR constant of an appropriate IR type. It didn't work our for a number of reasons, most notably not all the targets could handle all the types, especially vector types. So I started to use the `PATCHABLE_RET` instead for this purpose. So currently `deduceIRType` is only used to build appropriate machine memory operands to make `GIM_CheckAtomicOrdering` happy, specifically, figure out a reasonable size and alignment. As the size is directly inferable from LLT, this is mostly to figure out the alignment. rtereshin: They aren't, but it appears to me that this is the best we could do w/o going to great lengths…
				LLTy.getNumElements());
				if (LLTy.isPointer())
				return PointerType::get(Type::getInt32Ty(Context), LLTy.getAddressSpace());
				return Type::getIntNTy(Context, LLTy.getSizeInBits());
				}

				// RegisterBankInfo's getRegBankFromRegClass is only required to be defined on
				// user-defined register classes. It could be unable to map TableGen'erated
				// register classes (intersections, for instance) to their reg banks
				static const RegisterBank *
				getRegBankFromRegClass(const RegisterBankInfo &RBI,
				const TargetRegisterClass *RC) {
				const RegisterBank *Result = nullptr;
				for (unsigned i = 0; i < RBI.getNumRegBanks(); ++i) {
				const RegisterBank &RB = RBI.getRegBank(i);
				if (RB.covers(*RC)) {
				assert(!Result && "Register banks should not intersect");
				Result = &RB;
				}
				}
				return Result;
				dsandersUnsubmitted Done Reply Inline Actions If I understand correctly, this is a register allocator that is used to generate the live-ins for the test function. This needs some explaining in the comments and possibly also renaming (I don't have a good spelling for it, maybe InputRegAllocator) dsanders: If I understand correctly, this is a register allocator that is used to generate the live-ins…
				rtereshinAuthorUnsubmitted Done Reply Inline Actions Your understanding is correct. I'm adding the following comment: +/// A helper providing sensible phys regs to define patterns' input vregs. +/// +/// It appears that the most portable and robust way to define all the input +/// vregs (used, but not defined) of an instruction sequence being generated is +/// to define them as COPY's from appropriate and preferably distinct physical +/// registers, live-in to the basic containing the instruction sequence and +/// preferably the entire function as well. The best pick is to have the same +/// size in bits as the vreg, same register bank, to be allocatable, and from +/// the beginning of the allocation order. The class also handles the following +/// issues: +/// +/// 1) RegisterBankInfo::getRegBankFromRegClass not being defined for +/// tablegen'erated reg classes. +/// 2) Register classes containing the same physical register and yet having +/// different sizes, and therefore physical registers having only weekly +/// defined size as the maximum of the sizes of all register classes they +/// belong to. +/// 3) Register banks not having a full list of register sizes available +/// directly. +/// +/// The latter is also used for picking sensible register banks for internal +/// (defined and used both by the instruction sequence being generated) vregs. and renaming the class from `LiveInRA` to `InputRegAllocator` (from `(anonymous namespace)::LiveInRA` to `llvm::InstructionSelectorTestgen::InputRegAllocator` to be exact). rtereshin: Your understanding is correct. I'm adding the following comment: ``` +/// A helper providing…
				}

				/// A helper providing sensible phys regs to define patterns' input vregs.
				///
				/// It appears that the most portable and robust way to define all the input
				/// vregs (used, but not defined) of an instruction sequence being generated is
				/// to define them as COPY's from appropriate and preferably distinct physical
				/// registers, live-in to the basic containing the instruction sequence and
				/// preferably the entire function as well. The best pick is to have the same
				/// size in bits as the vreg, same register bank, to be allocatable, and from
				/// the beginning of the allocation order. The class also handles the following
				/// issues:
				///
				/// 1) RegisterBankInfo::getRegBankFromRegClass not being defined for
				/// tablegen'erated reg classes.
				/// 2) Register classes containing the same physical register and yet having
				/// different sizes, and therefore physical registers having only weekly
				/// defined size as the maximum of the sizes of all register classes they
				/// belong to.
				/// 3) Register banks not having a full list of register sizes available
				/// directly.
				///
				/// The latter is also used for picking sensible register banks for internal
				/// (defined and used both by the instruction sequence being generated) vregs.
				class InstructionSelectorTestgen::InputRegAllocator {
				public:
				InputRegAllocator(const TargetRegisterInfo &TRI, const RegisterBankInfo &RBI,
				const MachineFunction &MF) {
				// Classes processed first have the largest impact on the resulting order as
				// we only append non-visited registers. Therefore, it's better to process
				// the largest classes first:
				SmallVector<const TargetRegisterClass *, 32> RegClasses(TRI.regclasses());
				std::stable_sort(
				RegClasses.begin(), RegClasses.end(),
				[](const TargetRegisterClass RC1, const TargetRegisterClass RC2) {
				// For the reg classes of the same size, it's better to start with
				// more concrete ones so we don't overuse weird stuff like register
				// tuples (see AArch64) and whatnot, so stable sort this way here,
				// reverse later:
				return RC1->getNumRegs() < RC2->getNumRegs();
				});
				std::reverse(RegClasses.begin(), RegClasses.end());

				// Every physical register belongs to a unique register bank, but it may
				// belong to register classes of different sizes. Therefore, "visited"
				// property needs to be per size:
				SmallDenseMap<unsigned, BitVector, 16> Size2VisitedPhysRegs;
				for (const TargetRegisterClass *RC : RegClasses)
				if (const RegisterBank *RB = getRegBankFromRegClass(RBI, RC)) {
				const unsigned Size = TRI.getRegSizeInBits(*RC);
				// RegisterClassInfo::getOrder is not available as the reserved
				// registers set is not freezed yet:
				for (MCPhysReg PhysReg : RC->getRawAllocationOrder(MF))
				if (!setVisited(Size2VisitedPhysRegs[Size], PhysReg))
				SizeRegBank2Regs[std::make_pair(Size, RB)].first.push_back(PhysReg);
				}
				// Initializing TrueSizeRegBank2Regs and Reg2TrueSize
				SmallVector<SizeRegBank2RegsTy::key_type, 32> Keys;
				for (const auto &Item : SizeRegBank2Regs)
				Keys.push_back(Item.first);
				std::sort(Keys.begin(), Keys.end(), std::greater<decltype(Keys.front())>());
				BitVector VisitedPhysRegs;
				for (const auto Key : Keys)
				for (const MCPhysReg PhysReg : SizeRegBank2Regs[Key].first)
				if (!setVisited(VisitedPhysRegs, PhysReg)) {
				Reg2TrueSize[PhysReg] = Key.first;
				TrueSizeRegBank2Regs[Key].first.push_back(PhysReg);
				}
				}

				/// Returns an inferred actual size of the physical register: true size.
				/// As a physical register can belong to multiple register classes
				/// with different register sizes (in bits), the actual size of the
				/// register could only be guessed as the maximum of all sizes:
				unsigned getTrueRegSize(MCPhysReg PhysReg) const {
				return Reg2TrueSize.find(PhysReg)->second;
				}

				/// See if there is a phys reg available for the given size and reg bank.
				/// \p ByTrueSize has the same meaning as in next(...) member function.
				bool hasNext(unsigned SizeInBits, const RegisterBank &RB,
				bool ByTrueSize = false) const {
				const auto &Map = ByTrueSize ? TrueSizeRegBank2Regs : SizeRegBank2Regs;
				return Map.count(std::make_pair(SizeInBits, &RB));
				}

				/// Get the next available phys reg for the given size and reg bank.
				/// \pre hasNext(...) returns true
				/// \p ByTrueSize indicates if the physical register requested must have the
				/// given size as its true size (maximum of the sizes of all register classes
				/// it belongs to), or it's enough for it to have at least one register class
				/// containing it with the size specified.
				MCPhysReg next(unsigned SizeInBits, const RegisterBank &RB,
				bool ByTrueSize = false) {
				assert(hasNext(SizeInBits, RB, ByTrueSize) && "Didn't find an allocatable "
				"physical register for the "
				"register bank and size");
				auto &Map = ByTrueSize ? TrueSizeRegBank2Regs : SizeRegBank2Regs;
				auto &Pool = Map[std::make_pair(SizeInBits, &RB)];
				MCPhysReg AllocatedPhysReg = Pool.first[Pool.second];
				Pool.second = (Pool.second + 1) % Pool.first.size();
				return AllocatedPhysReg;
				}

				/// Continue all further phys reg allocations from the beginning of the
				/// allocation order.
				InputRegAllocator &reset() {
				for (auto &Item : SizeRegBank2Regs)
				Item.second.second = 0;
				for (auto &Item : TrueSizeRegBank2Regs)
				Item.second.second = 0;
				return *this;
				}

				using Size2RegBanksTy =
				std::map<unsigned, SmallPtrSet<const RegisterBank *, 2>>;

				/// Get a map from a register (class) size in bits to the full set of register
				/// banks containing at least one (non-empty) register class of that size.
				const Size2RegBanksTy &getSize2RegBanks() const {
				if (Size2RegBanks.empty())
				for (const auto &Item : SizeRegBank2Regs)
				Size2RegBanks[Item.first.first].insert(Item.first.second);
				return Size2RegBanks;
				}

				void dump(const TargetRegisterInfo &TRI) const {
				for (const auto &Item : SizeRegBank2Regs) {
				dbgs() << "Phys regs of size " << Item.first.first
				<< " bits from reg bank " << Item.first.second->getName() << ":";
				for (const MCPhysReg PhysReg : Item.second.first)
				dbgs() << " " << TRI.getRegAsmName(PhysReg);
				dbgs() << "\n";
				}
				}

				private:
				mgrangUnsubmitted Not Done Reply Inline Actions Please use llvm::sort instead of std::sort. See https://llvm.org/docs/CodingStandards.html#beware-of-non-deterministic-sorting-order-of-equal-elements. mgrang: Please use llvm::sort instead of std::sort. See https://llvm.org/docs/CodingStandards.
				rtereshinAuthorUnsubmitted Not Done Reply Inline Actions Good catch, thanks, will do! rtereshin: Good catch, thanks, will do!
				using SizeRegBank2RegsTy =
				DenseMap<std::pair<unsigned, const RegisterBank *>,
				std::pair<SmallVector<MCPhysReg, 32>, unsigned>>;

				// All the physical registers available per bank and size in the preferred
				// allocation order with an "allocated so far" index attached:
				SizeRegBank2RegsTy SizeRegBank2Regs;

				// getTrueRegSize's cache
				DenseMap<MCPhysReg, unsigned> Reg2TrueSize;

				// All the physical registers available per bank and true size in the
				// preferred allocation order with an "allocated so far" index attached:
				SizeRegBank2RegsTy TrueSizeRegBank2Regs;

				// getSize2RegBanks' cache
				mutable Size2RegBanksTy Size2RegBanks;

				static bool setVisited(BitVector &VisitedPhysRegs, MCPhysReg PhysReg) {
				if (PhysReg >= VisitedPhysRegs.size())
				VisitedPhysRegs.resize(PhysReg + 1);
				if (!VisitedPhysRegs[PhysReg]) {
				dsandersUnsubmitted Done Reply Inline Actions I think a word is missing from "of the required by" dsanders: I think a word is missing from "of the required by"
				rtereshinAuthorUnsubmitted Done Reply Inline Actions An example sentence would be "Didn't find a register bank containing allocatable registers of the required by LLT <2 x s16> size of 32 bits or larger for an unconstrained vreg %1". Breaking it down as follows: "Didn't find a register bank containing allocatable registers of a compatible size; vreg: %1(<2 x s16>), size: 32 bits (or larger)." rtereshin: An example sentence would be "Didn't find a register bank containing allocatable registers of…
				VisitedPhysRegs[PhysReg] = true;
				return false;
				}
				return true;
				}
				};

				using InputRegAllocator = InstructionSelectorTestgen::InputRegAllocator;

				/// Speculatively assign a sensible register bank to the \p Reg provided.
				///
				/// Do nothing if the \p Reg already has a bank assigned, otherwise pick a bank
				/// based on vreg's size and the register bank frequency analysis provided via
				/// \p Hist. Update \p Hist to reflect the changes.
				static void
				selectRegBank(unsigned Reg,
				const InputRegAllocator::Size2RegBanksTy &Size2RegBanks,
				SmallDenseMap<const RegisterBank *, unsigned, 4> &Hist,
				MachineFunction &MF) {
				MachineRegisterInfo &MRI = MF.getRegInfo();
				if (MRI.getRegBankOrNull(Reg))
				return;
				const LLT LLTy = MRI.getType(Reg);
				unsigned Size = LLTy.getSizeInBits();
				dsandersUnsubmitted Done Reply Inline Actions greadily -> greedily dsanders: greadily -> greedily
				rtereshinAuthorUnsubmitted Done Reply Inline Actions Oops, good catch, thanks! rtereshin: Oops, good catch, thanks!
				// Some of the vregs are so small that it's not possible to allocate equally
				// sized phys regs for them. In that case we'd better find the closest
				dsandersUnsubmitted Done Reply Inline Actions This lambda is getting pretty big. I'd be inclined to make it a static function in its own right dsanders: This lambda is getting pretty big. I'd be inclined to make it a static function in its own right
				rtereshinAuthorUnsubmitted Done Reply Inline Actions Agreed, doing that. I'm also changing `Size2RegBanksTy` from `DenseMap` to `std::map`, that gets rid of the extracting keys (`Sizes`) and sorting them every time, makes the implementation a little simpler and reduces the number of arguments (former closure captured variables) of the static function being extracted. rtereshin: Agreed, doing that. I'm also changing `Size2RegBanksTy` from `DenseMap` to `std::map`, that…
				// allocatable size still sufficiently large to initialize all bits of the
				// vreg. We can't do the same in a robust and portable way if the vreg is
				// larger then any allocatable phys reg though.
				const auto I = Size2RegBanks.lower_bound(Size);
				if (I == Size2RegBanks.end()) {
				dbgs() << "- Didn't find a register bank containing allocatable registers\n"
				" of a compatible size; vreg: %"
				<< TargetRegisterInfo::virtReg2Index(Reg) << "(" << LLTy
				<< "), size: " << Size << " bits (or larger).\n";
				// This test will definitely fail during the actual selection due to
				// register banks not being fully defined yet, but we don't want to
				// create too much noise on targets early in GlobalISel migration:
				MF.getProperties().set(MachineFunctionProperties::Property::FailedISel);
				return;
				}
				const auto &Banks = I->second;
				assert(!Banks.empty() && "Size2RegBanks is broken: contains empty sets");
				// Out of the banks that have physical registers of an appropriate size
				// pick the one that is most common already:
				const RegisterBank RB = std::max_element(
				Banks.begin(), Banks.end(),
				[&Hist](const RegisterBank RB1, const RegisterBank RB2) {
				// Add the unique ID as a tiebreaker for determinism
				return std::make_pair(Hist[RB1], RB1->getID()) <
				std::make_pair(Hist[RB2], RB2->getID());
				});
				assert(RB && "Size2RegBanks is broken: contains null pointers");
				MRI.setRegBank(Reg, *RB);
				// As we might have not picked the most common bank at the previous step
				// out of all banks, but only out of a subset of them, update the stats so
				// we could greedily select the locally best result for the next vreg:
				++Hist[RB];
				}

				/// Speculatively assign sensible register banks to all vregs in a function
				/// missing a register bank.
				static void selectUnconstrainedRegBanks(
				dsandersUnsubmitted Done Reply Inline Actions As a general thing: Writes to dbgs() should generally be wrapped in DEBUG(...) or DEBUG_ONLY(...) so that we don't format strings we're not going to emit. dsanders: As a general thing: Writes to dbgs() should generally be wrapped in DEBUG(...) or DEBUG_ONLY(...
				rtereshinAuthorUnsubmitted Done Reply Inline Actions Done. so that we don't format strings we're not going to emit. w/o DEBUG macro we actually emit all of this in assert and release builds likewise. rtereshin: Done. > so that we don't format strings we're not going to emit. w/o DEBUG macro we actually…
				const InputRegAllocator::Size2RegBanksTy &Size2RegBanks,
				MachineFunction &MF) {
				MachineRegisterInfo &MRI = MF.getRegInfo();
				SmallDenseMap<const RegisterBank *, unsigned, 4> Hist;
				for (unsigned i = 0; i < MRI.getNumVirtRegs(); ++i) {
				const unsigned Reg = TargetRegisterInfo::index2VirtReg(i);
				// Overall in & out vregs of the entire instruction sequence affect the
				// selection, the internal vregs and their reg banks don't usually matter,
				dsandersUnsubmitted Done Reply Inline Actions I'm not sure IMPLICIT_DEF is the right thing to use for these if-statements. IMPLICIT_DEF is a definition with an unknown value (much like UNDEF) so it wouldn't be wrong to propagate it in something like %0 = IMPLICIT_DEF %2 = G_ADD %0, %1 to: %2 = IMPLICIT_DEF a COPY of a live-in phys-reg would be safer but it's probably ok since constant propagation isn't ISel's job. I think this would only become an issue if we started porting this to combiners. dsanders: I'm not sure IMPLICIT_DEF is the right thing to use for these if-statements. IMPLICIT_DEF is a…
				rtereshinAuthorUnsubmitted Done Reply Inline Actions COPY of a live-in phys-reg would be safer It is, this is why I'm using `IMPLICIT_DEF` only if I couldn't find a phys-reg with the appropriate size and within the required register bank (or if this behavior is explicitly requested by a command line option). rtereshin: > COPY of a live-in phys-reg would be safer It is, this is why I'm using `IMPLICIT_DEF` only…
				// but if the selector went into trouble of checking them, they probably do
				// so better take them into account as well as the external ones:
				if (!MRI.def_empty(Reg) \|\| !MRI.use_empty(Reg))
				// See in which bank the instruction sequence mostly lies in:
				++Hist[MRI.getRegBankOrNull(Reg)];
				}
				// Process in & out "external" vregs first so our choices on internal ones
				// don't affect our choices on the external ones:
				for (unsigned i = 0; i < MRI.getNumVirtRegs(); ++i) {
				const unsigned Reg = TargetRegisterInfo::index2VirtReg(i);
				if (MRI.def_empty(Reg) ^ MRI.use_empty(Reg))
				selectRegBank(Reg, Size2RegBanks, Hist, MF);
				}
				dsandersUnsubmitted Done Reply Inline Actions This table is going to be quite fragile. We should at put this somewhere near the GIM_/GIR_ declarations or at least cross-reference them in the comments. Tablegen-erating them might be sensible if we get additional metadata. dsanders: This table is going to be quite fragile. We should at put this somewhere near the GIM_/GIR_…
				rtereshinAuthorUnsubmitted Done Reply Inline Actions Agreed. I'm making this less fragile by doing the following: Specifying a list of pairs implicitly defining a mapping from an opcode to its number of operands instead so there is no need to maintain the records in a specific order, matching the order of the opcode definitions opcodes are used directly, not just as a comment, thus making sure there are no non-existent opcodes mentioned in the list adding assertions that would make sure that there are no opcodes missing from the mapping putting the definition right next to the opcode definitions rtereshin: Agreed. I'm making this less fragile by doing the following: # Specifying a list of pairs…
				for (unsigned i = 0; i < MRI.getNumVirtRegs(); ++i) {
				const unsigned Reg = TargetRegisterInfo::index2VirtReg(i);
				// The internal vregs don't really matter, but we still need to pick
				// something for them. Technically, out-of-place banks, banks with no
				// registers big enough or no registers at all are fine here, but just in
				// case we don't really want to stress selector too much with malicious
				// input:
				if (!MRI.def_empty(Reg) && !MRI.use_empty(Reg))
				selectRegBank(Reg, Size2RegBanks, Hist, MF);
				}
				}

				static void defineUndefinedVRegs(const TargetInstrInfo &TII,
				const TargetRegisterInfo &TRI,
				const RegisterBankInfo &RBI,
				InputRegAllocator &RA,
				MachineIRBuilder &MIRBuilder) {
				MachineFunction &MF = MIRBuilder.getMF();
				MachineRegisterInfo &MRI = MF.getRegInfo();
				const auto OrigInsertPt = MIRBuilder.getInsertPt();
				MIRBuilder.setInsertPt(MIRBuilder.getMBB(), MIRBuilder.getMBB().begin());
				for (unsigned i = 0; i < MRI.getNumVirtRegs(); ++i) {
				const unsigned Reg = TargetRegisterInfo::index2VirtReg(i);
				if (MRI.def_empty(Reg) && !MRI.use_empty(Reg)) {
				LLT LLTy = MRI.getType(Reg);
				assert(LLTy.isValid() && "Undefined virtual reg doesn't have a type");
				const RegisterBank *RB = MRI.getRegBankOrNull(Reg);
				if (!RB) {
				dbgs() << "- Undefined virtual register %"
				<< TargetRegisterInfo::virtReg2Index(Reg)
				<< " doesn't have a reg bank\n";
				MIRBuilder.buildInstr(TargetOpcode::IMPLICIT_DEF).addDef(Reg);
				} else if (!RA.hasNext(LLTy.getSizeInBits(), *RB, true)) {
				dbgs() << "- Didn't find an allocatable physical register for the\n"
				" register bank " << RB->getName() << " of the required by"
				" LLT " << LLTy << "\n size of " << LLTy.getSizeInBits()
				<< " bits for an undefined virtual register %"
				<< TargetRegisterInfo::virtReg2Index(Reg) << "\n";
				MIRBuilder.buildInstr(TargetOpcode::IMPLICIT_DEF).addDef(Reg);
				} else if (TestgenNoABI)
				MIRBuilder.buildInstr(TargetOpcode::IMPLICIT_DEF).addDef(Reg);
				else
				MRI.addLiveIn(RA.next(LLTy.getSizeInBits(), *RB, true), Reg);
				}
				}
				MIRBuilder.setInsertPt(MIRBuilder.getMBB(), OrigInsertPt);
				MRI.EmitLiveInCopies(&MF.front(), TRI, TII);
				}

				static uint64_t nextGIOpcodeIdx(const int64_t *MatchTable,
				uint64_t CurrentIdx) {
				static const GIOpcodeMeta GIU;
				if (MatchTable[CurrentIdx] == GIR_MergeMemOperands)
				do {
				++CurrentIdx;
				dsandersUnsubmitted Done Reply Inline Actions If we have a rule with variadic instructions, what do we do about the number of defs? dsanders: If we have a rule with variadic instructions, what do we do about the number of defs?
				rtereshinAuthorUnsubmitted Done Reply Inline Actions That's a very good question, thanks! I suppose, I will have to replace `NumDefs = InsnB->getDesc().NumDefs;` line below with `NumDefs = std::max(InsnB->getDesc().NumDefs, InsnB->getNumExplicitDefs());` as soon as we get `MachineInstr::getNumExplicitDefs()` merged in ;-) (https://reviews.llvm.org/D45640). It will help, but won't solve the problem. It's not a problem for now, though, as `InstructionSelector` can't really handle those either. For instance, record instruction opcode clearly assumes that the definition is always the operand 0. When it does support the case, however, most likely it won't be doing that by checking how many definitions an instruction has, it will most likely just rely on MIR being valid. That naturally assumes machine verifier can check this stuff. So I guess at some point machine verifier will special case instructions like `G_UNMERGE_VALUES`. We can implement that check in the machine verifier as `unsigned getNumExplicitDefsExpected(const MachineInstr &I)` refactored out and then checking if the actual `I` has the number of defs expected. And then we can reuse `getNumExplicitDefsExpected` right here in Testgen. If the generic opcode has a very flexible number of defs, as in, it's not derivable from the number of operands and their types (do we even have these?), it will probably be still all right, we just see the highest operand index that the pattern explicitly requires to be a definition (via record instruction opcodes, for instance), and we say that that operand is the last def (and of course every operand with a lower index is also a def). And that should produce a) valid MIR, we started by saying this mysterious opcode is very flexible with defs b) MIR that could be matched by the pattern, and it's all we care about. Not to mention, if we end up having generic opcodes like this - with non-derivable number of defs and the number of defs having an impact on the instructions' semantics - we will end up having a match table opcode checking the number of defs explicitly. But again, it's not a problem for now. rtereshin: That's a very good question, thanks! I suppose, I will have to replace `NumDefs = InsnB…
				} while (MatchTable[CurrentIdx] != GIU_MergeMemOperands_EndOfList);
				else
				CurrentIdx += GIU.NumOperands[MatchTable[CurrentIdx]];
				return CurrentIdx + 1;
				}

				static MachineInstrBuilder &ensureNumOperands(MachineInstrBuilder &InsnB,
				int64_t NumOperands,
				int64_t NumDefs = -1) {
				if (NumDefs == -1)
				NumDefs = InsnB->getDesc().NumDefs;

				while (InsnB->getNumOperands() < NumOperands)
				if (InsnB->getNumOperands() < NumDefs)
				InsnB.addDef(0);
				else
				InsnB.addReg(0);

				for (int I = 0; I < NumDefs; ++I)
				InsnB->getOperand(I).setIsDef();

				return InsnB;
				}

				static void setType(MachineInstrBuilder &InsnB, unsigned OpIdx,
				LLT ExpectedType, MachineRegisterInfo &MRI) {
				SmallVector<unsigned, 3> OpIdxs{OpIdx};
				const MCInstrDesc &MCID = InsnB->getDesc();
				if (OpIdx < MCID.getNumOperands() && MCID.OpInfo[OpIdx].isGenericType()) {
				const unsigned TypeIdx = MCID.OpInfo[OpIdx].getGenericTypeIndex();
				for (unsigned I = 0;
				I < std::min(MCID.getNumOperands(), InsnB->getNumOperands()); ++I) {
				if (I == OpIdx \|\| !MCID.OpInfo[I].isGenericType())
				continue;
				if (MCID.OpInfo[I].getGenericTypeIndex() == TypeIdx)
				OpIdxs.push_back(I);
				}
				}
				ensureNumOperands(InsnB, *std::max_element(OpIdxs.begin(), OpIdxs.end()) + 1);
				for (auto Idx : OpIdxs) {
				MachineOperand &MO = InsnB->getOperand(Idx);
				if (!MO.getReg())
				MO.setReg(MRI.createGenericVirtualRegister(ExpectedType));
				else
				MRI.setType(MO.getReg(), ExpectedType);
				}
				}
				dsandersUnsubmitted Done Reply Inline Actions Is this ever false? ensureNumOperands() looks like it adds operands until this is true dsanders: Is this ever false? ensureNumOperands() looks like it adds operands until this is true
				rtereshinAuthorUnsubmitted Done Reply Inline Actions Of course, the number of operands could be greater than `OpIdx`, `ensureNumOperands` doesn't add or remove operands in that case. rtereshin: Of course, the number of operands could be greater than `OpIdx`, `ensureNumOperands` doesn't…

				static MachineInstrBuilder &
				setOperand(MachineInstrBuilder &InsnB, int64_t OpIdx,
				std::function<void(MachineInstrBuilder &)> OperandAdder) {
				ensureNumOperands(InsnB, OpIdx);

				if (InsnB->getNumOperands() == OpIdx)
				OperandAdder(InsnB);
				else {
				SmallVector<MachineOperand, 4> Uses;
				for (int64_t I = InsnB->getNumOperands() - 1; I > OpIdx; --I) {
				Uses.push_back(InsnB->getOperand(I));
				InsnB->RemoveOperand(I);
				}
				InsnB->RemoveOperand(OpIdx);
				OperandAdder(InsnB);
				dsandersUnsubmitted Done Reply Inline Actions What is this for? dsanders: What is this for?
				rtereshinAuthorUnsubmitted Done Reply Inline Actions Often I need to create a generic virtual register w/o knowing which type is expected from it yet, and it's not exactly possible to create a virtual register w/o a type. This constant exists to be consistent with the type I use as a default / initial option. rtereshin: Often I need to create a generic virtual register w/o knowing which type is expected from it…
				while (!Uses.empty())
				InsnB.add(Uses.pop_back_val());
				}
				return InsnB;
				dsandersUnsubmitted Done Reply Inline Actions This comment explains the reasons behind something but doesn't really explain what that something is dsanders: This comment explains the reasons behind something but doesn't really explain what that…
				rtereshinAuthorUnsubmitted Done Reply Inline Actions I'm replacing the comment with the following: // Raw representation of a single match table rule as an ordered union of // several continuous regions of the match table. The representation tries its // best to ignore parts of the table that don't affect semantics of a single // rule in isolation, like labels and rule IDs. // // Assuming that all the parts of the MatchTable that don't affect the // selection process but only identify a rule, like GIR_Coverage opcodes, come // within a rule as a single continuous block, the meaningful parts of the // rule could be represented as some prefix (starting from the first // non-control flow opcode, in other words, skipping GIM_Try and its // label-operand) and suffix: using RuleBodyTy = std::pair<ArrayRef<int64_t>, ArrayRef<int64_t>>; rtereshin: I'm replacing the comment with the following: ``` // Raw representation of a single match…
				}

				static const LLT DummyLLT = LLT::scalar(32);
				// Assuming that all the parts of the MatchTable that don't affect the selection
				// process but only identify a rule, like GIR_Coverage opcodes, come within a
				// rule as a single continuous block, so the meaningful parts of the rule could
				// be represented as some prefix and suffix:
				using RuleBodyTy = std::pair<ArrayRef<int64_t>, ArrayRef<int64_t>>;

				dsandersUnsubmitted Done Reply Inline Actions I don't think I understand this variable. What does it represent? dsanders: I don't think I understand this variable. What does it represent?
				rtereshinAuthorUnsubmitted Done Reply Inline Actions I'm renaming the variable from `CoverageBlockPassed` to `ExcludedRegionPassed` and adding a bunch of comments that should make it clearer, like this: // Get a rule descriptor, containing the index of its GIR_Done opcode, RuleID, // and a raw representation of the entire body. // // \pre MatchTable is a non-optimized linear match table. // \pre CurrentIdx points to the first (and only) GIM_Try opcode of the rule // that has all its semantically meaningless opcodes that to be excluded from // the body as a single continuous subregion somewhere. // \post the prefix of the body is a range from the first opcode after the // initial GIM_Try until GIR_Coverage (or, more generally, the first // semantically meaningless opcode), the suffix is a range from the first // opcode after the last semantically meaningless opcode until GIR_Done // (exclusive). static std::tuple<uint64_t, int64_t, RuleBodyTy> getDoneIdxRuleIDAndBody(const int64_t MatchTable, uint64_t CurrentIdx) { // skipping GIM_Try const uint64_t BodyIdx = nextGIOpcodeIdx(MatchTable, CurrentIdx); uint64_t ExcludedFirst = BodyIdx; uint64_t ExcludedLast = ExcludedFirst; // RuleID we discovered so far. Or the first one if we have many O_o int64_t FirstRuleID = -1; // Did we already iterated over that continuous region of unimportant opcodes // we are going to exclude from the body? bool ExcludedRegionPassed = false; unsigned NestingLevel = 0; do { rtereshin:* I'm renaming the variable from `CoverageBlockPassed ` to `ExcludedRegionPassed` and adding a…
				static std::tuple<uint64_t, int64_t, RuleBodyTy>
				getDoneIdxRuleIDAndBody(const int64_t *MatchTable, uint64_t CurrentIdx) {
				const uint64_t BodyIdx = nextGIOpcodeIdx(MatchTable, CurrentIdx);
				uint64_t CoverageFirst = BodyIdx;
				uint64_t CoverageLast = CoverageFirst;
				dsandersUnsubmitted Done Reply Inline Actions This should probably indicate what is being skipped dsanders: This should probably indicate what is being skipped
				rtereshinAuthorUnsubmitted Done Reply Inline Actions I'm adding the following comment: // Skipping the OnFail label operand rtereshin: I'm adding the following comment: ``` // Skipping the OnFail label operand ```
				int64_t FirstRuleID = -1;
				dsandersUnsubmitted Done Reply Inline Actions In some ways it would be nice to support this (e.g. to check the tests are the same) but I agree it's way too big a task for a first patch. dsanders: In some ways it would be nice to support this (e.g. to check the tests are the same) but I…
				rtereshinAuthorUnsubmitted Done Reply Inline Actions It will be hard to make sure that the tests are the same. For instance, let's suppose that non-optimized table does some meaningless checks, for instance, checks register banks on internal vregs (the registers defined and used inside the pattern and not being the pattern's overall inputs or outputs), while optimized table doesn't. Semantically they are the same, but in the case of the latter Testgen will have to guess more regbanks, and it might guess it differently (from what is explicitly checked by the non-optimized table). It makes no difference in the selected code, and the selected test will pass, however, the testgend test (and that one only tests the Testgen itself, not the selector) will be technically different. Also, if optimization reorders opcodes, the Testgen might easily end up with different virtual register names, while keeping the actual def-use chain the same, or schedule the def-use chain differently. Not to mention, it will noticeably increase the maintenance burden, and I think it's best to keep that at a minimum. rtereshin: It will be hard to make sure that the tests are the same. For instance, let's suppose that non…
				bool CoverageBlockPassed = false;
				unsigned NestingLevel = 0;
				do {
				switch (MatchTable[CurrentIdx++]) {
				case GIM_Try: {
				CurrentIdx++;
				if (++NestingLevel > 1) {
				dsandersUnsubmitted Done Reply Inline Actions Doesn't NDEBUG also disable the dbgs() stream? That assert can't succeed (NestingLevel > 1 vs NestingLevel == 1). It looks like this ought to be report_fatal_error() or similar dsanders: Doesn't NDEBUG also disable the dbgs() stream? That assert can't succeed (NestingLevel > 1 vs…
				rtereshinAuthorUnsubmitted Done Reply Inline Actions Doesn't NDEBUG also disable the dbgs() stream? No, the only difference is that `NDEBUG` resolves `dbgs()` directly to `errs()` and therefore sends the output straight to `stderr` while in assert builds there is a circular buffer in the middle (that smart enough to flush if killed). That assert can't succeed (NestingLevel > 1 vs NestingLevel == 1). It looks like this ought to be report_fatal_error() or similar True, good catch, I'm tidying this up. rtereshin: > Doesn't NDEBUG also disable the dbgs() stream? No, the only difference is that `NDEBUG`…
				#ifdef NDEBUG
				dbgs()
				<< "Testgen doesn't support non-linear or optimized Match Tables\n";
				exit(1);
				#endif
				assert(NestingLevel == 1 &&
				"Testgen doesn't support non-linear or optimized Match Tables");
				}
				break;
				}
				case GIR_Coverage: {
				int64_t RuleID = MatchTable[CurrentIdx++];
				assert(RuleID >= 0 && "Expected a non-negative RuleID");
				if (FirstRuleID < 0) {
				FirstRuleID = RuleID;
				CoverageFirst = CurrentIdx - 2;
				}
				CoverageLast = CurrentIdx;
				if (CoverageBlockPassed)
				CoverageFirst = CoverageLast;
				break;
				}
				default:
				CoverageBlockPassed = FirstRuleID >= 0;
				CurrentIdx = nextGIOpcodeIdx(MatchTable, CurrentIdx - 1);
				}
				} while (MatchTable[CurrentIdx] != GIR_Done);
				assert(BodyIdx <= CoverageFirst &&
				(CoverageFirst < CoverageLast && FirstRuleID >= 0 \|\|
				CoverageFirst == CoverageLast) &&
				CoverageLast <= CurrentIdx && "Broken Coverage Block");
				const auto &RuleBody = RuleBodyTy(
				ArrayRef<int64_t>(&MatchTable[BodyIdx], &MatchTable[CoverageFirst]),
				ArrayRef<int64_t>(&MatchTable[CoverageLast], &MatchTable[CurrentIdx]));
				return std::make_tuple(CurrentIdx, FirstRuleID, RuleBody);
				}

				static void
				initInsnBuilders(const int64_t *MatchTable, uint64_t CurrentIdx,
				MachineIRBuilder &MIRBuilder,
				SmallVectorImpl<MachineInstrBuilder> &InsnBuilders) {
				do {
				switch (MatchTable[CurrentIdx++]) {
				case GIM_CheckOpcode: {
				int64_t InsnID = MatchTable[CurrentIdx++];
				int64_t Expected = MatchTable[CurrentIdx++];
				dbgs() << "GIM_CheckOpcode " << InsnID << ", " << Expected << "\n";
				if (static_cast<uint64_t>(InsnID) >= InsnBuilders.size())
				InsnBuilders.resize(InsnID + 1);
				InsnBuilders[InsnID] = MIRBuilder.buildInstrNoInsert(Expected);

				// G_CONSTANT's and G_FCONSTANT's are assumed to be well-formed by the
				// selector w/o actually checking it.
				if (Expected == TargetOpcode::G_CONSTANT) {
				ensureNumOperands(InsnBuilders[InsnID], 2);
				InsnBuilders[InsnID]->getOperand(1).ChangeToImmediate(1);
				} else if (Expected == TargetOpcode::G_FCONSTANT) {
				ensureNumOperands(InsnBuilders[InsnID], 2);
				InsnBuilders[InsnID]->getOperand(1).ChangeToFPImmediate(ConstantFP::get(
				MIRBuilder.getMF().getFunction().getContext(), APFloat(0.0)));
				}
				break;
				}
				default:
				CurrentIdx = nextGIOpcodeIdx(MatchTable, CurrentIdx - 1);
				}
				} while (MatchTable[CurrentIdx] != GIR_Done);
				}

				static void mainPass(const InstructionSelectorTestgen &Testgen,
				uint64_t CurrentIdx, MachineIRBuilder &MIRBuilder,
				SmallVectorImpl<MachineInstrBuilder> &InsnBuilders,
				SmallVectorImpl<unsigned> &InsnOrder) {
				MachineFunction &MF = MIRBuilder.getMF();
				MachineRegisterInfo &MRI = MF.getRegInfo();
				const TargetRegisterInfo &TRI = Testgen.getTRI();
				const RegisterBankInfo &RBI = Testgen.getRBI();
				const int64_t *MatchTable = Testgen.getMatchTable();
				InsnOrder.push_back(0);
				do {
				switch (MatchTable[CurrentIdx++]) {
				case GIM_RecordInsn: {
				int64_t NewInsnID = MatchTable[CurrentIdx++];
				int64_t InsnID = MatchTable[CurrentIdx++];
				int64_t OpIdx = MatchTable[CurrentIdx++];
				dbgs() << "GIM_RecordInsn " << NewInsnID << ", " << InsnID << ", "
				dsandersUnsubmitted Done Reply Inline Actions I don't think I understand the Def.getReg() part of this. Both the Use and the Def must be the same register so I'd expect to always use one or the other for all cases. dsanders: I don't think I understand the Def.getReg() part of this. Both the Use and the Def must be the…
				rtereshinAuthorUnsubmitted Done Reply Inline Actions Both the Use and the Def must be the same register so I'd expect to always use one or the other for all cases. They must be the same register by the end of this `case`, but we know little in the beginning. We know that `Def` is a register and that's about it. Either (and we don't know which) or both `Def` and `Use` could be `%noreg`s (`0`). If one of them is an actual vreg (not `%noreg`) it might have a meaningful type or / and a bank assigned, and we can't loose that assignment. Technically, to make it even more robust we need to intersect both of their constraints here (what if both of them are valid vregs already, one has a bank checked, but not a type yet, and the other has the type checked, but not the bank?), but so far it was working pretty well as is. rtereshin: > Both the Use and the Def must be the same register so I'd expect to always use one or the…
				<< OpIdx << "\n";
				InsnOrder.push_back(NewInsnID);
				MachineOperand &Def =
				ensureNumOperands(InsnBuilders[NewInsnID], 1, 1)->getOperand(0);
				MachineOperand &Use =
				ensureNumOperands(InsnBuilders[InsnID], OpIdx + 1)->getOperand(OpIdx);
				unsigned Reg = Use.isReg() && Use.getReg() ? Use.getReg() : Def.getReg();
				Reg = Reg ? Reg : MRI.createGenericVirtualRegister(DummyLLT);
				Def.setReg(Reg);
				Use.ChangeToRegister(Reg, /isDef/ false);
				break;
				}
				case GIM_CheckNumOperands: {
				int64_t InsnID = MatchTable[CurrentIdx++];
				int64_t Expected = MatchTable[CurrentIdx++];
				dbgs() << "GIM_CheckNumOperands " << InsnID << ", " << Expected << "\n";
				ensureNumOperands(InsnBuilders[InsnID], Expected);
				break;
				}
				case GIM_CheckIntrinsicID: {
				int64_t InsnID = MatchTable[CurrentIdx++];
				int64_t OpIdx = MatchTable[CurrentIdx++];
				int64_t Value = MatchTable[CurrentIdx++];
				dbgs() << "GIM_CheckIntrinsicID " << InsnID << ", " << OpIdx << ", "
				<< Value << "\n";
				auto &InsnB = ensureNumOperands(InsnBuilders[InsnID], OpIdx, OpIdx);
				setOperand(InsnB, OpIdx, [Value](MachineInstrBuilder &InsnB) {
				InsnB.addIntrinsicID(static_cast<Intrinsic::ID>(Value));
				});
				break;
				}
				case GIM_CheckIsMBB: {
				int64_t InsnID = MatchTable[CurrentIdx++];
				int64_t OpIdx = MatchTable[CurrentIdx++];
				dbgs() << "GIM_CheckIsMBB " << InsnID << ", " << OpIdx << "\n";
				setOperand(
				InsnBuilders[InsnID], OpIdx,
				[&MF](MachineInstrBuilder &InsnB) { InsnB.addMBB(&MF.back()); });
				break;
				}
				case GIM_CheckType: {
				int64_t InsnID = MatchTable[CurrentIdx++];
				int64_t OpIdx = MatchTable[CurrentIdx++];
				int64_t TypeID = MatchTable[CurrentIdx++];
				dbgs() << "GIM_CheckType " << InsnID << ", " << OpIdx << ", " << TypeID
				<< "\n";
				setType(InsnBuilders[InsnID], OpIdx, Testgen.getTypeObject(TypeID), MRI);
				break;
				}
				case GIM_CheckPointerToAny: {
				int64_t InsnID = MatchTable[CurrentIdx++];
				int64_t OpIdx = MatchTable[CurrentIdx++];
				int64_t SizeInBits = MatchTable[CurrentIdx++];
				dbgs() << "GIM_CheckPointerToAny " << InsnID << ", " << OpIdx << ", "
				<< SizeInBits << "\n";
				if (!SizeInBits)
				SizeInBits = MF.getDataLayout().getPointerSizeInBits(0);
				setType(InsnBuilders[InsnID], OpIdx, LLT::pointer(0, SizeInBits), MRI);
				break;
				}
				case GIM_CheckRegBankForClass: {
				int64_t InsnID = MatchTable[CurrentIdx++];
				int64_t OpIdx = MatchTable[CurrentIdx++];
				int64_t RCEnum = MatchTable[CurrentIdx++];
				dbgs() << "GIM_CheckRegBankForClass " << InsnID << ", " << OpIdx << ", "
				<< RCEnum << "\n";
				const TargetRegisterClass *RC = TRI.getRegClass(RCEnum);
				const RegisterBank *ExpectedRegBank = getRegBankFromRegClass(RBI, RC);
				MachineOperand &MO =
				ensureNumOperands(InsnBuilders[InsnID], OpIdx + 1)->getOperand(OpIdx);
				if (!MO.getReg())
				MO.setReg(MRI.createGenericVirtualRegister(DummyLLT));
				if (ExpectedRegBank)
				MRI.setRegBank(MO.getReg(), *ExpectedRegBank);
				else {
				dbgs() << "- Didn't find a register bank covering class "
				<< TRI.getRegClassName(RC) << " for operand #" << OpIdx
				<< " (" << MO << ")" << " of\n"
				<< " " << *InsnBuilders[InsnID].getInstr() << "\n";
				// This test will definitely fail during the actual select due to
				// register banks not being fully defined yet, but we don't want to
				// create too much noise on targets early in GlobalISel migration:
				MF.getProperties().set(MachineFunctionProperties::Property::FailedISel);
				}
				break;
				dsandersUnsubmitted Done Reply Inline Actions Function-level features can be handled by listing them in the function attributes: attributes #0 = { "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" } The tricky bit will be mapping the enum back to the feature name. Module-level features are harder since they're only read once per module. You'd have to separate them into multiple files dsanders: Function-level features can be handled by listing them in the function attributes: attributes…
				rtereshinAuthorUnsubmitted Done Reply Inline Actions Thanks, I didn't dig into this deep enough yet to know that for myself! And yes, the mapping is the hardest part. So this is why that ugly `TestgenSetAllFeatures` command line option for now, to overcome this the cheap and dirty way. Of course, that strips us off testing the checking features part of every pattern and the match table as a whole. This is for future patches and improvements, though. rtereshin: Thanks, I didn't dig into this deep enough yet to know that for myself! And yes, the mapping…
				}
				case GIM_CheckFeatures: {
				int64_t ExpectedBitsetID = MatchTable[CurrentIdx++];
				dbgs() << "GIM_CheckFeatures " << ExpectedBitsetID << "\n";
				if (!Testgen.checkFeatures(ExpectedBitsetID)) {
				dbgs() << "! Testgen deficiency: Don't know how to satisfy\n"
				<< " Module & Function Target Features yet!\n";
				// This test definitely won't match the rule being tested and might not
				// even select at all, so mark it to be skipped:
				MF.getProperties().set(MachineFunctionProperties::Property::FailedISel);
				}
				break;
				}
				case GIM_CheckConstantInt: {
				int64_t InsnID = MatchTable[CurrentIdx++];
				int64_t OpIdx = MatchTable[CurrentIdx++];
				int64_t Value = MatchTable[CurrentIdx++];
				dbgs() << "GIM_CheckConstantInt " << InsnID << ", " << OpIdx << ", "
				<< Value << "\n";
				MachineOperand &MO =
				ensureNumOperands(InsnBuilders[InsnID], OpIdx + 1)->getOperand(OpIdx);
				if (MO.isReg() && MO.isUse() && MRI.def_empty(MO.getReg())) {
				if (!MO.getReg())
				MO.ChangeToRegister(MRI.createGenericVirtualRegister(DummyLLT),
				/isDef/ false);
				InsnOrder.push_back(InsnBuilders.size());
				InsnBuilders.push_back(
				MIRBuilder.buildInstrNoInsert(TargetOpcode::G_CONSTANT)
				.addDef(MO.getReg())
				.addImm(Value));
				}
				break;
				}
				dsandersUnsubmitted Done Reply Inline Actions There's no guarantee that 1 will match. There's a few predicates that check they're multiples of N dsanders: There's no guarantee that 1 will match. There's a few predicates that check they're multiples…
				rtereshinAuthorUnsubmitted Done Reply Inline Actions Absolutely no guarantee. However, I actually took a quick look and it appeared to me that 1 would match more often then any other fixed constant, did't measure it though. Full support for immediate predicates is for future patches. rtereshin: Absolutely no guarantee. However, I actually took a quick look and it appeared to me that 1…
				case GIM_CheckI64ImmPredicate: {
				int64_t InsnID = MatchTable[CurrentIdx++];
				int64_t Predicate = MatchTable[CurrentIdx++];
				dbgs() << "GIM_CheckI64ImmPredicate " << InsnID << ", " << Predicate
				<< "\n";
				setOperand(InsnBuilders[InsnID], 1,
				[](MachineInstrBuilder &InsnB) { InsnB.addImm(1); });
				break;
				}
				default:
				CurrentIdx = nextGIOpcodeIdx(MatchTable, CurrentIdx - 1);
				}
				} while (MatchTable[CurrentIdx] != GIR_Done);
				}

				static void latePass(const int64_t *MatchTable, uint64_t CurrentIdx,
				MachineFunction &MF,
				SmallVectorImpl<MachineInstrBuilder> &InsnBuilders) {
				MachineRegisterInfo &MRI = MF.getRegInfo();
				do {
				switch (MatchTable[CurrentIdx++]) {
				case GIM_CheckAtomicOrdering: {
				int64_t InsnID = MatchTable[CurrentIdx++];
				int64_t OrderingVal = MatchTable[CurrentIdx++];
				dbgs() << "GIM_CheckAtomicOrdering " << InsnID << ", " << OrderingVal
				<< "\n";
				MachineInstrBuilder &InsnB = InsnBuilders[InsnID];
				auto PtrInfo = MachinePointerInfo::getUnknownStack(MF);
				auto Flags = (InsnB->mayLoad() ? MachineMemOperand::MOLoad
				: MachineMemOperand::MONone) \|
				(InsnB->mayStore() ? MachineMemOperand::MOStore
				: MachineMemOperand::MOLoad);
				LLT LLTy = DummyLLT;
				for (const MachineOperand &MO : InsnB->operands())
				if (MO.isReg() && MO.getReg() && MRI.getType(MO.getReg()).isValid()) {
				LLTy = MRI.getType(MO.getReg());
				break;
				}
				Type *ValTy = InstructionSelectorTestgen::deduceIRType(
				LLTy, MF.getFunction().getContext());
				auto Size = MF.getDataLayout().getTypeStoreSize(ValTy);
				auto Alignment = MF.getDataLayout().getABITypeAlignment(ValTy);
				auto Ordering = static_cast<AtomicOrdering>(OrderingVal);
				MachineMemOperand *MMO =
				MF.getMachineMemOperand(PtrInfo, Flags, Size, Alignment, AAMDNodes(),
				nullptr, SyncScope::System, Ordering);
				if (InsnB->memoperands_empty())
				InsnB.addMemOperand(MMO);
				else
				*InsnB->memoperands_begin() = MMO;
				break;
				}
				case GIM_CheckIsSameOperand: {
				int64_t InsnID = MatchTable[CurrentIdx++];
				int64_t OpIdx = MatchTable[CurrentIdx++];
				int64_t OtherInsnID = MatchTable[CurrentIdx++];
				int64_t OtherOpIdx = MatchTable[CurrentIdx++];
				dbgs() << "GIM_CheckIsSameOperand " << InsnID << ", " << OpIdx << ", "
				<< ", " << OtherInsnID << ", " << OtherOpIdx << "\n";
				auto &InsnB = ensureNumOperands(InsnBuilders[InsnID], OpIdx + 1);
				auto &OtherInsnB =
				ensureNumOperands(InsnBuilders[OtherInsnID], OtherOpIdx + 1);
				const auto &MO = InsnB->getOperand(OpIdx);
				const auto &OtherMO = OtherInsnB->getOperand(OtherOpIdx);
				assert(MO.isReg() && OtherMO.isReg() && "Expected regs as same operands");
				if (MRI.def_empty(OtherMO.getReg()) && MO.getReg())
				rtereshinAuthorUnsubmitted Done Reply Inline Actions @dsanders This situation, btw, is very similar to the one with record instruction opcode: we don't know which register is defined and which is not, and we can not afford loosing the definition. With record instruction we don't know which register might already have LLT and / or regbank "checked", and we can't afford loosing that info. rtereshin: @dsanders This situation, btw, is very similar to the one with record instruction opcode: we…
				setOperand(OtherInsnB, OtherOpIdx,
				[&MO](MachineInstrBuilder &OtherInsnB) {
				OtherInsnB.addReg(MO.getReg());
				dsandersUnsubmitted Done Reply Inline Actions One thing to mention here is that if something changes the register number of OtherInsnB.getOperand(OtherOpIdx) after this opcode is processed then these might diverge. I think we're ok on that since setReg() doesn't seem to be called from latePass() dsanders: One thing to mention here is that if something changes the register number of OtherInsnB.
				rtereshinAuthorUnsubmitted Done Reply Inline Actions True, and that's the whole point of having more than one pass: satisfying dependencies. If the number and complexity of the dependencies were much greater, I would probably process the table once, "box" (as in incapsulate) every opcode (with its operands) in an object, and then sort the objects in a way that satisfies the dependencies. However, the dependencies we've got seem to be simple enough to get away with just a few passes over the table, so I find the "an object per opcode" solution greatly over-engineered and not needed. rtereshin: True, and that's the whole point of having more than one pass: satisfying dependencies. If the…
				});
				else
				setOperand(InsnB, OpIdx, [&OtherMO](MachineInstrBuilder &InsnB) {
				InsnB.addReg(OtherMO.getReg());
				});
				break;
				}
				case GIM_CheckAPIntImmPredicate: {
				int64_t InsnID = MatchTable[CurrentIdx++];
				int64_t Predicate = MatchTable[CurrentIdx++];
				dbgs() << "GIM_CheckAPIntImmPredicate " << InsnID << ", " << Predicate
				<< "\n";
				dsandersUnsubmitted Done Reply Inline Actions As with the other immediate case, '1' might not match all predicates dsanders: As with the other immediate case, '1' might not match all predicates
				rtereshinAuthorUnsubmitted Done Reply Inline Actions For future patches. rtereshin: For future patches.
				setOperand(
				InsnBuilders[InsnID], 1, [&MF, &MRI](MachineInstrBuilder &InsnB) {
				const MachineOperand &Def = InsnB->getOperand(0);
				LLT LLTy = Def.isReg() ? MRI.getType(Def.getReg()) : LLT{};
				LLTy = LLTy.isValid() ? LLTy : DummyLLT;
				InsnB.addCImm(ConstantInt::get(MF.getFunction().getContext(),
				APInt(LLTy.getSizeInBits(), 1)));
				});
				break;
				}
				default:
				CurrentIdx = nextGIOpcodeIdx(MatchTable, CurrentIdx - 1);
				}
				} while (MatchTable[CurrentIdx] != GIR_Done);
				}

				static std::string buildName(unsigned RuleNum, uint64_t RuleIdx,
				int64_t RuleID = -1) {
				const auto &ID = RuleID < 0 ? Twine{} : "_id" + Twine(RuleID);
				return ("test_rule" + Twine(RuleNum) + ID + "_at_idx" + Twine(RuleIdx)).str();
				}

				static bool skipRule(unsigned RuleNum, uint64_t CurrentIdx,
				int64_t RuleID = -1) {
				static DenseSet<unsigned> ExplicitlyListedRules;
				if (ExplicitlyListedRules.empty())
				for (unsigned Idx : TestgenIncludeOnly)
				ExplicitlyListedRules.insert(Idx);

				if (!ExplicitlyListedRules.empty())
				return !ExplicitlyListedRules.count(RuleNum);

				if (RuleNum < TestgenFromRule \|\| RuleNum > TestgenUntilRule)
				return true;

				static SmallDenseSet<unsigned, 32> ExcludedRules;
				if (ExcludedRules.empty())
				for (unsigned Idx : TestgenExcludeRules)
				ExcludedRules.insert(Idx);
				return ExcludedRules.count(RuleNum);
				}

				void InstructionSelectorTestgen::generateTestCase(
				uint64_t CurrentIdx, InputRegAllocator &RA,
				MachineIRBuilder &MIRBuilder) const {

				SmallVector<MachineInstrBuilder, 1> InsnBuilders;
				initInsnBuilders(getMatchTable(), CurrentIdx, MIRBuilder, InsnBuilders);

				SmallVector<unsigned, 1> InsnOrder;
				mainPass(*this, CurrentIdx, MIRBuilder, InsnBuilders, InsnOrder);

				MachineFunction &MF = MIRBuilder.getMF();
				dsandersUnsubmitted Done Reply Inline Actions I was confused by this function name at first. It sounded like the regbanks were unconstrained and that didn't make sense. I see it's about assigning regbanks to vregs that don't already check one. I haven't thought of a good spelling ('assignRegBanksToUnconstrainedVregs()' or 'ensureRegistersHaveBanks()' are a couple ideas) but we may be able to find a clearer name. dsanders: I was confused by this function name at first. It sounded like the regbanks were unconstrained…
				rtereshinAuthorUnsubmitted Done Reply Inline Actions I'm renaming `selectUnconstrainedRegBanks` to `assignRegBanksToUnconstrainedVRegs` and `selectRegBank` helper function (former lambda) to `assignRegBank` for consistency. rtereshin: I'm renaming `selectUnconstrainedRegBanks` to `assignRegBanksToUnconstrainedVRegs` and…
				latePass(getMatchTable(), CurrentIdx, MF, InsnBuilders);

				for (auto I = InsnOrder.rbegin(), E = InsnOrder.rend(); I != E; ++I)
				MIRBuilder.insertInstr(InsnBuilders[*I]);

				// We need all the banks in place so we could speculate on ABI better:
				dbgs() << "** Selecting RegBanks for VRegs unconstrained by the rule\n";
				selectUnconstrainedRegBanks(RA.getSize2RegBanks(), MF);
				// selectUnconstrainedRegBanks relies on the input vregs being undefined,
				// so run defineUndefinedVRegs later, not before:
				dbgs() << "** Defining VRegs undefined by the rule\n";
				defineUndefinedVRegs(getTII(), getTRI(), getRBI(), RA, MIRBuilder);
				// finalize the MBB with a non-generic terminator that uses the root vreg:
				InstructionSelectorTestgen::emitReturnIntoTestCase(MIRBuilder);

				MF.getProperties().set(MachineFunctionProperties::Property::Legalized);
				MF.getProperties().set(MachineFunctionProperties::Property::RegBankSelected);
				}

				void InstructionSelectorTestgen::generateTestCasesImpl(
				Module &M, MachineModuleInfo &MMI) const {

				const MachineFunction MF = MMI.getMachineFunction(M.begin());
				assert(MF && "Expected the Module / MMI to contain an init Machine Function");

				const int64_t *MatchTable = getMatchTable();
				unsigned RuleNum = 0;
				uint64_t CurrentIdx = 0;
				uint64_t CurrentDoneIdx = ~0ULL;
				int64_t RuleID = -1;
				using RuleSig = std::tuple<unsigned, uint64_t, int64_t>;
				RuleBodyTy RuleBody;

				DenseMap<RuleBodyTy, SmallVector<RuleSig, 1>> RuleDuplicates;
				InputRegAllocator RA(getTRI(), getRBI(), *MF);
				dsandersUnsubmitted Done Reply Inline Actions We should have a couple examples of how duplicates can happen (e.g. i32 and f32 both mapping to s32) dsanders: We should have a couple examples of how duplicates can happen (e.g. i32 and f32 both mapping to…
				rtereshinAuthorUnsubmitted Done Reply Inline Actions I'm adding the following comment: // The major source of literal duplicates is the fact that we map MVTs // like i32 and f32 to the same s32 LLT, therefore 2 or more patterns // originally written for SelectionDAG ISel get imported as the exact same // sequence of semantically meaningful match table opcodes, matching and // rendering opcodes both: rtereshin: I'm adding the following comment: ``` // The major source of literal duplicates is the…

				while (MatchTable[CurrentIdx] != GIM_Reject) {
				std::tie(CurrentDoneIdx, RuleID, RuleBody) =
				getDoneIdxRuleIDAndBody(MatchTable, CurrentIdx);

				if (!skipRule(RuleNum, CurrentIdx, RuleID)) {
				const auto &Name = buildName(RuleNum, CurrentIdx, RuleID);
				SmallVectorImpl<RuleSig> &Dups = RuleDuplicates[RuleBody];
				Dups.emplace_back(RuleNum, CurrentIdx, RuleID);

				if (Dups.size() > 1) {
				dbgs() << "*** Skipping test case " << Name << " as a duplicate of "
				<< buildName(std::get<0>(Dups.front()),
				std::get<1>(Dups.front()),
				std::get<2>(Dups.front()))
				<< "\n";
				} else {
				dbgs() << "\n* Generating test case " << Name << " *\n";
				MachineIRBuilder MIRBuilder =
				InstructionSelectorTestgen::createEmptyTestCase(Name, M, MMI);
				generateTestCase(CurrentIdx, RA.reset(), MIRBuilder);
				if (!MIRBuilder.getMF().verify(nullptr, Name.c_str(), false)) {
				Function *F = M.getFunction(Name);
				MMI.deleteMachineFunctionFor(*F);
				F->eraseFromParent();
				}
				}
				}
				++RuleNum;
				CurrentIdx = nextGIOpcodeIdx(MatchTable, CurrentDoneIdx);
				}
				}

test/TableGen/GlobalISelEmitter.td

	Show First 20 Lines • Show All 73 Lines • ▼ Show 20 Lines
	// CHECK-NEXT: typedef void(MyTargetInstructionSelector::*CustomRendererFn)(MachineInstrBuilder &, const MachineInstr&) const;			// CHECK-NEXT: typedef void(MyTargetInstructionSelector::*CustomRendererFn)(MachineInstrBuilder &, const MachineInstr&) const;
	// CHECK-NEXT: const ISelInfoTy<PredicateBitset, ComplexMatcherMemFn, CustomRendererFn> ISelInfo;			// CHECK-NEXT: const ISelInfoTy<PredicateBitset, ComplexMatcherMemFn, CustomRendererFn> ISelInfo;
	// CHECK-NEXT: static MyTargetInstructionSelector::ComplexMatcherMemFn ComplexPredicateFns[];			// CHECK-NEXT: static MyTargetInstructionSelector::ComplexMatcherMemFn ComplexPredicateFns[];
	// CHECK-NEXT: static MyTargetInstructionSelector::CustomRendererFn CustomRenderers[];			// CHECK-NEXT: static MyTargetInstructionSelector::CustomRendererFn CustomRenderers[];
	// CHECK-NEXT: bool testImmPredicate_I64(unsigned PredicateID, int64_t Imm) const override;			// CHECK-NEXT: bool testImmPredicate_I64(unsigned PredicateID, int64_t Imm) const override;
	// CHECK-NEXT: bool testImmPredicate_APInt(unsigned PredicateID, const APInt &Imm) const override;			// CHECK-NEXT: bool testImmPredicate_APInt(unsigned PredicateID, const APInt &Imm) const override;
	// CHECK-NEXT: bool testImmPredicate_APFloat(unsigned PredicateID, const APFloat &Imm) const override;			// CHECK-NEXT: bool testImmPredicate_APFloat(unsigned PredicateID, const APFloat &Imm) const override;
	// CHECK-NEXT: const int64_t *getMatchTable() const override;			// CHECK-NEXT: const int64_t *getMatchTable() const override;
				// CHECK-NEXT: std::unique_ptr<const InstructionSelectorTestgen> getTestgen() const override;
				// CHECK-NEXT: friend class MyTargetInstructionSelectorTestgen;
	// CHECK-NEXT: #endif // ifdef GET_GLOBALISEL_TEMPORARIES_DECL			// CHECK-NEXT: #endif // ifdef GET_GLOBALISEL_TEMPORARIES_DECL

	// CHECK-LABEL: #ifdef GET_GLOBALISEL_TEMPORARIES_INIT			// CHECK-LABEL: #ifdef GET_GLOBALISEL_TEMPORARIES_INIT
	// CHECK-NEXT: , State(2),			// CHECK-NEXT: , State(2),
	// CHECK-NEXT: ISelInfo({TypeObjects, FeatureBitsets, ComplexPredicateFns, CustomRenderers})			// CHECK-NEXT: ISelInfo({TypeObjects, FeatureBitsets, ComplexPredicateFns, CustomRenderers})
	// CHECK-NEXT: #endif // ifdef GET_GLOBALISEL_TEMPORARIES_INIT			// CHECK-NEXT: #endif // ifdef GET_GLOBALISEL_TEMPORARIES_INIT

	// CHECK-LABEL: enum SubtargetFeatureBits : uint8_t {			// CHECK-LABEL: enum SubtargetFeatureBits : uint8_t {
	▲ Show 20 Lines • Show All 1,021 Lines • Show Last 20 Lines

utils/TableGen/GlobalISelEmitter.cpp

Show First 20 Lines • Show All 3,803 Lines • ▼ Show 20 Lines	OS << " static " << Target.getName()
<< "InstructionSelector::CustomRendererFn CustomRenderers[];\n"		<< "InstructionSelector::CustomRendererFn CustomRenderers[];\n"
<< " bool testImmPredicate_I64(unsigned PredicateID, int64_t Imm) const "		<< " bool testImmPredicate_I64(unsigned PredicateID, int64_t Imm) const "
"override;\n"		"override;\n"
<< " bool testImmPredicate_APInt(unsigned PredicateID, const APInt &Imm) "		<< " bool testImmPredicate_APInt(unsigned PredicateID, const APInt &Imm) "
"const override;\n"		"const override;\n"
<< " bool testImmPredicate_APFloat(unsigned PredicateID, const APFloat "		<< " bool testImmPredicate_APFloat(unsigned PredicateID, const APFloat "
"&Imm) const override;\n"		"&Imm) const override;\n"
<< " const int64_t *getMatchTable() const override;\n"		<< " const int64_t *getMatchTable() const override;\n"
		<< " std::unique_ptr<const InstructionSelectorTestgen> "
		"getTestgen() const override;\n"
		<< " friend class " << Target.getName() << "InstructionSelectorTestgen;\n"
<< "#endif // ifdef GET_GLOBALISEL_TEMPORARIES_DECL\n\n";		<< "#endif // ifdef GET_GLOBALISEL_TEMPORARIES_DECL\n\n";

OS << "#ifdef GET_GLOBALISEL_TEMPORARIES_INIT\n"		OS << "#ifdef GET_GLOBALISEL_TEMPORARIES_INIT\n"
<< ", State(" << MaxTemporaries << "),\n"		<< ", State(" << MaxTemporaries << "),\n"
<< "ISelInfo({TypeObjects, FeatureBitsets, ComplexPredicateFns, "		<< "ISelInfo({TypeObjects, FeatureBitsets, ComplexPredicateFns, "
"CustomRenderers})\n"		"CustomRenderers})\n"
<< "#endif // ifdef GET_GLOBALISEL_TEMPORARIES_INIT\n\n";		<< "#endif // ifdef GET_GLOBALISEL_TEMPORARIES_INIT\n\n";

▲ Show 20 Lines • Show All 154 Lines • ▼ Show 20 Lines	if (A.isHigherPriorityThan(B)) {
"the same time");		"the same time");
return true;		return true;
}		}
return false;		return false;
});		});

OS << "bool " << Target.getName()		OS << "bool " << Target.getName()
<< "InstructionSelector::selectImpl(MachineInstr &I, CodeGenCoverage "		<< "InstructionSelector::selectImpl(MachineInstr &I, CodeGenCoverage "
"&CoverageInfo) const {\n"		<< "&CoverageInfo) const {\n"
<< " MachineFunction &MF = *I.getParent()->getParent();\n"		<< " MachineFunction &MF = *I.getParent()->getParent();\n"
<< " MachineRegisterInfo &MRI = MF.getRegInfo();\n"		<< " MachineRegisterInfo &MRI = MF.getRegInfo();\n"
<< " // FIXME: This should be computed on a per-function basis rather "		<< " // FIXME: This should be computed on a per-function basis rather "
"than per-insn.\n"		<< "than per-insn.\n"
<< " AvailableFunctionFeatures = computeAvailableFunctionFeatures(&STI, "		<< " AvailableFunctionFeatures = computeAvailableFunctionFeatures(&STI"
"&MF);\n"		<< ", &MF);\n"
<< " const PredicateBitset AvailableFeatures = getAvailableFeatures();\n"		<< " const PredicateBitset AvailableFeatures = getAvailableFeatures();\n"
<< " NewMIVector OutMIs;\n"		<< " NewMIVector OutMIs;\n"
<< " State.MIs.clear();\n"		<< " State.MIs.clear();\n"
<< " State.MIs.push_back(&I);\n\n"		<< " State.MIs.push_back(&I);\n\n"
<< " if (executeMatchTable(*this, OutMIs, State, ISelInfo"		<< " if (executeMatchTable(*this, OutMIs, State, ISelInfo"
<< ", getMatchTable(), TII, MRI, TRI, RBI, AvailableFeatures"		<< ", getMatchTable(), TII, MRI, TRI, RBI, AvailableFeatures"
<< ", CoverageInfo)) {\n"		<< ", CoverageInfo)) {\n"
<< " return true;\n"		<< " return true;\n"
<< " }\n\n"		<< " }\n\n"
<< " return false;\n"		<< " return false;\n"
		<< "}\n\n"

		<< "namespace {\n"
		<< "class " << Target.getName() << "InstructionSelectorTestgen\n"
		<< " : public llvm::InstructionSelectorTestgen {\n"
		<< "public:\n"
		<< " " << Target.getName() << "InstructionSelectorTestgen(const "
		<< Target.getName() << "InstructionSelector &ISel)\n"
		<< " : ISel(ISel) {}\n"
		<< " void "
		<< "generateTestCases(Module &M, MachineModuleInfo &MMI) const override;\n"
		<< " bool checkFeatures(unsigned FeatureBitsetID) const override;\n"
		<< " LLT getTypeObject(unsigned TypeObjectID) const override;\n"
		<< " const TargetInstrInfo &getTII() const override { return ISel.TII;"
		<< " }\n"
		<< " const TargetRegisterInfo &getTRI() const override { return ISel.TRI;"
		<< " }\n"
		<< " const RegisterBankInfo &getRBI() const override { return ISel.RBI;"
		<< " }\n"
		<< " const int64_t *getMatchTable() const override;\n"
		<< "private:\n"
		<< " const " << Target.getName() << "InstructionSelector &ISel;\n"
		<< "};\n"
		<< "} // end anonymous namespace\n\n"

		<< "std::unique_ptr<const llvm::InstructionSelectorTestgen>\n"
		<< Target.getName() << "InstructionSelector::getTestgen() const {\n"
		<< " return llvm::make_unique<const " << Target.getName()
		<< "InstructionSelectorTestgen>(*this);\n"
		<< "}\n\n"

		<< "void " << Target.getName()
		<< "InstructionSelectorTestgen::generateTestCases(Module &M"
		<< ", MachineModuleInfo &MMI) const {\n"
		<< " MachineFunction MF = MMI.getMachineFunction(M.rbegin());\n"
		<< " ISel.AvailableFunctionFeatures = "
		<< "ISel.computeAvailableFunctionFeatures(&ISel.STI, MF);\n"
		<< " generateTestCasesImpl(M, MMI);\n"
		<< "}\n\n"

		<< "bool " << Target.getName()
		<< "InstructionSelectorTestgen::checkFeatures(unsigned FeatureBitsetID) "
		<< "const {\n"
		<< " const PredicateBitset &ExpectedFeatures = "
		<< "ISel.ISelInfo.FeatureBitsets[FeatureBitsetID];\n"
		<< " const PredicateBitset &AvailableFeatures = "
		<< "ISel.getAvailableFeatures();\n"
		<< " return (ExpectedFeatures & AvailableFeatures) == ExpectedFeatures;\n"
		<< "}\n\n"

		<< "LLT " << Target.getName()
		<< "InstructionSelectorTestgen::getTypeObject(unsigned TypeObjectID) "
		<< "const {\n"
		<< " return ISel.ISelInfo.TypeObjects[TypeObjectID];\n"
<< "}\n\n";		<< "}\n\n";

		OS << "const int64_t *" << Target.getName()
		<< "InstructionSelectorTestgen::getMatchTable() const {\n";
		if (OptimizeMatchTable \|\| !GenerateCoverage) {
		const MatchTable VanillaTable = buildMatchTable(Rules, false, true);
		VanillaTable.emitDeclaration(OS);
		OS << " return ";
		VanillaTable.emitUse(OS);
		} else
		OS << " return ISel.getMatchTable()";
		OS << ";\n}\n\n";

const MatchTable Table =		const MatchTable Table =
buildMatchTable(Rules, OptimizeMatchTable, GenerateCoverage);		buildMatchTable(Rules, OptimizeMatchTable, GenerateCoverage);
OS << "const int64_t *" << Target.getName()		OS << "const int64_t *" << Target.getName()
<< "InstructionSelector::getMatchTable() const {\n";		<< "InstructionSelector::getMatchTable() const {\n";
Table.emitDeclaration(OS);		Table.emitDeclaration(OS);
		dsandersUnsubmitted Done Reply Inline Actions It seems that this block has been moved down from the other side of the sort. Non-functional changes like this ought to be in separate patches dsanders: It seems that this block has been moved down from the other side of the sort. Non-functional…
		rtereshinAuthorUnsubmitted Done Reply Inline Actions I'm extracting a number of patches from this one to break it down a little. rtereshin: I'm extracting a number of patches from this one to break it down a little.
OS << " return ";		OS << " return ";
Table.emitUse(OS);		Table.emitUse(OS);
OS << ";\n}\n";		OS << ";\n}\n";
OS << "#endif // ifdef GET_GLOBALISEL_IMPL\n";
		OS << "#endif // ifdef GET_GLOBALISEL_IMPL\n\n";

OS << "#ifdef GET_GLOBALISEL_PREDICATES_DECL\n"		OS << "#ifdef GET_GLOBALISEL_PREDICATES_DECL\n"
<< "PredicateBitset AvailableModuleFeatures;\n"		<< "PredicateBitset AvailableModuleFeatures;\n"
<< "mutable PredicateBitset AvailableFunctionFeatures;\n"		<< "mutable PredicateBitset AvailableFunctionFeatures;\n"
<< "PredicateBitset getAvailableFeatures() const {\n"		<< "PredicateBitset getAvailableFeatures() const {\n"
		<< " if (TestgenSetAllFeatures)\n"
		<< " return PredicateBitset().set();\n"
<< " return AvailableModuleFeatures \| AvailableFunctionFeatures;\n"		<< " return AvailableModuleFeatures \| AvailableFunctionFeatures;\n"
<< "}\n"		<< "}\n"
<< "PredicateBitset\n"		<< "PredicateBitset\n"
<< "computeAvailableModuleFeatures(const " << Target.getName()		<< "computeAvailableModuleFeatures(const " << Target.getName()
<< "Subtarget *Subtarget) const;\n"		<< "Subtarget *Subtarget) const;\n"
<< "PredicateBitset\n"		<< "PredicateBitset\n"
		dsandersUnsubmitted Done Reply Inline Actions This definition should probably be wrapped in a #ifdef dsanders: This definition should probably be wrapped in a #ifdef
		rtereshinAuthorUnsubmitted Done Reply Inline Actions It is wrapped in `GET_GLOBALISEL_IMPL` along with the InstructionSelector implementation. Due to `InstructionSelector::getTestgen` method we need the class definition even if we aren't planning to use testgen, but why would we not? rtereshin: It is wrapped in `GET_GLOBALISEL_IMPL` along with the InstructionSelector implementation. Due…
<< "computeAvailableFunctionFeatures(const " << Target.getName()		<< "computeAvailableFunctionFeatures(const " << Target.getName()
<< "Subtarget *Subtarget,\n"		<< "Subtarget *Subtarget,\n"
<< " const MachineFunction *MF) const;\n"		<< " const MachineFunction *MF) const;\n"
<< "#endif // ifdef GET_GLOBALISEL_PREDICATES_DECL\n";		<< "#endif // ifdef GET_GLOBALISEL_PREDICATES_DECL\n";

OS << "#ifdef GET_GLOBALISEL_PREDICATES_INIT\n"		OS << "#ifdef GET_GLOBALISEL_PREDICATES_INIT\n"
<< "AvailableModuleFeatures(computeAvailableModuleFeatures(&STI)),\n"		<< "AvailableModuleFeatures(computeAvailableModuleFeatures(&STI)),\n"
<< "AvailableFunctionFeatures()\n"		<< "AvailableFunctionFeatures()\n"
▲ Show 20 Lines • Show All 150 Lines • Show Last 20 Lines

utils/update_instruction_select_testgen_tests.sh

This file was added.

Property	Old Value	New Value
File Mode	null	100755

				#!/bin/bash -e

				testgend="$1"
				llc="$2"
				triple="$3"
				testgen_extra_args="${@:4}"

				if [ -z "$testgend" -o -z "$llc" -o -z "$triple" ]; then
				echo "usage: $0 <testgen'd file> <llc binary> <target triple> [extra llc args]"
				exit 1
				fi

				selected="${testgend%-testgend.mir}-selected.mir"

				testgen_command="$llc -x mir -mtriple $triple -testgen-set-all-features \
				-run-pass instruction-select-testgen $testgen_extra_args \
				-verify-machineinstrs -simplify-mir -o -"

				text="$(echo \| $testgen_command 2> /dev/null \| perl -pe 's/\s+$/\n/')"

				test0_line_number=$(echo "$text" \| grep -n 'name:\s*test_' \
				\| head -n 1 \| cut -d: -f 1)
				if [ -z "$test0_line_number" ]; then
				echo "Couldn't generate any tests by running the following testgen command:"
				echo "echo \| $testgen_command"
				exit 2
				fi
				tests=$(echo "$text" \| tail +$((test0_line_number - 1)))

				echo "# NOTE: This test has been autogenerated by utils/$(basename $0)" > "$testgend"
				echo "# RUN: llc -mtriple $triple -run-pass instruction-select-testgen \\" >> "$testgend"
				if [ -n "$testgen_extra_args" ]; then
				echo "# RUN: $testgen_extra_args \\" >> "$testgend"
				fi
				echo "# RUN: -testgen-set-all-features -verify-machineinstrs -simplify-mir %s \\" >> "$testgend"
				echo "# RUN: -o - 2>&1 \| FileCheck %s --check-prefix=TESTGEND" >> "$testgend"
				echo "#" >> "$testgend"
				echo "$tests" \
				\| perl -pe 's/(test_rule\d+)(_id\d+)?_at_idx\d+/\1/' \
				\| perl -pe 's/^(name:\s*test_rule\d+)/# TESTGEND-LABEL: \1/' \
				\| perl -pe 's/^([^#\n])/# TESTGEND: \1/' \
				\| perl -pe 's/^\n$/#\n/' >> "$testgend"

				dsandersUnsubmitted Done Reply Inline Actions The 'perl' commands might be an issue for the windows bots. dsanders: The 'perl' commands might be an issue for the windows bots.
				rtereshinAuthorUnsubmitted Done Reply Inline Actions `sed` is horrible with multiline patterns, hm... This script isn't required to run tests, only to generate / update them, so maybe we could deal with it a bit later. Maybe with a little help from somebody running windows? rtereshin: `sed` is horrible with multiline patterns, hm... This script isn't required to run tests, only…
				echo "# RUN: llc -mtriple $triple -run-pass instruction-select \\" > "$selected"
				echo "# RUN: -testgen-set-all-features -disable-gisel-legality-check \\" >> "$selected"
				echo "# RUN: -verify-machineinstrs -simplify-mir %s -o - 2>&1 \\" >> "$selected"
				echo "# RUN: \| FileCheck %s --check-prefix=SELECTED" >> "$selected"
				echo "#" >> "$selected"
				echo "# Test if this file is in sync with the current state of the selector:" >> "$selected"
				echo "$text" >> "$selected"

				$(dirname $0)/update_mir_test_checks.py --llc-binary="$llc" "$selected"

				sed -i '' -e "/^# Test if this file is in sync with the current state of the selector:$/ a\\
				# RUN: cat %s \| FileCheck --check-prefix=TESTGEND \\\\\\
				# RUN: %S/$(basename $testgend)
				" "$selected"

				sed -i '' -e "1 i\\
				# NOTE: This test has been autogenerated by utils/$(basename $0)\\
				#
				" "$selected"

This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel][utils] Adding the init version of Instruction Select TestgenAbandonedPublic

Details

What this tool is and isn't:

Design goals:

Usage:

Coverage:

Diff Detail

Event Timeline

Performance

Revision Contents

Diff 144043

include/llvm/CodeGen/GlobalISel/InstructionSelectTestgen.h

include/llvm/CodeGen/GlobalISel/InstructionSelector.h

include/llvm/CodeGen/GlobalISel/InstructionSelectorImpl.h

include/llvm/CodeGen/GlobalISel/InstructionSelectorTestgen.h

include/llvm/InitializePasses.h

lib/CodeGen/GlobalISel/CMakeLists.txt

lib/CodeGen/GlobalISel/GlobalISel.cpp

lib/CodeGen/GlobalISel/InstructionSelectTestgen.cpp

lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp

test/TableGen/GlobalISelEmitter.td

utils/TableGen/GlobalISelEmitter.cpp

utils/update_instruction_select_testgen_tests.sh

[GlobalISel][utils] Adding the init version of Instruction Select Testgen
AbandonedPublic