This is an archive of the discontinued LLVM Phabricator instance.

[RFC] "Alternative" matches for TableGen DAG patterns
AbandonedPublic

Authored by uweigand on Jun 19 2018, 9:46 AM.

Details

Summary

A TableGen instruction record usually contains a DAG pattern that will
describe the SelectionDAG operation that can be implemented by this
instruction. However, there will be cases where several different DAG
patterns can all be implemented by the same instruction. The way to
represent this today is to write additional patterns in the Pattern
(or usually Pat) class that map those extra DAG patterns to the
instruction. This usually also works fine.

However, I've noticed cases where the current setup seems to require
quite a bit of extra (and duplicated) text in the target .td files.
For example, in the SystemZ back-end, there are quite a number of
instructions that can implement an "add-with-overflow" operation.
The same instructions also need to be used to implement just plain
addition (simply ignoring the extra overflow output). The current
solution requires creating extra Pat pattern for every instruction,
duplicating the information about which particular add operands
map best to which particular instruction.

Similarly, when working on my current proposal to support strict
floating-point operations in the back-end, it seems that this will
require (at least) two patterns for each floating-point instruction:
one that matches the regular FP operation, and one that matches the
strict FP operation -- the latter must be distinct at the DAG level
since it carries a chain.

It seems to me that it might be useful to have TableGen implicity
generate multiple DAG matcher patterns for a single instruction,
without having to explicitly spell those out in the .td files.
Looking at CodeGenDAGPatterns.cpp, I see some precedents for this
idea: there is already "ExpandHwModeBasedTypes", which implicitly
generates multiple patterns for different HW types, and there is
"GenerateVariants", which implictly generates multiple patterns
based on arithmetic combinations of commutative / associative
operations.

Along those lines, I'd like to propose a new feature that allows a
back-end implementer to concisely write instruction patterns that
match multiple "alternative" DAG patterns and will be implicitly
expanded into regular DAG matchers by TableGen. As an example,
this patch uses the feature in the SystemZ back-end to eliminate
all the duplicated Pat patterns for add/sub instructions mentioned
above. In their place, I can now simply define an alternative:

def z_sadd : PatFrag<(ops node:$src1, node:$src2),
                     (alternative (z_saddo node:$src1, node:$src2),
                                  (add node:$src1, node:$src2))>;

and then write all "addition" instructions to match the z_sadd
operation. This will then get expanded into two DAG matchers
for each of those instructions, one matching z_saddo (which maps
the add-with-overflow operation) and one matching a regular add.

The implementation is mostly straightforward; the actual generation
of duplicated patterns is done by a new routine EnumerateAlternatives,
implemented along the lines of ExpandHwModeBasedTypes and GenerateVariants.

There's one not quite obvious change required: GetNumNodeResults now
needs to take a DagInit operand instead of an Operator, since for
"alternative" records we need to iterate across the DAG children
to determine the number of results of all alternatives. Since the
caller of GetNumNodeResults doesn't have a DagInit, but at this
point the Operator is actually always an instruction, this now
calls an outlined routine GetNumInstructionResults.

Note that "alternative" nodes are completely eliminated by
EnumerateAlternatives, so there are no "down-stream" code changes
required beyond this point at all. The generation of the
SelectionDAG matchers works unchanged as-is (and similarly there
are no changes required to GlobalISel).

Diff Detail

Event Timeline

uweigand created this revision.Jun 19 2018, 9:46 AM

I don't think that the new keyword is necessary. Why not have a new class Patternsthat takes a list of input dag patterns (otherwise identical to the current Pattern), and matches when any of these inputs matches? Then you can have a PatFrags as an analogy to PatFrag.

I've not looked at the patch itself, but regarding the general functionality: Nice! I've wanted this for as long as I've been working on LLVM :-)

I don't think that the new keyword is necessary. Why not have a new class Patternsthat takes a list of input dag patterns (otherwise identical to the current Pattern), and matches when any of these inputs matches? Then you can have a PatFrags as an analogy to PatFrag.

I'm not sure I fully see how this would work yet, but I'll give it a try. Note that the primary motivation of my change was specifically to allow multiple matching patterns for the the main instruction pattern, i.e. the Pattern field of the Instruction class, and I'm not sure I see how your suggestion can work for this case, given that the main pattern does operate a bit differently from the Pattern case ...

For the purposes of matching the main pattern is really the same as the ones specified by Pats. I am opposed to making this feature specific to the "Pattern" field, (1) because this is a very useful feature to have in general, and (2) because if the implementation is main-pattern-specific, it would need to be replaced to make it work with Pats in the future. The main strength of this feature comes from multiple "source patterns" in a PatFrag.

The main issue you'd encounter is that wherever there is a single pattern in CodeGenDAGPatterns.cpp, there will now have to be a list of them. The function InlinePatternFragments would create new patterns for each alternative source pattern, so it would no longer be single input -> single output, but single input -> multiple outputs. Other than that, there shouldn't really be more significant differences. The patterns are detached from instruction definitions as soon as you add them to the PatternsToMatch list.

For the purposes of matching the main pattern is really the same as the ones specified by Pats. I am opposed to making this feature specific to the "Pattern" field, (1) because this is a very useful feature to have in general, and (2) because if the implementation is main-pattern-specific, it would need to be replaced to make it work with Pats in the future. The main strength of this feature comes from multiple "source patterns" in a PatFrag.

Fully agreed, it certainly should work for any pattern matching, both main pattern and Pat.

The main issue you'd encounter is that wherever there is a single pattern in CodeGenDAGPatterns.cpp, there will now have to be a list of them. The function InlinePatternFragments would create new patterns for each alternative source pattern, so it would no longer be single input -> single output, but single input -> multiple outputs. Other than that, there shouldn't really be more significant differences. The patterns are detached from instruction definitions as soon as you add them to the PatternsToMatch list.

Sure, for adding to the PatternsToMatch list I can see how this would work. What I was wondering is those other uses of the main pattern, e.g. to find implicit register definitions -- this is where the main pattern already today can be a list of multiple patterns, where all but the first one must be implicit statements. If a PatFrag expands the first pattern in the list to multiple ones, how will those implicit statements then fit in? There's also this bit in MatcherGen::EmitResultInstructionAsOperand where the main pattern is specifically consulted again to find out whether to place a chain on this output node ... This can probably all be made to work somehow, it's just something that needs a bit more investigation ...

(In general, it seems that those other uses of the main pattern are a bit dubious anyway -- I'm seeing comments that the main pattern and a Pat statement really ought to be equivalent, but currently aren't ...)

kpn added a subscriber: kpn.Jun 21 2018, 12:27 PM

I don't think that the new keyword is necessary. Why not have a new class Patternsthat takes a list of input dag patterns (otherwise identical to the current Pattern), and matches when any of these inputs matches? Then you can have a PatFrags as an analogy to PatFrag.

I've now implemented this approach for comparison here:
https://reviews.llvm.org/D48545

Let me know which one is preferable ...