This is an alternative approach to implement the same goal as described here:
https://reviews.llvm.org/D48326
See the summary of that patch for more information.
The difference of this approach is that instead of using a new keyword ("alternative"), we rather extend the PatFrag class to allow specifying multiple alternative match patterns -- this approach was suggested by Krzysztof in comments to D48326.
As an example, the z_sadd fragment mentioned in D48326 would instead be written as follows:
def z_sadd : PatFrags<(ops node:$src1, node:$src2), [(z_saddo node:$src1, node:$src2), (add node:$src1, node:$src2)]>;
Implementation-wise, the major difference is that instead of expanding "alternative" nodes in a separate pass over the PatternsToMatch vector after instruction/pattern parsing was completed, we now directly expand multiple alternatives when inlining pattern fragments. That is, the InlinePatterFragments routine no longer performs an in-place update of one single pattern, but instead maps one input pattern to a set of output patterns.
However, a number of other implementation differences follow from that. This mostly results from the fact that there doesn't seem to be a good way to handle multiple alternatives for the "set" nodes in the main instruction pattern. The approach in D48326 would continue to fully inline pattern fragments before parsing main instructions patterns -- this would inline the "alternative" nodes, but still retain a single "set" node. Instead, with the new approach we now defer inlining pattern fragments until after the initial parsing of the instruction pattern, and only inline fragments on the resulting "PatternToMatch". However, this doesn't fully work, since before inlining we may no longer recognize all input/output nodes; therefore, FindPatternInputsAndOutputs will now do a partial inlining just on named nodes (that serve as inputs) and set destination nodes (that serve as outputs). Since type resolution only works after pattern inlining, this is likewise defered. As an effect, parsing of the DAG pattern now shares a lot more code between the Instruction and the Pattern case, which should be a good thing in any case ...
As a follow-on, we also need to update the two other places that currently still refer to the full instruction pattern (including the "set" and "implicit" nodes): these are InferFromPattern and EmitResultInstructionAsOperand. The former currently infers flags separately from Instruction and Pattern nodes; now it just uses the PatternToMatch list, which contains both. This required rewriting the way the "isBitcast" flag is inferred. The latter uses the instruction pattern to search for SDNPHasChain property flags. This is now moved to InferFromPattern as well (since it basically is the same operation).
Note that in order to avoid spurious changes to other targets, I've for now still restricted the new inference code for the isBitcast and hasChain flags to patterns that originated from Instruction nodes. In general, it would probably be better to remove this final difference between instruction patterns and "regular" patterns, but that should IMO only be done after analysing the specific impacts on existing targets, so this should be done as a follow-on patch.
For now, this patch should be NFC for all targets:
- for SystemZ I've verified that identical machine code is generated
- for all other targets except Sparc, all TableGen generated files are identical, so machine code must be identical too
- for Sparc, the TableGen generated file does change, because we're now running more type inference logic on Instruction patterns (which we used to run only on Pattern patterns), which allows TableGen to infer that the "iPTR:$dst" operand of LEAX_ADDri must actually have type i64. Since the pattern is only active when generating 64-bit code, this should likewise not cause any codegen changes.
A few open questions:
- I've renamed the PatFrag class to PatFrags, holding a "list<dag> Fragments" instead of a "dag Fragment". Does the name make sense? Or should this be called something like MultiPatFrag, or PatFragAlternatives, or ...?
- The original PatFrag is now a subclass of PatFrags, specialized to a single fragment. This required some changes to two targets (Hexagon and NVPTX) that directly accessed implementation details of PatFrag (specifically, the "Fragment" member). Should we instead leave PatFrag as is and and a nonrelated PatFrags class? This would make the implementation in TableGen a bit wordier ...
- Theoretically, a PatFrags instance can now have an empty Fragments list -- this would simply never match anything. Should this be explicitly forbidden? On the other hand, we might also use this to implement "null_frag" without hardcoding its name in TableGen ...