This adds a basic tablegen backend that analyzes the SelectionDAG patterns to find simple ones that are eligible for GlobalISel-emission.
That's similar to FastISel, with one notable difference: we're not fed ISD opcodes, so we need to map the SDNode operators to generic opcodes. That's done using GINodeEquiv in TargetGlobalISel.td.
Otherwise, this is mostly boilerplate, and lots of filtering of any kind of "complicated" pattern. On AArch64, this is sufficient to match G_ADD up to s64 (to ADDWrr/ADDXrr) and G_BR (to B).
This depends on a local patch that adds a "Widenable" flag to generic instructions (to express what we implicitly rely on about things like G_ADD being OK to do on wider types) (full diff at https://reviews.llvm.org/differential/diff/78609/).
But before we dive into the details, I'd like to get feedback about the approach: we discussed various alternatives:
- invent a new syntax, write new patterns: let's avoid duplicating all the work.
- invent a new syntax, convert older patterns (at checkin- or build-time): I don't think we're at a stage where we can make well-informed decisions about the eventual tablegen design. I'm hoping that, as we add support for SDAG constructs, we get more insight on what we'll eventually need. For now, emitting matchers from SDAG patterns is a small incremental step.
- mutate the existing representation (e.g., starting by adding a "GlobalISel" class to SDNode which have equivalents): we decided against that as one of our goal in the bring-up is to have absolute no impact whatsoever on SelectionDAG.
For the record, this currently generates:
bool AArch64InstructionSelector::selectImpl(MachineInstr &I) const { MachineRegisterInfo &MRI = I.getParent()->getParent()->getRegInfo(); // Src: (add:i32 GPR32:i32:$Rn, GPR32:i32:$Rm) // Dst: (ADDWrr:i32 GPR32:i32:$Rn, GPR32:i32:$Rm) if ((I.getOpcode() == TargetOpcode::G_ADD) && (MRI.getType(I.getOperand(0).getReg()).equalOrNarrower(LLT::scalar(32))) && (MRI.getType(I.getOperand(1).getReg()).equalOrNarrower(LLT::scalar(32))) && (MRI.getType(I.getOperand(2).getReg()).equalOrNarrower(LLT::scalar(32)))) { I.setDesc(TII.get(AArch64::ADDWrr)); constrainSelectedInstRegOperands(I, TII, TRI, RBI); return true; } // Src: (add:i64 GPR64:i64:$Rn, GPR64:i64:$Rm) // Dst: (ADDXrr:i64 GPR64:i64:$Rn, GPR64:i64:$Rm) if ((I.getOpcode() == TargetOpcode::G_ADD) && (MRI.getType(I.getOperand(0).getReg()).equalOrNarrower(LLT::scalar(64))) && (MRI.getType(I.getOperand(1).getReg()).equalOrNarrower(LLT::scalar(64))) && (MRI.getType(I.getOperand(2).getReg()).equalOrNarrower(LLT::scalar(64)))) { I.setDesc(TII.get(AArch64::ADDXrr)); constrainSelectedInstRegOperands(I, TII, TRI, RBI); return true; } // Src: (br (bb:Other):$addr) // Dst: (B (bb:Other):$addr) if ((I.getOpcode() == TargetOpcode::G_BR) && (I.getOperand(0).isMBB())) { I.setDesc(TII.get(AArch64::B)); constrainSelectedInstRegOperands(I, TII, TRI, RBI); return true; } return false; }
Note that there is no sorting done, so it's pure luck that we pick the s32 ADDWrr for smaller sizes (rather than ADDXrr). We'll need some higher-level sorting to order the matchers appropriately, but picking ADDXrr isn't incorrect anyway (though it might cause constrain failures).
Also note that there's a lot of redundant work done. I'm hoping we can optimize the matchers (much like SDAG does currently), to for instance take advantage of the fact that all operands have the same type (type0) so we don't need to check it again.
Finally, note that this isn't strictly correct with regards to register banks (they're never checked): we'll need to add tablegen support for bank definitions before we can land this.