This is an archive of the discontinued LLVM Phabricator instance.

[globalisel][tablegen] Add support for EXTRACT_SUBREG.
ClosedPublic

Authored by dsanders on May 26 2017, 7:10 AM.

Details

Summary

After this patch, we finally have test cases that require multiple
instruction emission.

Depends on D33590

Diff Detail

Repository
rL LLVM

Event Timeline

dsanders created this revision.May 26 2017, 7:10 AM

After re-testing this on current trunk, there's a bug in X86 due to GR64 not fully supporting the 8bit subregister index. I'll fix this shortly using failedImport() for this case since we don't have multi-insn emission quite yet. The following patches introduce the infrastructure for it.

dsanders updated this revision to Diff 101196.Jun 2 2017, 6:12 AM

Fix the X86 machine verifier bug where EXTRACT_SUBREG could emit a subregister
copy with classes that didn't quite support it. TableGen now finds a suitable
pair of classes. However, if this results in a copy of the operand into another
class, tablegen will currently refuse to import SelectionDAG because we lack
multi-insn emission at the moment. Support for that will start to be introduced
as part of the state-machine patches.

qcolombet edited edge metadata.Jun 2 2017, 7:23 PM

Could you re-upload the patch without the diffs from https://reviews.llvm.org/D33590?

test/CodeGen/AArch64/GlobalISel/select-trunc.mir
19 ↗(On Diff #101196)

Do you know why we end up with 'all' in one case and 'sp' in the other?
I would have expected to both have 'all' or 'sp'.

utils/TableGen/CodeGenRegisters.h
368 ↗(On Diff #101196)

The "as far as possible" part doesn't sound right. I thought tablegen was generating the missing register classes for that.

dsanders updated this revision to Diff 101490.Jun 5 2017, 5:39 PM

Resubmit the patch minus the bits that were already in D33590. Sorry about that.

dsanders added inline comments.Jun 6 2017, 6:28 PM
test/CodeGen/AArch64/GlobalISel/select-trunc.mir
19 ↗(On Diff #101196)

The gpr32sp is indirectly specified by the rule:

// (trunc:i32 GPR64sp:i64:$src)  =>  (EXTRACT_SUBREG:i32 GPR64sp:i64:$src, sub_32:i32)

$src is specified to be GPR64sp and the best fit for the sub_32 subregister index is GPR32sp. At this point, $src is left as GPR (which sounds wrong to me now that I've noticed it, I think it should constrain it to GPR64sp, I'll take another look at it). The GPR64all comes from the '%vreg0<def>(s64) = COPY %X0; GPR:%vreg0'

utils/TableGen/CodeGenRegisters.h
368 ↗(On Diff #101196)

It tries to find the best class (often one of the generated classes) but I'm not sure I can guarantee that it will always find it for all targets. AMDGPU's unusual use of subregister indices is what I had in mind since I think that may end up picking a subreg register class that contains additional registers. Additional registers in the subreg register class isn't a problem if it happens.

dsanders updated this revision to Diff 101662.Jun 6 2017, 6:45 PM

Constrain the source operand too. This fixes the odd GPR64all, it's now GPR64sp.

ab edited edge metadata.Jun 15 2017, 3:48 PM

Can you add a tablegen testcase?

lib/Target/AArch64/AArch64InstructionSelector.cpp
943–944 ↗(On Diff #101662)

Remove the:

} else if (DstRC == &AArch64::GPR32RegClass &&
           SrcRC == &AArch64::GPR64RegClass) {

below instead.

utils/TableGen/CodeGenRegisters.cpp
923 ↗(On Diff #101490)

Looks like we already have an escape hatch in TargetRegisterInfo::getSubClassWithSubReg. Should we just use that?

utils/TableGen/CodeGenRegisters.h
374 ↗(On Diff #101662)

Return an Optional<std::pair> ?

utils/TableGen/GlobalISelEmitter.cpp
871 ↗(On Diff #101662)

Triple-slash.

1625–1630 ↗(On Diff #101662)

At this point, we don't need to compute SrcRC/DstRC, right? Can you check this when emitting the constraining code?

1817 ↗(On Diff #101662)

Split the 'if' from the define and early-exit?

1825 ↗(On Diff #101662)

Stale text?

dsanders marked 5 inline comments as done.Jun 23 2017, 2:41 AM
dsanders added inline comments.
lib/Target/AArch64/AArch64InstructionSelector.cpp
953 ↗(On Diff #101662)

This code is also used by G_PTRTOINT. I'll fix these comments.

976–977 ↗(On Diff #101662)

I've updated this to account for the change you requested but I couldn't make that exact change due to a couple problems.

It turns out this is needed for G_PTRTOINT to prevent select-int-ptr-casts.mir failing. This is because a suitable rule isn't imported. As far as I can see there are no suitable rules to import.

It also turned out to be needed for G_TRUNC to support s1/s8/s16 to avoid errors like:

LLVM ERROR: cannot select: %vreg1<def>(s8) = G_TRUNC %vreg0; GPR32:%vreg1 GPR64:%vreg0 (in function: trunc_s8_s64)

This happens because getRegClassForTypeOnBank() returns GPR32 for integer types smaller than 32 bit. To fix this, I've done the equivalent with DstTy and SrcTy.

utils/TableGen/CodeGenRegisters.cpp
923 ↗(On Diff #101490)

That only selects a class suitable for the source of the copy. We also need to find a suitable class for the destination otherwise the MachineVerifier will complain that some destination registers for the subreg COPY are not subregisters of a register from the source class.

utils/TableGen/GlobalISelEmitter.cpp
871 ↗(On Diff #101662)

Well spotted

1625–1630 ↗(On Diff #101662)

We need to compute it to determine whether we need to copy the source to a suitable register class or not. The current code deals with the case where the extra copy isn't needed and rejects the one that needs it. To support the case that needs it we'll need to refactor to support multiple insn emission.

dsanders updated this revision to Diff 103705.Jun 23 2017, 2:44 AM
dsanders marked an inline comment as done.

Fix the nits ( -> /, Optional<std::pair<...>>, etc.)

Made the change requested for AArch64InstructionSelector.cpp as far as
possible. There are a few issues with the exact change as mentioned in my
comment.

dsanders marked 2 inline comments as done.Jun 23 2017, 2:45 AM
This revision was automatically updated to reflect the committed changes.
ab added a comment.Jun 27 2017, 11:51 AM

Can you add a tablegen testcase? If nothing else, it's a good way to document the tablegen backend.

llvm/trunk/utils/TableGen/CodeGenRegisters.cpp
919

Looks like we already have an escape hatch in TargetRegisterInfo::getSubClassWithSubReg. Should we just use that?

That only selects a class suitable for the source of the copy. We also need to find a suitable class for the destination otherwise the MachineVerifier will complain that some destination registers for the subreg COPY are not subregisters of a register from the source class.

Huh, OK; I thought we did have a helper somewhere (that computes the class of all subregisters of a superregister class via a given index).

llvm/trunk/utils/TableGen/GlobalISelEmitter.cpp
1669

There are multiple 80-col violations, btw

1679

We need to compute it to determine whether we need to copy the source to a suitable register class or not. The current code deals with the case where the extra copy isn't needed and rejects the one that needs it. To support the case that needs it we'll need to refactor to support multiple insn emission.

Sure, but wouldn't it work to only compute it once, later, when emitting, and check whether we need the copy then? Here's a concrete patch:

 utils/TableGen/GlobalISelEmitter.cpp | 12 ++----------
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git i/utils/TableGen/GlobalISelEmitter.cpp w/utils/TableGen/GlobalISelEmitter.cpp
index 50da9085c21..edb772e5c26 100644
--- i/utils/TableGen/GlobalISelEmitter.cpp
+++ w/utils/TableGen/GlobalISelEmitter.cpp
@@ -1667,18 +1667,8 @@ Expected<BuildMIAction &> GlobalISelEmitter::createAndImportInstructionRenderer(
       return failedImport("EXTRACT_SUBREG child #1 is not a leaf");

     if (DefInit *SubRegInit = dyn_cast<DefInit>(Dst->getChild(1)->getLeafValue())) {
-      CodeGenRegisterClass *RC = CGRegs.getRegClass(
-          getInitValueAsRegClass(Dst->getChild(0)->getLeafValue()));
       CodeGenSubRegIndex *SubIdx = CGRegs.getSubRegIdx(SubRegInit->getDef());

-      const auto &SrcRCDstRCPair =
-          RC->getMatchingSubClassWithSubRegs(CGRegs, SubIdx);
-      if (SrcRCDstRCPair.hasValue()) {
-        assert(SrcRCDstRCPair->second && "Couldn't find a matching subclass");
-        if (SrcRCDstRCPair->first != RC)
-          return failedImport("EXTRACT_SUBREG requires an additional COPY");
-      }
-
       DstMIBuilder.addRenderer<CopySubRegRenderer>(
           InsnMatcher, Dst->getChild(0)->getName(), SubIdx);
       return DstMIBuilder;
@@ -1898,6 +1888,8 @@ Expected<RuleMatcher> GlobalISelEmitter::runOnPattern(const PatternToMatch &P) {
       const auto &SrcRCDstRCPair =
           SrcRC->getMatchingSubClassWithSubRegs(CGRegs, SubIdx);
       assert(SrcRCDstRCPair->second && "Couldn't find a matching subclass");
+      if (SrcRCDstRCPair->first != SrcRC)
+        return failedImport("EXTRACT_SUBREG requires an additional COPY");
       M.addAction<ConstrainOperandToRegClassAction>("NewI", 0, *SrcRCDstRCPair->second);
       M.addAction<ConstrainOperandToRegClassAction>("NewI", 1, *SrcRCDstRCPair->first);
1878

This was where I was suggesting an early-exit, not L1866: it might help to drop the if around SubRegInit?

1884

The operand number is also wrong, no? (here and elsewhere)

1896

Add a message?

In D33596#792538, @ab wrote:

Can you add a tablegen testcase? If nothing else, it's a good way to document the tablegen backend.

Sure.

llvm/trunk/utils/TableGen/GlobalISelEmitter.cpp
1669

Ok, I'll fix these. Hopefully, I'll also find time soon to get the commit hook I used to have working again.

1679

That looks like it will work at the moment but I'm not sure it fits with the emission of nested instructions. For example:

(EXTRACT_SUBREG (ADD x:s64, y:s64), sub_32)

could select to:

%2(A64) = ADD %0(A64), %1(A64)
%3(B64) = COPY %2(A64)
%4(B32) = COPY %3(B64), sub_32

where A64 and B64 are disjoint register classes and B32 is a subregister class of B64. Handling the extra copy outside the tree walking code would need a way to insert the copy into the right part of the emitted list. If we handle it during the tree walking code we can just append to the list as we go.

Here's a particularly complicated example from Mips:

def : MSAPat<
  (i32 (vextract_zext_i8 v16i8:$ws, i64:$idx)),
  (SRL (COPY_TO_REGCLASS
         (i32 (EXTRACT_SUBREG
                 (SPLAT_B v16i8:$ws,
                   (COPY_TO_REGCLASS
                     (i32 (EXTRACT_SUBREG i64:$idx, sub_32)), GPR32)),
                 sub_lo)),
         GPR32),
       (i32 24))>;
1878

Ah I see. I don't mind either way, I'll make the change.

1884

It depends how you view the data. The gMIR view is:

%0 = EXTRACT_SUBREG %1, subidx

in which case it's operand 1, but the DAG view is

(EXTRACT_SUBREG %1, subidx)

in which case it's child 1. I've been using 'operand' for the gMIR view and 'child' for the DAG view.

1896

Ok