This is an archive of the discontinued LLVM Phabricator instance.

[TailDuplicator] Fix old bugs in TailDuplicator::duplicateInstruction
ClosedPublic

Authored by bjope on Dec 21 2022, 1:23 PM.

Details

Summary

This patch is updating TailDuplicator::duplicateInstruction to fix
some old bugs that has been found with an out-of-tree target. There
are three different things being addressed:

  1. In one situation two subregister indices are combined using the composeSubRegIndices helper. But the order in which those indices are combined has been incorrect. For this problem I managed to create some kind of reproducer using AArch64 (see the test case touched in this patch).
  1. Another fault was found in the else branch for the above situation. Here we do not compose the two subregisters, instead we insert a COPY to replace the PHI, and then the subreg index in the using MO remains. Thus, the virtual register created for the COPY should always match with the size of the original register. Therefore the optimization that "constrain" (or rather relax) the register class by looking at the instruction desc must be limited to the situation when there is no subregister access. Otherwise we create a vreg with the wrong class.
  1. Last problem addressed in this patch is that when a new register class is picked by looking at the instruction desc, then it isn't guaranteed that the isAllocatable property is set for that class. So one need to use the getAllocatableClass helper to find a subclass that is allocatable before using createVirualRegister, or alternatively (as in this patch) just use the OrigRC instead of relaxing the register class for the COPY destination.

Haven't been able to find any in-tree reproducers for problem 2 and 3.
The tricky part is to find a target that has register hierarchies that
match with the problem to trigger those code paths (and with subreg
accesses involved).

Diff Detail

Event Timeline

bjope created this revision.Dec 21 2022, 1:23 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 21 2022, 1:23 PM
bjope requested review of this revision.Dec 21 2022, 1:23 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 21 2022, 1:23 PM
Herald added a subscriber: wdng. · View Herald Transcript
bjope added a comment.Jan 11 2023, 8:28 AM

Gentle ping!

Let me know if I should try to find more reviewers, not sure who else is good at subreg hierarchies etc.
But afaict the erroneous code was added in this commit by @kparzysz :

commit 4773f647bdf86da3fb95d7e34513d12bea55d60d
Author: Krzysztof Parzyszek <kparzysz@codeaurora.org>
Date:   Tue Apr 26 18:36:34 2016 +0000

    [Tail duplication] Handle source registers with subregisters
    
    When a block is tail-duplicated, the PHI nodes from that block are
    replaced with appropriate COPY instructions. When those PHI nodes
    contained use operands with subregisters, the subregisters were
    dropped from the COPY instructions, resulting in incorrect code.
    
    Keep track of the subregister information and use this information
    when remapping instructions from the duplicated block.
    
    Differential Revision: http://reviews.llvm.org/D19337
    
    llvm-svn: 267583
kparzysz added inline comments.Feb 3 2023, 10:08 AM
llvm/lib/CodeGen/TailDuplicator.cpp
454

Maybe we can just use OrigRC here, instead of trying to come up with a new class? OrigRC is guaranteed to be valid, since that's what the original code used.

bjope added inline comments.Feb 5 2023, 7:53 AM
llvm/lib/CodeGen/TailDuplicator.cpp
454

Yes. That is an alternative solution.

If I remember correct I tried to avoid changing the behavior for the non-faulty case (assuming that there was some good reason why the old code tried to find a new RC here).

Since register coalescing is doing register class widening (taking getLargestLegalSuperClass into consideration) I don't think that it is important to find a "better" class here. And there are no lit tests that are impacted if I simplify this to just use OrigRC (and I haven't seen any regressions in downstream testing either).

I'll update the patch accordingly, to simplify this part.

bjope updated this revision to Diff 494914.Feb 5 2023, 7:55 AM

Simplified code based on review feedback.

bjope edited the summary of this revision. (Show Details)Feb 5 2023, 7:56 AM
kparzysz accepted this revision.Feb 6 2023, 6:12 AM

LGTM, thanks!

This revision is now accepted and ready to land.Feb 6 2023, 6:12 AM
This revision was landed with ongoing or failed builds.Feb 6 2023, 10:23 AM
This revision was automatically updated to reflect the committed changes.