This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][CodeGen] Fold the mov and lsl into ubfiz
ClosedPublic

Authored by Allen on Aug 21 2022, 2:52 AM.

Diff Detail

Event Timeline

Allen created this revision.Aug 21 2022, 2:52 AM
Allen requested review of this revision.Aug 21 2022, 2:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 21 2022, 2:52 AM
Allen retitled this revision from [AArch64][SelectionDAG] Fold the mov and lsl into ubfiz to [AArch64][CodeGen] Fold the mov and lsl into ubfiz.Aug 21 2022, 7:32 PM
efriedma accepted this revision.Aug 23 2022, 10:29 AM

LGTM

Some of these changes feel a little silly, like the change to select_cc.ll, but there isn't any performance difference, so I guess it's fine. (lsl is actually ubfiz in disguise.)

This revision is now accepted and ready to land.Aug 23 2022, 10:29 AM
llvm/lib/Target/AArch64/AArch64InstrInfo.td
7348

@efriedma / @Allen : Sorry for the naive question but I'm not hugely familiar with this node and the documentation says:

/// All other bits are
/// assumed to be equal to the bits in the immediate integer constant in the
/// first operand. This instruction just communicates information; No code
/// should be generated.

So is this safe? I mean if no code is generated then in this instance how can we be sure $Rn has it's top 32bits zeroed? I'm kind of assuming this is why the original code is using INSERT_SUBREG?

The emitted code is valid, but could something query the SUBREG_TO_REG post isel and reuse/transform it in an invalid way?

efriedma added inline comments.Aug 23 2022, 12:23 PM
llvm/lib/Target/AArch64/AArch64InstrInfo.td
7348

Oh, I didn't notice it was changed from using INSERT_SUBREG to use SUBREG_TO_REG. I just glanced over the patterns and assumed it was just using the same pattern we were already using. Not sure why it was changed. We normally only use SUBREG_TO_REG if we know the operand actually clears the high bits.

Not sure there's really any practical consequence here; there isn't very much code that's actually aware of the semantics of SUBREG_TO_REG.

Allen added inline comments.Aug 23 2022, 8:19 PM
llvm/lib/Target/AArch64/AArch64InstrInfo.td
7348

hi @paulwalker-arm @efriedma

 According https://developer.arm.com/documentation/ddi0602/2021-12/Base-Instructions/UBFIZ--Unsigned-Bitfield-Insert-in-Zero--an-alias-of-UBFM-?lang=en, the insn ubfiz  copies specified bits from the source $Rn and clears other unspecified bits. 
In this pattern, the top 32bits will be clean, so I think it is save.

If we use the original INSERT_SUBREG, I find the test case @loop2 in file llvm/test/CodeGen/AArch64/tbl-loops.ll regressses.

+++ b/llvm/test/CodeGen/AArch64/tbl-loops.ll
@@ -151,7 +151,8 @@ define void @loop2(i8* noalias nocapture noundef writeonly %dst, float* nocaptur
 ; CHECK-NEXT:    cmp w8, #2
 ; CHECK-NEXT:    b.ls .LBB1_4
 ; CHECK-NEXT:  // %bb.2: // %vector.memcheck
-; CHECK-NEXT:    ubfiz x9, x8, #1, #32
+; CHECK-NEXT:    mov w9, w8
+; CHECK-NEXT:    ubfiz x9, x9, #1, #32

I think this is because the extra %37: grp64all = IMPLICIT_DEF need alloc a register for %37, which is not always specified the same register of %36
%36:gpr64 = INSERT_SUBREG %37:grp64all (tied-def 0), %24:gpr32common, %subreg.sub_32

paulwalker-arm added inline comments.Aug 24 2022, 7:40 AM
llvm/lib/Target/AArch64/AArch64InstrInfo.td
7348

I can see the emitted code is correct but my concern relates to a scenario where a later pass decides to reuse the result of the SUBREG_TO_REG MachineInstr in isolation, with the understanding that the top 32bits will be zero. In this instance we don't know how $Rn was produced so we don't know anything about its top 32bits.

Within PeepholeOptimizer.cpp there's this comment:

// It's an error to translate this:
//
//    %reg1025 = <sext> %reg1024
//     ...
//    %reg1026 = SUBREG_TO_REG 0, %reg1024, 4
//
// into this:
//
//    %reg1025 = <sext> %reg1024
//     ...
//    %reg1027 = COPY %reg1025:4
//    %reg1026 = SUBREG_TO_REG 0, %reg1027, 4
//
// The problem here is that SUBREG_TO_REG is there to assert that an
// implicit zext occurs. It doesn't insert a zext instruction. If we allow
// the COPY here, it will give us the value after the <sext>, not the
// original value of %reg1024 before <sext>.

This articulates what I mean. An optimisation is prevented that would otherwise break SUBREG_TO_REG's contract. Presumably this is for a good reason and suggests we should only use SUBREG_TO_REG when we can honour its contract.

That said, if Eli is happy then I shalt worry about it, but to me this just looks unsafe.

Perhaps we're missing a MachineInstr transformation for INSERT_SUBREG (undef), $GPR -> SUBREG_TO_REG 0, $GPR for the cases where we know the top bits of $GPR will be zero?

Allen added inline comments.Aug 24 2022, 8:39 AM
llvm/lib/Target/AArch64/AArch64InstrInfo.td
7348

Thanks @paulwalker-arm for detail explanation, and now I think your worry is make sense.
I'll try to take a look at the MachineInstr transformation as your suggestion.

Allen added inline comments.Aug 30 2022, 6:40 AM
llvm/lib/Target/AArch64/AArch64InstrInfo.td
7348

Address with D132939.

Allen updated this revision to Diff 458951.Sep 8 2022, 8:08 PM
Allen edited the summary of this revision. (Show Details)

rebase after D132939

Allen added a comment.Sep 8 2022, 8:12 PM

hi @paulwalker-arm, I have updated the pattern with INSERT_SUBREG, Any further concerns? Thanks.

paulwalker-arm accepted this revision.Sep 9 2022, 4:13 AM

hi @paulwalker-arm, I have updated the pattern with INSERT_SUBREG, Any further concerns? Thanks.

I've a small request to ensure we preserve an existing test but otherwise this looks good to me. Thanks for persevering with the work.

llvm/test/CodeGen/AArch64/aarch64-dynamic-stack-layout.ll
461–466

Your version of these CHECK-MACHO lines is missing add x9, x9, #15 that is fundamental to what the checks are validating. I've checked myself and the add is present so you just need to re-add the CHECK-MACHO line itself.

Allen updated this revision to Diff 459042.Sep 9 2022, 6:45 AM

update test case according comment

This revision was automatically updated to reflect the committed changes.
Allen marked an inline comment as done.
kda added a subscriber: kda.EditedSep 9 2022, 4:30 PM

Looks like this is breaking ASAN buildbot:
https://lab.llvm.org/buildbot/#/builders/168/builds/8760

Log snippet:

******************** TEST 'MLIR-Unit :: IR/./MLIRIRTests/25/57' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:/usr/local/google/home/kda/src/bw02/llvm-project/llvm_build_asan/tools/mlir/unittests/IR/./MLIRIRTests-MLIR-Unit-669455-25-57.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=57 GTEST_SHARD_INDEX=25 /usr/local/google/home/kda/src/bw02/llvm-project/llvm_build_asan/tools/mlir/unittests/IR/./MLIRIRTests
--


Note: This is test shard 26 of 57.
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from BlockAndValueMapping
[ RUN      ] BlockAndValueMapping.TypedValue
[       OK ] BlockAndValueMapping.TypedValue (9 ms)
[----------] 1 test from BlockAndValueMapping (9 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (9 ms total)
[  PASSED  ] 1 test.

=================================================================
==2874064==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 80 byte(s) in 1 object(s) allocated from:
    #0 0x5571e8d4b60e in malloc /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:69:3
    #1 0x5571e920c8d8 in mlir::Operation::create(mlir::Location, mlir::OperationName, mlir::TypeRange, mlir::ValueRange, mlir::NamedAttrList&&, mlir::BlockRange, unsigned int) /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/mlir/lib/IR/Operation.cpp:77:46
    #2 0x5571e920c48d in mlir::Operation::create(mlir::Location, mlir::OperationName, mlir::TypeRange, mlir::ValueRange, mlir::NamedAttrList&&, mlir::BlockRange, mlir::RegionRange) /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/mlir/lib/IR/Operation.cpp:39:19
    #3 0x5571e920c171 in mlir::Operation::create(mlir::OperationState const&) /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/mlir/lib/IR/Operation.cpp:28:10
    #4 0x5571e908b8d4 in mlir::OpBuilder::create(mlir::OperationState const&) /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/mlir/lib/IR/Builders.cpp:409:17
    #5 0x5571e8dbbddb in test::TestOpConstant mlir::OpBuilder::create<test::TestOpConstant, mlir::FloatType, mlir::FloatAttr>(mlir::Location, mlir::FloatType&&, mlir::FloatAttr&&) /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/mlir/include/mlir/IR/Builders.h:457:16
    #6 0x5571e8dbaf8a in BlockAndValueMapping_TypedValue_Test::TestBody() /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/mlir/unittests/IR/BlockAndValueMapping.cpp:27:26
    #7 0x5571e8f8ec4c in HandleExceptionsInMethodIfSupported<testing::Test, void> /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/llvm/utils/unittest/googletest/src/gtest.cc
    #8 0x5571e8f8ec4c in testing::Test::Run() /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/llvm/utils/unittest/googletest/src/gtest.cc:2508:5
    #9 0x5571e8f9141c in testing::TestInfo::Run() /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/llvm/utils/unittest/googletest/src/gtest.cc:2684:11
    #10 0x5571e8f9281f in testing::TestSuite::Run() /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/llvm/utils/unittest/googletest/src/gtest.cc:2816:28
    #11 0x5571e8fbd61a in testing::internal::UnitTestImpl::RunAllTests() /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/llvm/utils/unittest/googletest/src/gtest.cc:5338:44
    #12 0x5571e8fbc301 in HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/llvm/utils/unittest/googletest/src/gtest.cc
    #13 0x5571e8fbc301 in testing::UnitTest::Run() /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/llvm/utils/unittest/googletest/src/gtest.cc:4925:10
    #14 0x5571e8f71c7a in RUN_ALL_TESTS /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/llvm/utils/unittest/googletest/include/gtest/gtest.h:2472:46
    #15 0x5571e8f71c7a in main /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/llvm/utils/unittest/UnitTestMain/TestMain.cpp:55:10
    #16 0x7fbe1698481c in __libc_start_main csu/../csu/libc-start.c:332:16

Direct leak of 80 byte(s) in 1 object(s) allocated from:
    #0 0x5571e8d4b60e in malloc /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:69:3
    #1 0x5571e920c8d8 in mlir::Operation::create(mlir::Location, mlir::OperationName, mlir::TypeRange, mlir::ValueRange, mlir::NamedAttrList&&, mlir::BlockRange, unsigned int) /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/mlir/lib/IR/Operation.cpp:77:46
    #2 0x5571e920c48d in mlir::Operation::create(mlir::Location, mlir::OperationName, mlir::TypeRange, mlir::ValueRange, mlir::NamedAttrList&&, mlir::BlockRange, mlir::RegionRange) /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/mlir/lib/IR/Operation.cpp:39:19
    #3 0x5571e920c171 in mlir::Operation::create(mlir::OperationState const&) /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/mlir/lib/IR/Operation.cpp:28:10
    #4 0x5571e908b8d4 in mlir::OpBuilder::create(mlir::OperationState const&) /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/mlir/lib/IR/Builders.cpp:409:17
    #5 0x5571e8dbb75b in test::TestOpConstant mlir::OpBuilder::create<test::TestOpConstant, mlir::IntegerType, mlir::IntegerAttr>(mlir::Location, mlir::IntegerType&&, mlir::IntegerAttr&&) /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/mlir/include/mlir/IR/Builders.h:457:16
    #6 0x5571e8dbaefc in BlockAndValueMapping_TypedValue_Test::TestBody() /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/mlir/unittests/IR/BlockAndValueMapping.cpp:25:26
    #7 0x5571e8f8ec4c in HandleExceptionsInMethodIfSupported<testing::Test, void> /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/llvm/utils/unittest/googletest/src/gtest.cc
    #8 0x5571e8f8ec4c in testing::Test::Run() /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/llvm/utils/unittest/googletest/src/gtest.cc:2508:5
    #9 0x5571e8f9141c in testing::TestInfo::Run() /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/llvm/utils/unittest/googletest/src/gtest.cc:2684:11
    #10 0x5571e8f9281f in testing::TestSuite::Run() /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/llvm/utils/unittest/googletest/src/gtest.cc:2816:28
    #11 0x5571e8fbd61a in testing::internal::UnitTestImpl::RunAllTests() /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/llvm/utils/unittest/googletest/src/gtest.cc:5338:44
    #12 0x5571e8fbc301 in HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/llvm/utils/unittest/googletest/src/gtest.cc
    #13 0x5571e8fbc301 in testing::UnitTest::Run() /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/llvm/utils/unittest/googletest/src/gtest.cc:4925:10
    #14 0x5571e8f71c7a in RUN_ALL_TESTS /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/llvm/utils/unittest/googletest/include/gtest/gtest.h:2472:46
    #15 0x5571e8f71c7a in main /usr/local/google/home/kda/src/bw02/llvm-project/llvm-project/llvm/utils/unittest/UnitTestMain/TestMain.cpp:55:10
    #16 0x7fbe1698481c in __libc_start_main csu/../csu/libc-start.c:332:16

SUMMARY: AddressSanitizer: 160 byte(s) leaked in 2 allocation(s).

--
exit: 1
--

********************

Looks like this is breaking ASAN buildbot:
https://lab.llvm.org/buildbot/#/builders/168/builds/8760

Are you certain you're assigning blame correctly? That's a unittest failure in MLIR on x86; I don't see how it could possibly involve the AArch64 backend.

kda added a comment.Sep 9 2022, 4:50 PM

Looks like this is breaking ASAN buildbot:
https://lab.llvm.org/buildbot/#/builders/168/builds/8760

Are you certain you're assigning blame correctly? That's a unittest failure in MLIR on x86; I don't see how it could possibly involve the AArch64 backend.

Well, I reverted this change and re-ran the buildbot. It looks like it passed (but admittedly it got hung up, but I saw the offending test pass).
I am re-running the test to see if I can get it to run to completion.
If it passes, I will probably submit the reversion.

I agree, I thought this was an unlikely candidate, but I tried 3 others with no luck.