Page MenuHomePhabricator

Optimize shift and accumulate pattern in AArch64.
Needs ReviewPublic

Authored by adriantong1024 on Nov 22 2021, 4:48 PM.

Details

Summary

AArch64 supports unsigned shift right and accumulate. In case we see a
unsigned shift right followed by an OR. We could turn them into a USRA
instruction, given the operands of the OR has no common bits.

Diff Detail

Unit TestsFailed

TimeTest
70 msx64 debian > LLVM.Bindings/Go::go.test
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/llvm-go go=/usr/bin/go test llvm.org/llvm/bindings/go/llvm
1,070 msx64 debian > SanitizerCommon-tsan-x86_64-Linux.Linux::decorate_proc_maps.cpp
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang --driver-mode=g++ -gline-tables-only -fsanitize=thread -m64 -funwind-tables -ldl -g /var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/sanitizer_common/TestCases/Linux/decorate_proc_maps.cpp -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/sanitizer_common/tsan-x86_64-Linux/Linux/Output/decorate_proc_maps.cpp.tmp

Event Timeline

adriantong1024 created this revision.Nov 22 2021, 4:48 PM
adriantong1024 requested review of this revision.Nov 22 2021, 4:48 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 22 2021, 4:48 PM

Could this be handled with a tablegen pattern? Something like the or_is_add pattern from X86 (but using haveNoCommonBitsSet). Maybe a PatFrag that accepts an add or an add-like-or?

Also should ssra be treated the same way?

llvm/test/CodeGen/AArch64/shift-accumulate.ll
13

It's probably a better test to return the %5, not extract a single lane from it.

Could this be handled with a tablegen pattern? Something like the or_is_add pattern from X86 (but using haveNoCommonBitsSet). Maybe a PatFrag that accepts an add or an add-like-or?

Thanks for the comment David. I will take a look at handling in tablegen pattern as well. Will update when I have more.

Also should ssra be treated the same way?

Address Dave's comment. Moving the rules into tablegen.

Thanks. I like the added Known bits - they can be useful in many situations.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
1808

We know that the operands 1 and 2 will be constants, so I think we can just grab the values for them directly:

uint64_t Mask = ~(Op->getConstantOperandVal(1) << Op->getConstantOperandVal(2));

The Mask then specifies which bits are known to be 0.

llvm/lib/Target/AArch64/AArch64InstrInfo.td
8237

Now that we have this, it may be worth using it in the existing patterns, using a PatFrags that accepts either "add" or "or_is_add". That would save the need for new patterns.

llvm/test/CodeGen/AArch64/shift-accumulate.ll
2

Can you run the update_llc_test_checks on the file? It's missing some expected output.

It's also worth adding some simple case for each signedness / type that have tablegen patterns added / changed.

19

Unfortunately this doesn't verify. I think because the BIC code is giving incorrect Known bits.

Address Dave's comment.

Add 2 more test cases, testing different datatypes.

Thanks. Looks good, as far as I can tell.

Can you make sure there are some i64 tests, especially v1i64. Those have a different tablegen class, so it would be good to make sure they have tests covering them.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
1800–1804

Is this chunk needed? I think the operands should be constants by definition.

llvm/test/CodeGen/AArch64/shift-accumulate.ll
3

I tend to use -mtriple=aarch64-none-eabi

5

You can usually remove dso_local and local_unnamed_addr #0 align 32, to make the tests a little cleaner.

Address Dave's comment. Thanks!