This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Revive optimize add/sub with immediate through MIPeepholeOpt
ClosedPublic

Authored by red1bluelost on Jan 16 2022, 7:18 AM.

Details

Summary

Fixes the build issue with D111034, whose goal was to optimize
add/sub with long immediates.

Optimize ([add|sub] r, imm) -> ([ADD|SUB] ([ADD|SUB] r, #imm0, lsl #12), #imm1),
if imm == (imm0<<12)+imm1. and both imm0 and imm1 are non-zero 12-bit unsigned
integers.

Optimize ([add|sub] r, imm) -> ([SUB|ADD] ([SUB|ADD] r, #imm0, lsl #12), #imm1),
if imm == -(imm0<<12)-imm1, and both imm0 and imm1 are non-zero 12-bit unsigned
integers.

The change which fixed the build issue in D111034 was the use of new virtual
registers so that SSA form is maintained until deleting MI.

Testing:

Tested by replicating the 2 stage bootstage process using QEMU and Docker,
replicating an AArch64 Linux environment. The steps for the build process
are taken from the "clang-aarch64-full-2stage" BuildBot.

Ran a 2 stage build on a clean LLVM repo to verify the environment.
Ran a 2 stage build on D111034 which failed in when building stage 2.
Ran a 2 stage build of these changes and it built with no errors.

Notes on the Docker Environment:

To set up the environment, make sure to have QEMU and Docker installed then run
the following commands:

docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
docker run -it --entrypoint=/bin/bash linaro/ci-arm64-tcwg-llvmbot-ubuntu:bionic

I mounted my LLVM folder in the Docker container so that I could quickly edit
the source from my host environment.

Some of the tests will due to the QEMU environment, which I found when verifying
the clean branch of LLVM. These appear to mainly be associated with the
Compiler-RT, such as libFuzzer and some of the sanitizers.

Diff Detail

Event Timeline

red1bluelost created this revision.Jan 16 2022, 7:18 AM
red1bluelost requested review of this revision.Jan 16 2022, 7:18 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 16 2022, 7:18 AM

Fixes some changes I made when merging the comment.

benshi001 accepted this revision.Jan 17 2022, 3:43 AM

Thanks for fixing my previous issue.

This revision is now accepted and ready to land.Jan 17 2022, 3:43 AM
dmgreen accepted this revision.Jan 17 2022, 4:09 AM

So this now does the same as visitORR? LGTM in that case, thanks for pushing this through!

So this now does the same as visitORR? LGTM in that case, thanks for pushing this through!

I don't think it is the same as visitORR. If you meant visitAND, I just checked an they are very similar expect differences where AND uses splitBitmaskImm while ADDSUB uses splitAndSubImm. Also the building of the ADDSUB MI's has an additional immediate for the shift value.

I'll try to see if it would be worth while to abstract the similarities after I add the SUBS instruction.

Thank you for the reviews! Would one of you be able to commit on my behalf as I do not have commit access?

I don't think it is the same as visitORR. If you meant visitAND, I just checked an they are very similar expect differences where AND uses splitBitmaskImm while ADDSUB uses splitAndSubImm. Also the building of the ADDSUB MI's has an additional immediate for the shift value.

Oh yeah. I meant the replaceRegWith etc at the end of visitAND (or visitORR). The way we update the operands is similar to something that we know has worked in the past, which is always a good sign.

I can commit this, I just need a name/email to attribute it to. Is your @gmail account OK for that?

I don't think it is the same as visitORR. If you meant visitAND, I just checked an they are very similar expect differences where AND uses splitBitmaskImm while ADDSUB uses splitAndSubImm. Also the building of the ADDSUB MI's has an additional immediate for the shift value.

Oh yeah. I meant the replaceRegWith etc at the end of visitAND (or visitORR). The way we update the operands is similar to something that we know has worked in the past, which is always a good sign.

I can commit this, I just need a name/email to attribute it to. Is your @gmail account OK for that?

Yep, it was the replaceRegWith that fixed the second stage.

Here is name and email:
Micah Weston
micahsweston@gmail.com

This revision was landed with ongoing or failed builds.Jan 17 2022, 9:17 AM
This revision was automatically updated to reflect the committed changes.
fhahn added a subscriber: fhahn.Jan 18 2022, 5:18 AM

This change appears to introduce new machine verifier failures when building the llvm-test-suite with -mllvm -verify-machineinstrs enabled. I reverted e6698f09929a so the bot gets back to green for now.

https://green.lab.llvm.org/green/job/test-suite-verify-machineinstrs-aarch64-O3/11061/

FAILED: MultiSource/Benchmarks/Olden/health/CMakeFiles/health.dir/health.c.o
/Users/buildslave/jenkins/workspace/test-suite-verify-machineinstrs-aarch64-O3/test-suite-build/tools/timeit --summary MultiSource/Benchmarks/Olden/health/CMakeFiles/health.dir/health.c.o.time /Users/buildslave/jenkins/workspace/test-suite-verify-machineinstrs-aarch64-O3/compiler/bin/clang -DNDEBUG  -B /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin    -Wno-unused-command-line-argument -mllvm -verify-machineinstrs -O3 -arch arm64 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS13.5.sdk   -w -Werror=date-time -DTORONTO -MD -MT MultiSource/Benchmarks/Olden/health/CMakeFiles/health.dir/health.c.o -MF MultiSource/Benchmarks/Olden/health/CMakeFiles/health.dir/health.c.o.d -o MultiSource/Benchmarks/Olden/health/CMakeFiles/health.dir/health.c.o   -c /Users/buildslave/jenkins/workspace/test-suite-verify-machineinstrs-aarch64-O3/test-suite/MultiSource/Benchmarks/Olden/health/health.c
*** Bad machine code: Illegal virtual register for instruction ***
- function:    alloc_tree
- basic block: %bb.1 if.else (0x7fc0db8f8bb0)
- instruction: %31:gpr64 = nsw MADDXrrr killed %39:gpr64sp, killed %25:gpr64, $xzr
- operand 1:   killed %39:gpr64sp
Expected a GPR64 register, but got a GPR64sp register
fatal error: error in backend: Found 1 machine code errors.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: /Users/buildslave/jenkins/workspace/test-suite-verify-machineinstrs-aarch64-O3/compiler/bin/clang -DNDEBUG -B /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin -Wno-unused-command-line-argument -mllvm -verify-machineinstrs -O3 -arch arm64 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS13.5.sdk -w -Werror=date-time -DTORONTO -MD -MT MultiSource/Benchmarks/Olden/health/CMakeFiles/health.dir/health.c.o -MF MultiSource/Benchmarks/Olden/health/CMakeFiles/health.dir/health.c.o.d -o MultiSource/Benchmarks/Olden/health/CMakeFiles/health.dir/health.c.o -c /Users/buildslave/jenkins/workspace/test-suite-verify-machineinstrs-aarch64-O3/test-suite/MultiSource/Benchmarks/Olden/health/health.c
1.      <eof> parser at end of file
2.      Code generation
3.      Running pass 'Function Pass Manager' on module '/Users/buildslave/jenkins/workspace/test-suite-verify-machineinstrs-aarch64-O3/test-suite/MultiSource/Benchmarks/Olden/health/health.c'.
4.      Running pass 'Verify generated machine code' on function '@alloc_tree'
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  clang         0x000000011191896b llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 43
1  clang         0x00000001119179b5 llvm::sys::RunSignalHandlers() + 85
2  clang         0x00000001119180e2 llvm::sys::CleanupOnSignal(unsigned long) + 210
3  clang         0x0000000111849f6a (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) + 106
4  clang         0x0000000111849ee8 llvm::CrashRecoveryContext::HandleExit(int) + 24
5  clang         0x0000000111914acc llvm::sys::Process::Exit(int, bool) + 44
6  clang         0x000000010f4e9be9 LLVMErrorHandler(void*, char const*, bool) + 89
7  clang         0x0000000114eba333 llvm::report_fatal_error(llvm::Twine const&, bool) + 323
8  clang         0x0000000110d8c620 (anonymous namespace)::MachineVerifier::BBInfo::~BBInfo() + 0
9  clang         0x0000000110cdddca llvm::MachineFunctionPass::runOnFunction(llvm::Function&) + 378
10 clang         0x00000001110b0154 llvm::FPPassManager::runOnFunction(llvm::Function&) + 1092
11 clang         0x00000001110b6268 llvm::FPPassManager::runOnModule(llvm::Module&) + 72
12 clang         0x00000001110b074a llvm::legacy::PassManagerImpl::run(llvm::Module&) + 986
13 clang         0x0000000111c20ad4 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, std::__1::unique_ptr<llvm::raw_pwrite_stream, std::__1::default_delete<llvm::raw_pwrite_stream> >) + 3764
14 clang         0x0000000111f6dd31 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) + 1905
15 clang         0x00000001131a28b3 clang::ParseAST(clang::Sema&, bool, bool) + 643
16 clang         0x00000001122b02a4 clang::FrontendAction::Execute() + 84
17 clang         0x000000011222d6a9 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) + 873
18 clang         0x000000011232faf5 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) + 661
19 clang         0x000000010f4e9860 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) + 2544
20 clang         0x000000010f4e7168 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) + 312
21 clang         0x00000001120ab187 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >*, bool*) const::$_1>(long) + 23
22 clang         0x0000000111849eb4 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) + 228
23 clang         0x00000001120aac24 clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >*, bool*) const + 324
24 clang         0x000000011207b85d clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&) const + 221
25 clang         0x000000011207bdad clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::__1::pair<int, clang::driver::Command const*> >&) const + 125
26 clang         0x0000000112092f7c clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::__1::pair<int, clang::driver::Command const*> >&) + 204
27 clang         0x000000010f4e6977 main + 10375
28 libdyld.dylib 0x00007fff6be90cc9 start + 1
29 libdyld.dylib 0x0000000000000018 start + 18446603338705728336
clang-14: error: clang frontend command failed with exit code 70 (use -v to see invocation)
clang version 14.0.0 (https://github.com/llvm/llvm-project.git c90d136be4e055f1b409f38706d0fe3e2211af08)
Target: arm64-apple-darwin19.5.0
Thread model: posix
InstalledDir: /Users/buildslave/jenkins/workspace/test-suite-verify-machineinstrs-aarch64-O3/compiler/bin
clang-14: note: diagnostic msg:

Thank you for reverting. I will look into this.

red1bluelost reopened this revision.Jan 19 2022, 7:05 AM
This revision is now accepted and ready to land.Jan 19 2022, 7:05 AM
red1bluelost updated this revision to Diff 401225.EditedJan 19 2022, 7:06 AM

Adds a test to catch the bug earlier and then fixes the issue.

The issue was related to differing register classes between the rr and ri
versions of the instructions.

The issue found in the CI is related to contraining the register class of the
replaced instruction. The [ADD|SUB][W|X]rr instructions use GPR[32|64]RegClass
while [ADD|SUB][W|X]ri instructions use GPR[32|64]spRegClass. To fix the
conflict, we constrain to GPR[32|64]commonRegClass (rather than spRegClass)
so that machine instructions remain valid.

To ensure that this issue is caught earlier, addsub-24bit-imm.mir was added for
tests. After the typical peephole patterns, an instruction that expects
GPR[32|64]RegClass uses the result. This was shown to fail with the prior
implementation of MIPeepholeOpt.

Note: I could not figure out a way to make the test in IR which is why I
ended up making an MIR test.

Rebase to fix the patch.

I think that even though it might no exist right now, in the future there could be a register class which is a subset of GPR64commonRegClass. The newly created reg should be choosing the common reg class of GPR64sp and the old DstReg.

Something like MRI->constrainRegClass(NewDstReg, MRI->getRegClass(DstReg));

Improves contraining of register class with the suggestion. It constrains the
source register to the register class used by the new immediate instructions.
It constrains the new destination register to the reg class used by the
original register.

Thanks. I'm pretty sure this LGTM, but you may need to squash the commits to show the full diff again.

dmgreen accepted this revision.Jan 21 2022, 11:05 AM

Squashes commits as suggested to fix the patch.