This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Adjust operand sequence for Add+Sub to combine more inline shift
ClosedPublic

Authored by bcl5980 on Oct 18 2022, 4:22 AM.

Diff Detail

Event Timeline

bcl5980 created this revision.Oct 18 2022, 4:22 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 18 2022, 4:22 AM
bcl5980 requested review of this revision.Oct 18 2022, 4:22 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 18 2022, 4:22 AM

Have you tried testing this? I think it will get stuck in DAG combines.

Otherwise the idea sounds good. Have you considered doing it for GlobalISel, or in a way that GlobalISel would also benefit?

Have you tried testing this? I think it will get stuck in DAG combines.

Otherwise the idea sounds good. Have you considered doing it for GlobalISel, or in a way that GlobalISel would also benefit?

I haven't found any stuck in DAG combine local and it looks all of the premerge test is also passed:
https://reviews.llvm.org/harbormaster/build/291788/

Which combination do you think will cause the dead loop in DAG combine with this patch? I can try to create a test for that.

I can try to do it in GISel also in another patch later.

Have you tried testing this? I think it will get stuck in DAG combines.

Otherwise the idea sounds good. Have you considered doing it for GlobalISel, or in a way that GlobalISel would also benefit?

I haven't found any stuck in DAG combine local and it looks all of the premerge test is also passed:
https://reviews.llvm.org/harbormaster/build/291788/

Which combination do you think will cause the dead loop in DAG combine with this patch? I can try to create a test for that.

Yes - the llvm regression tests are very much a subset of all the patterns that can come up in practice. I'm not sure what this is running into. Perhaps try and bootstrap or compiling the llvm-test-suite?

Have you tried testing this? I think it will get stuck in DAG combines.

Otherwise the idea sounds good. Have you considered doing it for GlobalISel, or in a way that GlobalISel would also benefit?

I haven't found any stuck in DAG combine local and it looks all of the premerge test is also passed:
https://reviews.llvm.org/harbormaster/build/291788/

Which combination do you think will cause the dead loop in DAG combine with this patch? I can try to create a test for that.

Yes - the llvm regression tests are very much a subset of all the patterns that can come up in practice. I'm not sure what this is running into. Perhaps try and bootstrap or compiling the llvm-test-suite?

Or can we do it on Machine IR(AArch64MIPeepholeOpt) to avoid dead loop and also GISel can get benefit?

bcl5980 planned changes to this revision.Oct 25 2022, 8:17 PM

dead loop happen in this file

bcl5980 updated this revision to Diff 470688.Oct 25 2022, 8:35 PM

Z can't be constant to avoid dead loop

bcl5980 updated this revision to Diff 470695.Oct 25 2022, 10:36 PM

test pass:
llvm self-build
llvm test-suite
spec2017
sqlite3
python3.11

dmgreen accepted this revision.Oct 26 2022, 10:20 AM

Thanks for finding the problem. This LGTM now.

Or can we do it on Machine IR(AArch64MIPeepholeOpt) to avoid dead loop and also GISel can get benefit?

I was wondering what to do about optimizations that can be shared between GISel and SDAG. AArch64MIPeepholeOpt often feels like the wrong place (MachineCombiner might be better for some optimizations like this one, but that currently doesn't scale very well). We don't really want to write them in the wrong place if they would be better as combines during ISel. For now at least this patch looks good.

This revision is now accepted and ready to land.Oct 26 2022, 10:20 AM

Building Firefox with LTO for any aarch64 target (linux, macos, windows) enters in some sort of infinite loop since this change (linking never ends).

Building Firefox with LTO for any aarch64 target (linux, macos, windows) enters in some sort of infinite loop since this change (linking never ends).

reverted by rGdfb16bd5526b

Is it link time infinite loop? Somehow this change doesn't have any effect on link stage.
Can this reproducer work after I revert the code?

Thanks for reverting - I've also bisected down hangs to the same commit. For my case, it can be reproduced with https://martin.st/temp/aarch64-hang.c, with clang -target aarch64-linux-gnu -c -O2 aarch64-hang.c.

Is it link time infinite loop? Somehow this change doesn't have any effect on link stage.

In his case, he's doing LTO builds, so the compilation/codegen happens at link time. But my testcase might be smaller and easier to run.

Is it link time infinite loop? Somehow this change doesn't have any effect on link stage.

In his case, he's doing LTO builds, so the compilation/codegen happens at link time. But my testcase might be smaller and easier to run.

Thanks for the case.

bcl5980 updated this revision to Diff 471473.Oct 28 2022, 3:14 AM

Fix infinite loop caused by Z is one-use shift C also.
Add some more tests.

bcl5980 reopened this revision.Oct 28 2022, 3:16 AM

@dmgreen, should I send a new patch review for the fix, or still in this page?

This revision is now accepted and ready to land.Oct 28 2022, 3:16 AM
thakis added a subscriber: thakis.Oct 28 2022, 6:06 AM

We also bisected hangs due to this (arm64 on all of android fuchsia ios mac). It looks like you have a bunch of reproducers already, but there's one more at https://bugs.chromium.org/p/chromium/issues/detail?id=1379286#c7 if you're interested (it's just a file from boringssl – we saw this hang on many files. I'd guess, but haven't checked, that doing a bootstrap build of clang on an arm64 box would also trigger this.)

Thanks for the revert!

Thanks for all of your tests. And current patch can pass all the tests now.

dmgreen accepted this revision.Oct 31 2022, 1:10 AM

I see. Thanks for the update. Providing you have ran a bootstrap, this LGTM.

I see. Thanks for the update. Providing you have ran a bootstrap, this LGTM.

I'm sorry but I don't know what is the bootstrap. Can you give me some suggestion for that? Where can I get a opensource bootstrap and build?

Ah - it just means compiler the compiler with itself (on AArch64) and make sure it still passes all the tests. There are some details, including some handy cmake options in https://llvm.org/docs/AdvancedBuilds.html

bcl5980 added a comment.EditedOct 31 2022, 3:54 AM

Ah - it just means compiler the compiler with itself (on AArch64) and make sure it still passes all the tests. There are some details, including some handy cmake options in https://llvm.org/docs/AdvancedBuilds.html

Thanks for the explaination. I have already build and test llvm itself on MacbookAir with M2 processor.