Page MenuHomePhabricator

AaronLiu (Aaron H Liu)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 29 2020, 2:38 PM (163 w, 5 d)

Recent Activity

Sep 9 2022

AaronLiu added a comment to D131465: C++/ObjC++: switch to gnu++17 as the default standard.

This cause a large amount failures in our testing with errors such as:

error: ISO C++17 does not allow 'register' storage class specifier [-Wregister]
error: ISO C++17 does not allow dynamic exception specifications [-Wdynamic-exception-spec]

Sorry to hear that. Who is "our" in this case and do you have a suggestion for how you'd like to proceed?

Sep 9 2022, 10:59 AM · Restricted Project, Restricted Project, Restricted Project
AaronLiu added a comment to D131465: C++/ObjC++: switch to gnu++17 as the default standard.

This cause a large amount failures in our testing with errors such as:

error: ISO C++17 does not allow 'register' storage class specifier [-Wregister]
error: ISO C++17 does not allow dynamic exception specifications [-Wdynamic-exception-spec]
Sep 9 2022, 10:23 AM · Restricted Project, Restricted Project, Restricted Project

Jun 22 2022

AaronLiu added a comment to D128128: [GlobalOpt] Perform store->dominated load forwarding for stored once globals.

Reduced TC that fail with -O3 -flto:

int counter = 0;
Jun 22 2022, 9:42 AM · Restricted Project, Restricted Project

Apr 21 2022

AaronLiu added a comment to D119136: [clang] Implement Change scope of lambda trailing-return-type.

When compile the following valid testcase:

void foo()
{
        int x = [x](int y[sizeof x]){return sizeof x;}(0);
}

It will complain:

error: captured variable 'x' cannot appear here
        int x = [x](int y[sizeof x]){return sizeof x;}(0);
                                 ^

The issue is also described in: https://cplusplus.github.io/CWG/issues/2569.html

Is that something you encountered in existing code, or where you trying to write test against this change.
I only implemented the decltype part of the proposed resolution, for reason explained in that link (supporting sizeof(unqual-id) and `noexcept(unqual-id) seems extremely arbitrary).
But hopefully WG21 will come up with a more encompassing fix before we attempt to re-land this change

Apr 21 2022, 1:06 PM · Restricted Project, Restricted Project
AaronLiu added a comment to D119136: [clang] Implement Change scope of lambda trailing-return-type.

When compile the following valid testcase:

void foo()
{
        int x = [x](int y[sizeof x]){return sizeof x;}(0);
}

It will complain:

error: captured variable 'x' cannot appear here
        int x = [x](int y[sizeof x]){return sizeof x;}(0);
                                 ^

The issue is also described in: https://cplusplus.github.io/CWG/issues/2569.html

Apr 21 2022, 11:48 AM · Restricted Project, Restricted Project

Mar 25 2022

AaronLiu accepted D122497: [clang][tests][NFC] Add filescope array initialization test.

LGTM. Thanks!

Mar 25 2022, 1:51 PM · Restricted Project, Restricted Project

Feb 1 2022

AaronLiu added a comment to D118009: [ARM] Make getInstSizeInBytes() use instruction size from InstrInfo.td.

Looks like linking the following test caused many(57 for now) LoP-LE build failures (first build failure: https://lab.llvm.org/buildbot/#/builders/57/builds/14559):

186.515 [21/53/841] Linking CXX executable unittests/Target/ARM/ARMTests
FAILED: unittests/Target/ARM/ARMTests 
: && /home/buildbots/clang.11.0.0/bin/clang++ --gcc-toolchain=/opt/rh/devtoolset-7/root/usr  -fPIC -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG  -Wl,--gc-sections unittests/Target/ARM/CMakeFiles/ARMTests.dir/MachineInstrTest.cpp.o unittests/Target/ARM/CMakeFiles/ARMTests.dir/InstSizes.cpp.o  -o unittests/Target/ARM/ARMTests  -Wl,-rpath,/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/build/lib  lib/libLLVMARMCodeGen.so.14git  lib/libLLVMARMDesc.so.14git  lib/libLLVMARMInfo.so.14git  lib/libLLVMGlobalISel.so.14git  lib/libLLVMMIRParser.so.14git  lib/libLLVMSelectionDAG.so.14git  -lpthread  lib/libgtest_main.so.14git  lib/libgtest.so.14git  -lpthread  lib/libLLVMCodeGen.so.14git  lib/libLLVMTarget.so.14git  lib/libLLVMMC.so.14git  lib/libLLVMSupport.so.14git  -Wl,-rpath-link,/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/build/lib && :
/opt/rh/devtoolset-7/root/usr/lib/gcc/ppc64le-redhat-linux/7/../../../../bin/ld: unittests/Target/ARM/CMakeFiles/ARMTests.dir/InstSizes.cpp.o: undefined reference to symbol '_ZN4llvm10DataLayout5clearEv'
/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/build/lib/libLLVMCore.so.14git: error adding symbols: DSO missing from command line
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
289.769 [21/2/892] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/SourceCodeTest.cpp.o
ninja: build stopped: subcommand failed.
program finished with exit code 1
elapsedTime=424.974897
Feb 1 2022, 4:21 PM · Restricted Project

Sep 1 2020

AaronLiu committed rGd7e16ca28f48: [LV] Interleave to expose ILP for small loops with scalar reductions. (authored by AaronLiu).
[LV] Interleave to expose ILP for small loops with scalar reductions.
Sep 1 2020, 12:49 PM
AaronLiu closed D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..
Sep 1 2020, 12:49 PM · Unknown Object (Project), Restricted Project
AaronLiu updated the diff for D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

Rebase to the latest master.

Sep 1 2020, 12:33 PM · Unknown Object (Project), Restricted Project

Aug 28 2020

AaronLiu updated the diff for D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

Address review comments.

Aug 28 2020, 6:54 AM · Unknown Object (Project), Restricted Project

Aug 26 2020

AaronLiu added a comment to D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

Just to double check, wouldn't it be sufficient if loop-unroll would unroll the loop? Or is this not happening? It seems like loop-unroll would unroll the loop in the test-case.

It would unroll the loop in the testcase, but it will not break dependence and expose ILP, will not help performance, and actually unroll hurt performance in this case.

I didn't took a very close look, but wouldnt unrolling generate similar code as interleaving, modulo different order of instructions, but with similar compute instructions trees?

But I think unrolling with runtime trip counts can generate a bit more overhead around the loop than interleaving in this case, so interleaving should be better in that case.

Aug 26 2020, 8:21 AM · Unknown Object (Project), Restricted Project
AaronLiu added a comment to D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

Just to double check, wouldn't it be sufficient if loop-unroll would unroll the loop? Or is this not happening? It seems like loop-unroll would unroll the loop in the test-case.

It would unroll the loop in the testcase, but it will not break dependence and expose ILP, will not help performance, and actually unroll hurt performance in this case.

I didn't took a very close look, but wouldnt unrolling generate similar code as interleaving, modulo different order of instructions, but with similar compute instructions trees?

Also, it seems like the loop already gets interleaved on current trunk (With IC=2) and this patch makes it more aggressive. Might be good to describe how the IC gets boosted by this option.

Aug 26 2020, 8:20 AM · Unknown Object (Project), Restricted Project
AaronLiu added inline comments to D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..
Aug 26 2020, 8:19 AM · Unknown Object (Project), Restricted Project

Aug 21 2020

AaronLiu added a comment to D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

Just to double check, wouldn't it be sufficient if loop-unroll would unroll the loop? Or is this not happening? It seems like loop-unroll would unroll the loop in the test-case.

Aug 21 2020, 10:55 AM · Unknown Object (Project), Restricted Project
AaronLiu added inline comments to D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..
Aug 21 2020, 10:54 AM · Unknown Object (Project), Restricted Project

Aug 20 2020

AaronLiu updated the diff for D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

Address review comments.

Aug 20 2020, 12:04 PM · Unknown Object (Project), Restricted Project

Aug 17 2020

AaronLiu added inline comments to D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..
Aug 17 2020, 1:48 PM · Unknown Object (Project), Restricted Project

Aug 13 2020

AaronLiu updated the diff for D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

Remove SLP comments.

Aug 13 2020, 12:02 PM · Unknown Object (Project), Restricted Project

Aug 12 2020

AaronLiu added a comment to D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

Currently there is no SLP involved, and this patch can still give a significant performance improvement for the benchmark.

Aug 12 2020, 7:51 AM · Unknown Object (Project), Restricted Project
AaronLiu retitled D81416: [LV] Interleave to expose ILP for small loops with scalar reductions. from [LV][SLP] Interleave to expose ILP for small loops with scalar reductions. to [LV] Interleave to expose ILP for small loops with scalar reductions..
Aug 12 2020, 7:50 AM · Unknown Object (Project), Restricted Project

Jul 8 2020

AaronLiu removed a reviewer for D81416: [LV] Interleave to expose ILP for small loops with scalar reductions.: nikic.
Jul 8 2020, 9:10 AM · Unknown Object (Project), Restricted Project
AaronLiu added a comment to D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

Thanks! Will keep eyes on it in the future.

Jul 8 2020, 5:41 AM · Unknown Object (Project), Restricted Project

Jul 7 2020

AaronLiu updated the diff for D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

deleted: llvm/test/Transforms/PhaseOrdering/interleave_LV_SLP.ll
deleted: llvm/test/Transforms/PhaseOrdering/interleave_LV_SLP_false.ll
deleted: llvm/test/Transforms/SLPVectorizer/PowerPC/interleave_SLP.ll

Jul 7 2020, 2:37 PM · Unknown Object (Project), Restricted Project
AaronLiu added a comment to D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

In that case, best understand why LV's cost model claims vectorizing the loop is not profitable, which you and SLP know it is; and ideally fix LV's cost model.
A crash due to forced vectorization sounds like a bug, which best be reported and/or fixed.
If cases with concrete "obstacles" are identified preventing LV from vectorizing a loop but allowing SLP to vectorize (part of) it, after LV interleaves the loop, such obstacles could potentially be used to (further) drive LV to interleave the loop.

Agree, ideally LV's cost model and its vectorization functionality should be improved in the future to be able to vectorize a lot more instructions.
We see some applications keep being crashed, due to some changes in LV and probably being fixed later on, or because of its own weakness in some aspects.
But all the above are beyond of this patch.
Currently, LV and SLP complement each other, and there are cases that LV fails to vectorize (functionally not being able to do it) but SLP succeed.

Hence the term "small loop" should be more specific; as in "vectorizer-min-trip-count" / "TinyTripCountVectorThreshold".

The "small or tiny" values are relative, and will keep on changing. In the situations we see, it is even more dynamic, the exact trip count is not known, but we know that it is relatively small.

Jul 7 2020, 2:15 PM · Unknown Object (Project), Restricted Project
AaronLiu added a comment to D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

May I suggest we only test what this patch actually changes? This patch adds an option which when enabled allows interleaving of loops with small trip counts and scalar reductions, so it suffices to test exactly that. That should be covered by llvm/test/Transforms/LoopVectorize/PowerPC/interleave_IC.ll. I think the other test cases can be removed. IMHO adding more tests to make sure SLP vectorization happens (and the like) are redundant, add unnecessary maintenance in the future and are beyond the scope of what this patch is trying to do.

Jul 7 2020, 9:15 AM · Unknown Object (Project), Restricted Project
AaronLiu added a comment to D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

In the application we try, LV refuse to vectorize due to not profitable, but if we force LV to vectorize and it will crash. Apparently there are some obstacles. There are cases that even if LV fails, SLP could succeed.
Yes, the term small loop is a little bit of confusing. For example a loop which has a small number of instructions but has a huge loop trip count, is the loop small or big? In our example, the loop trip count is small, and also the instruction number is small.

Jul 7 2020, 9:13 AM · Unknown Object (Project), Restricted Project

Jun 30 2020

AaronLiu updated the diff for D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

Update testcases.

Jun 30 2020, 12:29 PM · Unknown Object (Project), Restricted Project
AaronLiu added a comment to D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

In this patch we "Interleave to expose ILP". The whole purpose of this patch is to "expose ILP", and the approach is to "Interleave".

  • I worry about that if we remove those testcases, no vectorization(parallelism) results due to this patch can be seen, and people will have no idea where do we "expose ILP"?
  • We have ever discussed that what @spatel suggested adding two testcases under PhaseOrdering to show the baseline results with the option in this patch on and off "do make sense".
  • The purpose of adding lit tests is to catch future regressions, i.e., if someone make the vectorization not working, we will catch it with the testcases we added.
Jun 30 2020, 9:12 AM · Unknown Object (Project), Restricted Project

Jun 26 2020

AaronLiu added a comment to D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

Currently all four testcases serve different purposes, and we can clearly see their differences:

  • PhaseOrdering/interleave_LV_SLP_false.ll gives the baseline result, which shows that with the option in this patch off, instructions are not being vectorized.
  • PhaseOrdering/interleave_LV_SLP.ll also gives the baseline result, which shows that with the option in this patch on, instructions are being vectorized. But which vectorizer make it work? We cannot tell.
  • Vectorize/LoopVectorize.cpp tells us that LV cannot vectorize the code, but interleave the instructions to expose ILP.
  • SLPVectorizer/PowerPC/interleave_SLP.ll demonstrates that after interleaving by LV, then SLP captures the opportunities and vectorize the instructions.
Jun 26 2020, 9:17 AM · Unknown Object (Project), Restricted Project
AaronLiu updated the diff for D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

Update three testcases, and add one more testcase to this patch.

Jun 26 2020, 9:17 AM · Unknown Object (Project), Restricted Project

Jun 24 2020

AaronLiu added a comment to D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

I think all the four testcases are related to this patch, and I prefer to keep the testcase in this patch. The four testcases all serve different purposes:

  • PhaseOrdering/interleave_LV_SLP.ll shows that with the option in this patch on, instructions were vectorized. But which vectorizer make it work? We cannot tell.
  • Vectorize/LoopVectorize.cpp tells us that LV cannot vectorize the code, but interleave the instructions to expose ILP.
  • SLPVectorizer/PowerPC/interleave_SLP.ll demonstrates that after interleaving by LV, then SLP captures the opportunities and vectorize the instructions.
  • The one you added in PhaseOrdering/interleave-vectorization.ll show that without this patch(equivalently with the option in this patch off), the same testcase both LV and SLP cannot vectorize the instructions.

This still isn't quite what I was hoping for. Can you just add "-interleave-small-loop-scalar-reduction=true" to the RUN lines of the file I added and update the CHECK lines using the script? I want this patch to show *the diff* in IR for the proposed code change. I can't easily tell what is changing by comparing 2 different files.

Jun 24 2020, 2:41 PM · Unknown Object (Project), Restricted Project

Jun 23 2020

AaronLiu added a comment to D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

I add a test: llvm/test/Transforms/PhaseOrdering/interleave_LV_SLP.ll
Please let me know whether that is what you wanted.

Thanks - that was the start of what I requested, but not complete. I've added the test here using auto-generated CHECK lines that show baseline (without this patch) results:
rGdf79443

Please rebase and update that file using the script at llvm/utils/update_test_checks.py.
I don't think we need to duplicate the test in the SLP folder now that we have coverage for this example in PhaseOrdering, but if you think that is still useful, that can be added independently of this patch.

Jun 23 2020, 2:32 PM · Unknown Object (Project), Restricted Project
AaronLiu updated the diff for D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..
Jun 23 2020, 2:32 PM · Unknown Object (Project), Restricted Project

Jun 22 2020

AaronLiu added a comment to D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

Ping...

Jun 22 2020, 9:09 AM · Unknown Object (Project), Restricted Project

Jun 17 2020

AaronLiu added a comment to D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

IIUC, we should add a test under test/Transforms/PhaseOrdering with -O2 to show the cooperative effect of the 2 vectorizers rather than a stand-alone SLP test.
If you can push that test with full baseline CHECK lines and then apply this patch and show test diffs, that would make it much easier to tell what is intended with this patch.

Jun 17 2020, 5:17 PM · Unknown Object (Project), Restricted Project
AaronLiu updated the diff for D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

Add llvm/test/Transforms/PhaseOrdering/interleave_LV_SLP.ll

Jun 17 2020, 5:17 PM · Unknown Object (Project), Restricted Project

Jun 16 2020

AaronLiu added a comment to D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

We use PhaseOrdering tests to ensure that the end result of >1 IR pass (usually the entire pipeline of -O* settings) produces the expected result. That may be stretching the meaning of PhaseOrdering, but that would be less fragile than the stand-alone SLP test. This patch isn't changing anything in SLP, so the test you are adding to SLP is independent of this patch, right?

Jun 16 2020, 11:00 AM · Unknown Object (Project), Restricted Project
AaronLiu added a comment to D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

IIUC, we should add a test under test/Transforms/PhaseOrdering with -O2 to show the cooperative effect of the 2 vectorizers rather than a stand-alone SLP test.
If you can push that test with full baseline CHECK lines and then apply this patch and show test diffs, that would make it much easier to tell what is intended with this patch.

Jun 16 2020, 9:54 AM · Unknown Object (Project), Restricted Project
AaronLiu added inline comments to D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..
Jun 16 2020, 9:21 AM · Unknown Object (Project), Restricted Project
AaronLiu updated the diff for D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

Update comments to address code review.

Jun 16 2020, 8:16 AM · Unknown Object (Project), Restricted Project
AaronLiu updated the diff for D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

Add comments to address code review.

Jun 16 2020, 7:42 AM · Unknown Object (Project), Restricted Project

Jun 15 2020

AaronLiu added inline comments to D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..
Jun 15 2020, 1:14 PM · Unknown Object (Project), Restricted Project
AaronLiu added a comment to D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

Ping...

Jun 15 2020, 8:40 AM · Unknown Object (Project), Restricted Project

Jun 10 2020

AaronLiu added a comment to D80175: [PowerPC][MachineCombiner] reassociate fma to expose more ILP.

Can you add a hidden option with init false? You can turn it true later on.
So that people can try with your option off and on? Thanks!

This is not adding a new optimization but is increasing the scope of an existing optimization. So I don't think it is appropriate to add an option to control only this aspect of this. If there is a use case for allowing the user to have fine grained control over which combiner patterns to use, then we can add an enum option. But we need to have justification for why that is needed.

OTOH, I am not opposed to adding an option to turn off the machine combiner (i.e. change the value returned by PPCInstrInfo::useMachineCombiner()) - but that is orthogonal to this patch and can be done separately.

Jun 10 2020, 9:56 AM · Restricted Project
AaronLiu added a comment to D80175: [PowerPC][MachineCombiner] reassociate fma to expose more ILP.

Can you add a hidden option with init false? You can turn it true later on.
So that people can try with your option off and on? Thanks!

Have you met some issues with this being turned on? This is specific for PowerPC, I have verified that on PowerPC there is no deg for the patch on the benchmarks I run. If you just want to see the impact of this patch, I would suggest you comment out the two newly added lines in function PPCInstrInfo::getMachineCombinerPatterns, it will turn off this opt.

Jun 10 2020, 7:04 AM · Restricted Project

Jun 9 2020

AaronLiu added a comment to D67948: [LV] Interleaving should not exceed estimated loop trip count..

I test it on PowerPC, I request it to be tested by the community on different platform. Currently it is disabled by default, and will enable it in the next step.

Jun 9 2020, 3:31 PM · Restricted Project
AaronLiu added a comment to D67948: [LV] Interleaving should not exceed estimated loop trip count..

We have observed some performance regressions, presumably because the vectorized code started to kick-in on short estimated trip count loops (as opposed to skipping vector code and execute scalar code). We'll try following up with cost model tuning. I'm not too surprised if others also hit a similar issue. Overall, though, that's the right direction to head to.

This was fixed?

Jun 9 2020, 2:21 PM · Restricted Project
AaronLiu added a comment to D67948: [LV] Interleaving should not exceed estimated loop trip count..

Accidentally removed the message that I posted above. Re-post here: basically what I want say is to request reviewers for this patch to review another patch D81416 that touch the same file. Thanks!

Jun 9 2020, 12:07 PM · Restricted Project

Jun 8 2020

AaronLiu added a comment to D67948: [LV] Interleaving should not exceed estimated loop trip count..
Jun 8 2020, 11:25 PM · Restricted Project
AaronLiu added a comment to D80175: [PowerPC][MachineCombiner] reassociate fma to expose more ILP.

Can you add a hidden option with init false? You can turn it true later on.
So that people can try with your option off and on? Thanks!

Jun 8 2020, 11:09 PM · Restricted Project
AaronLiu updated the diff for D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..

A little bit of format change.

Jun 8 2020, 2:25 PM · Unknown Object (Project), Restricted Project
AaronLiu created D81416: [LV] Interleave to expose ILP for small loops with scalar reductions..
Jun 8 2020, 12:07 PM · Unknown Object (Project), Restricted Project