User Details
- User Since
- Sep 9 2019, 7:00 PM (83 w, 3 d)
Wed, Apr 14
Thanks for your review. Hope this patch won't cause too many ABI issues in the future.
Tue, Apr 13
Address Craig's comments
Address Simon's comments
Address Craig's comments
- Rebase;
- Emit unaligned move in ISEL;
- Only do the conversion on AVX machine.
Mon, Apr 12
I'm more concerned about these cases that don't use long double and aren't missed constant folding. https://godbolt.org/z/qGv54ef34
Thu, Apr 8
Wed, Apr 7
Mon, Apr 5
Address Pengfei's comments
Thu, Apr 1
Ping?
Rebase and avoid using 'byval' parameter.
Tue, Mar 30
Mon, Mar 29
Mar 9 2021
Mar 8 2021
Address pengfei's comments
Address comments
Mar 7 2021
Address pengfei's comment.
Rebase and address Pengfei and Craig's comments.
Mar 5 2021
Mar 4 2021
Feb 24 2021
I don't know why pre-merge-checks failed. I can check-all successfully locally in redhat8. I don't have debian mainchine to reproduce this problem.
Address Pengfei and Yuanke's comments. We don't need more tile type.
Feb 23 2021
Adding back 'avx512f' to amx-tile-basic.ll
Feb 22 2021
Dec 28 2020
The above threshold is for number of MIs. BB->size() is to get instruction number of BB. I committed 31c2b93d83f63ce7f9bb4977f58de2e00bf18e0f to further reduce compiling time. You can have a try
Dec 27 2020
I can surely do that. But I think the most reasonable solution would be fix the compiling time issue. Since compiling time tests I did before does not expose any regression, your test case must be a little special. Could you find out the special point, for example the function has too many blocks or some/many blocks in the function has too many instructions? Thanks.
Dec 22 2020
Yes, we have been aware that this patch may introduce compiling time degradations. And as you can see in previous comments, I already tested the compiling time on X86 arch. Sadly, the tested benchmarks don't expose any regressions.
Could you please help to send me your regression function/IR? So I can have a look about how to fix it? Thanks.
Hi, @shchenz. Our several opecncl benchmarks have appeared great compile time regression.
For only one function, the time consume on Machine code sinking pass increased form 6.0711s to 366.5713.
According to your algorithm, this patch will obviously increase the compile time for some cases.
Dec 16 2020
Nov 20 2020
Nov 17 2020
It allows more than two, right? like {vex}{vex2}{vex3} instruction. I think it should be a bug for att.
- Delete IsPrefix parameter, and delete 'break'
- Check prefix, ignoring case
- Delete IsPrefix parameter, and delete 'break', so that we won't check prefix again. I am not sure if this is right. Att format can allow two prefix and using the last one as the finally encoding prefix. I think this may not be the original intention of the design.
- Change the test: checking the IR istead of checking the assembly.
- Made some format adjustments.
Ping?
Nov 11 2020
Rebase.
Adding the '{}' to prefix when generate IR.
Nov 4 2020
- Address comments;
- Only support parsing vex/vex2/vex3/evex prefix for MASM
Nov 2 2020
Oct 30 2020
Thanks for all of your review!
- Move the testcase from avx-vnni/ to test/CodeGen/X86/.
- Refine the Run line in avx_vnni-intrinsics.ll
Oct 29 2020
Address comments
Oct 28 2020
Ping?
Oct 26 2020
Oct 25 2020
Address comments.
Adding avxvnni to Alder Lake.
- move the commonvnniintrin.h to the avx512vlvnniintrin.h.
- move the testcase avxvnni-builtins.c to X86 subdirectory.