jtony (Tony Jiang)
User

Projects

User does not belong to any projects.

User Details

User Since
Oct 21 2016, 3:07 PM (51 w, 5 d)

Recent Activity

Yesterday

jtony accepted D38705: [PPC CodeGen] Fix the bitreverse.i64 intrinsic..

This patch now LGTM.

Wed, Oct 18, 8:34 AM
jtony added a comment to D38705: [PPC CodeGen] Fix the bitreverse.i64 intrinsic..

Added comment suggest by jtony.

The test can be updated to something like

for (int i = 0; i < NUM; ++i) {
  sum1 += ReverseBits32(i);
  sum2 += ReverseBits64(i);
}
for (int i = 0; i < NUM; ++i) {
  sum1 -= __builtin_bitreverse32(i);
  sum2 -= __builtin_bitreverse64(i);
}
return sum1 == 0 && sum2 == 0 ? 0 : 1;

But I use git and the document does not mention how to use arc diff to update a file that is in another repo (./projects/test-suite/)

% find . -name .git
./.git
./projects/libcxx/.git
./projects/test-suite/.git
./projects/compiler-rt/.git
./projects/libcxxabi/.git
./tools/clang/.git
./tools/clang/tools/extra/.git

Wed, Oct 18, 8:32 AM
jtony added inline comments to D38962: [test-suite] Update bitreverse benchmark..
Wed, Oct 18, 8:29 AM

Mon, Oct 16

jtony added inline comments to D38705: [PPC CodeGen] Fix the bitreverse.i64 intrinsic..
Mon, Oct 16, 12:28 PM

Sun, Oct 15

jtony added a comment to D38705: [PPC CodeGen] Fix the bitreverse.i64 intrinsic..

This is my omission in the implementation. All the 1-bit, 2-bit, 4-bit and bytes are swapped, except the high word and low word is not swapped, which is the root cause for the problem. The reason why the test case : projects/test-suite/SingleSource/Benchmarks/Misc/revertBits.c didn't catch it was probably LLVM OPT optimizes ReverseBits64(__builtin_bitreverse64(i)) to just i, so it never tests the code I added (Note I retest it, it generates the same result with gcc at -O2 but generates wrong result without optimization). I also did some other functional testing on my own machine before committing, but I used the number 0x5555555555555555, which was not general enough, so it also didn't catch this subtle bug. I should have used a more general test case. I suggest modifying the test case in the Benchmark (https://reviews.llvm.org/D35188) to be stronger to catch this (like instead of add ReverseBits64(__builtin_bitreverse64) to sum, add ReverseBits64 and __builtin_bitreverse64 separately to the sum so LLVM OPT couldn't recognize it is a reversal of a reversal.

Sun, Oct 15, 7:05 PM

Thu, Oct 12

jtony added inline comments to D38486: [PPC] Implement the heuristic to choose between a X-Form VSX ld/st vs a X-Form FP ld/st..
Thu, Oct 12, 1:42 PM
jtony updated the diff for D38486: [PPC] Implement the heuristic to choose between a X-Form VSX ld/st vs a X-Form FP ld/st..

Address comments from Nemanja.
(1) Abstract the common part code for handling D-Form and X-Form LD/ST heuristic into a function.
(2) Fix missing predicate guards (HasVectorP8(), AddedComplexity=400) problem.

Thu, Oct 12, 1:38 PM

Mon, Oct 2

jtony added inline comments to D38486: [PPC] Implement the heuristic to choose between a X-Form VSX ld/st vs a X-Form FP ld/st..
Mon, Oct 2, 6:33 PM
jtony created D38486: [PPC] Implement the heuristic to choose between a X-Form VSX ld/st vs a X-Form FP ld/st..
Mon, Oct 2, 6:32 PM
jtony abandoned D38099: Fix crashes with -fprofile-use + new pass manager in queens.c from the testsuite (bugzilla bug 33776).

Thanks for your contribution.
Some notes:

  1. Can you please update the patch to include context (see the doc) ?
  2. Can you please add a test? I attached one to the PR.
  3. Can you please explain the rationale behind this fix? I'm afraid it's just papering over the problem rather than solving it.
Mon, Oct 2, 6:10 PM

Thu, Sep 21

jtony updated the summary of D38099: Fix crashes with -fprofile-use + new pass manager in queens.c from the testsuite (bugzilla bug 33776).
Thu, Sep 21, 10:13 AM

Wed, Sep 20

jtony created D38099: Fix crashes with -fprofile-use + new pass manager in queens.c from the testsuite (bugzilla bug 33776).
Wed, Sep 20, 2:26 PM

Tue, Sep 19

jtony committed rL313639: [PowerPC Peephole] Constants into a join add, use ADDI over LI/ADD..
[PowerPC Peephole] Constants into a join add, use ADDI over LI/ADD.
Tue, Sep 19, 9:16 AM
jtony closed D36734: [PowerPC Peephole] Constants into a join add, use ADDI over LI/ADD. by committing rL313639: [PowerPC Peephole] Constants into a join add, use ADDI over LI/ADD..
Tue, Sep 19, 9:16 AM
jtony closed D37851: [Power9] Add missing Power9 instructions..
Tue, Sep 19, 9:10 AM
jtony committed rL313636: [Power9] Add missing Power9 instructions..
[Power9] Add missing Power9 instructions.
Tue, Sep 19, 8:24 AM

Sep 18 2017

jtony updated the diff for D37851: [Power9] Add missing Power9 instructions..

Address all the comments from Nemanja.

Sep 18 2017, 2:50 PM
jtony added inline comments to D37851: [Power9] Add missing Power9 instructions..
Sep 18 2017, 2:25 PM

Sep 15 2017

jtony updated the diff for D37851: [Power9] Add missing Power9 instructions..

Delete the p9-instrs.txt file, since all the needed instructions in it have been implemented.

Sep 15 2017, 7:44 AM

Sep 14 2017

jtony created D37851: [Power9] Add missing Power9 instructions..
Sep 14 2017, 6:22 AM

Sep 7 2017

jtony updated the diff for D36734: [PowerPC Peephole] Constants into a join add, use ADDI over LI/ADD..

Address comments from Hiroshi and rebase on trunk.

Sep 7 2017, 4:34 PM
jtony added inline comments to D36734: [PowerPC Peephole] Constants into a join add, use ADDI over LI/ADD..
Sep 7 2017, 11:44 AM

Sep 5 2017

jtony committed rL312547: [PPC][NFC] Renaming things with 'xxinsert' moniker to 'vecinsert' to make it….
[PPC][NFC] Renaming things with 'xxinsert' moniker to 'vecinsert' to make it…
Sep 5 2017, 11:10 AM

Aug 30 2017

jtony updated the diff for D36734: [PowerPC Peephole] Constants into a join add, use ADDI over LI/ADD..

Refactor the implementation according to Nemanja's suggestion (get rid of the large lambda function replaceLiWithAddi, and inline replaceAddWithCopy)

Aug 30 2017, 6:25 AM

Aug 24 2017

jtony added inline comments to D36734: [PowerPC Peephole] Constants into a join add, use ADDI over LI/ADD..
Aug 24 2017, 12:29 PM
jtony updated the diff for D36734: [PowerPC Peephole] Constants into a join add, use ADDI over LI/ADD..

Address the comments from Nemanja.

Aug 24 2017, 12:29 PM

Aug 16 2017

jtony committed rL311010: Add bitreverse LNT benchmark..
Add bitreverse LNT benchmark.
Aug 16 2017, 8:09 AM
jtony closed D35188: Add bitreverse LNT benchmark. by committing rL311010: Add bitreverse LNT benchmark..
Aug 16 2017, 8:09 AM

Aug 15 2017

jtony created D36734: [PowerPC Peephole] Constants into a join add, use ADDI over LI/ADD..
Aug 15 2017, 6:26 AM

Aug 14 2017

jtony updated the diff for D35188: Add bitreverse LNT benchmark..

Address one more new comment from Hal (add check for compilers that don't support __has_builtin)

Aug 14 2017, 2:01 PM

Aug 13 2017

jtony updated the diff for D35188: Add bitreverse LNT benchmark..

Address comments from Hal Finkel.

Aug 13 2017, 6:20 PM

Jul 26 2017

jtony added inline comments to D35907: [StackColoring] Update AliasAnalysis information in stack coloring pass.
Jul 26 2017, 1:31 PM

Jul 24 2017

jtony added a comment to D35188: Add bitreverse LNT benchmark..

Kindly ping.

Jul 24 2017, 1:11 PM

Jul 11 2017

jtony committed rL307691: [PPC] Fix one test case regression for patch https://reviews.llvm.org/D34337..
[PPC] Fix one test case regression for patch https://reviews.llvm.org/D34337.
Jul 11 2017, 12:07 PM
jtony committed rL307672: [PPC] Fix two bugs in frame lowering..
[PPC] Fix two bugs in frame lowering.
Jul 11 2017, 9:42 AM
jtony closed D34337: [PPC] Fix two bugs in frame lowering. by committing rL307672: [PPC] Fix two bugs in frame lowering..
Jul 11 2017, 9:42 AM

Jul 10 2017

jtony added inline comments to D35188: Add bitreverse LNT benchmark..
Jul 10 2017, 1:16 PM
jtony updated the diff for D35188: Add bitreverse LNT benchmark..

(1) Also test builtin_bitreverse32 and builtin_bitreverse64.
(2) Add the missing newline at the end of the second printf.

Jul 10 2017, 1:16 PM
jtony committed rL307563: [PPC CodeGen] Expand the bitreverse.i64 intrinsic..
[PPC CodeGen] Expand the bitreverse.i64 intrinsic.
Jul 10 2017, 11:11 AM
jtony closed D34908: [PPC CodeGen] Expand the bitreverse.i64 intrinsic. by committing rL307563: [PPC CodeGen] Expand the bitreverse.i64 intrinsic..
Jul 10 2017, 11:11 AM
jtony updated the summary of D35188: Add bitreverse LNT benchmark..
Jul 10 2017, 6:48 AM

Jul 9 2017

jtony created D35188: Add bitreverse LNT benchmark..
Jul 9 2017, 4:18 PM
jtony updated the diff for D34337: [PPC] Fix two bugs in frame lowering..

Address comments from Hiroshi.

Jul 9 2017, 4:01 PM
jtony added inline comments to D35007: [PowerPC] Do not emit displacements for DQ-Form instructions that aren't multiples of 16.
Jul 9 2017, 6:15 AM

Jul 7 2017

jtony committed rL307413: [PPC CodeGen] Expand the bitreverse.i32 intrinsic..
[PPC CodeGen] Expand the bitreverse.i32 intrinsic.
Jul 7 2017, 9:42 AM
jtony closed D33572: [PPC CodeGen] Expand the bitreverse.i32 intrinsic. by committing rL307413: [PPC CodeGen] Expand the bitreverse.i32 intrinsic..
Jul 7 2017, 9:42 AM

Jul 6 2017

jtony added inline comments to D34337: [PPC] Fix two bugs in frame lowering..
Jul 6 2017, 12:58 PM
jtony updated the diff for D34337: [PPC] Fix two bugs in frame lowering..

(1) Refactor the messy red zone guard conditions to make it more readable.
(2) Differentiate the red zone behavior for DarwinABI and Non-DarwinABI (SVR4ABI).
(3) Address comments from Hal.

Jul 6 2017, 11:25 AM
jtony added inline comments to D35027: [PowerPC] Reduce register pressure by not materializing a constant just for use as an index register for X-Form loads/stores.
Jul 6 2017, 11:10 AM

Jul 5 2017

jtony committed rL307174: [Power9] Exploit vector extract with variable index..
[Power9] Exploit vector extract with variable index.
Jul 5 2017, 9:55 AM
jtony closed D34032: [Power9] Exploit vector extract with variable index by committing rL307174: [Power9] Exploit vector extract with variable index..
Jul 5 2017, 9:55 AM
jtony committed rL307169: [Power9] Exploit vector integer extend instructions when indices aren't correct..
[Power9] Exploit vector integer extend instructions when indices aren't correct.
Jul 5 2017, 9:01 AM
jtony closed D34009: [Power9] Exploit vector integer extend instructions when indices aren't correct by committing rL307169: [Power9] Exploit vector integer extend instructions when indices aren't correct..
Jul 5 2017, 9:01 AM

Jun 30 2017

jtony created D34908: [PPC CodeGen] Expand the bitreverse.i64 intrinsic..
Jun 30 2017, 2:08 PM

Jun 29 2017

jtony added inline comments to D34627: [Power9] Disable removing extra swaps on P9 since it should not be needed..
Jun 29 2017, 9:47 AM

Jun 28 2017

jtony updated the diff for D33572: [PPC CodeGen] Expand the bitreverse.i32 intrinsic..

Address comments from Hal Finkel and add one more IR test case to test the original situation in Bugzilla (the IR is equivalent form of fast bit-reverse but NOT the intrinsic).

Jun 28 2017, 11:25 AM
jtony retitled D33572: [PPC CodeGen] Expand the bitreverse.i32 intrinsic. from [PPC] Implement fast bit reverse in PPCDAGToDAGISel to [PPC CodeGen] Expand the bitreverse.i32 intrinsic..
Jun 28 2017, 8:54 AM

Jun 26 2017

jtony updated the diff for D33572: [PPC CodeGen] Expand the bitreverse.i32 intrinsic..

Re-implement this patch according to Hal's comments.
Note this is the first patch of the CodeGen part for intrinsic llvm.bitreverse.i32
There will be a follow-up patch to implement intrinsic llvm.bitreverse.i64
and another patch to do idiom recognition in llvm opt to generate llvm.bitreverse

Jun 26 2017, 6:50 PM
jtony commandeered D33572: [PPC CodeGen] Expand the bitreverse.i32 intrinsic..
Jun 26 2017, 6:23 PM

Jun 21 2017

jtony updated the summary of D34337: [PPC] Fix two bugs in frame lowering..
Jun 21 2017, 7:58 AM
jtony updated the diff for D34337: [PPC] Fix two bugs in frame lowering..

Address the comments from Nemanja and Stefan.
Add one test case to test the alignment calculation change.

Jun 21 2017, 6:50 AM

Jun 19 2017

jtony added inline comments to D34337: [PPC] Fix two bugs in frame lowering..
Jun 19 2017, 8:59 AM
jtony updated the summary of D34337: [PPC] Fix two bugs in frame lowering..
Jun 19 2017, 6:22 AM
jtony updated the summary of D34337: [PPC] Fix two bugs in frame lowering..
Jun 19 2017, 6:22 AM
jtony updated the summary of D34337: [PPC] Fix two bugs in frame lowering..
Jun 19 2017, 6:21 AM

Jun 18 2017

jtony created D34337: [PPC] Fix two bugs in frame lowering..
Jun 18 2017, 6:53 PM

Jun 16 2017

jtony added inline comments to D34160: [Power9] Exploit vinserth instruction.
Jun 16 2017, 7:55 AM

Jun 15 2017

jtony added inline comments to D34160: [Power9] Exploit vinserth instruction.
Jun 15 2017, 1:55 PM

Jun 14 2017

jtony committed rL305401: [PPC] Enhance altivec conversion function macros implementation..
[PPC] Enhance altivec conversion function macros implementation.
Jun 14 2017, 10:24 AM
jtony closed D34092: [PPC] Enhance altivec conversion function macros implementation. by committing rL305401: [PPC] Enhance altivec conversion function macros implementation..
Jun 14 2017, 10:24 AM

Jun 12 2017

jtony committed rL305214: [PowerPC] Match vec_revb builtins to P9 instructions..
[PowerPC] Match vec_revb builtins to P9 instructions.
Jun 12 2017, 11:25 AM
jtony closed D33690: [PowerPC] Match vec_revb builtins to P9 instructions. by committing rL305214: [PowerPC] Match vec_revb builtins to P9 instructions..
Jun 12 2017, 11:25 AM
jtony committed rL305210: [Power9] Added support for the modsw, moduw, modsd, modud hardware instructions..
[Power9] Added support for the modsw, moduw, modsd, modud hardware instructions.
Jun 12 2017, 10:59 AM
jtony closed D33940: [Power9] Added support for the modsw, moduw, modsd, modud hardware instructions that are new to P9. by committing rL305210: [Power9] Added support for the modsw, moduw, modsd, modud hardware instructions..
Jun 12 2017, 10:59 AM
jtony retitled D34092: [PPC] Enhance altivec conversion function macros implementation. from [PPC] Check the second parameter of altivec conversion function is literal. to [PPC] Enhance altivec conversion function macros implementation..
Jun 12 2017, 6:40 AM

Jun 11 2017

jtony created D34092: [PPC] Enhance altivec conversion function macros implementation..
Jun 11 2017, 1:44 PM

Jun 8 2017

jtony added inline comments to D33940: [Power9] Added support for the modsw, moduw, modsd, modud hardware instructions that are new to P9..
Jun 8 2017, 11:08 AM
jtony added inline comments to D33690: [PowerPC] Match vec_revb builtins to P9 instructions..
Jun 8 2017, 4:41 AM
jtony updated the diff for D33690: [PowerPC] Match vec_revb builtins to P9 instructions..

Address Kit's comments.

Jun 8 2017, 4:26 AM

Jun 2 2017

jtony updated the summary of D33690: [PowerPC] Match vec_revb builtins to P9 instructions..
Jun 2 2017, 7:36 AM

May 31 2017

jtony added a comment to D33690: [PowerPC] Match vec_revb builtins to P9 instructions..

I am debating internally about suggesting a potential solution for this, but this implementation essentially misses an entire set of complementary shuffle masks. We have a number of instructions that do element-wise reordering and we now have these that do per-element byte reversal. Combining the capabilities covers a lot more shuffles - I am just not positive that these occur enough to warrant the effort.

Here's what I mean:

  • We have an instruction that will do a "rotate-left-by-word" operation on a vector (and a way to emit that instructions)
  • We now have a "reverse-bytes-within-word-elements" operation
  • We don't have a "reverse-bytes-within-each-word-and-rotate-left-by-word", which we can simply do with a 2 instruction sequence now

    And of course, the same goes for all other masks we lower to a single instruction. It might be useful for each of them to detect byte-reversal as well. It is non-trivial work, but doesn't sound fundamentally all that hard. Perhaps we should re-design our handling of shuffles at some point and have a robust way to determine what we can lower to a one or two instruction sequence on any Subtarget.
May 31 2017, 7:22 PM
jtony updated the diff for D33690: [PowerPC] Match vec_revb builtins to P9 instructions..

Address comments from Nemanja.

May 31 2017, 7:20 PM
jtony added inline comments to D33690: [PowerPC] Match vec_revb builtins to P9 instructions..
May 31 2017, 12:53 PM
jtony committed rL304298: [PowerPC] Fix a performance bug for PPC::XXPERMDI..
[PowerPC] Fix a performance bug for PPC::XXPERMDI.
May 31 2017, 6:10 AM
jtony closed D33404: [PowerPC] Fix a performance bug for PPC::XXPERMDI. by committing rL304298: [PowerPC] Fix a performance bug for PPC::XXPERMDI..
May 31 2017, 6:10 AM

May 30 2017

jtony added inline comments to D33656: [PowerPC] Correctly specify the cache line size for Power 7, 8 and 9..
May 30 2017, 6:06 PM
jtony created D33690: [PowerPC] Match vec_revb builtins to P9 instructions..
May 30 2017, 1:27 PM

May 28 2017

jtony added inline comments to D33404: [PowerPC] Fix a performance bug for PPC::XXPERMDI..
May 28 2017, 12:56 PM
jtony updated the diff for D33404: [PowerPC] Fix a performance bug for PPC::XXPERMDI..

Address comments from Hal and Nemanja.

May 28 2017, 12:56 PM

May 25 2017

jtony updated the diff for D33404: [PowerPC] Fix a performance bug for PPC::XXPERMDI..

Address comments from Nemanja and Kit.

May 25 2017, 2:32 PM
jtony added inline comments to D33404: [PowerPC] Fix a performance bug for PPC::XXPERMDI..
May 25 2017, 1:08 PM
jtony closed D33225: [PowerPC] Fix a performance bug for PPC::XXSLDWI..
May 25 2017, 8:47 AM

May 24 2017

jtony committed rL303822: [PowerPC] Fix a performance bug for PPC::XXSLDWI..
[PowerPC] Fix a performance bug for PPC::XXSLDWI.
May 24 2017, 4:49 PM
jtony committed rL303786: Fix one test case faiulre in commit 303766..
Fix one test case faiulre in commit 303766.
May 24 2017, 11:12 AM
jtony committed rL303766: [PowerPC] Implement vec_xxsldwi builtin..
[PowerPC] Implement vec_xxsldwi builtin.
May 24 2017, 8:54 AM
jtony closed D33236: [PowerPC] Implement vec_xxsldwi builtin. by committing rL303766: [PowerPC] Implement vec_xxsldwi builtin..
May 24 2017, 8:54 AM
jtony committed rL303760: [PowerPC] Implement vec_xxpermdi builtin..
[PowerPC] Implement vec_xxpermdi builtin.
May 24 2017, 8:14 AM
jtony closed D33053: [PowerPC] Implement vec_xxpermdi builtin. by committing rL303760: [PowerPC] Implement vec_xxpermdi builtin..
May 24 2017, 8:13 AM
jtony committed rL303753: Generalize two diagnostic messages to take function name as parameter..
Generalize two diagnostic messages to take function name as parameter.
May 24 2017, 7:46 AM

May 23 2017

jtony updated the diff for D33236: [PowerPC] Implement vec_xxsldwi builtin..

Address minor comments from Nemanja and Hiroshi.

May 23 2017, 12:45 PM