Page MenuHomePhabricator

anhtuyen (Anh Tuyen Tran)
User

Projects

User does not belong to any projects.

User Details

User Since
Nov 8 2018, 8:35 AM (86 w, 3 d)

Recent Activity

Tue, Jun 9

anhtuyen committed rGe7c5412b3731: [NFC][LV][TEST]: extend pr45679-fold-tail-by-masking.ll with -force-vector… (authored by anhtuyen).
[NFC][LV][TEST]: extend pr45679-fold-tail-by-masking.ll with -force-vector…
Tue, Jun 9, 11:35 AM
anhtuyen closed D80446: [NFC][LV][TEST]: extend pr45679-fold-tail-by-masking.ll with a run of -force-vector-width=1 -force-vector-interleave=4.
Tue, Jun 9, 11:34 AM · Restricted Project

Sun, Jun 7

anhtuyen updated the diff for D80446: [NFC][LV][TEST]: extend pr45679-fold-tail-by-masking.ll with a run of -force-vector-width=1 -force-vector-interleave=4.

Thank you very much Ayal for your suggestion. I adjusted the file accordingly.

Sun, Jun 7, 5:03 PM · Restricted Project
anhtuyen added a comment to D80446: [NFC][LV][TEST]: extend pr45679-fold-tail-by-masking.ll with a run of -force-vector-width=1 -force-vector-interleave=4.

Hello Ayal @Ayal , when you have some time, can you review this patch to the testcases, please! Thank you very much!

Sun, Jun 7, 12:46 PM · Restricted Project

May 22 2020

anhtuyen added inline comments to D79976: [LV] Handle Fold-Tail of loops with vectorizarion factor (VF) equal to 1.
May 22 2020, 9:38 AM · Restricted Project
anhtuyen created D80446: [NFC][LV][TEST]: extend pr45679-fold-tail-by-masking.ll with a run of -force-vector-width=1 -force-vector-interleave=4.
May 22 2020, 9:38 AM · Restricted Project
anhtuyen committed rG13bf6039c9ae: Title: [LV] Handle Fold-Tail of loops with vectorizarion factor equal to 1 (authored by anhtuyen).
Title: [LV] Handle Fold-Tail of loops with vectorizarion factor equal to 1
May 22 2020, 6:58 AM
anhtuyen closed D79976: [LV] Handle Fold-Tail of loops with vectorizarion factor (VF) equal to 1.
May 22 2020, 6:57 AM · Restricted Project

May 19 2020

anhtuyen updated the diff for D79976: [LV] Handle Fold-Tail of loops with vectorizarion factor (VF) equal to 1.

Extend pr45679-fold-tail-by-masking.ll with an additional RUN of -force-vector-width=1 and -force-vector-interleave=4 from https://reviews.llvm.org/D80085.

May 19 2020, 10:30 PM · Restricted Project
anhtuyen added a comment to D79976: [LV] Handle Fold-Tail of loops with vectorizarion factor (VF) equal to 1.

This looks good to me, thanks! Please wait a day or so if @fhahn has further comments.

As noted, it would now be good to also extend pr45679-fold-tail-by-masking.ll with an additional RUN of -force-vector-width=1 and -force-vector-interleave=4.

It is very kind of you, Ayal @Ayal , and I thank you very much. As you have mentioned, I surely will wait a day or two to see whether Florian and Bardia might have further comments.
About pr45679-fold-tail-by-masking.ll, do you want me to add it for completeness?
With VF=1, the testcase does not go through the code changed by this patch (b/c its BackedgeTakenCount is null).

May 19 2020, 9:58 PM · Restricted Project
anhtuyen added a comment to D79976: [LV] Handle Fold-Tail of loops with vectorizarion factor (VF) equal to 1.

This looks good to me, thanks! Please wait a day or so if @fhahn has further comments.

As noted, it would now be good to also extend pr45679-fold-tail-by-masking.ll with an additional RUN of -force-vector-width=1 and -force-vector-interleave=4.

May 19 2020, 9:26 PM · Restricted Project
anhtuyen added a comment to D79976: [LV] Handle Fold-Tail of loops with vectorizarion factor (VF) equal to 1.

Hello,
I write to ask if there is any thing else in this patch you like me to address. If not, please see if you could approve it. A number of testcases I am running are affected and it will be great if I can get them pass with this patch. Thank you very much!

May 19 2020, 12:36 PM · Restricted Project

May 17 2020

anhtuyen added inline comments to D79976: [LV] Handle Fold-Tail of loops with vectorizarion factor (VF) equal to 1.
May 17 2020, 5:02 PM · Restricted Project
anhtuyen added a comment to D80085: [LV] Fix FoldTail under user VF and UF.

LGTM , too. Thanks for extending this feature.

May 17 2020, 5:02 PM · Restricted Project

May 16 2020

anhtuyen added inline comments to D79976: [LV] Handle Fold-Tail of loops with vectorizarion factor (VF) equal to 1.
May 16 2020, 10:47 PM · Restricted Project
anhtuyen updated the diff for D79976: [LV] Handle Fold-Tail of loops with vectorizarion factor (VF) equal to 1.

Similar to the 2nd patch earlier, this patch includes all the changes recommended by the reviewers.
Also, I

  • modified the 2nd function of the LIT test to make it clearer and simpler, and
  • added ORE to confirm which pass has/does not have the impact.
May 16 2020, 9:12 PM · Restricted Project
anhtuyen added inline comments to D79976: [LV] Handle Fold-Tail of loops with vectorizarion factor (VF) equal to 1.
May 16 2020, 9:12 PM · Restricted Project
anhtuyen added a comment to D79976: [LV] Handle Fold-Tail of loops with vectorizarion factor (VF) equal to 1.

The issue has also been reported as https://bugs.llvm.org/show_bug.cgi?id=45943

May 16 2020, 4:25 PM · Restricted Project
anhtuyen added inline comments to D79976: [LV] Handle Fold-Tail of loops with vectorizarion factor (VF) equal to 1.
May 16 2020, 4:25 PM · Restricted Project
anhtuyen added a reviewer for D79976: [LV] Handle Fold-Tail of loops with vectorizarion factor (VF) equal to 1: bmahjour.
May 16 2020, 1:45 PM · Restricted Project
anhtuyen updated the diff for D79976: [LV] Handle Fold-Tail of loops with vectorizarion factor (VF) equal to 1.

This patch includes all the changes recommended by the reviewers - thank you very much!
I also added a new LIT test, which uses the ORE to confirm which pass has the impact.
In terms of testing, I have run a number of test-cases with it, and all were successful.

May 16 2020, 1:13 PM · Restricted Project

May 15 2020

anhtuyen added inline comments to D79976: [LV] Handle Fold-Tail of loops with vectorizarion factor (VF) equal to 1.
May 15 2020, 4:19 PM · Restricted Project

May 14 2020

anhtuyen added a comment to D78847: [LV] Fix recording of BranchTakenCount for FoldTail.

Based on discussion with Ayal, I created a quick patch to the problem of type-mismatch issue I reported above.
https://reviews.llvm.org/D79976

May 14 2020, 5:59 PM · Restricted Project
anhtuyen created D79976: [LV] Handle Fold-Tail of loops with vectorizarion factor (VF) equal to 1.
May 14 2020, 5:58 PM · Restricted Project

May 12 2020

anhtuyen added a comment to D78847: [LV] Fix recording of BranchTakenCount for FoldTail.

[snip]

Right, VPWidenCanonicalIVRecipe::execute() also needs to treat VF==1 differently.

I looked at that, too. It still gives us the assert at a different location. We will need a little more work to do.

The above implies setting both VStart to CanonicalIV instead of splatting, and VStep to ConstantInt::set(STy, Part) instead of ConstantVector::get(Indices), when VF==1. Would doing so pass all your tests?

Some of this issue stems from not using the overloaded getBroadcastInstrs().

The more general issues raised are whether to apply foldTail when VF==1 in the absence of masked scalar loads/stores, and/or whether to internally turn foldTail on for small loops (due to cost considerations) when the VF and/or UF are provided externally (bypassing their cost-based selection process).

May 12 2020, 4:09 PM · Restricted Project

May 11 2020

anhtuyen added a comment to D78847: [LV] Fix recording of BranchTakenCount for FoldTail.

[snip]

Below is an example to demonstrate that setting VTCMO to TCMO when State->VF == 1 will not help in the case of a loop of VF 1 having a vector loop-bound.

[snip]

In this case, the operand[0] (which is %vec.iv) has type <1 x i64>. The loop-bound, however, will get the type as **i64** instead of the expected **<i64 2305843009213693951>** .

Right, VPWidenCanonicalIVRecipe::execute() also needs to treat VF==1 differently.

May 11 2020, 6:56 AM · Restricted Project

May 10 2020

anhtuyen added a comment to D78847: [LV] Fix recording of BranchTakenCount for FoldTail.

Hello,

[snip]

In this example, the operand[0] (%induction) correctly has type i64, but the loop bound (14) is of vector type <1 x i64>

There might be multiple ways to address this assert failure. I list below a few simple ones for your reference: they might or might not be a good solution at all.

  1. Option 1: Not to generate the icmp instructions for %induction. In the particular case of this testcase, these instructions seem to be redundant.
  2. Option 2: If we are to generate the icmp instructions above, can we set the BackedgeTakenCount to the State depending on the type of the first operand? In cases like this one when the first operand is not a vector type, using Value *TCMO instead of Value *VTCMO might be an option.

    I will open a Bugzzila and copy its link to this page when my password reset goes through.

    Thanks, Anh

Yes, thanks for catching this!
One quick fix is indeed to set VTCMO to TCMO when State->VF == 1, instead of "splatting" it into a vector of a single element.
Thinking if fold-tail-by-masking should be restricted to work for VF>1 only, given that only vectors (loads/stores) get masked.

I also came up (and gave up) that fix last week, because it would not work for a loop whose VF is 1, but the loop bound is a vector. I will come up with an example shortly to demonstrate my thought.

May 10 2020, 9:49 PM · Restricted Project
anhtuyen added a comment to D78847: [LV] Fix recording of BranchTakenCount for FoldTail.

Hello,

[snip]

In this example, the operand[0] (%induction) correctly has type i64, but the loop bound (14) is of vector type <1 x i64>

There might be multiple ways to address this assert failure. I list below a few simple ones for your reference: they might or might not be a good solution at all.

  1. Option 1: Not to generate the icmp instructions for %induction. In the particular case of this testcase, these instructions seem to be redundant.
  2. Option 2: If we are to generate the icmp instructions above, can we set the BackedgeTakenCount to the State depending on the type of the first operand? In cases like this one when the first operand is not a vector type, using Value *TCMO instead of Value *VTCMO might be an option.

    I will open a Bugzzila and copy its link to this page when my password reset goes through.

    Thanks, Anh

Yes, thanks for catching this!
One quick fix is indeed to set VTCMO to TCMO when State->VF == 1, instead of "splatting" it into a vector of a single element.
Thinking if fold-tail-by-masking should be restricted to work for VF>1 only, given that only vectors (loads/stores) get masked.

May 10 2020, 8:13 PM · Restricted Project
anhtuyen added a comment to D78847: [LV] Fix recording of BranchTakenCount for FoldTail.

The new fix delivered in this patch has caused an assert failure with a testcase having a loop with a very small trip count going through fold tail by masking.
The reduced testcase is as follows. The options are: -loop-vectorize -force-vector-interleave=4

May 10 2020, 3:58 PM · Restricted Project

Apr 29 2020

anhtuyen committed rGc7878ad231ee: [VFDatabase] Scalar functions are vector functions with VF =1 (authored by anhtuyen).
[VFDatabase] Scalar functions are vector functions with VF =1
Apr 29 2020, 10:44 AM
anhtuyen closed D78054: [VFDatabase] Scalar functions are vector functions with VF =1.
Apr 29 2020, 10:44 AM · Restricted Project

Feb 19 2020

anhtuyen added inline comments to D73707: [TSAN] Parameterize the hard-coded threshold of deflake in tsan test.
Feb 19 2020, 1:48 PM · Restricted Project, Restricted Project
anhtuyen added inline comments to D73707: [TSAN] Parameterize the hard-coded threshold of deflake in tsan test.
Feb 19 2020, 1:10 PM · Restricted Project, Restricted Project

Feb 12 2020

anhtuyen committed rGa5b6480d0551: [NFC] Remove extra headers included in Loop Unroll and LoopUnrollAndJam files (authored by anhtuyen).
[NFC] Remove extra headers included in Loop Unroll and LoopUnrollAndJam files
Feb 12 2020, 10:06 AM
anhtuyen closed D73498: [NFC] Remove extra headers included in Loop Unroll and LoopUnrollAndJam files.
Feb 12 2020, 10:06 AM · Restricted Project
anhtuyen committed rGdadc214e4d9d: Title: [TSAN] Parameterize the hard-coded threshold of deflake in tsan test (authored by anhtuyen).
Title: [TSAN] Parameterize the hard-coded threshold of deflake in tsan test
Feb 12 2020, 7:56 AM
anhtuyen closed D73707: [TSAN] Parameterize the hard-coded threshold of deflake in tsan test.
Feb 12 2020, 7:55 AM · Restricted Project, Restricted Project

Feb 11 2020

anhtuyen updated the diff for D73498: [NFC] Remove extra headers included in Loop Unroll and LoopUnrollAndJam files.

Remove llvm/Config/abi-breaking.h

Feb 11 2020, 5:52 AM · Restricted Project

Feb 10 2020

anhtuyen updated the diff for D73498: [NFC] Remove extra headers included in Loop Unroll and LoopUnrollAndJam files.

As pointed out by Michael @Meinersbur, the existence of an empty line between 2 blocks of #include had prevented clang-format from acting properly. I removed the empty lines, and reran clang-format on the affected files.

Feb 10 2020, 5:33 PM · Restricted Project
anhtuyen added inline comments to D73498: [NFC] Remove extra headers included in Loop Unroll and LoopUnrollAndJam files.
Feb 10 2020, 5:18 PM · Restricted Project
anhtuyen updated the diff for D73498: [NFC] Remove extra headers included in Loop Unroll and LoopUnrollAndJam files.

Based on comments from Reid @rnk and Michael @Meinersbur about IWYU, I have used the tools and adjusted removal/addition of the headers to meet the IWYU ideas. I also decided not to include the suggested llvm/IR/IntrinsicEnums.inc b/c it caused build errors.

Feb 10 2020, 3:28 PM · Restricted Project

Feb 6 2020

anhtuyen added a comment to D73498: [NFC] Remove extra headers included in Loop Unroll and LoopUnrollAndJam files.

Thank you very much Reid @rnk and Michael @Meinersbur for introducing me to that IWYU tool!!!

Feb 6 2020, 7:12 AM · Restricted Project

Feb 3 2020

anhtuyen added inline comments to D73707: [TSAN] Parameterize the hard-coded threshold of deflake in tsan test.
Feb 3 2020, 6:31 PM · Restricted Project, Restricted Project

Jan 31 2020

anhtuyen updated the diff for D73707: [TSAN] Parameterize the hard-coded threshold of deflake in tsan test.

Thank you very much Hubert @hubert.reinterpretcast for your comments. I have changed the script accordingly.

Jan 31 2020, 10:17 AM · Restricted Project, Restricted Project
anhtuyen added inline comments to D73498: [NFC] Remove extra headers included in Loop Unroll and LoopUnrollAndJam files.
Jan 31 2020, 8:47 AM · Restricted Project
anhtuyen added a comment to D73498: [NFC] Remove extra headers included in Loop Unroll and LoopUnrollAndJam files.
In D73498#1848429, @rnk wrote:

I don't have time to review all these headers, but I looked at 3, and 2/3 are used, and would cause build errors if they did not happen to be pulled in by transitive includes. Generally, code should include what it uses, so 2/3 of the changes I looked at seem undesirable.

If you want to meaningfully make the build faster, I would do it like this:

  • extract the command line to recompile the source file of interest (rm lib/.../LoopUnrollAndJam.cpp.o ninja -v -n lib/.../LoopUnrollAndJam.cpp.o) Paste it in a shell script to easily rerun.
  • compile the file, count the number of lines in the .d file to approximate how many includes it actually needed
  • remove one include from the source file
  • recompile, recount number of lines in .d file. If it did not get smaller, put the header back.
  • repeat.

    In this way, you only ever make changes that actually make the build faster. It is also best to start by focusing on header files instead of cpp files, since they introduce transitive includes. Because of that, removing an include from a header has more impact on the overall build time.
Jan 31 2020, 8:47 AM · Restricted Project

Jan 30 2020

anhtuyen added a comment to D73707: [TSAN] Parameterize the hard-coded threshold of deflake in tsan test.
In D73707#1850870, @rnk wrote:

I think what you are doing is reasonable and practical. I only felt obligated to make the philosophical point that flaky tests are bad, and nobody other than the test owner should have to spend time on them.

In the meantime, I think this is pretty reasonable. Maybe @vitalybuka can help review it.

Jan 30 2020, 5:56 PM · Restricted Project, Restricted Project
anhtuyen added a comment to D73707: [TSAN] Parameterize the hard-coded threshold of deflake in tsan test.
In D73707#1850128, @rnk wrote:

More than 10 retries seems excessive. For this particular issue, I think it would be reasonable to mark the test ; UNSUPPORTED: powerpc64. Adding more retries will make the powerpc64 bots slower and less useful.

Jan 30 2020, 11:35 AM · Restricted Project, Restricted Project
anhtuyen updated the summary of D73707: [TSAN] Parameterize the hard-coded threshold of deflake in tsan test.
Jan 30 2020, 6:50 AM · Restricted Project, Restricted Project
anhtuyen updated the summary of D73707: [TSAN] Parameterize the hard-coded threshold of deflake in tsan test.
Jan 30 2020, 6:49 AM · Restricted Project, Restricted Project
anhtuyen created D73707: [TSAN] Parameterize the hard-coded threshold of deflake in tsan test.
Jan 30 2020, 6:40 AM · Restricted Project, Restricted Project

Jan 28 2020

anhtuyen updated the diff for D73498: [NFC] Remove extra headers included in Loop Unroll and LoopUnrollAndJam files.

Based on suggestion by David @dmgreen, I un-delete the headers used obviously in the files.
The change is now much smaller than that of the initial patch, but I think it is fine.
Thanks again, David! Pls let me know how you think!

Jan 28 2020, 10:58 AM · Restricted Project

Jan 27 2020

anhtuyen edited reviewers for D73498: [NFC] Remove extra headers included in Loop Unroll and LoopUnrollAndJam files, added: dmgreen; removed: greened.
Jan 27 2020, 2:34 PM · Restricted Project
anhtuyen added a comment to D73498: [NFC] Remove extra headers included in Loop Unroll and LoopUnrollAndJam files.

Thanks, David @dmgreen
You are right that many of the headers in my list are included transitively. Other than AssumptionCache and Dominators and Debug, can you give me names of those you like to be included explicitly even if the compilation can go without them? I will keep them untouched.
By the way, I am aware of what Whitney is working on. I am refactoring the LoopUnrollAndJam to three classes: 1. new pass manager 2. old pass manager 3. the actual implementation, which is used by both 1 and 2.

Jan 27 2020, 2:34 PM · Restricted Project
anhtuyen created D73498: [NFC] Remove extra headers included in Loop Unroll and LoopUnrollAndJam files.
Jan 27 2020, 12:33 PM · Restricted Project

Oct 31 2019

anhtuyen added a comment to D69685: Prevent adding lld to test dependency (TEST_DEPS) when lld project is not built.

Thank you very much Evgenii @eugenis for your prompt review and approval!

Oct 31 2019, 4:35 PM · Restricted Project, Restricted Project
anhtuyen created D69685: Prevent adding lld to test dependency (TEST_DEPS) when lld project is not built.
Oct 31 2019, 2:15 PM · Restricted Project, Restricted Project

Sep 9 2019

anhtuyen added a comment to D58694: LLVM: Optimization Pass: Remove conflicting attribute, if any, before adding new read attribute to an argument.

Hello,
By the end of last week, I have addressed all comments from the reviewers. Please let me know if there is any other issue, which to like me to handle so that we can complete the patch. Thank you very much!

Sep 9 2019, 10:20 AM · Restricted Project

Aug 29 2019

anhtuyen updated the diff for D58694: LLVM: Optimization Pass: Remove conflicting attribute, if any, before adding new read attribute to an argument.

Address @jdoerfert Johannes's comment about not decrementing the statistics.

Aug 29 2019, 4:04 PM · Restricted Project

Aug 28 2019

anhtuyen added inline comments to D58694: LLVM: Optimization Pass: Remove conflicting attribute, if any, before adding new read attribute to an argument.
Aug 28 2019, 1:55 PM · Restricted Project
anhtuyen updated the diff for D58694: LLVM: Optimization Pass: Remove conflicting attribute, if any, before adding new read attribute to an argument.

My apologies about the long pause before providing this patch. I was on a long vacation, and was then driven into some other issues until now. This new patch will remove conflicting attribute, if any, currently associated with an argument before adding a read attribute to it. It also updates the statistics accordingly.

Aug 28 2019, 11:28 AM · Restricted Project

Apr 3 2019

anhtuyen edited reviewers for D58694: LLVM: Optimization Pass: Remove conflicting attribute, if any, before adding new read attribute to an argument, added: jdoerfert; removed: majnemer.
Apr 3 2019, 12:01 PM · Restricted Project
anhtuyen added inline comments to D58694: LLVM: Optimization Pass: Remove conflicting attribute, if any, before adding new read attribute to an argument.
Apr 3 2019, 11:53 AM · Restricted Project
anhtuyen updated the diff for D58694: LLVM: Optimization Pass: Remove conflicting attribute, if any, before adding new read attribute to an argument.

Based on comments from the reviewers, if a pointer argument with attribute WriteOnly has no uses, there is no write through nor read from the pointer argument. In this case, its attribute can safely be strengthened from WriteOnly to ReadNone

Apr 3 2019, 10:49 AM · Restricted Project

Feb 27 2019

anhtuyen added a comment to D58694: LLVM: Optimization Pass: Remove conflicting attribute, if any, before adding new read attribute to an argument.

Thank you very much, @rnk and @chandlerc for your comments. Let me change the fix accordingly.

Feb 27 2019, 6:28 AM · Restricted Project

Feb 26 2019

anhtuyen retitled D58694: LLVM: Optimization Pass: Remove conflicting attribute, if any, before adding new read attribute to an argument from LLVM: Optimization Pass: Function Attribute: No read-attribute should be added if the argument is WriteOnly to LLVM: Optimization Pass: Function Attribute: nocapture should be added if the argument is WriteOnly.
Feb 26 2019, 1:10 PM · Restricted Project
anhtuyen added a comment to D58694: LLVM: Optimization Pass: Remove conflicting attribute, if any, before adding new read attribute to an argument.
In D58694#1411172, @rnk wrote:

This seems like the wrong fix, I would expect functionattrs to improve the deduction by removing the writeonly attribute.

Despite its name, the readnone attribute has implied semantics about write operations:

readnone makes more sense when you consider it as a strengthening of readonly, which implies no write operations.

Feb 26 2019, 1:05 PM · Restricted Project
anhtuyen created D58694: LLVM: Optimization Pass: Remove conflicting attribute, if any, before adding new read attribute to an argument.
Feb 26 2019, 12:44 PM · Restricted Project

Nov 16 2018

anhtuyen updated the diff for D54441: [OPENMP] Support relational-op !- (not-equal) as one of the canonical forms of random access iterator.
  1. Correct the typo on line 3707 clang/lib/Sema/SemaOpenMP.cpp
  2. Update the testcase: teams_distribute_simd_loop_messages.cpp
Nov 16 2018, 10:10 AM · Restricted Project, Restricted Project, Restricted Project

Nov 12 2018

anhtuyen created D54441: [OPENMP] Support relational-op !- (not-equal) as one of the canonical forms of random access iterator.
Nov 12 2018, 12:56 PM · Restricted Project, Restricted Project, Restricted Project