Page MenuHomePhabricator

zjaffal (Zain Jaffal)
User

Projects

User does not belong to any projects.

User Details

User Since
Aug 11 2022, 3:52 AM (6 w, 3 d)

Recent Activity

Thu, Sep 22

zjaffal updated the diff for D133421: [AArch64] break non-temporal loads over 256 into 256-loads and a smaller load.

rebase on top of main

Thu, Sep 22, 9:25 AM · Restricted Project, Restricted Project

Tue, Sep 20

zjaffal updated the diff for D133300: [InstCombine] Matrix multiplication negation optimisation.

Refactor the code for moving negation to the result. Now we preserve the fastmath flags on the created multiplication and negation instructions by passing them from the original multiplication instruction. FNeg is created using Builder instead of UnitaryOperand because
the latter caused build failures.
The changes in the fastflag behaviour can be seen on test_negation_move_to_result_with_fastflags test.

Tue, Sep 20, 6:50 AM · Restricted Project, Restricted Project

Fri, Sep 16

zjaffal updated the diff for D133300: [InstCombine] Matrix multiplication negation optimisation.

Add condition to only optimize if the negated operand has one use

Fri, Sep 16, 8:00 AM · Restricted Project, Restricted Project
zjaffal added inline comments to D133300: [InstCombine] Matrix multiplication negation optimisation.
Fri, Sep 16, 7:56 AM · Restricted Project, Restricted Project
zjaffal added inline comments to D133300: [InstCombine] Matrix multiplication negation optimisation.
Fri, Sep 16, 7:38 AM · Restricted Project, Restricted Project
zjaffal requested review of D134048: [ConstraintElimination] Simplify usub(a,b) if a s>=b..
Fri, Sep 16, 7:30 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D134038: [ConstraintElimination] Add initial usub.with.overflow tests..

Fix usub_no_overflow_due_to_or_conds and usub_no_overflow_due_to_or_conds_sub_result_not_used to use the correct exit condition

Fri, Sep 16, 7:05 AM · Restricted Project, Restricted Project
zjaffal requested review of D134044: [ConstraintElimination] Move logic for replacing ssub overflow users (NFC).
Fri, Sep 16, 6:48 AM · Restricted Project, Restricted Project
zjaffal requested review of D134038: [ConstraintElimination] Add initial usub.with.overflow tests..
Fri, Sep 16, 6:26 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133300: [InstCombine] Matrix multiplication negation optimisation.

Fix comments and variable names

Fri, Sep 16, 3:02 AM · Restricted Project, Restricted Project
zjaffal added inline comments to D133300: [InstCombine] Matrix multiplication negation optimisation.
Fri, Sep 16, 1:47 AM · Restricted Project, Restricted Project
zjaffal added inline comments to D133300: [InstCombine] Matrix multiplication negation optimisation.
Fri, Sep 16, 1:38 AM · Restricted Project, Restricted Project

Wed, Sep 14

zjaffal updated the diff for D131964: [AArch64] Add support to loop vectorization for non temporal loads.

Add a check for isLittleEndian

Wed, Sep 14, 8:32 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133695: [InstCombine] Optimize multiplication where both operands are negated.
Wed, Sep 14, 8:03 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133695: [InstCombine] Optimize multiplication where both operands are negated.
Wed, Sep 14, 7:48 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D132559: [AArch64] Add support for 128-bit non temporal loads..

check for little endian target

Wed, Sep 14, 7:16 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133789: [AArch64] Disable nontemproal load for Big Endian.

Rebase on top of main

Wed, Sep 14, 5:30 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133789: [AArch64] Disable nontemproal load for Big Endian.

Apply clang format and add comments

Wed, Sep 14, 3:54 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133300: [InstCombine] Matrix multiplication negation optimisation.

Update variable names to patch comments

Wed, Sep 14, 3:38 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133695: [InstCombine] Optimize multiplication where both operands are negated.

Variable names match comments

Wed, Sep 14, 3:36 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133695: [InstCombine] Optimize multiplication where both operands are negated.

rename variables

Wed, Sep 14, 3:22 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133695: [InstCombine] Optimize multiplication where both operands are negated.

rename variables

Wed, Sep 14, 3:13 AM · Restricted Project, Restricted Project

Tue, Sep 13

zjaffal requested review of D133789: [AArch64] Disable nontemproal load for Big Endian.
Tue, Sep 13, 10:26 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133300: [InstCombine] Matrix multiplication negation optimisation.

use patterns

Tue, Sep 13, 8:26 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133695: [InstCombine] Optimize multiplication where both operands are negated.

use patterns

Tue, Sep 13, 7:32 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133765: [AArch64] Add nontemporal tests for big endian..

add missing flag

Tue, Sep 13, 6:20 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133765: [AArch64] Add nontemporal tests for big endian..

remove redundant checks

Tue, Sep 13, 2:48 AM · Restricted Project, Restricted Project
zjaffal requested review of D133765: [AArch64] Add nontemporal tests for big endian..
Tue, Sep 13, 2:44 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133300: [InstCombine] Matrix multiplication negation optimisation.

update transforms

Tue, Sep 13, 2:21 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133695: [InstCombine] Optimize multiplication where both operands are negated.

update transformations

Tue, Sep 13, 2:19 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133287: [InstCombine] Test for matrix multiplication negation optimisation..

Update tests

Tue, Sep 13, 2:18 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133287: [InstCombine] Test for matrix multiplication negation optimisation..
  • [InstCombine] Matrix multiplication negation optimisation
Tue, Sep 13, 2:17 AM · Restricted Project, Restricted Project

Mon, Sep 12

zjaffal retitled D133695: [InstCombine] Optimize multiplication where both operands are negated from [InstCombine] Optmize multiplication where both operands are negated to [InstCombine] Optimize multiplication where both operands are negated.
Mon, Sep 12, 7:06 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133300: [InstCombine] Matrix multiplication negation optimisation.

replace dyn_cast with cast

Mon, Sep 12, 7:05 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133287: [InstCombine] Test for matrix multiplication negation optimisation..

Add tests cases where we have both operands negated

Mon, Sep 12, 7:02 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133300: [InstCombine] Matrix multiplication negation optimisation.
  1. Add the optimization for two negations into a separate patch
  2. Remove the check for single use for negation operation
  3. Address style comments and spelling mistakes
Mon, Sep 12, 7:01 AM · Restricted Project, Restricted Project
zjaffal requested review of D133695: [InstCombine] Optimize multiplication where both operands are negated.
Mon, Sep 12, 6:45 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133421: [AArch64] break non-temporal loads over 256 into 256-loads and a smaller load.
  1. Change Align to use PtrOffset
  2. Make sure that performTBISimplification optimisation can run
  3. Add TokenFactor node
Mon, Sep 12, 3:58 AM · Restricted Project, Restricted Project
zjaffal added inline comments to D133421: [AArch64] break non-temporal loads over 256 into 256-loads and a smaller load.
Mon, Sep 12, 2:19 AM · Restricted Project, Restricted Project
zjaffal added inline comments to D133421: [AArch64] break non-temporal loads over 256 into 256-loads and a smaller load.
Mon, Sep 12, 2:05 AM · Restricted Project, Restricted Project
zjaffal added inline comments to D133421: [AArch64] break non-temporal loads over 256 into 256-loads and a smaller load.
Mon, Sep 12, 1:40 AM · Restricted Project, Restricted Project

Fri, Sep 9

zjaffal updated the diff for D133300: [InstCombine] Matrix multiplication negation optimisation.
  • Move the test to a seperate patch.
  • In the cases where both operands are negated we may need to introduce a seperate patch to handle that case
Fri, Sep 9, 12:39 PM · Restricted Project, Restricted Project
zjaffal updated the diff for D133287: [InstCombine] Test for matrix multiplication negation optimisation..

Add more test cases where operands are negated

Fri, Sep 9, 12:38 PM · Restricted Project, Restricted Project
zjaffal added inline comments to D133300: [InstCombine] Matrix multiplication negation optimisation.
Fri, Sep 9, 11:25 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133421: [AArch64] break non-temporal loads over 256 into 256-loads and a smaller load.

readd the volatile test

Fri, Sep 9, 8:47 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133421: [AArch64] break non-temporal loads over 256 into 256-loads and a smaller load.

Fix non-temporal clause

Fri, Sep 9, 8:29 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133300: [InstCombine] Matrix multiplication negation optimisation.

remove unecessary check for two negations

Fri, Sep 9, 8:17 AM · Restricted Project, Restricted Project
zjaffal added inline comments to D133300: [InstCombine] Matrix multiplication negation optimisation.
Fri, Sep 9, 8:16 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133421: [AArch64] break non-temporal loads over 256 into 256-loads and a smaller load.

Removing checking if both operands are negative at the same time since it is handled by default.

Fri, Sep 9, 8:12 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133421: [AArch64] break non-temporal loads over 256 into 256-loads and a smaller load.
  • Add early exit for volatile loads and a test cases to check for it.
  • Address some style issues
Fri, Sep 9, 7:17 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133287: [InstCombine] Test for matrix multiplication negation optimisation..

Add tests to check for fast flags

Fri, Sep 9, 5:16 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133300: [InstCombine] Matrix multiplication negation optimisation.

Support where two operands are negated

Fri, Sep 9, 5:15 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133287: [InstCombine] Test for matrix multiplication negation optimisation..

Add test cases for second operand and chain multiplication

Fri, Sep 9, 4:32 AM · Restricted Project, Restricted Project
zjaffal added inline comments to D133287: [InstCombine] Test for matrix multiplication negation optimisation..
Fri, Sep 9, 2:48 AM · Restricted Project, Restricted Project

Thu, Sep 8

zjaffal updated the diff for D133287: [InstCombine] Test for matrix multiplication negation optimisation..

Add more test casses to address the following:

  1. Multiple matrix multiply operation
  2. Negation on both operands
  3. Negation on the result
Thu, Sep 8, 2:21 PM · Restricted Project, Restricted Project
zjaffal added a comment to D133300: [InstCombine] Matrix multiplication negation optimisation.

What happens if both args are negated? We need a test for that. Maybe that pattern should be handled before this, so we don't have to deal with the complication in this patch?

Tests should also include fast-math-flags in at least some cases, so we can see if those are propagated as expected.

Thu, Sep 8, 8:43 AM · Restricted Project, Restricted Project
zjaffal added a comment to D133300: [InstCombine] Matrix multiplication negation optimisation.

Are we looking to also support -(A * B) -> -A * B with the negation on the cheapest operand? (might need to check what the required fast math flags are)

We should. In particular, there three places for the negation to go: -A * B = A * -B = -(A * B) and any one of the three might be smaller.

Thu, Sep 8, 8:42 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133421: [AArch64] break non-temporal loads over 256 into 256-loads and a smaller load.

Add comments for function

Thu, Sep 8, 8:39 AM · Restricted Project, Restricted Project
zjaffal added inline comments to D133421: [AArch64] break non-temporal loads over 256 into 256-loads and a smaller load.
Thu, Sep 8, 8:22 AM · Restricted Project, Restricted Project
zjaffal added inline comments to D133421: [AArch64] break non-temporal loads over 256 into 256-loads and a smaller load.
Thu, Sep 8, 8:15 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133421: [AArch64] break non-temporal loads over 256 into 256-loads and a smaller load.

Fix the issues related to the scalable loads.

Thu, Sep 8, 8:14 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133498: [AArch64] Add test for vscale nontemporal loads larger than 256..

fix test label

Thu, Sep 8, 8:12 AM · Restricted Project, Restricted Project
zjaffal requested review of D133498: [AArch64] Add test for vscale nontemporal loads larger than 256..
Thu, Sep 8, 8:06 AM · Restricted Project, Restricted Project
zjaffal added inline comments to D133421: [AArch64] break non-temporal loads over 256 into 256-loads and a smaller load.
Thu, Sep 8, 7:17 AM · Restricted Project, Restricted Project

Wed, Sep 7

zjaffal updated the diff for D133300: [InstCombine] Matrix multiplication negation optimisation.

Move code next to FP Intrinsics

Wed, Sep 7, 8:29 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D133287: [InstCombine] Test for matrix multiplication negation optimisation..

Add test case where fneg is used more than once

Wed, Sep 7, 8:28 AM · Restricted Project, Restricted Project
zjaffal requested review of D133421: [AArch64] break non-temporal loads over 256 into 256-loads and a smaller load.
Wed, Sep 7, 6:12 AM · Restricted Project, Restricted Project

Mon, Sep 5

zjaffal requested review of D133300: [InstCombine] Matrix multiplication negation optimisation.
Mon, Sep 5, 6:16 AM · Restricted Project, Restricted Project
zjaffal requested review of D133287: [InstCombine] Test for matrix multiplication negation optimisation..
Mon, Sep 5, 3:01 AM · Restricted Project, Restricted Project

Fri, Sep 2

zjaffal updated the diff for D132559: [AArch64] Add support for 128-bit non temporal loads..

Remove support for floating point non temporal loads

Fri, Sep 2, 6:07 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D132559: [AArch64] Add support for 128-bit non temporal loads..

Fix patch for failing tests. The problem originated from 128-bit loads being handled incorrectly if they weren't nontemporal loads. I added two checks to address the issue
First, if the load is of floating point type we preserve the behaviour provided from

setOperationPromotedToType(ISD::LOAD, VT, PromoteTo);
Fri, Sep 2, 3:25 AM · Restricted Project, Restricted Project

Thu, Sep 1

zjaffal updated the diff for D132559: [AArch64] Add support for 128-bit non temporal loads..

Address some of the failing tests. The problem we have here is that all 128-bit vector loads will go to LowerLOAD function.
Most of the failing test cases where triggered by the following assert

Thu, Sep 1, 1:42 AM · Restricted Project, Restricted Project

Tue, Aug 30

zjaffal added a comment to D132559: [AArch64] Add support for 128-bit non temporal loads..

Running this locally causes many test failures for me. @zainja could you double check all AArch64 codegen tests pass with the patch applied to current main? Also, could you rebase the patch to current main to make the precommit tests run latest main?

Tue, Aug 30, 8:32 AM · Restricted Project, Restricted Project

Aug 24 2022

zjaffal removed reviewers for D132555: [instcombine] Optimise for zero initialisation of integer product: dmgreen, SjoerdMeijer.
Aug 24 2022, 8:06 AM · Restricted Project, Restricted Project
zjaffal added reviewers for D132559: [AArch64] Add support for 128-bit non temporal loads.: dmgreen, SjoerdMeijer.
Aug 24 2022, 8:05 AM · Restricted Project, Restricted Project
zjaffal added reviewers for D132555: [instcombine] Optimise for zero initialisation of integer product: dmgreen, SjoerdMeijer.
Aug 24 2022, 8:03 AM · Restricted Project, Restricted Project
zjaffal added reviewers for D132559: [AArch64] Add support for 128-bit non temporal loads.: hiraditya, kristof.beyls.
Aug 24 2022, 7:55 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D132555: [instcombine] Optimise for zero initialisation of integer product.

edit file name

Aug 24 2022, 7:52 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D132553: [instcombine] Test for zero initialisation optimisation of integer product.

rename file

Aug 24 2022, 7:47 AM · Restricted Project, Restricted Project
zjaffal requested review of D132559: [AArch64] Add support for 128-bit non temporal loads..
Aug 24 2022, 7:17 AM · Restricted Project, Restricted Project
zjaffal requested review of D132555: [instcombine] Optimise for zero initialisation of integer product.
Aug 24 2022, 6:19 AM · Restricted Project, Restricted Project
zjaffal requested review of D132553: [instcombine] Test for zero initialisation optimisation of integer product.
Aug 24 2022, 6:17 AM · Restricted Project, Restricted Project

Aug 16 2022

zjaffal updated the diff for D131672: [instcombine] Optimise for zero initialisation of product given fast flags are enabled.

Rebase

Aug 16 2022, 12:54 PM · Restricted Project, Restricted Project
zjaffal updated the diff for D131964: [AArch64] Add support to loop vectorization for non temporal loads.

Updating D131964: [AArch64] Add support to loop vectorization for non temporal loads

Aug 16 2022, 9:56 AM · Restricted Project, Restricted Project
zjaffal updated the summary of D131964: [AArch64] Add support to loop vectorization for non temporal loads.
Aug 16 2022, 9:34 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D131672: [instcombine] Optimise for zero initialisation of product given fast flags are enabled.

Updating D131672: [instcombine] Optimise for zero initialisation of product given fast flags are enabled

Aug 16 2022, 9:14 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D131672: [instcombine] Optimise for zero initialisation of product given fast flags are enabled.

remove brackets around if statement

Aug 16 2022, 7:04 AM · Restricted Project, Restricted Project
zjaffal requested review of D131964: [AArch64] Add support to loop vectorization for non temporal loads.
Aug 16 2022, 6:36 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D131773: [AArch64] Add support for 256-bit non temporal loads.

Fix for failing tests

Aug 16 2022, 4:11 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D131672: [instcombine] Optimise for zero initialisation of product given fast flags are enabled.

Updating D131672: [instcombine] Optimise for zero initialisation of product given fast flags are enabled

Aug 16 2022, 2:54 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D131757: [instcombine] Test for zero initialisation optimisation of a product given fast flags.

Updating D131757: [instcombine] Test for zero initialisation optimisation of a product given fast flags

Aug 16 2022, 2:51 AM · Restricted Project, Restricted Project
zjaffal added inline comments to D131773: [AArch64] Add support for 256-bit non temporal loads.
Aug 16 2022, 12:47 AM · Restricted Project, Restricted Project

Aug 15 2022

zjaffal updated the diff for D131672: [instcombine] Optimise for zero initialisation of product given fast flags are enabled.

Updating D131672: [instcombine] Optimise for zero initialisation of product given fast flags are enabled

Aug 15 2022, 11:48 AM · Restricted Project, Restricted Project
zjaffal added inline comments to D131672: [instcombine] Optimise for zero initialisation of product given fast flags are enabled.
Aug 15 2022, 11:47 AM · Restricted Project, Restricted Project
zjaffal retitled D131672: [instcombine] Optimise for zero initialisation of product given fast flags are enabled from [instcombine] Optimise for zero initialisation of product given finite math in Clang to [instcombine] Optimise for zero initialisation of product given fast flags are enabled.
Aug 15 2022, 10:10 AM · Restricted Project, Restricted Project
zjaffal retitled D131757: [instcombine] Test for zero initialisation optimisation of a product given fast flags from [instcombine] Check for zero initialisation optimisation of a product given fast flags are enabled in clang. to [instcombine] Test for zero initialisation optimisation of a product given fast flags.
Aug 15 2022, 10:09 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D131672: [instcombine] Optimise for zero initialisation of product given fast flags are enabled.

Updating D131672: [instcombine] Optimise for zero initialisation of product given finite math in Clang

Aug 15 2022, 9:42 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D131757: [instcombine] Test for zero initialisation optimisation of a product given fast flags.

Updating D131757: [instcombine] Check for zero initialisation optimisation of a product given fast flags are enabled in clang.

Aug 15 2022, 9:41 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D131757: [instcombine] Test for zero initialisation optimisation of a product given fast flags.

Updating D131757: [instcombine] Check for zero initialisation optimisation of a product given fast flags are enabled in clang.

Aug 15 2022, 9:35 AM · Restricted Project, Restricted Project
zjaffal updated the diff for D131757: [instcombine] Test for zero initialisation optimisation of a product given fast flags.

Updating D131757: [instcombine] Check for zero initialisation optimisation of a product given fast flags are enabled in clang.

Aug 15 2022, 9:29 AM · Restricted Project, Restricted Project