Page MenuHomePhabricator

magabari (Mohammed Agabaria)
User

Projects

User does not belong to any projects.

User Details

User Since
Jul 14 2016, 12:21 AM (173 w, 6 d)

Recent Activity

Feb 27 2018

magabari added a comment to D41944: [LLVM][IR][LIT] support of 'no-overflow' flag for sdiv\udiv instructions.

ping ^ 2

Feb 27 2018, 8:27 AM

Feb 12 2018

magabari updated the diff for D41944: [LLVM][IR][LIT] support of 'no-overflow' flag for sdiv\udiv instructions.

minimized the patch by removing changes to some lit tests (it will be committed later as NFC patch)

Feb 12 2018, 7:19 AM

Feb 4 2018

magabari added a comment to D41944: [LLVM][IR][LIT] support of 'no-overflow' flag for sdiv\udiv instructions.

ping

Feb 4 2018, 3:48 AM

Jan 29 2018

magabari updated the diff for D41944: [LLVM][IR][LIT] support of 'no-overflow' flag for sdiv\udiv instructions.

fixed craig notes

Jan 29 2018, 11:47 PM
magabari added inline comments to D41944: [LLVM][IR][LIT] support of 'no-overflow' flag for sdiv\udiv instructions.
Jan 29 2018, 11:47 PM
magabari added a comment to D41944: [LLVM][IR][LIT] support of 'no-overflow' flag for sdiv\udiv instructions.

So do we still need all the .ll file changes with the syntax nof/mof syntax changes?

Jan 29 2018, 11:04 PM
magabari added a comment to D42353: [Codegen] support of 'nof' flag lowering on X86 target.

fixed craig notes

Jan 29 2018, 7:20 AM
magabari updated the diff for D41944: [LLVM][IR][LIT] support of 'no-overflow' flag for sdiv\udiv instructions.

added new pass 'ScalarizeMayOverflowDiv'
updated div attributes in LangRef
and fixed notes given by craig

Jan 29 2018, 4:40 AM
magabari added a comment to D41944: [LLVM][IR][LIT] support of 'no-overflow' flag for sdiv\udiv instructions.

fixed craig and eli notes

Jan 29 2018, 4:10 AM

Jan 25 2018

magabari updated the diff for D41944: [LLVM][IR][LIT] support of 'no-overflow' flag for sdiv\udiv instructions.

fixed craig notes
I also agree with Eli friedman note i will upload fix soon

Jan 25 2018, 7:39 AM
magabari added a comment to D41944: [LLVM][IR][LIT] support of 'no-overflow' flag for sdiv\udiv instructions.

fixed

Jan 25 2018, 7:37 AM

Jan 21 2018

magabari added a comment to D41944: [LLVM][IR][LIT] support of 'no-overflow' flag for sdiv\udiv instructions.

lowering patch of 'nof' flag (for X86) is up:
https://reviews.llvm.org/D42353

Jan 21 2018, 5:07 AM
magabari created D42353: [Codegen] support of 'nof' flag lowering on X86 target.
Jan 21 2018, 4:15 AM

Jan 19 2018

magabari updated the diff for D41944: [LLVM][IR][LIT] support of 'no-overflow' flag for sdiv\udiv instructions.

Following to the raised comments about compatibility issues.
This update follows my prev. comment on how to solve it.

Jan 19 2018, 3:50 AM

Jan 17 2018

magabari added a comment to D41944: [LLVM][IR][LIT] support of 'no-overflow' flag for sdiv\udiv instructions.

Before accepting this patch, we really need to see benchmark results. I'm not going to change clang to start emitting non-UB divs if the perf is going to be horrible. We need data.
Otherwise I don't see the need for this poison version of division. Could you elaborate if your plan is to expose this somehow to the application developer?

I'm sorry if this questions have been properly answered in the past. If so, could you please link them here?

In general the proposed feature allows compiler to start speculating div without worrying too much of div-by-zero etc. so for example you can do instruction hoisting or vectorizing predicated sdiv.
We are currently focused on vectorizing predicated div instruction and our implementation shows around 20-30% improvements on several tests of coremark-pro and denbench.

I believe that in micro benchmarks that can be vectorized you can get nice speedups. The question is what happens end-to-end to regular applications? Do I have a slowdown? Code size increase because now all my divisions are guarded?
Also, you could also guard those vectorizations around checks to ensure sdiv doesn't trap. This increases code size.

Jan 17 2018, 10:25 AM
magabari added a comment to D41944: [LLVM][IR][LIT] support of 'no-overflow' flag for sdiv\udiv instructions.

Before accepting this patch, we really need to see benchmark results. I'm not going to change clang to start emitting non-UB divs if the perf is going to be horrible. We need data.
Otherwise I don't see the need for this poison version of division. Could you elaborate if your plan is to expose this somehow to the application developer?

I'm sorry if this questions have been properly answered in the past. If so, could you please link them here?

Jan 17 2018, 9:25 AM
magabari added a comment to D41944: [LLVM][IR][LIT] support of 'no-overflow' flag for sdiv\udiv instructions.

I agree with Eli last comment, In fact we can solve compatibility in both text and bitcode.
Eli already suggested a way how to solve that in bitcode by inverting the bit (set will mean "mayOverflow" and unset means "NoOverflow")

Jan 17 2018, 4:57 AM

Jan 16 2018

magabari updated the diff for D41944: [LLVM][IR][LIT] support of 'no-overflow' flag for sdiv\udiv instructions.

added scalarization logic on codegen prepare (fixMayOverflowIntegerDiv) in case there is no support of may overflow div in the target.

Jan 16 2018, 1:44 AM

Jan 13 2018

magabari added a comment to D41944: [LLVM][IR][LIT] support of 'no-overflow' flag for sdiv\udiv instructions.

This looks like something that will be better served by an intrinsic (llvm.safe_(s|u)div or something like that), at least to begin with. Experimental intrinsics are a low-cost preferred way of trying out ideas like this without changing fundamental IR semantics.

Jan 13 2018, 11:09 PM
magabari added a comment to D41944: [LLVM][IR][LIT] support of 'no-overflow' flag for sdiv\udiv instructions.
  • Without any change in the backend (not even an abort) this will simply miscompile the no nof version on most targets.
  • With the way you are modeling the new flag, means that existing bitcode/.ll files will change semantics when read with newer compilers. I'm not sure that is a good idea for this, in any way at the very least you have to provide AutoUpgrade logic for that.
Jan 13 2018, 11:04 PM

Jan 11 2018

magabari added reviewers for D41944: [LLVM][IR][LIT] support of 'no-overflow' flag for sdiv\udiv instructions: craig.topper, DavidKreitzer, hsaito, nlopes, reames, MatzeB, eli.friedman, rengolin, hfinkel.
Jan 11 2018, 4:31 AM
magabari created D41944: [LLVM][IR][LIT] support of 'no-overflow' flag for sdiv\udiv instructions.
Jan 11 2018, 3:29 AM

Nov 20 2017

magabari committed rL318641: [LV][X86] Support of AVX2 Gathers code generation and update the LV with this.
[LV][X86] Support of AVX2 Gathers code generation and update the LV with this
Nov 20 2017, 12:18 AM
magabari closed D35772: [LV][X86] Support of AVX2 Gathers code generation and update the LV with this by committing rL318641: [LV][X86] Support of AVX2 Gathers code generation and update the LV with this.
Nov 20 2017, 12:18 AM

Nov 16 2017

magabari updated the diff for D35772: [LV][X86] Support of AVX2 Gathers code generation and update the LV with this.

gathers will be allowed for Skylake currently

Nov 16 2017, 4:40 AM
magabari committed rL318385: [TTI][X86] update costs of interleaved load\store of i64\double.
[TTI][X86] update costs of interleaved load\store of i64\double
Nov 16 2017, 1:38 AM
magabari closed D40008: [X86][TTI] update costs of interleaved load\store of i64\double by committing rL318385: [TTI][X86] update costs of interleaved load\store of i64\double.
Nov 16 2017, 1:38 AM

Nov 15 2017

magabari updated the diff for D40008: [X86][TTI] update costs of interleaved load\store of i64\double.

fixed dorit notes

Nov 15 2017, 11:43 PM
magabari updated the diff for D35772: [LV][X86] Support of AVX2 Gathers code generation and update the LV with this.

updating masked-intrinsic-cost.ll with the new costs of AVX2 and adding SKL check

Nov 15 2017, 4:59 AM

Nov 14 2017

magabari created D40008: [X86][TTI] update costs of interleaved load\store of i64\double.
Nov 14 2017, 12:34 AM
magabari retitled D40008: [X86][TTI] update costs of interleaved load\store of i64\double from [X86][TTI] update costs of interleaved load to [X86][TTI] update costs of interleaved load\store of i64\double.
Nov 14 2017, 12:34 AM

Nov 6 2017

magabari committed rL317471: [LV][X86] update the cost of interleaving mem. access of floats.
[LV][X86] update the cost of interleaving mem. access of floats
Nov 6 2017, 2:56 AM

Nov 5 2017

magabari committed rL317433: [REVERT][LV][X86] update the cost of interleaving mem. access of floats.
[REVERT][LV][X86] update the cost of interleaving mem. access of floats
Nov 5 2017, 1:37 AM
magabari committed rL317432: [LV][X86] update the cost of interleaving mem. access of floats.
[LV][X86] update the cost of interleaving mem. access of floats
Nov 5 2017, 1:07 AM
magabari closed D39403: [LV][X86] update the cost of interleaving mem. access of floats by committing rL317432: [LV][X86] update the cost of interleaving mem. access of floats.
Nov 5 2017, 1:06 AM
magabari updated the diff for D35772: [LV][X86] Support of AVX2 Gathers code generation and update the LV with this.
Nov 5 2017, 12:54 AM

Nov 2 2017

magabari updated the diff for D39403: [LV][X86] update the cost of interleaving mem. access of floats.

moving the test to cost model directory (and renaming it)

Nov 2 2017, 12:45 AM

Nov 1 2017

magabari updated subscribers of D35772: [LV][X86] Support of AVX2 Gathers code generation and update the LV with this.
Nov 1 2017, 6:35 AM
magabari added a comment to D39403: [LV][X86] update the cost of interleaving mem. access of floats.

Didn't @m_zuckerman put the previous tests into llvm/test/Analysis/CostModel/interleaved-*.ll ? Maybe match that?

Nov 1 2017, 6:33 AM
magabari updated the diff for D35772: [LV][X86] Support of AVX2 Gathers code generation and update the LV with this.
Nov 1 2017, 5:55 AM

Oct 31 2017

magabari added a comment to D39403: [LV][X86] update the cost of interleaving mem. access of floats.

@RKSimon can you take a look on the changes please.

Oct 31 2017, 3:08 AM
magabari updated the diff for D35772: [LV][X86] Support of AVX2 Gathers code generation and update the LV with this.
Oct 31 2017, 12:34 AM
magabari added a comment to D35772: [LV][X86] Support of AVX2 Gathers code generation and update the LV with this.

fixed Elena comments

Oct 31 2017, 12:32 AM

Oct 30 2017

magabari updated the diff for D39403: [LV][X86] update the cost of interleaving mem. access of floats.
Oct 30 2017, 1:52 AM
magabari updated the diff for D39403: [LV][X86] update the cost of interleaving mem. access of floats.

updating diff with the comment fix

Oct 30 2017, 1:50 AM

Oct 29 2017

magabari added a comment to D35772: [LV][X86] Support of AVX2 Gathers code generation and update the LV with this.

ping ping

Oct 29 2017, 6:32 AM
magabari created D39403: [LV][X86] update the cost of interleaving mem. access of floats.
Oct 29 2017, 2:56 AM

Oct 23 2017

magabari added a comment to D35772: [LV][X86] Support of AVX2 Gathers code generation and update the LV with this.

ping

Oct 23 2017, 10:52 PM

Sep 18 2017

magabari updated the diff for D35772: [LV][X86] Support of AVX2 Gathers code generation and update the LV with this.
Sep 18 2017, 3:48 AM

Sep 17 2017

magabari committed rL313516: [X86][Codegen] adding masked gathers tests for avx2.
[X86][Codegen] adding masked gathers tests for avx2
Sep 17 2017, 11:51 PM
magabari closed D37800: [X86][Codegen] adding masked gathers tests for avx2 by committing rL313516: [X86][Codegen] adding masked gathers tests for avx2.
Sep 17 2017, 11:51 PM
magabari updated the diff for D37800: [X86][Codegen] adding masked gathers tests for avx2.
Sep 17 2017, 1:19 AM

Sep 13 2017

magabari created D37800: [X86][Codegen] adding masked gathers tests for avx2.
Sep 13 2017, 4:41 AM
magabari committed rL313132: [X86] Adding X86 Processor Families.
[X86] Adding X86 Processor Families
Sep 13 2017, 2:02 AM
magabari closed D35348: Adding all X86 Processor families which can help initializing several uArch properties by committing rL313132: [X86] Adding X86 Processor Families.
Sep 13 2017, 2:02 AM

Sep 10 2017

magabari added a comment to D35772: [LV][X86] Support of AVX2 Gathers code generation and update the LV with this.

@magabari What is happening with this?

Sep 10 2017, 5:11 AM

Sep 7 2017

magabari added a comment to D30247: Epilog loop vectorization.

After this patch was posted, there was an RFC discussion regarding approach. We discussed, for example, just adding metadata (or keeping similar state) to restrict VF and rerunning the vectorizer to vectorize the epilogue loop. Can you please summarize that discussion and how it relates to what's here? Did the design of this patch change as a result of that discussion? If not, why not?

Sep 7 2017, 1:44 AM · Restricted Project

Aug 31 2017

magabari updated the diff for D35348: Adding all X86 Processor families which can help initializing several uArch properties.

fixed comment

Aug 31 2017, 1:46 AM
magabari added a comment to D35348: Adding all X86 Processor families which can help initializing several uArch properties.
Aug 31 2017, 1:08 AM
magabari updated the diff for D35348: Adding all X86 Processor families which can help initializing several uArch properties.

fixing Elena comments

Aug 31 2017, 1:04 AM

Aug 27 2017

magabari added a comment to D35348: Adding all X86 Processor families which can help initializing several uArch properties.

Simon, could you please take a look on the changes and see if its okay now?

Aug 27 2017, 12:01 AM

Aug 21 2017

magabari updated the diff for D35348: Adding all X86 Processor families which can help initializing several uArch properties.
Aug 21 2017, 3:52 AM

Aug 14 2017

magabari added a reviewer for D35348: Adding all X86 Processor families which can help initializing several uArch properties: dorit.
Aug 14 2017, 3:46 AM
magabari added a comment to D35348: Adding all X86 Processor families which can help initializing several uArch properties.

ping ping

Aug 14 2017, 3:45 AM

Aug 1 2017

magabari added a comment to D35348: Adding all X86 Processor families which can help initializing several uArch properties.

ping

Aug 1 2017, 4:20 AM

Jul 27 2017

magabari committed rL309260: [TTI] fixing a bug in the isLegalMaskedScatter API.
[TTI] fixing a bug in the isLegalMaskedScatter API
Jul 27 2017, 3:29 AM
magabari closed D35786: [TTI] fixing a bug in the isLegalMaskedScatter API by committing rL309260: [TTI] fixing a bug in the isLegalMaskedScatter API.
Jul 27 2017, 3:29 AM
magabari updated the diff for D35348: Adding all X86 Processor families which can help initializing several uArch properties.
Jul 27 2017, 12:55 AM
magabari added inline comments to D35348: Adding all X86 Processor families which can help initializing several uArch properties.
Jul 27 2017, 12:54 AM

Jul 24 2017

magabari updated the diff for D35772: [LV][X86] Support of AVX2 Gathers code generation and update the LV with this.
Jul 24 2017, 1:40 AM
magabari added a comment to D35772: [LV][X86] Support of AVX2 Gathers code generation and update the LV with this.

fixed simon notes

Jul 24 2017, 1:39 AM
magabari created D35786: [TTI] fixing a bug in the isLegalMaskedScatter API.
Jul 24 2017, 1:10 AM
magabari updated the diff for D35348: Adding all X86 Processor families which can help initializing several uArch properties.

replacing Generic with Others

Jul 24 2017, 12:39 AM

Jul 23 2017

magabari updated the diff for D35348: Adding all X86 Processor families which can help initializing several uArch properties.

adding ScatterOverhead and updating the max overhead.
max overhead can't be MAX_INT or MAX_UINT because it will overflow on the CM calculation and causing a wrong decision.

Jul 23 2017, 4:39 AM
magabari created D35772: [LV][X86] Support of AVX2 Gathers code generation and update the LV with this.
Jul 23 2017, 3:55 AM

Jul 22 2017

magabari added inline comments to D35348: Adding all X86 Processor families which can help initializing several uArch properties.
Jul 22 2017, 11:40 PM

Jul 19 2017

magabari added inline comments to D35348: Adding all X86 Processor families which can help initializing several uArch properties.
Jul 19 2017, 4:20 AM
magabari updated the diff for D35348: Adding all X86 Processor families which can help initializing several uArch properties.

removing unnecessary processor
fixing uint issue

Jul 19 2017, 4:19 AM

Jul 13 2017

magabari updated the diff for D35348: Adding all X86 Processor families which can help initializing several uArch properties.
Jul 13 2017, 4:21 AM
magabari added a comment to D35191: [X86] Adding Fast AVX2 Gather as a subtarget feature.

test case will be provided in my next commit which support avx2 gather generation.

Can put your next patch up for review now and make it dependent on this?

test case will be provided in my next commit which support avx2 gather generation.

Can put your next patch up for review now and make it dependent on this?

abandoned. please take a look at https://reviews.llvm.org/D35348

Jul 13 2017, 4:12 AM
magabari abandoned D35191: [X86] Adding Fast AVX2 Gather as a subtarget feature.

test case will be provided in my next commit which support avx2 gather generation.

Can put your next patch up for review now and make it dependent on this?

Jul 13 2017, 4:11 AM
magabari created D35348: Adding all X86 Processor families which can help initializing several uArch properties.
Jul 13 2017, 4:08 AM

Jul 11 2017

magabari updated the diff for D35191: [X86] Adding Fast AVX2 Gather as a subtarget feature.
Jul 11 2017, 12:11 AM

Jul 10 2017

magabari added a comment to D35191: [X86] Adding Fast AVX2 Gather as a subtarget feature.

Wouldn't this be better done through X86TTIImpl::getGatherScatterOpCost ?

Jul 10 2017, 3:04 AM
magabari created D35191: [X86] Adding Fast AVX2 Gather as a subtarget feature.
Jul 10 2017, 1:01 AM

Jul 2 2017

magabari committed rL306974: [X86][CM] update add\sub costs of vectors of 64 in X86\SLM arch.
[X86][CM] update add\sub costs of vectors of 64 in X86\SLM arch
Jul 2 2017, 5:16 AM
magabari closed D33983: update add\sub costs of vectors of 64 in X86\SLM arch by committing rL306974: [X86][CM] update add\sub costs of vectors of 64 in X86\SLM arch.
Jul 2 2017, 5:16 AM

Jun 20 2017

magabari added a comment to D33983: update add\sub costs of vectors of 64 in X86\SLM arch.

ping

Jun 20 2017, 12:15 AM

Jun 15 2017

magabari updated the diff for D33983: update add\sub costs of vectors of 64 in X86\SLM arch.
Jun 15 2017, 1:46 AM
magabari added a comment to D33983: update add\sub costs of vectors of 64 in X86\SLM arch.

fixed test comment

Jun 15 2017, 1:46 AM

Jun 14 2017

magabari added a comment to D33983: update add\sub costs of vectors of 64 in X86\SLM arch.

I've added the SLP vectorization tests at rL305151

Simon I saw that you have added that to the tests.
Is there is a need to do something else?

Did the tests change at all with your cost model changes?

Jun 14 2017, 5:25 AM
magabari added a comment to D33983: update add\sub costs of vectors of 64 in X86\SLM arch.

I've added the SLP vectorization tests at rL305151

Jun 14 2017, 12:21 AM

Jun 7 2017

magabari added a comment to D33983: update add\sub costs of vectors of 64 in X86\SLM arch.

You might want to add a slm target to the llvm\test\Transforms\SLPVectorizer\X86\arith-add.ll, arith-mul.ll and arith-sub.ll tests as well.

Jun 7 2017, 6:05 AM
magabari created D33983: update add\sub costs of vectors of 64 in X86\SLM arch.
Jun 7 2017, 4:58 AM

May 30 2017

magabari added a comment to D33341: Enable vectorizer-maximize-bandwidth by default..

we're seeing nice improvements but also significant degradations, which we would like to investigate before the patch is committed.

May 30 2017, 12:43 AM

Jan 25 2017

magabari committed rL293040: [X86] enable memory interleaving for X86\SLM arch. .
[X86] enable memory interleaving for X86\SLM arch.
Jan 25 2017, 1:26 AM
magabari closed D28547: enable memory interleaving for X86\SLM arch. by committing rL293040: [X86] enable memory interleaving for X86\SLM arch. .
Jan 25 2017, 1:26 AM

Jan 24 2017

magabari added a comment to D28547: enable memory interleaving for X86\SLM arch..

Did you manage to do any perf tests on SLM hardware?

Jan 24 2017, 9:00 AM
magabari added inline comments to D28547: enable memory interleaving for X86\SLM arch..
Jan 24 2017, 1:10 AM

Jan 11 2017

magabari added reviewers for D28547: enable memory interleaving for X86\SLM arch.: delena, zvi, RKSimon, mkuper.
Jan 11 2017, 1:40 AM