Page MenuHomePhabricator

GGanesh (Ganesh Gopalasubramanian)
User

Projects

User does not belong to any projects.

User Details

User Since
Apr 25 2016, 3:58 AM (259 w, 21 h)

Recent Activity

Wed, Mar 24

GGanesh added a comment to D94395: [X86] AMD Znver3 Scheduler descriptions and llvm-mca tests.

The FPU pipe counters aren't in the PPR as well. I have raised a ticket so as to get this updated in the PPR. Unfortunately, without the document getting updated, these events can't be enabled.

Wed, Mar 24, 7:51 AM · Restricted Project

Fri, Mar 19

GGanesh added a comment to D94395: [X86] AMD Znver3 Scheduler descriptions and llvm-mca tests.

My colleague had submitted the libpfm4 patch to the libpfm4 community. You can take this patch and enablement for exegesis for znver3. Thank you for the help!

Fri, Mar 19, 11:53 PM · Restricted Project

Tue, Mar 16

GGanesh added a comment to D94395: [X86] AMD Znver3 Scheduler descriptions and llvm-mca tests.

We have exchanged mails with Eranian for libpfm4 support. We will upload the libpfm4 patch shortly (3-4 days in time). The patch will be based on the latest PPR manual (https://www.amd.com/system/files/TechDocs/55898_pub.zip). I believe that will keep it moving for verifying the numbers and the details with exegesis.

Tue, Mar 16, 1:04 PM · Restricted Project

Feb 26 2021

GGanesh added a comment to D94395: [X86] AMD Znver3 Scheduler descriptions and llvm-mca tests.

I can work on znver1 and znver2 models as well. I have a znver2 machine. Let me check if I can get a znver1 machine and run exegesis to correct these latency\throughput numbers.

Feb 26 2021, 10:59 AM · Restricted Project

Feb 25 2021

GGanesh added a comment to D94395: [X86] AMD Znver3 Scheduler descriptions and llvm-mca tests.

We have started working on the libpfm patch. We are working on getting it posted very soon! Hopefully, once that is in place, we will be able to measure these numbers in znver3 hardware and correct accordingly.

Feb 25 2021, 8:14 PM · Restricted Project

Jan 26 2021

GGanesh added a comment to D94395: [X86] AMD Znver3 Scheduler descriptions and llvm-mca tests.

I would like to know if this patch can be approved and later can be checked for the numbers when a cpu is accessible.
Since this is an extension and tweaks from znver1->znver2 path, I would request that if it is doable.

Jan 26 2021, 7:39 PM · Restricted Project

Jan 20 2021

GGanesh added a comment to D94395: [X86] AMD Znver3 Scheduler descriptions and llvm-mca tests.

Ping!

Jan 20 2021, 9:20 AM · Restricted Project

Jan 12 2021

GGanesh requested review of D94543: [DAGCombine] Optimize pow(X, (2/3)) and Pow(X,(3/2)).
Jan 12 2021, 12:33 PM · Restricted Project

Jan 11 2021

GGanesh added a comment to D94436: [X86] Add the FSRM feature (Fast Short Rep Mov) to Zen3..

The enhancement applies to specific string lengths. I will check it and if need be will submit changes with respect to this prerequisite.

Jan 11 2021, 9:28 PM · Restricted Project
GGanesh requested review of D94395: [X86] AMD Znver3 Scheduler descriptions and llvm-mca tests.
Jan 11 2021, 2:01 AM · Restricted Project

Jan 8 2021

GGanesh committed rG9386483b7142: [X86] Add TLBSYNC, INVLPGB and SNP instructions (authored by GGanesh).
[X86] Add TLBSYNC, INVLPGB and SNP instructions
Jan 8 2021, 9:10 AM
GGanesh closed D94134: [X86] Add TLBSYNC, INVLPGB and SNP instructions.
Jan 8 2021, 9:09 AM · lld, Restricted Project

Jan 6 2021

GGanesh committed rGdbfc1ac4d86c: [X86] Update tests for znver3 (authored by GGanesh).
[X86] Update tests for znver3
Jan 6 2021, 10:24 PM
GGanesh closed D92812: [X86] Update tests for znver3.
Jan 6 2021, 10:24 PM · Restricted Project, Restricted Project, Restricted Project
GGanesh retitled D92812: [X86] Update tests for znver3 from [X86] AMD Znver3 (Family 19H) Enablement to [X86] Update tests for znver3.
Jan 6 2021, 2:29 AM · Restricted Project, Restricted Project, Restricted Project
GGanesh updated the diff for D94134: [X86] Add TLBSYNC, INVLPGB and SNP instructions.

Updated the patch for the review comments from @craig.topper and @pengfei

  1. The instructions are updated for prefix specifiers
  2. Except for pvalidate all the SNP instructions are valid only in 64-bit environment. Corrected the test accordingly.
  3. The modes (In64BitMode, In64BitMode) are updated in Instruction description.
Jan 6 2021, 1:53 AM · lld, Restricted Project

Jan 5 2021

GGanesh requested review of D94134: [X86] Add TLBSYNC, INVLPGB and SNP instructions.
Jan 5 2021, 3:51 PM · lld, Restricted Project
GGanesh updated the diff for D92812: [X86] Update tests for znver3.

Updaing the patch so that the simplified patch adds only few missing znver3 tests. The subsequent patches will comprehensively enable other znver3 features.

Jan 5 2021, 1:15 PM · Restricted Project, Restricted Project, Restricted Project

Dec 8 2020

GGanesh added a comment to D92812: [X86] Update tests for znver3.

it looks like a very bad merge imo.

Yep, Thank you! I will post smaller incremental patches.

Dec 8 2020, 12:48 PM · Restricted Project, Restricted Project, Restricted Project

Dec 7 2020

GGanesh requested review of D92812: [X86] Update tests for znver3.
Dec 7 2020, 8:49 PM · Restricted Project, Restricted Project, Restricted Project

Apr 22 2020

GGanesh added a comment to D27028: Add intrinsics for constrained floating point operations.

@andrew.w.kaylor I went through the mailing list thread regarding this change and saw "Eventually, we’ll want to go back and teach specific optimizations to understand the intrinsics so that where possible optimizations can be performed in a manner consistent with dynamic rounding modes and strict exception handling.".
Do you have any references\plans on how to teach specific optimizations on this?

Apr 22 2020, 9:12 AM

Jan 30 2020

GGanesh added a comment to D72032: [llvm-exegesis] Add pfm counters for Zen2 (znver2)..

Good with me! I am have moved some stones to get the wikichip URL updated.

Jan 30 2020, 1:33 AM · Restricted Project
GGanesh added inline comments to D72032: [llvm-exegesis] Add pfm counters for Zen2 (znver2)..
Jan 30 2020, 1:14 AM · Restricted Project
GGanesh added inline comments to D72032: [llvm-exegesis] Add pfm counters for Zen2 (znver2)..
Jan 30 2020, 12:36 AM · Restricted Project
GGanesh added a comment to D73172: [X86][Sched] A bunch of fixes to the Zen2 sched model latencies..

Yes looks good to me.

Jan 30 2020, 12:17 AM · Restricted Project

Jan 27 2020

GGanesh added a comment to D73172: [X86][Sched] A bunch of fixes to the Zen2 sched model latencies..

Changes done are obvious. Are you using libpfm with znver2 for verifying these?

Jan 27 2020, 1:37 AM · Restricted Project

Jan 9 2020

GGanesh committed rG3408940f7369: [X86] AMD Znver2 (Rome) Scheduler enablement (authored by GGanesh).
[X86] AMD Znver2 (Rome) Scheduler enablement
Jan 9 2020, 11:20 AM
GGanesh closed D66088: AMD Znver2 (Rome) Scheduler enablement.
Jan 9 2020, 11:20 AM · Restricted Project
GGanesh committed rGd7a715bebba5: [X86] AMD Znver2 (Rome) Scheduler enablement (authored by GGanesh).
[X86] AMD Znver2 (Rome) Scheduler enablement
Jan 9 2020, 6:18 AM

Jan 4 2020

GGanesh added a comment to D72032: [llvm-exegesis] Add pfm counters for Zen2 (znver2)..

RKSimon could you please commit D66088 on my behalf. I think my github account is not added.

Jan 4 2020, 6:46 AM · Restricted Project

Dec 31 2019

GGanesh added a comment to D72032: [llvm-exegesis] Add pfm counters for Zen2 (znver2)..

We are checking the libpfm enablement. I can commit D66088 if we are okay without libpfm.

Dec 31 2019, 4:22 AM · Restricted Project

Nov 18 2019

GGanesh added a comment to D66088: AMD Znver2 (Rome) Scheduler enablement.

I agree on having a patch to enable exegesis. Will post that in couple of days. As mentioned, this is part of the initial plan as well.

Nov 18 2019, 5:14 PM · Restricted Project

Nov 11 2019

GGanesh added a comment to D66088: AMD Znver2 (Rome) Scheduler enablement.

Ping!

Nov 11 2019, 12:04 AM · Restricted Project

Nov 3 2019

GGanesh added a comment to D66088: AMD Znver2 (Rome) Scheduler enablement.

Ping!

Nov 3 2019, 9:47 PM · Restricted Project

Oct 22 2019

GGanesh updated the diff for D66088: AMD Znver2 (Rome) Scheduler enablement.

Updated the patch for review comments.

Oct 22 2019, 3:18 PM · Restricted Project

Oct 16 2019

GGanesh updated the diff for D66088: AMD Znver2 (Rome) Scheduler enablement.

Updated for review comments and latency modifications for MUL, vzeroupper, CLZERO instructions.

Oct 16 2019, 3:54 PM · Restricted Project
GGanesh updated the diff for D66088: AMD Znver2 (Rome) Scheduler enablement.

The changes for review comments are incorporated.
The latency information in CLZERO, VZEROUPPER, MUL instructions are updated.

Oct 16 2019, 10:46 AM · Restricted Project

Aug 18 2019

GGanesh added a comment to D66088: AMD Znver2 (Rome) Scheduler enablement.

Ping!

Aug 18 2019, 10:58 PM · Restricted Project

Aug 12 2019

GGanesh created D66088: AMD Znver2 (Rome) Scheduler enablement.
Aug 12 2019, 6:58 AM · Restricted Project

Feb 26 2019

GGanesh committed rG4f171d276175: [X86] AMD znver2 enablement (authored by GGanesh).
[X86] AMD znver2 enablement
Feb 26 2019, 9:15 AM
GGanesh committed rGe172d7008d0c: [X86] AMD znver2 enablement (authored by GGanesh).
[X86] AMD znver2 enablement
Feb 26 2019, 8:55 AM

Feb 25 2019

GGanesh committed rGf03939fcc3ab: Test commit (remove a blank space) (authored by GGanesh).
Test commit (remove a blank space)
Feb 25 2019, 4:27 AM

Feb 19 2019

GGanesh updated the diff for D58343: Enablement for AMD znver2 architecture - skeleton patch.
Feb 19 2019, 9:04 AM · Restricted Project, Restricted Project
GGanesh updated the diff for D58343: Enablement for AMD znver2 architecture - skeleton patch.
Feb 19 2019, 8:59 AM · Restricted Project, Restricted Project
GGanesh updated the diff for D58343: Enablement for AMD znver2 architecture - skeleton patch.

Addressed the comments from Craig Topper

Feb 19 2019, 3:29 AM · Restricted Project, Restricted Project

Feb 18 2019

GGanesh created D58343: Enablement for AMD znver2 architecture - skeleton patch.
Feb 18 2019, 4:26 AM · Restricted Project, Restricted Project

Jul 23 2018

GGanesh accepted D49392: [NFC][MCA] ZnVer1: add partial-reg-update tests.

LGTM!

Jul 23 2018, 2:50 AM
GGanesh accepted D49393: [NFC][MCA] ZnVer1: Update RegisterFile to identify false dependencies on partially written registers..

I am fine!

Jul 23 2018, 2:47 AM

Jun 19 2018

GGanesh accepted D47676: [X86][Znver1] Specify Register Files, RCU; FP scheduler capacity..
Jun 19 2018, 11:16 PM
GGanesh added a comment to D47676: [X86][Znver1] Specify Register Files, RCU; FP scheduler capacity..

LGTM!

Jun 19 2018, 9:09 PM

Jun 3 2018

GGanesh added inline comments to D47676: [X86][Znver1] Specify Register Files, RCU; FP scheduler capacity..
Jun 3 2018, 10:31 PM

May 3 2018

GGanesh added a comment to D46229: [X86][AVX] Tag VPMOVSX/VPMOVZX ymm instructions as WriteShuffle256..

Sorry! Missed this thread completely!
LGTM!

May 3 2018, 10:41 AM

May 1 2018

GGanesh added a comment to D46314: [X86][AMD][Bulldozer] Fix Bulldozer Model 2 detection..

gcc uses feature bits. That would be a differential ISA with respect to the previous gen. I think the idea is to get the ISA list as close to the underlying arch. When an older version of the compiler gets used which doesn't have the arch enabled, the compiler can fallback to the closest arch which enables the ISA list.

May 1 2018, 9:26 AM

Apr 8 2018

GGanesh added inline comments to D45380: [X86] Add SchedWrites for CMOV and SETCC. Use them to remove InstRWs..
Apr 8 2018, 3:07 AM

Apr 7 2018

GGanesh added a comment to D44841: [X86][Znver1] Remove InstRWs for BLENDVPS/PD.

It shouldn't differ.
The xmm version has 1 cycle latency and ymm version has 2 cycle latency for both AVX and SSE.

Apr 7 2018, 9:50 PM

Mar 29 2018

GGanesh added inline comments to D44972: [X86] Add SchedRW for PMULLD.
Mar 29 2018, 9:20 PM

Mar 27 2018

GGanesh added inline comments to D44924: [X86] Add WriteFMOVMSK/WriteVecMOVMSK/WriteMMXMOVMSK scheduler classes.
Mar 27 2018, 6:38 AM

Mar 25 2018

GGanesh added a comment to D44687: [SchedModel] Remove instregex entries that don't match any instructions (WIP).

Looks good to me!

Mar 25 2018, 7:57 PM
GGanesh added inline comments to D44879: [X86] Add WriteBitScan/WriteLZCNT/WriteTZCNT/WritePOPCNT scheduler classes (PR36881).
Mar 25 2018, 7:57 PM

Mar 21 2018

GGanesh added inline comments to D44687: [SchedModel] Remove instregex entries that don't match any instructions (WIP).
Mar 21 2018, 9:34 PM

Aug 30 2017

GGanesh updated the diff for D36617: AMD Zen Scheduler Model Update.

Updated for review comments from Craig Topper!

Aug 30 2017, 3:54 AM

Aug 22 2017

GGanesh added a comment to D36617: AMD Zen Scheduler Model Update.

Simon! If you are okay with the patch, can you please commit the patch on my behalf!

Aug 22 2017, 8:46 AM
GGanesh updated the diff for D36617: AMD Zen Scheduler Model Update.

Updated as per Javed's comments!

Aug 22 2017, 12:15 AM

Aug 20 2017

GGanesh updated the diff for D36617: AMD Zen Scheduler Model Update.

Updated the patch as per Simon's comments.
Added the FP instruction itineraries which includes SSE4A and SHA instructions.

Aug 20 2017, 12:30 PM

Aug 18 2017

GGanesh added a comment to D36617: AMD Zen Scheduler Model Update.

Yes Simon! I will include the SSE4A instructions, their itineraries in the next patch. I will include tests verifying them as well.
If this patch is okay, can you please commit this patch on my behalf.

Aug 18 2017, 7:10 AM
GGanesh added a comment to D36617: AMD Zen Scheduler Model Update.

Simon, Craig Topper! My next increment is ready. If this patch can be accepted and committed, I will rebase and submit the next patch.
Or should I submit the next patch as an incremental patch with the changes put forth in this patch? Please help!

Aug 18 2017, 3:19 AM

Aug 14 2017

GGanesh updated the diff for D36617: AMD Zen Scheduler Model Update.

Updated for the itineraries of memory variants of the instructions.

Aug 14 2017, 2:41 AM

Aug 11 2017

GGanesh created D36617: AMD Zen Scheduler Model Update.
Aug 11 2017, 7:46 AM

Jul 19 2017

GGanesh added a comment to D35293: AMD znver1 Initial Scheduler model.

Thanks all!

Jul 19 2017, 8:21 AM

Jul 18 2017

GGanesh added a comment to D35293: AMD znver1 Initial Scheduler model.

Simon! If you are fine, can you please commit the patch on my behalf. I am yet to get commit access rights. Probably, after this patch, I will try to get it.

Jul 18 2017, 4:24 AM
GGanesh updated the diff for D35293: AMD znver1 Initial Scheduler model.

Patch update: For newer testcases.

Jul 18 2017, 4:02 AM

Jul 17 2017

GGanesh updated the diff for D35293: AMD znver1 Initial Scheduler model.

Updated as per Javed's review comments!

Jul 17 2017, 3:51 AM

Jul 16 2017

GGanesh updated the diff for D35293: AMD znver1 Initial Scheduler model.

Updated as per the review comments.

Jul 16 2017, 10:56 PM

Jul 12 2017

GGanesh added inline comments to D35293: AMD znver1 Initial Scheduler model.
Jul 12 2017, 12:36 AM
GGanesh created D35293: AMD znver1 Initial Scheduler model.
Jul 12 2017, 12:24 AM

Feb 8 2017

GGanesh added a comment to D29386: Clzero flag addition and inclusion under znver1.

Thank you @craig.topper.

Feb 8 2017, 9:58 PM
GGanesh added a comment to D29385: Clzero intrinsic and its addition under znver1.

@craig.topper If you are okay, can you please commit the changes on my behalf?

Feb 8 2017, 6:38 AM
GGanesh added a comment to D29385: Clzero intrinsic and its addition under znver1.

I think it is okay even if we don't set the mayStore attribute.
I wrote a simple test to check the following

  1. Schedules based on the instruction attribute
  2. Side-effect handling
Feb 8 2017, 4:00 AM

Feb 7 2017

GGanesh updated the diff for D29385: Clzero intrinsic and its addition under znver1.

Updated the test file "x86-32.s" for clzero only test!

Feb 7 2017, 7:56 AM
GGanesh updated the diff for D29386: Clzero flag addition and inclusion under znver1.

Updated the builtins test for "__builtin_ia32_clzero"

Feb 7 2017, 7:54 AM
GGanesh updated the diff for D29386: Clzero flag addition and inclusion under znver1.

Updated for review comments.

Feb 7 2017, 2:46 AM
GGanesh updated the diff for D29385: Clzero intrinsic and its addition under znver1.

Updated for the review comments

Feb 7 2017, 2:45 AM

Feb 1 2017

GGanesh created D29386: Clzero flag addition and inclusion under znver1.
Feb 1 2017, 2:42 AM
GGanesh created D29385: Clzero intrinsic and its addition under znver1.
Feb 1 2017, 2:38 AM

Jan 9 2017

GGanesh added a comment to D28018: AMD family 17h (znver1) enablement.

If Okay, can you please commit these on my behalf. I don't have write access.

Jan 9 2017, 12:27 PM
GGanesh added a comment to D28018: AMD family 17h (znver1) enablement.

Yes. True I mentioned that for the grouping or the order of the features enabled. These initFeatureMap are done based on the intrinsics and the CodeGen part.

Jan 9 2017, 12:26 PM
GGanesh updated the diff for D28017: AMD family 17h (znver1) enablement.

Adding znver1 to following tests.
a. LZCNT
b. Slow SHLD
c. slow unaligned memory

Jan 9 2017, 8:46 AM
GGanesh updated the diff for D28018: AMD family 17h (znver1) enablement.

Fallback to CK_BTVER1 is ok but not to CK_BTVER2. This is not possible because of the partial YMM writes. They have different behavior for znver1 with AVX and their legacy SIMD counterparts. So, as of now leaving them to alphabetical order.

Jan 9 2017, 8:19 AM

Jan 8 2017

GGanesh added inline comments to D28018: AMD family 17h (znver1) enablement.
Jan 8 2017, 12:33 PM
GGanesh updated the diff for D28017: AMD family 17h (znver1) enablement.

The clzero intrinsic handling and feature addition will be handled as a separate patch.
Added movbe and sse4a into ISA list of znver1.

Jan 8 2017, 8:42 AM
GGanesh updated the diff for D28018: AMD family 17h (znver1) enablement.

The clzero builtins and feature addition will be handled separately in another patch.
SSE4a and movbe are added to the ISA list.

Jan 8 2017, 8:40 AM

Dec 21 2016

GGanesh added a comment to D28017: AMD family 17h (znver1) enablement.

I am preparing a patch which doesn't include the clzero feature patch.
I will submit a separate patch for clzero feature patch.

Dec 21 2016, 7:22 PM
GGanesh updated D28017: AMD family 17h (znver1) enablement.
Dec 21 2016, 3:08 AM
GGanesh updated D28018: AMD family 17h (znver1) enablement.
Dec 21 2016, 3:08 AM
GGanesh retitled D28018: AMD family 17h (znver1) enablement from to AMD family 17h (znver1) enablement.
Dec 21 2016, 3:06 AM
GGanesh retitled D28017: AMD family 17h (znver1) enablement from to AMD family 17h (znver1) enablement.
Dec 21 2016, 3:03 AM

May 17 2016

GGanesh added a comment to D19795: Add new flag and intrinsic support for MWAITX and MONITORX instructions..

Thank you!

May 17 2016, 10:16 PM

May 13 2016

GGanesh updated the diff for D19795: Add new flag and intrinsic support for MWAITX and MONITORX instructions..

Added FeatureMWAITX to bdver4.

May 13 2016, 2:59 AM

May 11 2016

GGanesh updated the diff for D19795: Add new flag and intrinsic support for MWAITX and MONITORX instructions..

Incorporated comments from Simon!

May 11 2016, 11:08 PM

May 9 2016

GGanesh added a comment to D19796: Add new intrinsic support for MONITORX and MWAITX instructions..

PING!

May 9 2016, 2:57 AM