Page MenuHomePhabricator
Feed Advanced Search

Mon, Oct 14

andreadb committed rGb744abb4f6a9: [X86][BtVer2] Improved latency and throughput of float/vector loads and stores. (authored by andreadb).
[X86][BtVer2] Improved latency and throughput of float/vector loads and stores.
Mon, Oct 14, 4:14 AM
andreadb closed D68871: [X86][BtVer2] Improved latency and throughput of float/vector loads and stores..
Mon, Oct 14, 4:14 AM · Restricted Project

Fri, Oct 11

andreadb added a comment to D68871: [X86][BtVer2] Improved latency and throughput of float/vector loads and stores..

Posted the output from llvm-exegesis for all the affected instructions.

Fri, Oct 11, 8:14 AM · Restricted Project
andreadb created D68871: [X86][BtVer2] Improved latency and throughput of float/vector loads and stores..
Fri, Oct 11, 8:14 AM · Restricted Project
andreadb added a comment to D67950: [TableGen] Fix a bug that MCSchedClassDesc is interfered between different SchedModel.

I am not convinced that this patch is correct. Isn’t the problem that your model was wrongly marked as complete?

Fri, Oct 11, 3:31 AM · Restricted Project
andreadb added a comment to D67950: [TableGen] Fix a bug that MCSchedClassDesc is interfered between different SchedModel.

I am not convinced that this patch is correct. Isn’t the problem that your model should was wrongly marked as complete?

Fri, Oct 11, 2:46 AM · Restricted Project

Thu, Oct 10

andreadb accepted D68714: [MCA] Show aggregate over Average Wait times for the whole snippet (PR43219).

LGTM

Thu, Oct 10, 7:16 AM · Restricted Project
andreadb added inline comments to D68714: [MCA] Show aggregate over Average Wait times for the whole snippet (PR43219).
Thu, Oct 10, 4:14 AM · Restricted Project

Wed, Oct 9

andreadb added inline comments to D68714: [MCA] Show aggregate over Average Wait times for the whole snippet (PR43219).
Wed, Oct 9, 12:44 PM · Restricted Project
andreadb added a comment to D68714: [MCA] Show aggregate over Average Wait times for the whole snippet (PR43219).

Thanks Roman.

Wed, Oct 9, 10:56 AM · Restricted Project

Tue, Oct 8

andreadb committed rG8d6651f7b11e: [MCA][LSUnit] Track loads and stores until retirement. (authored by andreadb).
[MCA][LSUnit] Track loads and stores until retirement.
Tue, Oct 8, 3:46 AM
andreadb closed D68266: [MCA][LSUnit] Track loads and stores until retirement..
Tue, Oct 8, 3:46 AM · Restricted Project

Fri, Oct 4

andreadb added a comment to D68266: [MCA][LSUnit] Track loads and stores until retirement..

Thanks Roman,

Fri, Oct 4, 3:52 AM · Restricted Project

Tue, Oct 1

andreadb created D68266: [MCA][LSUnit] Track loads and stores until retirement..
Tue, Oct 1, 5:08 AM · Restricted Project

Mon, Sep 30

andreadb committed rG2730df2e164b: [MCA] Use references to LSUnitBase in class Scheduler and add helper methods to… (authored by andreadb).
[MCA] Use references to LSUnitBase in class Scheduler and add helper methods to…
Mon, Sep 30, 10:24 AM
andreadb accepted D68190: [llvm-mca] Add a -mattr flag.

LGTM

Mon, Sep 30, 8:47 AM · Restricted Project

Sun, Sep 22

andreadb added inline comments to D67875: [X86] X86DAGToDAGISel::matchBEXTRFromAndImm(): if can't use BEXTR, fallback to BZHI (PR43381).
Sun, Sep 22, 1:36 PM · Restricted Project

Thu, Sep 19

andreadb committed rGe0900f285bb5: [MCA] Improved cost computation for loop carried dependencies in the bottleneck… (authored by andreadb).
[MCA] Improved cost computation for loop carried dependencies in the bottleneck…
Thu, Sep 19, 9:05 AM

Sep 6 2019

andreadb accepted D67192: [X86] Use MOVSX instead of CBW to extend i8 to AX for i8 sdiv..

Looks good to me.
Thanks for the changes in FixupBWInsts. Personally I was already happy with the previous version patch. But this one is looks even better.

Sep 6 2019, 3:51 AM · Restricted Project

Sep 2 2019

andreadb committed rG528f68144b7e: [X86][BtVer2] Fix latency and throughput of conditional SIMD store instructions. (authored by andreadb).
[X86][BtVer2] Fix latency and throughput of conditional SIMD store instructions.
Sep 2 2019, 5:34 AM
andreadb added a comment to D66801: [X86][BtVer2] Fix latency and throughput of conditional SIMD store instructions..

LGTM - thanks @andreadb I think this is the way to go. As ever its up to the people responsible for the other models to tweak as necessary, as you said this is NFC for everything but btver2.

I don't see accurate numbers for these ops on Agner/instlatx64 for any target, I'm curious how they've checked the perf range for different mask register values (although Agner does mention that btver2 is often bad with VMASKMOVPS loads when mask == 0).

@lebedev.ri By the looks of it llvm-exegesis always uses zero registers for those tests - does it alter if you hack in other values?

Sep 2 2019, 2:47 AM · Restricted Project

Aug 30 2019

andreadb updated the diff for D66801: [X86][BtVer2] Fix latency and throughput of conditional SIMD store instructions..

Patch updated.

Aug 30 2019, 10:01 AM · Restricted Project
andreadb added inline comments to D66801: [X86][BtVer2] Fix latency and throughput of conditional SIMD store instructions..
Aug 30 2019, 7:15 AM · Restricted Project

Aug 27 2019

andreadb committed rG2f51a43f8c2b: [Tblgen][MCA] Add the ability to mark groups as LoadQueue and StoreQueue. NFCI (authored by andreadb).
[Tblgen][MCA] Add the ability to mark groups as LoadQueue and StoreQueue. NFCI
Aug 27 2019, 11:30 AM
andreadb closed D66810: [Tblgen][MCA] Add the ability to mark groups as LoadQueue and StoreQueue. NFCI.
Aug 27 2019, 11:29 AM · Restricted Project
andreadb added a comment to D66810: [Tblgen][MCA] Add the ability to mark groups as LoadQueue and StoreQueue. NFCI.

LGTM, nice change! I assume all of the existing LSUnit testing is sufficient.

Aug 27 2019, 11:01 AM · Restricted Project
andreadb created D66810: [Tblgen][MCA] Add the ability to mark groups as LoadQueue and StoreQueue. NFCI.
Aug 27 2019, 10:37 AM · Restricted Project
andreadb created D66801: [X86][BtVer2] Fix latency and throughput of conditional SIMD store instructions..
Aug 27 2019, 6:48 AM · Restricted Project

Aug 25 2019

andreadb abandoned D5356: InstCombine: constant comparison involving ashr is wrongly simplified (PR20945)..
Aug 25 2019, 8:36 AM

Aug 23 2019

andreadb committed rG8e9af64da6c9: [X86][BtVer2] Add a read-advance to every implicit register use of… (authored by andreadb).
[X86][BtVer2] Add a read-advance to every implicit register use of…
Aug 23 2019, 5:21 AM
andreadb committed rG1630f64e2f6c: [X86][BtVer2] Fix latency of ALU RMW instructions. (authored by andreadb).
[X86][BtVer2] Fix latency of ALU RMW instructions.
Aug 23 2019, 4:38 AM
andreadb added a comment to D66636: [X86][BtVer2] Fix latency of ALU RMW instructions..

LGTM - for the ADC/SBB fix I'd recommend adding a WriteADCRMW class instead of adding yet more overrides.

Aug 23 2019, 4:16 AM · Restricted Project
andreadb created D66636: [X86][BtVer2] Fix latency of ALU RMW instructions..
Aug 23 2019, 3:23 AM · Restricted Project

Aug 22 2019

andreadb committed rGc9649eb9dab7: [X86][BtVer2] Fix latency/throughput of scalar integer MUL instructions. (authored by andreadb).
[X86][BtVer2] Fix latency/throughput of scalar integer MUL instructions.
Aug 22 2019, 8:21 AM
andreadb updated the diff for D66547: [X86][BtVer2] Fix latency/throughput of scalar integer MUL instructions..

Patch rebased.

Aug 22 2019, 7:51 AM · Restricted Project
andreadb committed rG589cb004dee7: [MCA] consistently use MCPhysReg instead of unsigned as register type. NFCI (authored by andreadb).
[MCA] consistently use MCPhysReg instead of unsigned as register type. NFCI
Aug 22 2019, 6:32 AM
andreadb committed rGc6744055adf9: [X86][BtVer2] Fix latency and throughput of XCHG and XADD. (authored by andreadb).
[X86][BtVer2] Fix latency and throughput of XCHG and XADD.
Aug 22 2019, 4:33 AM

Aug 21 2019

andreadb updated the diff for D66535: [X86][BtVer2] Fix latency and throughput of XCHG and XADD..

Address review comment.

Aug 21 2019, 12:58 PM · Restricted Project
andreadb created D66547: [X86][BtVer2] Fix latency/throughput of scalar integer MUL instructions..
Aug 21 2019, 11:50 AM · Restricted Project
andreadb updated the diff for D66535: [X86][BtVer2] Fix latency and throughput of XCHG and XADD..

Patch updated.

Aug 21 2019, 8:10 AM · Restricted Project
andreadb updated the summary of D66535: [X86][BtVer2] Fix latency and throughput of XCHG and XADD..
Aug 21 2019, 7:52 AM · Restricted Project
andreadb created D66535: [X86][BtVer2] Fix latency and throughput of XCHG and XADD..
Aug 21 2019, 7:19 AM · Restricted Project

Aug 20 2019

andreadb committed rG2e897a94f587: [X86][BtVer2] Use ReadAfterLd entries for the register operands of CMPXCHG. (authored by andreadb).
[X86][BtVer2] Use ReadAfterLd entries for the register operands of CMPXCHG.
Aug 20 2019, 10:07 AM
andreadb committed rG16111d3795c7: [X86][BtVer2] Fix latency and throughput of atomic INC/DEC/NEG/NOT. (authored by andreadb).
[X86][BtVer2] Fix latency and throughput of atomic INC/DEC/NEG/NOT.
Aug 20 2019, 7:34 AM
andreadb closed D66469: [X86][BtVer2] Fix latency and throughput of atomic INC/DEC/NEG/NOT..
Aug 20 2019, 7:33 AM · Restricted Project
andreadb created D66469: [X86][BtVer2] Fix latency and throughput of atomic INC/DEC/NEG/NOT..
Aug 20 2019, 4:46 AM · Restricted Project
andreadb committed rGb1bdd97a2671: [X86][Btver2] Fix latency and throughput of CMPXCHG instructions. (authored by andreadb).
[X86][Btver2] Fix latency and throughput of CMPXCHG instructions.
Aug 20 2019, 3:25 AM

Aug 19 2019

andreadb updated the diff for D66424: [X86][Btver2] Fix latency and throughput of CMPXCHG instructions..

Patch updated.

Aug 19 2019, 12:21 PM · Restricted Project
andreadb committed rGbf989187c30f: [X86] Move scheduling tests for CMPXCHG to the corresponding resources-x86_64.s… (authored by andreadb).
[X86] Move scheduling tests for CMPXCHG to the corresponding resources-x86_64.s…
Aug 19 2019, 11:20 AM
andreadb updated the diff for D66424: [X86][Btver2] Fix latency and throughput of CMPXCHG instructions..

Address review comment.

Aug 19 2019, 10:18 AM · Restricted Project
andreadb committed rGecbaba672e18: [X86] Added extensive scheduling model tests for all the CMPXCHG variants. NFC (authored by andreadb).
[X86] Added extensive scheduling model tests for all the CMPXCHG variants. NFC
Aug 19 2019, 10:08 AM
andreadb added inline comments to D66424: [X86][Btver2] Fix latency and throughput of CMPXCHG instructions..
Aug 19 2019, 9:54 AM · Restricted Project
andreadb created D66424: [X86][Btver2] Fix latency and throughput of CMPXCHG instructions..
Aug 19 2019, 9:17 AM · Restricted Project

Aug 15 2019

andreadb committed rG3de2f0330f4b: [MCA] Slightly refactor class RetireControlUnit, and add the ability to… (authored by andreadb).
[MCA] Slightly refactor class RetireControlUnit, and add the ability to…
Aug 15 2019, 8:30 AM
andreadb committed rG7aa0dbb664ea: [MCA] Slightly refactor the logic in ResourceManager. NFCI (authored by andreadb).
[MCA] Slightly refactor the logic in ResourceManager. NFCI
Aug 15 2019, 5:41 AM

Aug 9 2019

andreadb committed rG8616a7702636: [MCA] Fix MSVC 19.16 build with libc++ (authored by andreadb).
[MCA] Fix MSVC 19.16 build with libc++
Aug 9 2019, 5:43 AM
andreadb committed rGcbec9af6bfb0: [MCA] Add flag -show-encoding to llvm-mca. (authored by andreadb).
[MCA] Add flag -show-encoding to llvm-mca.
Aug 9 2019, 4:30 AM
andreadb added a comment to D65844: [MCA] Fix MSVC 19.16 build with libc++.

Could you commit this on my behalf? Please edit the comment if you wish.

Aug 9 2019, 4:29 AM · Restricted Project
andreadb added a comment to D65948: [MCA] Add flag -show-encoding to llvm-mca..

Thanks Guillaume.

Aug 9 2019, 4:20 AM · Restricted Project

Aug 8 2019

andreadb added a comment to D65948: [MCA] Add flag -show-encoding to llvm-mca..

Is it already transitively enabled by -all-stats/-all-views, i think not?

No it is not.

TBH I'd prefer that this is disabled by default - it bulks out the instruction view quite a bit

By default - i totally agree. But not by an explicit "please show me everything; no but really please do".

Aug 8 2019, 9:47 AM · Restricted Project
andreadb added a comment to D65948: [MCA] Add flag -show-encoding to llvm-mca..

Is it already transitively enabled by -all-stats/-all-views, i think not?

Aug 8 2019, 9:32 AM · Restricted Project
andreadb updated the diff for D65948: [MCA] Add flag -show-encoding to llvm-mca..

Address review comments.

Aug 8 2019, 9:20 AM · Restricted Project
andreadb added a comment to D65948: [MCA] Add flag -show-encoding to llvm-mca..

Docs missing.

Aug 8 2019, 8:06 AM · Restricted Project
andreadb created D65948: [MCA] Add flag -show-encoding to llvm-mca..
Aug 8 2019, 6:28 AM · Restricted Project
andreadb committed rG987331671f02: [MCA] Remove dependency from InstrBuilder in mca::Context. NFC (authored by andreadb).
[MCA] Remove dependency from InstrBuilder in mca::Context. NFC
Aug 8 2019, 3:31 AM

Aug 7 2019

andreadb accepted D65844: [MCA] Fix MSVC 19.16 build with libc++.

A colleague ( @gbedwell ) pointed out that this is likely to be the same MSVC bug originally reported here:
https://developercommunity.visualstudio.com/content/problem/343550/webkit-an-undefined-class-is-not-allowed-as-an-arg.html.

Aug 7 2019, 4:15 AM · Restricted Project

Aug 5 2019

andreadb committed rG225655f82c3f: [MCA][doc] Add a section for the 'Bottleneck Analysis'. (authored by andreadb).
[MCA][doc] Add a section for the 'Bottleneck Analysis'.
Aug 5 2019, 6:19 AM

Aug 2 2019

andreadb added a comment to D65354: [X86] Let MachineCombiner reassociate adds for ILP.

Skimming through the RegisterPressure class and the approach MachineLICM uses, the mechanics of adding regrester pressure tracking to MachineCombiner don't seem too bad. I'm fairly confident we can do so without too much work, though I haven't fully worked it through much less written the code, so that might be a gotcha. Thanks everyone for the pointers.

The question left is what would we use the register pressure *for*? As in, what heuristics would make sense to impose the machine combiner?

Should we only do transforms which don't increase register pressure? Only increase it by an amount under the register class limit? The later is tempting, but then we might inhibit other transforms which also want to increase register pressure. What's the right tradeoff? I think register pressure preserving is probably too restrictive, but I'm not sure where the sweat spot is.

Aug 2 2019, 9:50 AM · Restricted Project
andreadb committed rG207e3af5018e: [MCA] Add support for printing immedate values as hex. Also enable lexing of… (authored by andreadb).
[MCA] Add support for printing immedate values as hex. Also enable lexing of…
Aug 2 2019, 3:41 AM

Aug 1 2019

andreadb updated the diff for D65588: [MCA] Add support for printing immedate values as hex. Also enable lexing of masm binary and hex literals..

Addressed review comment.

Aug 1 2019, 10:38 AM · Restricted Project
andreadb added inline comments to D65588: [MCA] Add support for printing immedate values as hex. Also enable lexing of masm binary and hex literals..
Aug 1 2019, 10:19 AM · Restricted Project
andreadb created D65588: [MCA] Add support for printing immedate values as hex. Also enable lexing of masm binary and hex literals..
Aug 1 2019, 9:58 AM · Restricted Project
andreadb accepted D65564: Improve raw_ostream so that you can "write" colors using operator<<.

LGTM too.

Aug 1 2019, 9:15 AM · Restricted Project, Restricted Project

Jun 27 2019

andreadb accepted D63873: [docs][tools] Add missing "program" tags to rst files.

LGTM

Jun 27 2019, 5:53 AM · Restricted Project

Jun 26 2019

andreadb added a comment to D63628: AMD K10 (Barcelona) Initial Scheduler model.

Thanks for taking a look!

Out of curiosity, did you investigate on why three benchmarks show a 6% slowdown?

Actually no, i didn't look at that as of this moment.
It doesn't appear to be noise, but those 4 tests are essentially covering
the same codepath, so the reason will be the same for all of them.

Jun 26 2019, 3:50 AM · Restricted Project

Jun 24 2019

andreadb added a comment to D63628: AMD K10 (Barcelona) Initial Scheduler model.

Rather pedantic, but shouldn't the model be called K10 or Fam10 instead of Barcelona, which was just one of the chips in the series?

I'm not sure, actually.

This seemed more consistent, because e.g. there are -mcpu=bdver[1-4],
with bd being bulldozer which was the first chip in the 15h series.
Those aren't -mcpu=k15ver[1-4], and not -mcpu={bulldozer,piledriver,steamroller,excavator},
and following the consistency logic that is already inconsistent, and latter would be
even more confusing since bulldozer would then refer both to the 15h series
and just a single model in the series, with all other models being different.

This isn't the case here, there is no -mcpu=k10ver2;
-mcpu=fam10h and -mcpu=barcelona are synonyms in X86.td
(i don't know if latter chips actually *should* be separate)

So i'm not sure, i can totally rename, but TO ME the current scheme follows preexisting pattern.

Jun 24 2019, 6:04 AM · Restricted Project
andreadb added a comment to D63628: AMD K10 (Barcelona) Initial Scheduler model.

Out of curiosity, did you investigate on why three benchmarks show a 6% slowdown?

Jun 24 2019, 5:33 AM · Restricted Project
andreadb added a comment to D63628: AMD K10 (Barcelona) Initial Scheduler model.

Hi Roman,

Jun 24 2019, 5:21 AM · Restricted Project

Jun 21 2019

andreadb committed rGdd0dc19b1c07: Set an explicit x86 triple for test bottleneck-analysis.s added by my r364045. (authored by andreadb).
Set an explicit x86 triple for test bottleneck-analysis.s added by my r364045.
Jun 21 2019, 7:05 AM
andreadb committed rGaa9b6468bdc9: [MCA][Bottleneck Analysis] Teach how to compute a critical sequence of… (authored by andreadb).
[MCA][Bottleneck Analysis] Teach how to compute a critical sequence of…
Jun 21 2019, 6:32 AM

Jun 20 2019

andreadb updated the diff for D63543: [MCA][Bottleneck Analysis] Teach how to compute a critical sequence of instructions based on the simulation..

Patch updated.

Jun 20 2019, 9:54 AM · Restricted Project
andreadb added a comment to D63543: [MCA][Bottleneck Analysis] Teach how to compute a critical sequence of instructions based on the simulation..

Awesome patch. I'm cool with this as long as Simon's comments are addressed.

Jun 20 2019, 4:15 AM · Restricted Project

Jun 19 2019

andreadb committed rG792510f86949: [llvm-mca][docs] clarify how the quality of the perf report is affected by the… (authored by andreadb).
[llvm-mca][docs] clarify how the quality of the perf report is affected by the…
Jun 19 2019, 9:08 AM
andreadb added a comment to D63556: [llvm-mca][docs] clarify how the quality of the perf report is affected by the quality of the scheduling models..

Thanks for the reviews @lebedev.ri and @mattd

Jun 19 2019, 8:57 AM · Restricted Project
andreadb updated subscribers of D63556: [llvm-mca][docs] clarify how the quality of the perf report is affected by the quality of the scheduling models..

Forgot to add llvm-commits to the subscribers. Apologies for the spam.

Jun 19 2019, 8:17 AM · Restricted Project
andreadb created D63556: [llvm-mca][docs] clarify how the quality of the perf report is affected by the quality of the scheduling models..
Jun 19 2019, 8:02 AM · Restricted Project
andreadb added inline comments to D63543: [MCA][Bottleneck Analysis] Teach how to compute a critical sequence of instructions based on the simulation..
Jun 19 2019, 6:20 AM · Restricted Project
andreadb created D63543: [MCA][Bottleneck Analysis] Teach how to compute a critical sequence of instructions based on the simulation..
Jun 19 2019, 5:00 AM · Restricted Project

Jun 18 2019

andreadb committed rG3b2f5df12c88: [MCA] Slightly refactor the bottleneck analysis view. NFCI (authored by andreadb).
[MCA] Slightly refactor the bottleneck analysis view. NFCI
Jun 18 2019, 5:58 AM

Jun 17 2019

andreadb accepted D63246: [X86][SSE] Prevent misaligned non-temporal vector load/store combines.

Looks good to me.

Jun 17 2019, 6:09 AM · Restricted Project

Jun 14 2019

andreadb committed rG6b78e4d0a43b: [MCA] Ignore invalid processor resource writes of zero cycles. NFCI (authored by andreadb).
[MCA] Ignore invalid processor resource writes of zero cycles. NFCI
Jun 14 2019, 6:29 AM

Jun 10 2019

andreadb committed rGc650a9084fcb: [llvm-mca] Enable bottleneck analysis when flag -all-views is specified. (authored by andreadb).
[llvm-mca] Enable bottleneck analysis when flag -all-views is specified.
Jun 10 2019, 9:54 AM
andreadb committed rG49d8699ecc57: [MCA] Fix -Wunused-private-field warning after r362933. NFC (authored by andreadb).
[MCA] Fix -Wunused-private-field warning after r362933. NFC
Jun 10 2019, 6:33 AM
andreadb committed rG47db08dbb19c: [MCA] Further refactor the bottleneck analysis view. NFCI. (authored by andreadb).
[MCA] Further refactor the bottleneck analysis view. NFCI.
Jun 10 2019, 5:48 AM
andreadb accepted D63040: [Docs] [llvm-mca] Point out a caveat for using llvm-mca markers in source code..

Looks very good.

Jun 10 2019, 3:29 AM · Restricted Project

Jun 8 2019

andreadb added reviewers for D63040: [Docs] [llvm-mca] Point out a caveat for using llvm-mca markers in source code.: mattd, RKSimon, spatel.

I understand your point.

Jun 8 2019, 1:29 PM · Restricted Project
andreadb added a comment to D63040: [Docs] [llvm-mca] Point out a caveat for using llvm-mca markers in source code..

Hi Max,

Jun 8 2019, 5:31 AM · Restricted Project

Jun 4 2019

andreadb accepted D61764: [LV] Suppress vectorization in some nontemporal cases.

I think this patch looks good.
The new TTI hooks looks good, and the change seems conservative enough. But more importantly it fixes the perf issue reported as PR40759.

Jun 4 2019, 9:50 AM · Restricted Project

Jun 1 2019

andreadb committed rG6a989c358cc7: [MCA][Scheduler] Change how memory instructions are dispatched to the pending… (authored by andreadb).
[MCA][Scheduler] Change how memory instructions are dispatched to the pending…
Jun 1 2019, 8:20 AM

May 31 2019

andreadb committed rG065bd45da9de: [MCA] Remove unused fields from BottleneckAnalysis. NFC (authored by andreadb).
[MCA] Remove unused fields from BottleneckAnalysis. NFC
May 31 2019, 11:01 AM