Page MenuHomePhabricator
Feed Advanced Search

Thu, Jun 27

andreadb accepted D63873: [docs][tools] Add missing "program" tags to rst files.

LGTM

Thu, Jun 27, 5:53 AM · Restricted Project

Wed, Jun 26

andreadb added a comment to D63628: AMD K10 (Barcelona) Initial Scheduler model.

Thanks for taking a look!

Out of curiosity, did you investigate on why three benchmarks show a 6% slowdown?

Actually no, i didn't look at that as of this moment.
It doesn't appear to be noise, but those 4 tests are essentially covering
the same codepath, so the reason will be the same for all of them.

Wed, Jun 26, 3:50 AM · Restricted Project

Mon, Jun 24

andreadb added a comment to D63628: AMD K10 (Barcelona) Initial Scheduler model.

Rather pedantic, but shouldn't the model be called K10 or Fam10 instead of Barcelona, which was just one of the chips in the series?

I'm not sure, actually.

This seemed more consistent, because e.g. there are -mcpu=bdver[1-4],
with bd being bulldozer which was the first chip in the 15h series.
Those aren't -mcpu=k15ver[1-4], and not -mcpu={bulldozer,piledriver,steamroller,excavator},
and following the consistency logic that is already inconsistent, and latter would be
even more confusing since bulldozer would then refer both to the 15h series
and just a single model in the series, with all other models being different.

This isn't the case here, there is no -mcpu=k10ver2;
-mcpu=fam10h and -mcpu=barcelona are synonyms in X86.td
(i don't know if latter chips actually *should* be separate)

So i'm not sure, i can totally rename, but TO ME the current scheme follows preexisting pattern.

Mon, Jun 24, 6:04 AM · Restricted Project
andreadb added a comment to D63628: AMD K10 (Barcelona) Initial Scheduler model.

Out of curiosity, did you investigate on why three benchmarks show a 6% slowdown?

Mon, Jun 24, 5:33 AM · Restricted Project
andreadb added a comment to D63628: AMD K10 (Barcelona) Initial Scheduler model.

Hi Roman,

Mon, Jun 24, 5:21 AM · Restricted Project

Fri, Jun 21

andreadb committed rGdd0dc19b1c07: Set an explicit x86 triple for test bottleneck-analysis.s added by my r364045. (authored by andreadb).
Set an explicit x86 triple for test bottleneck-analysis.s added by my r364045.
Fri, Jun 21, 7:05 AM
andreadb committed rGaa9b6468bdc9: [MCA][Bottleneck Analysis] Teach how to compute a critical sequence of… (authored by andreadb).
[MCA][Bottleneck Analysis] Teach how to compute a critical sequence of…
Fri, Jun 21, 6:32 AM

Thu, Jun 20

andreadb updated the diff for D63543: [MCA][Bottleneck Analysis] Teach how to compute a critical sequence of instructions based on the simulation..

Patch updated.

Thu, Jun 20, 9:54 AM · Restricted Project
andreadb added a comment to D63543: [MCA][Bottleneck Analysis] Teach how to compute a critical sequence of instructions based on the simulation..

Awesome patch. I'm cool with this as long as Simon's comments are addressed.

Thu, Jun 20, 4:15 AM · Restricted Project

Wed, Jun 19

andreadb committed rG792510f86949: [llvm-mca][docs] clarify how the quality of the perf report is affected by the… (authored by andreadb).
[llvm-mca][docs] clarify how the quality of the perf report is affected by the…
Wed, Jun 19, 9:08 AM
andreadb added a comment to D63556: [llvm-mca][docs] clarify how the quality of the perf report is affected by the quality of the scheduling models..

Thanks for the reviews @lebedev.ri and @mattd

Wed, Jun 19, 8:57 AM · Restricted Project
andreadb updated subscribers of D63556: [llvm-mca][docs] clarify how the quality of the perf report is affected by the quality of the scheduling models..

Forgot to add llvm-commits to the subscribers. Apologies for the spam.

Wed, Jun 19, 8:17 AM · Restricted Project
andreadb created D63556: [llvm-mca][docs] clarify how the quality of the perf report is affected by the quality of the scheduling models..
Wed, Jun 19, 8:02 AM · Restricted Project
andreadb added inline comments to D63543: [MCA][Bottleneck Analysis] Teach how to compute a critical sequence of instructions based on the simulation..
Wed, Jun 19, 6:20 AM · Restricted Project
andreadb created D63543: [MCA][Bottleneck Analysis] Teach how to compute a critical sequence of instructions based on the simulation..
Wed, Jun 19, 5:00 AM · Restricted Project

Tue, Jun 18

andreadb committed rG3b2f5df12c88: [MCA] Slightly refactor the bottleneck analysis view. NFCI (authored by andreadb).
[MCA] Slightly refactor the bottleneck analysis view. NFCI
Tue, Jun 18, 5:58 AM

Jun 17 2019

andreadb accepted D63246: [X86][SSE] Prevent misaligned non-temporal vector load/store combines.

Looks good to me.

Jun 17 2019, 6:09 AM · Restricted Project

Jun 14 2019

andreadb committed rG6b78e4d0a43b: [MCA] Ignore invalid processor resource writes of zero cycles. NFCI (authored by andreadb).
[MCA] Ignore invalid processor resource writes of zero cycles. NFCI
Jun 14 2019, 6:29 AM

Jun 10 2019

andreadb committed rGc650a9084fcb: [llvm-mca] Enable bottleneck analysis when flag -all-views is specified. (authored by andreadb).
[llvm-mca] Enable bottleneck analysis when flag -all-views is specified.
Jun 10 2019, 9:54 AM
andreadb committed rG49d8699ecc57: [MCA] Fix -Wunused-private-field warning after r362933. NFC (authored by andreadb).
[MCA] Fix -Wunused-private-field warning after r362933. NFC
Jun 10 2019, 6:33 AM
andreadb committed rG47db08dbb19c: [MCA] Further refactor the bottleneck analysis view. NFCI. (authored by andreadb).
[MCA] Further refactor the bottleneck analysis view. NFCI.
Jun 10 2019, 5:48 AM
andreadb accepted D63040: [Docs] [llvm-mca] Point out a caveat for using llvm-mca markers in source code..

Looks very good.

Jun 10 2019, 3:29 AM · Restricted Project

Jun 8 2019

andreadb added reviewers for D63040: [Docs] [llvm-mca] Point out a caveat for using llvm-mca markers in source code.: mattd, RKSimon, spatel.

I understand your point.

Jun 8 2019, 1:29 PM · Restricted Project
andreadb added a comment to D63040: [Docs] [llvm-mca] Point out a caveat for using llvm-mca markers in source code..

Hi Max,

Jun 8 2019, 5:31 AM · Restricted Project

Jun 4 2019

andreadb accepted D61764: [LV] Suppress vectorization in some nontemporal cases.

I think this patch looks good.
The new TTI hooks looks good, and the change seems conservative enough. But more importantly it fixes the perf issue reported as PR40759.

Jun 4 2019, 9:50 AM · Restricted Project

Jun 1 2019

andreadb committed rG6a989c358cc7: [MCA][Scheduler] Change how memory instructions are dispatched to the pending… (authored by andreadb).
[MCA][Scheduler] Change how memory instructions are dispatched to the pending…
Jun 1 2019, 8:20 AM

May 31 2019

andreadb committed rG065bd45da9de: [MCA] Remove unused fields from BottleneckAnalysis. NFC (authored by andreadb).
[MCA] Remove unused fields from BottleneckAnalysis. NFC
May 31 2019, 11:01 AM
andreadb committed rG312f3a2bbf45: [MCA] Refactor class BottleneckAnalysis. NFCI (authored by andreadb).
[MCA] Refactor class BottleneckAnalysis. NFCI
May 31 2019, 10:19 AM

May 29 2019

andreadb committed rG280ac1fd1dc3: [MCA] Refactor class LSUnit. NFCI (authored by andreadb).
[MCA] Refactor class LSUnit. NFCI
May 29 2019, 4:37 AM

May 26 2019

andreadb committed rGc2493ce4a40b: [MCA][Scheduler] Improved critical memory dependency computation. (authored by andreadb).
[MCA][Scheduler] Improved critical memory dependency computation.
May 26 2019, 12:49 PM
andreadb committed rGa549dd25607d: [MCA] Refactor the logic that computes the critical memory dependency info. NFCI (authored by andreadb).
[MCA] Refactor the logic that computes the critical memory dependency info. NFCI
May 26 2019, 11:40 AM

May 24 2019

andreadb committed rG21977d8e29f8: [MCA] Zero-initialize field CRD in InstructionBase. Also run clang-format on a… (authored by andreadb).
[MCA] Zero-initialize field CRD in InstructionBase. Also run clang-format on a…
May 24 2019, 6:54 AM
andreadb accepted D62360: [X86] Add zero idioms to the haswell, broadwell, and skylake schedule models. Add 256-bit fp xor to sandybridge zero idioms.

LGTM (modulo the changes requested by Simon).

May 24 2019, 4:06 AM · Restricted Project

May 23 2019

andreadb committed rG27b3b5d952c5: [MCA] Add the ability to compute critical register dependency of an instruction. (authored by andreadb).
[MCA] Add the ability to compute critical register dependency of an instruction.
May 23 2019, 9:31 AM
andreadb committed rGdd0d9e01eeaa: [MCA] Introduce class LSUnitBase and let LSUnit derive from it. (authored by andreadb).
[MCA] Introduce class LSUnitBase and let LSUnit derive from it.
May 23 2019, 6:41 AM
andreadb committed rG28afd8dc7112: [MCA] Make the bool conversion operator in class InstRef explicit. NFCI (authored by andreadb).
[MCA] Make the bool conversion operator in class InstRef explicit. NFCI
May 23 2019, 3:48 AM

May 9 2019

andreadb committed rG4e62554bfae2: [MCA] Add support for nested and overlapping region markers (authored by andreadb).
[MCA] Add support for nested and overlapping region markers
May 9 2019, 8:19 AM
andreadb updated the diff for D61676: [MCA] Add support for nested and overlapping region markers.

Patch updated.

May 9 2019, 6:31 AM · Restricted Project
andreadb added a comment to D61676: [MCA] Add support for nested and overlapping region markers.

This looks nice! We should probably also have a test for the case where a user specifies an END before a BEGIN tag.

May 9 2019, 6:14 AM · Restricted Project

May 8 2019

andreadb created D61676: [MCA] Add support for nested and overlapping region markers.
May 8 2019, 5:57 AM · Restricted Project
andreadb committed rGd52a542e4cb6: [MCA] Don't add a name to the default code region. (authored by andreadb).
[MCA] Don't add a name to the default code region.
May 8 2019, 4:03 AM
andreadb committed rG86654dd8a046: [MCA] Slightly refactor CodeRegion.h. NFCI (authored by andreadb).
[MCA] Slightly refactor CodeRegion.h. NFCI
May 8 2019, 3:44 AM
andreadb committed rG69b8b17945f0: [MCA] Remove dead assignment. NFC (authored by andreadb).
[MCA] Remove dead assignment. NFC
May 8 2019, 3:28 AM

May 5 2019

andreadb committed rG0460a3629b25: [MCA] Notify event listeners when instructions transition to the Pending state. (authored by andreadb).
[MCA] Notify event listeners when instructions transition to the Pending state.
May 5 2019, 9:10 AM

May 3 2019

andreadb added a comment to D61472: [X86FixupLEAs] Turn optIncDec into a generic two address LEA optimizer. Support LEA64_32r properly..

Overall, the patch looks good to me.

May 3 2019, 7:17 AM · Restricted Project, Restricted Project

May 2 2019

andreadb accepted D51160: Adjust MIScheduler to use ProcResource counts.
May 2 2019, 4:59 AM · Restricted Project

May 1 2019

andreadb added a reviewer for D61355: [ThinLTO] Fix unreachable code when parsing summary entries.: wristow.
May 1 2019, 3:37 AM · Restricted Project
andreadb added reviewers for D61355: [ThinLTO] Fix unreachable code when parsing summary entries.: bd1976llvm, kromanova, kbelochapka.
May 1 2019, 3:36 AM · Restricted Project

Apr 30 2019

andreadb accepted D60993: [X86] Initial cleanups on the FixupLEAs pass. Separate Atom LEA creation from other LEA optimizations..

LGTM

Apr 30 2019, 4:28 AM · Restricted Project

Apr 28 2019

andreadb committed rG43003f0fec7f: [MCA] Fix typo in AVX2 gather tests. NFC (authored by andreadb).
[MCA] Fix typo in AVX2 gather tests. NFC
Apr 28 2019, 3:55 AM

Apr 27 2019

andreadb committed rGd77dc9ada206: [MCA] Add field `IsEliminated` to class Instruction. NFCI (authored by andreadb).
[MCA] Add field `IsEliminated` to class Instruction. NFCI
Apr 27 2019, 4:58 AM

Apr 16 2019

andreadb committed rG57cef5867295: [MCA] Moved the bottleneck analysis to its own file. NFCI (authored by andreadb).
[MCA] Moved the bottleneck analysis to its own file. NFCI
Apr 16 2019, 11:02 PM

Apr 11 2019

andreadb committed rG2050dff996a2: [MCA] Remove wrong comments from a test. NFC (authored by andreadb).
[MCA] Remove wrong comments from a test. NFC
Apr 11 2019, 3:14 AM

Apr 10 2019

andreadb accepted D60441: [X86] Make _Int instructions the preferred instructon for the assembly parser and disassembly parser to remove inconsistencies between VEX and EVEX..

Architecturally that read really does exist. Its not a false dependency. That read defines the upper bits of the result. The fact that AVX and SSE were different before this patch seems like a bug. It looks like with your proposed change they would still be different. That doesn't seem right.

Right. Sorry. The upper bits of the result are unmodified for the SSE variants. So yes, it was a bug before, and that read does exist in practice.

Apr 10 2019, 4:59 AM · Restricted Project

Apr 9 2019

andreadb added a comment to D60441: [X86] Make _Int instructions the preferred instructon for the assembly parser and disassembly parser to remove inconsistencies between VEX and EVEX..

Architecturally that read really does exist. Its not a false dependency. That read defines the upper bits of the result. The fact that AVX and SSE were different before this patch seems like a bug. It looks like with your proposed change they would still be different. That doesn't seem right.

Apr 9 2019, 11:11 AM · Restricted Project
andreadb added a comment to D60441: [X86] Make _Int instructions the preferred instructon for the assembly parser and disassembly parser to remove inconsistencies between VEX and EVEX..

The reason why there is a regression is because this patch adds an extra input operand to the following instructions:

cvtsi2ssl  %ecx, %xmm0
cvtsi2sdl  %ecx, %xmm0
Apr 9 2019, 9:48 AM · Restricted Project
andreadb added a comment to D60441: [X86] Make _Int instructions the preferred instructon for the assembly parser and disassembly parser to remove inconsistencies between VEX and EVEX..

Many of our instructions have both a _Int form used by intrinsics and a form used by other IR constructs.

Is there any documentation, a comment, a mail thread somewhere that explains why this is the way it is?
I.e. why are those _Int variants need to exist? (are they temporary, or to stay forever)

The mca(?) regression is troubling.

Apr 9 2019, 8:56 AM · Restricted Project

Apr 8 2019

andreadb committed rGf6a60f1f8031: [llvm-mca][scheduler-stats] Print issued micro opcodes per cycle. NFCI (authored by andreadb).
[llvm-mca][scheduler-stats] Print issued micro opcodes per cycle. NFCI
Apr 8 2019, 9:07 AM

Apr 5 2019

andreadb accepted D60286: [x86] make 8-bit shl undesirable.

Looks good to me.

Apr 5 2019, 3:24 AM · Restricted Project

Apr 4 2019

andreadb accepted D60138: [X86] Merge the different SETcc instructions for each condition code into single instructions that store the condition code as an operand..

LGTM

Apr 4 2019, 2:54 AM · Restricted Project

Apr 3 2019

andreadb accepted D60185: [X86] Make the post machine scheduler macrofusion-aware..

Sounds reasonable to me.

Apr 3 2019, 2:32 AM · Restricted Project

Mar 29 2019

andreadb added a comment to D59997: [x86] allow movmsk with 2-element reductions.

llvm-mca numbers are quite accurate for btver2 (see below for the perf results):

Mar 29 2019, 10:47 AM · Restricted Project
andreadb committed rGe074ac60b452: [MCA] Add an experimental MicroOpQueue stage. (authored by andreadb).
[MCA] Add an experimental MicroOpQueue stage.
Mar 29 2019, 5:16 AM
andreadb added inline comments to D59928: [MCA] Add an experimental MicroOpQueue stage..
Mar 29 2019, 4:20 AM · Restricted Project

Mar 28 2019

andreadb updated the diff for D59928: [MCA] Add an experimental MicroOpQueue stage..

Address review comment.

Mar 28 2019, 10:41 AM · Restricted Project
andreadb accepted D59689: [ScheduleDAG] Move `Topo` and `addEdge` to base class..
Mar 28 2019, 9:57 AM · Restricted Project
andreadb added a comment to D59689: [ScheduleDAG] Move `Topo` and `addEdge` to base class..

LGTM

Mar 28 2019, 9:57 AM · Restricted Project
andreadb accepted D59688: [X86] Make post-ra scheduling macrofusion-aware..

Thanks Clement.

Mar 28 2019, 8:12 AM · Restricted Project
andreadb updated the diff for D59928: [MCA] Add an experimental MicroOpQueue stage..

Patch updated.

Mar 28 2019, 8:05 AM · Restricted Project
andreadb added inline comments to D59928: [MCA] Add an experimental MicroOpQueue stage..
Mar 28 2019, 7:52 AM · Restricted Project
andreadb created D59928: [MCA] Add an experimental MicroOpQueue stage..
Mar 28 2019, 6:04 AM · Restricted Project
andreadb accepted D59872: [X86MacroFusion] Handle branch fusion (AMD CPUs)..
Mar 28 2019, 5:13 AM · Restricted Project
andreadb added a comment to D59872: [X86MacroFusion] Handle branch fusion (AMD CPUs)..

Macro-op fusion on Intel processors is essentially a form branch fusion.
The only difference in practice is that - starting from Sandybridge - a few extra opcodes (mostly arithmetic) other than CMP/TEST can now be fuse with branches.
But conceptually, what AMD calls branch fusion is (a form of) macro-op fusion.
Using FeatureMacroFusion and FeatureBranchFusion to refer to the two different forms of macro-op fusion may be a bit misleading... But then, I am not good with names, so I am not sure I am able to suggest better names for those :-( .

Mar 28 2019, 4:26 AM · Restricted Project

Mar 27 2019

andreadb committed rGa194656fa245: [MCA] Fix -Wparentheses warning breaking the -Werror build. (authored by andreadb).
[MCA] Fix -Wparentheses warning breaking the -Werror build.
Mar 27 2019, 9:24 AM
andreadb committed rG333a3264f472: [MCA][Pipeline] Don't visit stages in reverse order when calling method… (authored by andreadb).
[MCA][Pipeline] Don't visit stages in reverse order when calling method…
Mar 27 2019, 8:43 AM

Mar 26 2019

andreadb committed rGddce32e2f3a0: [MCA] Correctly update the UsedResourceGroups mask in the InstrBuilder. (authored by andreadb).
[MCA] Correctly update the UsedResourceGroups mask in the InstrBuilder.
Mar 26 2019, 8:41 AM
andreadb added inline comments to D51160: Adjust MIScheduler to use ProcResource counts.
Mar 26 2019, 5:00 AM · Restricted Project

Mar 25 2019

andreadb added a comment to D59688: [X86] Make post-ra scheduling macrofusion-aware..

I plan to run some experiments today using your patch.

That's great, thanks.

Sorry, I was over optimistic about my other workload. I don't think I'll get a chance to get any perf numbers anytime soon.

That being said, I tried your patch on a few small examples on some different targets, and results seem good.
For example, before your patch I saw cases where the test/cmp was not emitted before the conditional branch. Your patch seems to fix that "issue" in most cases.

My only concern is that the macro-fusion mutator might be a bit too aggressive for AMD processors.
X86MacroFusion assumes that branch fusion can happen with ADD/SUB/INC/DEC too. That is okay for Intel processors, but not necessarily for

AMD processors where branch fusion (as far as I remember) is limited to CMP/TEST opcodes only.

That is consistent with what is stated in agner's microarchitecture, amd sog for piledriver.

Since your patch enables that mutator for targets with FeatureMacroFusion, it would be nice to get some feedback from somebody with access to an AMD target where macro fusion is enabled (Bobcat/Jaguar doesn't do branch fusion). Perhaps @lebedev.ri can run some quick tests on BdVer2?

It will, as usual, depend on whether this happens to affect the hotpath or not.
I did just run my rawspeed benchmark, and i'm not observing any notable non-noise perf changes.

Mar 25 2019, 2:48 PM · Restricted Project
Herald added a project to D46662: [X86] condition branches folding for three-way conditional codes: Restricted Project.
In D46662#1246781, @xur wrote:

Hi Andrea,

Thanks for running this test, and the explanation. Can you run the tests
on Bulldozer/Ryzen? I don't have access to these platforms. If I need to do
this in subtarget way, it would be good to know the performance there.

CC'ing @lebedev.ri and @GGanesh.
They should be able to help you with running those tests on Bulldozer/Ryzen. Unfortunately, I don't have access to those machines.

I *think* this should be fine on bdver2, as per https://www.agner.org/optimize/microarchitecture.pdf:

19.15 Branches and loops
The branch prediction mechanism is described on page 34. There is no longer any
restriction on the number of branches per 16 bytes of code that can be predicted efficiently.
The misprediction penalty is quite high because of a long pipeline.

...
Bench: 4evencases.cc
...
Bench: 15evencases.cc
...
I wouldn't be surprised if instead this patch improves the performance of code on other big AMD cores like Bulldozer/ryzen.

Are these benchmarks available from somewhere? Can i run them

Mar 25 2019, 2:43 PM · Restricted Project
andreadb updated subscribers of D59688: [X86] Make post-ra scheduling macrofusion-aware..

I plan to run some experiments today using your patch.

That's great, thanks.

Mar 25 2019, 7:35 AM · Restricted Project

Mar 22 2019

andreadb added a comment to D59688: [X86] Make post-ra scheduling macrofusion-aware..

Nice patch Clement!

I always wondered why on x86 we only enabled that mutator in the pre-ra scheduler.
In the past, I remember I did some quick experiments with enabling that mutator in the post-RA scheduler. I must admit that I wasn't particularly lucky wih the experiments (i.e. I couldn't find significant/promising improvements). But then - again - those were just quick experiments, and I didn't try it on many codebases. If you think you can share some numbers then that would be great.

Thanks Andrea.

Yes, that's essentially what the comment in X86.td says:

"This generally gives a nice performance increase on silvermont, with largely neutral behavior on other contemporary large core processors."

However, that was before the round of scheduling information fixes that Simon & I made based on llvm-exegesis. I wanted to give it another try after that, and from my first experiments it seems that it indeed makes sense to look at it again.
What I have done for now is run our (internal, sorry) main macrobenchmark with post-ra enabled. With the base code I see a consistent regression of 0.5% to 1% depending on metrics. With this patch I see a consistent improvement of 0.5% to 2%.

Mar 22 2019, 6:25 AM · Restricted Project
andreadb added a comment to D59688: [X86] Make post-ra scheduling macrofusion-aware..

Nice patch Clement!

Mar 22 2019, 4:58 AM · Restricted Project

Mar 20 2019

andreadb committed rG624f5deff429: [X86] Remove X86 specific dag nodes for RDTSC/RDTSCP/RDPMC. NFCI (authored by andreadb).
[X86] Remove X86 specific dag nodes for RDTSC/RDTSCP/RDPMC. NFCI
Mar 20 2019, 4:21 AM

Mar 19 2019

andreadb updated the diff for D59547: [X86] Remove X86 specific dag nodes for RDTSC/RDTSCP/RDPMC. NFCI.

Patch updated.

Mar 19 2019, 2:40 PM · Restricted Project
andreadb added inline comments to D59547: [X86] Remove X86 specific dag nodes for RDTSC/RDTSCP/RDPMC. NFCI.
Mar 19 2019, 2:28 PM · Restricted Project
andreadb added inline comments to D59547: [X86] Remove X86 specific dag nodes for RDTSC/RDTSCP/RDPMC. NFCI.
Mar 19 2019, 1:26 PM · Restricted Project
andreadb added inline comments to D59547: [X86] Remove X86 specific dag nodes for RDTSC/RDTSCP/RDPMC. NFCI.
Mar 19 2019, 12:52 PM · Restricted Project
andreadb added inline comments to D59547: [X86] Remove X86 specific dag nodes for RDTSC/RDTSCP/RDPMC. NFCI.
Mar 19 2019, 12:29 PM · Restricted Project
andreadb updated the diff for D59547: [X86] Remove X86 specific dag nodes for RDTSC/RDTSCP/RDPMC. NFCI.

Addressed review comments.

Mar 19 2019, 12:13 PM · Restricted Project
andreadb added inline comments to D59547: [X86] Remove X86 specific dag nodes for RDTSC/RDTSCP/RDPMC. NFCI.
Mar 19 2019, 11:05 AM · Restricted Project
andreadb added a comment to D59391: [X86] Add post-isel pseudos for rotate by immediate using SHLD/SHRD.

I've been toying with the idea of using a pseudo for all SHLD/SHRD cases so we can make it easier to select between that and the expanded shift pattern depending on scheduler-model/register-pressure etc. instead of trying to make the decision in DAG with the feature bits. I don't know if this could be a first step towards this? @andreadb Any thoughts?

Mar 19 2019, 10:41 AM · Restricted Project
andreadb created D59547: [X86] Remove X86 specific dag nodes for RDTSC/RDTSCP/RDPMC. NFCI.
Mar 19 2019, 8:30 AM · Restricted Project

Mar 15 2019

andreadb added a reviewer for D59412: [X86] X86ISelLowering::combineSextInRegCmov(): also handle i8 CMOV's: RKSimon.
Mar 15 2019, 7:29 AM · Restricted Project
andreadb added a comment to D59035: [X86] Promote i8 CMOV's (PR40965).

why can't we simply widen the hands of select, like

define i16 @new(i1 %c) {

%ret = select i1 %c, i16 117, i16 -19
ret i16 %ret

}

https://rise4fun.com/Alive/cs8
?

Mar 15 2019, 7:20 AM · Restricted Project, Restricted Project
andreadb accepted D59035: [X86] Promote i8 CMOV's (PR40965).

Looks good to me.

Mar 15 2019, 5:22 AM · Restricted Project, Restricted Project

Mar 7 2019

andreadb accepted D59098: [llvm-mca] Emit a message when no bottlenecks are identified..

LGTM if you add flags -all-views=false -summary-view to that new test, and then you regenerate it with the update_mca script.

Mar 7 2019, 11:01 AM · Restricted Project
andreadb accepted D59058: [X86] Model ADC/SBB with immediate 0 more accurately in the Haswell scheduler model.

LGTM

Mar 7 2019, 3:12 AM · Restricted Project
andreadb accepted D59077: [X86] Correct scheduler information for rotate by constant for Haswell, Broadwell, and Skylake..

LGTM

Mar 7 2019, 3:09 AM · Restricted Project

Mar 5 2019

andreadb accepted D58939: [Subtarget] Merge ProcSched and ProcDesc arrays in MCSubtargetInfo into a single array..

LGTM. Very nice refactoring.

Mar 5 2019, 4:34 AM · Restricted Project
andreadb accepted D58938: [Subtarget] Create a separate SubtargetSubtargetKV struct for ProcDesc to remove fields from the stack tables that aren't needed for CPUs.

Looks good to me.

Mar 5 2019, 4:10 AM · Restricted Project