This is an archive of the discontinued LLVM Phabricator instance.

[TableGen] Emit a fatal error on inconsistencies in resource units vs cycles.
ClosedPublic

Authored by courbet on May 2 2018, 4:40 AM.

Details

Summary

See PR37310.

For targets I'm not familiar with, I've automatically made the "default to 1 for each resource" behaviour explicit in the td files.
For more obvious cases, I've ventured a fix.

Some notes:

  • The change to X86ScheduleBtVer2.td is an NFC, it just makes values more explicit.
  • Exynos is especially fishy.
  • AArch64SchedThunderX2T99.td had some truncated entries. If I understand correctly, the person who wrote that interpreted the ResourceCycle as a range. I made the decision to use the upper/lower bound for consistency with the 'Latency' value. I'm sure there is a better choice.
  • MIPS has the same issue for DIVs.

Diff Detail

Repository
rL LLVM

Event Timeline

courbet created this revision.May 2 2018, 4:40 AM
courbet edited the summary of this revision. (Show Details)May 2 2018, 4:41 AM
courbet updated this revision to Diff 144859.May 2 2018, 5:12 AM
courbet edited the summary of this revision. (Show Details)

Fix some MIPS scheduling info revealed by the checks.

courbet edited the summary of this revision. (Show Details)May 2 2018, 5:12 AM
courbet added a reviewer: az.May 2 2018, 5:17 AM

Not sure if I want to have to specify resource cycles every time - we should be able to assume a default=1, same for latency and micro-ops. This helps reduce bulkiness of the models considerably. In fact I might go further and maybe suggest a debug warning that values set to the default value could be removed?

But I'd definitely happy with the idea that if ResourceCycles are set it must be equal in size to ResourceUnits, (and similar ambiguities).

andreadb added a subscriber: andreadb.

Hi Clement,

I think that your new check is a bit too strict.
It is okay to have less ResourceCycle elements than resources in ProcResources. It simply means that some resources would implicitly be consumed for 1cy.

Example:

// Reserve A9UnitFP for 2 consecutive cycles.
def A9Write2V4 : SchedWriteRes<[A9UnitFP, A9UnitAGU]> {
  let Latency = 4;
  let ResourceCycles = [2];

It is okay not to have an explicit [2,1] here. The A9UnitAGU should get an implicit 1cy.

Same here:

defm : ZnWriteResPair<WriteIMul, [ZnALU1, ZnMultiplier], 4>;

We shouldn't have to explicitly pass a ResourceCycles to [1,1]. IMHO That should be the implied default.

On the other hand, I totally agree with you that the following case is a bug:

def BWWriteResGroup138 : SchedWriteRes<[BWPort0,BWPort5,BWPort23]> {
  let Latency = 13;
  let NumMicroOps = 4;
  let ResourceCycles = [1,2,1,7];

Here ResourceCycles has more elements than the number of resources specified by the write (i.e. 4 entries versus 3 resources).

The next example is "well-formed", although semantically incorrect. We should have that 70 is associated to SLMFPDivider. However, strictly speaking, it is not wrong to have only a list of two cycles (instead of three).

def SLMriteResGroup13 : SchedWriteRes<[SLM_MEC_RSV,SLM_FPC_RSV0,SLMFPDivider]> {
  let Latency = 74;
  let NumMicroOps = 1;
  let ResourceCycles = [1,70];
}

I am not against introducing/enforcing more constraints. Personally, I would only error if Cycles.size() > PRVec.size(), and not if Cycles.size() != PRVec.size().
This is just my opinion though.
I do recognize though that this would help catching bugs similar to the div from the last example.

courbet updated this revision to Diff 144863.May 2 2018, 5:42 AM
  • revert changes to btver2, use a saner default.

Not sure if I want to have to specify resource cycles every time - we should be able to assume a default=1, same for latency and micro-ops. This helps reduce bulkiness of the models considerably. In fact I might go further and maybe suggest a debug warning that values set to the default value could be removed?

OK, I've changed list<int> Res = [1] to list<int> Res = [] in the definition of JWriteResFpuPair, and added a condition so that the folded version avoids the concat when relying in the default.

Hi Andrea,

I think that your new check is a bit too strict.
It is okay to have less ResourceCycle elements than resources in ProcResources. It simply means that some resources would implicitly be consumed for 1cy.

Yes, I understand that this is the current behaviour. But I think this change shows that relying on this implicit behaviour is way too dangerous as shown by the bugs it has uncovered in MIPS, Exynos (I think), ThunderX, Broadwell, and Silvermont.

Example:

// Reserve A9UnitFP for 2 consecutive cycles.
def A9Write2V4 : SchedWriteRes<[A9UnitFP, A9UnitAGU]> {
  let Latency = 4;
  let ResourceCycles = [2];

It is okay not to have an explicit [2,1] here. The A9UnitAGU should get an implicit 1cy.

It's semantically similar, but not obvious: someone unfamiliar with the schema could think that this is 1 cycle on A9UnitFP and one on A9UnitAGU. I think it's clearer to spell it out explicitly.

Same here:

defm : ZnWriteResPair<WriteIMul, [ZnALU1, ZnMultiplier], 4>;

We shouldn't have to explicitly pass a ResourceCycles to [1,1]. IMHO That should be the implied default.

That's Simon's point in him comment. Agreed, I'm going to apply the same approach.

The next example is "well-formed", although semantically incorrect.

That's the point of the change: it's currently too easy to accidentally write well-formed but incorrect specs.

courbet updated this revision to Diff 144865.May 2 2018, 5:58 AM
  • Allow fully implicit ResourceCycles in ZnXXXPair model. as per Andrea's comment.
  • Update TargetSchedule.td documentation.
courbet updated this revision to Diff 144869.May 2 2018, 6:09 AM
  • Update llvm-mca SLM tests.

Hi Andrea,

I think that your new check is a bit too strict.
It is okay to have less ResourceCycle elements than resources in ProcResources. It simply means that some resources would implicitly be consumed for 1cy.

Yes, I understand that this is the current behaviour. But I think this change shows that relying on this implicit behaviour is way too dangerous as shown by the bugs it has uncovered in MIPS, Exynos (I think), ThunderX, Broadwell, and Silvermont.

Example:

// Reserve A9UnitFP for 2 consecutive cycles.
def A9Write2V4 : SchedWriteRes<[A9UnitFP, A9UnitAGU]> {
  let Latency = 4;
  let ResourceCycles = [2];

It is okay not to have an explicit [2,1] here. The A9UnitAGU should get an implicit 1cy.

It's semantically similar, but not obvious: someone unfamiliar with the schema could think that this is 1 cycle on A9UnitFP and one on A9UnitAGU. I think it's clearer to spell it out explicitly.

My opinion is that the description in TargetSchedule.td is quite clear.

// Optionally, ResourceCycles indicates the number of cycles the
// resource is consumed. Each ResourceCycles item is paired with the
// ProcResource item at the same position in its list. Since
// ResourceCycles are rarely specialized, the list may be
// incomplete. By default, resources are consumed for a single cycle [...]

The next example is "well-formed", although semantically incorrect.

That's the point of the change: it's currently too easy to accidentally write well-formed but incorrect specs.

I agree with your last point. It would definitely force people to be more careful when they write those cycles.

If Simon/other people are happy with it, then I am happy too.

About your patch:
you should probably update the comment in TargetSchedule.td (lines [283:292]). Your patch slightly changes the set of constraints on the resource cycles; I think that one of the statements in that paragraph may not say the truth anymore.

utils/TableGen/SubtargetEmitter.cpp
1123 ↗(On Diff #144863)

Please remove this TODO (or remove your name from the TODO comment).

Hi Andrea,

I think that your new check is a bit too strict.
It is okay to have less ResourceCycle elements than resources in ProcResources. It simply means that some resources would implicitly be consumed for 1cy.

Yes, I understand that this is the current behaviour. But I think this change shows that relying on this implicit behaviour is way too dangerous as shown by the bugs it has uncovered in MIPS, Exynos (I think), ThunderX, Broadwell, and Silvermont.

Example:

// Reserve A9UnitFP for 2 consecutive cycles.
def A9Write2V4 : SchedWriteRes<[A9UnitFP, A9UnitAGU]> {
  let Latency = 4;
  let ResourceCycles = [2];

It is okay not to have an explicit [2,1] here. The A9UnitAGU should get an implicit 1cy.

It's semantically similar, but not obvious: someone unfamiliar with the schema could think that this is 1 cycle on A9UnitFP and one on A9UnitAGU. I think it's clearer to spell it out explicitly.

My opinion is that the description in TargetSchedule.td is quite clear.

// Optionally, ResourceCycles indicates the number of cycles the
// resource is consumed. Each ResourceCycles item is paired with the
// ProcResource item at the same position in its list. Since
// ResourceCycles are rarely specialized, the list may be
// incomplete. By default, resources are consumed for a single cycle [...]

It is, but it clearly does not prevent bugs in practice :)

you should probably update the comment in TargetSchedule.td (lines [283:292]). Your patch slightly changes the set of constraints on the resource cycles; I think that one of the statements in that paragraph may not say the truth anymore.

Yep, this was done in update 3 of this diff

courbet added inline comments.
lib/Target/Mips/MipsScheduleGeneric.td
77 ↗(On Diff #144869)

@sdardis Can you have a look at this one ? Thanks.

sdardis added inline comments.May 2 2018, 6:31 AM
lib/Target/Mips/MipsScheduleGeneric.td
77 ↗(On Diff #144869)

Bug. The divide unit in the generic model is supposed to be separate from the ALU. It's a hold-over from when I developed the generic model, as I cribbed it from the P5600 model with has two units working on the divide according to the model.

This change is correct.

82 ↗(On Diff #144869)

As is this.

Can you pull out the SLM / BW fixes into their own commits please?

lib/Target/X86/X86SchedBroadwell.td
1657 ↗(On Diff #144869)

This fix can be committed now in its own commit.

lib/Target/X86/X86ScheduleSLM.td
393 ↗(On Diff #144869)

This fix (and the llvm-mca test change) can be committed in its own commit.

courbet marked 5 inline comments as done.May 2 2018, 6:35 AM
courbet added inline comments.
lib/Target/Mips/MipsScheduleGeneric.td
77 ↗(On Diff #144869)

Thanks !

lib/Target/X86/X86SchedBroadwell.td
1657 ↗(On Diff #144869)

Will do.

courbet updated this revision to Diff 144876.May 2 2018, 6:42 AM
courbet marked 2 inline comments as done.

Submit MIPS change, rebase.

courbet updated this revision to Diff 144877.May 2 2018, 6:50 AM

Submit SLM+BDW fixes and rebase.

Thanks for this check.
If ResourceCycles is not specified then it should default to 1 . I believe most people write with that interpretation.
If ResourceCycles is specified for at least one resource, it makes sense from correctness POV that it be specified for all (even though it might get verbose at time).

andreadb accepted this revision.May 2 2018, 7:43 AM

Thanks Clement,

It looks good to me now. I only have one minor nit (see below).

utils/TableGen/SubtargetEmitter.cpp
1123 ↗(On Diff #144863)

This comment should be modified or removed.

This revision is now accepted and ready to land.May 2 2018, 7:43 AM
courbet updated this revision to Diff 144883.May 2 2018, 7:43 AM

Get rid of the FIXME to remove the default ResourceCyles.

RKSimon requested changes to this revision.May 2 2018, 7:47 AM

I think it'd be clearer if we could keep Btver2/Znver1 to the Res = [1] default tbh

This revision now requires changes to proceed.May 2 2018, 7:47 AM
courbet marked an inline comment as done.May 2 2018, 7:48 AM

Thanks.

@andreadb, @RKSimon, @javed.absar: I'll submit the X86 and A9 changes as they are safe.

I'll wait for confirmation for Exynos and Thunder for the rest.

I think it'd be clearer if we could keep Btver2/Znver1 to the Res = [1] default tbh

Hm, not sure how I can do that without triggering the check for cases when the user erroneously specified [1] when they wanted to use the default (now []). Did I miss a possibility ?
This makes is no difference for the user, but allows more checking. Best of both worlds ? :)

I think it'd be clearer if we could keep Btver2/Znver1 to the Res = [1] default tbh

Hm, not sure how I can do that without triggering the check for cases when the user erroneously specified [1] when they wanted to use the default (now []). Did I miss a possibility ?
This makes is no difference for the user, but allows more checking. Best of both worlds ? :)

I am bit confused on - "user erroneously specified [1] when they wanted to use the default ". But the default was always [1]

I think it'd be clearer if we could keep Btver2/Znver1 to the Res = [1] default tbh

Hm, not sure how I can do that without triggering the check for cases when the user erroneously specified [1] when they wanted to use the default (now []). Did I miss a possibility ?
This makes is no difference for the user, but allows more checking. Best of both worlds ? :)

Which cases fail?

I think it'd be clearer if we could keep Btver2/Znver1 to the Res = [1] default tbh

Hm, not sure how I can do that without triggering the check for cases when the user erroneously specified [1] when they wanted to use the default (now []). Did I miss a possibility ?
This makes is no difference for the user, but allows more checking. Best of both worlds ? :)

I am bit confused on - "user erroneously specified [1] when they wanted to use the default ". But the default was always [1]

Now that we're checking what the user wrote, we want the following cases to be valid:

def : WriteRes<SchedRW, [P0, P1]> {
    let ResourceCycles = [1, 1];  // explicit
}

def : WriteRes<SchedRW, [P0, P1]> {
    let ResourceCycles = [];  // explicitly use default, same as let ResourceCycles = [1,1]
}

def : WriteRes<SchedRW, [P0, P1]> {
    // implicitly use default, same as let ResourceCycles = [1,1]
}

But this one to be invalid:

def : WriteRes<SchedRW, [P0, P1]> {
    let ResourceCycles = [1];  // Did the user expect [1,1]; or did they make a mistake ? Throw an error.
}

In this specific case we want:

defm : ZnWriteResPair<WriteIMul,   [ZnALU1, ZnMultiplier], 4>;            // (A) ok, [1,1]
defm : ZnWriteResPair<WriteIMul,   [ZnALU1, ZnMultiplier], 4, [1,1]>;  // (B) ok, [1,1]
defm : ZnWriteResPair<WriteIMul,   [ZnALU1, ZnMultiplier], 4, [1]>;     // (C) Error, not sure if this is a mistake or not.

If I keep the default Res = [1] in the definition of ZnWriteResPair, I cannot make a difference between (A) and (C). Using the the default Res = [] allows making the difference and is consistent with what WriteRes uses as a default.

Am I making more sense ?

Ah! OK, I see. Thanks for the clarification.

Ah! OK, I see. Thanks for the clarification.

Yup - me too! LGTM if @javed.absar is happy as well.

evandro added inline comments.May 2 2018, 2:20 PM
lib/Target/AArch64/AArch64SchedExynosM3.td
307 ↗(On Diff #144883)

Please, make this [8, 8]...

311 ↗(On Diff #144883)

[13, 13]...

315 ↗(On Diff #144883)

[19, 19]...

319 ↗(On Diff #144883)

[26, 26]...

391 ↗(On Diff #144883)

[6, 6, 6]...

411 ↗(On Diff #144883)

[6, 1, 1, 6, 1]...

419 ↗(On Diff #144883)

[6, 1, 1, 6, 1, 1]...

425 ↗(On Diff #144883)

[6, 6, 6, 6]...

434 ↗(On Diff #144883)

[1, 3, 1, 3]...

442 ↗(On Diff #144883)

[1, 3, 1, 3, 1, 3]...

451 ↗(On Diff #144883)

[1, 3, 3, 1, 3, 1, 3]...

462 ↗(On Diff #144883)

[1, 3, 3, 1, 3, 1, 3, 1, 3]...

469 ↗(On Diff #144883)

[1, 3, 3, 1, 3]...

evandro added inline comments.May 2 2018, 2:20 PM
lib/Target/AArch64/AArch64SchedExynosM3.td
480 ↗(On Diff #144883)

[1, 3, 3, 1, 3, 1, 3, 1, 3].

courbet updated this revision to Diff 144982.May 2 2018, 11:13 PM

Submit NFC changes to ARM and AMD, rebase.

courbet added inline comments.May 3 2018, 1:53 AM
lib/Target/AArch64/AArch64SchedExynosM3.td
307 ↗(On Diff #144883)

Note that the two following definitions are strictly equivalent:

def M3WriteNEONV   : SchedWriteRes<[M3UnitFDIV, M3UnitFDIV]>  {
  let Latency = 7;
  let NumMicroOps = 1;
  let ResourceCycles = [8,8];
}
def M3WriteNEONV   : SchedWriteRes<[M3UnitFDIV]>  {
  let Latency = 7;
  let NumMicroOps = 1;
  let ResourceCycles = [16];
}

What's the semantics that you're trying to express by splitting into 8*M3UnitFDIV + 8*M3UnitFDIV vs 16*M3UnitFDIV ? The SubtargetEmitter is destroying these semantics anyway.

For reference before this change the definition is equivalent to:

def M3WriteNEONV   : SchedWriteRes<[M3UnitFDIV]>  {
  let Latency = 7;
  let NumMicroOps = 1;
  let ResourceCycles = [9];
}
evandro added inline comments.May 3 2018, 8:10 AM
lib/Target/AArch64/AArch64SchedExynosM3.td
307 ↗(On Diff #144883)

There are two FDIV units in M3:

def M3UnitFDIV : ProcResGroup<[M3UnitFDIV0, M3UnitFDIV1]>;

courbet added inline comments.May 3 2018, 8:28 AM
lib/Target/AArch64/AArch64SchedExynosM3.td
307 ↗(On Diff #144883)

Yes indeed, but this is already covered by the M3UnitFDIV referencing these two.

I'm not familiar with Exynos, what's the semantics that you're trying to express here ?

As written before this change, M3WriteNEONV dispatches one Uop that writes its output with latency 7 and uses either of {M3UnitFDIV0, M3UnitFDIV1} for 9 cycles.
After that change, whatever way ([8,8] or [16]) we write it, M3WriteNEONV dispatches one Uop that writes its output with latency 7 and uses either of {M3UnitFDIV0, M3UnitFDIV1} for 16 cycles.

Maybe that's the case, but since that sounds weird, I just want to confirm.

Does FDIV use both the resources for 8 cycles? Writing [8,8] would imply that.

Does FDIV use both the resources for 8 cycles? Writing [8,8] would imply that.

For FDIVv4f32, yes.

evandro added inline comments.May 3 2018, 9:38 AM
lib/Target/AArch64/AArch64SchedExynosM3.td
307 ↗(On Diff #144883)

I mean that with M3WriteNEONV, for FDIVv4f32, one uop is dispatched to both [M3UnitFDIV0, M3UnitFDIV1] (don't ask me how this is done inside the machine) and each used up for 8 cycles.

311 ↗(On Diff #144883)

And here I mean that with M3WriteNEONV, for FDIVv2f64, one uop is dispatched to both [M3UnitFDIV0, M3UnitFDIV1] and each used up for 13 cycles.

courbet updated this revision to Diff 145165.May 4 2018, 2:27 AM
courbet marked 13 inline comments as done.
  • [ExynosM3] Fix scheduling info.

@evandro Do you have benchmarks to evaluate the impact of this change ?

lib/Target/AArch64/AArch64SchedExynosM3.td
307 ↗(On Diff #144883)

Thanks for the explanation. So you wanted to write:

def M3WriteNEONV   : SchedWriteRes<[M3UnitFDIV0, M3UnitFDIV1]>  {
  let Latency = 7;
  let NumMicroOps = 2;
  let ResourceCycles = [8,8];
}

I've updated the diff with that.

courbet added inline comments.May 4 2018, 2:30 AM
lib/Target/AArch64/AArch64SchedExynosM3.td
376 ↗(On Diff #145165)

This still feels weirdly asymetric. Is that correct ?

@evandro Do you have benchmarks to evaluate the impact of this change ?

Yes. They're not done yet. Will let you know when I get the results.

Thank you.

lib/Target/AArch64/AArch64SchedExynosM3.td
376 ↗(On Diff #145165)

No, it should be [5, 5], thank you.

307 ↗(On Diff #144883)

Thank you.

311 ↗(On Diff #144883)

Err, I meant M3WriteNEONW here. Thank you.

courbet updated this revision to Diff 145427.May 6 2018, 11:25 PM
courbet marked 2 inline comments as done.

Fix a last exynos M3 entry.

@evandro Do you have benchmarks to evaluate the impact of this change ?

Yes. They're not done yet. Will let you know when I get the results.

Any news ?
Thanks.

evandro accepted this revision.May 15 2018, 8:19 AM

@evandro Do you have benchmarks to evaluate the impact of this change ?

Yes. They're not done yet. Will let you know when I get the results.

Any news ?

Sorry, they slipped my mind. I see a minor improvement overall, especially in FP workloads, with a few minor regressions.

Thank you.

@evandro Do you have benchmarks to evaluate the impact of this change ?

Yes. They're not done yet. Will let you know when I get the results.

Any news ?

Sorry, they slipped my mind. I see a minor improvement overall, especially in FP workloads, with a few minor regressions.

Thank you.

OK, thanks. I'll submit this part then.

This revision was not accepted when it landed; it landed in state Needs Review.May 18 2018, 6:17 AM
This revision was automatically updated to reflect the committed changes.
courbet reopened this revision.May 18 2018, 6:21 AM

I accidentally closed this when submitting r332713.

courbet updated this revision to Diff 147494.May 18 2018, 6:24 AM

rebase after r332713

What is still outstanding on this? Have regression tests been done on all the AArch64 cpus @javed.absar @evandro ?

It'd be very useful to get this error reporting added to trunk ASAP as people are taking an interest in improving scheduling right now.

What is still outstanding on this?

A review of ExynosM1 /ThunderX2, as I'm not familiar with these CPUs.

RKSimon added a subscriber: joel_k_jones.

@joel_k_jones Can you comment on these aarch64 model changes please?

A review of ExynosM1 /ThunderX2, as I'm not familiar with these CPUs.

I'm sorry, I completely forgot about M1.

lib/Target/AArch64/AArch64SchedExynosM1.td
329 ↗(On Diff #147494)

[1, 1]

343 ↗(On Diff #147494)

[2, 2, 2]

349 ↗(On Diff #147494)

[2, 1, 1, 1]

356 ↗(On Diff #147494)

[2, 1, 1, 1, 1]

363 ↗(On Diff #147494)

[1, 1, 1, 1, 1]

371 ↗(On Diff #147494)

[1, 1, 1, 1, 1, 1]

377 ↗(On Diff #147494)

[2, 1, 2, 1]

392 ↗(On Diff #147494)

[7, 1, 1, 1, 1]

401 ↗(On Diff #147494)

[1, 7, 1, 7, 1, 1, 1]

412 ↗(On Diff #147494)

[1, 7, 1, 7, 1, 1, 1, 1, 1]

419 ↗(On Diff #147494)

[1, 7, 1, 7, 1]

432 ↗(On Diff #147494)

[1, 7, 1, 7, 1, 1, 1, 1, 1, 1, 1]

courbet updated this revision to Diff 150263.Jun 7 2018, 12:31 AM
courbet marked 12 inline comments as done.

Update ExynosM1.

lib/Target/AArch64/AArch64SchedExynosM1.td
343 ↗(On Diff #147494)

I set NumMicroOps = 6 to be consistent. Is that OK ?

349 ↗(On Diff #147494)

NumMicroOps = 5

356 ↗(On Diff #147494)

NumMicroOps = 6

377 ↗(On Diff #147494)

NumMicroOps = 6

392 ↗(On Diff #147494)

NumMicroOps = 11

401 ↗(On Diff #147494)

ditto

412 ↗(On Diff #147494)

ditto

419 ↗(On Diff #147494)

ditto

432 ↗(On Diff #147494)

ditto

courbet updated this revision to Diff 150267.Jun 7 2018, 12:44 AM
  • fix btver2

More X86 typos crept in while this was pending; I fixed some harmless ones in r334164, but fixing the ones in BtVer2 would actually introduce semantic differences, so I've added it back here.

RKSimon added inline comments.Jun 7 2018, 2:22 AM
lib/Target/X86/X86ScheduleBtVer2.td
371 ↗(On Diff #150267)

These are 2 uops not 4, btver2 splits the issue and execution resource.

courbet marked an inline comment as done.Jun 7 2018, 2:28 AM
courbet added inline comments.
lib/Target/X86/X86ScheduleBtVer2.td
371 ↗(On Diff #150267)

OK cool, thanks. So this is an NFC.

RKSimon added inline comments.Jun 7 2018, 6:36 AM
lib/Target/X86/X86ScheduleBtVer2.td
371 ↗(On Diff #150267)

No, you need to set it back to 2

courbet updated this revision to Diff 150321.Jun 7 2018, 6:46 AM
courbet marked an inline comment as done.

Rebase after r334178.

courbet added inline comments.Jun 7 2018, 6:46 AM
lib/Target/X86/X86ScheduleBtVer2.td
371 ↗(On Diff #150267)

Sorry, yes, that's what I meant; I just did not re-export to phabricator. I submitted r334178 with 2 here.

@evandro Does AArch64SchedExynosM1.td and AArch64SchedThunderX2T99.td look correct now please?

Update ExynosM1.

No, the number of uops was correct. Don't ask how, but some uops can be dispatched to different units at the same time. Please, restore the original number, unless it creates any error. Then, we can talk.

@evandro Does AArch64SchedExynosM1.td and AArch64SchedThunderX2T99.td look correct now please?

Except for my remark above on M1, the changes to X2T99 seem reasonable to me. But I am not familiar with this Cavium processor. It'd be better if someone from Cavium would chime in, but their silence, after so long, might be taken as consent.

courbet updated this revision to Diff 150680.Jun 11 2018, 12:12 AM

revert NumMicroOps changes to ExynosM1.

courbet updated this revision to Diff 150682.Jun 11 2018, 12:37 AM

Rebase after submitting ExynosM1 changes as r334391.

I'll submit the cavium changes in a couple of days unless there are reservations.
@joel_k_jones any comments ?

RKSimon accepted this revision.Jun 13 2018, 2:26 AM

LGTM - thank @courbet

This revision is now accepted and ready to land.Jun 13 2018, 2:26 AM

Thanks all for the reviews.

This revision was automatically updated to reflect the committed changes.