Page MenuHomePhabricator

Add Marvell ThunderX3T110 support
ClosedPublic

Authored by wxz2020 on Apr 14 2020, 10:08 AM.

Details

Summary

This is the first checkin to support Marvell ThunderX3T110.

Initial definition of the micro-ops of the instructions in ThunderX3T110 is included.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Herald added a project: Restricted Project. · View Herald TranscriptApr 14 2020, 10:08 AM
wxz2020 updated this revision to Diff 257876.Apr 15 2020, 3:43 PM

The newly added two files were not accepted by the pre-merge check. Re worked on the diff file and uploaded it now.

wxz2020 updated this revision to Diff 259605.Apr 23 2020, 9:27 AM

Resubmit as the previous one was rejected by test plan changes

wxz2020 updated this revision to Diff 259737.Apr 23 2020, 3:26 PM

fix a format issus.

Fix a format issue.

ktkachov added inline comments.
llvm/lib/Target/AArch64/AArch64InstrInfo.td
849–857 ↗(On Diff #259737)

IIRC these instructions are deliberately allowed in pre-armv8.3 targets because they are encoded in the NOP-space and can be deployed on pre-armv8.3 targets

llvm/lib/Target/AArch64/AArch64SchedThunderX3T110.td
10

Typo in the processor name?

wxz2020 marked 2 inline comments as done.Apr 24 2020, 8:43 AM
wxz2020 added inline comments.
llvm/lib/Target/AArch64/AArch64InstrInfo.td
849–857 ↗(On Diff #259737)

I will do some research on this.

llvm/lib/Target/AArch64/AArch64SchedThunderX3T110.td
10

Sure, thanks for spotting it. I will have it fixed.

wxz2020 updated this revision to Diff 260680.Apr 28 2020, 10:11 AM

Remove non-official target options. We now only support -mcpu=thunderx3t110

Typo fix.

Herald added a project: Restricted Project. · View Herald TranscriptApr 28 2020, 10:11 AM
wxz2020 marked an inline comment as done.Apr 28 2020, 1:41 PM
wxz2020 added inline comments.
llvm/lib/Target/AArch64/AArch64InstrInfo.td
849–857 ↗(On Diff #259737)

According to the documents, pointer authenticatoin got officially supporoted starting from armv8.3.

chill added a subscriber: chill.Apr 29 2020, 12:57 AM
chill added inline comments.
llvm/lib/Target/AArch64/AArch64InstrInfo.td
849–857 ↗(On Diff #259737)

These instructions are executed as NOP on pre v8.3-A architectures. It allows you to have a single compatible binary that works correctly on pre v8.3-a (ofc, without pointer authentication), as well as on
8.3-a and later cores, with pointer authentication.

Please, remove the predicates.

DavidSpickett added inline comments.
clang/lib/Driver/ToolChains/Arch/AArch64.cpp
89 ↗(On Diff #260680)

Remove stray change.

llvm/include/llvm/Support/AArch64TargetParser.def
167–171

Is this change also correcting the options for the x2t99? I think that should be a separate patch if so.

DavidSpickett added inline comments.Apr 29 2020, 1:57 AM
llvm/include/llvm/Support/AArch64TargetParser.def
168

Add a tests for this in llvm/unittests/Support/TargetParserTest.cpp (there are existing ones for "thunderx2t99").
Might need to update "NumAArch64CPUArchs" too.

SjoerdMeijer added inline comments.
llvm/lib/Target/AArch64/AArch64.td
856

HasV8_3aOps implies HasV8_2aOps, which implies HasV8_1aOps.
So you can just remove HasV8_1aOps?

llvm/lib/Target/AArch64/AArch64SchedThunderX3T110.td
14

I don't intend to check the numbers here, but just curious if there's an optimisation guide if people are curious?

llvm/test/CodeGen/AArch64/loop-micro-op-buffer-size-t110.ll
1 ↗(On Diff #260680)

Couple of question about this test.
Looks like you're both testing output on stdout and stderr, and your checks for this are interleaved. If I am not mistaken, the behaviour of this can vary on different platforms, and thus may fail on different platforms.

But more importantly, what is this test exactly testing? The file name gives away a hint of course, but I don't see yet how this interact with loop unrolling. Is the loop unroller looking at MaxInterleaveFactor that is set here, but is that what we are testing here?

25 ↗(On Diff #260680)

I guess there won't be another define %val, it will be %val6, so this CHECK-NOT will never match even if there's another add?

wxz2020 marked 7 inline comments as done.May 1 2020, 10:52 AM
wxz2020 added inline comments.
clang/lib/Driver/ToolChains/Arch/AArch64.cpp
89 ↗(On Diff #260680)

Removed

llvm/include/llvm/Support/AArch64TargetParser.def
167–171

I will have it removed. irrelevant to this change.

168

OK, we will have it fixed

llvm/lib/Target/AArch64/AArch64InstrInfo.td
849–857 ↗(On Diff #259737)

Agree. Will remove it.

llvm/lib/Target/AArch64/AArch64SchedThunderX3T110.td
14

We will release a document about the details soon.

llvm/test/CodeGen/AArch64/loop-micro-op-buffer-size-t110.ll
25 ↗(On Diff #260680)

This test is for our internal use. Sorry, Will have it removed.

wxz2020 updated this revision to Diff 261497.May 1 2020, 10:55 AM
wxz2020 marked an inline comment as done.

fixed all the feedback suggestions. Thanks,

In email Wei asked for help about he following error message:

error message from tblgen
Included from /home/wei/project/tx3/llvm-project/llvm/lib/Target/AArch64/AArch64.td:439:
/home/wei/project/tx3/llvm-project/llvm/lib/Target/AArch64/AArch64InstrInfo.td:961:5: error: 'CycloneModel' lacks information for 'AUTDZA'

def DZA  : SignAuthZero<prefix_z, 0b10, !strconcat(asm, "dza")>;
^
In the end:
Incomplete schedule models found.

·         Consider setting 'CompleteModel = 0' while developing new models.
·         Pseudo instructions can be marked with 'hasNoSchedulingInfo = 1'.
·         Instructions should usually have Sched<[...]> as a superclass, you may temporarily use an empty list.
·         Instructions related to unsupported features can be excluded with list<Predicate> UnsupportedFeatures = [HasA,..,HasY]; in the processor model.
error: Incomplete schedule model

And the comment from the person on the email was to define the instruction. The instruction is defined, as evidenced when table gen is run to produce the record-list. The error message can be suppressed by defining CompleteModele = 0, but that isn't correct, as there are models for those instructions.

However, I now note that at least In the output that Wei captured, it isn't complaining about the ThunderX3T110 model, but about Cyclone.

In email Wei asked for help about he following error message:

error message from tblgen
Included from /home/wei/project/tx3/llvm-project/llvm/lib/Target/AArch64/AArch64.td:439:
/home/wei/project/tx3/llvm-project/llvm/lib/Target/AArch64/AArch64InstrInfo.td:961:5: error: 'CycloneModel' lacks information for 'AUTDZA'

def DZA  : SignAuthZero<prefix_z, 0b10, !strconcat(asm, "dza")>;
^
In the end:
Incomplete schedule models found.

·         Consider setting 'CompleteModel = 0' while developing new models.
·         Pseudo instructions can be marked with 'hasNoSchedulingInfo = 1'.
·         Instructions should usually have Sched<[...]> as a superclass, you may temporarily use an empty list.
·         Instructions related to unsupported features can be excluded with list<Predicate> UnsupportedFeatures = [HasA,..,HasY]; in the processor model.
error: Incomplete schedule model

And the comment from the person on the email was to define the instruction. The instruction is defined, as evidenced when table gen is run to produce the record-list. The error message can be suppressed by defining CompleteModele = 0, but that isn't correct, as there are models for those instructions.

However, I now note that at least In the output that Wei captured, it isn't complaining about the ThunderX3T110 model, but about Cyclone.

Hi Joel,
Thanks for your message, and apologies for not replying earlier to the updates; missed the notification.
I am now taking a closer at some changes I hadn't looked at. In the mean, some drive-by comments:

It is okay'ish to set CompleteModel = 0 if you're not interested in describing all instructions. The other way is of course to add the missing instruction to the model.

But first things first. @chill commented earlier about predicates. I also don't think you need to add HasV8_3a for the reason Momchill described, and there is still one left AArch64InstrInfo.td.

And while I am still looking into this, let me ask the question what I am wondering: why do you need to change the other scheduling models? These instructions are always supported, as a NOP, or otherwise.

It is okay'ish to set CompleteModel = 0 if you're not interested in describing all instructions. The other way is of course to add the missing instruction to the model.

We are trying to describe all instructions. The issue Wei is having is that the pointer auth instructions are being called out in the error message, but their timing information is defined—see the very end of AArch64SchedThunderX3T110.td for AUTDZA.

But first things first. @chill commented earlier about predicates. I also don't think you need to add HasV8_3a for the reason Momchill described, and there is still one left AArch64InstrInfo.td.

And while I am still looking into this, let me ask the question what I am wondering: why do you need to change the other scheduling models? These instructions are always supported, as a NOP, or otherwise.

The changes to UnsupportedFeatures that was in every model has been taken out. I'm not sure why UnsupportedFeatures exists at all, but it isn't just AArch64 that has this.

Should the latest version be put up even if it fails on table-gen? That would make things easier to review.

wxz2020 marked an inline comment as done.May 6 2020, 7:53 AM

I tried to set CompleteModel = 0 on line 25 in file llvm/lib/Target/AArch64/AArch64SchedThunderX3T110.td

The same error is still there.

llvm/lib/Target/AArch64/AArch64InstrInfo.td
849–857 ↗(On Diff #259737)

One questions here, we want to model these instructions in our machine model, scroll down to line 1992-1997, in
llvm/lib/Target/AArch64/AArch64SchedThunderX3T110.td below.

However, if we remove this predicate here, tblgen will complain the following errors for all the instructions defined in line 1992-1997:

error message from tblgen

Included from /home/wei/project/tx3/llvm-project/llvm/lib/Target/AArch64/AArch64.td:439:
/home/wei/project/tx3/llvm-project/llvm/lib/Target/AArch64/AArch64InstrInfo.td:961:5: error: 'CycloneModel' lacks information for 'AUTDZA'

def DZA  : SignAuthZero<prefix_z, 0b10, !strconcat(asm, "dza")>;
^

In the end:
Incomplete schedule models found.

  • Consider setting 'CompleteModel = 0' while developing new models.
  • Pseudo instructions can be marked with 'hasNoSchedulingInfo = 1'.
  • Instructions should usually have Sched<[...]> as a superclass, you may temporarily use an empty list.
  • Instructions related to unsupported features can be excluded with list<Predicate> UnsupportedFeatures = [HasA,..,HasY]; in the processor model.

error: Incomplete schedule model

Anyone has clue why we have this problem? How should I model these instructions in the .md file.

Thanks,

SjoerdMeijer added a comment.EditedMay 6 2020, 11:02 AM

Since debugging on phabricator is a bit difficult, I took the patch and had a little play. Now I actually remember seeing this before. I think the way this works is that when you describe new instructions (PA in this case) that other models don't have, they start complaining about incomplete models when they have set let CompleteModel = 1

I got things compiling with these changes:

  1. leave all existing models untouched, revert those changes,
  1. remove that predicate HasPA that you added,
  1. Change this:

def SVEUnsupported : AArch64Unsupported {
let F = [HasSVE, HasSVE2, HasSVE2AES, HasSVE2SM4, HasSVE2SHA3,

  • HasSVE2BitPerm]; + HasSVE2BitPerm, HasPA]; }
  1. In your new model set:

let CompleteModel = 0;
list<Predicate> UnsupportedFeatures = SVEUnsupported.F;

  1. This is where the problem is, comment out these descriptions in your model:

def : InstRW<[THX3T110Write_7Cyc_I1],
(instrs AUTDA, AUTDB, AUTDZA, AUTDZB, AUTIA, AUTIA1716, AUTIASP,
AUTIAZ, AUTIB, AUTIB1716, AUTIBSP, AUTIBZ, AUTIZA, AUTIZB,
PACDA, PACDB, PACDZA, PACDZB, PACGA, PACIA, PACIA1716,
PACIASP, PACIAZ, PACIB, PACIB1716, PACIBSP, PACIBZ,
PACIZA, PACIZB, XPACD, XPACI, XPACLRI)>;

I have never really looked into these instructions, and quick grep doesn't show any descriptions for some. For example, AUTDA doesn't exist. The others look aliases, perhaps they need a different treatment, or a regexp, I don't know, but perhaps you can take it from here.

I can confirm your findings, basically as long as we comment out the new instructions, it will be OK to build.

Agree with your changes 1), 2) and 3).

  1. this seems does not matter to build, as long as we comment out the new instructions in 5), setting it "1" or "0", both are OK. they all can be compiled
  1. if we comment out these instructions, then we will not be able to schedule them efficiently in the instruction scheduler. I know in other models or earlier versions, they could be treated as NOP, but in our new machine, they do have a timing effect, 7 cycles to finish. Any way to represent this in .md?

Thanks a lot for the help,

I don't think it makes sense to combine two unrelated things SVE and PA support into a combined thing. Since we already have UnsupportedFeatures in every sub-target .td file, I think it would be better to instead have:

def PAUnsupported : AArch64Unsupported {
  let F = [HasPA];
}

and modify each .td file to have

list<Predicate> UnsupportedFeatures = !listconcat(SVEUnsupported.F, PAUnsupported.F);
list<Predicate> UnsupportedFeatures = !listconcat(SVEUnsupported.F, PAUnsupported.F);

Agree. Methinks that this is easier to read.

Agreed about the UnsupportedFeatures list. But I just wanted to unblock this work, create a first version that compiles, so you can pick it up and clean things further up.

wxz2020 updated this revision to Diff 262665.May 7 2020, 8:26 AM

I put every people's feedbacks into the code and upload them here.

I think we can use it as the start point for TX3, and add more when we move on.

Thank you all for great help,

I will put the "PAUnsupported" predicate later once this got passed.

Let's separate out HasPA from SVEFeatures now while we are at it, probably it's more work to do this as a follow up, and after that this looks good to me.

Bonus points for adding some llvm-mca tests, see the llvm-project/llvm/test/tools/llvm-mca/ directory for some inspiration how this latencies can be checked, but that seems like something for follow-up.

llvm/lib/Target/AArch64/AArch64SchedThunderX3T110.td
1993

just remove them then instead of commenting them out.

wxz2020 updated this revision to Diff 262876.May 8 2020, 7:42 AM

Removed the PA related instructions from the .md file as suggested.

I think I made all the changes mentioned in the feedbacks. Please take sometime to review the current version. We are eager to get the first thunderx3t110 checked in.

There are a few things we plan to do after the 1st version:

  1. add more tests as suggested in the review
  2. make changes to the predicate for clarity and easy readability
  3. solve the PA instruction issue

Let me know if there is anything missing. Thanks,

I don't think it makes sense to combine two unrelated things SVE and PA support into a combined thing. Since we already have UnsupportedFeatures in every sub-target .td file, I think it would be better to instead have:

def PAUnsupported : AArch64Unsupported {
  let F = [HasPA];
}

and modify each .td file to have

list<Predicate> UnsupportedFeatures = !listconcat(SVEUnsupported.F, PAUnsupported.F);

Can you add what Joel suggested? I don't see the point of doing this as a follow up. This very simple thing is done half right here, so is best fixed here as there is no reason to leave this for another time; following up on this and talking about this is more work than just doing it.

Sure I will add Joel's suggestion and other fixes later.

Thanks,

wxz2020 updated this revision to Diff 263070.May 10 2020, 7:36 AM
  1. Added the predicates as suggested
  2. brought back the TargetParserTest.cpp which was missed last time
wxz2020 updated this revision to Diff 263079.May 10 2020, 12:20 PM

Fix the format issue in TargetParserTest.cpp

wxz2020 updated this revision to Diff 263081.May 10 2020, 12:58 PM

fix a typo, thunderx3t110, not thunderx3t100 in TargetParserTest.cpp

SjoerdMeijer added a comment.EditedMay 11 2020, 12:48 AM

This looks good now, but sorry, one more request: I've just noticed a Clang driver test is missing. Can you add a test for this to clang/test/Driver/aarch64-cpus.c? And related to this, the relevant tests to llvm/unittests/Support/TargetParserTest.cpp?

wxz2020 updated this revision to Diff 263543.May 12 2020, 3:41 PM
  1. in clang/test/Driver/aarch64-cpus.c, expanded the the same tests for thunderx2t99 to thunderx3t110
  2. in llvm/unittest/Support/TargetParserTest.cpp, fix the missing target features

So far all the existing issues are fixed according to review feedbacks

Thanks

wxz2020 updated this revision to Diff 263582.May 12 2020, 6:05 PM

Correct the wrong version of llvm/unittests/Support/TargetParserTest.cpp which causes the test failure.

SjoerdMeijer accepted this revision.May 13 2020, 12:39 AM

Thanks, LGTM

This revision is now accepted and ready to land.May 13 2020, 12:39 AM

Forgot to ask/add: can you commit this, do you have commit rights?

Joel will help me to commit it. Thanks,

This revision was automatically updated to reflect the committed changes.

I have a question here. Our customers need the TX3 support urgently, they asked me to put this TX3 support to LLVM 10.0.1 and if there is 9.0.2 release planned, they also need it there.

How can I do this? To put this checkin to LLVM 10.0.1 branch and 9.0.2 branch?

Is it possible? What steps I should follow?

Thanks,

SjoerdMeijer added a comment.EditedMay 19 2020, 8:52 AM

Best to ask the release manager. Today Tom Stellard posted a message to the llvm dev list with subject:

[llvm-dev] LLVM 10.0.1-rc1 release update

So best to reply to this(*) asap with your query if it's possible to get this onto the branch.

(*) http://lists.llvm.org/pipermail/llvm-dev/2020-May/141670.html

Thanks, Just sent Tom an email.