Page MenuHomePhabricator

Backend for NEC SX-Aurora
ClosedPublic

Authored by simoll on Oct 17 2019, 5:05 AM.

Details

Summary

This patch registers the 've' target: the NEC SX-Aurora TSUBASA Vector Engine.

This follows up on the announcement on llvm-dev: https://lists.llvm.org/pipermail/llvm-dev/2019-April/131580.html
We have a poster & lightning talk at the upcoming DevMtg: http://llvm.org/devmtg/2019-10/talk-abstracts.html#lit6

  • Testing
    • Reference implementation is tested continuously at NEC.
    • We are planning to setup a buildslave reporting to LLVM buildmaster.
  • Code owner of the VE target
    • Simon Moll

A rough roadmap:

  1. Registers, encodings, scalar instructions, clang tooling, ..
  2. Vector instruction support through target-specific intrinsics.
  3. Support for LLVM-VP as it becomes available in LLVM upstream.

Diff Detail

Event Timeline

simoll created this revision.Oct 17 2019, 5:05 AM
efocht added a subscriber: efocht.Oct 17 2019, 5:28 AM
bcain added a subscriber: bcain.Oct 18 2019, 6:19 AM
bcain added a comment.Oct 18 2019, 7:07 AM

Shouldn't the first patch add to LLVM_TARGETS_TO_BUILD or LLVM_EXPERIMENTAL_TARGETS_TO_BUILD?

llvm/test/CodeGen/VE/target_support.ll
2

not not can't be right, can it? Just run llc directly.

simoll marked an inline comment as done.Oct 22 2019, 5:15 PM

Shouldn't the first patch add to LLVM_TARGETS_TO_BUILD or LLVM_EXPERIMENTAL_TARGETS_TO_BUILD?

This patch adds VE as an experimental target, which is not build by default. If users want VE support, they add it to LLVM_EXPERIMENTAL_TARGETS_TO_BUILD when they configure the build.

llvm/test/CodeGen/VE/target_support.ll
2

The 'not not' makes sure the test does not crash when 'llc' crashes (which is anticipated in a successful run). We need this because if the tests succeeds, the 'llc' call crashes trying to create initialize VE codegen. The test fails when 'llc' reports that 've' is not a valid target.

asl added a subscriber: asl.Oct 22 2019, 5:25 PM
asl added inline comments.
llvm/test/CodeGen/VE/target_support.ll
2

Can't you just check the output of llc --version then? Expecting llc to crash is quite ugly...

simoll updated this revision to Diff 226140.Oct 23 2019, 8:06 AM
simoll marked 3 inline comments as done.

Thanks for your feedback! Fixes in this update:

  • Using llc --version in the target test.
  • comment cleanup.
rengolin added a subscriber: rengolin.

This plan has all the requirements for adding a new target (http://llvm.org/docs/DeveloperPolicy.html#new-targets), mainly: ISA document, current implementation, code owner, buildbots. You just need to make sure it's marked as *experimental*, as stated in the policy.

The Github repo mentioned (branch develop) is a bit hard to make out what are the changes necessary, as it has only merges from other branches. Given that we'll have to judge what the next series of patches will be, and that you'll have to upstream them in order, it would be good if they were more or less organised in the way they will be upstreamed in the source. This will also help you rebase changes that are requested in the reviews, which weren't originally in your repo, and test them before continuing the upstreaming process.

Other than that, it would be nice if other code owners could review and approve.

arsenm added a subscriber: arsenm.Oct 27 2019, 2:40 PM

Could use some unit tests in TripleTest

llvm/lib/Target/LLVMBuild.txt
39

Targets usually start out as experimental targets, not added to the default build list (although IMO this is more trouble than it's worth in practice)

llvm/lib/Target/VE/MCTargetDesc/VEMCTargetDesc.h
17–19

Unnecessary includes? Although I guess this is just a stub

craig.topper added inline comments.Oct 27 2019, 9:03 PM
llvm/lib/Support/Triple.cpp
1473

It looks like this list was supposed to be alphabetized, but then renderscript32/64 was added out of order? Should we just put this back in order?

1565

Same here

simoll planned changes to this revision.Oct 28 2019, 5:42 AM
simoll marked 3 inline comments as done.

Could use some unit tests in TripleTest

Ok. Skimming through the tests there, it seems one parser test for a ve-X-Y triple should be enough, right?

llvm/lib/Support/Triple.cpp
1473

I guess the order of backend names is not a critical issue. I'd bring the target lists back into alphabetical order in a quick follow-up after this patch.

llvm/lib/Target/LLVMBuild.txt
39

It is my understanding that a target is experimental, if it is not added by default to LLVM_TARGETS_TO_BUILD (it is not in LLVM_ALL_TARGETS of `llvm/CMakeLists.txt).

If this line is removed the cmake configure step crashes when the target is added to LLVM_EXPERIMENTAL_TARGETS_TO_BUILD.

llvm-build: error: invalid target to enable: 'VE' (not in project)
llvm/lib/Target/VE/MCTargetDesc/VEMCTargetDesc.h
17–19

Yep. You are looking at a close-to-minimal slice of the current backend source on github. Should i strip these includes? (to add them later?)

simoll updated this revision to Diff 226671.Oct 28 2019, 8:44 AM

Thanks for your comments!

Added the suggested TripleTest.

This plan has all the requirements for adding a new target (http://llvm.org/docs/DeveloperPolicy.html#new-targets), mainly: ISA document, current implementation, code owner, buildbots. You just need to make sure it's marked as *experimental*, as stated in the policy.

Great! How do we mark the target to make it clear it's experimental?

The Github repo mentioned (branch develop) is a bit hard to make out what are the changes necessary, as it has only merges from other branches. Given that we'll have to judge what the next series of patches will be, and that you'll have to upstream them in order, it would be good if they were more or less organised in the way they will be upstreamed in the source. This will also help you rebase changes that are requested in the reviews, which weren't originally in your repo, and test them before continuing the upstreaming process.

That's a good point. We probably won't stick to the existing commits but repackage the github code into a linear sequence of testable patches for review. We will go for instruction encodings, register decl and scalar codegen first, then extend to vector intrinsics and finally to LLVM-VP.

How much of that do you need to see before this patch here can be approved? I am asking because as we are refactoring the out-of-tree patches for upstreaming, it'd ease the pain of rebasing if some of the changes outside of lib/Target/VE could already go in. Beyond this patch, the critical changes are:

  • TableGen:
    • IntrinsicEmitter (Additional IIT_Info enum entries).
    • CodeGenTarget (MVT enum entries).
  • Clang: VE toolchain.

Later, we will add target-specific vector intrinsics to Clang and LLVM.

Other than that, it would be nice if other code owners could review and approve.

Thanks! :)

Great! How do we mark the target to make it clear it's experimental?

There used to be different registers in CMake and some structs, I can't remember. There are some comments in here that may help, but ultimately, you want to make sure that your target is only built if explicitly named, so we don't break bots.

How much of that do you need to see before this patch here can be approved? I am asking because as we are refactoring the out-of-tree patches for upstreaming, it'd ease the pain of rebasing if some of the changes outside of lib/Target/VE could already go in.

Ideally you'd like to at least show the next steps (in the order you named). This could be a tentative Phab review, or a branch on your Github repo that shows it, but it would have to end up in Phab in a way that can be merged pretty soon.

What we're trying to avoid here is to have a long period (potentially across releases) with just the stub in and no real code. The current patch is just a marker and it would be good to see you're on the right path early on, so make any strong refactory before starting the patch set.

Beyond this patch, the critical changes are:

  • TableGen:
    • IntrinsicEmitter (Additional IIT_Info enum entries).
    • CodeGenTarget (MVT enum entries).
  • Clang: VE toolchain.

    Later, we will add target-specific vector intrinsics to Clang and LLVM.

I'm also interested in how you're going to connect to the rest of the toolchain. Are you emitting direct binaries (are they ELF-like?) or assembly that gets lowered by a proprietary tool (like NVidia)?

Also, what is the subset of IR that can work in your target (if any) and will Clang be responsible for filtering it or do we barf in the middle-end?

I'm expecting the loop vectoriser to be of interest, because if all we have is intrinsics, than it's easier to write assembly directly. :)

--renato

simoll added a comment.EditedOct 28 2019, 12:42 PM

Great! How do we mark the target to make it clear it's experimental?

There used to be different registers in CMake and some structs, I can't remember. There are some comments in here that may help, but ultimately, you want to make sure that your target is only built if explicitly named, so we don't break bots.

Okay. I'll double check cmake only configures the build with the target when its explicitly added to LLVM_EXPERIMENTAl_TARGETS_TO_BUILD. Some pointers to the structs in question would be very helpful.

How much of that do you need to see before this patch here can be approved? I am asking because as we are refactoring the out-of-tree patches for upstreaming, it'd ease the pain of rebasing if some of the changes outside of lib/Target/VE could already go in.

Ideally you'd like to at least show the next steps (in the order you named). This could be a tentative Phab review, or a branch on your Github repo that shows it, but it would have to end up in Phab in a way that can be merged pretty soon.

What we're trying to avoid here is to have a long period (potentially across releases) with just the stub in and no real code. The current patch is just a marker and it would be good to see you're on the right path early on, so make any strong refactory before starting the patch set.

I understand. We will repackage the github code into digestible commits and put them on phab. I'll get back to you when the patch sets leading up to full scalar codegen are ready.

Beyond this patch, the critical changes are:

  • TableGen:
    • IntrinsicEmitter (Additional IIT_Info enum entries).
    • CodeGenTarget (MVT enum entries).
  • Clang: VE toolchain.

    Later, we will add target-specific vector intrinsics to Clang and LLVM.

I'm also interested in how you're going to connect to the rest of the toolchain. Are you emitting direct binaries (are they ELF-like?) or assembly that gets lowered by a proprietary tool (like NVidia)?

Also, what is the subset of IR that can work in your target (if any) and will Clang be responsible for filtering it or do we barf in the middle-end?

Any code that would run on your CPU really :) The full application binary (ELF, btw) runs on the card by default and dispatches systemcalls to a proxy process on the host. The github code already implements full scalar instruction codegen (and there are VE implementations for libcxx/libcxxabi/libunwind, ..).

I'm expecting the loop vectoriser to be of interest, because if all we have is intrinsics, than it's easier to write assembly directly. :)

Well, we already use the target-specific intrinsics for hand-written Tensorflow kernels. However, we are planning to implement standard vector instructions and LLVM-VP once its available upstream. The Region Vectorizer is already capable of emitting VP intrinsics. The current loop vectorizer would need to be extended to support tail loop predication through setting the active vector length to be useful for the VE (Otw, eg for packed f32 mode, there would be up to 511 scalarized remainder iterations, ..).

I understand. We will repackage the github code into digestible commits and put them on phab. I'll get back to you when the patch sets leading up to full scalar codegen are ready.

Thanks!

Any code that would run on your CPU really :) The full application binary (ELF, btw) runs on the card by default and dispatches systemcalls to a proxy process on the host. The github code already implements full scalar instruction codegen (and there are VE implementations for libcxx/libcxxabi/libunwind, ..).

Awesome! Hopefully the callbacks into CPU are rare enough in optimised code that it doesn't get to be a bottleneck.

Well, we already use the target-specific intrinsics for hand-written Tensorflow kernels. However, we are planning to implement standard vector instructions and LLVM-VP once its available upstream. The Region Vectorizer is already capable of emitting VP intrinsics. The current loop vectorizer would need to be extended to support tail loop predication through setting the active vector length to be useful for the VE (Otw, eg for packed f32 mode, there would be up to 511 scalarized remainder iterations, ..).

Interesting. I'm guessing VPlan would be a must-have to get decent performance in this target.

simoll updated this revision to Diff 233296.Dec 11 2019, 2:56 AM
  • removed vector types from DataLayout

We've sliced up the backend into 5 different patches. The first three patches leave you with functional scalar code generation for the VE target:

  1. VE,#0 64bit data for SjLj - is a preparatory patch for exception handling.
  2. VE,#1 Scalar code generation - full scalar code generation.
  3. VE,#2 Clang toolchain for SX-Aurora - makes the VE target available in Clang.
  4. VE,#3 Runtime libraries - basic adaptations of the libcxx, libcxxabi,libunwind, compiler-rt, openmp runtime libraries.
  5. VE,#4 Target vector intrinsics - Target-specific vector intrinsics.

I am currently preparing a patch to register our staging buildbot.
As announced on llvm-dev there are some more patches for the test-suite in the pipe to set it up for the VE target.

arsenm added inline comments.Dec 11 2019, 7:36 AM
llvm/include/llvm/ADT/Triple.h
481

Stray change?

simoll marked an inline comment as done.Dec 11 2019, 7:59 AM
simoll added inline comments.
llvm/include/llvm/ADT/Triple.h
481

Not sure what you mean, there is no change in this line?

khchen added a subscriber: khchen.Dec 12 2019, 8:42 AM
simoll added a project: Restricted Project.Dec 13 2019, 12:53 AM

The child patches show the next steps and attach a staging buildbot for the VE target as requested.
Please let us know how to proceed from here.

arsenm accepted this revision.Dec 21 2019, 1:26 AM

LGTM

llvm/lib/Target/VE/VETargetMachine.cpp
31

I think just having a single string would be better unless you're planning on adding checks to control these, but it's not important

This revision is now accepted and ready to land.Dec 21 2019, 1:26 AM
simoll marked an inline comment as done.Tue, Jan 7, 1:08 AM

Thanks for the review. This patch has been simmering for a while. I'll wait until Thursday to commit this to allow others to chime in.

llvm/lib/Target/VE/VETargetMachine.cpp
31

We will add vector type alignments in the forthcoming patches. With the DL string spread out over multiple lines those appear more cleanly in the diff.

Jim added a subscriber: Jim.Tue, Jan 7, 1:17 AM
simoll updated this revision to Diff 236975.Thu, Jan 9, 1:02 AM

Rebased

This revision was automatically updated to reflect the committed changes.
simoll added a comment.Thu, Jan 9, 3:27 AM

Thanks! Sorry for the collateral damage.