This is an archive of the discontinued LLVM Phabricator instance.

Add Connex vector processor back end
Needs ReviewPublic

Authored by alexsusu on Mar 31 2019, 5:16 PM.

Details

Summary
Connex is an established, almost 30-year old, very wide research vector processor (see, for example, http://users.dcae.pub.ro/~gstefan/2ndLevel/connex.html) with a number of lanes between 32 and 4096, easily changeable at synthesis time.
A very interesting feature is that the Connex processor has a local banked vector memory (each lane has its own local memory), which achieves 1 cycle latency with direct and indirect loads and stores - this implies that the memory bandwidth is very big.

The Connex vector processor has 16-bit signed integer Execution Units in each lane. It is emulating efficiently (via inlining the emulation subroutines in the instruction selection pass) 32-bit int and IEEE 754-2008 compliant 16-bit floating point (Clang type _Float16, C for ARM __fp16, LLVM IR half type). The emulation subroutines are in the lib/Target/Connex/Select_*_OpincaaCodeGen.h files, which are to be included in the ConnexISelDAGToDAG.cpp module, in the ConnexDAGToDAGISel::Select() method. These emulation subroutines can be easily adjusted using for example to increase performance by sacrificing accuracy of f16 - drop me an email to ask how can you do it. (They currently total almost 1 MB of C++ code.)
The Connex vector processor does not currently support the float, double, nor the 64-bit integer types.

The back end targets more exactly the Connex processor, used as an accelerator, a variant of the Connex processor, which is low-power. The working compiler is described at https://sites.google.com/site/alexsusu/myfilecabinet/OpincaaLLVM_TR_UPB.pdf .

Note that currently our back end targets only our Connex Opincaa assembler (very easy to learn and use) available at https://gitlab.dcae.pub.ro/research/ConnexRelated/opincaa/ .
The Connex Opincaa assembler allows to run arbitrary Connex vector-length, host (CPU) agnostic code.

The ISA of the Connex vector processor is available at https://gitlab.dcae.pub.ro/research/ConnexRelated/opincaa/blob/master/ConnexISA.pdf .
The Connex vector processor has also an open source C++ simulator available also at https://gitlab.dcae.pub.ro/research/ConnexRelated/opincaa/ .

The mailing list for the Connex processor and tools is: https://groups.google.com/forum/#!forum/connex-tools .

An interesting feature is that, in order to support recovering from from the Instruction selection pass' SelectionDAG back to the original source (C) code we require adding a simple data structure in include/llvm/CodeGen/SelectionDAG.h (and helper methods in related files) that maps an SDValue to the LLVM IR Value object it was used to translate from:
   DenseMap<const Value*, SDValue> *crtNodeMapPtr

The Connex back end is 3 years old. We published 1 academic paper on it at a CGO workshop: https://dl.acm.org/citation.cfm?id=3306166 . However, we are still adding features to the back end.

Small note: the Connex backend is rather small, it builds fast (in ~3-5 mins, single-threaded on a decent machine; in Apr 2019 the built objects have a total 71,168K, while the smallest LLVM backend, MSP430, has 63,387K and the biggest ones are X86 with 359,736K, and AMDGPU with 488,309K).

An important thing is that I think the test/MC/Connex folder should not be populated for this patch, because the Connex back end is able to generate only an assembly code that is required to be used by the special Opincaa assembler, which is not integrated in LLVM. I've seen other back ends doing a similar thing such as the NVPTX back end, which doesn't support object file generation. The Connex back end also doesn't support object file generation.
The eBPF+ConnexS processor has the same ABI as the eBPF processor it extends, except that Connex-S supports natively only 16-bit integers and it is able to access the banked vector memory only by line (so Connex-S can't perform unaligned accesses).

The Connex processor is currently implemented in FPGA, but was also implemented in silicon also:
    - an older version for HDTV: Gheorghe M. Stefan, "The CA1024: A Massively Parallel Processor for Cost-Effective HDTV", 2006 (http://users.dcae.pub.ro/~gstefan/2ndLevel/images/connex_v4.ppt)
    - M. Malita and Gheorghe M. Stefan, "Map-scan Node Accelerator for Big-data"
    - Gheorghe M. Stefan and Mihaela Malita, "Can One-Chip Parallel Computing Be Liberated From Ad Hoc Solutions? A Computation Model Based Approach and Its Implementation"

Diff Detail

Repository
rL LLVM

Event Timeline

alexsusu created this revision.Mar 31 2019, 5:16 PM
alexsusu edited the summary of this revision. (Show Details)Mar 31 2019, 6:17 PM
alexsusu updated this revision to Diff 193199.Apr 1 2019, 4:40 PM

Added 2 more files. I still have to add another about 50 files.

alexsusu updated this revision to Diff 194976.Apr 12 2019, 3:40 PM
alexsusu edited the summary of this revision. (Show Details)

A few corrections to ConnexInstrInfoVec.td .

alexsusu edited the summary of this revision. (Show Details)Apr 12 2019, 7:38 PM
alexsusu edited the summary of this revision. (Show Details)
alexsusu edited the summary of this revision. (Show Details)Apr 12 2019, 7:53 PM
alexsusu edited the summary of this revision. (Show Details)Apr 12 2019, 8:08 PM
alexsusu updated this revision to Diff 195044.Apr 13 2019, 4:46 PM

More refactoring on the .td TableGen files. Added also more source files.

alexsusu updated this revision to Diff 195239.Apr 15 2019, 1:49 PM
alexsusu edited the summary of this revision. (Show Details)

Added all source files for the Connex back end.
Followed coding standards from https://llvm.org/docs/CodingStandards.html .

alexsusu updated this revision to Diff 195252.Apr 15 2019, 3:10 PM
alexsusu edited the summary of this revision. (Show Details)
alexsusu added a reviewer: llvm.org.

Added myself as maintainer of the Connex backend in CODE_OWNERS.TXT.

alexsusu edited the summary of this revision. (Show Details)Apr 15 2019, 5:08 PM
alexsusu added reviewers: jpienaar, asb.
asb added a comment.Apr 18 2019, 5:49 AM

Some initial feedback:

  • This patch is pretty huge which makes it pretty hard to meaningfully review
  • There seem to be effectively no tests. I'd expect test/CodeGen/Connex and test/MC/Connex to be reasonably well populated
  • Plenty of code commented out or with date-based comments that don't match our style, e.g. // 2018_*
  • Files have the old license header rather than the new one
alexsusu updated this revision to Diff 195950.Apr 19 2019, 6:06 PM
alexsusu edited the summary of this revision. (Show Details)

Made a first round of corrections following Alex Bradbury's review.

In D60052#1471560, @asb wrote:

Some initial feedback:

  • This patch is pretty huge which makes it pretty hard to meaningfully review
  • There seem to be effectively no tests. I'd expect test/CodeGen/Connex and test/MC/Connex to be reasonably well populated
  • Plenty of code commented out or with date-based comments that don't match our style, e.g. // 2018_*
  • Files have the old license header rather than the new one

Hi, Alex.

I addressed a few of your concerns and I'm working on the others.

An important thing is that I think test/MC/Connex should not be populated for this patch, because the Connex back end is able to generate only an assembly code that is required to be used by the special Opincaa assembler, which is not integrated in LLVM. I've seen other back ends doing a similar thing such as the NVPTX back end, which doesn't support object file generation. The Connex back end also doesn't support object file generation.

Thank you,
  Alex Susu
alexsusu updated this revision to Diff 196163.Apr 22 2019, 6:10 PM

Added all required files for the back end (a few were missing).

ormris removed a subscriber: ormris.Apr 23 2019, 9:36 AM

Just quickly reviewing the target-independent changes.

In general, commented-out code should be cleaned up.

Could you explain the need for crtNodeMapPtr a bit more? In general, IR instructions should be lowered to SelectionDAG nodes in a way that doesn't require referring back to the original Instruction afterwards.

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
9092–9093

If you have operations with multiple results you need to custom-legalize, your backend should just override LowerOperationWrapper.

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h
41

Stray include?

arsenm added a subscriber: arsenm.Apr 23 2019, 2:19 PM

There's a lot of clutter that makes this hard to review. All of the extra debug, special ifdef blocks, and commented out code should be removed.

This should also be split into smaller patches. For example, the triple patches, then adding the basic target machine definition, and then MC parts before moving on to codegen and optimizations

include/llvm/ADT/Triple.h
56

The triple patches are usually committed as a first, separate patch

include/llvm/CodeGen/SelectionDAG.h
273–277

This probably should be dropped

1226–1231

This should be a separate patch (or you could just use the ArrayRef version)

lib/Target/Connex/ConnexAsmPrinter.cpp
52 ↗(On Diff #196163)

This should be after all includes

53–58 ↗(On Diff #196163)

Sort includes

60–63 ↗(On Diff #196163)

You should not need this, also no global

164 ↗(On Diff #196163)

isInlineAsm() (also for others)

225–226 ↗(On Diff #196163)

A large number of the debug statements seem like they're just noise for committing

234 ↗(On Diff #196163)

No c string functions

957 ↗(On Diff #196163)

You shouldn't be looking at the .data() on a StringRef as if it were a c string like this should just use the StringRef directly.

lib/Target/Connex/ConnexAsmPrinterLoopNests.h
1–2 ↗(On Diff #196163)

I don't really understand anything that's going on in this file. You shouldn't have file IO, or globals

lib/Target/Connex/ConnexConfig.h
4

These macros mostly seem like a waste of effort

lib/Target/Connex/ConnexHazardRecognizers.cpp
304 ↗(On Diff #196163)

Fewer TODOs

308 ↗(On Diff #196163)

No random special ifdef blocks

lib/Target/Connex/ConnexInstrInfo.h
52 ↗(On Diff #196163)

No macro for this

lib/Target/Connex/ConnexSelectionDAGInfo.cpp
113 ↗(On Diff #196163)

static const

175 ↗(On Diff #196163)

static const

201 ↗(On Diff #196163)

More junk macros

lib/Target/Connex/ConnexTargetMachine.cpp
898 ↗(On Diff #196163)

This needs to be broken down into smaller functions

lib/Target/Connex/ConnexTargetTransformInfo.h
117 ↗(On Diff #196163)

Noisy debug

lib/Target/Connex/RecoverFromLlvmIR.h
1–2 ↗(On Diff #196163)

I don't know what this file is trying to accomplish, but it is a separate patch from the backend

lib/Target/Connex/Select_ADDi32_OpincaaCodeGen.h
9–18 ↗(On Diff #196163)

There shouldn't be any generated code. Generated selection should come from table gen, with some manual code in *ISelDAGToDAG

lib/Target/Connex/TargetInfo/Makefile
1 ↗(On Diff #196163)

I thought Makefiles were all gone?

alexsusu marked 2 inline comments as done.Apr 25 2019, 4:07 PM
This comment was removed by alexsusu.
luismarques added inline comments.
CMakeLists.txt
324

New targets are generally considered experimental and not added to the default build list. You would generally build them by adding them to the cmake definition LLVM_EXPERIMENTAL_TARGETS_TO_BUILD.

alexsusu updated this revision to Diff 198100.May 3 2019, 3:14 PM
alexsusu marked 26 inline comments as done and an inline comment as not done.
alexsusu edited the summary of this revision. (Show Details)
alexsusu edited reviewers, added: arsenm, efriedma, luismarques; removed: jpienaar.

Addressed most reviews of efriedma, arsenm, luismarques (and of asb).

alexsusu updated this revision to Diff 198358.May 6 2019, 3:52 PM
alexsusu marked 2 inline comments as done.
alexsusu edited the summary of this revision. (Show Details)

Added relevant tests in test/CodeGen/Connex.

arsenm added a comment.May 8 2019, 5:33 AM

This still needs a lot of work before it will be in a committable state. All of the parts touching generic code need to be reviewed and committed separately at an absolute minimum. The backend also could use some breaking up.

In particular the Value*/SDNode map doesn't in general make sense. What is it used for? It's certainly not the correct solution to whatever problem you're solving with it.

include/llvm/Analysis/ScalarEvolutionExpander.h
264–272 ↗(On Diff #198358)

SCEV changes need to be split to separate patches

include/llvm/CodeGen/SelectionDAG.h
273–277

This still needs to be dropped

lib/CodeGen/LiveRangeCalc.cpp
25–38 ↗(On Diff #198358)

This needs to be removed

lib/CodeGen/RegAllocGreedy.cpp
1223–1224 ↗(On Diff #198358)

Needs to be removed

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
1472–1473

Needs to be removed

lib/CodeGen/SelectionDAG/SelectionDAG.cpp
85

Ditto

lib/Target/Connex/ConnexAsmPrinter.cpp
79–86 ↗(On Diff #198358)

These various places linking to other documentation should be removed. Doxygen generates the appropriate links

104–117 ↗(On Diff #198358)

Remove commented out code

780–782 ↗(On Diff #198358)

More macros to remove

lib/Target/Connex/Select_SUBi32_OpincaaCodeGen.h
1–6 ↗(On Diff #198358)

This file needs to be dropped. There should be no committed, generated code. You should have table gen emit this, or manually write the code in ISelDAGToDAG

arsenm resigned from this revision.Feb 13 2020, 2:54 PM

Feel free to re-add me on a new revision

luismarques resigned from this revision.Feb 1 2021, 7:17 AM
lib/Target/Connex/Select_SHRAi32_OpincaaCodeGen.cpp