Connex is an established, almost 30-year old, very wide research vector processor (see, for example, http://users.dcae.pub.ro/~gstefan/2ndLevel/connex.html) with a number of lanes between 32 and 4096, easily changeable at synthesis time. A very interesting feature is that the Connex processor has a local banked vector memory (each lane has its own local memory), which achieves 1 cycle latency with direct and indirect loads and stores - this implies that the memory bandwidth is very big. The Connex vector processor has 16-bit signed integer Execution Units in each lane. It is emulating efficiently (via inlining the emulation subroutines in the instruction selection pass) 32-bit int and IEEE 754-2008 compliant 16-bit floating point (Clang type _Float16, C for ARM __fp16, LLVM IR half type). The emulation subroutines are in the lib/Target/Connex/Select_*_OpincaaCodeGen.h files, which are to be included in the ConnexISelDAGToDAG.cpp module, in the ConnexDAGToDAGISel::Select() method. These emulation subroutines can be easily adjusted using for example to increase performance by sacrificing accuracy of f16 - drop me an email to ask how can you do it. (They currently total almost 1 MB of C++ code.) The Connex vector processor does not currently support the float, double, nor the 64-bit integer types. The back end targets more exactly the Connex processor, used as an accelerator, a variant of the Connex processor, which is low-power. The working compiler is described at https://sites.google.com/site/alexsusu/myfilecabinet/OpincaaLLVM_TR_UPB.pdf . Note that currently our back end targets only our Connex Opincaa assembler (very easy to learn and use) available at https://gitlab.dcae.pub.ro/research/ConnexRelated/opincaa/ . The Connex Opincaa assembler allows to run arbitrary Connex vector-length, host (CPU) agnostic code. The ISA of the Connex vector processor is available at https://gitlab.dcae.pub.ro/research/ConnexRelated/opincaa/blob/master/ConnexISA.pdf . The Connex vector processor has also an open source C++ simulator available also at https://gitlab.dcae.pub.ro/research/ConnexRelated/opincaa/ . The mailing list for the Connex processor and tools is: https://groups.google.com/forum/#!forum/connex-tools . An interesting feature is that, in order to support recovering from from the Instruction selection pass' SelectionDAG back to the original source (C) code we require adding a simple data structure in include/llvm/CodeGen/SelectionDAG.h (and helper methods in related files) that maps an SDValue to the LLVM IR Value object it was used to translate from: DenseMap<const Value*, SDValue> *crtNodeMapPtr The Connex back end is 3 years old. We published 1 academic paper on it at a CGO workshop: https://dl.acm.org/citation.cfm?id=3306166 . However, we are still adding features to the back end. Small note: the Connex backend is rather small, it builds fast (in ~3-5 mins, single-threaded on a decent machine; in Apr 2019 the built objects have a total 71,168K, while the smallest LLVM backend, MSP430, has 63,387K and the biggest ones are X86 with 359,736K, and AMDGPU with 488,309K). An important thing is that I think the test/MC/Connex folder should not be populated for this patch, because the Connex back end is able to generate only an assembly code that is required to be used by the special Opincaa assembler, which is not integrated in LLVM. I've seen other back ends doing a similar thing such as the NVPTX back end, which doesn't support object file generation. The Connex back end also doesn't support object file generation. The eBPF+ConnexS processor has the same ABI as the eBPF processor it extends, except that Connex-S supports natively only 16-bit integers and it is able to access the banked vector memory only by line (so Connex-S can't perform unaligned accesses). The Connex processor is currently implemented in FPGA, but was also implemented in silicon also: - an older version for HDTV: Gheorghe M. Stefan, "The CA1024: A Massively Parallel Processor for Cost-Effective HDTV", 2006 (http://users.dcae.pub.ro/~gstefan/2ndLevel/images/connex_v4.ppt) - M. Malita and Gheorghe M. Stefan, "Map-scan Node Accelerator for Big-data" - Gheorghe M. Stefan and Mihaela Malita, "Can One-Chip Parallel Computing Be Liberated From Ad Hoc Solutions? A Computation Model Based Approach and Its Implementation"
Some initial feedback:
- This patch is pretty huge which makes it pretty hard to meaningfully review
- There seem to be effectively no tests. I'd expect test/CodeGen/Connex and test/MC/Connex to be reasonably well populated
- Plenty of code commented out or with date-based comments that don't match our style, e.g. // 2018_*
- Files have the old license header rather than the new one
I addressed a few of your concerns and I'm working on the others. An important thing is that I think test/MC/Connex should not be populated for this patch, because the Connex back end is able to generate only an assembly code that is required to be used by the special Opincaa assembler, which is not integrated in LLVM. I've seen other back ends doing a similar thing such as the NVPTX back end, which doesn't support object file generation. The Connex back end also doesn't support object file generation. Thank you, Alex Susu
Just quickly reviewing the target-independent changes.
In general, commented-out code should be cleaned up.
Could you explain the need for crtNodeMapPtr a bit more? In general, IR instructions should be lowered to SelectionDAG nodes in a way that doesn't require referring back to the original Instruction afterwards.
|9092 ↗||(On Diff #196163)|
If you have operations with multiple results you need to custom-legalize, your backend should just override LowerOperationWrapper.
There's a lot of clutter that makes this hard to review. All of the extra debug, special ifdef blocks, and commented out code should be removed.
This should also be split into smaller patches. For example, the triple patches, then adding the basic target machine definition, and then MC parts before moving on to codegen and optimizations
The triple patches are usually committed as a first, separate patch
This probably should be dropped
This should be a separate patch (or you could just use the ArrayRef version)
This should be after all includes
You should not need this, also no global
isInlineAsm() (also for others)
A large number of the debug statements seem like they're just noise for committing
No c string functions
You shouldn't be looking at the .data() on a StringRef as if it were a c string like this should just use the StringRef directly.
I don't really understand anything that's going on in this file. You shouldn't have file IO, or globals
These macros mostly seem like a waste of effort
|304 ↗||(On Diff #196163)|
|308 ↗||(On Diff #196163)|
No random special ifdef blocks
No macro for this
More junk macros
This needs to be broken down into smaller functions
I don't know what this file is trying to accomplish, but it is a separate patch from the backend
There shouldn't be any generated code. Generated selection should come from table gen, with some manual code in *ISelDAGToDAG
|1 ↗||(On Diff #196163)|
I thought Makefiles were all gone?
New targets are generally considered experimental and not added to the default build list. You would generally build them by adding them to the cmake definition LLVM_EXPERIMENTAL_TARGETS_TO_BUILD.
This still needs a lot of work before it will be in a committable state. All of the parts touching generic code need to be reviewed and committed separately at an absolute minimum. The backend also could use some breaking up.
In particular the Value*/SDNode map doesn't in general make sense. What is it used for? It's certainly not the correct solution to whatever problem you're solving with it.
SCEV changes need to be split to separate patches
This still needs to be dropped
This needs to be removed
Needs to be removed
Needs to be removed
These various places linking to other documentation should be removed. Doxygen generates the appropriate links
Remove commented out code
More macros to remove
This file needs to be dropped. There should be no committed, generated code. You should have table gen emit this, or manually write the code in ISelDAGToDAG