Page MenuHomePhabricator

AAP Backend

Authored by edward-jones on Aug 20 2015, 4:39 AM.



This patch introduces the AAP backend to LLVM.

AAP is a Harvard architecture, with features representative of a large range of deeply embedded microprocessors. It aims to aid the development and maintenance of other out-of-tree deeply embedded systems.

Diff Detail

Event Timeline

edward-jones retitled this revision from to AAP Backend.
edward-jones updated this object.
edward-jones added a subscriber: llvm-commits.
jfb added a subscriber: jfb.Aug 21 2015, 1:50 PM

High-level comments:

We maintain an out-of-tree target for a DSP - using a Harvard architecture with multiple address spaces - which makes this target interesting.

However, our target has other special features as well. For example, 16-bit bytes. Do you have a feeling for how hard it will be to modify and use your target for testing the LLVM framework for such support?

I havn't looked at your code, just tried to use it. Comments:

  • The code don't build with clang-3.7 without warnings (which I think it should, partly according to LLVM coding guidelines).
  • I tested the backend, using llvm-stress, and found problems quickly. I will send a separate email.
  • Where do I find tools to simulate the target? This will make it possible to execute more tests.

At the moment it would be relatively easy to change the architecture, and we're looking to add any features which it may be useful to have in an in-tree embedded system. I don't believe it would be too difficult to have 16-bit bytes, and will discuss with the others.

I hope to have an updated for this patch shortly, which should have fixed all of the warnings which clang emits, as well as made some minor updates to the architecture. I've also added some more basic tests, and run the back-end through clang format to tidy things up. We have run llvm-stress, but have not yet fixed issues that cropped up.

For anyone who wishes to try and build the architecture, the other components of the toolchain are available on the Embecosm github page.

This updates the architecture to the latest revision, and adds bug fixes and new tests. The backend has also been tidied up, and run through clang-format.

This updates the LLVM support of AAP to the latest top of tree. Added in this patch are various fixes to stabilize the AAP backend, as well as a few more tests.

Hi Edward,

Sorry I didn't reply sooner. I really should have taken a more active role here.

I've got a bunch of comments below (mostly fairly trivial, but with a few questions in there too, and quite possibly some where I should be told to get lost).

But on larger issues, what are your plans for maintaining this backend? Do you have someone lined up to help out with the work in keeping it going, advising on breakages, reviewing patches (sorry, we're not really in a position to be throwing stones here) etc?

Also, do you plan to setup some kind of runtime build bot, or at have someone running regular tests on the code generated?

It didn't quite fit into any inline area on the diff, but we could probably do with some disassembly tests too.




I take it this is part of the workaround for function pointers being larger than others, but I didn't quite see how all the pieces fit together in LLVM, and didn't see any tests in this area.

This feels like a hack though, could you try to explain a bit more why it's needed?


Range-based adaptors?


You should probably be using isInt<N> from MathExtras.h for the signed one. And I'd actually suggest adding something like "bool isIntUInt<N>(int64_t)" too. We really shouldn't be duplicating these helpers all over LLVM.


Apart from this one you seem to be using isImm for unsigned immediates and isOff for signed ones.

I'd sort of suggest isUImm and isSImm or something more explicit, but can certainly live with Imm vs Off if it's consistently applied in the backend.


This also seems misnamed, even if nothing else changes.


This comment doesn't match the alternative output ("#whatever"). Also, the code above suggests it should be "$r1".


R7 differs between the two.


I think it's worth commenting on the decisions here: R1 = SP, and trying to have about 66% callee-saved with a possibly varying number of registers.


These are the default implementations. They can probably be skipped for now.


These should probably be with the others.


I think this should be an assertion. It makes it clearer that Offset is always set, which is a requirement for returning true.


Thisn't the second check always true here? As far as I can see if SelectAddr succeeds then it always creates a TargetConstant.


I'm surprised to see so many types in this list (including below) when only MVT::i16 (and possibly MVT::Other) is actually legal.

The type legaliser mostly just goes about its business without checking these, so you can almost certainly drop most of the i1 & i8 entries.

I'd also be very surprised if more than one of the MVT::i16 and MVT::Other cases actually applied in most cases. For example a SELECT should never have an MVT::Other input or result.

It's more complicated for other functions (e.g. setLoadExtAction), so you should expect more of those to be needed.


An array is usually simpler here. You don't actually need any SmallVector properties, just that the type is convertible to an ArrayRef<SDValue>. So you more often see code like:

SDValue Ops[] = { Chain, DAG.getConstant(TargetCC, dl, MVT::i16), ... };

Here and throughout, the LLVM style is currently that variables start with capital letters.


Isn't this duplicating getBranchOpForCondition? If not, a comment about why it's special would be useful.


More default implementations here. They can probably just be removed until you actually want to do something interesting.


More unity in naming would be good here too.


Why does this need to have a custom inserter? It would seem better to get patterns matching each CC into the actual instructions.

Generally, custom inserters are a last resort for instructions that need to mangle the CFG (like your select implementation).


I don't think I understand these. By convention you're using R0 as the LR, but that's already covered by $rB. And the instruction itself doesn't do anything to R1 (=SP) does it?


Commented code. Should probably be removed.


We've got register sequences for this kind of thing now:

(add (sequence "R%u", 0, 63))

Trivial point, but I think this will give an unused variable warning outside Debug mode.

Slightly less trivially, does that check only apply to encodeImm6? There's quite a bit of duplication here; many of the encode* functions look like they could defer to an encodeImmN, passing the fixup needed.


We normally try to check that the roundtrip has succeeded in this kind of MC test, as well as the actual bitwise encoding produced (via -show-encoding). Surprisingly many things can go wrong there (swapping registers around, decoding random bits of an instruction as an operand, ...).

Checking the actual instruction name too isn't absolutely awful, but does feel like a bit of an implementation detail.

Hi Tim,

Very sorry, the update notification had become lost in my email. Thank you very much for your comments, I general agree with all of your points so I'll make the appropriate changes and update the patch. I'll definitely look at adding more tests too, since that appears to be an area where we are lacking.

In terms of maintaining the backend; it is a long term project for us at Embecosm. At the moment it is me and Simon Cook who are doing work actively on AAP.

Initially we are looking at doing nightly regression runs of the GCC regression tests, however we want to set up a public buildbot too.

Thank you,


This is unrelated to long function pointers. In DWARF, AAP has an address length of 4 bytes, as the upper bits are used for flag to identify debug, code and data. This change is needed so that we can emit addresses of the correct length.


The method names correspond to the immediates used in the InstrInfo.d file, hence the use of 'Off' instead of 'SImm'. It definitely makes sense to rename them though. isImm16 was being (ab)used so that we would match both 0xffff when zext and -1 when sext in assembly, and I need to address that in a more elegant way.


Yes, I definitely need to clear these up.


I'll have a look at this. I have a feeling there is a reason that we use a custom inserted, and if that's the case at the very least I need to document the reason.

This adds fixes suggested by Tim and rolls the backend forward to the top-of-tree

beanz added a subscriber: beanz.Mar 29 2016, 12:50 PM

I'm not qualified to comment on the backend itself, but a few notes on the build system changes.

First and foremost, please don't add Makefiles to LLVM. I spent a lot of time getting rid of them, and we no longer support autoconf.

I additionally have one small CMake comment inline.


210 ↗(On Diff #51931)

New backends should't be added to the default backend list. They should be added as experimental only.

Experimental backends are not specified in the CMake explicitly, you enable them by specifying them on the CMake command line with the variable LLVM_EXPERIMENTAL_TARGETS_TO_BUILD.

This strips out the Makefiles, and removes AAP from the targets to build.

edward-jones abandoned this revision.Aug 17 2016, 1:24 AM

We are closing this patch in favor of a soon to be submitted updated patch.