This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Add Tiny Code Model for AArch64
ClosedPublic

Authored by dmgreen on Jul 23 2018, 8:36 AM.

Details

Summary

This adds the plumbing for the Tiny code model for the AArch64 backend. This, instead of loading addresses through the normal ADRP;ADD pair used in the Small model, uses a single ADR. The 21 bit range of an ADR means that the code and its statically defined symbols need to be within 1MB of each other.

This makes it mostly interesting for embedded applications where we want to fit as much as we can in as small a space as possible.

Diff Detail

Repository
rL LLVM

Event Timeline

dmgreen created this revision.Jul 23 2018, 8:36 AM

This seems mostly straightforward.

Is tiny supported with PIC? Please add tests.

Do other in-tree targets correctly reject -code-model=tiny?

lib/Target/AArch64/AArch64Subtarget.h
315 ↗(On Diff #156782)

Please don't use an default parameter like this; just use two separate helpers instead.

I think that it is important to support position independent code in the tiny code model, there was a thread on llvm-dev in May where someone couldn't 4k align his position independent code on a bare-metal AArch64 platform (http://lists.llvm.org/pipermail/llvm-dev/2018-May/123465.html) A tiny code model that avoids ADRP could be a solution there. A PR was raised to implement it https://bugs.llvm.org/show_bug.cgi?id=37543 . If we can get position independent support added then it will be worth closing that PR. I guess it could be added in a follow up patch though.

Some other observations:

  • It will be worth adding assembler support for adr with respect to a relocated literal, I don't think it does right now. Certainly worth adding a test.
  • I seem to remember in the thread that supporting tiny in machO might be difficult due to lack of available relocations. May be worth making sure that selecting tiny code model on MachO is an error and then you can avoid testing for non-MachO in multiple places.
  • TLS could also be altered in the tiny code model. AFAIK GNU ld supports relaxation of at least some of the tiny code model TLS implementations. The GCC version I had only seemed to alter its initial-exec implementation though. Again I think TLS support could be added in a follow up patch.
lib/Target/AArch64/AArch64ISelLowering.cpp
3942 ↗(On Diff #156782)

Typo: GOT relocations instead of git?

Thanks guys

This seems mostly straightforward.

Is tiny supported with PIC? Please add tests.

Do other in-tree targets correctly reject -code-model=tiny?

Will do

Some other observations:

Cheers for the comments (and the code). I forgot to mention the things that this doesn't yet handle. TLS is the main one. This will currently assert for TLS for ELF, in the same way as for the Large codemodel, and if that is removed/ignored will create code as with the Small implementation. I plan to look into this, but as you say that might be better in a followup.

I have some code for fastisel but removed it as I wasn't sure it was worth-while. That will fall back to ISel at the moment.

I think I mentioned somewhere in the code that as weak references need to go through a GOT (so they can be 0), and from what I can tell there are no GOT relocations on ADR's, the code here currently produces an ADRP;ADD pair with the relocations. GCC produces a constant pool access or LDR's in PIC, so this may change.

I'll see about adding PIC, filling in the gaps and run some extra testing.

dmgreen updated this revision to Diff 157959.Jul 30 2018, 8:19 AM
dmgreen added reviewers: peter.smith, efriedma.

Now with more code. I will go through later and try to pull sensible parts of this out into separate patches. The llvm-test-suite all either pass or fail to link both with and without -fPIC (although more fail to link without PIC that I would have expected).

TLS still uses the same implementation as Small at the moment. This, as far as I can tell, is the same thing that gcc does.

Thanks to Peter for some pointers.

nhaehnle removed a subscriber: nhaehnle.Jul 30 2018, 8:21 AM

One example where gcc (7.1 in my case) will generate different code for tls is:

__thread int foo __attribute__((tls_model("initial-exec")));

int func(void) {
    return foo;
}

aarch64-linux-gnu-gcc -S tls.c -O1 -mcmodel=small -o -

func:
	adrp	x0, :gottprel:.LANCHOR0
	ldr	x0, [x0, #:gottprel_lo12:.LANCHOR0]
	mrs	x1, tpidr_el0
	ldr	w0, [x1, x0]
	ret
...
	.section	.tbss,"awT",@nobits
	.align	2
	.set	.LANCHOR0,. + 0
	.type	foo, %object
	.size	foo, 4
foo:
	.zero	4

whereas with -mcmodel=tiny

func:
	mrs	x1, tpidr_el0
	ldr	x0, :gottprel:.LANCHOR0
	add	x0, x0, x1
	ldr	w0, [x0]
	ret

I've not seen any differences in the other TLS models.

I think it is probably fine to keep TLS access using the small code model, at least initially. I wouldn't expect too many embedded systems to implement TLS and if they did they could probably use the local-exec model at compile time or have the linker relax to local-exec. Haven't had a chance to look at the rest of the patch yet, will hope to do so shortly.

dmgreen updated this revision to Diff 158529.Aug 1 2018, 6:52 AM

Thanks Peter.

I've split out a couple of obvious parts into separate commits. Let me know if anything else looks like it deserves the same treatment.

Generally looking good to me. Apologies for the delay.

As a general nit, the patch has been a bit inconsistent about preserving/removing curly braces after if/else statements with only a single statement in the body. It may be better to either preserve all as they are or change all of them; not a critical comment by any means.

It would be good to add a test in MC for a relocated ADR instruction with the :got: modifier. It would also be good to add a test to check what happens when someone attempts to use TLS with the tiny code model.

lib/Target/AArch64/AArch64ISelLowering.cpp
4073 ↗(On Diff #158529)

I think that this should be a user-visible error message as it is not obvious to a user why TLS might not be supported.

Another possibility is to just warn and use the small code model for TLS. This would cause problems for people that must avoid ADRP due to position independent requirements on 4k page alignment, but I think that could be a small minority of people.

4795 ↗(On Diff #158529)

You may be able to simplify this as getTargetMachine().getCodeModel() == CodeModel::Tiny implies ELF as getEffectiveCodeModel so you shouldn't need to test if the Subtarget->isTargetMachO(). Probably not a lot in it though.

dmgreen updated this revision to Diff 160658.Aug 14 2018, 12:03 PM
dmgreen marked 2 inline comments as done.

Generally looking good to me. Apologies for the delay.

Thanks for the review

As a general nit, the patch has been a bit inconsistent about preserving/removing curly braces after if/else statements with only a single statement in the body. It may be better to either preserve all as they are or change all of them; not a critical comment by any means.

If in doubt, follow what was already there. I'll try to keep things as they were.

It would be good to add a test in MC for a relocated ADR instruction with the :got: modifier.

Added some test for adr/adrp errors. There are others in MC to cover testing adr with a label. I changed the tryParseAdrLabel a little to make the error's better.

It would also be good to add a test to check what happens when someone attempts to use TLS with the tiny code model.

Hmm.. So I was expecting this would keep working the same as small model. I believe that would be less efficient but functionally correct (providing you are not un-aligning page boundaries). In adding some tests I can see that there are some code differences though (because it goes through LOADgot). I think the example in arm64-tls-execs is correct, and the other tests here produce the same code (as does your example from above). I've not tested this very well though, and I'm not sure what else would be required. As far as I can tell it should be OK, but producing an error for tiny+tls, same as large+tls, is an option.

lib/Target/AArch64/AArch64ISelLowering.cpp
4073 ↗(On Diff #158529)

I agree the behaviour here for Large isn't so great. It will assert, or just use the small code here in release builds without asserts. I've changed that to a fatal error.

I'm not sure how to add a warning for tiny+tls.

Hmm.. So I was expecting this would keep working the same as small model. I believe that would be less efficient but functionally correct (providing you are not un-aligning page boundaries). In adding some tests I can see that there are some code differences though (because it goes through LOADgot). I think the example in arm64-tls-execs is correct, and the other tests here produce the same code (as does your example from above). I've not tested this very well though, and I'm not sure what else would be required. As far as I can tell it should be OK, but producing an error for tiny+tls, same as large+tls, is an option.

I've been testing out some examples with tls + tiny [+ pic]. As far as I can tell from the code I've ran, everything is working as expected.

From what I can see, it should just be the InitialExec model that does a LOADgot which should match the GCC behaviour with respect to TLS; so I'm happy with that for this patch. I think we can add better support for tiny code model TLS later if there is a need for it. I don't have any more comments myself, if there are no other comments from the other reviewers by the end of the week I'll LGTM.

lib/Target/AArch64/AArch64ISelLowering.cpp
4073 ↗(On Diff #158529)

Ok. Probably worth just adding a comment/FIXME that the tiny code model uses the small code model TLS sequences and that some optimization is possible.

peter.smith accepted this revision.Aug 21 2018, 6:38 AM

Looks good to me. I've not seen any other comments and I've no further comments.

This revision is now accepted and ready to land.Aug 21 2018, 6:38 AM
dmgreen updated this revision to Diff 161754.Aug 21 2018, 10:25 AM
dmgreen edited the summary of this revision. (Show Details)

Thanks Peter. This change just adds a FIXME for tiny+tls.

Mind taking a look at the hopefully trivial clang part of this in D49674? I can then commit the two parts together.

This revision was automatically updated to reflect the committed changes.