This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Implement execute-only support in CodeGen
ClosedPublic

Authored by prakhar on Dec 6 2016, 2:59 AM.

Details

Summary

[ARM] Implement execute-only support in CodeGen

This implements execute-only support for ARM code generation, which
prevents the compiler from generating data accesses to code sections.
The following changes are involved:

  • Add the SubtargetFeature "execute-only" to the ARM code generator.
  • Add the clang flag "-mexecute-only" as well as the GCC-compatible alias "-mpure-code" to enable this feature.
  • When enabled, literal pools are replaced with MOVW/MOVT instructions, with VMOV used in addition for floating-point literals. As the MOVT instruction is required, execute-only support is only available in Thumb mode for targets supporting ARMv8-M baseline or Thumb2.
  • Jump tables are placed in data sections when in execute-only mode.
  • The execute-only text section is assigned section ID 0, and is marked as unreadable with the SHF_ARM_PURECODE flag with symbol 'y'. This also overrides selection of ELF sections for globals.

Event Timeline

prakhar updated this revision to Diff 80398.Dec 6 2016, 2:59 AM
prakhar retitled this revision from to [ARM] Implement execute-only support in CodeGen.
prakhar updated this object.
prakhar added reviewers: t.p.northover, rengolin.
prakhar added a subscriber: llvm-commits.

Hi Prakhar,

Thanks for working on this, it's a nice functionality. I have a bunch of inline comments, but some generic ones, too.

Hence, execute-only support is only available for targets that support MOVW/MOVT (ARMv7 and above).

Actually, MOVT was introduced in ARMv6T2, but execute-only seems to be an M thing, mostly ARMv7M and ARMv8M. Maybe advertise it that way instead of MOVT-dependent.

I wonder how compatible this is with the GCC implementation. Is this how GCC lays out the ELF object? Can we link GCC objects with LLVM ones and expect perfect cooperation (ex. globals & symbols)?

I'm also adding more people to review this with experience in ELF and constant pools.

cheers,
--renato

include/llvm/CodeGen/TargetLoweringObjectFileImpl.h
36

Initialising with 0 is a bit obvious, but 1 is not. Can you add a comment, please?

include/llvm/MC/SectionKind.h
32

No TABs, please. Use clang-format on your patch to make sure it adheres to the coding standard.

119

To me it seems ExecuteOnly is orthogonal to isText. The problem is that we're using ReadOnly on the same solution space.

I think we should keep the accessors as simple as possible and use isText() || isExecuteOnly() in code.

lib/CodeGen/TargetLoweringObjectFileImpl.cpp
318

Does this mean there is only one pure section on each module?

lib/Target/ARM/ARMInstrThumb2.td
3386

I'm worried about this hijack. Does that mean it won't be used by Thumb2 (v6T2+) any more?

lib/Target/ARM/ARMSubtarget.cpp
179

I'm worried about the silent behaviour here.

If movt is disabled by some other part or user flag, than this should be an error, not a silent change.

181

This assert is wrong. It should be something like:

assert(hasV8MBaselineOps() && "Execute-only supported by ARMv8M only");

And it should only be an assert if there are no combination of user flags that can get you here.

If there are, this should be an error.

lib/Target/ARM/ARMTargetObjectFile.cpp
49

Where is the TextSection initialised if it's not pure? Shouldn't this move to the same place?

test/CodeGen/ARM/constantfp.ll
146

There are plenty of other tests in this file that need to make sure no loads from labels occurred. Please, add the relevant checks.

test/CodeGen/ARM/execute-only-big-stack-frame.ll
11

Whenever adding a NOT pattern, you have to be as generic as possible. It's possible that a label load with a different pattern shows up and you won't pick it up.

Just use:

CHECK-SUBW-ADDW-NOT: ldr
peter.smith edited edge metadata.Dec 6 2016, 5:42 AM

I've not looked over the details of the code. I can add some comments on compatibility and to a certain extent naming.

The ARM Compiler 6 documentation for execute-only http://infocenter.arm.com/help/topic/com.arm.doc.dui0774g/vvi1452508645836.html says Link Time Optimization does not honor the armclang -mexecute-only option. If you use the armclang -flto or -Omax options, then the compiler cannot generate execute-only code. Does this still apply, and if it does what should the action be, error message, user beware?

Personally I think it would be better to use the GCC pure-code naming conventions for the test cases and comments. The execute-only name has been favoured by the commercial compilers such as ARM Compiler 5 and ARM Compiler 6. However I think that the majority of the people looking through the code base will be familiar with the GCC pure-code name. Standardising on pure-code would make it easier to equate the support in GCC and llvm and enhance expectations of compatibility. I think that this only means changing comments to refer to pure-code rather than execute-only and changing the file names of the tests to use pure-code rather than execute-only. Or perhaps a comment that equates execute-only with pure-code. I think it is valuable to have both -mexecute-only and -mpure-code as a migration path for both commercial and GCC users.

I've not good a hugely strong opinion about this, I think execute-only is a far better name but I think familiarity with GCC might help more.

To support purecode a toolchain needs linker support. In effect if a section has the SHF_ARM_PURECODE flag it is making the claim that the contents of the section must not be read at run-time. If a program segment contains only SHF_ARM_PURECODE sections then the linker removes the PF_R flag from the ELF file. For compatibility it doesn't matter if the sections come from llvm or gcc, as long as they set the flag they will be treated the same by a linker. There is support in the Version 5-2016-q3-update of the GNU ARM Embedded Toolchain

References:

There is a small whitepaper https://community.arm.com/docs/DOC-11666 on Pure Code in GCC

prakhar marked 7 inline comments as done.Dec 8 2016, 4:17 AM

Actually, MOVT was introduced in ARMv6T2, but execute-only seems to be an M thing, mostly ARMv7M and ARMv8M. Maybe advertise it that way instead of MOVT-dependent.

Execute-only code can be technically generated for any target that supports the MOVT instruction (so Thumb2 and v8-M Baseline targets, which includes v6T2), but is intended for M-class targets that have memory controllers supporting execute-only regions. I'll update the commit message with this.

I wonder how compatible this is with the GCC implementation. Is this how GCC lays out the ELF object? Can we link GCC objects with LLVM ones and expect perfect cooperation (ex. globals & symbols)?

At the object level, there should be no issues with GCC compatibility and the objects should link without issue. For assembly code, LLVM does not yet support GAS's syntax of allowing numeric flags to be used, so I've created patch D27451 for this.

The ARM Compiler 6 documentation for execute-only http://infocenter.arm.com/help/topic/com.arm.doc.dui0774g/vvi1452508645836.html says Link Time Optimization does not honor the armclang -mexecute-only option. If you use the armclang -flto or -Omax options, then the compiler cannot generate execute-only code. Does this still apply, and if it does what should the action be, error message, user beware?

armclang does not appear to generate any warning or error if -flto is specified alongside -mexecute-only.

Personally I think it would be better to use the GCC pure-code naming conventions for the test cases and comments. The execute-only name has been favoured by the commercial compilers such as ARM Compiler 5 and ARM Compiler 6. However I think that the majority of the people looking through the code base will be familiar with the GCC pure-code name. Standardising on pure-code would make it easier to equate the support in GCC and llvm and enhance expectations of compatibility. I think that this only means changing comments to refer to pure-code rather than execute-only and changing the file names of the tests to use pure-code rather than execute-only. Or perhaps a comment that equates execute-only with pure-code. I think it is valuable to have both -mexecute-only and -mpure-code as a migration path for both commercial and GCC users.

Execute-only has already been used in AC5 and AC6 as well as in IAR's compiler, and is a more intuitive name whose purpose is immediately understood. The -mpure-code flag is maintained as an alias for -mexecute-only in clang for compatibility in GCC, and the section header flag is referred to as SHF_ARM_PURECODE as is the case with GCC. The fact that the feature is named execute-only within LLVM is purely an implementation detail, and shouldn't be an issue barring any significant opposition. I can add a source code comment in to explain this if necessary.

include/llvm/MC/SectionKind.h
32

This isn't a tab character. Unless you meant that ExecuteOnly should be placed at the same indent level as Text?

119

As implemented here, an execute-only section is essentially a specialised text section for all purposes, with the added restriction that reads should not be made from it. I'm not sure the two are orthogonal.

lib/CodeGen/TargetLoweringObjectFileImpl.cpp
318

Yes, that's the intention.

lib/Target/ARM/ARMInstrThumb2.td
3386

Thumb2 is a superset of V8MBaselineOps, so this won't be an issue.

lib/Target/ARM/ARMSubtarget.cpp
179

Yes, this should be an error. I've changed this to report an error if NoMovt is enabled or the target doesn't support MOVT. I'll also add a diagnostic to clang for this.

lib/Target/ARM/ARMTargetObjectFile.cpp
49

We cannot change the flags of an existing section, so we create a new text section with the flags set in addition to the default one.

rengolin added inline comments.Dec 8 2016, 4:49 AM
include/llvm/MC/SectionKind.h
32

yup

lib/Target/ARM/ARMTargetObjectFile.cpp
49

Oh, I see. I think this deserves a nice comment. :)

prakhar updated this revision to Diff 80748.Dec 8 2016, 5:36 AM
prakhar marked 3 inline comments as done.
prakhar edited edge metadata.

Additional tests, error reporting, and explanatory comments

Right, I'm happy with the changes from the code side, but there were some concerns on IRC if we should support this at all.

We seem to have no ELF code owner, but Jim is the MC code owner, so I'm adding him here to review the generic parts.

I forgot who had concerns on IRC and my history doesn't go that far, so I'll leave for people that really care about this to make their voices heard on this review.

cheers,
--renato

grosbach edited edge metadata.Dec 8 2016, 10:57 AM

Right, I'm happy with the changes from the code side, but there were some concerns on IRC if we should support this at all.

Seems a reasonable thing to support in general. Were the concerns about supporting execute-only conceptually or about the design of this approach?

include/llvm/MC/SectionKind.h
119

Yeesh. This file is a bit of a mess. I'd much rather see ExecuteOnly be an attribute of Text rather than a completely distinct Kind. That's not the way the rest of the stuff is represented though, leading to the comment indentation mess Renato commented on.

Bottom line, this is fine as-is here, I'm just grumbling about the overall file. :)

lib/Target/ARM/ARM.td
267 ↗(On Diff #80748)

Why use a Feature flag for this? This isn't an attribute of the CPU, but rather a code generation option, right?

Would a function attribute be sufficient? At first look, it seems all the decisions to be made are about how to handle any given function's constant data. That would have the added benefit of making LTO "just work" since function attributes propagate naturally.

lib/Target/ARM/ARMSubtarget.cpp
179

The error should be detected earlier and a proper diagnostic issued if at all possible. Using report_fatal_error() for user-facing diagnostics is an option of last resort and an indicator of a layering failure elsewhere.

Seems a reasonable thing to support in general. Were the concerns about supporting execute-only conceptually or about the design of this approach?

About the concept of unreadable code areas in general, and what benefits this really brings to security and performance of deeply embedded devices.

I honestly think this is a feature that ARM hardware needs support and I'm not a hardware expert to decide what features people need on such an individualised market. With the change itself being reasonably non-intrusive, I can't see why we'd block such a feature on those grounds only.

cheers,
--renato

include/llvm/MC/SectionKind.h
119

Yeah, so, after I saw *all* the other options, I realised this split was the least of our worries. :)

My knowledge of this code is limited, so I'm not sure I could suggest a way to make this separation consistent everywhere else, and I'd certainly dislike proposing something ARM-specific here.

lib/Target/ARM/ARM.td
267 ↗(On Diff #80748)

Oh, I missed that one! Indeed, this is a codegen attribute, which is dependent on the platform, but not enough to be a sub-target feature.

I think the confusion is that Clang emits "-target-feature" "+execute-only", but those are different things.

lib/Target/ARM/ARMSubtarget.cpp
179

There is an error in Clang (D27450) that detects this. Is there a better way to do this in LLVM itself?

Seems a reasonable thing to support in general. Were the concerns about supporting execute-only conceptually or about the design of this approach?

About the concept of unreadable code areas in general, and what benefits this really brings to security and performance of deeply embedded devices.

Gotcha. My recollection is that it's generally not to valuable in isolation, but can be used in combination with other measures to provide some legitimate mitigations. Usual caveats apply. I'm not a security expert, blah blah. :)

I honestly think this is a feature that ARM hardware needs support and I'm not a hardware expert to decide what features people need on such an individualised market. With the change itself being reasonably non-intrusive, I can't see why we'd block such a feature on those grounds only.

Agreed.

lib/Target/ARM/ARMSubtarget.cpp
179

If there's a legitimate user-facing way to get here and this be the first time we can diagnose the error, we can use the emitError() method on the LLVMContext to complain about it. If, OTOH, getting here in that state indicates a bug earlier in the pipeline (probably true?), then perhaps this could just be an assert().

rengolin added inline comments.Dec 8 2016, 1:57 PM
lib/Target/ARM/ARMSubtarget.cpp
179

There shouldn't be a user-facing way to get here, though it may be possible possible with the right combination of flags on llc/llvm-mc.

A quick assert would be of the same value as the one above, I'm ok with that, too.

grosbach added inline comments.Dec 12 2016, 2:41 PM
lib/Target/ARM/ARMSubtarget.cpp
179

SGTM. I don't think it's too horrible if llc/llvm-mc can trigger asserts. Not ideal, but those aren't (supposed to be) user-facing tools at all.

prakhar added inline comments.Dec 13 2016, 3:03 AM
lib/Target/ARM/ARM.td
267 ↗(On Diff #80748)

If I understand the concept correctly, then it sounds like execute-only is something that relates to codegen rather than an attribute of a CPU. How would I go about implementing a codegen option, and is there an example I can follow for this?

lib/Target/ARM/ARMSubtarget.cpp
179

It's possible to reach this point with llc/llvm-mc by specifying +execute-only and +no-movt for -target-feature. As I'll be adding clang diagnostics for -mno-movt being enabled alongside -mexecute-only, I'll revert this back to an assert.

prakhar updated this revision to Diff 81215.Dec 13 2016, 5:17 AM
prakhar edited edge metadata.

Changed execute-only/no-movt error back to an assert

prakhar updated this object.Dec 13 2016, 5:25 AM
prakhar set the repository for this revision to rL LLVM.
prakhar updated this object.
prakhar updated this revision to Diff 81218.Dec 13 2016, 5:30 AM

Fixed mistake with assert

rengolin added inline comments.Dec 13 2016, 6:59 AM
lib/Target/ARM/ARM.td
267 ↗(On Diff #80748)

Have a look at UseFusedMulOps in ARMSubtarget.cpp. There are other examples in the CPP files that you can also take from.

prakhar updated this revision to Diff 81398.Dec 14 2016, 9:10 AM

Make execute-only a codegen option instead of a subtarget feature

rengolin accepted this revision.Dec 14 2016, 9:27 AM
rengolin edited edge metadata.

Thanks for all the changes. LGTM now.

This revision is now accepted and ready to land.Dec 14 2016, 9:27 AM
prakhar closed this revision.Dec 15 2016, 12:57 AM

Closed by commit rL289784

MaskRay added inline comments.
lib/MC/MCParser/ELFAsmParser.cpp
297

I am adding an error for non-aarch32 targets in D148386.

Herald added a project: Restricted Project. · View Herald Transcript