This is an archive of the discontinued LLVM Phabricator instance.

[lld][ARM][2/3]Big Endian support - Word invariant support
ClosedPublic

Authored by simpal01 on Dec 16 2022, 3:23 AM.

Details

Summary

Changes:

  • Adding BE32 big endian Support for Arm.
  • Replace the writele and readle with their endian-aware versions.
  • Adding test cases for the big-endian be32 arm configuration.

    Patch by: Milosz Plichta. This patch merges all the changes from this patch https://reviews.llvm.org/D140203 as well.

Diff Detail

Event Timeline

M-Plichta created this revision.Dec 16 2022, 3:23 AM
Herald added a project: Restricted Project. · View Herald Transcript
M-Plichta requested review of this revision.Dec 16 2022, 3:23 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 16 2022, 3:23 AM

It will be worth adding to the description of the difference between BE32 and BE8. BE32 is the legacy way of handling big-endian on Arm systems prior to ArmV6. In this model all relocatable object files are BigEndian, and all the instructions and data are BigEndian. In ArmV6 the byte-invariant BE8 mode was added, in the output ELF file (executable/shared library) this is the same model as AArch64 where the ELF file and Data are big-endian but instructions are little-endian. In contrast to AArch64 relocatable object files are identical in BE8 and BE32, the linker is expected to endian reverse big-endian instructions from big-endian to little-endian. In modern Arm systems BE32 support has been removed, leaving only BE8. However BE32 is a necessary pre-requisite to support BE8.

For BE8 support we'll have to endian reverse the instructions. For non-SyntheticSections we'll have to use the mapping symbols to do that during or more likely after writeTo(). For SyntheticSections we have a choice, we can write them big-endian with mapping symbols, and then do the same endian reversal as for non-SyntheticSections; or we can have the SyntheticSections force little-endian writes for instructions.

I notice that ARMErrataFix.cpp hard-codes reading of little-endian instructions. This won't work on a big-endian relocatable object. In practice all the systems that use ARMErrataFix.cpp are likely to be little-endian so it may not be necessary to support it for big-endian at this point. It would be good to error out with an unsupported error message.

I suspect that some reviewers will prefer to have the tests in the same review as the patch as this is making functional changes.

Not had a chance to go over all the small changes, but on a quick scan I couldn't spot anything obviously wrong. The more difficult bit will be spotting anything missing.

I'll be on vacation for three weeks. Happy for others to review in my absence. Otherwise I can pick it back up next year.

MaskRay added inline comments.Dec 17 2022, 7:11 PM
lld/ELF/Options.td
39 ↗(On Diff #483472)

What do the options do? arm-linux-gnueabihf-ld doesn't support --be8 or --be32.

MaskRay added inline comments.Dec 17 2022, 9:36 PM
lld/ELF/Options.td
39 ↗(On Diff #483472)

If they are really needed, FF< to only support the double-dash form.

lld/docs/ld.lld.1
92 ↗(On Diff #483472)

This description isn't very clear.

M-Plichta updated this revision to Diff 484534.Dec 21 2022, 4:46 AM

Thanks for the description Peter, and for all the comments. Hope this adds enough clarity.

Please answer why --be8 and --be32 are added and what the references are.

lld/ELF/ARMErrataFix.cpp
181 ↗(On Diff #484534)

don't capitalize. see other diagnostics. this incompatibility check should be added to Driver.cpp see " may not be used together"

lld/ELF/Arch/ARM.cpp
921

Add newline at end of file

lld/ELF/Driver.cpp
1085

If not yet supported, just delete this option. Are use cases using this option still useful?

Please answer why --be8 and --be32 are added and what the references are.

Some documentation for --be8: https://sourceware.org/binutils/docs/ld/ARM.html and https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html
As far as I'm aware, --be32 is deprecated but needed to be added for backwards compatibility.

The --be8 and --be32 were added at the same time because this patch is a precursor for the patch that adds --be8 support. You're right though, it does not belong in this patch.

Thanks.

Please answer why --be8 and --be32 are added and what the references are.

Some documentation for --be8: https://sourceware.org/binutils/docs/ld/ARM.html and https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html
As far as I'm aware, --be32 is deprecated but needed to be added for backwards compatibility.

The --be8 and --be32 were added at the same time because this patch is a precursor for the patch that adds --be8 support. You're right though, it does not belong in this patch.

Thanks.

Removing the options and this patch is fine.

simpal01 commandeered this revision.Feb 16 2023, 7:15 AM
simpal01 added a reviewer: M-Plichta.
simpal01 added subscribers: Milosz-Plichta, simpal01.

I am taking over this change from the original author @Milosz-Plichta with his permission to move the change forward.

MaskRay requested changes to this revision.Feb 16 2023, 9:52 AM

Please answer why --be8 and --be32 are added and what the references are.

Some documentation for --be8: https://sourceware.org/binutils/docs/ld/ARM.html and https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html
As far as I'm aware, --be32 is deprecated but needed to be added for backwards compatibility.

The --be8 and --be32 were added at the same time because this patch is a precursor for the patch that adds --be8 support. You're right though, it does not belong in this patch.

Thanks.

Removing the options and this patch is fine.

Request for removal :)

This revision now requires changes to proceed.Feb 16 2023, 9:52 AM
simpal01 updated this revision to Diff 504549.Mar 13 2023, 2:01 AM

Addressed all the comments. I have added all the tests in this same
patch where we are making all the functional changes. I have literally
taken all the changes from https://reviews.llvm.org/D140203 to here.
Also added few more testcases.

simpal01 edited the summary of this revision. (Show Details)Mar 13 2023, 3:30 AM

A few small comments to move an error message to Driver.cpp and to remove the --be32 option. This will mean updating the tests.

I've not had the chance to check through the tests yet, but will do so tomorrow.

Please can you Commandeer https://reviews.llvm.org/D140203 and take the Abandon Revision action. That revision is now redundant as this one includes all of its changes and tests.

lld/ELF/ARMErrataFix.cpp
181 ↗(On Diff #504549)

This comment still needs to be done. I've left a comment to highlight the place in Driver.cpp

lld/ELF/Driver.cpp
347

This would be the place to move the check from ARMErrataFix.cpp.

Message text could be something like --fix-cortex-a8 is not supported on big-endian targets

I also suggest adding an error message for --be32 on emachine != EM_ARM

lld/ELF/Options.td
41 ↗(On Diff #504549)

These comments not done yet.

GNU ld only has the --be8 option, it defaults to --be32. We should remove this command line from upstream and default to --be32 (tests will need altering).

When the --be8 support is added we can add the --be8 switch.

The --be8 is in theory the better default, but if we do go down that route we will need both --be8 and --be32 as GCC and clang drivers will always pass --be8.

lld/docs/ld.lld.1
91 ↗(On Diff #504549)

I think we should take this out given that I'm suggesting taking out --be32 to match the GNU ld interface.

When we add --be8 we'll need to add that here.

simpal01 updated this revision to Diff 505400.EditedMar 15 2023, 2:06 AM
simpal01 edited the summary of this revision. (Show Details)

This patch now

  • Removes the --be32 command line option and make it default.
  • Added an error check for the --fix-cortex-a8 option on big endian targets
  • Updated the tests accordingly.
This comment was removed by simpal01.
lld/ELF/Driver.cpp
347

Now i have set be32 in Config.h with

**bool be32 = false;**

And again setting this value in setConfigs with

**config->be32 = (k==ELF32BEKIND && m==EM_ARM);**

And when we implement --be8 switch, my plan is to change this to

config->be32 = (k==ELF32BEKIND && m==EM_ARM && !config->be8);

where config->be8 will be

config->be8 = args.hasArg(OPT_be8);

Since i am setting the be32 value only when emachine = EM_ARM, i am not sure if we still need to add an error message for --be32 on emachine!=EM_ARM.

simpal01 marked an inline comment as not done.Mar 15 2023, 2:27 AM
simpal01 marked 5 inline comments as done.Mar 15 2023, 2:29 AM
simpal01 added inline comments.
lld/ELF/Driver.cpp
347

Now i have set be32 in Config.h with

bool be32 = false;

And again setting this value in setConfigs with

config->be32 = (k==ELF32BEKIND && m==EM_ARM);

And when we implement --be8 switch, my plan is to change this to

config->be32 = (k==ELF32BEKIND && m==EM_ARM && !config->be8);

where config->be8 will be

config->be8 = args.hasArg(OPT_be8);

Since i am setting the be32 value only when emachine = EM_ARM, i am not sure if we still need to add an error message for --be32 on emachine!=EM_ARM.

I think I've spotted a problem in .ARM.exidx table compression. I also think we can get rid of config->be32 for this patch as nothing else reads it.

lld/ELF/Config.h
202 ↗(On Diff #505400)

I don't think anything in this patch uses this be32 variable. Given that there will be a --be8 switch and be32 will be the default I think we can do everything we need in a follow up patch with just be8 which is set with config->be8 = args.hasArg(OPT_be8);

lld/ELF/Driver.cpp
1555

As mentioned above I think we can take this out for this patch as config->be32 is never read.

lld/ELF/SyntheticSections.cpp
3412–3413

I think this needs changing, it is probably the reason the .ARM.exidx table is not identical between little and big-endian.

lld/test/ELF/arm-exidx-emit-relocs.s
77

There is an extra table entry in the .ARM.exidx section, which I can't immediately see why that would be the case. The only difference I can see in the RUN: line is -mcpu=cortex-a8 but I wouldn't expect that to make any difference.

At a glance I can see two identical entries at addresses 100d4 and 100dc with unwinding instructions 80978408. Table compression should be able to merge these two adjacent, identical entries.

lld/test/ELF/arm-exidx-sentinel-norelocatable.s
23

Remove extra trailing whitespace

simpal01 updated this revision to Diff 506567.Mar 20 2023, 7:16 AM

This patch:

  • get rid of config->be32 for this patch
  • Fixed the problem found in .ARM.exidx table compression. Now isDuplicateArmExidxSec works for both little and big endians. .ARM.exidx table is now identical between little and big-endian.
simpal01 edited the summary of this revision. (Show Details)Mar 20 2023, 7:17 AM

This looks correct, although I think there are a few places where we can simplify the code to remove the explicit endianness.

lld/ELF/SyntheticSections.cpp
3545

I think we can simplify the function by replacing cantUnwindData with a comment

// A linker generated CANTUNWIND entry is made up of two words
// 0x0 with R_ARM_PREL31 relocation to target.
// 0x1

We can then replace read32le(cantUnwindData+0) with 0x0 and read32le(cantUnwindData+4) with 0x1 in the two places it is used.

3568

Suggest making this // Write Sentinel CANTUNWIND entry.

peter.smith added inline comments.Mar 20 2023, 11:17 AM
lld/ELF/SyntheticSections.cpp
3445

Phab lost my previous comment that I had spent ages on. Sigh...

I think that we can rewrite this so that we don't have to use explicit endianness. Instead of using getDataAs() we use content().getData() and read32().

For example (untested and will need comments)

uint32_t prevUnwind = 1;
if (prev)
  prevUnwind = read32(prev->getContent().getData() + prev->size - 4);
if (isExtabRef(prevUnwind))
  return false;
...
if (cur == nullptr)
  return prevUnwind == 1;

for(uint32_t offset = 4; offset < size; offset += 8) {
  uint32_t curUnwind = read32(getContent().getData() + offset);
  if (isExtabRef(curUnwind) || curUnwind != prevUnwind)
      return false;
}
simpal01 updated this revision to Diff 507658.Mar 23 2023, 2:32 AM

Addressed Peter's review comments.

simpal01 marked 7 inline comments as done.Mar 23 2023, 2:37 AM
peter.smith accepted this revision.Mar 27 2023, 8:24 AM

One tiny change to add some punctuation to a comment. This can be done prior to commit, no need to post another review.

I think that this satisfies MaskRay's request in https://reviews.llvm.org/D140202#4132479 to remove --be8 from this patch. LLD now has the same interface as GNU ld for BE32. Will be worth waiting a day or so to see if there are any additional comments from the US timezone.

lld/ELF/SyntheticSections.cpp
3547

Just remembered the LLD convention for comments. Can you add a colon after two words and put a full stop after 0x1. essentially:

// A linker generated CANTUNWIND entry is made up of two words:
// 0x0 with R_ARM_PREL31 relocation to target.
// 0x1.

My apologies I should have made that clearer in the example.

PPC64 is not templated while supporting both endians. If we use templates for ARM, we can use write32<ELFT::TargetEndianness> to save a config->endianness dispatch.

lld/ELF/Arch/ARM.cpp
187

write32<ELFT::TargetEndianness> will be more efficient and compile to less code.

simpal01 updated this revision to Diff 508966.Mar 28 2023, 5:05 AM

Yeah, It does look like we do not need to add the template to the Arm target classes.
I can not see any places that we use the template parameter. Detemplating the classes
which makes it more readable.

simpal01 marked 2 inline comments as done.Mar 28 2023, 5:07 AM
MaskRay accepted this revision.Mar 28 2023, 9:58 PM

LGTM.

This revision is now accepted and ready to land.Mar 28 2023, 9:58 PM
simpal01 edited the summary of this revision. (Show Details)Mar 29 2023, 1:41 AM
This revision was landed with ongoing or failed builds.Mar 29 2023, 2:21 AM
This revision was automatically updated to reflect the committed changes.