This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC][PC Rel] Implement option to omit Power10 instructions from stubs
ClosedPublic

Authored by Conanap on Jan 13 2021, 11:49 AM.

Details

Summary

Implemented the option to omit Power10 instructions from save stubs via the option --no-power10-stubs or --power10-stubs=no on lld. --power10-stubs= will override the other option. --power10-stubs=auto also exists to use the default behaviour (ie allow Power10 instructions in stubs).

Diff Detail

Event Timeline

Conanap created this revision.Jan 13 2021, 11:49 AM
Conanap requested review of this revision.Jan 13 2021, 11:49 AM
amyk added a subscriber: amyk.Jan 22 2021, 2:09 PM

A few general comments.

lld/ELF/Config.h
74

We have a "yes", but does it need to be here, too?

lld/ELF/Driver.cpp
763

Add documentation for the function.

776

Do we need to handle power10_stubs?

lld/ELF/Options.td
450

nit: maybe use the full word instructions here and in the following lines.

452

Describe the default behaviour?

lld/ELF/Thunks.cpp
920

Unrelated change?

925

Can we maybe try to have a more descriptive name than inst2? (here and in the other places)

lld/test/ELF/ppc64-toc-call-to-pcrel.s
23

No need for the extra #

Conanap updated this revision to Diff 319140.Jan 25 2021, 3:17 PM
Conanap marked 7 inline comments as done.

Fixed some formatting and comments.

Addressed comments

lld/ELF/Config.h
74

After a bit of discussion, since we don't have a concrete implementation for the "yes" option yet, the consensus is to keep it out for now. I'll remove it from the help message as well.

lld/ELF/Driver.cpp
776

That option is handled by the variable NoP10; I'll add a comment to make that more clear.

lld/ELF/Thunks.cpp
920

This change was made so that the commented instruction would better line up with the new instructions added for better readablity.

I have a few comments. Most of them are nits but there is a functional issue as well.

For the testing:
Do we have a test for the PPC64R2SaveStub in the situation where the offset fits in 34 bits?

lld/ELF/Driver.cpp
768

nit:
You can probably get rid of this temp and have the condition down below change to:

if (!args.hasArg(OPT_power10_stubs_eq) &&
    args.hasArg(OPT_no_power10_stubs))
  return P10Stub::No;

Mainly because you only use NoP10 in one place at this point.

lld/ELF/Options.td
445

nit:
This is not checked in getP10StubOpt.
This is no longer an alias for power10-stubs=yes it is an alias for default.

lld/ELF/Thunks.cpp
932

This sequence does not match with what you are using as an offset.
The tocOffset is the difference between the TOC pointer (in r2) and the address of the destination function (the callee). However, the add instructions are using that offset and adding it to zero (addis r12, 0, top of offset) which is not the TOC pointer. At the end of this sequence the register r12 will not have the address of the callee but simply the difference between that address and the TOC pointer.

Since you have a valid r2 and are computing the offset from the TOC pointer you should be adding to that.

935

Same here as above.

971

nit:
The variable d is only used in one place so you can just inline it.
Also, for the future try to avoid single letter variable names.

977

You need to have the off-8 in both cases. The way that this is done is quite confusing beause in one case you subtract 8 in the lambda and in the other you just subtract here inline. I would say just subtract in the computation of off and then you don't have to do it twice in two different places.

Secondly, instead of casting to uint16_t you are better off just using a mask (off & 0xffff).

1010

nit:
Similar to above it is probably better to compute off with the -8 included than to add the adjustment to the lambdas.

1057

I see that these lambdas are used in a lot of places to do the same thing. It might be better to have a static function instead of defining the same lambda three times.

1059

nit: Same as above. Just add compute the offset with the 8 difference here.

lld/test/ELF/ppc64-pcrel-call-to-toc.s
18

nit:
Do the LE version of the test instead of the BE version.

llvm/include/llvm/Object/ELF.h
57

Are these two instruction masks used?

Conanap updated this revision to Diff 319715.Jan 27 2021, 4:34 PM
Conanap marked 12 inline comments as done.

Removed lambda functions and made a static function, included a R2SaveStub test case that covers if it fits in 34 bits with option --no-power10-stubs.

Implement option to omit P10 instructions from stubs

Please mention the option name directly.
Please avoid P10 - the abbreviation is confusing.

lld/ELF/Config.h
74

In this case it can be a simple boolean

lld/ELF/Driver.cpp
767

Delete the comment. it is not useful

770

value

lld/ELF/Thunks.cpp
1006

Delete the stray comment or clarify.

1056

Delete the stray comment or clarify.

1062

space before //

There is one more thing that does not look quite right. It may be an encoding problem for one of the instructions. I found it in the test but you will have to trace it back to the place where it is generated in order to fix it.

lld/ELF/Driver.cpp
770

You may even be able to inline this since it is only used in one place that I can see.

lld/ELF/Options.td
450

nit:
Is this line too long? Or does it just appear too long in my browser. If the length is ok ignore my comment.

lld/ELF/Thunks.cpp
930

nit:
You can mark both as const.
Also below in this same function.

lld/test/ELF/ppc64-pcrel-call-to-toc.s
55

This is not correct.
You should be adding to r11 and not r12 here.
ie.

addis 12, 11, -1

The encoding above may not be correct.

Conanap retitled this revision from [PowerPC][PC Rel] Implement option to omit P10 instructions from stubs to [PowerPC][PC Rel] Implement option to omit Power10 instructions from stubs.Feb 1 2021, 1:12 PM
Conanap updated this revision to Diff 320865.Feb 2 2021, 11:39 AM
Conanap marked 10 inline comments as done.

Addressed comments, fixed an incorrect encoding.

Added a note on one of the functions

lld/ELF/Driver.cpp
765

For this function here, I realize we can inline all the ifs into a giant return statement - is there any opinions on this? I thought the if statements might make this a bit more readable, but if it is preferred that there is only 1 return statement I can make that change as well.

Just nits this time around.

lld/ELF/Driver.cpp
765

Personally I'm happy with it like this. I'm not a fan of huge if statements but I could be persuaded otherwise depending on what other reviewers have to say...

lld/ELF/Options.td
451

nit:
instsructions -> instructions

lld/ELF/Thunks.cpp
936

nit:
addi r12, 2, offset -> addi r12, r2, offset

lld/test/ELF/ppc64-pcrel-call-to-extern.s
110

nit:
Same thing here. Please fix the comment.
0x20148 = 131400 and not 131416

135

nit:
Please fix these comments.
0x10150 does not equal 65888.

160

nit:
This comment too.

0x10030168 - 0x10030010 = 0x158
0x150 = 336

I'm not going to comment on all of these. Please take a look at the comments where numbers have changed and make sure that they are correct.

Conanap updated this revision to Diff 323000.Feb 11 2021, 7:22 AM
Conanap marked 6 inline comments as done.

Fixed up a few comments

stefanp accepted this revision as: stefanp.Feb 16 2021, 3:52 AM

I just have one final nit otherwise LGTM. Feel free to fix that on commit.
Please wait a couple of days and see if @MaskRay has any further comments.

lld/test/ELF/ppc64-pcrel-call-to-extern.s
135

nit:
I think these numbers are still wrong.

0x10030160 - 0x10020010 = 0x10150 not 0x10160

I think it should be: 0x10030170

This revision is now accepted and ready to land.Feb 16 2021, 3:52 AM
amyk added a comment.Feb 16 2021, 2:06 PM

Thanks for answering the questions I previously had on this patch.

lld/ELF/Driver.cpp
763

nit: remove the space after = and before flags.

Conanap updated this revision to Diff 327839.Mar 3 2021, 9:58 AM

Updated 2 test cases and some nits

This revision was landed with ongoing or failed builds.Mar 4 2021, 10:28 AM
This revision was automatically updated to reflect the committed changes.

The used code sequence has 2 bugs so the output will crash. I fixed them in 7198c87f42f6c15d76b127c1c63530e9b4d5dd39 and added my notes to https://maskray.me/blog/2023-02-26-linker-notes-on-power-isa

Herald added a project: Restricted Project. · View Herald TranscriptFeb 27 2023, 4:21 PM