This is an archive of the discontinued LLVM Phabricator instance.

[mips] Expand JAL instructions when PIC is enabled.
ClosedPublic

Authored by dsanders on Nov 12 2014, 9:38 AM.

Details

Summary

This is the correct way to handle JAL instructions when PIC is enabled.

Diff Detail

Event Timeline

tomatabacu updated this revision to Diff 16104.Nov 12 2014, 9:38 AM
tomatabacu retitled this revision from to [mips] Expand JAL instructions when PIC is enabled..
tomatabacu updated this object.
tomatabacu edited the test plan for this revision. (Show Details)
tomatabacu added a reviewer: dsanders.
tomatabacu added a subscriber: Unknown Object (MLST).
tomatabacu updated this revision to Diff 16154.Nov 13 2014, 6:15 AM

-We now emit a correct JALR instruction, which makes us unable to add the R_(MICRO)MIPS_JALR relocation after it.
This is fine because that relocation is just an optimization hint for the linker and is not needed for correctness.
This means that we don't need to implement that relocation any more.

-Added a FIXME which explains this situation.

-Made tests match with the changes mentioned above.

tomatabacu updated this revision to Diff 16156.Nov 13 2014, 6:37 AM

Fix a spelling mistake in a comment.

emaste added a subscriber: emaste.Jan 28 2015, 8:38 AM
tomatabacu updated this revision to Diff 18947.Jan 29 2015, 3:23 AM

Updated tests because of fix to the microMIPS disassembler.
Rebased.

dsanders requested changes to this revision.Jan 29 2015, 8:47 AM
dsanders edited edge metadata.

Sorry for taking so long on these. I had to find out how this is supposed to work before I could comment.

A couple general comments on the patch:

  • There's no mention of -mxgot. I'm happy for -mxgot to be a follow up patch but could you insert assertions/unreachables to guard against it for now.
  • Not required but consider refactoring into smaller functions. It's difficult to see the overall logic at the moment.
  • You'll need to change the commands to use -target-abi
lib/Target/Mips/AsmParser/MipsAsmParser.cpp
1106–1109

Nit: This would be a little tidier as a single statement. We never use JalOpnd, and JalSymExpr again.

1111

I doubt this condition is right, although it's probably close enough to cover a lot of cases.

Things that spring to mind:

  • Are forward-declared locals handled correctly?
  • Are weak symbols handled correctly?
  • What if the symbol is a section name like .text?
1123–1131

These paths should be merged and we should use JalSym in both cases. The .text you see in objdump's output stems from a GAS optimisation to reduce the number of symbols it has to maintain.

Also, use JalSym instead of JalSym.getName() so we don't lose the offset on things like 'jal foo+4'

1138–1144

Same as above

1155

LD for N64

1158–1167

Same as above

1170–1183

Is this doing the right thing for N64?

1186–1189

Nit: JalrInst.setOpcode(inMicroMipsMode() ? Mips::JALR_MM : Mips::JALR);

This revision now requires changes to proceed.Jan 29 2015, 8:47 AM
dsanders added inline comments.Jan 29 2015, 8:48 AM
lib/Target/Mips/AsmParser/MipsAsmParser.cpp
1158–1167

Same as above

(I mean the above JalSym comment, not the N64 one)

tomatabacu updated this revision to Diff 21503.Mar 9 2015, 11:16 AM
tomatabacu edited edge metadata.

Partially addressed review comments.
Changed tests to use -target-abi instead of -mattr.
Removed the .set reorder test (redundant).
Switched away from using llvm-objdump (doesn't work for microMIPS on N64).
Added support for local labels (1b, 2f, etc.).
Updated comments.

Regarding -mxgot, AFAICT it is never checked for in the IAS. There is a command-line option used in llc, but I can't find a way to check for it from MipsAsmParser.
In the meantime, I've added a FIXME comment.

Also, note that offsets do not work (e.g. jal label+8). I can get the value of the offset, but I can't use it as an operand for the ADDIU, in the case of O32,
or store it in the relocation addendum, in the case of N32/N64. I think we would need to improve relocation emission in order to properly handle offsets.

lib/Target/Mips/AsmParser/MipsAsmParser.cpp
1106–1109

Done.

1111

Are forward-declared locals handled correctly?
Unfortunately, they are not (I assume you're talking about doing a "jal fwd_label" before the definition of fwd_label). They end up being treated as global symbols. I'm not sure how to fix this.

Are weak symbols handled correctly?
AFAICT weak symbols are treated the same way as global symbols.

What if the symbol is a section name like .text?
.text should be treated as a local symbol, but there is something weird happening: with llvm-objdump it's treated as a local symbol, but with plain llvm-mc, .text is treated as a global symbol.

1123–1131

Done.

1138–1144

Done.

1155

Fixed.

1158–1167

Done.

1170–1183

Except for not generating LD on N64 (now fixed), I believe so.

1186–1189

Done.

Regarding -mxgot, AFAICT it is never checked for in the IAS. There is a
command-line option used in llc, but I can't find a way to check for it from
MipsAsmParser. In the meantime, I've added a FIXME comment.

Ok. We'll have to add that at some point.

Also, note that offsets do not work (e.g. jal label+8). I can get the value
of the offset, but I can't use it as an operand for the ADDIU, in the case
of O32, or store it in the relocation addendum, in the case of N32/N64.
I think we would need to improve relocation emission in order to properly handle offsets.

I believe the problem is that you are extracting the symbol from the expression, then creating a new expression from the symbol. The original expression should have the offset in it and will result in relocations for label+offset.

lib/Target/Mips/AsmParser/MipsAsmParser.cpp
1111

Are forward-declared locals handled correctly?
Unfortunately, they are not (I assume you're talking about doing a "jal fwd_label" before the definition of fwd_label). They end up being treated as global symbols. I'm not sure how to fix this.

Yes, I'm talking about cases like:

  jal foo
foo:

I've just experimented with it and it turns out that they do work as expected and are treated as local symbols.

Are weak symbols handled correctly?
AFAICT weak symbols are treated the same way as global symbols.

Apart from a duplicate nop which needs fixing, this is behaving the same as GAS too.

What if the symbol is a section name like .text?
.text should be treated as a local symbol, but there is something weird happening: with llvm-objdump it's treated as a local symbol, but with plain llvm-mc, .text is treated as a global symbol.

I see this odd behaviour too. We need llvm-mc to do the right thing for testing purposes so could you look into this? My guess is that there's a difference in how the .text section is created.

1170–1183

Thanks. I've experimented a bit with GAS too and my remaining doubts are now settled. The relocations tend to change for N64 but this is one case where it doesn't.

I think this needs a regen against head as it fails to apply cleanly as of today.

tomatabacu updated this revision to Diff 26329.May 22 2015, 9:25 AM
tomatabacu edited edge metadata.

Rebased on top of D9938 which fixes invalid names for temp symbol operands (e.g. $tmp00, $tmp10).
Added .set nomacro warning & test case.
Added the desired FIXME output for the .text test case.
Rebased.

I have found a potential fix for the discrepancy between the asm and obj outputs when using .text in a PIC relocation,
but I don't have an arch-independent test for it yet and it breaks a lot of NVPTX tests, for some reason.

Rebased (for Sean Bruno's convenience).

tomatabacu planned changes to this revision.Jun 11 2015, 10:59 AM

I have found a potential fix for the discrepancy between the asm and obj outputs when using .text in a PIC relocation,
but I don't have an arch-independent test for it yet and it breaks a lot of NVPTX tests, for some reason.

If it's difficult or a significant amount of work then I'd prefer to get this patch committed and handle that as a follow-up.

Looking back through the comments, I believe the remaining issues are:

  • Duplicated nops
  • The handling of .text
  • Offsets are lost for label+offset

Even without resolving those details this is still a significant step forwards.

Are there any other outstanding issues? If not, I think it's ok to commit whats done so far and fix the smaller details in follow-ups.

tomatabacu updated this revision to Diff 27939.Jun 18 2015, 9:45 AM

Improved the code which gets the JalSym from the original JalExpr.
Moved relocation creation before MCInst creation in order to improve readability. (NFC)
Added FIXME comments for the remaining problems.

Looking back through the comments, I believe the remaining issues are:

  • Duplicated nops

I'm not sure if this is accurate. I've compared IAS and GAS with .set reorder/noreorder and
with/without manually added NOPs and I haven't seen any discrepancies.

  • The handling of .text

I have found a fix for this, but it's a bit ugly and it only touches common code.
If I include it here, it might slow down this patch even more.

  • Offsets are lost for label+offset

They used to be lost, but since I switched the code to use evaluateRelocExpr() offsets will cause a fatal error,
because it tries to inappropriately apply relocations to constants.

I'd rather have this than silently generating the wrong code.

However, evaluateRelocExpr() needs to be fixed.

Some other remaining problems are:

  • no support for the LargeGOT case
  • no support for forward-declared locals

Also, this patch will cause a recently added RuntimeDyld test to fail (ExecutionEngine/RuntimeDyld/Mips/ELF_Mips64r2N64_PIC_relocations.s).
I don't know how to fix this test.

Looking back through the comments, I believe the remaining issues are:

  • Duplicated nops

I'm not sure if this is accurate. I've compared IAS and GAS with .set reorder/noreorder and
with/without manually added NOPs and I haven't seen any discrepancies.

In that case it must have been fixed recently.

  • The handling of .text

I have found a fix for this, but it's a bit ugly and it only touches common code.
If I include it here, it might slow down this patch even more.

I'm ok with leaving that for another patch.

  • Offsets are lost for label+offset

They used to be lost, but since I switched the code to use evaluateRelocExpr() offsets will cause a fatal error,
because it tries to inappropriately apply relocations to constants.

I'd rather have this than silently generating the wrong code.
However, evaluateRelocExpr() needs to be fixed.

I agree that an error is better than silently producing bad code and that evaluateRelocExpr() is badly flawed and needs fixing..

Some other remaining problems are:

  • no support for the LargeGOT case

That's ok for this patch, but following the principle above I'd suggest asserting that LargeGOT is not in use.

  • no support for forward-declared locals

Didn't we establish earlier in the review that these worked as expected? Has that changed?

Also, this patch will cause a recently added RuntimeDyld test to fail (ExecutionEngine/RuntimeDyld/Mips/ELF_Mips64r2N64_PIC_relocations.s).
I don't know how to fix this test.

You need to disable pic for the 'jal foo' on line 44 using '.option pic0' and '.option pic2'. At the moment, it's trying to read the bottom 26-bits of a jal instruction but the new expansion means it's really reading the bottom 26-bits of an lw.

lib/Target/Mips/AsmParser/MipsAsmParser.cpp
1041–1042

Nit: Unnecessary braces

1232

Nit: psudo->pseudo

1239–1244

I'm not sure I understand this bit. Can you explain?

1282

Nit: isABI_N64() -> ABI.ArePtrs64bit()

tomatabacu updated this revision to Diff 28116.Jun 22 2015, 8:41 AM

Fixed the RuntimeDyld test.
Addressed the other review comments.

That's ok for this patch, but following the principle above I'd suggest asserting that LargeGOT is not in use.

I just realized that llvm-mc doesn't even accept -mxgot on the command line (it's also not listed in --help-hidden).
It seems that -mxgot is a hidden option only for llc.

Didn't we establish earlier in the review that these worked as expected? Has that changed?

I'm not sure we did. We're still treating forward-declared local symbols as global, while GAS treats them as local.
Just to be clear, I'm talking the following:

.option pic2
  jal foo
foo:
lib/Target/Mips/AsmParser/MipsAsmParser.cpp
1041–1042

Fixed.

1232

Fixed.

1239–1244

GAS spews all kinds of errors if it is given 2 symbols here, so we should report an error as well.
Also, if we didn't, the IAS would (currently) segfault.

1282

Fixed.

brooks added a subscriber: brooks.Jul 21 2015, 2:15 PM
seanbruno accepted this revision.Aug 16 2015, 4:29 PM
seanbruno added a reviewer: seanbruno.

Thanks for the work on this!

dsanders commandeered this revision.Aug 18 2015, 2:50 AM
dsanders edited reviewers, added: tomatabacu; removed: dsanders.

Commandeered since Toma has finished his placement with us.

In D6231#191632, @dsanders wrote:
That's ok for this patch, but following the principle above I'd suggest asserting that LargeGOT is not in use.
I just realized that llvm-mc doesn't even accept -mxgot on the command line (it's also not listed in --help-hidden).

It seems that -mxgot is a hidden option only for llc.

We'll need to add -mxgot in a later patch. For now, I'll just add a fixme.

Didn't we establish earlier in the review that these worked as expected? Has that changed?

I'm not sure we did. We're still treating forward-declared local symbols as global, while GAS treats them as local.
Just to be clear, I'm talking the following:

.option pic2

jal foo

foo:

I've decided not to worry about the handling of forward-declared local symbols. If we do still have issues, then we can fix them later. For now, it's more important to get Toma's remaining work committed.

lib/Target/Mips/AsmParser/MipsAsmParser.cpp
1239–1244

I understand that Toma was checking for strange assembly like:

jal foo-bar

I agree that this should have a nice error message, but I wonder if this patch is checking in the right place. I think it may be a job for the code that figures out which reloc to use for the expression.

Having looked into this a bit, I think that handling this in the right place is related to representing our unary operators (%hi(), etc.) actual operators in MCExpr nodes. printExpr() ought to blindly print the expression it has whether valid or not, and the reloc selection should handle validity.

For now, I'll add a FIXME comment saying it should be removed once %hi(), etc are MCExpr operators.

This revision is now accepted and ready to land.Aug 18 2015, 2:50 AM
dsanders closed this revision.Aug 18 2015, 9:18 AM