This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Generate address sequences suitable for mcmodel=medium
ClosedPublic

Authored by lewis-revill on Nov 6 2018, 2:49 AM.

Details

Summary

This patch adds an implementation of a PC-relative addressing sequence to be used when -mcmodel=medium is specified. With absolute addressing, a 'medium' codemodel may cause addresses to be out of range. This is because while 'medium' implies a 2 GiB addressing range, this 2 GiB can be at any offset as opposed to 'small', which implies the first 2 GiB only.

Note that LLVM/Clang currently specifies code models differently to GCC, where small and medium imply the same functionality as GCC's medlow and medany respectively.

Diff Detail

Repository
rL LLVM

Event Timeline

lewis-revill created this revision.Nov 6 2018, 2:49 AM
jrtc27 requested changes to this revision.Nov 6 2018, 3:05 AM
jrtc27 added inline comments.
lib/Target/RISCV/RISCVISelLowering.cpp
376

This needs to be an MO_PCREL_LO, surely? (and then modified to refer to the auipc rather than the symbol...)

This revision now requires changes to proceed.Nov 6 2018, 3:05 AM
jrtc27 added a comment.Nov 6 2018, 3:32 AM

It's also worth noting that GCC distinguishes between these two models by calling them medlow and medany, whereas small (what should be medlow) vs medium (ie medany) is somewhat misleading, but the SPARC backend seems to already set a precedent for this divergence (medlow -> small, medmid -> medium ie 44-bit absolute, medany -> large ie 64-bit absolute).

Now, there's an argument to be made whether our medium should continue to use absolute addressing on RV32I. The %hi/%lo pair already gives a 32-bit absolute address (signed, but that only poses a potential issue on RV64I), so we don't gain anything from this sequence. However, on RV64I, if our medium does mean medany, it will need to use PC-relative addressing, so it makes sense to have RV32I match this behaviour (and indeed, GCC also does this).

lewis-revill added inline comments.Nov 6 2018, 3:36 AM
lib/Target/RISCV/RISCVISelLowering.cpp
376

I was under the impression that only the auipc symbol needs to be PC-relative, since:

  1. The upper bits are the bits we need to prevent from overflowing
  2. the operation of auipc is to add the upper 20 bits of the symbol relative to the program counter to the program counter itself, putting the result into the destination register. Essentially we've done the same operation as an lui, but via the PC. So the addi should be the same?

Also more importantly I tried that and got a 'relocation truncated to fit'...

I don't quite understand what you mean by 'modified to refer to the auipc rather than the symbol'? Basically the first two SDValues are just building the %pcrel_hi(sym) & %lo(sym) expressions to be used in the actual sequence.

It's also worth noting that GCC distinguishes between these two models by calling them medlow and medany, whereas small (what should be medlow) vs medium (ie medany) is somewhat misleading, but the SPARC backend seems to already set a precedent for this divergence (medlow -> small, medmid -> medium ie 44-bit absolute, medany -> large ie 64-bit absolute).

Now, there's an argument to be made whether our medium should continue to use absolute addressing on RV32I. The %hi/%lo pair already gives a 32-bit absolute address (signed, but that only poses a potential issue on RV64I), so we don't gain anything from this sequence. However, on RV64I, if our medium does mean medany, it will need to use PC-relative addressing, so it makes sense to have RV32I match this behaviour (and indeed, GCC also does this).

I was looking at the old GCC specifications for the code model, when they did use small/medium/large which I presume clang copied (quite hard to find actual documentation on this). Your comment about RV32I makes sense, we literally gain nothing if the PC is 32 bits. The idea of this patch was to enable programs with a 64-bit address space to be compiled correctly on RV64 (assuming they use 'medium').

jrtc27 added inline comments.Nov 6 2018, 3:52 AM
lib/Target/RISCV/RISCVISelLowering.cpp
376

No, because you still have the lower bits of the program counter added to the %lo value. What you're trying to do is use auipc to get the 4k-page and then set the lower bits to the offset in the page, but that needs an extra mask if you want to do that, which is pointless when you can add the %pcrel_lo instead and get the right result.

Example:

0x10048: auipc a0, %pcrel_hi(X)
0x1004c: addi a1, a0, %lo(X)

where X is at address 0x20072.

%pcrel_hi(X) at 0x48 will give us upper 20 bits of 0x20072-0x10048=0x1002a, ie 0x10000
Thus a0 gets set to PC+0x10000 = 0x20048
%lo(X) will give us 0x20072&0xFFF = 0x72
Thus a1 gets set to a0+0x72 = 0x200ba, not 0x20072, because we had the extra 0x48 from the low bits of PC at the time of the auipc.

376

As for 'modified to refer to the auipc rather than the symbol', %pcrel_lo is unusual on RISC-V. You would think that you would write something like the following (the +4 is of course extremely important):

.L0: auipc a0, %pcrel_hi(X)
     addi a1, a0, %pcrel_lo(X+4)

But in fact that's not how you do it. On RISC-V, the pair of instructions are tied together so the linker knows which auipc the later addi is consuming, and you instead do this:

.L0: auipc a0, %pcrel_hi(X)
     addi a1, a0, %pcrel_lo(.L0)

Note how the symbol for the %pcrel_lo becomes the label for the auipc rather than the symbol itself. This indirection is eventually removed by the linker once it has done all relaxations, and will turn into the low 12 bits of X-.L0 ie what is normally %pcrel_lo(X+4).

lewis-revill added inline comments.Nov 6 2018, 4:09 AM
lib/Target/RISCV/RISCVISelLowering.cpp
376

Thank you for the explanation, I see what you mean now. Now I have to figure out how to implement that.

jrtc27 added inline comments.Nov 6 2018, 4:28 AM
lib/Target/RISCV/RISCVISelLowering.cpp
376

Look at the expansion of lla in the assembler, that's exactly what you're trying to do here.

jrtc27 added a comment.EditedNov 6 2018, 5:56 AM

A few other points:

  1. We should be defaulting to the small code model to match GCC (which defaults to medlow)
  2. Other backends only support a subset of the code models; we should only allow small and medium, anything else is an error (look at other backends for how they do it).
  3. The switch on the code model is identical across all three functions and so can be de-duplicated.
rogfer01 added inline comments.
lib/Target/RISCV/RISCVISelLowering.cpp
376

In an earlier attempt to implement local PIC addressing I created a basic block to get the address

https://reviews.llvm.org/D50634

but this may impact negatively later optimisations that work at a basic-block unit. @efriedma suggested to add an id to the auipc instead.

My approach used a pseudo instruction so what one needs in my case is to make sure the label is emitted by AsmPrinter. Under that approach (in case you want to consider this approach) it looks to me we need to extend first MachineInstr to be able to represent that id (e.g. zero if no id). FunctionLoweringInfo could be the provider of these unique ids within a function via SelectionDAG. Then teach AsmPrinter to emit a label when there is an id for a given instruction. Then we need to link the user of the id (in your case addi). That part is less clear to me but I'll be bold here and I believe SelectionDAG::getTargetIndex is enough here (with a MII flag stating that this is a pcrel_lo).

Unfortunately I could not progress on this, yet. Feel free to investigate this avenue for feasibility. Perhaps there is a simpler approach in your case.

(To the best of my knowledge, this way of referencing an earlier instruction for symbol relocations is a bit of a unique thing of RISC-V so I think we're in uncharted land within LLVM)

Hope this helps.

Regards,

lewis-revill added inline comments.Nov 6 2018, 8:01 AM
lib/Target/RISCV/RISCVISelLowering.cpp
376

Thank you very much for the background, I should have looked this up before.. I think a pseudo instruction is the right approach, but it should probably be expanded as late as possible? I'm testing whether it would be worth modifying RISCVExpandPseudoInsts to cater for operations other than just atomics.

Maybe then we could still go with the basic block approach since most optimisations will be done already.

lewis-revill planned changes to this revision.Nov 6 2018, 8:04 AM

Adding a pseudo instruction for PC-relative addressing to expand later and addressing the issues raised by @jrtc27

jrtc27 added inline comments.Nov 6 2018, 8:31 AM
lib/Target/RISCV/RISCVISelLowering.cpp
376

The latest point to do it is in RISCVAsmPrinter for the pseudo approach, as far as I am aware (and that's what I've been doing for my locally hacked up LLVM). I get uneasy with doing it earlier (ie by using a basic block and expanding somewhere like RISCVExpandPseudoInsts) in case instructions get moved around and the MBB no longer has the auipc as its first instruction, but if there are guarantees that that doesn't happen then that seems fine (although I fail to see the benefit from expanding such a simple instruction then, when doing it in RISCVAsmPrinter is almost exactly the same as what RISCVAsmParser is doing). Others with deeper LLVM-internals knowledge may be able to offer better advice here, but my approach does work.

lewis-revill added inline comments.Nov 6 2018, 9:25 AM
lib/Target/RISCV/RISCVISelLowering.cpp
376

The only problem with expanding in RISCVAsmPrinter that I can think of is that we would miss the chance to use the RISCVMergeBaseOffset pass, which I was planning to modify for this patch as well. Maybe there is a way around that. It would certainly make things easier.

rogfer01 added inline comments.Nov 6 2018, 9:47 AM
lib/Target/RISCV/RISCVISelLowering.cpp
376

@jrtc27: in my case, for PIC addressing I was a bit reluctant to late-expand the sequences, which I do in my downstream LLVM too, but I think we could do better in upstream: I presume there may be circumstances where due to scheduling we may want to move instructions between the address-forming instructions. My understanding is that a pseudo-instruction that is expanded late is a bit like a black box and would prevent that. I felt a bit uneasy to dismiss the flexibility that the existing linker mechanism provides.

That said perhaps I've got a storm in a teacup and there are no realistic circumstances where this is going to be profitable, so a late expanded pseudo is actually fine. Perhaps this mechanism of relocating to an instruction that has the actual relocation is more suited to the linker and there is no expectation for the compiler to emit non-contiguous sequences. Not sure, to be honest.

Regards,

jrtc27 added inline comments.Nov 6 2018, 11:53 AM
lib/Target/RISCV/RISCVISelLowering.cpp
376

Yeah, it's possible that you'd want to split them up, but as you say given the simplicity of the operations involved it seems unlikely to be necessary, but I'm not an expert on these things. So, in my opinion, we should either be doing the late expansion in RISCVAsmPrinter, or we go all the way and generate separate linked instructions with a labelled auipc straight away, with no pseudos involved; anything in between seems like a half-hearted waste of time.

It's worth noting that GCC will do exactly what LLVM does for 32-bit absolute addressing (separate %hi/%lo instructions, with the addi ..., %lo(sym) merged with the load if you're not just taking the address), but for anything else it will just emit la or lla (depending on the symbol's localness), not even expanding the sequence and leaving it up to GNU as, essentially like our simple expansion solution.

Deduplicate switch statements and error on an unsupported code model. Added a wrapper for loading the PC-relative address, which is expanded in RISCVExpandPseudoInsts. This approach is a compromise between splitting up into auipc and addi too early and blocking optimisations due to an additional basic block and splitting up too late and losing the chance to optimize the black box pseudo instruction.

By expanding at this point there is still the opportunity to do base-offset merges (when implemented) and other late machine-function passes, but the majority of larger optimisations such as outlining will have already taken place.

More changes are needed to fully implement this approach if it is suitable.

lewis-revill planned changes to this revision.Nov 8 2018, 5:36 AM

An issue has occured where the AsmPrinter cannot emit symbols for the new MBB if the parent MBB (and therefore our new MBB) does not correspond to an LLVM IR BasicBlock. Looking into whether this can be fixed.

I fixed the segfault caused when the AsmPrinter attempts to use the BasicBlock to emit symbols which doesn't necessarily exist. This only occurred when symbols were being emitted for an MBB which had AddressTaken set to true, while the modifications for LabelMustBeEmitted allow a label to be emitted without the need for this.

Deduplicate switch statements and error on an unsupported code model. Added a wrapper for loading the PC-relative address, which is expanded in RISCVExpandPseudoInsts. This approach is a compromise between splitting up into auipc and addi too early and blocking optimisations due to an additional basic block and splitting up too late and losing the chance to optimize the black box pseudo instruction.

Why do we need a new wrapper? Can't we re-use PseudoLLA (with isAsmParserOnly modified to 0 of course), as this is precisely an lla?

Deduplicate switch statements and error on an unsupported code model. Added a wrapper for loading the PC-relative address, which is expanded in RISCVExpandPseudoInsts. This approach is a compromise between splitting up into auipc and addi too early and blocking optimisations due to an additional basic block and splitting up too late and losing the chance to optimize the black box pseudo instruction.

Why do we need a new wrapper? Can't we re-use PseudoLLA (with isAsmParserOnly modified to 0 of course), as this is precisely an lla?

That's a good point. I don't think PseudoLLA is a precise match? It fits better with PseudoLA (IE load global address) which isn't implemented yet. It would be an interesting approach.

jrtc27 added a comment.EditedNov 20 2018, 10:05 AM

Deduplicate switch statements and error on an unsupported code model. Added a wrapper for loading the PC-relative address, which is expanded in RISCVExpandPseudoInsts. This approach is a compromise between splitting up into auipc and addi too early and blocking optimisations due to an additional basic block and splitting up too late and losing the chance to optimize the black box pseudo instruction.

Why do we need a new wrapper? Can't we re-use PseudoLLA (with isAsmParserOnly modified to 0 of course), as this is precisely an lla?

That's a good point. I don't think PseudoLLA is a precise match? It fits better with PseudoLA (IE load global address) which isn't implemented yet. It would be an interesting approach.

It depends; la will get you GOT-relative for PIC, whereas lla gets you PC-relative. Everything would end up using PseudoLLA except for global addresses where shouldAssumeDSOLocal returns false, but you could leave that as a TODO.

In my downstream I have a single RISCVISD::WRAPPER_PIC and then I use a target flag to remember whether later we want just PCREL (lla) or GOT based on what shouldAssumeDSOLocal returns.

But I agree that we can do instead what @jrtc27 suggests and then we can add a PseudoLA later (or something that explicitly means "via GOT" because a "la" with -fno-PIC is just a lla),

In my downstream I have a single RISCVISD::WRAPPER_PIC and then I use a target flag to remember whether later we want just PCREL (lla) or GOT based on what shouldAssumeDSOLocal returns.

But I agree that we can do instead what @jrtc27 suggests and then we can add a PseudoLA later (or something that explicitly means "via GOT" because a "la" with -fno-PIC is just a lla),

I don't see why -fno-PIC is relevant. If we have -fno-PIC we *want* whatever we pick to be an lla, so using PseudoLA for anything non-local is precisely what we want.

I've tried re-using PseudoLLA, but I just cannot get around the problem of expanding it in the MC layer. If I try expanding it in RISCVMCCodeEmitter there is no way to get/create an appropriate expression to use for the %pcrel_lo relocation. It would be nice if it was possible to create a <.text+offset> expression for the AUIPC instruction but I just don't know how. Otherwise I have also tried splitting up the instruction earlier, but the AUIPC/ADDI get split up too often to make it feasible. Also there's no real benefit because RISCVMergeBaseOffset cannot work on the %pcrel_lo base symbols without a great deal of modification. @rogfer01 what do you do differently to this patch for the PC-relative case?

Rebased on top of master, and added entry to getInstSizeInBits for PseudoAddrPCRel.

jrtc27 added a comment.Dec 4 2018, 6:51 PM

I've tried re-using PseudoLLA, but I just cannot get around the problem of expanding it in the MC layer. If I try expanding it in RISCVMCCodeEmitter there is no way to get/create an appropriate expression to use for the %pcrel_lo relocation. It would be nice if it was possible to create a <.text+offset> expression for the AUIPC instruction but I just don't know how. Otherwise I have also tried splitting up the instruction earlier, but the AUIPC/ADDI get split up too often to make it feasible. Also there's no real benefit because RISCVMergeBaseOffset cannot work on the %pcrel_lo base symbols without a great deal of modification. @rogfer01 what do you do differently to this patch for the PC-relative case?

Why can't it be expanded in RISCVExpandPseudo? If it works for a new PseudoAddrPCRel, it works if you instead use PseudoLLA. All you need is a find/replace of PseudoAddrPCRel with PseudoLLA (and of course removing all the definitions of PseudoAddrPCRel).

jrtc27 added a comment.Dec 4 2018, 6:58 PM

Rebased on top of master, and added entry to getInstSizeInBits for PseudoAddrPCRel.

[You mean getInstSizeInBytes]

@asb: We run the BranchRelaxation pass *before* RISCVExpandPseudos (addPreEmitPass vs addPreEmitPass2), but the atomics pseudos don't have a size set, nor are they special-cased in getInstSizeInBytes, so will BranchRelaxtion not think they are 0 bytes? Can we just run BranchRelaxation later, or do we need to declare the right size for everything (ugh)?

jrtc27 added inline comments.Dec 5 2018, 2:22 AM
lib/Target/RISCV/RISCVExpandPseudoInsts.cpp
86

This should come at the end of the file just after expandAtomicCmpXchg to match the order in which they're declared above (and once using PseudoLLA instead, be called expandLoadLocalAddress or similar).

I've tried re-using PseudoLLA, but I just cannot get around the problem of expanding it in the MC layer. If I try expanding it in RISCVMCCodeEmitter there is no way to get/create an appropriate expression to use for the %pcrel_lo relocation. It would be nice if it was possible to create a <.text+offset> expression for the AUIPC instruction but I just don't know how. Otherwise I have also tried splitting up the instruction earlier, but the AUIPC/ADDI get split up too often to make it feasible. Also there's no real benefit because RISCVMergeBaseOffset cannot work on the %pcrel_lo base symbols without a great deal of modification. @rogfer01 what do you do differently to this patch for the PC-relative case?

Why can't it be expanded in RISCVExpandPseudo? If it works for a new PseudoAddrPCRel, it works if you instead use PseudoLLA. All you need is a find/replace of PseudoAddrPCRel with PseudoLLA (and of course removing all the definitions of PseudoAddrPCRel).

Ah I see now. What I thought was meant by reusing PseudoLLA was to deduplicate code with a single expansion of the instruction in the MC layer, rather than just to remove this extra pseudo. So if I understand correctly we'd have a PseudoLLA expansion in codegen (RISCVExpandPseudo) and an expansion in the MC layer (RISCVAsmParser)?

My only problem with that approach is that it seems wrong to expand PseudoLLA the same way I am expanding PseudoAddrPCRel, IE allowing the AUIPC operand to be decided by codegen.

Rearranged RISCVExpandPseudoInsts.cpp

lewis-revill marked an inline comment as done.Dec 5 2018, 2:53 PM

My only problem with that approach is that it seems wrong to expand PseudoLLA the same way I am expanding PseudoAddrPCRel, IE allowing the AUIPC operand to be decided by codegen.

I'm not sure to follow here.

I think @jrtc27 means that, instead of adding a new PseudoAddrPCRel and select it from a WrapperPCRel, we could select PseudoLLA and then expand it in RISCVExpandPseudoInsts.

That said, I presume at some point we will want to add codegen for GOT-addressing. We have a few options here but if we reuse the WrapperPCRel and we use a different target flag to record that this is a GOT-relocation, then the expansion of PseudoLLA in the codegen flow and the asm parser flow will be different (the latter always doing %pcrel_hi while the former might be able to do both %pcrel_hi / %got_pcrel_hi). This does not seem ideal to me. Adding another target-specific DAG (e.g. WrapperGOTRel) node is workable but feels unnecessary.

I'd lean towards having PseudoAddrPCRel because it avoids conflating the two entities due to having slightly different treatment (I'm aware in an earlier comment of mine I said using PseudoLLA made sense but now I'm less sure).

Maybe I'm misunderstanding the issue here.

lib/Target/RISCV/RISCVISelLowering.cpp
335

Can these helper functions be made static?

jrtc27 added a comment.Dec 6 2018, 5:27 AM

My only problem with that approach is that it seems wrong to expand PseudoLLA the same way I am expanding PseudoAddrPCRel, IE allowing the AUIPC operand to be decided by codegen.

I'm not sure to follow here.

I think @jrtc27 means that, instead of adding a new PseudoAddrPCRel and select it from a WrapperPCRel, we could select PseudoLLA and then expand it in RISCVExpandPseudoInsts.

That said, I presume at some point we will want to add codegen for GOT-addressing. We have a few options here but if we reuse the WrapperPCRel and we use a different target flag to record that this is a GOT-relocation, then the expansion of PseudoLLA in the codegen flow and the asm parser flow will be different (the latter always doing %pcrel_hi while the former might be able to do both %pcrel_hi / %got_pcrel_hi). This does not seem ideal to me. Adding another target-specific DAG (e.g. WrapperGOTRel) node is workable but feels unnecessary.

But we wouldn't be using PseudoLLA, we'd be using PseudoLA for things that aren't assumed DSO-local. The CodeGen expansion for both would be identical to the AsmParser (well, except that currently CodeGen is operating on MBBs, but it's equivalent).

So my problem is that (if we expand PseudoLLA in codegen in the same way as we do in RISCVAsmParser) then we lose some flexibility for later addressing uses. Currently I can use the wrapper to do (WrapperPCRel %pcrel_hi(sym)), (WrapperPCRel %got_pcrel_hi(sym)), (WrapperPCRel %tls_ie_pcrel_hi(sym)) and (WrapperPCRel %tls_gd_pcrel_hi(sym)), whereas with PseudoLLA I am limited to just %pcrel_hi addressing. The addition of PseudoLA would help greatly, but only for PIC, not for TLS, meaning another wrapper would be required anyway. Maybe this is an acceptable approach to use PseudoLLA and PseudoLA where it fits then add a wrapper later?

lewis-revill marked 2 inline comments as done.Dec 6 2018, 10:44 AM
lewis-revill added inline comments.
lib/Target/RISCV/RISCVISelLowering.cpp
335

They certainly can, thanks!

lewis-revill marked an inline comment as done.

Rebased with dependency and marked helper functions as static.

jrtc27 added a comment.Dec 6 2018, 3:25 PM

So my problem is that (if we expand PseudoLLA in codegen in the same way as we do in RISCVAsmParser) then we lose some flexibility for later addressing uses. Currently I can use the wrapper to do (WrapperPCRel %pcrel_hi(sym)), (WrapperPCRel %got_pcrel_hi(sym)), (WrapperPCRel %tls_ie_pcrel_hi(sym)) and (WrapperPCRel %tls_gd_pcrel_hi(sym)), whereas with PseudoLLA I am limited to just %pcrel_hi addressing. The addition of PseudoLA would help greatly, but only for PIC, not for TLS, meaning another wrapper would be required anyway. Maybe this is an acceptable approach to use PseudoLLA and PseudoLA where it fits then add a wrapper later?

For TLS you'd then use the yet-to-be-implemented PseudoLA_TLS_GD and PseudoLA_TLS_IE (or whatever they end up being called) representing the la.tls.gd and la.tls.ie macros/pseudo-instructions. Anything you could want to use WrapperPCRel will need a corresponding PseudoFOO to exist for the assembly parser, so I still fail to see why we would need this extra wrapper. I really like having the exact correspondence between what CodeGen produces and what the equivalent hand-written assembly would be parsed to.

So my problem is that (if we expand PseudoLLA in codegen in the same way as we do in RISCVAsmParser) then we lose some flexibility for later addressing uses. Currently I can use the wrapper to do (WrapperPCRel %pcrel_hi(sym)), (WrapperPCRel %got_pcrel_hi(sym)), (WrapperPCRel %tls_ie_pcrel_hi(sym)) and (WrapperPCRel %tls_gd_pcrel_hi(sym)), whereas with PseudoLLA I am limited to just %pcrel_hi addressing. The addition of PseudoLA would help greatly, but only for PIC, not for TLS, meaning another wrapper would be required anyway. Maybe this is an acceptable approach to use PseudoLLA and PseudoLA where it fits then add a wrapper later?

For TLS you'd then use the yet-to-be-implemented PseudoLA_TLS_GD and PseudoLA_TLS_IE (or whatever they end up being called) representing the la.tls.gd and la.tls.ie macros/pseudo-instructions. Anything you could want to use WrapperPCRel will need a corresponding PseudoFOO to exist for the assembly parser, so I still fail to see why we would need this extra wrapper. I really like having the exact correspondence between what CodeGen produces and what the equivalent hand-written assembly would be parsed to.

Thanks, this clears things up for me. I understand why that would be the better approach now. I did see that those pseudo instructions are what GCC produces but since they weren't part of the RISCV specifications I didn't think about using them in LLVM. I'll try to make these changes and add patches where necessary.

jrtc27 added a comment.Dec 6 2018, 4:56 PM

So my problem is that (if we expand PseudoLLA in codegen in the same way as we do in RISCVAsmParser) then we lose some flexibility for later addressing uses. Currently I can use the wrapper to do (WrapperPCRel %pcrel_hi(sym)), (WrapperPCRel %got_pcrel_hi(sym)), (WrapperPCRel %tls_ie_pcrel_hi(sym)) and (WrapperPCRel %tls_gd_pcrel_hi(sym)), whereas with PseudoLLA I am limited to just %pcrel_hi addressing. The addition of PseudoLA would help greatly, but only for PIC, not for TLS, meaning another wrapper would be required anyway. Maybe this is an acceptable approach to use PseudoLLA and PseudoLA where it fits then add a wrapper later?

For TLS you'd then use the yet-to-be-implemented PseudoLA_TLS_GD and PseudoLA_TLS_IE (or whatever they end up being called) representing the la.tls.gd and la.tls.ie macros/pseudo-instructions. Anything you could want to use WrapperPCRel will need a corresponding PseudoFOO to exist for the assembly parser, so I still fail to see why we would need this extra wrapper. I really like having the exact correspondence between what CodeGen produces and what the equivalent hand-written assembly would be parsed to.

Thanks, this clears things up for me. I understand why that would be the better approach now. I did see that those pseudo instructions are what GCC produces but since they weren't part of the RISCV specifications I didn't think about using them in LLVM. I'll try to make these changes and add patches where necessary.

Glad things are clear and we're now in agreement! They're specified in the psABI document, but not the assembly manual; go figure. I may try and tidy things up a bit tomorrow on the documentation front...

lewis-revill planned changes to this revision.Dec 10 2018, 4:36 AM

Rebasing and modifying this patch to use and expand PseudoLLA instead of introducing a new wrapper.

Rebased and updated to use and expand PseudoLLA for PC-relative addressing.

Seems fine from my point of view but we'll wait to hear from the others. I'd suggest removing the "WIP" from the subject to reflect the true state.

Oh, probably also worth mentioning the small <-> medlow and medium <-> medany GCC code model mapping in the commit message, and probably a comment too in the source code of getAddr.

lewis-revill retitled this revision from [WIP, RISCV] Generate address sequences suitable for mcmodel=medium to [RISCV] Generate address sequences suitable for mcmodel=medium.
lewis-revill edited the summary of this revision. (Show Details)

Remove unnecessary Flags operand.

jrtc27 added inline comments.Dec 13 2018, 5:05 AM
lib/Target/RISCV/RISCVISelLowering.cpp
371

Minor: comment should say something like %pcrel_lo(auipc) to be accurate.

Rebased and updated to use and expand PseudoLLA for PC-relative addressing.

Ah, now I understand! I always forget one can create target nodes here :) Definitely using PseudoLLA is more straightforward.

Overall the approach looks sensible to me: if RISCVExpandPseudoInsts.cpp is late enough for subword compare-and-exchange pseudos (which have to be carefully emitted, AFAIU) it should be too for the expansion of PseudoLLA (and PseudoLA).

Rebased and updated comment.

Herald added a project: Restricted Project. · View Herald TranscriptFeb 5 2019, 2:01 AM
jrtc27 added a comment.Feb 5 2019, 9:25 AM

Hi Lewis, could you please rebase this after the RV64A patch landed and added PseudoMaskedCmpXchg64 to RISCVExpandPseudoInsts (conflicts with adding PseudoLLA)?

asb accepted this revision.EditedApr 1 2019, 7:33 AM

LGTM, thanks. I agree that later adding PseudoLA_TLS_GD, PseudoLA_TLS_IE etc as necessary is the right path forwards.

This revision was not accepted when it landed; it landed in state Needs Review.Apr 1 2019, 7:42 AM
This revision was automatically updated to reflect the committed changes.