This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/CodeGen/
-
llvm/
-
CodeGen/
-
MachineBasicBlock.h
-
lib/
-
CodeGen/AsmPrinter/
-
AsmPrinter/
-
AsmPrinter.cpp
-
Target/RISCV/
-
RISCV/
1/1
RISCVExpandPseudoInsts.cpp
-
RISCVISelLowering.h
2/15
RISCVISelLowering.cpp
-
RISCVInstrInfo.cpp
-
RISCVMCInstLower.cpp
-
Utils/
-
RISCVBaseInfo.h
-
test/CodeGen/RISCV/
-
CodeGen/
-
RISCV/
-
codemodel-lowering.ll

Differential D54143

[RISCV] Generate address sequences suitable for mcmodel=medium
ClosedPublic

Authored by lewis-revill on Nov 6 2018, 2:49 AM.

Download Raw Diff

Details

Reviewers

asb
jrtc27

Commits

rGda20f5ca7452: [RISCV] Generate address sequences suitable for mcmodel=medium
rL357393: [RISCV] Generate address sequences suitable for mcmodel=medium

Summary

This patch adds an implementation of a PC-relative addressing sequence to be used when -mcmodel=medium is specified. With absolute addressing, a 'medium' codemodel may cause addresses to be out of range. This is because while 'medium' implies a 2 GiB addressing range, this 2 GiB can be at any offset as opposed to 'small', which implies the first 2 GiB only.

Note that LLVM/Clang currently specifies code models differently to GCC, where small and medium imply the same functionality as GCC's medlow and medany respectively.

Diff Detail

Repository: rL LLVM

Event Timeline

lewis-revill created this revision.Nov 6 2018, 2:49 AM

Herald added subscribers: llvm-commits, jocewei, PkmX and 16 others. · View Herald TranscriptNov 6 2018, 2:49 AM

jrtc27 requested changes to this revision.Nov 6 2018, 3:05 AM

jrtc27 added inline comments.

lib/Target/RISCV/RISCVISelLowering.cpp
376	This needs to be an `MO_PCREL_LO`, surely? (and then modified to refer to the `auipc` rather than the symbol...)

This revision now requires changes to proceed.Nov 6 2018, 3:05 AM

It's also worth noting that GCC distinguishes between these two models by calling them medlow and medany, whereas small (what should be medlow) vs medium (ie medany) is somewhat misleading, but the SPARC backend seems to already set a precedent for this divergence (medlow -> small, medmid -> medium ie 44-bit absolute, medany -> large ie 64-bit absolute).

Now, there's an argument to be made whether our medium should continue to use absolute addressing on RV32I. The %hi/%lo pair already gives a 32-bit absolute address (signed, but that only poses a potential issue on RV64I), so we don't gain anything from this sequence. However, on RV64I, if our medium does mean medany, it will need to use PC-relative addressing, so it makes sense to have RV32I match this behaviour (and indeed, GCC also does this).

lewis-revill added inline comments.Nov 6 2018, 3:36 AM

lib/Target/RISCV/RISCVISelLowering.cpp
376	I was under the impression that only the `auipc` symbol needs to be PC-relative, since: The upper bits are the bits we need to prevent from overflowing the operation of `auipc` is to add the upper 20 bits of the symbol relative to the program counter to the program counter itself, putting the result into the destination register. Essentially we've done the same operation as an `lui`, but via the PC. So the `addi` should be the same? Also more importantly I tried that and got a 'relocation truncated to fit'... I don't quite understand what you mean by 'modified to refer to the `auipc` rather than the symbol'? Basically the first two SDValues are just building the `%pcrel_hi(sym)` & `%lo(sym)` expressions to be used in the actual sequence.

In D54143#1288531, @jrtc27 wrote:

It's also worth noting that GCC distinguishes between these two models by calling them medlow and medany, whereas small (what should be medlow) vs medium (ie medany) is somewhat misleading, but the SPARC backend seems to already set a precedent for this divergence (medlow -> small, medmid -> medium ie 44-bit absolute, medany -> large ie 64-bit absolute).

Now, there's an argument to be made whether our medium should continue to use absolute addressing on RV32I. The %hi/%lo pair already gives a 32-bit absolute address (signed, but that only poses a potential issue on RV64I), so we don't gain anything from this sequence. However, on RV64I, if our medium does mean medany, it will need to use PC-relative addressing, so it makes sense to have RV32I match this behaviour (and indeed, GCC also does this).

I was looking at the old GCC specifications for the code model, when they did use small/medium/large which I presume clang copied (quite hard to find actual documentation on this). Your comment about RV32I makes sense, we literally gain nothing if the PC is 32 bits. The idea of this patch was to enable programs with a 64-bit address space to be compiled correctly on RV64 (assuming they use 'medium').

jrtc27 added inline comments.Nov 6 2018, 3:52 AM

lib/Target/RISCV/RISCVISelLowering.cpp
376	No, because you still have the lower bits of the program counter added to the `%lo` value. What you're trying to do is use `auipc` to get the 4k-page and then set the lower bits to the offset in the page, but that needs an extra mask if you want to do that, which is pointless when you can add the `%pcrel_lo` instead and get the right result. Example: 0x10048: auipc a0, %pcrel_hi(X) 0x1004c: addi a1, a0, %lo(X) where X is at address 0x20072. %pcrel_hi(X) at 0x48 will give us upper 20 bits of 0x20072-0x10048=0x1002a, ie 0x10000 Thus a0 gets set to PC+0x10000 = 0x20048 %lo(X) will give us 0x20072&0xFFF = 0x72 Thus a1 gets set to a0+0x72 = 0x200ba, not 0x20072, because we had the extra 0x48 from the low bits of PC at the time of the auipc.
376	As for 'modified to refer to the auipc rather than the symbol', `%pcrel_lo` is unusual on RISC-V. You would think that you would write something like the following (the `+4` is of course extremely important): .L0: auipc a0, %pcrel_hi(X) addi a1, a0, %pcrel_lo(X+4) But in fact that's not how you do it. On RISC-V, the pair of instructions are tied together so the linker knows which `auipc` the later `addi` is consuming, and you instead do this: .L0: auipc a0, %pcrel_hi(X) addi a1, a0, %pcrel_lo(.L0) Note how the symbol for the `%pcrel_lo` becomes the label for the `auipc` rather than the symbol itself. This indirection is eventually removed by the linker once it has done all relaxations, and will turn into the low 12 bits of `X-.L0` ie what is normally `%pcrel_lo(X+4)`.

lewis-revill added inline comments.Nov 6 2018, 4:09 AM

lib/Target/RISCV/RISCVISelLowering.cpp
376	Thank you for the explanation, I see what you mean now. Now I have to figure out how to implement that.

jrtc27 added inline comments.Nov 6 2018, 4:28 AM

lib/Target/RISCV/RISCVISelLowering.cpp
376	Look at the expansion of `lla` in the assembler, that's exactly what you're trying to do here.

A few other points:

We should be defaulting to the small code model to match GCC (which defaults to medlow)
Other backends only support a subset of the code models; we should only allow small and medium, anything else is an error (look at other backends for how they do it).
The switch on the code model is identical across all three functions and so can be de-duplicated.

rogfer01 added a subscriber: efriedma.Nov 6 2018, 6:11 AM

rogfer01 added inline comments.

lib/Target/RISCV/RISCVISelLowering.cpp
376	In an earlier attempt to implement local PIC addressing I created a basic block to get the address https://reviews.llvm.org/D50634 but this may impact negatively later optimisations that work at a basic-block unit. @efriedma suggested to add an id to the `auipc` instead. My approach used a pseudo instruction so what one needs in my case is to make sure the label is emitted by `AsmPrinter`. Under that approach (in case you want to consider this approach) it looks to me we need to extend first `MachineInstr` to be able to represent that id (e.g. zero if no id). `FunctionLoweringInfo` could be the provider of these unique ids within a function via `SelectionDAG`. Then teach `AsmPrinter` to emit a label when there is an id for a given instruction. Then we need to link the user of the id (in your case `addi`). That part is less clear to me but I'll be bold here and I believe `SelectionDAG::getTargetIndex` is enough here (with a MII flag stating that this is a pcrel_lo). Unfortunately I could not progress on this, yet. Feel free to investigate this avenue for feasibility. Perhaps there is a simpler approach in your case. (To the best of my knowledge, this way of referencing an earlier instruction for symbol relocations is a bit of a unique thing of RISC-V so I think we're in uncharted land within LLVM) Hope this helps. Regards,

lewis-revill added inline comments.Nov 6 2018, 8:01 AM

lib/Target/RISCV/RISCVISelLowering.cpp
376	Thank you very much for the background, I should have looked this up before.. I think a pseudo instruction is the right approach, but it should probably be expanded as late as possible? I'm testing whether it would be worth modifying `RISCVExpandPseudoInsts` to cater for operations other than just atomics. Maybe then we could still go with the basic block approach since most optimisations will be done already.

Adding a pseudo instruction for PC-relative addressing to expand later and addressing the issues raised by @jrtc27

jrtc27 added inline comments.Nov 6 2018, 8:31 AM

lib/Target/RISCV/RISCVISelLowering.cpp
376	The latest point to do it is in `RISCVAsmPrinter` for the pseudo approach, as far as I am aware (and that's what I've been doing for my locally hacked up LLVM). I get uneasy with doing it earlier (ie by using a basic block and expanding somewhere like `RISCVExpandPseudoInsts`) in case instructions get moved around and the MBB no longer has the `auipc` as its first instruction, but if there are guarantees that that doesn't happen then that seems fine (although I fail to see the benefit from expanding such a simple instruction then, when doing it in `RISCVAsmPrinter` is almost exactly the same as what `RISCVAsmParser` is doing). Others with deeper LLVM-internals knowledge may be able to offer better advice here, but my approach does work.

lewis-revill added inline comments.Nov 6 2018, 9:25 AM

lib/Target/RISCV/RISCVISelLowering.cpp
376	The only problem with expanding in `RISCVAsmPrinter` that I can think of is that we would miss the chance to use the `RISCVMergeBaseOffset` pass, which I was planning to modify for this patch as well. Maybe there is a way around that. It would certainly make things easier.

rogfer01 added inline comments.Nov 6 2018, 9:47 AM

lib/Target/RISCV/RISCVISelLowering.cpp
376	@jrtc27: in my case, for PIC addressing I was a bit reluctant to late-expand the sequences, which I do in my downstream LLVM too, but I think we could do better in upstream: I presume there may be circumstances where due to scheduling we may want to move instructions between the address-forming instructions. My understanding is that a pseudo-instruction that is expanded late is a bit like a black box and would prevent that. I felt a bit uneasy to dismiss the flexibility that the existing linker mechanism provides. That said perhaps I've got a storm in a teacup and there are no realistic circumstances where this is going to be profitable, so a late expanded pseudo is actually fine. Perhaps this mechanism of relocating to an instruction that has the actual relocation is more suited to the linker and there is no expectation for the compiler to emit non-contiguous sequences. Not sure, to be honest. Regards,

jrtc27 added inline comments.Nov 6 2018, 11:53 AM

lib/Target/RISCV/RISCVISelLowering.cpp
376	Yeah, it's possible that you'd want to split them up, but as you say given the simplicity of the operations involved it seems unlikely to be necessary, but I'm not an expert on these things. So, in my opinion, we should either be doing the late expansion in `RISCVAsmPrinter`, or we go all the way and generate separate linked instructions with a labelled `auipc` straight away, with no pseudos involved; anything in between seems like a half-hearted waste of time. It's worth noting that GCC will do exactly what LLVM does for 32-bit absolute addressing (separate `%hi`/`%lo` instructions, with the `addi ..., %lo(sym)` merged with the load if you're not just taking the address), but for anything else it will just emit `la` or `lla` (depending on the symbol's localness), not even expanding the sequence and leaving it up to GNU as, essentially like our simple expansion solution.

Deduplicate switch statements and error on an unsupported code model. Added a wrapper for loading the PC-relative address, which is expanded in RISCVExpandPseudoInsts. This approach is a compromise between splitting up into auipc and addi too early and blocking optimisations due to an additional basic block and splitting up too late and losing the chance to optimize the black box pseudo instruction.

By expanding at this point there is still the opportunity to do base-offset merges (when implemented) and other late machine-function passes, but the majority of larger optimisations such as outlining will have already taken place.

More changes are needed to fully implement this approach if it is suitable.

lewis-revill updated this revision to Diff 172920.Nov 7 2018, 4:05 AM

An issue has occured where the AsmPrinter cannot emit symbols for the new MBB if the parent MBB (and therefore our new MBB) does not correspond to an LLVM IR BasicBlock. Looking into whether this can be fixed.

I fixed the segfault caused when the AsmPrinter attempts to use the BasicBlock to emit symbols which doesn't necessarily exist. This only occurred when symbols were being emitted for an MBB which had AddressTaken set to true, while the modifications for LabelMustBeEmitted allow a label to be emitted without the need for this.

In D54143#1289972, @lewis-revill wrote:

Deduplicate switch statements and error on an unsupported code model. Added a wrapper for loading the PC-relative address, which is expanded in RISCVExpandPseudoInsts. This approach is a compromise between splitting up into auipc and addi too early and blocking optimisations due to an additional basic block and splitting up too late and losing the chance to optimize the black box pseudo instruction.

Why do we need a new wrapper? Can't we re-use PseudoLLA (with isAsmParserOnly modified to 0 of course), as this is precisely an lla?

In D54143#1304167, @jrtc27 wrote:

In D54143#1289972, @lewis-revill wrote:

Deduplicate switch statements and error on an unsupported code model. Added a wrapper for loading the PC-relative address, which is expanded in RISCVExpandPseudoInsts. This approach is a compromise between splitting up into auipc and addi too early and blocking optimisations due to an additional basic block and splitting up too late and losing the chance to optimize the black box pseudo instruction.

Why do we need a new wrapper? Can't we re-use PseudoLLA (with isAsmParserOnly modified to 0 of course), as this is precisely an lla?

That's a good point. I don't think PseudoLLA is a precise match? It fits better with PseudoLA (IE load global address) which isn't implemented yet. It would be an interesting approach.

In D54143#1304219, @lewis-revill wrote:

In D54143#1304167, @jrtc27 wrote:

In D54143#1289972, @lewis-revill wrote:

Deduplicate switch statements and error on an unsupported code model. Added a wrapper for loading the PC-relative address, which is expanded in RISCVExpandPseudoInsts. This approach is a compromise between splitting up into auipc and addi too early and blocking optimisations due to an additional basic block and splitting up too late and losing the chance to optimize the black box pseudo instruction.

Why do we need a new wrapper? Can't we re-use PseudoLLA (with isAsmParserOnly modified to 0 of course), as this is precisely an lla?

That's a good point. I don't think PseudoLLA is a precise match? It fits better with PseudoLA (IE load global address) which isn't implemented yet. It would be an interesting approach.

It depends; la will get you GOT-relative for PIC, whereas lla gets you PC-relative. Everything would end up using PseudoLLA except for global addresses where shouldAssumeDSOLocal returns false, but you could leave that as a TODO.

In my downstream I have a single RISCVISD::WRAPPER_PIC and then I use a target flag to remember whether later we want just PCREL (lla) or GOT based on what shouldAssumeDSOLocal returns.

But I agree that we can do instead what @jrtc27 suggests and then we can add a PseudoLA later (or something that explicitly means "via GOT" because a "la" with -fno-PIC is just a lla),

In D54143#1305469, @rogfer01 wrote:

In my downstream I have a single RISCVISD::WRAPPER_PIC and then I use a target flag to remember whether later we want just PCREL (lla) or GOT based on what shouldAssumeDSOLocal returns.

But I agree that we can do instead what @jrtc27 suggests and then we can add a PseudoLA later (or something that explicitly means "via GOT" because a "la" with -fno-PIC is just a lla),

I don't see why -fno-PIC is relevant. If we have -fno-PIC we *want* whatever we pick to be an lla, so using PseudoLA for anything non-local is precisely what we want.

I've tried re-using PseudoLLA, but I just cannot get around the problem of expanding it in the MC layer. If I try expanding it in RISCVMCCodeEmitter there is no way to get/create an appropriate expression to use for the %pcrel_lo relocation. It would be nice if it was possible to create a <.text+offset> expression for the AUIPC instruction but I just don't know how. Otherwise I have also tried splitting up the instruction earlier, but the AUIPC/ADDI get split up too often to make it feasible. Also there's no real benefit because RISCVMergeBaseOffset cannot work on the %pcrel_lo base symbols without a great deal of modification. @rogfer01 what do you do differently to this patch for the PC-relative case?

Rebased on top of master, and added entry to getInstSizeInBits for PseudoAddrPCRel.

lewis-revill mentioned this in D55303: [RISCV] Add lowering of addressing sequences for PIC.Dec 4 2018, 5:08 PM

lewis-revill added a child revision: D55303: [RISCV] Add lowering of addressing sequences for PIC.Dec 4 2018, 5:16 PM

In D54143#1319494, @lewis-revill wrote:

I've tried re-using PseudoLLA, but I just cannot get around the problem of expanding it in the MC layer. If I try expanding it in RISCVMCCodeEmitter there is no way to get/create an appropriate expression to use for the %pcrel_lo relocation. It would be nice if it was possible to create a <.text+offset> expression for the AUIPC instruction but I just don't know how. Otherwise I have also tried splitting up the instruction earlier, but the AUIPC/ADDI get split up too often to make it feasible. Also there's no real benefit because RISCVMergeBaseOffset cannot work on the %pcrel_lo base symbols without a great deal of modification. @rogfer01 what do you do differently to this patch for the PC-relative case?

Why can't it be expanded in RISCVExpandPseudo? If it works for a new PseudoAddrPCRel, it works if you instead use PseudoLLA. All you need is a find/replace of PseudoAddrPCRel with PseudoLLA (and of course removing all the definitions of PseudoAddrPCRel).

In D54143#1319527, @lewis-revill wrote:

Rebased on top of master, and added entry to getInstSizeInBits for PseudoAddrPCRel.

[You mean getInstSizeInBytes]

@asb: We run the BranchRelaxation pass *before* RISCVExpandPseudos (addPreEmitPass vs addPreEmitPass2), but the atomics pseudos don't have a size set, nor are they special-cased in getInstSizeInBytes, so will BranchRelaxtion not think they are 0 bytes? Can we just run BranchRelaxation later, or do we need to declare the right size for everything (ugh)?

jrtc27 added inline comments.Dec 5 2018, 2:22 AM

lib/Target/RISCV/RISCVExpandPseudoInsts.cpp
86	This should come at the end of the file just after `expandAtomicCmpXchg` to match the order in which they're declared above (and once using `PseudoLLA` instead, be called `expandLoadLocalAddress` or similar).

In D54143#1319641, @jrtc27 wrote:

In D54143#1319494, @lewis-revill wrote:

I've tried re-using PseudoLLA, but I just cannot get around the problem of expanding it in the MC layer. If I try expanding it in RISCVMCCodeEmitter there is no way to get/create an appropriate expression to use for the %pcrel_lo relocation. It would be nice if it was possible to create a <.text+offset> expression for the AUIPC instruction but I just don't know how. Otherwise I have also tried splitting up the instruction earlier, but the AUIPC/ADDI get split up too often to make it feasible. Also there's no real benefit because RISCVMergeBaseOffset cannot work on the %pcrel_lo base symbols without a great deal of modification. @rogfer01 what do you do differently to this patch for the PC-relative case?

Why can't it be expanded in RISCVExpandPseudo? If it works for a new PseudoAddrPCRel, it works if you instead use PseudoLLA. All you need is a find/replace of PseudoAddrPCRel with PseudoLLA (and of course removing all the definitions of PseudoAddrPCRel).

Ah I see now. What I thought was meant by reusing PseudoLLA was to deduplicate code with a single expansion of the instruction in the MC layer, rather than just to remove this extra pseudo. So if I understand correctly we'd have a PseudoLLA expansion in codegen (RISCVExpandPseudo) and an expansion in the MC layer (RISCVAsmParser)?

My only problem with that approach is that it seems wrong to expand PseudoLLA the same way I am expanding PseudoAddrPCRel, IE allowing the AUIPC operand to be decided by codegen.

Rearranged RISCVExpandPseudoInsts.cpp

lewis-revill marked an inline comment as done.Dec 5 2018, 2:53 PM

My only problem with that approach is that it seems wrong to expand PseudoLLA the same way I am expanding PseudoAddrPCRel, IE allowing the AUIPC operand to be decided by codegen.

I'm not sure to follow here.

I think @jrtc27 means that, instead of adding a new PseudoAddrPCRel and select it from a WrapperPCRel, we could select PseudoLLA and then expand it in RISCVExpandPseudoInsts.

That said, I presume at some point we will want to add codegen for GOT-addressing. We have a few options here but if we reuse the WrapperPCRel and we use a different target flag to record that this is a GOT-relocation, then the expansion of PseudoLLA in the codegen flow and the asm parser flow will be different (the latter always doing %pcrel_hi while the former might be able to do both %pcrel_hi / %got_pcrel_hi). This does not seem ideal to me. Adding another target-specific DAG (e.g. WrapperGOTRel) node is workable but feels unnecessary.

I'd lean towards having PseudoAddrPCRel because it avoids conflating the two entities due to having slightly different treatment (I'm aware in an earlier comment of mine I said using PseudoLLA made sense but now I'm less sure).

Maybe I'm misunderstanding the issue here.

lib/Target/RISCV/RISCVISelLowering.cpp
335	Can these helper functions be made `static`?

In D54143#1321182, @rogfer01 wrote:

My only problem with that approach is that it seems wrong to expand PseudoLLA the same way I am expanding PseudoAddrPCRel, IE allowing the AUIPC operand to be decided by codegen.

I'm not sure to follow here.

I think @jrtc27 means that, instead of adding a new PseudoAddrPCRel and select it from a WrapperPCRel, we could select PseudoLLA and then expand it in RISCVExpandPseudoInsts.

That said, I presume at some point we will want to add codegen for GOT-addressing. We have a few options here but if we reuse the WrapperPCRel and we use a different target flag to record that this is a GOT-relocation, then the expansion of PseudoLLA in the codegen flow and the asm parser flow will be different (the latter always doing %pcrel_hi while the former might be able to do both %pcrel_hi / %got_pcrel_hi). This does not seem ideal to me. Adding another target-specific DAG (e.g. WrapperGOTRel) node is workable but feels unnecessary.

But we wouldn't be using PseudoLLA, we'd be using PseudoLA for things that aren't assumed DSO-local. The CodeGen expansion for both would be identical to the AsmParser (well, except that currently CodeGen is operating on MBBs, but it's equivalent).

So my problem is that (if we expand PseudoLLA in codegen in the same way as we do in RISCVAsmParser) then we lose some flexibility for later addressing uses. Currently I can use the wrapper to do (WrapperPCRel %pcrel_hi(sym)), (WrapperPCRel %got_pcrel_hi(sym)), (WrapperPCRel %tls_ie_pcrel_hi(sym)) and (WrapperPCRel %tls_gd_pcrel_hi(sym)), whereas with PseudoLLA I am limited to just %pcrel_hi addressing. The addition of PseudoLA would help greatly, but only for PIC, not for TLS, meaning another wrapper would be required anyway. Maybe this is an acceptable approach to use PseudoLLA and PseudoLA where it fits then add a wrapper later?

lewis-revill added a parent revision: D54029: [RISCV] Properly evaluate fixup_riscv_pcrel_lo12.Dec 6 2018, 10:33 AM

lewis-revill marked 2 inline comments as done.Dec 6 2018, 10:44 AM

lewis-revill added inline comments.

lib/Target/RISCV/RISCVISelLowering.cpp
335	They certainly can, thanks!

Rebased with dependency and marked helper functions as static.

In D54143#1321605, @lewis-revill wrote:

So my problem is that (if we expand PseudoLLA in codegen in the same way as we do in RISCVAsmParser) then we lose some flexibility for later addressing uses. Currently I can use the wrapper to do (WrapperPCRel %pcrel_hi(sym)), (WrapperPCRel %got_pcrel_hi(sym)), (WrapperPCRel %tls_ie_pcrel_hi(sym)) and (WrapperPCRel %tls_gd_pcrel_hi(sym)), whereas with PseudoLLA I am limited to just %pcrel_hi addressing. The addition of PseudoLA would help greatly, but only for PIC, not for TLS, meaning another wrapper would be required anyway. Maybe this is an acceptable approach to use PseudoLLA and PseudoLA where it fits then add a wrapper later?

For TLS you'd then use the yet-to-be-implemented PseudoLA_TLS_GD and PseudoLA_TLS_IE (or whatever they end up being called) representing the la.tls.gd and la.tls.ie macros/pseudo-instructions. Anything you could want to use WrapperPCRel will need a corresponding PseudoFOO to exist for the assembly parser, so I still fail to see why we would need this extra wrapper. I really like having the exact correspondence between what CodeGen produces and what the equivalent hand-written assembly would be parsed to.

In D54143#1322155, @jrtc27 wrote:

In D54143#1321605, @lewis-revill wrote:

So my problem is that (if we expand PseudoLLA in codegen in the same way as we do in RISCVAsmParser) then we lose some flexibility for later addressing uses. Currently I can use the wrapper to do (WrapperPCRel %pcrel_hi(sym)), (WrapperPCRel %got_pcrel_hi(sym)), (WrapperPCRel %tls_ie_pcrel_hi(sym)) and (WrapperPCRel %tls_gd_pcrel_hi(sym)), whereas with PseudoLLA I am limited to just %pcrel_hi addressing. The addition of PseudoLA would help greatly, but only for PIC, not for TLS, meaning another wrapper would be required anyway. Maybe this is an acceptable approach to use PseudoLLA and PseudoLA where it fits then add a wrapper later?

For TLS you'd then use the yet-to-be-implemented PseudoLA_TLS_GD and PseudoLA_TLS_IE (or whatever they end up being called) representing the la.tls.gd and la.tls.ie macros/pseudo-instructions. Anything you could want to use WrapperPCRel will need a corresponding PseudoFOO to exist for the assembly parser, so I still fail to see why we would need this extra wrapper. I really like having the exact correspondence between what CodeGen produces and what the equivalent hand-written assembly would be parsed to.

Thanks, this clears things up for me. I understand why that would be the better approach now. I did see that those pseudo instructions are what GCC produces but since they weren't part of the RISCV specifications I didn't think about using them in LLVM. I'll try to make these changes and add patches where necessary.

In D54143#1322247, @lewis-revill wrote:

In D54143#1322155, @jrtc27 wrote:

In D54143#1321605, @lewis-revill wrote:

So my problem is that (if we expand PseudoLLA in codegen in the same way as we do in RISCVAsmParser) then we lose some flexibility for later addressing uses. Currently I can use the wrapper to do (WrapperPCRel %pcrel_hi(sym)), (WrapperPCRel %got_pcrel_hi(sym)), (WrapperPCRel %tls_ie_pcrel_hi(sym)) and (WrapperPCRel %tls_gd_pcrel_hi(sym)), whereas with PseudoLLA I am limited to just %pcrel_hi addressing. The addition of PseudoLA would help greatly, but only for PIC, not for TLS, meaning another wrapper would be required anyway. Maybe this is an acceptable approach to use PseudoLLA and PseudoLA where it fits then add a wrapper later?

For TLS you'd then use the yet-to-be-implemented PseudoLA_TLS_GD and PseudoLA_TLS_IE (or whatever they end up being called) representing the la.tls.gd and la.tls.ie macros/pseudo-instructions. Anything you could want to use WrapperPCRel will need a corresponding PseudoFOO to exist for the assembly parser, so I still fail to see why we would need this extra wrapper. I really like having the exact correspondence between what CodeGen produces and what the equivalent hand-written assembly would be parsed to.

Thanks, this clears things up for me. I understand why that would be the better approach now. I did see that those pseudo instructions are what GCC produces but since they weren't part of the RISCV specifications I didn't think about using them in LLVM. I'll try to make these changes and add patches where necessary.

Glad things are clear and we're now in agreement! They're specified in the psABI document, but not the assembly manual; go figure. I may try and tidy things up a bit tomorrow on the documentation front...

Rebasing and modifying this patch to use and expand PseudoLLA instead of introducing a new wrapper.

Rebased and updated to use and expand PseudoLLA for PC-relative addressing.

Seems fine from my point of view but we'll wait to hear from the others. I'd suggest removing the "WIP" from the subject to reflect the true state.

Oh, probably also worth mentioning the small <-> medlow and medium <-> medany GCC code model mapping in the commit message, and probably a comment too in the source code of getAddr.

Remove unnecessary Flags operand.

Herald added a subscriber: rkruppe. · View Herald TranscriptDec 13 2018, 1:06 AM

jrtc27 added inline comments.Dec 13 2018, 5:05 AM

lib/Target/RISCV/RISCVISelLowering.cpp
371	Minor: comment should say something like `%pcrel_lo(auipc)` to be accurate.

In D54143#1325501, @lewis-revill wrote:

Rebased and updated to use and expand PseudoLLA for PC-relative addressing.

Ah, now I understand! I always forget one can create target nodes here :) Definitely using PseudoLLA is more straightforward.

Overall the approach looks sensible to me: if RISCVExpandPseudoInsts.cpp is late enough for subword compare-and-exchange pseudos (which have to be carefully emitted, AFAIU) it should be too for the expansion of PseudoLLA (and PseudoLA).

Rebased and updated comment.

lewis-revill edited parent revisions, added: D57240: [RISCV] Don't incorrectly force relocation for %pcrel_lo; removed: D54029: [RISCV] Properly evaluate fixup_riscv_pcrel_lo12.Feb 5 2019, 2:01 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 5 2019, 2:01 AM

jrtc27 added a child revision: D55560: [RISCV] Attach VK_RISCV_CALL to symbols upon creation.Feb 5 2019, 9:21 AM

Hi Lewis, could you please rebase this after the RV64A patch landed and added PseudoMaskedCmpXchg64 to RISCVExpandPseudoInsts (conflicts with adding PseudoLLA)?

Rebased

LGTM, thanks. I agree that later adding PseudoLA_TLS_GD, PseudoLA_TLS_IE etc as necessary is the right path forwards.

Herald added subscribers: benna, psnobl, jdoerfert. · View Herald TranscriptApr 1 2019, 7:33 AM

This revision was not accepted when it landed; it landed in state Needs Review.Apr 1 2019, 7:42 AM

Closed by commit rL357393: [RISCV] Generate address sequences suitable for mcmodel=medium (authored by asb). · Explain Why

This revision was automatically updated to reflect the committed changes.

luismarques mentioned this in D79635: [RISCV] Split the pseudo instruction splitting pass.Jul 1 2020, 10:22 AM

jrtc27 mentioned this in D92097: [RISCV] Basic jump table lowering.Nov 26 2020, 7:19 AM

Revision Contents

Path

Size

include/

llvm/

CodeGen/

MachineBasicBlock.h

11 lines

lib/

CodeGen/

AsmPrinter/

AsmPrinter.cpp

5 lines

Target/

RISCV/

RISCVExpandPseudoInsts.cpp

45 lines

RISCVISelLowering.h

4 lines

RISCVISelLowering.cpp

88 lines

RISCVInstrInfo.cpp

1 line

RISCVMCInstLower.cpp

6 lines

Utils/

RISCVBaseInfo.h

1 line

test/

CodeGen/

RISCV/

codemodel-lowering.ll

80 lines

Diff 178015

include/llvm/CodeGen/MachineBasicBlock.h

Show First 20 Lines • Show All 109 Lines • ▼ Show 20 Lines	private:

/// Indicate that this basic block is entered via an exception handler.		/// Indicate that this basic block is entered via an exception handler.
bool IsEHPad = false;		bool IsEHPad = false;

/// Indicate that this basic block is potentially the target of an indirect		/// Indicate that this basic block is potentially the target of an indirect
/// branch.		/// branch.
bool AddressTaken = false;		bool AddressTaken = false;

		/// Indicate that this basic block needs its symbol be emitted regardless of
		/// whether the flow just falls-through to it.
		bool LabelMustBeEmitted = false;

/// Indicate that this basic block is the entry block of an EH scope, i.e.,		/// Indicate that this basic block is the entry block of an EH scope, i.e.,
/// the block that used to have a catchpad or cleanuppad instruction in the		/// the block that used to have a catchpad or cleanuppad instruction in the
/// LLVM IR.		/// LLVM IR.
bool IsEHScopeEntry = false;		bool IsEHScopeEntry = false;

/// Indicate that this basic block is the entry block of an EH funclet.		/// Indicate that this basic block is the entry block of an EH funclet.
bool IsEHFuncletEntry = false;		bool IsEHFuncletEntry = false;

Show All 28 Lines	public:

/// Test whether this block is potentially the target of an indirect branch.		/// Test whether this block is potentially the target of an indirect branch.
bool hasAddressTaken() const { return AddressTaken; }		bool hasAddressTaken() const { return AddressTaken; }

/// Set this block to reflect that it potentially is the target of an indirect		/// Set this block to reflect that it potentially is the target of an indirect
/// branch.		/// branch.
void setHasAddressTaken() { AddressTaken = true; }		void setHasAddressTaken() { AddressTaken = true; }

		/// Test whether this block must have its label emitted.
		bool hasLabelMustBeEmitted() const { return LabelMustBeEmitted; }

		/// Set this block to reflect that, regardless how we flow to it, we need
		/// its label be emitted.
		void setLabelMustBeEmitted() { LabelMustBeEmitted = true; }

/// Return the MachineFunction containing this basic block.		/// Return the MachineFunction containing this basic block.
const MachineFunction *getParent() const { return xParent; }		const MachineFunction *getParent() const { return xParent; }
MachineFunction *getParent() { return xParent; }		MachineFunction *getParent() { return xParent; }

using instr_iterator = Instructions::iterator;		using instr_iterator = Instructions::iterator;
using const_instr_iterator = Instructions::const_iterator;		using const_instr_iterator = Instructions::const_iterator;
using reverse_instr_iterator = Instructions::reverse_iterator;		using reverse_instr_iterator = Instructions::reverse_iterator;
using const_reverse_instr_iterator = Instructions::const_reverse_iterator;		using const_reverse_instr_iterator = Instructions::const_reverse_iterator;
▲ Show 20 Lines • Show All 773 Lines • Show Last 20 Lines

lib/CodeGen/AsmPrinter/AsmPrinter.cpp

Show First 20 Lines • Show All 2,886 Lines • ▼ Show 20 Lines	if (isVerbose()) {
}		}

assert(MLI != nullptr && "MachineLoopInfo should has been computed");		assert(MLI != nullptr && "MachineLoopInfo should has been computed");
emitBasicBlockLoopComments(MBB, MLI, *this);		emitBasicBlockLoopComments(MBB, MLI, *this);
}		}

// Print the main label for the block.		// Print the main label for the block.
if (MBB.pred_empty() \|\|		if (MBB.pred_empty() \|\|
(isBlockOnlyReachableByFallthrough(&MBB) && !MBB.isEHFuncletEntry())) {		(isBlockOnlyReachableByFallthrough(&MBB) && !MBB.isEHFuncletEntry() &&
		!MBB.hasLabelMustBeEmitted())) {
if (isVerbose()) {		if (isVerbose()) {
// NOTE: Want this comment at start of line, don't emit with AddComment.		// NOTE: Want this comment at start of line, don't emit with AddComment.
OutStreamer->emitRawComment(" %bb." + Twine(MBB.getNumber()) + ":",		OutStreamer->emitRawComment(" %bb." + Twine(MBB.getNumber()) + ":",
false);		false);
}		}
} else {		} else {
		if (isVerbose() && MBB.hasLabelMustBeEmitted())
		OutStreamer->AddComment("Label of block must be emitted");
OutStreamer->EmitLabel(MBB.getSymbol());		OutStreamer->EmitLabel(MBB.getSymbol());
}		}
}		}

void AsmPrinter::EmitBasicBlockEnd(const MachineBasicBlock &MBB) {		void AsmPrinter::EmitBasicBlockEnd(const MachineBasicBlock &MBB) {
MCCodePaddingContext Context;		MCCodePaddingContext Context;
setupCodePaddingContext(MBB, Context);		setupCodePaddingContext(MBB, Context);
OutStreamer->EmitCodePaddingBasicBlockEnd(Context);		OutStreamer->EmitCodePaddingBasicBlockEnd(Context);
▲ Show 20 Lines • Show All 212 Lines • Show Last 20 Lines

lib/Target/RISCV/RISCVExpandPseudoInsts.cpp

Show First 20 Lines • Show All 49 Lines • ▼ Show 20 Lines	bool expandAtomicBinOp(MachineBasicBlock &MBB,
MachineBasicBlock::iterator &NextMBBI);		MachineBasicBlock::iterator &NextMBBI);
bool expandAtomicMinMaxOp(MachineBasicBlock &MBB,		bool expandAtomicMinMaxOp(MachineBasicBlock &MBB,
MachineBasicBlock::iterator MBBI,		MachineBasicBlock::iterator MBBI,
AtomicRMWInst::BinOp, bool IsMasked, int Width,		AtomicRMWInst::BinOp, bool IsMasked, int Width,
MachineBasicBlock::iterator &NextMBBI);		MachineBasicBlock::iterator &NextMBBI);
bool expandAtomicCmpXchg(MachineBasicBlock &MBB,		bool expandAtomicCmpXchg(MachineBasicBlock &MBB,
MachineBasicBlock::iterator MBBI, bool IsMasked,		MachineBasicBlock::iterator MBBI, bool IsMasked,
int Width, MachineBasicBlock::iterator &NextMBBI);		int Width, MachineBasicBlock::iterator &NextMBBI);
		bool expandLoadLocalAddress(MachineBasicBlock &MBB,
		MachineBasicBlock::iterator MBBI,
		MachineBasicBlock::iterator &NextMBBI);
};		};

char RISCVExpandPseudo::ID = 0;		char RISCVExpandPseudo::ID = 0;

bool RISCVExpandPseudo::runOnMachineFunction(MachineFunction &MF) {		bool RISCVExpandPseudo::runOnMachineFunction(MachineFunction &MF) {
TII = static_cast<const RISCVInstrInfo *>(MF.getSubtarget().getInstrInfo());		TII = static_cast<const RISCVInstrInfo *>(MF.getSubtarget().getInstrInfo());
bool Modified = false;		bool Modified = false;
for (auto &MBB : MF)		for (auto &MBB : MF)
Show All 9 Lines	while (MBBI != E) {
MachineBasicBlock::iterator NMBBI = std::next(MBBI);		MachineBasicBlock::iterator NMBBI = std::next(MBBI);
Modified \|= expandMI(MBB, MBBI, NMBBI);		Modified \|= expandMI(MBB, MBBI, NMBBI);
MBBI = NMBBI;		MBBI = NMBBI;
}		}

return Modified;		return Modified;
}		}

bool RISCVExpandPseudo::expandMI(MachineBasicBlock &MBB,		bool RISCVExpandPseudo::expandMI(MachineBasicBlock &MBB,
		jrtc27Unsubmitted Done Reply Inline Actions This should come at the end of the file just after `expandAtomicCmpXchg` to match the order in which they're declared above (and once using `PseudoLLA` instead, be called `expandLoadLocalAddress` or similar). jrtc27: This should come at the end of the file just after `expandAtomicCmpXchg` to match the order in…
MachineBasicBlock::iterator MBBI,		MachineBasicBlock::iterator MBBI,
MachineBasicBlock::iterator &NextMBBI) {		MachineBasicBlock::iterator &NextMBBI) {
switch (MBBI->getOpcode()) {		switch (MBBI->getOpcode()) {
case RISCV::PseudoAtomicLoadNand32:		case RISCV::PseudoAtomicLoadNand32:
return expandAtomicBinOp(MBB, MBBI, AtomicRMWInst::Nand, false, 32,		return expandAtomicBinOp(MBB, MBBI, AtomicRMWInst::Nand, false, 32,
NextMBBI);		NextMBBI);
case RISCV::PseudoMaskedAtomicSwap32:		case RISCV::PseudoMaskedAtomicSwap32:
return expandAtomicBinOp(MBB, MBBI, AtomicRMWInst::Xchg, true, 32,		return expandAtomicBinOp(MBB, MBBI, AtomicRMWInst::Xchg, true, 32,
Show All 16 Lines	return expandAtomicMinMaxOp(MBB, MBBI, AtomicRMWInst::UMax, true, 32,
NextMBBI);		NextMBBI);
case RISCV::PseudoMaskedAtomicLoadUMin32:		case RISCV::PseudoMaskedAtomicLoadUMin32:
return expandAtomicMinMaxOp(MBB, MBBI, AtomicRMWInst::UMin, true, 32,		return expandAtomicMinMaxOp(MBB, MBBI, AtomicRMWInst::UMin, true, 32,
NextMBBI);		NextMBBI);
case RISCV::PseudoCmpXchg32:		case RISCV::PseudoCmpXchg32:
return expandAtomicCmpXchg(MBB, MBBI, false, 32, NextMBBI);		return expandAtomicCmpXchg(MBB, MBBI, false, 32, NextMBBI);
case RISCV::PseudoMaskedCmpXchg32:		case RISCV::PseudoMaskedCmpXchg32:
return expandAtomicCmpXchg(MBB, MBBI, true, 32, NextMBBI);		return expandAtomicCmpXchg(MBB, MBBI, true, 32, NextMBBI);
		case RISCV::PseudoLLA:
		return expandLoadLocalAddress(MBB, MBBI, NextMBBI);
}		}

return false;		return false;
}		}

static unsigned getLRForRMW32(AtomicOrdering Ordering) {		static unsigned getLRForRMW32(AtomicOrdering Ordering) {
switch (Ordering) {		switch (Ordering) {
default:		default:
▲ Show 20 Lines • Show All 416 Lines • ▼ Show 20 Lines	bool RISCVExpandPseudo::expandAtomicCmpXchg(
LivePhysRegs LiveRegs;		LivePhysRegs LiveRegs;
computeAndAddLiveIns(LiveRegs, *LoopHeadMBB);		computeAndAddLiveIns(LiveRegs, *LoopHeadMBB);
computeAndAddLiveIns(LiveRegs, *LoopTailMBB);		computeAndAddLiveIns(LiveRegs, *LoopTailMBB);
computeAndAddLiveIns(LiveRegs, *DoneMBB);		computeAndAddLiveIns(LiveRegs, *DoneMBB);

return true;		return true;
}		}

		bool RISCVExpandPseudo::expandLoadLocalAddress(
		MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,
		MachineBasicBlock::iterator &NextMBBI) {
		MachineFunction *MF = MBB.getParent();
		MachineInstr &MI = *MBBI;
		DebugLoc DL = MI.getDebugLoc();

		unsigned DestReg = MI.getOperand(0).getReg();
		const MachineOperand &Symbol = MI.getOperand(1);

		MachineBasicBlock *NewMBB = MF->CreateMachineBasicBlock(MBB.getBasicBlock());

		// Tell AsmPrinter that we unconditionally want the symbol of this label to be
		// emitted.
		NewMBB->setLabelMustBeEmitted();

		MF->insert(++MBB.getIterator(), NewMBB);

		BuildMI(NewMBB, DL, TII->get(RISCV::AUIPC), DestReg)
		.addDisp(Symbol, 0, RISCVII::MO_PCREL_HI);
		BuildMI(NewMBB, DL, TII->get(RISCV::ADDI), DestReg)
		.addReg(DestReg)
		.addMBB(NewMBB, RISCVII::MO_PCREL_LO);

		// Move all the rest of the instructions to NewMBB.
		NewMBB->splice(NewMBB->end(), &MBB, std::next(MBBI), MBB.end());
		// Update machine-CFG edges.
		NewMBB->transferSuccessorsAndUpdatePHIs(&MBB);
		// Make the original basic block fall-through to the new.
		MBB.addSuccessor(NewMBB);

		// Make sure live-ins are correctly attached to this new basic block.
		LivePhysRegs LiveRegs;
		computeAndAddLiveIns(LiveRegs, *NewMBB);

		NextMBBI = MBB.end();
		MI.eraseFromParent();
		return true;
		}

} // end of anonymous namespace		} // end of anonymous namespace

INITIALIZE_PASS(RISCVExpandPseudo, "riscv-expand-pseudo",		INITIALIZE_PASS(RISCVExpandPseudo, "riscv-expand-pseudo",
RISCV_EXPAND_PSEUDO_NAME, false, false)		RISCV_EXPAND_PSEUDO_NAME, false, false)
namespace llvm {		namespace llvm {

FunctionPass *createRISCVExpandPseudoPass() { return new RISCVExpandPseudo(); }		FunctionPass *createRISCVExpandPseudoPass() { return new RISCVExpandPseudo(); }

} // end of namespace llvm		} // end of namespace llvm

lib/Target/RISCV/RISCVISelLowering.h

Show First 20 Lines • Show All 104 Lines • ▼ Show 20 Lines	SDValue LowerReturn(SDValue Chain, CallingConv::ID CallConv, bool IsVarArg,
const SmallVectorImpl<SDValue> &OutVals, const SDLoc &DL,		const SmallVectorImpl<SDValue> &OutVals, const SDLoc &DL,
SelectionDAG &DAG) const override;		SelectionDAG &DAG) const override;
SDValue LowerCall(TargetLowering::CallLoweringInfo &CLI,		SDValue LowerCall(TargetLowering::CallLoweringInfo &CLI,
SmallVectorImpl<SDValue> &InVals) const override;		SmallVectorImpl<SDValue> &InVals) const override;
bool shouldConvertConstantLoadToIntImm(const APInt &Imm,		bool shouldConvertConstantLoadToIntImm(const APInt &Imm,
Type *Ty) const override {		Type *Ty) const override {
return true;		return true;
}		}

		template <class NodeTy>
		SDValue getAddr(NodeTy *N, SelectionDAG &DAG) const;

SDValue lowerGlobalAddress(SDValue Op, SelectionDAG &DAG) const;		SDValue lowerGlobalAddress(SDValue Op, SelectionDAG &DAG) const;
SDValue lowerBlockAddress(SDValue Op, SelectionDAG &DAG) const;		SDValue lowerBlockAddress(SDValue Op, SelectionDAG &DAG) const;
SDValue lowerConstantPool(SDValue Op, SelectionDAG &DAG) const;		SDValue lowerConstantPool(SDValue Op, SelectionDAG &DAG) const;
SDValue lowerSELECT(SDValue Op, SelectionDAG &DAG) const;		SDValue lowerSELECT(SDValue Op, SelectionDAG &DAG) const;
SDValue lowerVASTART(SDValue Op, SelectionDAG &DAG) const;		SDValue lowerVASTART(SDValue Op, SelectionDAG &DAG) const;
SDValue lowerFRAMEADDR(SDValue Op, SelectionDAG &DAG) const;		SDValue lowerFRAMEADDR(SDValue Op, SelectionDAG &DAG) const;
SDValue lowerRETURNADDR(SDValue Op, SelectionDAG &DAG) const;		SDValue lowerRETURNADDR(SDValue Op, SelectionDAG &DAG) const;

Show All 20 Lines

lib/Target/RISCV/RISCVISelLowering.cpp

Show First 20 Lines • Show All 326 Lines • ▼ Show 20 Lines	case ISD::VASTART:
return lowerVASTART(Op, DAG);		return lowerVASTART(Op, DAG);
case ISD::FRAMEADDR:		case ISD::FRAMEADDR:
return lowerFRAMEADDR(Op, DAG);		return lowerFRAMEADDR(Op, DAG);
case ISD::RETURNADDR:		case ISD::RETURNADDR:
return lowerRETURNADDR(Op, DAG);		return lowerRETURNADDR(Op, DAG);
}		}
}		}

		static SDValue getTargetNode(GlobalAddressSDNode *N, SDLoc DL, EVT Ty,
		rogfer01Unsubmitted Done Reply Inline Actions Can these helper functions be made `static`? rogfer01: Can these helper functions be made `static`?
		lewis-revillAuthorUnsubmitted Done Reply Inline Actions They certainly can, thanks! lewis-revill: They certainly can, thanks!
		SelectionDAG &DAG, unsigned Flags) {
		return DAG.getTargetGlobalAddress(N->getGlobal(), DL, Ty, 0, Flags);
		}

		static SDValue getTargetNode(BlockAddressSDNode *N, SDLoc DL, EVT Ty,
		SelectionDAG &DAG, unsigned Flags) {
		return DAG.getTargetBlockAddress(N->getBlockAddress(), Ty, N->getOffset(),
		Flags);
		}

		static SDValue getTargetNode(ConstantPoolSDNode *N, SDLoc DL, EVT Ty,
		SelectionDAG &DAG, unsigned Flags) {
		return DAG.getTargetConstantPool(N->getConstVal(), Ty, N->getAlignment(),
		N->getOffset(), Flags);
		}

		template <class NodeTy>
		SDValue RISCVTargetLowering::getAddr(NodeTy *N, SelectionDAG &DAG) const {
		SDLoc DL(N);
		EVT Ty = getPointerTy(DAG.getDataLayout());

		switch (getTargetMachine().getCodeModel()) {
		default:
		report_fatal_error("Unsupported code model for lowering");
		case CodeModel::Small: {
		// Generate a sequence for accessing addresses within the first 2 GiB of
		// address space. This generates the pattern (addi (lui %hi(sym)) %lo(sym)).
		SDValue AddrHi = getTargetNode(N, DL, Ty, DAG, RISCVII::MO_HI);
		SDValue AddrLo = getTargetNode(N, DL, Ty, DAG, RISCVII::MO_LO);
		SDValue MNHi = SDValue(DAG.getMachineNode(RISCV::LUI, DL, Ty, AddrHi), 0);
		return SDValue(DAG.getMachineNode(RISCV::ADDI, DL, Ty, MNHi, AddrLo), 0);
		}
		case CodeModel::Medium: {
		// Generate a sequence for accessing addresses within any 2GiB range within
		// the address space. This generates the pattern (PseudoLLA sym), which
		// expands to (addi (auipc %pcrel_hi(sym)) %pcrel_lo(sym)).
		jrtc27Unsubmitted Not Done Reply Inline Actions Minor: comment should say something like `%pcrel_lo(auipc)` to be accurate. jrtc27: Minor: comment should say something like `%pcrel_lo(auipc)` to be accurate.
		SDValue Addr = getTargetNode(N, DL, Ty, DAG, 0);
		return SDValue(DAG.getMachineNode(RISCV::PseudoLLA, DL, Ty, Addr), 0);
		}
		}
		}
		jrtc27Unsubmitted Not Done Reply Inline Actions This needs to be an `MO_PCREL_LO`, surely? (and then modified to refer to the `auipc` rather than the symbol...) jrtc27: This needs to be an `MO_PCREL_LO`, surely? (and then modified to refer to the `auipc` rather…
		lewis-revillAuthorUnsubmitted Not Done Reply Inline Actions I was under the impression that only the `auipc` symbol needs to be PC-relative, since: The upper bits are the bits we need to prevent from overflowing the operation of `auipc` is to add the upper 20 bits of the symbol relative to the program counter to the program counter itself, putting the result into the destination register. Essentially we've done the same operation as an `lui`, but via the PC. So the `addi` should be the same? Also more importantly I tried that and got a 'relocation truncated to fit'... I don't quite understand what you mean by 'modified to refer to the `auipc` rather than the symbol'? Basically the first two SDValues are just building the `%pcrel_hi(sym)` & `%lo(sym)` expressions to be used in the actual sequence. lewis-revill: I was under the impression that only the `auipc` symbol needs to be PC-relative, since: 1) The…
		jrtc27Unsubmitted Not Done Reply Inline Actions No, because you still have the lower bits of the program counter added to the `%lo` value. What you're trying to do is use `auipc` to get the 4k-page and then set the lower bits to the offset in the page, but that needs an extra mask if you want to do that, which is pointless when you can add the `%pcrel_lo` instead and get the right result. Example: 0x10048: auipc a0, %pcrel_hi(X) 0x1004c: addi a1, a0, %lo(X) where X is at address 0x20072. %pcrel_hi(X) at 0x48 will give us upper 20 bits of 0x20072-0x10048=0x1002a, ie 0x10000 Thus a0 gets set to PC+0x10000 = 0x20048 %lo(X) will give us 0x20072&0xFFF = 0x72 Thus a1 gets set to a0+0x72 = 0x200ba, not 0x20072, because we had the extra 0x48 from the low bits of PC at the time of the auipc. jrtc27: No, because you still have the lower bits of the program counter added to the `%lo` value. What…
		jrtc27Unsubmitted Not Done Reply Inline Actions As for 'modified to refer to the auipc rather than the symbol', `%pcrel_lo` is unusual on RISC-V. You would think that you would write something like the following (the `+4` is of course extremely important): .L0: auipc a0, %pcrel_hi(X) addi a1, a0, %pcrel_lo(X+4) But in fact that's not how you do it. On RISC-V, the pair of instructions are tied together so the linker knows which `auipc` the later `addi` is consuming, and you instead do this: .L0: auipc a0, %pcrel_hi(X) addi a1, a0, %pcrel_lo(.L0) Note how the symbol for the `%pcrel_lo` becomes the label for the `auipc` rather than the symbol itself. This indirection is eventually removed by the linker once it has done all relaxations, and will turn into the low 12 bits of `X-.L0` ie what is normally `%pcrel_lo(X+4)`. jrtc27: As for 'modified to refer to the auipc rather than the symbol', `%pcrel_lo` is unusual on RISC…
		lewis-revillAuthorUnsubmitted Not Done Reply Inline Actions Thank you for the explanation, I see what you mean now. Now I have to figure out how to implement that. lewis-revill: Thank you for the explanation, I see what you mean now. Now I have to figure out how to…
		jrtc27Unsubmitted Not Done Reply Inline Actions Look at the expansion of `lla` in the assembler, that's exactly what you're trying to do here. jrtc27: Look at the expansion of `lla` in the assembler, that's exactly what you're trying to do here.
		rogfer01Unsubmitted Not Done Reply Inline Actions In an earlier attempt to implement local PIC addressing I created a basic block to get the address https://reviews.llvm.org/D50634 but this may impact negatively later optimisations that work at a basic-block unit. @efriedma suggested to add an id to the `auipc` instead. My approach used a pseudo instruction so what one needs in my case is to make sure the label is emitted by `AsmPrinter`. Under that approach (in case you want to consider this approach) it looks to me we need to extend first `MachineInstr` to be able to represent that id (e.g. zero if no id). `FunctionLoweringInfo` could be the provider of these unique ids within a function via `SelectionDAG`. Then teach `AsmPrinter` to emit a label when there is an id for a given instruction. Then we need to link the user of the id (in your case `addi`). That part is less clear to me but I'll be bold here and I believe `SelectionDAG::getTargetIndex` is enough here (with a MII flag stating that this is a pcrel_lo). Unfortunately I could not progress on this, yet. Feel free to investigate this avenue for feasibility. Perhaps there is a simpler approach in your case. (To the best of my knowledge, this way of referencing an earlier instruction for symbol relocations is a bit of a unique thing of RISC-V so I think we're in uncharted land within LLVM) Hope this helps. Regards, rogfer01: In an earlier attempt to implement local PIC addressing I created a basic block to get the…
		lewis-revillAuthorUnsubmitted Not Done Reply Inline Actions Thank you very much for the background, I should have looked this up before.. I think a pseudo instruction is the right approach, but it should probably be expanded as late as possible? I'm testing whether it would be worth modifying `RISCVExpandPseudoInsts` to cater for operations other than just atomics. Maybe then we could still go with the basic block approach since most optimisations will be done already. lewis-revill: Thank you very much for the background, I should have looked this up before.. I think a pseudo…
		jrtc27Unsubmitted Not Done Reply Inline Actions The latest point to do it is in `RISCVAsmPrinter` for the pseudo approach, as far as I am aware (and that's what I've been doing for my locally hacked up LLVM). I get uneasy with doing it earlier (ie by using a basic block and expanding somewhere like `RISCVExpandPseudoInsts`) in case instructions get moved around and the MBB no longer has the `auipc` as its first instruction, but if there are guarantees that that doesn't happen then that seems fine (although I fail to see the benefit from expanding such a simple instruction then, when doing it in `RISCVAsmPrinter` is almost exactly the same as what `RISCVAsmParser` is doing). Others with deeper LLVM-internals knowledge may be able to offer better advice here, but my approach does work. jrtc27: The latest point to do it is in `RISCVAsmPrinter` for the pseudo approach, as far as I am aware…
		lewis-revillAuthorUnsubmitted Not Done Reply Inline Actions The only problem with expanding in `RISCVAsmPrinter` that I can think of is that we would miss the chance to use the `RISCVMergeBaseOffset` pass, which I was planning to modify for this patch as well. Maybe there is a way around that. It would certainly make things easier. lewis-revill: The only problem with expanding in `RISCVAsmPrinter` that I can think of is that we would miss…
		rogfer01Unsubmitted Not Done Reply Inline Actions @jrtc27: in my case, for PIC addressing I was a bit reluctant to late-expand the sequences, which I do in my downstream LLVM too, but I think we could do better in upstream: I presume there may be circumstances where due to scheduling we may want to move instructions between the address-forming instructions. My understanding is that a pseudo-instruction that is expanded late is a bit like a black box and would prevent that. I felt a bit uneasy to dismiss the flexibility that the existing linker mechanism provides. That said perhaps I've got a storm in a teacup and there are no realistic circumstances where this is going to be profitable, so a late expanded pseudo is actually fine. Perhaps this mechanism of relocating to an instruction that has the actual relocation is more suited to the linker and there is no expectation for the compiler to emit non-contiguous sequences. Not sure, to be honest. Regards, rogfer01: @jrtc27: in my case, for PIC addressing I was a bit reluctant to late-expand the sequences…
		jrtc27Unsubmitted Not Done Reply Inline Actions Yeah, it's possible that you'd want to split them up, but as you say given the simplicity of the operations involved it seems unlikely to be necessary, but I'm not an expert on these things. So, in my opinion, we should either be doing the late expansion in `RISCVAsmPrinter`, or we go all the way and generate separate linked instructions with a labelled `auipc` straight away, with no pseudos involved; anything in between seems like a half-hearted waste of time. It's worth noting that GCC will do exactly what LLVM does for 32-bit absolute addressing (separate `%hi`/`%lo` instructions, with the `addi ..., %lo(sym)` merged with the load if you're not just taking the address), but for anything else it will just emit `la` or `lla` (depending on the symbol's localness), not even expanding the sequence and leaving it up to GNU as, essentially like our simple expansion solution. jrtc27: Yeah, it's possible that you'd want to split them up, but as you say given the simplicity of…

SDValue RISCVTargetLowering::lowerGlobalAddress(SDValue Op,		SDValue RISCVTargetLowering::lowerGlobalAddress(SDValue Op,
SelectionDAG &DAG) const {		SelectionDAG &DAG) const {
SDLoc DL(Op);		SDLoc DL(Op);
EVT Ty = Op.getValueType();		EVT Ty = Op.getValueType();
GlobalAddressSDNode *N = cast<GlobalAddressSDNode>(Op);		GlobalAddressSDNode *N = cast<GlobalAddressSDNode>(Op);
const GlobalValue *GV = N->getGlobal();
int64_t Offset = N->getOffset();		int64_t Offset = N->getOffset();
MVT XLenVT = Subtarget.getXLenVT();		MVT XLenVT = Subtarget.getXLenVT();

if (isPositionIndependent())		if (isPositionIndependent())
report_fatal_error("Unable to lowerGlobalAddress");		report_fatal_error("Unable to lowerGlobalAddress");

		SDValue Addr = getAddr(N, DAG);

// In order to maximise the opportunity for common subexpression elimination,		// In order to maximise the opportunity for common subexpression elimination,
// emit a separate ADD node for the global address offset instead of folding		// emit a separate ADD node for the global address offset instead of folding
// it in the global address node. Later peephole optimisations may choose to		// it in the global address node. Later peephole optimisations may choose to
// fold it back in when profitable.		// fold it back in when profitable.
SDValue GAHi = DAG.getTargetGlobalAddress(GV, DL, Ty, 0, RISCVII::MO_HI);
SDValue GALo = DAG.getTargetGlobalAddress(GV, DL, Ty, 0, RISCVII::MO_LO);
SDValue MNHi = SDValue(DAG.getMachineNode(RISCV::LUI, DL, Ty, GAHi), 0);
SDValue MNLo =
SDValue(DAG.getMachineNode(RISCV::ADDI, DL, Ty, MNHi, GALo), 0);
if (Offset != 0)		if (Offset != 0)
return DAG.getNode(ISD::ADD, DL, Ty, MNLo,		return DAG.getNode(ISD::ADD, DL, Ty, Addr,
DAG.getConstant(Offset, DL, XLenVT));		DAG.getConstant(Offset, DL, XLenVT));
return MNLo;		return Addr;
}		}

SDValue RISCVTargetLowering::lowerBlockAddress(SDValue Op,		SDValue RISCVTargetLowering::lowerBlockAddress(SDValue Op,
SelectionDAG &DAG) const {		SelectionDAG &DAG) const {
SDLoc DL(Op);
EVT Ty = Op.getValueType();
BlockAddressSDNode *N = cast<BlockAddressSDNode>(Op);		BlockAddressSDNode *N = cast<BlockAddressSDNode>(Op);
const BlockAddress *BA = N->getBlockAddress();
int64_t Offset = N->getOffset();

if (isPositionIndependent())		if (isPositionIndependent())
report_fatal_error("Unable to lowerBlockAddress");		report_fatal_error("Unable to lowerBlockAddress");

SDValue BAHi = DAG.getTargetBlockAddress(BA, Ty, Offset, RISCVII::MO_HI);		return getAddr(N, DAG);
SDValue BALo = DAG.getTargetBlockAddress(BA, Ty, Offset, RISCVII::MO_LO);
SDValue MNHi = SDValue(DAG.getMachineNode(RISCV::LUI, DL, Ty, BAHi), 0);
SDValue MNLo =
SDValue(DAG.getMachineNode(RISCV::ADDI, DL, Ty, MNHi, BALo), 0);
return MNLo;
}		}

SDValue RISCVTargetLowering::lowerConstantPool(SDValue Op,		SDValue RISCVTargetLowering::lowerConstantPool(SDValue Op,
SelectionDAG &DAG) const {		SelectionDAG &DAG) const {
SDLoc DL(Op);
EVT Ty = Op.getValueType();
ConstantPoolSDNode *N = cast<ConstantPoolSDNode>(Op);		ConstantPoolSDNode *N = cast<ConstantPoolSDNode>(Op);
const Constant *CPA = N->getConstVal();
int64_t Offset = N->getOffset();
unsigned Alignment = N->getAlignment();

if (!isPositionIndependent()) {		if (isPositionIndependent())
SDValue CPAHi =
DAG.getTargetConstantPool(CPA, Ty, Alignment, Offset, RISCVII::MO_HI);
SDValue CPALo =
DAG.getTargetConstantPool(CPA, Ty, Alignment, Offset, RISCVII::MO_LO);
SDValue MNHi = SDValue(DAG.getMachineNode(RISCV::LUI, DL, Ty, CPAHi), 0);
SDValue MNLo =
SDValue(DAG.getMachineNode(RISCV::ADDI, DL, Ty, MNHi, CPALo), 0);
return MNLo;
} else {
report_fatal_error("Unable to lowerConstantPool");		report_fatal_error("Unable to lowerConstantPool");
}
		return getAddr(N, DAG);
}		}

SDValue RISCVTargetLowering::lowerSELECT(SDValue Op, SelectionDAG &DAG) const {		SDValue RISCVTargetLowering::lowerSELECT(SDValue Op, SelectionDAG &DAG) const {
SDValue CondV = Op.getOperand(0);		SDValue CondV = Op.getOperand(0);
SDValue TrueV = Op.getOperand(1);		SDValue TrueV = Op.getOperand(1);
SDValue FalseV = Op.getOperand(2);		SDValue FalseV = Op.getOperand(2);
SDLoc DL(Op);		SDLoc DL(Op);
MVT XLenVT = Subtarget.getXLenVT();		MVT XLenVT = Subtarget.getXLenVT();
▲ Show 20 Lines • Show All 1,327 Lines • Show Last 20 Lines

lib/Target/RISCV/RISCVInstrInfo.cpp

Show First 20 Lines • Show All 433 Lines • ▼ Show 20 Lines	unsigned RISCVInstrInfo::getInstSizeInBytes(const MachineInstr &MI) const {
default: { return get(Opcode).getSize(); }		default: { return get(Opcode).getSize(); }
case TargetOpcode::EH_LABEL:		case TargetOpcode::EH_LABEL:
case TargetOpcode::IMPLICIT_DEF:		case TargetOpcode::IMPLICIT_DEF:
case TargetOpcode::KILL:		case TargetOpcode::KILL:
case TargetOpcode::DBG_VALUE:		case TargetOpcode::DBG_VALUE:
return 0;		return 0;
case RISCV::PseudoCALL:		case RISCV::PseudoCALL:
case RISCV::PseudoTAIL:		case RISCV::PseudoTAIL:
		case RISCV::PseudoLLA:
return 8;		return 8;
case TargetOpcode::INLINEASM: {		case TargetOpcode::INLINEASM: {
const MachineFunction &MF = *MI.getParent()->getParent();		const MachineFunction &MF = *MI.getParent()->getParent();
const auto &TM = static_cast<const RISCVTargetMachine &>(MF.getTarget());		const auto &TM = static_cast<const RISCVTargetMachine &>(MF.getTarget());
return getInlineAsmLength(MI.getOperand(0).getSymbolName(),		return getInlineAsmLength(MI.getOperand(0).getSymbolName(),
*TM.getMCAsmInfo());		*TM.getMCAsmInfo());
}		}
}		}
}		}

lib/Target/RISCV/RISCVMCInstLower.cpp

Show All 37 Lines	case RISCVII::MO_None:
Kind = RISCVMCExpr::VK_RISCV_None;		Kind = RISCVMCExpr::VK_RISCV_None;
break;		break;
case RISCVII::MO_LO:		case RISCVII::MO_LO:
Kind = RISCVMCExpr::VK_RISCV_LO;		Kind = RISCVMCExpr::VK_RISCV_LO;
break;		break;
case RISCVII::MO_HI:		case RISCVII::MO_HI:
Kind = RISCVMCExpr::VK_RISCV_HI;		Kind = RISCVMCExpr::VK_RISCV_HI;
break;		break;
		case RISCVII::MO_PCREL_LO:
		Kind = RISCVMCExpr::VK_RISCV_PCREL_LO;
		break;
		case RISCVII::MO_PCREL_HI:
		Kind = RISCVMCExpr::VK_RISCV_PCREL_HI;
		break;
}		}

const MCExpr *ME =		const MCExpr *ME =
MCSymbolRefExpr::create(Sym, MCSymbolRefExpr::VK_None, Ctx);		MCSymbolRefExpr::create(Sym, MCSymbolRefExpr::VK_None, Ctx);

if (!MO.isJTI() && !MO.isMBB() && MO.getOffset())		if (!MO.isJTI() && !MO.isMBB() && MO.getOffset())
ME = MCBinaryExpr::createAdd(		ME = MCBinaryExpr::createAdd(
ME, MCConstantExpr::create(MO.getOffset(), Ctx), Ctx);		ME, MCConstantExpr::create(MO.getOffset(), Ctx), Ctx);
▲ Show 20 Lines • Show All 55 Lines • Show Last 20 Lines

lib/Target/RISCV/Utils/RISCVBaseInfo.h

Show First 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	enum {

InstFormatMask = 31		InstFormatMask = 31
};		};

enum {		enum {
MO_None,		MO_None,
MO_LO,		MO_LO,
MO_HI,		MO_HI,
		MO_PCREL_LO,
MO_PCREL_HI,		MO_PCREL_HI,
};		};
} // namespace RISCVII		} // namespace RISCVII

// Describes the predecessor/successor bits used in the FENCE instruction.		// Describes the predecessor/successor bits used in the FENCE instruction.
namespace RISCVFenceField {		namespace RISCVFenceField {
enum FenceField {		enum FenceField {
I = 8,		I = 8,
▲ Show 20 Lines • Show All 97 Lines • Show Last 20 Lines

test/CodeGen/RISCV/codemodel-lowering.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -mtriple=riscv32 -mattr=+f -code-model=small -verify-machineinstrs < %s \
				; RUN: \| FileCheck %s -check-prefix=RV32I-SMALL
				; RUN: llc -mtriple=riscv32 -mattr=+f -code-model=medium -verify-machineinstrs < %s \
				; RUN: \| FileCheck %s -check-prefix=RV32I-MEDIUM

				; Check lowering of globals
				@G = global i32 0

				define i32 @lower_global(i32 %a) nounwind {
				; RV32I-SMALL-LABEL: lower_global:
				; RV32I-SMALL: # %bb.0:
				; RV32I-SMALL-NEXT: lui a0, %hi(G)
				; RV32I-SMALL-NEXT: lw a0, %lo(G)(a0)
				; RV32I-SMALL-NEXT: ret
				;
				; RV32I-MEDIUM-LABEL: lower_global:
				; RV32I-MEDIUM: # %bb.0:
				; RV32I-MEDIUM-NEXT: .LBB0_1: # Label of block must be emitted
				; RV32I-MEDIUM-NEXT: auipc a0, %pcrel_hi(G)
				; RV32I-MEDIUM-NEXT: addi a0, a0, %pcrel_lo(.LBB0_1)
				; RV32I-MEDIUM-NEXT: lw a0, 0(a0)
				; RV32I-MEDIUM-NEXT: ret
				%1 = load volatile i32, i32* @G
				ret i32 %1
				}

				; Check lowering of blockaddresses

				@addr = global i8* null

				define void @lower_blockaddress() nounwind {
				; RV32I-SMALL-LABEL: lower_blockaddress:
				; RV32I-SMALL: # %bb.0:
				; RV32I-SMALL-NEXT: lui a0, %hi(addr)
				; RV32I-SMALL-NEXT: addi a1, zero, 1
				; RV32I-SMALL-NEXT: sw a1, %lo(addr)(a0)
				; RV32I-SMALL-NEXT: ret
				;
				; RV32I-MEDIUM-LABEL: lower_blockaddress:
				; RV32I-MEDIUM: # %bb.0:
				; RV32I-MEDIUM-NEXT: .LBB1_1: # Label of block must be emitted
				; RV32I-MEDIUM-NEXT: auipc a0, %pcrel_hi(addr)
				; RV32I-MEDIUM-NEXT: addi a0, a0, %pcrel_lo(.LBB1_1)
				; RV32I-MEDIUM-NEXT: addi a1, zero, 1
				; RV32I-MEDIUM-NEXT: sw a1, 0(a0)
				; RV32I-MEDIUM-NEXT: ret
				store volatile i8* blockaddress(@lower_blockaddress, %block), i8** @addr
				ret void

				block:
				unreachable
				}

				; Check lowering of constantpools

				define float @lower_constantpool(float %a) nounwind {
				; RV32I-SMALL-LABEL: lower_constantpool:
				; RV32I-SMALL: # %bb.0:
				; RV32I-SMALL-NEXT: fmv.w.x ft0, a0
				; RV32I-SMALL-NEXT: lui a0, %hi(.LCPI2_0)
				; RV32I-SMALL-NEXT: addi a0, a0, %lo(.LCPI2_0)
				; RV32I-SMALL-NEXT: flw ft1, 0(a0)
				; RV32I-SMALL-NEXT: fadd.s ft0, ft0, ft1
				; RV32I-SMALL-NEXT: fmv.x.w a0, ft0
				; RV32I-SMALL-NEXT: ret
				;
				; RV32I-MEDIUM-LABEL: lower_constantpool:
				; RV32I-MEDIUM: # %bb.0:
				; RV32I-MEDIUM-NEXT: .LBB2_1: # Label of block must be emitted
				; RV32I-MEDIUM-NEXT: auipc a1, %pcrel_hi(.LCPI2_0)
				; RV32I-MEDIUM-NEXT: addi a1, a1, %pcrel_lo(.LBB2_1)
				; RV32I-MEDIUM-NEXT: flw ft0, 0(a1)
				; RV32I-MEDIUM-NEXT: fmv.w.x ft1, a0
				; RV32I-MEDIUM-NEXT: fadd.s ft0, ft1, ft0
				; RV32I-MEDIUM-NEXT: fmv.x.w a0, ft0
				; RV32I-MEDIUM-NEXT: ret
				%1 = fadd float %a, 1.0
				ret float %1
				}

This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Generate address sequences suitable for mcmodel=mediumClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 178015

include/llvm/CodeGen/MachineBasicBlock.h

lib/CodeGen/AsmPrinter/AsmPrinter.cpp

lib/Target/RISCV/RISCVExpandPseudoInsts.cpp

lib/Target/RISCV/RISCVISelLowering.h

lib/Target/RISCV/RISCVISelLowering.cpp

lib/Target/RISCV/RISCVInstrInfo.cpp

lib/Target/RISCV/RISCVMCInstLower.cpp

lib/Target/RISCV/Utils/RISCVBaseInfo.h

test/CodeGen/RISCV/codemodel-lowering.ll

[RISCV] Generate address sequences suitable for mcmodel=medium
ClosedPublic