This is an archive of the discontinued LLVM Phabricator instance.

I know GNU as does this but I don't see why that's a good idea nor why we should; if you use a conditional branch to a label that's too far away then that's your problem and you should have written non-broken code. Having it insert the long branch form is too much magic for me...

HsiangKai added a reviewer: jrtc27.Aug 30 2021, 5:53 PM

In D108961#2973678, @jrtc27 wrote:

I know GNU as does this but I don't see why that's a good idea nor why we should; if you use a conditional branch to a label that's too far away then that's your problem and you should have written non-broken code. Having it insert the long branch form is too much magic for me...

Yes, I'm in two minds about this. I'm usually in favour of trying to match GNU tools behaviour in order to minimise porting issues / provide a more predictable user experience. But fighting against that, I'm really not a fan of these "magic" transformations either.

The fear with adding anything like this is of course adding a very rarely executed path that ends up having bugs in obscure cases. As such, if we were to go in this direction I'd be keen to see more extensive test coverage.

In D108961#2979562, @asb wrote:

In D108961#2973678, @jrtc27 wrote:

I know GNU as does this but I don't see why that's a good idea nor why we should; if you use a conditional branch to a label that's too far away then that's your problem and you should have written non-broken code. Having it insert the long branch form is too much magic for me...

Yes, I'm in two minds about this. I'm usually in favour of trying to match GNU tools behaviour in order to minimise porting issues / provide a more predictable user experience. But fighting against that, I'm really not a fan of these "magic" transformations either.

The fear with adding anything like this is of course adding a very rarely executed path that ends up having bugs in obscure cases. As such, if we were to go in this direction I'd be keen to see more extensive test coverage.

FWIW, I *think* the origin of this stems from:

(a) GCC/binutils divide
(b) GCC uses pseudos in its output (even uses li in some cases, though often mixed in with other shifts and adds for decomposing immediates, bit odd)
(c) GCC has limited knowledge of compression

so GCC doesn't actually know how far away things are and thus can't pick the right branch all the time. Fixing it up in the assembler was presumably the easier solution rather than fixing GCC to emit correct assembly. That then has the unfortunate effect of becoming something hand-written assembly relies on (e.g. bbl was, as was FreeBSD in one place very early on before LLVM was mature enough to build it).

In D108961#2979635, @jrtc27 wrote:

FWIW, I *think* the origin of this stems from:

(a) GCC/binutils divide
(b) GCC uses pseudos in its output (even uses li in some cases, though often mixed in with other shifts and adds for decomposing immediates, bit odd)
(c) GCC has limited knowledge of compression

so GCC doesn't actually know how far away things are and thus can't pick the right branch all the time. Fixing it up in the assembler was presumably the easier solution rather than fixing GCC to emit correct assembly. That then has the unfortunate effect of becoming something hand-written assembly relies on (e.g. bbl was, as was FreeBSD in one place very early on before LLVM was mature enough to build it).

That's a good point re the origin of this kind of magic. It probably strengthens the argument in my mind for supporting this - being able to take .s generated by GCC and assemble it with clang/LLVM is a good thing (and we should probably be doing more such testing).

FWIW, I *think* the origin of this stems from:

(a) GCC/binutils divide
(b) GCC uses pseudos in its output (even uses li in some cases, though often mixed in with other shifts and adds for decomposing immediates, bit odd)
(c) GCC has limited knowledge of compression

so GCC doesn't actually know how far away things are and thus can't pick the right branch all the time. Fixing it up in the assembler was presumably the easier solution rather than fixing GCC to emit correct assembly. That then has the unfortunate effect of becoming something hand-written assembly relies on (e.g. bbl was, as was FreeBSD in one place very early on before LLVM was mature enough to build it).

That's a good point re the origin of this kind of magic. It probably strengthens the argument in my mind for supporting this - being able to take .s generated by GCC and assemble it with clang/LLVM is a good thing (and we should probably be doing more such testing).

The description about GCC is not fully correct, GCC also has capability estimate jump range, and convert cond branch to inverted cond branch and a jump during the compilation stage, GCC didn't emit any compressed instruction so the instruction distance estimation is over-conservative.

So when we need this magic?

GCC didn't have integrated assembler like LLVM, so instruction length for inline asm can only use a very roughly way to estimate.
Mostly this is used for hand-written assembly file, RISC-V conditional branch only provide very short range compare to other RISC ISA; +-32K for MIPS, +-32M for ARM32, +-32K(TBZ/TBNZ) ~ +-1M(CBZ/CBNZ) for AArch64, but only +-2K for RISC-V, that made RISC-V is more easy to hit out-of-range condition branch issue, of cause we can ask programmer to convert this by themselves, but out-of-range error is reported until linker time because RISC-V has relaxation, that made many user confuse, (relocation truncate to fit???), this magic can prevent that confusion in most case.

In D108961#2979927, @kito-cheng wrote:

FWIW, I *think* the origin of this stems from:

(a) GCC/binutils divide
(b) GCC uses pseudos in its output (even uses li in some cases, though often mixed in with other shifts and adds for decomposing immediates, bit odd)
(c) GCC has limited knowledge of compression

so GCC doesn't actually know how far away things are and thus can't pick the right branch all the time. Fixing it up in the assembler was presumably the easier solution rather than fixing GCC to emit correct assembly. That then has the unfortunate effect of becoming something hand-written assembly relies on (e.g. bbl was, as was FreeBSD in one place very early on before LLVM was mature enough to build it).

That's a good point re the origin of this kind of magic. It probably strengthens the argument in my mind for supporting this - being able to take .s generated by GCC and assemble it with clang/LLVM is a good thing (and we should probably be doing more such testing).

The description about GCC is not fully correct, GCC also has capability estimate jump range, and convert cond branch to inverted cond branch and a jump during the compilation stage, GCC didn't emit any compressed instruction so the instruction distance estimation is over-conservative.

So when we need this magic?

GCC didn't have integrated assembler like LLVM, so instruction length for inline asm can only use a very roughly way to estimate.

Is that not a problem for every architecture though?

Mostly this is used for hand-written assembly file, RISC-V conditional branch only provide very short range compare to other RISC ISA; +-32K for MIPS, +-32M for ARM32, +-32K(TBZ/TBNZ) ~ +-1M(CBZ/CBNZ) for AArch64, but only +-2K for RISC-V, that made RISC-V is more easy to hit out-of-range condition branch issue, of cause we can ask programmer to convert this by themselves, but out-of-range error is reported until linker time because RISC-V has relaxation, that made many user confuse, (relocation truncate to fit???), this magic can prevent that confusion in most case.

There's nothing stopping an assembler from detecting that a label will/won't/might be out of range for a given branch by doing best-case and worst-case analysis on the instruction sequence and emitting an error in the case where it's definitely going to be out of bounds however much you relax, and a warning when it's unsure because it depends on how much relaxation there is (which is likely to be relatively rare, normally these cases are _way_ out of bounds for user-provided assembly).

So when we need this magic?

GCC didn't have integrated assembler like LLVM, so instruction length for inline asm can only use a very roughly way to estimate.

Is that not a problem for every architecture though?

Yeah, but RISC-V is more frequently hit due to shorter range for conditional branch, I believe integrated assembler is right way to resolve this, but sadly it's not existing in GCC :(

Mostly this is used for hand-written assembly file, RISC-V conditional branch only provide very short range compare to other RISC ISA; +-32K for MIPS, +-32M for ARM32, +-32K(TBZ/TBNZ) ~ +-1M(CBZ/CBNZ) for AArch64, but only +-2K for RISC-V, that made RISC-V is more easy to hit out-of-range condition branch issue, of cause we can ask programmer to convert this by themselves, but out-of-range error is reported until linker time because RISC-V has relaxation, that made many user confuse, (relocation truncate to fit???), this magic can prevent that confusion in most case.

There's nothing stopping an assembler from detecting that a label will/won't/might be out of range for a given branch by doing best-case and worst-case analysis on the instruction sequence and emitting an error in the case where it's definitely going to be out of bounds however much you relax, and a warning when it's unsure because it depends on how much relaxation there is (which is likely to be relatively rare, normally these cases are _way_ out of bounds for user-provided assembly).

Technically that's feasible solution, but I think that's would be more like philosophical issue here :p

In D108961#2980097, @kito-cheng wrote:

So when we need this magic?

GCC didn't have integrated assembler like LLVM, so instruction length for inline asm can only use a very roughly way to estimate.

Is that not a problem for every architecture though?

Yeah, but RISC-V is more frequently hit due to shorter range for conditional branch, I believe integrated assembler is right way to resolve this, but sadly it's not existing in GCC :(

I don't particularly like adding magic to RISC-V just because it hits an issue shared by many architectures more often than them. If I can break every architecture other than RISC-V with a sufficiently-large blob of inline assembly then it seems to me like GCC needs fixing, not that RISC-V should have a special hacky workaround just for it that gets exposed to and relied upon by users?

Mostly this is used for hand-written assembly file, RISC-V conditional branch only provide very short range compare to other RISC ISA; +-32K for MIPS, +-32M for ARM32, +-32K(TBZ/TBNZ) ~ +-1M(CBZ/CBNZ) for AArch64, but only +-2K for RISC-V, that made RISC-V is more easy to hit out-of-range condition branch issue, of cause we can ask programmer to convert this by themselves, but out-of-range error is reported until linker time because RISC-V has relaxation, that made many user confuse, (relocation truncate to fit???), this magic can prevent that confusion in most case.

There's nothing stopping an assembler from detecting that a label will/won't/might be out of range for a given branch by doing best-case and worst-case analysis on the instruction sequence and emitting an error in the case where it's definitely going to be out of bounds however much you relax, and a warning when it's unsure because it depends on how much relaxation there is (which is likely to be relatively rare, normally these cases are _way_ out of bounds for user-provided assembly).

Technically that's feasible solution, but I think that's would be more like philosophical issue here :p

Some objection from me.

x86 does have such a relaxation (breaking WYSIWYG), but this is not a thing that any other RISC arch I know does.
WYSIWYG matters a bit because aligned branched for loops may have performance advantage. Unpredictable assembler relaxation can break it.
(On x86 when Intel jump condition code erratum was discussed) it is said certain instructions have "side effect after exactly one or two instruction", arbitrarily replacing one instruction has some risk. )
thumb1 has a short range (+-256) as well, but it doesn't use assembler relaxation:

# thumb: Error: branch out of range
.thumb
beq .Lfoo
.rept 129
nop
.endr
.Lfoo:

RISCV already uses lib/CodeGen/BranchRelaxation.cpp on the compiler side, so I am not sure why we need this assembler side solution.

https://gcc.gnu.org/onlinedocs/gcc/Size-of-an-asm.html#Size-of-an-asm says
"It does this by counting the number of instructions in the pattern of the asm "
Multiple RISC ports use this. If GCC RISCV doesn't do this yet, it should be fixed.

For completeness, such long branches are typically not performance bottleneck.
Even if linker relaxation can shorten the distance a bit and allow more short-form branches,
it may not worth the complexity.

Will the BranchRelaxation pass account for .insn directives in inline asm correctly? Or will we need to do additional work for that associated with D108602

In D108961#2980290, @craig.topper wrote:

Will the BranchRelaxation pass account for .insn directives in inline asm correctly? Or will we need to do additional work for that associated with D108602

I haven't looked at it closely yet. BranchRelaxation.cpp uses TargetInstrInfo::getInstSizeInBytes to compute the size.

craig.topper edited the summary of this revision. (Show Details)Sep 2 2021, 10:42 AM

https://gcc.gnu.org/onlinedocs/gcc/Size-of-an-asm.html#Size-of-an-asm says
"It does this by counting the number of instructions in the pattern of the asm "
Multiple RISC ports use this. If GCC RISCV doesn't do this yet, it should be fixed.

That's not target dependent feature, and it's roughly way to estimate, unless GCC has something like MC layer, otherwise it's never accurate.

For completeness, such long branches are typically not performance bottleneck.
Even if linker relaxation can shorten the distance a bit and allow more short-form branches,
it may not worth the complexity.

RISC-V don't have special relocation to handle this relaxation, and binutils didn't handle that, although that could be recognized by R_RISCV_BRANCH + R_RISCV_JAL pair, but as you said it's might be complicated and not worth.

It is not a new feature for RISC-V. We already have c.beqz to beq, c.bnez to bne, c.j to jal, etc. This patch just extends the relaxation patterns to conditional branches. If MC relaxation is not a good idea, should we remove all the patterns in the relaxInstruction target hook?

In D108961#2985805, @HsiangKai wrote:

It is not a new feature for RISC-V. We already have c.beqz to beq, c.bnez to bne, c.j to jal, etc. This patch just extends the relaxation patterns to conditional branches. If MC relaxation is not a good idea, should we remove all the patterns in the relaxInstruction target hook?

My understanding is that's more a quirk of our implementation and is there to un-compress branches that get erroneously compressed (since a bare symbol is a valid operand for compressed branches), but yes, that does currently cause things to behave a bit strangely and ideally wouldn't be there.

binutils also does this for at least Xtensa http://sourceware.org/binutils/docs/as/Xtensa-Branch-Relaxation.html

There was this pull request to document the GNU behavior in the riscv-asm-manual https://github.com/riscv-non-isa/riscv-asm-manual/pull/58/files

The inline assembly counting in LLVM is known to be incomplete see https://bugs.llvm.org/show_bug.cgi?id=42539

I think maybe .insn will work with the current implementation of getInlineAsmLength because it mostly just counts the number of lines and multiplies by MaxInstLength. With a special case for .space directive.

In D108961#2987466, @craig.topper wrote:

binutils also does this for at least Xtensa http://sourceware.org/binutils/docs/as/Xtensa-Branch-Relaxation.html

There was this pull request to document the GNU behavior in the riscv-asm-manual https://github.com/riscv-non-isa/riscv-asm-manual/pull/58/files

The inline assembly counting in LLVM is known to be incomplete see https://bugs.llvm.org/show_bug.cgi?id=42539

I think maybe .insn will work with the current implementation of getInlineAsmLength because it mostly just counts the number of lines and multiplies by MaxInstLength. With a special case for .space directive.

If we deal with special directives such as .rept in getInlineAsmLength, we should be able to get the correct branch instruction according to the estimated offset. I agree that the compiler should be responsible to generate the correct code, not depends on MC to fix it. However, hand-written assembly code is another story. How about to turn MC relaxation off by default and let users turn on it if users want the compiler to help them to take care of these transformation?

I found some pseudo instruction are expanded to multiple MI instructions in RISCVExpandPseudoInsts.cpp. And the BranchRelaxation pass will run before the expandPseudo pass. Is it possible to cause the fixup value out-of-range on some branch instructions? If this will happen, maybe add MC layer branch relaxation can deal with it.

In D108961#3003855, @StephenFan wrote:

I found some pseudo instruction are expanded to multiple MI instructions in RISCVExpandPseudoInsts.cpp. And the BranchRelaxation pass will run before the expandPseudo pass. Is it possible to cause the fixup value out-of-range on some branch instructions? If this will happen, maybe add MC layer branch relaxation can deal with it.

Not unless RISCVInstrInfo::getInstSizeInBytes has missing or incorrect cases.

In D108961#3003867, @jrtc27 wrote:

In D108961#3003855, @StephenFan wrote:

I found some pseudo instruction are expanded to multiple MI instructions in RISCVExpandPseudoInsts.cpp. And the BranchRelaxation pass will run before the expandPseudo pass. Is it possible to cause the fixup value out-of-range on some branch instructions? If this will happen, maybe add MC layer branch relaxation can deal with it.

Not unless RISCVInstrInfo::getInstSizeInBytes has missing or incorrect cases.

Got it. Thanks!

Hello, we are currently trying to implement an LLVM backend on RISC-V for the GraalVM Native Image project and we run into some issues that are solved by this patch. In summary, we produce LLVM bitcode from Java code and feed it to llc. The problem is that we cannot really control the size of the functions and it can cause branches to out of range locations. However, this issue seems stale and controversial. Is there any news to share?

Herald added a project: Restricted Project. · View Herald TranscriptJul 13 2022, 7:33 AM

Herald added subscribers: sunshaoce, • pcwang-thead, eopXD and 3 others. · View Herald Transcript

In D108961#3648216, @Zeavee wrote:

Hello, we are currently trying to implement an LLVM backend on RISC-V for the GraalVM Native Image project and we run into some issues that are solved by this patch. In summary, we produce LLVM bitcode from Java code and feed it to llc. The problem is that we cannot really control the size of the functions and it can cause branches to out of range locations. However, this issue seems stale and controversial. Is there any news to share?

llc has branchrelaxation, doesn't it solve your problem?

Zeavee added a comment.Jul 14 2022, 7:44 AM

This comment was removed by Zeavee.

liaolucy added a subscriber: liaolucy.Jul 14 2022, 6:18 PM

craig.topper commandeered this revision.Jan 27 2023, 1:52 PM

craig.topper edited reviewers, added: HsiangKai; removed: craig.topper.

Herald added a subscriber: luke. · View Herald TranscriptJan 27 2023, 1:52 PM

Rebase code. Still waiting on build to check tests.

craig.topper added a reviewer: reames.Jan 27 2023, 1:53 PM

Fix build error

@reames has discovered that CodeGen's BranchRelaxation doesn't understand how RISC-V emit alignment directives when relaxation is enabled. Align directives always emit (align - mininstsize) bytes of NOPs. The linker is responsible for cleaning up the extra NOPs to meet the requested alignment. BranchRelaxation doesn't know this and undercounts the size of the alignment NOPs. This shows up when -falign-loops is used for example.

Adding this relaxation to the assembler is the quickest fix and matches the GNU assembler anyway. Though it still wouldn't handle the case that requires a branch to become indirect.

lit tests are clean with this patch

Harbormaster completed remote builds in B210469: Diff 492898.Jan 27 2023, 3:44 PM

As @craig.topper mentioned above, I stumbled into this issue last week when debugging an LTO failure. The exact issue I hit was because BranchRelaxation doesn't account for the padding required in RISCV's alignment directive. (Which is different from every other target.) We could, of course, patch the particular issue in BranchRelaxation, but I want to make an argument for reversing course on this patch as well. If we'd landed this patch, my hard to debug LTO correctness issue would have been a minor code quality issue at worst, and I'd not have lost a week of my life. :)

First, we are currently incompatible with the GNU toolchain. LLVM's assembler will not successfully assemble programs which 'as' will. This shows up both in hand written assembly, but also, in theory at the moment, in output from gcc.

target:
	bne a0, a1, target
.rep 1024
	nop
.endr
	bne a0, a1, target

Personally, I consider this a very strong reason to accept this patch (or some variant thereof). To the point where I almost hesitate to bring up other lines of arguments because I'm worried they'll distract from the core point of cross toolchain compatibility. We *need* cross toolchain compatibility; any exceptions to this need to be *very* strongly justified, and I don't consider any of the arguments made on this thread to date to come anywhere close to that bar.

Second, I don't believe the correctness invariant relied on from BranchRelaxation is a good idea. At the moment, we *must* get all instruction sizes used in branch relaxation to be upper bounds on the actual bytes generated. We've now had multiple examples where this was not upheld in edge cases, and I don't see anything in the code which particularly enforces said invariant. I've posted a couple patches to add a few relevant asserts here and there, but this is a cross cutting invariant with broad implications. If we can remove it it, that removes a major correctness risk.

There was an argument made on thread previously that this patch adds rarely executed behavior which is likely to bit rot. I think this argument gets things completely backwards. The fundamental case requiring relaxation has to be implemented somewhere, and the assembler implementation is actually the easier to test of the alternatives. The assembler change has dramatically less "blast radius" than the compiler BranchRelaxation approach.

Now, I do need to acknowledge that there's related branch relaxation case which the assembler *can't* easily handle. Specifically, when we cross out of +/- 2MB range, we need a scratch register to materialize the long branch. To my knowledge, GCC doesn't handle this case at all.

Third, I think we actually have potentially significant missed optimizations here. BranchRelaxation is inherently conservative. As one example, @craig.topper tells me that every branch is modeled as 4 bytes, even if the branch target would allow the branch to be compressible. As such, for branch dense code (e.g. loop nests) we can end up relaxation branches which don't need relaxed. I haven't (yet) seen a case compelling enough on it's own to address, but in theory, such cases are certainly possible. Since we have a correctness need to change schemes here anyways, we might as well fix the optimization problem at the same time.

reames added inline comments.Jan 30 2023, 8:54 AM

llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
199	Is this right? BEQ is not the same as BEQZ? Isn't there a missing zero check here? Ah, no, that's checked in the caller. Could you adjust naming or comments to make that clear please? In fact, so much of the caller differs for this case, maybe it should just be an if/else block there?
229	Doing this when we're not going to use the compressed instruction seems like an odd canonicalization from the assembler. Not objecting, just wondering why.
276	Can you precommit a change which pulls this out and converts these branches to a switch? Doing so reduces the delta.
llvm/test/MC/RISCV/long-conditional-jump.s
7	Use check-prefixes to common all but the couple which actually differ?

craig.topper added inline comments.Jan 30 2023, 9:33 AM

llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
229	Probably because it made it easier to write the expression for UseComperssBr as a single condition. I'll rework it.

craig.topper mentioned this in rG025c92077d39: [RISCV] Replace multiple ifs with a switch. NFC.Jan 30 2023, 9:54 AM

Rebase after adding switch to encodeInstruction. I'll work on other comments next.

Harbormaster completed remote builds in B210801: Diff 493341.Jan 30 2023, 10:44 AM

In D108961#4090884, @reames wrote:

As @craig.topper mentioned above, I stumbled into this issue last week when debugging an LTO failure. The exact issue I hit was because BranchRelaxation doesn't account for the padding required in RISCV's alignment directive. (Which is different from every other target.) We could, of course, patch the particular issue in BranchRelaxation, but I want to make an argument for reversing course on this patch as well. If we'd landed this patch, my hard to debug LTO correctness issue would have been a minor code quality issue at worst, and I'd not have lost a week of my life. :)

I'm very sorry for that, I know from personal experience how galling it is to spend a long time debugging something to later find a patch that fixed it had already been proposed. I don't personally have a reliable system for tracking patches that don't get re-pinged by their authors, but there was a bug report for this issue here, so definitely a failure there...

There was an argument made on thread previously that this patch adds rarely executed behavior which is likely to bit rot. I think this argument gets things completely backwards. The fundamental case requiring relaxation has to be implemented somewhere, and the assembler implementation is actually the easier to test of the alternatives. The assembler change has dramatically less "blast radius" than the compiler BranchRelaxation approach.

I wanted to clarify this point - my concern was adding a rarely executed and insufficiently tested code path (which has in the past also been the source of hard to debug issues).

Address review comments

craig.topper added inline comments.Jan 30 2023, 11:35 AM

llvm/test/MC/RISCV/long-conditional-jump.s
7	There doesn't appear to be much common. The hex constants are different in every case but the first.

llvm/test/MC/RISCV/long-conditional-jump.s
7	You are correct here. Ignore me.

Code wise, LGTM. Please wait on landing until we settle the high level approach questions.

Harbormaster completed remote builds in B210827: Diff 493371.Jan 30 2023, 12:15 PM

Code changes LGTM as well. I can't think of other cases worth testing either (but of course, more eyes always welcome there).

Vote +1 from my personal side.
We met some issues when porting Android/Rust to RISCV, and finally found it can be fixed by this patch.

We discussed this today at the RISCV sync up meeting. Unfortunately, @jrtc27 wasn't in attendance.

Out of the attendees, @asb, @craig.topper, and I spoke in support of landing this. @luismarques remembered being hesitant, and was going to revisit.

I'm going to reach out to @jrtc27 to discuss, and see if her previous opposition still stands given new information. (edit: Reached out via email on 2/2/23)

luke957 removed a subscriber: luke957.Feb 2 2023, 9:55 PM

To summarise my position after chatting things through with Philip, I'm in favour of landing this as-is^. For compiler-generated code, I would personally rather it not be something relied upon, but I understand the reasons it does from a practical perspective and that others have differing philosophical positions. My bigger concern was with hand-written assembly, especially with .option norelax (and possibly .option norvc) where there is the expectation that what you write is exactly what you get at the byte level, not just semantically (c.f. Linux's alternatives and static branches, which make use of that). For compatibility with GNU as on real-world code we do unfortunately need to support this, and so having it on by default is the way we have to go. I would, however, like to see both assemblers provide a standard way to turn this off in future, and perhaps an eventual path for compilers to explicitly opt-in to using this such that the default could be changed many years down the road (and thus not "surprise" developers), but that's all future work that isn't blocking for any of this.

^ NB: This is not a code review though, I haven't studied the changes themselves, hence making this just a comment and not an "official" approval

Thanks @jrtc27 - marking this explicitly as LGTM now the remaining concerns have I think all been addressed.

This revision is now accepted and ready to land.Feb 3 2023, 10:21 AM

MaskRay added inline comments.Feb 3 2023, 10:43 AM

llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp
153	`isInt<13>(Offset)`. `(Offset > 4094 \|\| Offset < -4096);` is not different from `(Offset > 4095 \|\| Offset < -4096);`
llvm/test/MC/RISCV/long-conditional-jump.s
2	`< %s` => `%s`
5	`-triple=riscv64` Consistently use `=` or separator.

craig.topper added inline comments.Feb 3 2023, 11:09 AM

llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp
153	Seems like this comment would apply to line 168 and 172 as well?

Address @MaskRay's comments

This revision was landed with ongoing or failed builds.Feb 3 2023, 12:34 PM

Closed by commit rG98117f1a743c: [RISCV] MC relaxation for out-of-range conditional branch. (authored by craig.topper). · Explain Why

This revision was automatically updated to reflect the committed changes.

craig.topper added a commit: rG98117f1a743c: [RISCV] MC relaxation for out-of-range conditional branch..

Harbormaster completed remote builds in B211774: Diff 494699.Feb 3 2023, 12:53 PM

jobnoorman mentioned this in D154958: [RISCV][MC] Relax conditional branches to unresolved symbols.Jul 11 2023, 6:21 AM

jobnoorman mentioned this in D155953: [RISCV][MC] Add CLI option to disable branch relaxation.Jul 21 2023, 6:57 AM

jobnoorman mentioned this in rGafb2e9f44c13: [RISCV][MC] Add CLI option to disable branch relaxation.Jul 28 2023, 1:42 AM

jobnoorman mentioned this in rG856745c5ebe5: [RISCV][MC] Relax conditional branches to unresolved symbols.Jul 28 2023, 1:56 AM

Revision Contents

Path

Size

llvm/

lib/

Target/

RISCV/

MCTargetDesc/

RISCVAsmBackend.cpp

35 lines

RISCVMCCodeEmitter.cpp

92 lines

RISCVInstrInfo.cpp

6 lines

RISCVInstrInfo.td

20 lines

test/

MC/

RISCV/

fixups-diagnostics.s

3 lines

long-conditional-jump.s

93 lines

rv64-relax-all.s

3 lines

Diff 369591

llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp

Show First 20 Lines • Show All 137 Lines • ▼ Show 20 Lines
}		}

bool RISCVAsmBackend::fixupNeedsRelaxationAdvanced(const MCFixup &Fixup,		bool RISCVAsmBackend::fixupNeedsRelaxationAdvanced(const MCFixup &Fixup,
bool Resolved,		bool Resolved,
uint64_t Value,		uint64_t Value,
const MCRelaxableFragment *DF,		const MCRelaxableFragment *DF,
const MCAsmLayout &Layout,		const MCAsmLayout &Layout,
const bool WasForced) const {		const bool WasForced) const {
		int64_t Offset = int64_t(Value);
		unsigned Kind = Fixup.getTargetKind();

		// We only do conditional branch relaxation when the symbol is resolved.
		// For conditional branch, the immediate must be in the range
		// [-4096, 4094].
		if (Kind == RISCV::fixup_riscv_branch)
		return Resolved && (Offset > 4094 \|\| Offset < -4096);
		MaskRayUnsubmitted Not Done Reply Inline Actions `isInt<13>(Offset)`. `(Offset > 4094 \|\| Offset < -4096);` is not different from `(Offset > 4095 \|\| Offset < -4096);` MaskRay: `isInt<13>(Offset)`. `(Offset > 4094 \|\| Offset < -4096);` is not different from `(Offset >…
		craig.topperAuthorUnsubmitted Done Reply Inline Actions Seems like this comment would apply to line 168 and 172 as well? craig.topper: Seems like this comment would apply to line 168 and 172 as well?

// Return true if the symbol is actually unresolved.		// Return true if the symbol is actually unresolved.
// Resolved could be always false when shouldForceRelocation return true.		// Resolved could be always false when shouldForceRelocation return true.
// We use !WasForced to indicate that the symbol is unresolved and not forced		// We use !WasForced to indicate that the symbol is unresolved and not forced
// by shouldForceRelocation.		// by shouldForceRelocation.
if (!Resolved && !WasForced)		if (!Resolved && !WasForced)
return true;		return true;

int64_t Offset = int64_t(Value);		switch (Kind) {
switch (Fixup.getTargetKind()) {
default:		default:
return false;		return false;
case RISCV::fixup_riscv_rvc_branch:		case RISCV::fixup_riscv_rvc_branch:
// For compressed branch instructions the immediate must be		// For compressed branch instructions the immediate must be
// in the range [-256, 254].		// in the range [-256, 254].
return Offset > 254 \|\| Offset < -256;		return Offset > 254 \|\| Offset < -256;
case RISCV::fixup_riscv_rvc_jump:		case RISCV::fixup_riscv_rvc_jump:
// For compressed jump instructions the immediate must be		// For compressed jump instructions the immediate must be
Show All 30 Lines	case RISCV::C_J:
Res.addOperand(Inst.getOperand(0));		Res.addOperand(Inst.getOperand(0));
break;		break;
case RISCV::C_JAL:		case RISCV::C_JAL:
// c.jal $imm -> jal X1, $imm.		// c.jal $imm -> jal X1, $imm.
Res.setOpcode(RISCV::JAL);		Res.setOpcode(RISCV::JAL);
Res.addOperand(MCOperand::createReg(RISCV::X1));		Res.addOperand(MCOperand::createReg(RISCV::X1));
Res.addOperand(Inst.getOperand(0));		Res.addOperand(Inst.getOperand(0));
break;		break;
		case RISCV::BEQ:
		case RISCV::BNE:
		case RISCV::BLT:
		case RISCV::BGE:
		case RISCV::BLTU:
		case RISCV::BGEU:
		Res.setOpcode(getRelaxedOpcode(Inst.getOpcode()));
		Res.addOperand(Inst.getOperand(0));
		Res.addOperand(Inst.getOperand(1));
		Res.addOperand(Inst.getOperand(2));
		break;
}		}
Inst = std::move(Res);		Inst = std::move(Res);
}		}

bool RISCVAsmBackend::relaxDwarfLineAddr(MCDwarfLineAddrFragment &DF,		bool RISCVAsmBackend::relaxDwarfLineAddr(MCDwarfLineAddrFragment &DF,
MCAsmLayout &Layout,		MCAsmLayout &Layout,
bool &WasRelaxed) const {		bool &WasRelaxed) const {
MCContext &C = Layout.getAssembler().getContext();		MCContext &C = Layout.getAssembler().getContext();
▲ Show 20 Lines • Show All 130 Lines • ▼ Show 20 Lines	default:
return Op;		return Op;
case RISCV::C_BEQZ:		case RISCV::C_BEQZ:
return RISCV::BEQ;		return RISCV::BEQ;
case RISCV::C_BNEZ:		case RISCV::C_BNEZ:
return RISCV::BNE;		return RISCV::BNE;
case RISCV::C_J:		case RISCV::C_J:
case RISCV::C_JAL: // fall through.		case RISCV::C_JAL: // fall through.
return RISCV::JAL;		return RISCV::JAL;
		case RISCV::BEQ:
		return RISCV::PseudoLongBEQ;
		case RISCV::BNE:
		return RISCV::PseudoLongBNE;
		case RISCV::BLT:
		return RISCV::PseudoLongBLT;
		case RISCV::BGE:
		return RISCV::PseudoLongBGE;
		case RISCV::BLTU:
		return RISCV::PseudoLongBLTU;
		case RISCV::BGEU:
		return RISCV::PseudoLongBGEU;
}		}
}		}

bool RISCVAsmBackend::mayNeedRelaxation(const MCInst &Inst,		bool RISCVAsmBackend::mayNeedRelaxation(const MCInst &Inst,
const MCSubtargetInfo &STI) const {		const MCSubtargetInfo &STI) const {
return getRelaxedOpcode(Inst.getOpcode()) != Inst.getOpcode();		return getRelaxedOpcode(Inst.getOpcode()) != Inst.getOpcode();
}		}

▲ Show 20 Lines • Show All 290 Lines • Show Last 20 Lines

llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp

Show First 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	public:
void expandFunctionCall(const MCInst &MI, raw_ostream &OS,		void expandFunctionCall(const MCInst &MI, raw_ostream &OS,
SmallVectorImpl<MCFixup> &Fixups,		SmallVectorImpl<MCFixup> &Fixups,
const MCSubtargetInfo &STI) const;		const MCSubtargetInfo &STI) const;

void expandAddTPRel(const MCInst &MI, raw_ostream &OS,		void expandAddTPRel(const MCInst &MI, raw_ostream &OS,
SmallVectorImpl<MCFixup> &Fixups,		SmallVectorImpl<MCFixup> &Fixups,
const MCSubtargetInfo &STI) const;		const MCSubtargetInfo &STI) const;

		void expandLongCondBr(const MCInst &MI, raw_ostream &OS,
		SmallVectorImpl<MCFixup> &Fixups,
		const MCSubtargetInfo &STI) const;

/// TableGen'erated function for getting the binary encoding for an		/// TableGen'erated function for getting the binary encoding for an
/// instruction.		/// instruction.
uint64_t getBinaryCodeForInstr(const MCInst &MI,		uint64_t getBinaryCodeForInstr(const MCInst &MI,
SmallVectorImpl<MCFixup> &Fixups,		SmallVectorImpl<MCFixup> &Fixups,
const MCSubtargetInfo &STI) const;		const MCSubtargetInfo &STI) const;

/// Return binary encoding of operand. If the machine operand requires		/// Return binary encoding of operand. If the machine operand requires
/// relocation, record the relocation and return zero.		/// relocation, record the relocation and return zero.
▲ Show 20 Lines • Show All 111 Lines • ▼ Show 20 Lines	void RISCVMCCodeEmitter::expandAddTPRel(const MCInst &MI, raw_ostream &OS,
MCInst TmpInst = MCInstBuilder(RISCV::ADD)		MCInst TmpInst = MCInstBuilder(RISCV::ADD)
.addOperand(DestReg)		.addOperand(DestReg)
.addOperand(SrcReg)		.addOperand(SrcReg)
.addOperand(TPReg);		.addOperand(TPReg);
uint32_t Binary = getBinaryCodeForInstr(TmpInst, Fixups, STI);		uint32_t Binary = getBinaryCodeForInstr(TmpInst, Fixups, STI);
support::endian::write(OS, Binary, support::little);		support::endian::write(OS, Binary, support::little);
}		}

		static unsigned getInvertedBranchOp(unsigned BrOp, bool UseCompressedBr) {
		switch (BrOp) {
		default:
		llvm_unreachable("Unexpected branch opcode!");
		case RISCV::PseudoLongBEQ:
		if (UseCompressedBr)
		reamesUnsubmitted Not Done Reply Inline Actions Is this right? BEQ is not the same as BEQZ? Isn't there a missing zero check here? Ah, no, that's checked in the caller. Could you adjust naming or comments to make that clear please? In fact, so much of the caller differs for this case, maybe it should just be an if/else block there? reames: Is this right? BEQ is not the same as BEQZ? Isn't there a missing zero check here? Ah, no…
		return RISCV::C_BNEZ;
		return RISCV::BNE;
		case RISCV::PseudoLongBNE:
		if (UseCompressedBr)
		return RISCV::C_BEQZ;
		return RISCV::BEQ;
		case RISCV::PseudoLongBLT:
		return RISCV::BGE;
		case RISCV::PseudoLongBGE:
		return RISCV::BLT;
		case RISCV::PseudoLongBLTU:
		return RISCV::BGEU;
		case RISCV::PseudoLongBGEU:
		return RISCV::BLTU;
		}
		}

		// Expand PseudoLongBxx to an inverted conditional branch and an unconditional
		// jump.
		void RISCVMCCodeEmitter::expandLongCondBr(const MCInst &MI, raw_ostream &OS,
		SmallVectorImpl<MCFixup> &Fixups,
		const MCSubtargetInfo &STI) const {
		MCOperand SrcReg1 = MI.getOperand(0);
		MCOperand SrcReg2 = MI.getOperand(1);
		MCOperand SrcSymbol = MI.getOperand(2);
		unsigned Opcode = MI.getOpcode();
		bool IsEqTest =
		(Opcode == RISCV::PseudoLongBNE) \|\| (Opcode == RISCV::PseudoLongBEQ);

		if (IsEqTest && SrcReg1.getReg() == RISCV::X0)
		reamesUnsubmitted Not Done Reply Inline Actions Doing this when we're not going to use the compressed instruction seems like an odd canonicalization from the assembler. Not objecting, just wondering why. reames: Doing this when we're not going to use the compressed instruction seems like an odd…
		craig.topperAuthorUnsubmitted Done Reply Inline Actions Probably because it made it easier to write the expression for UseComperssBr as a single condition. I'll rework it. craig.topper: Probably because it made it easier to write the expression for UseComperssBr as a single…
		std::swap(SrcReg1, SrcReg2);

		// Emit an inverted conditional branch to skip the following jump.
		bool UseCompressedBr =
		STI.getFeatureBits()[RISCV::FeatureStdExtC] &&
		(RISCV::X8 <= SrcReg1.getReg() && SrcReg1.getReg() <= RISCV::X15) &&
		(SrcReg2.getReg() == RISCV::X0) && IsEqTest;
		auto TmpInst =
		MCInstBuilder(getInvertedBranchOp(MI.getOpcode(), UseCompressedBr))
		.addOperand(SrcReg1);
		if (UseCompressedBr)
		TmpInst.addImm(6);
		else
		TmpInst.addOperand(SrcReg2).addImm(8);
		uint32_t Binary = getBinaryCodeForInstr(TmpInst, Fixups, STI);
		uint32_t Offset;
		if (UseCompressedBr) {
		support::endian::write<uint16_t>(OS, Binary, support::little);
		Offset = 2;
		} else {
		support::endian::write(OS, Binary, support::little);
		Offset = 4;
		}

		// Emit an unconditional jump to the destination.
		TmpInst = MCInstBuilder(RISCV::JAL).addReg(RISCV::X0).addOperand(SrcSymbol);
		Binary = getBinaryCodeForInstr(TmpInst, Fixups, STI);
		support::endian::write(OS, Binary, support::little);

		Fixups.clear();
		if (SrcSymbol.isExpr()) {
		Fixups.push_back(MCFixup::create(Offset, SrcSymbol.getExpr(),
		MCFixupKind(RISCV::fixup_riscv_jal),
		MI.getLoc()));
		}
		}

void RISCVMCCodeEmitter::encodeInstruction(const MCInst &MI, raw_ostream &OS,		void RISCVMCCodeEmitter::encodeInstruction(const MCInst &MI, raw_ostream &OS,
SmallVectorImpl<MCFixup> &Fixups,		SmallVectorImpl<MCFixup> &Fixups,
const MCSubtargetInfo &STI) const {		const MCSubtargetInfo &STI) const {
verifyInstructionPredicates(MI,		verifyInstructionPredicates(MI,
computeAvailableFeatures(STI.getFeatureBits()));		computeAvailableFeatures(STI.getFeatureBits()));

const MCInstrDesc &Desc = MCII.get(MI.getOpcode());		const MCInstrDesc &Desc = MCII.get(MI.getOpcode());
// Get byte count of instruction.		// Get byte count of instruction.
unsigned Size = Desc.getSize();		unsigned Size = Desc.getSize();
		unsigned Opcode = MI.getOpcode();
		reamesUnsubmitted Not Done Reply Inline Actions Can you precommit a change which pulls this out and converts these branches to a switch? Doing so reduces the delta. reames: Can you precommit a change which pulls this out and converts these branches to a switch? Doing…

// RISCVInstrInfo::getInstSizeInBytes hard-codes the number of expanded		// RISCVInstrInfo::getInstSizeInBytes hard-codes the number of expanded
// instructions for each pseudo, and must be updated when adding new pseudos		// instructions for each pseudo, and must be updated when adding new pseudos
// or changing existing ones.		// or changing existing ones.
if (MI.getOpcode() == RISCV::PseudoCALLReg \|\|		if (Opcode == RISCV::PseudoCALLReg \|\| Opcode == RISCV::PseudoCALL \|\|
MI.getOpcode() == RISCV::PseudoCALL \|\|		Opcode == RISCV::PseudoTAIL \|\| Opcode == RISCV::PseudoJump) {
MI.getOpcode() == RISCV::PseudoTAIL \|\|
MI.getOpcode() == RISCV::PseudoJump) {
expandFunctionCall(MI, OS, Fixups, STI);		expandFunctionCall(MI, OS, Fixups, STI);
MCNumEmitted += 2;		MCNumEmitted += 2;
return;		return;
}		}

if (MI.getOpcode() == RISCV::PseudoAddTPRel) {		if (MI.getOpcode() == RISCV::PseudoAddTPRel) {
expandAddTPRel(MI, OS, Fixups, STI);		expandAddTPRel(MI, OS, Fixups, STI);
MCNumEmitted += 1;		MCNumEmitted += 1;
return;		return;
}		}

		if (Opcode == RISCV::PseudoLongBEQ \|\| Opcode == RISCV::PseudoLongBNE \|\|
		Opcode == RISCV::PseudoLongBLT \|\| Opcode == RISCV::PseudoLongBGE \|\|
		Opcode == RISCV::PseudoLongBLTU \|\| Opcode == RISCV::PseudoLongBGEU) {
		expandLongCondBr(MI, OS, Fixups, STI);
		MCNumEmitted += 2;
		return;
		}

switch (Size) {		switch (Size) {
default:		default:
llvm_unreachable("Unhandled encodeInstruction length!");		llvm_unreachable("Unhandled encodeInstruction length!");
case 2: {		case 2: {
uint16_t Bits = getBinaryCodeForInstr(MI, Fixups, STI);		uint16_t Bits = getBinaryCodeForInstr(MI, Fixups, STI);
support::endian::write<uint16_t>(OS, Bits, support::little);		support::endian::write<uint16_t>(OS, Bits, support::little);
break;		break;
}		}
▲ Show 20 Lines • Show All 184 Lines • Show Last 20 Lines

llvm/lib/Target/RISCV/RISCVInstrInfo.cpp

Show First 20 Lines • Show All 782 Lines • ▼ Show 20 Lines	unsigned RISCVInstrInfo::getInstSizeInBytes(const MachineInstr &MI) const {
case RISCV::PseudoCALLReg:		case RISCV::PseudoCALLReg:
case RISCV::PseudoCALL:		case RISCV::PseudoCALL:
case RISCV::PseudoJump:		case RISCV::PseudoJump:
case RISCV::PseudoTAIL:		case RISCV::PseudoTAIL:
case RISCV::PseudoLLA:		case RISCV::PseudoLLA:
case RISCV::PseudoLA:		case RISCV::PseudoLA:
case RISCV::PseudoLA_TLS_IE:		case RISCV::PseudoLA_TLS_IE:
case RISCV::PseudoLA_TLS_GD:		case RISCV::PseudoLA_TLS_GD:
		case RISCV::PseudoLongBEQ:
		case RISCV::PseudoLongBNE:
		case RISCV::PseudoLongBLT:
		case RISCV::PseudoLongBGE:
		case RISCV::PseudoLongBLTU:
		case RISCV::PseudoLongBGEU:
return 8;		return 8;
case RISCV::PseudoAtomicLoadNand32:		case RISCV::PseudoAtomicLoadNand32:
case RISCV::PseudoAtomicLoadNand64:		case RISCV::PseudoAtomicLoadNand64:
return 20;		return 20;
case RISCV::PseudoMaskedAtomicSwap32:		case RISCV::PseudoMaskedAtomicSwap32:
case RISCV::PseudoMaskedAtomicLoadAdd32:		case RISCV::PseudoMaskedAtomicLoadAdd32:
case RISCV::PseudoMaskedAtomicLoadSub32:		case RISCV::PseudoMaskedAtomicLoadSub32:
return 28;		return 28;
▲ Show 20 Lines • Show All 869 Lines • Show Last 20 Lines

llvm/lib/Target/RISCV/RISCVInstrInfo.td

	Show First 20 Lines • Show All 1,007 Lines • ▼ Show 20 Lines

	def : BccPat<SETEQ, BEQ>;			def : BccPat<SETEQ, BEQ>;
	def : BccPat<SETNE, BNE>;			def : BccPat<SETNE, BNE>;
	def : BccPat<SETLT, BLT>;			def : BccPat<SETLT, BLT>;
	def : BccPat<SETGE, BGE>;			def : BccPat<SETGE, BGE>;
	def : BccPat<SETULT, BLTU>;			def : BccPat<SETULT, BLTU>;
	def : BccPat<SETUGE, BGEU>;			def : BccPat<SETUGE, BGEU>;

				class LongBccPseudo : Pseudo<(outs),
				(ins GPR:$rs1, GPR:$rs2, simm21_lsb0_jal:$imm20),
				[]> {
				let Size = 8;
				let isBarrier = 1;
				let isBranch = 1;
				let hasSideEffects = 0;
				let mayStore = 0;
				let mayLoad = 0;
				let isAsmParserOnly = 1;
				let hasNoSchedulingInfo = 1;
				}

				def PseudoLongBEQ : LongBccPseudo;
				def PseudoLongBNE : LongBccPseudo;
				def PseudoLongBLT : LongBccPseudo;
				def PseudoLongBGE : LongBccPseudo;
				def PseudoLongBLTU : LongBccPseudo;
				def PseudoLongBGEU : LongBccPseudo;

	let isBarrier = 1, isBranch = 1, isTerminator = 1 in			let isBarrier = 1, isBranch = 1, isTerminator = 1 in
	def PseudoBR : Pseudo<(outs), (ins simm21_lsb0_jal:$imm20), [(br bb:$imm20)]>,			def PseudoBR : Pseudo<(outs), (ins simm21_lsb0_jal:$imm20), [(br bb:$imm20)]>,
	PseudoInstExpansion<(JAL X0, simm21_lsb0_jal:$imm20)>;			PseudoInstExpansion<(JAL X0, simm21_lsb0_jal:$imm20)>;

	let isBarrier = 1, isBranch = 1, isIndirectBranch = 1, isTerminator = 1 in			let isBarrier = 1, isBranch = 1, isIndirectBranch = 1, isTerminator = 1 in
	def PseudoBRIND : Pseudo<(outs), (ins GPRJALR:$rs1, simm12:$imm12), []>,			def PseudoBRIND : Pseudo<(outs), (ins GPRJALR:$rs1, simm12:$imm12), []>,
	PseudoInstExpansion<(JALR X0, GPR:$rs1, simm12:$imm12)>;			PseudoInstExpansion<(JALR X0, GPR:$rs1, simm12:$imm12)>;

	▲ Show 20 Lines • Show All 327 Lines • Show Last 20 Lines

llvm/test/MC/RISCV/fixups-diagnostics.s

	# RUN: not llvm-mc -triple riscv32 -filetype obj < %s -o /dev/null 2>&1 \| FileCheck %s			# RUN: not llvm-mc -triple riscv32 -filetype obj < %s -o /dev/null 2>&1 \| FileCheck %s

	jal a0, far_distant # CHECK: :[[@LINE]]:3: error: fixup value out of range			jal a0, far_distant # CHECK: :[[@LINE]]:3: error: fixup value out of range
	jal a0, unaligned # CHECK: :[[@LINE]]:3: error: fixup value must be 2-byte aligned			jal a0, unaligned # CHECK: :[[@LINE]]:3: error: fixup value must be 2-byte aligned

	beq a0, a1, distant # CHECK: :[[@LINE]]:3: error: fixup value out of range
	blt t0, t1, unaligned # CHECK: :[[@LINE]]:3: error: fixup value must be 2-byte aligned			blt t0, t1, unaligned # CHECK: :[[@LINE]]:3: error: fixup value must be 2-byte aligned

	.byte 0			.byte 0
	unaligned:			unaligned:
	.byte 0			.byte 0
	.byte 0			.byte 0
	.byte 0			.byte 0

	.space 1<<12
	distant:
	.space 1<<20			.space 1<<20
	far_distant:			far_distant:

llvm/test/MC/RISCV/long-conditional-jump.s

This file was added.

				# RUN: llvm-mc -filetype=obj -triple riscv64 < %s \
				# RUN: \| llvm-objdump -d -M no-aliases - \
				MaskRayUnsubmitted Not Done Reply Inline Actions `< %s` => `%s` MaskRay: `< %s` => `%s`
				# RUN: \| FileCheck --check-prefix=CHECK-INST %s
				# RUN: llvm-mc -filetype=obj -triple riscv64 -mattr=+c < %s \
				# RUN: \| llvm-objdump -d -M no-aliases - \
				MaskRayUnsubmitted Not Done Reply Inline Actions `-triple=riscv64` Consistently use `=` or separator. MaskRay: `-triple=riscv64` Consistently use `=` or ` ` separator.
				# RUN: \| FileCheck --check-prefix=CHECK-INST-C %s

				reamesUnsubmitted Not Done Reply Inline Actions Use check-prefixes to common all but the couple which actually differ? reames: Use check-prefixes to common all but the couple which actually differ?
				craig.topperAuthorUnsubmitted Done Reply Inline Actions There doesn't appear to be much common. The hex constants are different in every case but the first. craig.topper: There doesn't appear to be much common. The hex constants are different in every case but the…
				reamesUnsubmitted Not Done Reply Inline Actions You are correct here. Ignore me. reames: You are correct here. Ignore me.
				.text
				.p2align 3
				.type test,@function
				test:
				# CHECK-INST: beq a0, a1, 0x8
				# CHECK-INST-NEXT: jal zero, 0x1458
				# CHECK-INST-C: beq a0, a1, 0x8
				# CHECK-INST-C-NEXT: jal zero, 0x1458
				bne a0, a1, .L1
				.fill 1300, 4, 0
				.L1:
				ret
				# CHECK-INST: bne a0, a1, 0x1464
				# CHECK-INST-NEXT: jal zero, 0x28b4
				# CHECK-INST-C: bne a0, a1, 0x1462
				# CHECK-INST-C-NEXT: jal zero, 0x28b2
				beq a0, a1, .L2
				.fill 1300, 4, 0
				.L2:
				ret
				# CHECK-INST: bge a0, a1, 0x28c0
				# CHECK-INST-NEXT: jal zero, 0x3d10
				# CHECK-INST-C: bge a0, a1, 0x28bc
				# CHECK-INST-C-NEXT: jal zero, 0x3d0c
				blt a0, a1, .L3
				.fill 1300, 4, 0
				.L3:
				ret
				# CHECK-INST: blt a0, a1, 0x3d1c
				# CHECK-INST-NEXT: jal zero, 0x516c
				# CHECK-INST-C: blt a0, a1, 0x3d16
				# CHECK-INST-C-NEXT: jal zero, 0x5166
				bge a0, a1, .L4
				.fill 1300, 4, 0
				.L4:
				ret
				# CHECK-INST: bgeu a0, a1, 0x5178
				# CHECK-INST-NEXT: jal zero, 0x65c8
				# CHECK-INST-C: bgeu a0, a1, 0x5170
				# CHECK-INST-C-NEXT: jal zero, 0x65c0
				bltu a0, a1, .L5
				.fill 1300, 4, 0
				.L5:
				ret
				# CHECK-INST: bltu a0, a1, 0x65d4
				# CHECK-INST-NEXT: jal zero, 0x7a24
				# CHECK-INST-C: bltu a0, a1, 0x65ca
				# CHECK-INST-C-NEXT: jal zero, 0x7a1a
				bgeu a0, a1, .L6
				.fill 1300, 4, 0
				.L6:
				ret
				# CHECK-INST: bne a0, zero, 0x7a30
				# CHECK-INST-NEXT: jal zero, 0x8e80
				# CHECK-INST-C: c.bnez a0, 0x7a22
				# CHECK-INST-C-NEXT: jal zero, 0x8e72
				beqz a0, .L7
				.fill 1300, 4, 0
				.L7:
				ret
				# CHECK-INST: bne a0, zero, 0x8e8c
				# CHECK-INST-NEXT: jal zero, 0xa2dc
				# CHECK-INST-C: c.bnez a0, 0x8e7a
				# CHECK-INST-C-NEXT: jal zero, 0xa2ca
				beq x0, a0, .L8
				.fill 1300, 4, 0
				.L8:
				ret
				# CHECK-INST: beq a0, zero, 0xa2e8
				# CHECK-INST-NEXT: jal zero, 0xb738
				# CHECK-INST-C: c.beqz a0, 0xa2d2
				# CHECK-INST-C-NEXT: jal zero, 0xb722
				bnez a0, .L9
				.fill 1300, 4, 0
				.L9:
				ret
				# CHECK-INST: beq a6, zero, 0xb744
				# CHECK-INST-NEXT: jal zero, 0xcb94
				# CHECK-INST-C: beq a6, zero, 0xb72c
				# CHECK-INST-C-NEXT: jal zero, 0xcb7c
				bnez x16, .L10
				.fill 1300, 4, 0
				.L10:
				ret
				.Lfunc_end0:
				.size test, .Lfunc_end0-test

llvm/test/MC/RISCV/rv64-relax-all.s

	# RUN: llvm-mc -filetype=obj -triple riscv64 -mattr=+c %s \| llvm-objdump -d -M no-aliases --no-show-raw-insn - \| FileCheck %s --check-prefix=INSTR			# RUN: llvm-mc -filetype=obj -triple riscv64 -mattr=+c %s \| llvm-objdump -d -M no-aliases --no-show-raw-insn - \| FileCheck %s --check-prefix=INSTR

	# RUN: llvm-mc -filetype=obj -triple riscv64 -mattr=+c %s --mc-relax-all \| llvm-objdump -d -M no-aliases --no-show-raw-insn - \| FileCheck %s --check-prefix=RELAX-INSTR			# RUN: llvm-mc -filetype=obj -triple riscv64 -mattr=+c %s --mc-relax-all \| llvm-objdump -d -M no-aliases --no-show-raw-insn - \| FileCheck %s --check-prefix=RELAX-INSTR

	## Check the instructions are relaxed correctly			## Check the instructions are relaxed correctly

	NEAR:			NEAR:

	# INSTR: c.beqz a0, 0x0 <NEAR>			# INSTR: c.beqz a0, 0x0 <NEAR>
	# RELAX-INSTR: beq a0, zero, 0x0 <NEAR>			# RELAX-INSTR: c.bnez a0, 0x6
				# RELAX-INSTR-NEXT:jal zero, 0x0 <NEAR>
	c.beqz a0, NEAR			c.beqz a0, NEAR

	# INSTR: c.j 0x0 <NEAR>			# INSTR: c.j 0x0 <NEAR>
	# RELAX-INSTR: jal zero, 0x0 <NEAR>			# RELAX-INSTR: jal zero, 0x0 <NEAR>
	c.j NEAR			c.j NEAR

This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] MC relaxation for out-of-range conditional branch.ClosedPublic

Details

Diff Detail

Event Timeline