This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Target/RISCV/
-
Target/
-
RISCV/
-
AsmParser/
4/6
RISCVAsmParser.cpp
-
MCTargetDesc/
1/1
CMakeLists.txt
1/1
RISCVMCPseudoExpansion.h
13/15
RISCVMCPseudoExpansion.cpp
4
RISCVAsmPrinter.cpp
3/3
RISCVInstrFormats.td
-
RISCVInstrInfo.cpp
3
RISCVInstrInfo.td
-
test/
-
CodeGen/RISCV/
-
RISCV/
-
bswap-ctlz-cttz-ctpop.ll
-
calling-conv.ll
-
mem.ll
-
vararg.ll
-
MC/RISCV/
-
RISCV/
-
rv32i-aliases-invalid.s
-
rv32i-aliases-valid.s
3
rv64i-aliases-valid.s
3/3
rvi-aliases-valid.s

Differential D41949

[RISCV] implement li pseudo instruction
ClosedPublic

Authored by niosHD on Jan 11 2018, 6:52 AM.

Download Raw Diff

Details

Reviewers

asb

Commits

rG480b7bc90686: [RISCV] implement li pseudo instruction
rL330224: [RISCV] implement li pseudo instruction

Summary

The implementation follows the MIPS backend and expands the
pseudo instruction directly during asm parsing. As the result, only
real MC instructions are emitted to the MCStreamer. Additionally,
PseudoLI instructions are emitted during codegen. The actual
expansion to real instructions is performed during MI to MC lowering
and is similar to the expansion performed by the GNU Assembler.

Diff Detail

Event Timeline

niosHD created this revision.Jan 11 2018, 6:52 AM

Herald added subscribers: sabuasal, apazos, jordy.potman.lists and 5 others. · View Herald TranscriptJan 11 2018, 6:52 AM

Hi Mario - as this is marked in WIP I've added a few initial comments rather than given a 100% complete review.

It's a shame we need to have RISCVInstrInfo::movImm32 as well as the expansion introduced here - but of course one produces MachineInstr and the other produces MCInstr.

lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
978–980	Locals are normally capitalised in the LLVM coding style.
lib/Target/RISCV/RISCVInstrFormats.td
108–109	Do you think having these properties inferred might be a big 'magic'? I'm not really decided one way or another myself, but it does seem a bit non-obvious.
test/MC/RISCV/rvi-aliases-valid.s
36	Having CHECK-INST and CHECK-ALIAS makes sense in this file. Adding in CHECK to the mix makes it a little confusing. Maybe the li cases that don't 'round-trip' belong in a separate test file? e.g. to match gnu as behaviour we'd expect `li x3, 0x80` to be printed by objdump, but `li x4, 0x800` would be expanded to lui+addiw and will never appear in objdump output.

Hi Alex, thank you for your comments!

As mentioned two weeks ago, I also think that it would be nice if we can share the code that synthesizes immediates between the assembler and codegen. I plan to experiment with getting this working in the next week. The idea that I plan to investigate is to delay the generation of the immediates until the MC layer is reached. Basically, emitting PseudoLI machine instructions in RISCVInstrInfo::movImm32 and perform the actual expansion manually during MI to MC lowering.

lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
978–980	Right, I really need read the coding style again. Will be fixed in the next iteration.
lib/Target/RISCV/RISCVInstrFormats.td
108–109	Well, the motivation for the magic was simply to keep the patch minimal. I at first intended to introduce a new `AsmPseudo` class. However, I decided against it because deriving from Pseudo did not feel particularly clean given that a real `AsmPseudo` is typically not a `CodeGenPseudo`. With the mental model that a Pseudo is either a `AsmPseudo` or a `CodeGenPseudo`, inferring the type depending on opcodestr is not that bad. Anyway, I got the code wrong. The `isAsmParserOnly` assignment should be negated (i.e., `!if(!eq(opcodestr,""), 0, 1)`). Regarding alternatives, for me, the cleanest option would be to introduce a new `PseudoBaseClass` and derive a `AsmPseudo` and `CodeGenPseudo` from it. (Naming suggestions are welcome.)
test/MC/RISCV/rvi-aliases-valid.s
36	If you are not strongly against it I would prefer to keep all pseudo instruction test cases together in the respective `...aliases-valid` and `...aliases-invalid` files. We already have many files in the RISCV MC test directory and I am hesitant to add even more without real need. I expect that the remaining pseudo instructions most likely will not properly roundtrip either and certainly do not want to add new test files for every individual instruction. Also, as you noted, some `li` instructions will roundtrip depending on the specified immediate. Splitting the files based on this property would introduce even more test files given that RV32 and RV64 need separate tests too.

In D41949#974406, @niosHD wrote:

Hi Alex, thank you for your comments!

As mentioned two weeks ago, I also think that it would be nice if we can share the code that synthesizes immediates between the assembler and codegen. I plan to experiment with getting this working in the next week. The idea that I plan to investigate is to delay the generation of the immediates until the MC layer is reached. Basically, emitting PseudoLI machine instructions in RISCVInstrInfo::movImm32 and perform the actual expansion manually during MI to MC lowering.

Yes, I think using PseudoLI in codegen could make sense, and as you say this could allow the reuse of a common helper function.

lib/Target/RISCV/RISCVInstrFormats.td
108–109	Could you just override isCodeGenOnly/isAsmParserOnly when necessary: let isAsmParserOnly = 1 in def FooInst : Pseudo<....>

asb added inline comments.Jan 17 2018, 5:55 AM

test/MC/RISCV/rvi-aliases-valid.s
36	Yes, I see your concern. My main problem is that when there was just CHECK-INST and CHECK-ALIAS it was fairly obvious what the different check lines meant. A comment in the file that explains the different check lines might make it easier on the reader. I suppose `CHECK-EXPAND` might be a little more descriptive, seeing as we're verifying that the pseudoinstruction expands to the expected multi-instruction sequence?

Addresses all of Alex's comments (thank you) and integrates PseudoLI emission into CodeGen.

More comments are welcome. Especially opinions about the correct location (and name) for the emitLoadImm method which is currently simply copied to both users.

Razer6 added a subscriber: Razer6.Jan 19 2018, 6:56 AM

In D41949#980499, @niosHD wrote:

Addresses all of Alex's comments (thank you) and integrates PseudoLI emission into CodeGen.

More comments are welcome. Especially opinions about the correct location (and name) for the emitLoadImm method which is currently simply copied to both users.

It's not immediately obvious exactly where it should go. It looks like Mips just ended up duplicating logic. Finding somewhere for it in RISCVDesc seems like it would be reasonable. Does anyone have else have any thoughts?

Moved the shared emitLoadImm methods to a free function into the RISCVDesc library. Additionally, the Size of the pseudo instruction has been set to 8 (worst case for RV32) to ensure proper branch relaxation. Finally, the pattern for 32-bit immediate integers has been updated to generate PseudoLI and the tests have been updated accordingly.

I am going to add support for 64-bit constants next in order to get rid of the WIP tag. Nevertheless, comments are of course still welcome.

Herald added subscribers: hintonda, mgorny. · View Herald TranscriptJan 31 2018, 4:39 AM

I'm liking the look of this, looking forward to giving a final review when you're happy to remove the WIP tag. Thanks!

apazos added inline comments.Feb 1 2018, 10:48 AM

lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
600	You changed getSTI() -> STI, was it intentional?
lib/Target/RISCV/MCTargetDesc/RISCVMCPseudoExpansion.cpp
35	extra {}
39	extra {}
lib/Target/RISCV/RISCVAsmPrinter.cpp
74	can't we return the new instruction from this function and reuse the EmitToStreamer call below. This way we reduce the places to insert compression calls, when instruction compression at MC level is enabled.

Thank you Ana for your comments!

lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
600	No, good catch, although I am not sure if it is better to use `getSTI` compared to directly accessing STI. (Probably a matter of taste.) On a closer look, actually, the whole `MCSubtargetInfo` operand of `processInstruction` seems to be redundant and can be removed given that we can access the STI within the method as well. Still, is it preferred to access `STI` via `getSTI` in the AsmParser?
lib/Target/RISCV/RISCVAsmPrinter.cpp
74	Theoretically yes, but isn't compression done in the `EmitInstruction` of the streamer? The code here is basically a custom MI to MC lowering. It uses the same `EmitInstruction` function which is also used by the generated `emitPseudoExpansionLowering` internally. Maybe I miss something but assuming that the MC compression works in conjunction with pseudo expansion I expect that it also works for the current code.

Added support for handling 64-bit immediate values.

I think this is ready for review.

Herald added a subscriber: kito-cheng. · View Herald TranscriptFeb 6 2018, 9:06 AM

Fixed some typos in the comments.

I just stumbled across a difference between the binutils assembler and my current li implementation regarding accepted immediate values.

The following snippet shows the issue:

% cat li.S                                                                      
li t0, 0x80000000
li t1, -2147483648
li t2, 3147483648
li t3, -3147483648

% riscv32-unknown-elf-gcc -o li.o -c li.S && riscv32-unknown-elf-objdump -d li.o
[...]
00000000 <.text>:
   0:   800002b7                lui     t0,0x80000
   4:   80000337                lui     t1,0x80000
   8:   bb9ad3b7                lui     t2,0xbb9ad
   c:   a0038393                addi    t2,t2,-1536 # 0xbb9aca00
  10:   44653e37                lui     t3,0x44653
  14:   600e0e13                addi    t3,t3,1536 # 0x44653600

While it may be reasonable to accept the first three li instructions, accepting the fourth one definitely does not feel correct. It looks to me as if the immediate verification of the binutils assembler accepts everything that can theoretically be represented as 32-bit value, potentially even as purely negative number. My current implementation verifies that the immediate is a 32-bit signed integer and therefore only accepts the second li instruction in the above example. Should we also be more relaxed regarding immediate verification or should this be considered as binutils bug?

Addressed the discovered defect regarding the immediate of the li instruction. In RV32 mode we now accept either a signed or an unsigned 32-bit value. In RV64 mode we accept basically everything that fits into 64-bit.

Herald added a subscriber: shiva0217. · View Herald TranscriptFeb 23 2018, 9:31 AM

Missing testcase for "li a0, foo".

lib/Target/RISCV/RISCVInstrInfo.td
401	I'm not sure it's a good idea to make code generation use this pseudo-instruction; you'll miss optimization opportunities, like MachineCSE of lui instructions.
test/MC/RISCV/rv64i-aliases-valid.s
94	This seems a little unfortunate... given you can load an arbitrary 32-bit immediate in two instructions, you should be able to load a 64-bit immediate in six instructions ("hi << 32 \| lo"). But I guess that requires a second register?

Thank you for your comments Eli!

Added li t4, foo test and fixed error message for RV64.

lib/Target/RISCV/RISCVInstrInfo.td
401	Indeed, me neither. I also raised this concern in one of our weekly sync up calls and the consensus was to go with the Pseudo instruction for now. However, I am definitely not opposed to expand the respective immediate loads early into machine instructions.
test/MC/RISCV/rv64i-aliases-valid.s
94	Correct, with a second register, 6 instructions would be sufficient. Unfortunately, using a second register is, at least for the assembler, not an option. On the other hand, during codegen I think we should invest these two (virtual) registers. Additionally, in the long term, loading the constant from a constant pool should be evaluated given that it could be even more efficient. (assuming RV64I: 64-bit constant + 1 load + at most 2 instructions for the address calculation)

Hi Mario, sorry for the delay in more review comments. The vast majority of my comments are very minor nits - I think the main one is to take a closer look at the comments for emitRISCVLoadImm. Cleaning that up should make it easier to review that bit of logic. Thanks!

lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
960	It would be nice to add a comment documenting the purpose of processInstruction
972	You might as well use SignExtend64 from MathExtras here.
lib/Target/RISCV/MCTargetDesc/CMakeLists.txt
8	Sort alphabetically
lib/Target/RISCV/MCTargetDesc/RISCVMCPseudoExpansion.cpp
27	LLVM coding standards suggest just using `static` for single functions https://llvm.org/docs/CodingStandards.html#anonymous-namespaces
31	Can we not avoid this and use findFirstSet from MathExtras? Unless I'm missing something, you could just mask out the first 12 bits at the call-site and so avoid the need for the 'StartOffset' parameter.
54	No need to mask the value passed to SignExtend64. Although it does no harm, I'd recommend changing to `SignExtend64<12>(Value)`.
55	Just `unsigned` is more usual in the LLVM tree
64	LLVM is somewhat conservative when it comes to the use of auto. Given that there's not much saving in space, I'd be explicit and use `unsigned` here.
72	Should have something like `&& "Target must be 64-bit to support a >32-bit constant"` or whatever phrasing you prefer
74–75	The comment describing how emitting 32-bit constants works was fantastic - it would be nice to expand this comment to a similar level of detail. The comment doesn't quite seem to match the behaviour either, as in the implementation emitRISCVLoadImm is called recursively before emitting any other instructions.
78	Do you rely on this being an arithmetic right shift? I'm not 100% sure if the C++ standard guarantees that.
lib/Target/RISCV/MCTargetDesc/RISCVMCPseudoExpansion.h
25	Write this as `unsigned DestReg`.
lib/Target/RISCV/RISCVInstrInfo.td
401	If I recall correctly, @kparzysz reported that based on his experience there was probably little to gain.
test/MC/RISCV/rv64i-aliases-valid.s
94	For what it's worth, the RV64I codegen patches (not yet merged) do just use two registers and six instructions - but this is done in a dumb way that fails to recogise cases where <6 instructions can be used. Fully agree that it will be worth looking at using the constant pool

Hi Alex, thank you very much for your comments!

I will address all of them in my next revision. Unfortunately, I am really busy at the moment and will not be able to join the sync up call tomorrow. However, I expect that I can provide the new revision at the beginning of next week.

Best,
Mario

lib/Target/RISCV/MCTargetDesc/RISCVMCPseudoExpansion.cpp
31	I will have a look, with an additional if at the call site it should probably work. If I remember correctly, having this function is more or less an remainder of an older revision of the patch where Value was checked for 0 and -1.
74–75	I will try to expand/improve the comment to make it clearer. The basic idea was to convey that, at this point, it is already fixed that an ADDI is going to be emitted (hence scheduled). The actual emission, on the other hand, is performed after the recursive all returns. However, I obviously failed at expressing this and will try again. ;)
78	Yes, I rely on that and it seems indeed not guaranteed by the standard. I can add an additional `SignExtend64` call to make it clear. However, the implementation of `SignExtend64` relies right shifts being arithmetic too.

I rebased the patch and addressed all comments. Thank you again for the feedback.

Thanks Mario. I think this is looking good to land now.

Are you planning a follow-up patch that will show li in disassembly and for generated assembly in simple cases? (matching binutils more closely).

I haven't looked into it more closely, but I do note a minor codegen change for float-mem.ll which results in an extra instruction:

 ; Ensure that 1 is added to the high 20 bits if bit 11 of the low part is 1
 define float @flw_fsw_constant(float %a) nounwind {
 ; RV32IF-LABEL: flw_fsw_constant:
 ; RV32IF:       # %bb.0:
 ; RV32IF-NEXT:    fmv.w.x ft0, a0
 ; RV32IF-NEXT:    lui a0, 912092
-; RV32IF-NEXT:    flw ft1, -273(a0)
+; RV32IF-NEXT:    addi a0, a0, -273
+; RV32IF-NEXT:    flw ft1, 0(a0)
 ; RV32IF-NEXT:    fadd.s ft0, ft0, ft1
-; RV32IF-NEXT:    fsw ft0, -273(a0)
+; RV32IF-NEXT:    fsw ft0, 0(a0)
 ; RV32IF-NEXT:    fmv.x.w a0, ft0
 ; RV32IF-NEXT:    ret
   %1 = inttoptr i32 3735928559 to float*
   %2 = load volatile float, float* %1
   %3 = fadd float %a, %2
   store float %3, float* %1
   ret float %3
 }

This revision is now accepted and ready to land.Mar 22 2018, 5:34 AM

In D41949#1045516, @asb wrote:

Thanks Mario. I think this is looking good to land now.

Perfect, thank you for the great feedback!

Are you planning a follow-up patch that will show li in disassembly and for generated assembly in simple cases? (matching binutils more closely).

Yes, I will look into it. Doing the same as binutils should be reasonable simple.

I haven't looked into it more closely, but I do note a minor codegen change for float-mem.ll which results in an extra instruction:

Good catch, I missed that codegen change. Seems like the ADDI was previously merged into the FLW. Given that we can solely use PseudoLI for constants we probably only miss a simplification pattern or a simple peephole optimisation. I can have a look but given that I currently do not use floating point instructions it may take some time.

In D41949#1046704, @niosHD wrote:

In D41949#1045516, @asb wrote:

I haven't looked into it more closely, but I do note a minor codegen change for float-mem.ll which results in an extra instruction:

Good catch, I missed that codegen change. Seems like the ADDI was previously merged into the FLW. Given that we can solely use PseudoLI for constants we probably only miss a simplification pattern or a simple peephole optimisation. I can have a look but given that I currently do not use floating point instructions it may take some time.

I'm mainly surprised that we're seeing this codegen change for floating point but not integer loads/stores. I'll try to take a closer look at it before committing, but it's not something that should block this patch anyway.

zzheng added a subscriber: zzheng.Apr 12 2018, 3:54 PM

Rebased on master as Mandeep requested via email .

Currently there are two open "problems" with this patch:

The doPeepholeLoadStoreADDI peephole optimisation can currently not deal with the PseudoLI instruction which results in the codegen regression that Alex already detected (see test/CodeGen/RISCV/mem.ll, test/CodeGen/RISCV/fload-mem.ll, test/CodeGen/RISCV/double-mem.ll). I am not sure yet if it is better to extend the current optimisation or to introduce a new one given that it requires to update the memory instruction as well as the PseudoLI instruction.
The compression support, that was in the meantime landed, is not yet integrated into the RISCVMCPseudoExpansion. I did a quick experiment and it seems to be easy though. Should I add it to this patch or post a new one?

Best,
Mario

Extended peephole optimisation to fix introduced codegen regression.

I'd do the compressed changes in a different patch. Thanks for updating the peephole RISCVISelDAGToDAG, I'll review that bit ASAP and then commit. At a first look, it seems to handle this exactly as I would expect.

sabuasal added inline comments.Apr 13 2018, 6:52 PM

lib/Target/RISCV/MCTargetDesc/RISCVMCPseudoExpansion.cpp
45	Hi Mario @niosHD , Thanks for the patch, this looks nice. just a small note about your comment addressing compression in case you want to update it in the future like @asb suggested. Since you are calling your function (emitRISCVLoadImm) from the InstPrinter (RISCVAsmPrinter::EmitInstruction) the standard way to Emit the Instructoin is by calling EmitToStreamer in your AsmPrinter. In other back-ends this will call ( AsmPrinter::EmitToStreamer), In RISCV, we define our own EmitToStreamer all what you have to do to support compression is calling your AsmPrinter->EmitToStreamer().

sabuasal added inline comments.Apr 13 2018, 6:58 PM

lib/Target/RISCV/RISCVAsmPrinter.cpp
74	I believe I addressed this in my other comment but I actually just saw this comment you had! The way "emitPseudoExpansionLowering" emits the instruction is "EmitToStreamer(OutStreamer, TmpInst);". This way it preserves any behavior in the XXXASMPrinter it is called from. You can check that in any inc file "XXXXGenMCPseudoLowering.inc"

In D41949#1067297, @asb wrote:

I'd do the compressed changes in a different patch. Thanks for updating the peephole RISCVISelDAGToDAG, I'll review that bit ASAP and then commit. At a first look, it seems to handle this exactly as I would expect.

Agreed! I will add compression in a different patch. Considering the inline discussion with Ana and Sameer, any opinion on what is the cleanest way to add compression?

lib/Target/RISCV/MCTargetDesc/RISCVMCPseudoExpansion.cpp
45	Hi Sameer @sabuasal, thank you for the hint but I do not think that calling `AsmPrinter::EmitToStreamer` is easily possible. `emitRISCVLoadImm` takes an `MCStreamer` as input because it is available in both, the `RISCVAsmParser` and the `RISCVAsmPrinter`, where it is called from. My current compression prototype therefore simple adds the same compression code that has been added to the `RISCVAsmParser` and the `RISCVAsmPrinter` to `emitRISCVLoadImm` (via a static helper function in `RISCVMCPseudoExpansion.cpp`). However, I am not particularly fond of this duplication and am open for alternative ideas.
lib/Target/RISCV/RISCVAsmPrinter.cpp
74	(see above) Returning the instructions, as Ana suggested in the first comment, would be an alternative to adding compression to the `RISCVMCPseudoExpansion`. However, I am still not sure if it is idiomatic for the llvm code base to return a list of instructions from such a function. Further opinions are welcome!

Thanks again Mario. I've reviewed the new RISCVISelDAGToDAG changes and just have a minor comment. This is also looking good when testing with the torture suite. I'll commit this as soon as you can confirm my minor query.

I'll think more about compression handling. If you already have something that works, it might be worth just posting that so we have something concrete to discuss.

lib/Target/RISCV/RISCVISelDAGToDAG.cpp
196 ↗	(On Diff #142423)	Perhaps I'm missing something obvious, but shouldn't this be 'Hi52'?

Updated patch to fix variable names.

In D41949#1069751, @asb wrote:

I'll think more about compression handling. If you already have something that works, it might be worth just posting that so we have something concrete to discuss.

Great, I will post it and then we can continue the discussion in the new review thread.

lib/Target/RISCV/RISCVISelDAGToDAG.cpp
196 ↗	(On Diff #142423)	Indeed, good catch! This is a stupid copy and paste error which, to my embarrassment, originates from the new `emitRISCVLoadImm` function... I'll fix this immediately and refresh the patch.

Closed by commit rL330224: [RISCV] implement li pseudo instruction (authored by asb). · Explain WhyApr 17 2018, 3:00 PM

This revision was automatically updated to reflect the committed changes.

Hi,

This patch causes repeated LUI generation for the following test case :

void foo (int num, int* addr) {

addr[0] = num*4097;
addr[1] = num*4098;
addr[2] = num*4099;
addr[3] = num*4100;

}

Without patch:

1.	lui       a0, 1
2.	addi      a3, a0, 2
3.	mul       a3, a1, a3
4.	sw         a3, 4(a2)
5.	addi      a3, a0, 1
6.	mul       a3, a1, a3
7.	sw         a3, 0(a2)
8.	addi      a3, a0, 3
9.	mul       a3, a1, a3
10.	sw         a3, 8(a2)
11.	addi      a0, a0, 4
12.	mul       a0, a1, a0
13.	sw         a0, 12(a2)
14.	ret

with patch:

1.	lui        a0, 1
2.	addi       a0, a0, 2
3.	mul        a0, a1, a0
4.	sw a0, 4(a2)
5.	lui       a0, 1           repeated lui!
6.	addi       a0, a0, 1
7.	mul        a0, a1, a0
8.	sw a0, 0(a2)
9.	 lui       a0, 1   ---> repeated load
10.	addi       a0, a0, 3
11.	mul        a0, a1, a0
12.	sw a0, 8(a2)
13.	lui        a0, 1      ---> repeated load
14.	addi       a0, a0, 4
15.	mul        a0, a1, a0
16.	sw a0, 12(a2)
17.	ret

I think this is bacuse we are hiding the %hi part f the immediate from the Selection Dag so it doesn't optimize it away.

Sorry for the late reply. I thought you were going to hold off committing this till the compression issue is addressed.

Hi Sameer,

thank you for reporting the issue. Eli already predicted that we loose CSE for LUI due to the use of the pseudo instruction. I was therefore already kind of expecting a missed optimisation of this form. On the plus side, considering that we only emit 32-bit constants in the codegen path, I am pretty confident that the LUI duplication is as bad as it gets. Still, we definitely should fix this issue.

Unfortunately, I do not see an easy fix as long as we stick with emitting pseudo instructions during codegen. When I introduced this in January it was a pure win given that it improved code quality and de-duplicated code. However, this may has to be revisited now given that the backend has been improved considerably in the meantime. To be perfectly honest, I would probably take a step backward in this situation and remove the automatic emission of the PseudoLI instruction from the codegen path again. I still think it is in the long run desirable to share the calculation of the individual immediate values and shift constants (if needed) between the codegen path and the MC layer. However, the current approach does not seem to be the right one. What do you guys think?

I'm just working through this now. I'll play around with the options and update this thread within the next few hours.

Ok, I've had a good think about this issue. I was slightly over-eager in committing this last night. Something like PseudoLI seems necessary for the more complex materialisation logic required for 64-bit immediates in RV64, but we can do without for RV32. I've weighed up whether to revert and revise, or to make changes post commit

I suggest the following:

I'll update test/CodeGen/RISCV/imm.ll so it contains tests for imm32_hi20_only which are the primary benefit of this patch for codegen
I'll improve testing of the codegen->compression path so that we have tests that would pick up any change in the status of compression of code sequences for materialising constants
I'll add a test for common subexpression elimination of code sequences for materializing constants, similar to Sameer's example
I'll restore the previous imm32 pattern, add a new pattern for immediates with lo12 == 0 and remove the pattern using PseudoLI. This retains the codegen improvements and will fix the issues with compressed instruction emission.
I'll revert the RISCVISelDAGToDAG changes as that code path is now dead. I think it's still worth having PseudoLI available to the backend as it's already an improvement over the old movImm32 code.

I have patches written for all the above, and will get going with committing them.

Herald added a subscriber: edward-jones. · View Herald TranscriptApr 18 2018, 9:22 AM

Thanks Mario and Alex for the patch and addressing the code size concerns.
We see code size increase < 1%, but one particular test SPEC2006/bzip2 is 12%.

asb mentioned this in rL330288: [RISCV] Expand codegen -> compression sanity checks and move to a single file.Apr 18 2018, 1:20 PM

asb mentioned this in rL330291: [RISCV] Add imm-cse.ll test case.Apr 18 2018, 1:28 PM

In the end I decided to temporarily revert this patch. I've committed patches that fill in the holes identified in testing, and added the straight-forward patch for imm32 with lo12=0.

Better tests for imm32_hi20_only (rL330274)
Improved testing of the codegen -> compression codepath (rL330288)
Test for constant subexpression elimination when materialising immediates (rL330291)
Add pattern for immediates with zero for the lower 12 bits (rL330293)

I think the best path forwards for this patch is to start by getting a version of it landed that only adds support in the MC layer for li. Would you be happy to make that change Mario? Apologies for the hassle.

As you say, it would be good to share logic between MC and codegen for materialising constants, especially the more complex logic for 64-bit constants. But let's get the MC bit landed and then we can revisit.

Thank you Mario for all of your work on this, and thanks Ana and Sameer fo the feedback.

niosHD mentioned this in D46118: [RISCV] AsmParser support for the li pseudo instruction.Apr 26 2018, 6:20 AM

asb mentioned this in rL334203: [RISCV] AsmParser support for the li pseudo instruction.Jun 7 2018, 8:40 AM

Revision Contents

Path

Size

lib/

Target/

RISCV/

AsmParser/

RISCVAsmParser.cpp

42 lines

MCTargetDesc/

CMakeLists.txt

1 line

RISCVMCPseudoExpansion.h

29 lines

RISCVMCPseudoExpansion.cpp

92 lines

11 lines

4 lines

13 lines

24 lines

test/

CodeGen/

RISCV/

bswap-ctlz-cttz-ctpop.ll

159 lines

calling-conv.ll

18 lines

mem.ll

4 lines

vararg.ll

16 lines

MC/

RISCV/

rv32i-aliases-invalid.s

2 lines

rv32i-aliases-valid.s

60 lines

rv64i-aliases-valid.s

86 lines

rvi-aliases-valid.s

5 lines

Diff 133026

lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp

//===-- RISCVAsmParser.cpp - Parse RISCV assembly to MCInst instructions --===//		//===-- RISCVAsmParser.cpp - Parse RISCV assembly to MCInst instructions --===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "MCTargetDesc/RISCVBaseInfo.h"		#include "MCTargetDesc/RISCVBaseInfo.h"
#include "MCTargetDesc/RISCVMCExpr.h"		#include "MCTargetDesc/RISCVMCExpr.h"
		#include "MCTargetDesc/RISCVMCPseudoExpansion.h"
#include "MCTargetDesc/RISCVMCTargetDesc.h"		#include "MCTargetDesc/RISCVMCTargetDesc.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/StringSwitch.h"		#include "llvm/ADT/StringSwitch.h"
#include "llvm/MC/MCContext.h"		#include "llvm/MC/MCContext.h"
#include "llvm/MC/MCExpr.h"		#include "llvm/MC/MCExpr.h"
#include "llvm/MC/MCInst.h"		#include "llvm/MC/MCInst.h"
#include "llvm/MC/MCParser/MCAsmLexer.h"		#include "llvm/MC/MCParser/MCAsmLexer.h"
#include "llvm/MC/MCParser/MCParsedAsmOperand.h"		#include "llvm/MC/MCParser/MCParsedAsmOperand.h"
#include "llvm/MC/MCParser/MCTargetAsmParser.h"		#include "llvm/MC/MCParser/MCTargetAsmParser.h"
#include "llvm/MC/MCRegisterInfo.h"		#include "llvm/MC/MCRegisterInfo.h"
#include "llvm/MC/MCStreamer.h"		#include "llvm/MC/MCStreamer.h"
#include "llvm/MC/MCSubtargetInfo.h"		#include "llvm/MC/MCSubtargetInfo.h"
#include "llvm/Support/Casting.h"		#include "llvm/Support/Casting.h"
		#include "llvm/Support/MathExtras.h"
#include "llvm/Support/TargetRegistry.h"		#include "llvm/Support/TargetRegistry.h"

		#include <limits>

using namespace llvm;		using namespace llvm;

namespace {		namespace {
struct RISCVOperand;		struct RISCVOperand;

class RISCVAsmParser : public MCTargetAsmParser {		class RISCVAsmParser : public MCTargetAsmParser {
SMLoc getLoc() const { return getParser().getTok().getLoc(); }		SMLoc getLoc() const { return getParser().getTok().getLoc(); }
bool isRV64() const { return getSTI().hasFeature(RISCV::Feature64Bit); }		bool isRV64() const { return getSTI().hasFeature(RISCV::Feature64Bit); }
Show All 11 Lines	class RISCVAsmParser : public MCTargetAsmParser {

bool ParseRegister(unsigned &RegNo, SMLoc &StartLoc, SMLoc &EndLoc) override;		bool ParseRegister(unsigned &RegNo, SMLoc &StartLoc, SMLoc &EndLoc) override;

bool ParseInstruction(ParseInstructionInfo &Info, StringRef Name,		bool ParseInstruction(ParseInstructionInfo &Info, StringRef Name,
SMLoc NameLoc, OperandVector &Operands) override;		SMLoc NameLoc, OperandVector &Operands) override;

bool ParseDirective(AsmToken DirectiveID) override;		bool ParseDirective(AsmToken DirectiveID) override;

		bool processInstruction(MCInst &Inst, SMLoc IDLoc, MCStreamer &Out);

// Auto-generated instruction matching functions		// Auto-generated instruction matching functions
#define GET_ASSEMBLER_HEADER		#define GET_ASSEMBLER_HEADER
#include "RISCVGenAsmMatcher.inc"		#include "RISCVGenAsmMatcher.inc"

OperandMatchResultTy parseImmediate(OperandVector &Operands);		OperandMatchResultTy parseImmediate(OperandVector &Operands);
OperandMatchResultTy parseRegister(OperandVector &Operands,		OperandMatchResultTy parseRegister(OperandVector &Operands,
bool AllowParens = false);		bool AllowParens = false);
OperandMatchResultTy parseMemOpBaseReg(OperandVector &Operands);		OperandMatchResultTy parseMemOpBaseReg(OperandVector &Operands);
▲ Show 20 Lines • Show All 139 Lines • ▼ Show 20 Lines	bool isFRMArg() const {
if (!SVal \|\| SVal->getKind() != MCSymbolRefExpr::VK_None)		if (!SVal \|\| SVal->getKind() != MCSymbolRefExpr::VK_None)
return false;		return false;

StringRef Str = SVal->getSymbol().getName();		StringRef Str = SVal->getSymbol().getName();

return RISCVFPRndMode::stringToRoundingMode(Str) != RISCVFPRndMode::Invalid;		return RISCVFPRndMode::stringToRoundingMode(Str) != RISCVFPRndMode::Invalid;
}		}

		bool isSImmXLen() const {
		int64_t Imm;
		RISCVMCExpr::VariantKind VK;
		if (!isImm())
		return false;
		bool IsConstantImm = evaluateConstantImm(Imm, VK);
		bool IsInRange = isRV64() ? true : isInt<32>(Imm);
		return IsConstantImm && IsInRange && VK == RISCVMCExpr::VK_RISCV_None;
		}

bool isUImmLog2XLen() const {		bool isUImmLog2XLen() const {
int64_t Imm;		int64_t Imm;
RISCVMCExpr::VariantKind VK;		RISCVMCExpr::VariantKind VK;
if (!isImm())		if (!isImm())
return false;		return false;
if (!evaluateConstantImm(Imm, VK) \|\| VK != RISCVMCExpr::VK_RISCV_None)		if (!evaluateConstantImm(Imm, VK) \|\| VK != RISCVMCExpr::VK_RISCV_None)
return false;		return false;
return (isRV64() && isUInt<6>(Imm)) \|\| isUInt<5>(Imm);		return (isRV64() && isUInt<6>(Imm)) \|\| isUInt<5>(Imm);
▲ Show 20 Lines • Show All 359 Lines • ▼ Show 20 Lines	bool RISCVAsmParser::MatchAndEmitInstruction(SMLoc IDLoc, unsigned &Opcode,
uint64_t &ErrorInfo,		uint64_t &ErrorInfo,
bool MatchingInlineAsm) {		bool MatchingInlineAsm) {
MCInst Inst;		MCInst Inst;

switch (MatchInstructionImpl(Operands, Inst, ErrorInfo, MatchingInlineAsm)) {		switch (MatchInstructionImpl(Operands, Inst, ErrorInfo, MatchingInlineAsm)) {
default:		default:
break;		break;
case Match_Success:		case Match_Success:
Inst.setLoc(IDLoc);		return processInstruction(Inst, IDLoc, Out);
		apazosUnsubmitted Not Done Reply Inline Actions You changed getSTI() -> STI, was it intentional? apazos: You changed getSTI() -> STI, was it intentional?
		niosHDAuthorUnsubmitted Not Done Reply Inline Actions No, good catch, although I am not sure if it is better to use `getSTI` compared to directly accessing STI. (Probably a matter of taste.) On a closer look, actually, the whole `MCSubtargetInfo` operand of `processInstruction` seems to be redundant and can be removed given that we can access the STI within the method as well. Still, is it preferred to access `STI` via `getSTI` in the AsmParser? niosHD: No, good catch, although I am not sure if it is better to use `getSTI` compared to directly…
Out.EmitInstruction(Inst, getSTI());
return false;
case Match_MissingFeature:		case Match_MissingFeature:
return Error(IDLoc, "instruction use requires an option to be enabled");		return Error(IDLoc, "instruction use requires an option to be enabled");
case Match_MnemonicFail:		case Match_MnemonicFail:
return Error(IDLoc, "unrecognized instruction mnemonic");		return Error(IDLoc, "unrecognized instruction mnemonic");
case Match_InvalidOperand: {		case Match_InvalidOperand: {
SMLoc ErrorLoc = IDLoc;		SMLoc ErrorLoc = IDLoc;
if (ErrorInfo != ~0U) {		if (ErrorInfo != ~0U) {
if (ErrorInfo >= Operands.size())		if (ErrorInfo >= Operands.size())
return Error(ErrorLoc, "too few operands for instruction");		return Error(ErrorLoc, "too few operands for instruction");

ErrorLoc = ((RISCVOperand &)*Operands[ErrorInfo]).getStartLoc();		ErrorLoc = ((RISCVOperand &)*Operands[ErrorInfo]).getStartLoc();
if (ErrorLoc == SMLoc())		if (ErrorLoc == SMLoc())
ErrorLoc = IDLoc;		ErrorLoc = IDLoc;
}		}
return Error(ErrorLoc, "invalid operand for instruction");		return Error(ErrorLoc, "invalid operand for instruction");
}		}
		case Match_InvalidSImmXLen:
		assert(!isRV64() && "immediate can not be out of range on RV64");
		return generateImmOutOfRangeError(Operands, ErrorInfo,
		std::numeric_limits<int32_t>::min(),
		std::numeric_limits<int32_t>::max());
case Match_InvalidUImmLog2XLen:		case Match_InvalidUImmLog2XLen:
if (isRV64())		if (isRV64())
return generateImmOutOfRangeError(Operands, ErrorInfo, 0, (1 << 6) - 1);		return generateImmOutOfRangeError(Operands, ErrorInfo, 0, (1 << 6) - 1);
return generateImmOutOfRangeError(Operands, ErrorInfo, 0, (1 << 5) - 1);		return generateImmOutOfRangeError(Operands, ErrorInfo, 0, (1 << 5) - 1);
case Match_InvalidUImmLog2XLenNonZero:		case Match_InvalidUImmLog2XLenNonZero:
if (isRV64())		if (isRV64())
return generateImmOutOfRangeError(Operands, ErrorInfo, 1, (1 << 6) - 1);		return generateImmOutOfRangeError(Operands, ErrorInfo, 1, (1 << 6) - 1);
return generateImmOutOfRangeError(Operands, ErrorInfo, 1, (1 << 5) - 1);		return generateImmOutOfRangeError(Operands, ErrorInfo, 1, (1 << 5) - 1);
▲ Show 20 Lines • Show All 322 Lines • ▼ Show 20 Lines	bool RISCVAsmParser::classifySymbolRef(const MCExpr *Expr,
Addend = AddendExpr->getValue();		Addend = AddendExpr->getValue();
if (BE->getOpcode() == MCBinaryExpr::Sub)		if (BE->getOpcode() == MCBinaryExpr::Sub)
Addend = -Addend;		Addend = -Addend;

// It's some symbol reference + a constant addend		// It's some symbol reference + a constant addend
return Kind != RISCVMCExpr::VK_RISCV_Invalid;		return Kind != RISCVMCExpr::VK_RISCV_Invalid;
}		}

		bool RISCVAsmParser::processInstruction(MCInst &Inst, SMLoc IDLoc,
		asbUnsubmitted Done Reply Inline Actions It would be nice to add a comment documenting the purpose of processInstruction asb: It would be nice to add a comment documenting the purpose of processInstruction
		MCStreamer &Out) {
		Inst.setLoc(IDLoc);

		switch (Inst.getOpcode()) {
		case RISCV::PseudoLI: {
		const MCOperand &DstRegOp = Inst.getOperand(0);
		const MCOperand &ImmOp = Inst.getOperand(1);
		emitRISCVLoadImm(DstRegOp.getReg(), ImmOp.getImm(), Out, STI);
		return false;
		}
		}

		asbUnsubmitted Done Reply Inline Actions You might as well use SignExtend64 from MathExtras here. asb: You might as well use SignExtend64 from MathExtras here.
		Out.EmitInstruction(Inst, *STI);
		return false;
		}

bool RISCVAsmParser::ParseDirective(AsmToken DirectiveID) { return true; }		bool RISCVAsmParser::ParseDirective(AsmToken DirectiveID) { return true; }

extern "C" void LLVMInitializeRISCVAsmParser() {		extern "C" void LLVMInitializeRISCVAsmParser() {
RegisterMCAsmParser<RISCVAsmParser> X(getTheRISCV32Target());		RegisterMCAsmParser<RISCVAsmParser> X(getTheRISCV32Target());
		asbUnsubmitted Done Reply Inline Actions Locals are normally capitalised in the LLVM coding style. asb: Locals are normally capitalised in the LLVM coding style.
		niosHDAuthorUnsubmitted Done Reply Inline Actions Right, I really need read the coding style again. Will be fixed in the next iteration. niosHD: Right, I really need read the coding style again. Will be fixed in the next iteration.
RegisterMCAsmParser<RISCVAsmParser> Y(getTheRISCV64Target());		RegisterMCAsmParser<RISCVAsmParser> Y(getTheRISCV64Target());
}		}

lib/Target/RISCV/MCTargetDesc/CMakeLists.txt

	add_llvm_library(LLVMRISCVDesc			add_llvm_library(LLVMRISCVDesc
	RISCVAsmBackend.cpp			RISCVAsmBackend.cpp
	RISCVELFObjectWriter.cpp			RISCVELFObjectWriter.cpp
	RISCVMCAsmInfo.cpp			RISCVMCAsmInfo.cpp
	RISCVMCCodeEmitter.cpp			RISCVMCCodeEmitter.cpp
	RISCVMCExpr.cpp			RISCVMCExpr.cpp
	RISCVMCTargetDesc.cpp			RISCVMCTargetDesc.cpp
				RISCVMCPseudoExpansion.cpp
				asbUnsubmitted Done Reply Inline Actions Sort alphabetically asb: Sort alphabetically
	)			)

lib/Target/RISCV/MCTargetDesc/RISCVMCPseudoExpansion.h

This file was added.

				//===-- RISCVMCPseudoExpansion.h - RISCV MC Pseudo Expansion ----- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				/// This file describes helpers to expand pseudo MC instructions that are usable
				/// in the AsmParser and the AsmPrinter.
				//
				//===----------------------------------------------------------------------===//
				#ifndef LLVM_LIB_TARGET_RISCV_MCTARGETDESC_RISCVMCPSEUDOEXPANSION_H
				#define LLVM_LIB_TARGET_RISCV_MCTARGETDESC_RISCVMCPSEUDOEXPANSION_H

				#include <cstdint>

				namespace llvm {

				class MCStreamer;
				class MCSubtargetInfo;

				void emitRISCVLoadImm(unsigned int DestReg, int64_t Value, MCStreamer &Out,
				const MCSubtargetInfo *STI);
				asbUnsubmitted Done Reply Inline Actions Write this as `unsigned DestReg`. asb: Write this as `unsigned DestReg`.

				} // namespace llvm

				#endif

lib/Target/RISCV/MCTargetDesc/RISCVMCPseudoExpansion.cpp

This file was added.

				//===-- RISCVMCPseudoExpansion.cpp - RISCV MC Pseudo Expansion ------------===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				///
				/// This file provides helpers to expand pseudo MC instructions that are usable
				/// in the AsmParser and the AsmPrinter.
				///
				//===----------------------------------------------------------------------===//

				#include "RISCVMCPseudoExpansion.h"
				#include "RISCVMCTargetDesc.h"
				#include "llvm/MC/MCInstBuilder.h"
				#include "llvm/MC/MCStreamer.h"
				#include "llvm/MC/MCSubtargetInfo.h"
				#include "llvm/Support/MathExtras.h"

				#include <cassert>

				using namespace llvm;

				namespace {

				asbUnsubmitted Done Reply Inline Actions LLVM coding standards suggest just using `static` for single functions https://llvm.org/docs/CodingStandards.html#anonymous-namespaces asb: LLVM coding standards suggest just using `static` for single functions https://llvm.
				// Scan the value to find the first non zero bit, i.e., the bit that has to be
				// emitted and is not solely the result of zero extension due to SLLI.
				int findFirstSetBit(int64_t Value, int StartOffset = 0) {
				Value >>= StartOffset;
				asbUnsubmitted Done Reply Inline Actions Can we not avoid this and use findFirstSet from MathExtras? Unless I'm missing something, you could just mask out the first 12 bits at the call-site and so avoid the need for the 'StartOffset' parameter. asb: Can we not avoid this and use findFirstSet from MathExtras? Unless I'm missing something, you…
				niosHDAuthorUnsubmitted Done Reply Inline Actions I will have a look, with an additional if at the call site it should probably work. If I remember correctly, having this function is more or less an remainder of an older revision of the patch where Value was checked for 0 and -1. niosHD: I will have a look, with an additional if at the call site it should probably work. If I…
				if (Value == 0)
				return StartOffset;
				return StartOffset + countTrailingZeros((uint64_t)Value, ZB_Undefined);
				}
				apazosUnsubmitted Done Reply Inline Actions extra {} apazos: extra {}

				} // namespace

				void llvm::emitRISCVLoadImm(unsigned int DestReg, int64_t Value,
				apazosUnsubmitted Done Reply Inline Actions extra {} apazos: extra {}
				MCStreamer &Out, const MCSubtargetInfo *STI) {
				if (isInt<32>(Value)) {
				// Emits the MC instructions for loading a 32-bit constant into a register.
				//
				// Depending on the active bits in the immediate Value v, the following
				// instruction sequences are emitted:
				sabuasalUnsubmitted Not Done Reply Inline Actions Hi Mario @niosHD , Thanks for the patch, this looks nice. just a small note about your comment addressing compression in case you want to update it in the future like @asb suggested. Since you are calling your function (emitRISCVLoadImm) from the InstPrinter (RISCVAsmPrinter::EmitInstruction) the standard way to Emit the Instructoin is by calling EmitToStreamer in your AsmPrinter. In other back-ends this will call ( AsmPrinter::EmitToStreamer), In RISCV, we define our own EmitToStreamer all what you have to do to support compression is calling your AsmPrinter->EmitToStreamer(). sabuasal: Hi Mario @niosHD , Thanks for the patch, this looks nice. just a small note about your…
				niosHDAuthorUnsubmitted Not Done Reply Inline Actions Hi Sameer @sabuasal, thank you for the hint but I do not think that calling `AsmPrinter::EmitToStreamer` is easily possible. `emitRISCVLoadImm` takes an `MCStreamer` as input because it is available in both, the `RISCVAsmParser` and the `RISCVAsmPrinter`, where it is called from. My current compression prototype therefore simple adds the same compression code that has been added to the `RISCVAsmParser` and the `RISCVAsmPrinter` to `emitRISCVLoadImm` (via a static helper function in `RISCVMCPseudoExpansion.cpp`). However, I am not particularly fond of this duplication and am open for alternative ideas. niosHD: Hi Sameer @sabuasal, thank you for the hint but I do not think that calling `AsmPrinter…
				//
				// v == 0 : ADDI(W)
				// v[0,12) != 0 && v[12,32) == 0 : ADDI(W)
				// v[0,12) == 0 && v[12,32) != 0 : LUI
				// v[0,32) != 0 : LUI+ADDI(W)
				//
				int64_t Hi20 = ((Value + 0x800) >> 12) & 0xFFFFF;
				int64_t Lo12 = SignExtend64<12>(Value & 0xFFF);
				unsigned int SrcReg = RISCV::X0;
				asbUnsubmitted Done Reply Inline Actions No need to mask the value passed to SignExtend64. Although it does no harm, I'd recommend changing to `SignExtend64<12>(Value)`. asb: No need to mask the value passed to SignExtend64. Although it does no harm, I'd recommend…

				asbUnsubmitted Done Reply Inline Actions Just `unsigned` is more usual in the LLVM tree asb: Just `unsigned` is more usual in the LLVM tree
				if (Hi20) {
				Out.EmitInstruction(
				MCInstBuilder(RISCV::LUI).addReg(DestReg).addImm(Hi20), *STI);
				SrcReg = DestReg;
				}

				if (Lo12 \|\| Hi20 == 0) {
				auto AddiOpcode =
				STI->hasFeature(RISCV::Feature64Bit) ? RISCV::ADDIW : RISCV::ADDI;
				asbUnsubmitted Done Reply Inline Actions LLVM is somewhat conservative when it comes to the use of auto. Given that there's not much saving in space, I'd be explicit and use `unsigned` here. asb: LLVM is somewhat conservative when it comes to the use of auto. Given that there's not much…
				Out.EmitInstruction(
				MCInstBuilder(AddiOpcode).addReg(DestReg).addReg(SrcReg).addImm(Lo12),
				*STI);
				}
				return;
				}
				assert(STI->hasFeature(RISCV::Feature64Bit));

				asbUnsubmitted Done Reply Inline Actions Should have something like `&& "Target must be 64-bit to support a >32-bit constant"` or whatever phrasing you prefer asb: Should have something like `&& "Target must be 64-bit to support a >32-bit constant"` or…
				// If more than 32 bits have to be emitted, schedule an ADDI instruction which
				// handles 12 bits and deal with the remaining bits recursively.
				int64_t Hi = (Value + 0x800);
				asbUnsubmitted Done Reply Inline Actions The comment describing how emitting 32-bit constants works was fantastic - it would be nice to expand this comment to a similar level of detail. The comment doesn't quite seem to match the behaviour either, as in the implementation emitRISCVLoadImm is called recursively before emitting any other instructions. asb: The comment describing how emitting 32-bit constants works was fantastic - it would be nice to…
				niosHDAuthorUnsubmitted Done Reply Inline Actions I will try to expand/improve the comment to make it clearer. The basic idea was to convey that, at this point, it is already fixed that an ADDI is going to be emitted (hence scheduled). The actual emission, on the other hand, is performed after the recursive all returns. However, I obviously failed at expressing this and will try again. ;) niosHD: I will try to expand/improve the comment to make it clearer. The basic idea was to convey that…
				int LsbIndex = findFirstSetBit(Hi, 12);
				Hi >>= LsbIndex;

				asbUnsubmitted Done Reply Inline Actions Do you rely on this being an arithmetic right shift? I'm not 100% sure if the C++ standard guarantees that. asb: Do you rely on this being an arithmetic right shift? I'm not 100% sure if the C++ standard…
				niosHDAuthorUnsubmitted Done Reply Inline Actions Yes, I rely on that and it seems indeed not guaranteed by the standard. I can add an additional `SignExtend64` call to make it clear. However, the implementation of `SignExtend64` relies right shifts being arithmetic too. niosHD: Yes, I rely on that and it seems indeed not guaranteed by the standard. I can add an additional…
				emitRISCVLoadImm(DestReg, Hi, Out, STI);

				Out.EmitInstruction(MCInstBuilder(RISCV::SLLI)
				.addReg(DestReg)
				.addReg(DestReg)
				.addImm(LsbIndex),
				*STI);

				int64_t Lo12 = SignExtend64<12>(Value & 0xFFF);
				if (Lo12)
				Out.EmitInstruction(
				MCInstBuilder(RISCV::ADDI).addReg(DestReg).addReg(DestReg).addImm(Lo12),
				*STI);
				}

lib/Target/RISCV/RISCVAsmPrinter.cpp

	//===-- RISCVAsmPrinter.cpp - RISCV LLVM assembly writer ------------------===//			//===-- RISCVAsmPrinter.cpp - RISCV LLVM assembly writer ------------------===//
	//			//
	// The LLVM Compiler Infrastructure			// The LLVM Compiler Infrastructure
	//			//
	// This file is distributed under the University of Illinois Open Source			// This file is distributed under the University of Illinois Open Source
	// License. See LICENSE.TXT for details.			// License. See LICENSE.TXT for details.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// This file contains a printer that converts from our internal representation			// This file contains a printer that converts from our internal representation
	// of machine-dependent LLVM code to the RISCV assembly language.			// of machine-dependent LLVM code to the RISCV assembly language.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include "RISCV.h"
	#include "InstPrinter/RISCVInstPrinter.h"			#include "InstPrinter/RISCVInstPrinter.h"
				#include "MCTargetDesc/RISCVMCPseudoExpansion.h"
				#include "RISCV.h"
	#include "RISCVTargetMachine.h"			#include "RISCVTargetMachine.h"
	#include "llvm/CodeGen/AsmPrinter.h"			#include "llvm/CodeGen/AsmPrinter.h"
	#include "llvm/CodeGen/MachineConstantPool.h"			#include "llvm/CodeGen/MachineConstantPool.h"
	#include "llvm/CodeGen/MachineFunctionPass.h"			#include "llvm/CodeGen/MachineFunctionPass.h"
	#include "llvm/CodeGen/MachineInstr.h"			#include "llvm/CodeGen/MachineInstr.h"
	#include "llvm/CodeGen/MachineModuleInfo.h"			#include "llvm/CodeGen/MachineModuleInfo.h"
	#include "llvm/MC/MCAsmInfo.h"			#include "llvm/MC/MCAsmInfo.h"
	#include "llvm/MC/MCInst.h"			#include "llvm/MC/MCInst.h"
	Show All 37 Lines
	// instructions) auto-generated.			// instructions) auto-generated.
	#include "RISCVGenMCPseudoLowering.inc"			#include "RISCVGenMCPseudoLowering.inc"

	void RISCVAsmPrinter::EmitInstruction(const MachineInstr *MI) {			void RISCVAsmPrinter::EmitInstruction(const MachineInstr *MI) {
	// Do any auto-generated pseudo lowerings.			// Do any auto-generated pseudo lowerings.
	if (emitPseudoExpansionLowering(*OutStreamer, MI))			if (emitPseudoExpansionLowering(*OutStreamer, MI))
	return;			return;

				if (MI->getOpcode() == RISCV::PseudoLI) {
				const MachineOperand &DstRegOp = MI->getOperand(0);
				const MachineOperand &ImmOp = MI->getOperand(1);
				emitRISCVLoadImm(DstRegOp.getReg(), ImmOp.getImm(), *OutStreamer,
				apazosUnsubmitted Not Done Reply Inline Actions can't we return the new instruction from this function and reuse the EmitToStreamer call below. This way we reduce the places to insert compression calls, when instruction compression at MC level is enabled. apazos: can't we return the new instruction from this function and reuse the EmitToStreamer call below.
				niosHDAuthorUnsubmitted Not Done Reply Inline Actions Theoretically yes, but isn't compression done in the `EmitInstruction` of the streamer? The code here is basically a custom MI to MC lowering. It uses the same `EmitInstruction` function which is also used by the generated `emitPseudoExpansionLowering` internally. Maybe I miss something but assuming that the MC compression works in conjunction with pseudo expansion I expect that it also works for the current code. niosHD: Theoretically yes, but isn't compression done in the `EmitInstruction` of the streamer? The…
				sabuasalUnsubmitted Not Done Reply Inline Actions I believe I addressed this in my other comment but I actually just saw this comment you had! The way "emitPseudoExpansionLowering" emits the instruction is "EmitToStreamer(OutStreamer, TmpInst);". This way it preserves any behavior in the XXXASMPrinter it is called from. You can check that in any inc file "XXXXGenMCPseudoLowering.inc" sabuasal: I believe I addressed this in my other comment but I actually just saw this comment you had!
				niosHDAuthorUnsubmitted Not Done Reply Inline Actions (see above) Returning the instructions, as Ana suggested in the first comment, would be an alternative to adding compression to the `RISCVMCPseudoExpansion`. However, I am still not sure if it is idiomatic for the llvm code base to return a list of instructions from such a function. Further opinions are welcome! niosHD: (see above) Returning the instructions, as Ana suggested in the first comment, would be an…
				&getSubtargetInfo());
				return;
				}

	MCInst TmpInst;			MCInst TmpInst;
	LowerRISCVMachineInstrToMCInst(MI, TmpInst, *this);			LowerRISCVMachineInstrToMCInst(MI, TmpInst, *this);
	EmitToStreamer(*OutStreamer, TmpInst);			EmitToStreamer(*OutStreamer, TmpInst);
	}			}

	bool RISCVAsmPrinter::PrintAsmOperand(const MachineInstr *MI, unsigned OpNo,			bool RISCVAsmPrinter::PrintAsmOperand(const MachineInstr *MI, unsigned OpNo,
	unsigned AsmVariant,			unsigned AsmVariant,
	const char *ExtraCode, raw_ostream &OS) {			const char *ExtraCode, raw_ostream &OS) {
	▲ Show 20 Lines • Show All 50 Lines • Show Last 20 Lines

lib/Target/RISCV/RISCVInstrFormats.td

Show First 20 Lines • Show All 96 Lines • ▼ Show 20 Lines	class RVInst<dag outs, dag ins, string opcodestr, string argstr,
dag InOperandList = ins;		dag InOperandList = ins;
let AsmString = opcodestr # "\t" # argstr;		let AsmString = opcodestr # "\t" # argstr;
let Pattern = pattern;		let Pattern = pattern;

let TSFlags{4-0} = format.Value;		let TSFlags{4-0} = format.Value;
}		}

// Pseudo instructions		// Pseudo instructions
class Pseudo<dag outs, dag ins, list<dag> pattern>		class Pseudo<dag outs, dag ins, list<dag> pattern, string opcodestr = "", string argstr = "">
: RVInst<outs, ins, "", "", pattern, InstFormatPseudo> {		: RVInst<outs, ins, opcodestr, argstr, pattern, InstFormatPseudo> {
let isPseudo = 1;		let isPseudo = 1;
let isCodeGenOnly = 1;		let isCodeGenOnly = 1;
}		}
		asbUnsubmitted Done Reply Inline Actions Do you think having these properties inferred might be a big 'magic'? I'm not really decided one way or another myself, but it does seem a bit non-obvious. asb: Do you think having these properties inferred might be a big 'magic'? I'm not really decided…
		niosHDAuthorUnsubmitted Done Reply Inline Actions Well, the motivation for the magic was simply to keep the patch minimal. I at first intended to introduce a new `AsmPseudo` class. However, I decided against it because deriving from Pseudo did not feel particularly clean given that a real `AsmPseudo` is typically not a `CodeGenPseudo`. With the mental model that a Pseudo is either a `AsmPseudo` or a `CodeGenPseudo`, inferring the type depending on opcodestr is not that bad. Anyway, I got the code wrong. The `isAsmParserOnly` assignment should be negated (i.e., `!if(!eq(opcodestr,""), 0, 1)`). Regarding alternatives, for me, the cleanest option would be to introduce a new `PseudoBaseClass` and derive a `AsmPseudo` and `CodeGenPseudo` from it. (Naming suggestions are welcome.) niosHD: Well, the motivation for the magic was simply to keep the patch minimal. I at first intended to…
		asbUnsubmitted Done Reply Inline Actions Could you just override isCodeGenOnly/isAsmParserOnly when necessary: let isAsmParserOnly = 1 in def FooInst : Pseudo<....> asb: Could you just override isCodeGenOnly/isAsmParserOnly when necessary: ``` let isAsmParserOnly…

// Instruction formats are listed in the order they appear in the RISC-V		// Instruction formats are listed in the order they appear in the RISC-V
// instruction set manual (R, I, S, B, U, J) with sub-formats (e.g. RVInstR4,		// instruction set manual (R, I, S, B, U, J) with sub-formats (e.g. RVInstR4,
// RVInstRAtomic) sorted alphabetically.		// RVInstRAtomic) sorted alphabetically.

class RVInstR<bits<7> funct7, bits<3> funct3, RISCVOpcode opcode, dag outs,		class RVInstR<bits<7> funct7, bits<3> funct3, RISCVOpcode opcode, dag outs,
dag ins, string opcodestr, string argstr>		dag ins, string opcodestr, string argstr>
: RVInst<outs, ins, opcodestr, argstr, [], InstFormatR> {		: RVInst<outs, ins, opcodestr, argstr, [], InstFormatR> {
▲ Show 20 Lines • Show All 168 Lines • Show Last 20 Lines

lib/Target/RISCV/RISCVInstrInfo.cpp

	Show First 20 Lines • Show All 77 Lines • ▼ Show 20 Lines
	}			}

	void RISCVInstrInfo::movImm32(MachineBasicBlock &MBB,			void RISCVInstrInfo::movImm32(MachineBasicBlock &MBB,
	MachineBasicBlock::iterator MBBI,			MachineBasicBlock::iterator MBBI,
	const DebugLoc &DL, unsigned DstReg, uint64_t Val,			const DebugLoc &DL, unsigned DstReg, uint64_t Val,
	MachineInstr::MIFlag Flag) const {			MachineInstr::MIFlag Flag) const {
	assert(isInt<32>(Val) && "Can only materialize 32-bit constants");			assert(isInt<32>(Val) && "Can only materialize 32-bit constants");

	// TODO: If the value can be materialized using only one instruction, only			BuildMI(MBB, MBBI, DL, get(RISCV::PseudoLI), DstReg)
	// insert a single instruction.			.addImm(Val)

	uint64_t Hi20 = ((Val + 0x800) >> 12) & 0xfffff;
	uint64_t Lo12 = SignExtend64<12>(Val);
	BuildMI(MBB, MBBI, DL, get(RISCV::LUI), DstReg)
	.addImm(Hi20)
	.setMIFlag(Flag);
	BuildMI(MBB, MBBI, DL, get(RISCV::ADDI), DstReg)
	.addReg(DstReg, RegState::Kill)
	.addImm(Lo12)
	.setMIFlag(Flag);			.setMIFlag(Flag);
	}			}

	// The contents of values added to Cond are not examined outside of			// The contents of values added to Cond are not examined outside of
	// RISCVInstrInfo, giving us flexibility in what to push to it. For RISCV, we			// RISCVInstrInfo, giving us flexibility in what to push to it. For RISCV, we
	// push BranchOpcode, Reg1, Reg2.			// push BranchOpcode, Reg1, Reg2.
	static void parseCondBranch(MachineInstr &LastInst, MachineBasicBlock *&Target,			static void parseCondBranch(MachineInstr &LastInst, MachineBasicBlock *&Target,
	SmallVectorImpl<MachineOperand> &Cond) {			SmallVectorImpl<MachineOperand> &Cond) {
	▲ Show 20 Lines • Show All 264 Lines • Show Last 20 Lines

lib/Target/RISCV/RISCVInstrInfo.td

Show All 37 Lines	def RetFlag : SDNode<"RISCVISD::RET_FLAG", SDTNone,
[SDNPHasChain, SDNPOptInGlue, SDNPVariadic]>;		[SDNPHasChain, SDNPOptInGlue, SDNPVariadic]>;
def SelectCC : SDNode<"RISCVISD::SELECT_CC", SDT_RISCVSelectCC,		def SelectCC : SDNode<"RISCVISD::SELECT_CC", SDT_RISCVSelectCC,
[SDNPInGlue]>;		[SDNPInGlue]>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Operand and SDNode transformation definitions.		// Operand and SDNode transformation definitions.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

		class ImmXLenAsmOperand<string prefix, string suffix = ""> : AsmOperandClass {
		let Name = prefix # "ImmXLen" # suffix;
		let RenderMethod = "addImmOperands";
		let DiagnosticType = !strconcat("Invalid", Name);
		}

class ImmAsmOperand<string prefix, int width, string suffix> : AsmOperandClass {		class ImmAsmOperand<string prefix, int width, string suffix> : AsmOperandClass {
let Name = prefix # "Imm" # width # suffix;		let Name = prefix # "Imm" # width # suffix;
let RenderMethod = "addImmOperands";		let RenderMethod = "addImmOperands";
let DiagnosticType = !strconcat("Invalid", Name);		let DiagnosticType = !strconcat("Invalid", Name);
}		}

class SImmAsmOperand<int width, string suffix = "">		class SImmAsmOperand<int width, string suffix = "">
: ImmAsmOperand<"S", width, suffix> {		: ImmAsmOperand<"S", width, suffix> {
▲ Show 20 Lines • Show All 63 Lines • ▼ Show 20 Lines
// A 21-bit signed immediate where the least significant bit is zero.		// A 21-bit signed immediate where the least significant bit is zero.
def simm21_lsb0 : Operand<OtherVT> {		def simm21_lsb0 : Operand<OtherVT> {
let ParserMatchClass = SImmAsmOperand<21, "Lsb0">;		let ParserMatchClass = SImmAsmOperand<21, "Lsb0">;
let EncoderMethod = "getImmOpValueAsr1";		let EncoderMethod = "getImmOpValueAsr1";
let DecoderMethod = "decodeSImmOperandAndLsl1<21>";		let DecoderMethod = "decodeSImmOperandAndLsl1<21>";
}		}

// A parameterized register class alternative to i32imm/i64imm from Target.td.		// A parameterized register class alternative to i32imm/i64imm from Target.td.
def ixlenimm : Operand<XLenVT>;		def ixlenimm : Operand<XLenVT> {
		let ParserMatchClass = ImmXLenAsmOperand<"S">;
		}

// Standalone (codegen-only) immleaf patterns.		// Standalone (codegen-only) immleaf patterns.
def simm32 : ImmLeaf<XLenVT, [{return isInt<32>(Imm);}]>;		def simm32 : ImmLeaf<XLenVT, [{return isInt<32>(Imm);}]>;

// Addressing modes.		// Addressing modes.
// Necessary because a frameindex can't be matched directly in a pattern.		// Necessary because a frameindex can't be matched directly in a pattern.
def AddrFI : ComplexPattern<iPTR, 1, "SelectAddrFI", [frameindex], []>;		def AddrFI : ComplexPattern<iPTR, 1, "SelectAddrFI", [frameindex], []>;

▲ Show 20 Lines • Show All 247 Lines • ▼ Show 20 Lines

// TODO la		// TODO la
// TODO lb lh lw		// TODO lb lh lw
// TODO RV64I: ld		// TODO RV64I: ld
// TODO sb sh sw		// TODO sb sh sw
// TODO RV64I: sd		// TODO RV64I: sd

def : InstAlias<"nop", (ADDI X0, X0, 0)>;		def : InstAlias<"nop", (ADDI X0, X0, 0)>;
// TODO li
		// Note that a size of 8 is currently correct because only 32-bit immediates
		// are emitted as PseudoLI during codegen. Emitting larger constants as
		// PseudoLI is probably not the best idea anyway given that up to
		// 8 32-bit instructions are needed to generate an arbitrary 64-bit immediate.
		efriedmaUnsubmitted Not Done Reply Inline Actions I'm not sure it's a good idea to make code generation use this pseudo-instruction; you'll miss optimization opportunities, like MachineCSE of lui instructions. efriedma: I'm not sure it's a good idea to make code generation use this pseudo-instruction; you'll miss…
		niosHDAuthorUnsubmitted Not Done Reply Inline Actions Indeed, me neither. I also raised this concern in one of our weekly sync up calls and the consensus was to go with the Pseudo instruction for now. However, I am definitely not opposed to expand the respective immediate loads early into machine instructions. niosHD: Indeed, me neither. I also raised this concern in one of our weekly sync up calls and the…
		asbUnsubmitted Not Done Reply Inline Actions If I recall correctly, @kparzysz reported that based on his experience there was probably little to gain. asb: If I recall correctly, @kparzysz reported that based on his experience there was probably…
		let hasSideEffects = 0, mayLoad = 0, mayStore = 0, Size = 8,
		isCodeGenOnly = 0, isAsmParserOnly = 1 in
		def PseudoLI : Pseudo<(outs GPR:$rd), (ins ixlenimm:$imm), [],
		"li", "$rd, $imm">;

def : InstAlias<"mv $rd, $rs", (ADDI GPR:$rd, GPR:$rs, 0)>;		def : InstAlias<"mv $rd, $rs", (ADDI GPR:$rd, GPR:$rs, 0)>;
def : InstAlias<"not $rd, $rs", (XORI GPR:$rd, GPR:$rs, -1)>;		def : InstAlias<"not $rd, $rs", (XORI GPR:$rd, GPR:$rs, -1)>;
def : InstAlias<"neg $rd, $rs", (SUB GPR:$rd, X0, GPR:$rs)>;		def : InstAlias<"neg $rd, $rs", (SUB GPR:$rd, X0, GPR:$rs)>;

let Predicates = [IsRV64] in {		let Predicates = [IsRV64] in {
def : InstAlias<"negw $rd, $rs", (SUBW GPR:$rd, X0, GPR:$rs)>;		def : InstAlias<"negw $rd, $rs", (SUBW GPR:$rd, X0, GPR:$rs)>;
def : InstAlias<"sext.w $rd, $rs", (ADDIW GPR:$rd, GPR:$rs, 0)>;		def : InstAlias<"sext.w $rd, $rs", (ADDIW GPR:$rd, GPR:$rs, 0)>;
} // Predicates = [IsRV64]		} // Predicates = [IsRV64]
▲ Show 20 Lines • Show All 86 Lines • ▼ Show 20 Lines

def IsOrAdd: PatFrag<(ops node:$A, node:$B), (or node:$A, node:$B), [{		def IsOrAdd: PatFrag<(ops node:$A, node:$B), (or node:$A, node:$B), [{
return isOrEquivalentToAdd(N);		return isOrEquivalentToAdd(N);
}]>;		}]>;

/// Immediates		/// Immediates

def : Pat<(simm12:$imm), (ADDI X0, simm12:$imm)>;		def : Pat<(simm12:$imm), (ADDI X0, simm12:$imm)>;
// TODO: Add a pattern for immediates with all zeroes in the lower 12 bits.		def : Pat<(simm32:$imm), (PseudoLI imm:$imm)>;
def : Pat<(simm32:$imm), (ADDI (LUI (HI20 imm:$imm)), (LO12Sext imm:$imm))>;

/// Simple arithmetic operations		/// Simple arithmetic operations

def : PatGprGpr<add, ADD>;		def : PatGprGpr<add, ADD>;
def : PatGprSimm12<add, ADDI>;		def : PatGprSimm12<add, ADDI>;
def : PatGprGpr<sub, SUB>;		def : PatGprGpr<sub, SUB>;
def : PatGprGpr<or, OR>;		def : PatGprGpr<or, OR>;
def : PatGprSimm12<or, ORI>;		def : PatGprSimm12<or, ORI>;
▲ Show 20 Lines • Show All 151 Lines • Show Last 20 Lines

test/CodeGen/RISCV/bswap-ctlz-cttz-ctpop.ll

	Show All 13 Lines

	define i16 @test_bswap_i16(i16 %a) nounwind {			define i16 @test_bswap_i16(i16 %a) nounwind {
	; RV32I-LABEL: test_bswap_i16:			; RV32I-LABEL: test_bswap_i16:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: sw s0, 8(sp)			; RV32I-NEXT: sw s0, 8(sp)
	; RV32I-NEXT: addi s0, sp, 16			; RV32I-NEXT: addi s0, sp, 16
	; RV32I-NEXT: lui a1, 4080			; RV32I-NEXT: slli a1, a0, 8
	; RV32I-NEXT: mv a1, a1			; RV32I-NEXT: lui a2, 4080
	; RV32I-NEXT: slli a2, a0, 8			; RV32I-NEXT: and a1, a1, a2
	; RV32I-NEXT: and a1, a2, a1
	; RV32I-NEXT: slli a0, a0, 24			; RV32I-NEXT: slli a0, a0, 24
	; RV32I-NEXT: or a0, a0, a1			; RV32I-NEXT: or a0, a0, a1
	; RV32I-NEXT: srli a0, a0, 16			; RV32I-NEXT: srli a0, a0, 16
	; RV32I-NEXT: lw s0, 8(sp)			; RV32I-NEXT: lw s0, 8(sp)
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
	%tmp = call i16 @llvm.bswap.i16(i16 %a)			%tmp = call i16 @llvm.bswap.i16(i16 %a)
	ret i16 %tmp			ret i16 %tmp
	}			}

	define i32 @test_bswap_i32(i32 %a) nounwind {			define i32 @test_bswap_i32(i32 %a) nounwind {
	; RV32I-LABEL: test_bswap_i32:			; RV32I-LABEL: test_bswap_i32:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: sw s0, 8(sp)			; RV32I-NEXT: sw s0, 8(sp)
	; RV32I-NEXT: addi s0, sp, 16			; RV32I-NEXT: addi s0, sp, 16
	; RV32I-NEXT: lui a1, 16			; RV32I-NEXT: srli a1, a0, 8
	; RV32I-NEXT: addi a1, a1, -256			; RV32I-NEXT: lui a2, 16
	; RV32I-NEXT: srli a2, a0, 8			; RV32I-NEXT: addi a2, a2, -256
	; RV32I-NEXT: and a1, a2, a1			; RV32I-NEXT: and a1, a1, a2
	; RV32I-NEXT: srli a2, a0, 24			; RV32I-NEXT: srli a2, a0, 24
	; RV32I-NEXT: or a1, a1, a2			; RV32I-NEXT: or a1, a1, a2
	; RV32I-NEXT: lui a2, 4080			; RV32I-NEXT: slli a2, a0, 8
	; RV32I-NEXT: mv a2, a2			; RV32I-NEXT: lui a3, 4080
	; RV32I-NEXT: slli a3, a0, 8			; RV32I-NEXT: and a2, a2, a3
	; RV32I-NEXT: and a2, a3, a2
	; RV32I-NEXT: slli a0, a0, 24			; RV32I-NEXT: slli a0, a0, 24
	; RV32I-NEXT: or a0, a0, a2			; RV32I-NEXT: or a0, a0, a2
	; RV32I-NEXT: or a0, a0, a1			; RV32I-NEXT: or a0, a0, a1
	; RV32I-NEXT: lw s0, 8(sp)			; RV32I-NEXT: lw s0, 8(sp)
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
	%tmp = call i32 @llvm.bswap.i32(i32 %a)			%tmp = call i32 @llvm.bswap.i32(i32 %a)
	ret i32 %tmp			ret i32 %tmp
	}			}

	define i64 @test_bswap_i64(i64 %a) nounwind {			define i64 @test_bswap_i64(i64 %a) nounwind {
	; RV32I-LABEL: test_bswap_i64:			; RV32I-LABEL: test_bswap_i64:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: sw s0, 8(sp)			; RV32I-NEXT: sw s0, 8(sp)
	; RV32I-NEXT: addi s0, sp, 16			; RV32I-NEXT: addi s0, sp, 16
	; RV32I-NEXT: lui a2, 16
	; RV32I-NEXT: addi a3, a2, -256
	; RV32I-NEXT: srli a2, a1, 8			; RV32I-NEXT: srli a2, a1, 8
				; RV32I-NEXT: lui a3, 16
				; RV32I-NEXT: addi a3, a3, -256
	; RV32I-NEXT: and a2, a2, a3			; RV32I-NEXT: and a2, a2, a3
	; RV32I-NEXT: srli a4, a1, 24			; RV32I-NEXT: srli a4, a1, 24
	; RV32I-NEXT: or a2, a2, a4			; RV32I-NEXT: or a2, a2, a4
	; RV32I-NEXT: lui a4, 4080			; RV32I-NEXT: slli a4, a1, 8
	; RV32I-NEXT: mv a4, a4			; RV32I-NEXT: lui a5, 4080
	; RV32I-NEXT: slli a5, a1, 8			; RV32I-NEXT: and a4, a4, a5
	; RV32I-NEXT: and a5, a5, a4
	; RV32I-NEXT: slli a1, a1, 24			; RV32I-NEXT: slli a1, a1, 24
	; RV32I-NEXT: or a1, a1, a5			; RV32I-NEXT: or a1, a1, a4
	; RV32I-NEXT: or a2, a1, a2			; RV32I-NEXT: or a2, a1, a2
	; RV32I-NEXT: srli a1, a0, 8			; RV32I-NEXT: srli a1, a0, 8
	; RV32I-NEXT: and a1, a1, a3			; RV32I-NEXT: and a1, a1, a3
	; RV32I-NEXT: srli a3, a0, 24			; RV32I-NEXT: srli a3, a0, 24
	; RV32I-NEXT: or a1, a1, a3			; RV32I-NEXT: or a1, a1, a3
	; RV32I-NEXT: slli a3, a0, 8			; RV32I-NEXT: slli a3, a0, 8
	; RV32I-NEXT: and a3, a3, a4			; RV32I-NEXT: and a3, a3, a5
	; RV32I-NEXT: slli a0, a0, 24			; RV32I-NEXT: slli a0, a0, 24
	; RV32I-NEXT: or a0, a0, a3			; RV32I-NEXT: or a0, a0, a3
	; RV32I-NEXT: or a1, a0, a1			; RV32I-NEXT: or a1, a0, a1
	; RV32I-NEXT: mv a0, a2			; RV32I-NEXT: mv a0, a2
	; RV32I-NEXT: lw s0, 8(sp)			; RV32I-NEXT: lw s0, 8(sp)
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
	Show All 9 Lines
	; RV32I-NEXT: sw s0, 8(sp)			; RV32I-NEXT: sw s0, 8(sp)
	; RV32I-NEXT: addi s0, sp, 16			; RV32I-NEXT: addi s0, sp, 16
	; RV32I-NEXT: andi a1, a0, 255			; RV32I-NEXT: andi a1, a0, 255
	; RV32I-NEXT: beqz a1, .LBB3_2			; RV32I-NEXT: beqz a1, .LBB3_2
	; RV32I-NEXT: # %bb.1: # %cond.false			; RV32I-NEXT: # %bb.1: # %cond.false
	; RV32I-NEXT: addi a1, a0, -1			; RV32I-NEXT: addi a1, a0, -1
	; RV32I-NEXT: not a0, a0			; RV32I-NEXT: not a0, a0
	; RV32I-NEXT: and a0, a0, a1			; RV32I-NEXT: and a0, a0, a1
	; RV32I-NEXT: lui a1, 349525			; RV32I-NEXT: srli a1, a0, 1
	; RV32I-NEXT: addi a1, a1, 1365			; RV32I-NEXT: lui a2, 349525
	; RV32I-NEXT: srli a2, a0, 1			; RV32I-NEXT: addi a2, a2, 1365
	; RV32I-NEXT: and a1, a2, a1			; RV32I-NEXT: and a1, a1, a2
	; RV32I-NEXT: sub a0, a0, a1			; RV32I-NEXT: sub a0, a0, a1
	; RV32I-NEXT: lui a1, 209715			; RV32I-NEXT: lui a1, 209715
	; RV32I-NEXT: addi a1, a1, 819			; RV32I-NEXT: addi a1, a1, 819
	; RV32I-NEXT: and a2, a0, a1			; RV32I-NEXT: and a2, a0, a1
	; RV32I-NEXT: srli a0, a0, 2			; RV32I-NEXT: srli a0, a0, 2
	; RV32I-NEXT: and a0, a0, a1			; RV32I-NEXT: and a0, a0, a1
	; RV32I-NEXT: add a0, a2, a0			; RV32I-NEXT: add a0, a2, a0
	; RV32I-NEXT: srli a1, a0, 4			; RV32I-NEXT: srli a1, a0, 4
	; RV32I-NEXT: add a0, a0, a1			; RV32I-NEXT: add a0, a0, a1
	; RV32I-NEXT: lui a1, 61681			; RV32I-NEXT: lui a1, 61681
	; RV32I-NEXT: addi a1, a1, -241			; RV32I-NEXT: addi a1, a1, -241
	; RV32I-NEXT: and a0, a0, a1			; RV32I-NEXT: and a0, a0, a1
				; RV32I-NEXT: lui a1, %hi(__mulsi3)
				; RV32I-NEXT: addi a2, a1, %lo(__mulsi3)
	; RV32I-NEXT: lui a1, 4112			; RV32I-NEXT: lui a1, 4112
	; RV32I-NEXT: addi a1, a1, 257			; RV32I-NEXT: addi a1, a1, 257
	; RV32I-NEXT: lui a2, %hi(__mulsi3)
	; RV32I-NEXT: addi a2, a2, %lo(__mulsi3)
	; RV32I-NEXT: jalr a2			; RV32I-NEXT: jalr a2
	; RV32I-NEXT: srli a0, a0, 24			; RV32I-NEXT: srli a0, a0, 24
	; RV32I-NEXT: j .LBB3_3			; RV32I-NEXT: j .LBB3_3
	; RV32I-NEXT: .LBB3_2:			; RV32I-NEXT: .LBB3_2:
	; RV32I-NEXT: addi a0, zero, 8			; RV32I-NEXT: addi a0, zero, 8
	; RV32I-NEXT: .LBB3_3: # %cond.end			; RV32I-NEXT: .LBB3_3: # %cond.end
	; RV32I-NEXT: lw s0, 8(sp)			; RV32I-NEXT: lw s0, 8(sp)
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	Show All 13 Lines
	; RV32I-NEXT: lui a1, 16			; RV32I-NEXT: lui a1, 16
	; RV32I-NEXT: addi a1, a1, -1			; RV32I-NEXT: addi a1, a1, -1
	; RV32I-NEXT: and a1, a0, a1			; RV32I-NEXT: and a1, a0, a1
	; RV32I-NEXT: beqz a1, .LBB4_2			; RV32I-NEXT: beqz a1, .LBB4_2
	; RV32I-NEXT: # %bb.1: # %cond.false			; RV32I-NEXT: # %bb.1: # %cond.false
	; RV32I-NEXT: addi a1, a0, -1			; RV32I-NEXT: addi a1, a0, -1
	; RV32I-NEXT: not a0, a0			; RV32I-NEXT: not a0, a0
	; RV32I-NEXT: and a0, a0, a1			; RV32I-NEXT: and a0, a0, a1
	; RV32I-NEXT: lui a1, 349525			; RV32I-NEXT: srli a1, a0, 1
	; RV32I-NEXT: addi a1, a1, 1365			; RV32I-NEXT: lui a2, 349525
	; RV32I-NEXT: srli a2, a0, 1			; RV32I-NEXT: addi a2, a2, 1365
	; RV32I-NEXT: and a1, a2, a1			; RV32I-NEXT: and a1, a1, a2
	; RV32I-NEXT: sub a0, a0, a1			; RV32I-NEXT: sub a0, a0, a1
	; RV32I-NEXT: lui a1, 209715			; RV32I-NEXT: lui a1, 209715
	; RV32I-NEXT: addi a1, a1, 819			; RV32I-NEXT: addi a1, a1, 819
	; RV32I-NEXT: and a2, a0, a1			; RV32I-NEXT: and a2, a0, a1
	; RV32I-NEXT: srli a0, a0, 2			; RV32I-NEXT: srli a0, a0, 2
	; RV32I-NEXT: and a0, a0, a1			; RV32I-NEXT: and a0, a0, a1
	; RV32I-NEXT: add a0, a2, a0			; RV32I-NEXT: add a0, a2, a0
	; RV32I-NEXT: srli a1, a0, 4			; RV32I-NEXT: srli a1, a0, 4
	; RV32I-NEXT: add a0, a0, a1			; RV32I-NEXT: add a0, a0, a1
	; RV32I-NEXT: lui a1, 61681			; RV32I-NEXT: lui a1, 61681
	; RV32I-NEXT: addi a1, a1, -241			; RV32I-NEXT: addi a1, a1, -241
	; RV32I-NEXT: and a0, a0, a1			; RV32I-NEXT: and a0, a0, a1
				; RV32I-NEXT: lui a1, %hi(__mulsi3)
				; RV32I-NEXT: addi a2, a1, %lo(__mulsi3)
	; RV32I-NEXT: lui a1, 4112			; RV32I-NEXT: lui a1, 4112
	; RV32I-NEXT: addi a1, a1, 257			; RV32I-NEXT: addi a1, a1, 257
	; RV32I-NEXT: lui a2, %hi(__mulsi3)
	; RV32I-NEXT: addi a2, a2, %lo(__mulsi3)
	; RV32I-NEXT: jalr a2			; RV32I-NEXT: jalr a2
	; RV32I-NEXT: srli a0, a0, 24			; RV32I-NEXT: srli a0, a0, 24
	; RV32I-NEXT: j .LBB4_3			; RV32I-NEXT: j .LBB4_3
	; RV32I-NEXT: .LBB4_2:			; RV32I-NEXT: .LBB4_2:
	; RV32I-NEXT: addi a0, zero, 16			; RV32I-NEXT: addi a0, zero, 16
	; RV32I-NEXT: .LBB4_3: # %cond.end			; RV32I-NEXT: .LBB4_3: # %cond.end
	; RV32I-NEXT: lw s0, 8(sp)			; RV32I-NEXT: lw s0, 8(sp)
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	Show All 10 Lines
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: sw s0, 8(sp)			; RV32I-NEXT: sw s0, 8(sp)
	; RV32I-NEXT: addi s0, sp, 16			; RV32I-NEXT: addi s0, sp, 16
	; RV32I-NEXT: beqz a0, .LBB5_2			; RV32I-NEXT: beqz a0, .LBB5_2
	; RV32I-NEXT: # %bb.1: # %cond.false			; RV32I-NEXT: # %bb.1: # %cond.false
	; RV32I-NEXT: addi a1, a0, -1			; RV32I-NEXT: addi a1, a0, -1
	; RV32I-NEXT: not a0, a0			; RV32I-NEXT: not a0, a0
	; RV32I-NEXT: and a0, a0, a1			; RV32I-NEXT: and a0, a0, a1
	; RV32I-NEXT: lui a1, 349525			; RV32I-NEXT: srli a1, a0, 1
	; RV32I-NEXT: addi a1, a1, 1365			; RV32I-NEXT: lui a2, 349525
	; RV32I-NEXT: srli a2, a0, 1			; RV32I-NEXT: addi a2, a2, 1365
	; RV32I-NEXT: and a1, a2, a1			; RV32I-NEXT: and a1, a1, a2
	; RV32I-NEXT: sub a0, a0, a1			; RV32I-NEXT: sub a0, a0, a1
	; RV32I-NEXT: lui a1, 209715			; RV32I-NEXT: lui a1, 209715
	; RV32I-NEXT: addi a1, a1, 819			; RV32I-NEXT: addi a1, a1, 819
	; RV32I-NEXT: and a2, a0, a1			; RV32I-NEXT: and a2, a0, a1
	; RV32I-NEXT: srli a0, a0, 2			; RV32I-NEXT: srli a0, a0, 2
	; RV32I-NEXT: and a0, a0, a1			; RV32I-NEXT: and a0, a0, a1
	; RV32I-NEXT: add a0, a2, a0			; RV32I-NEXT: add a0, a2, a0
	; RV32I-NEXT: srli a1, a0, 4			; RV32I-NEXT: srli a1, a0, 4
	; RV32I-NEXT: add a0, a0, a1			; RV32I-NEXT: add a0, a0, a1
	; RV32I-NEXT: lui a1, 61681			; RV32I-NEXT: lui a1, 61681
	; RV32I-NEXT: addi a1, a1, -241			; RV32I-NEXT: addi a1, a1, -241
	; RV32I-NEXT: and a0, a0, a1			; RV32I-NEXT: and a0, a0, a1
				; RV32I-NEXT: lui a1, %hi(__mulsi3)
				; RV32I-NEXT: addi a2, a1, %lo(__mulsi3)
	; RV32I-NEXT: lui a1, 4112			; RV32I-NEXT: lui a1, 4112
	; RV32I-NEXT: addi a1, a1, 257			; RV32I-NEXT: addi a1, a1, 257
	; RV32I-NEXT: lui a2, %hi(__mulsi3)
	; RV32I-NEXT: addi a2, a2, %lo(__mulsi3)
	; RV32I-NEXT: jalr a2			; RV32I-NEXT: jalr a2
	; RV32I-NEXT: srli a0, a0, 24			; RV32I-NEXT: srli a0, a0, 24
	; RV32I-NEXT: j .LBB5_3			; RV32I-NEXT: j .LBB5_3
	; RV32I-NEXT: .LBB5_2:			; RV32I-NEXT: .LBB5_2:
	; RV32I-NEXT: addi a0, zero, 32			; RV32I-NEXT: addi a0, zero, 32
	; RV32I-NEXT: .LBB5_3: # %cond.end			; RV32I-NEXT: .LBB5_3: # %cond.end
	; RV32I-NEXT: lw s0, 8(sp)			; RV32I-NEXT: lw s0, 8(sp)
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	Show All 17 Lines
	; RV32I-NEXT: srli a1, a0, 2			; RV32I-NEXT: srli a1, a0, 2
	; RV32I-NEXT: or a0, a0, a1			; RV32I-NEXT: or a0, a0, a1
	; RV32I-NEXT: srli a1, a0, 4			; RV32I-NEXT: srli a1, a0, 4
	; RV32I-NEXT: or a0, a0, a1			; RV32I-NEXT: or a0, a0, a1
	; RV32I-NEXT: srli a1, a0, 8			; RV32I-NEXT: srli a1, a0, 8
	; RV32I-NEXT: or a0, a0, a1			; RV32I-NEXT: or a0, a0, a1
	; RV32I-NEXT: srli a1, a0, 16			; RV32I-NEXT: srli a1, a0, 16
	; RV32I-NEXT: or a0, a0, a1			; RV32I-NEXT: or a0, a0, a1
				; RV32I-NEXT: not a0, a0
	; RV32I-NEXT: lui a1, 349525			; RV32I-NEXT: lui a1, 349525
	; RV32I-NEXT: addi a1, a1, 1365			; RV32I-NEXT: addi a1, a1, 1365
	; RV32I-NEXT: not a0, a0
	; RV32I-NEXT: srli a2, a0, 1			; RV32I-NEXT: srli a2, a0, 1
	; RV32I-NEXT: and a1, a2, a1			; RV32I-NEXT: and a1, a2, a1
	; RV32I-NEXT: sub a0, a0, a1			; RV32I-NEXT: sub a0, a0, a1
	; RV32I-NEXT: lui a1, 209715			; RV32I-NEXT: lui a1, 209715
	; RV32I-NEXT: addi a1, a1, 819			; RV32I-NEXT: addi a1, a1, 819
	; RV32I-NEXT: and a2, a0, a1			; RV32I-NEXT: and a2, a0, a1
	; RV32I-NEXT: srli a0, a0, 2			; RV32I-NEXT: srli a0, a0, 2
	; RV32I-NEXT: and a0, a0, a1			; RV32I-NEXT: and a0, a0, a1
	; RV32I-NEXT: add a0, a2, a0			; RV32I-NEXT: add a0, a2, a0
	; RV32I-NEXT: srli a1, a0, 4			; RV32I-NEXT: srli a1, a0, 4
	; RV32I-NEXT: add a0, a0, a1			; RV32I-NEXT: add a0, a0, a1
	; RV32I-NEXT: lui a1, 61681			; RV32I-NEXT: lui a1, 61681
	; RV32I-NEXT: addi a1, a1, -241			; RV32I-NEXT: addi a1, a1, -241
	; RV32I-NEXT: and a0, a0, a1			; RV32I-NEXT: and a0, a0, a1
				; RV32I-NEXT: lui a1, %hi(__mulsi3)
				; RV32I-NEXT: addi a2, a1, %lo(__mulsi3)
	; RV32I-NEXT: lui a1, 4112			; RV32I-NEXT: lui a1, 4112
	; RV32I-NEXT: addi a1, a1, 257			; RV32I-NEXT: addi a1, a1, 257
	; RV32I-NEXT: lui a2, %hi(__mulsi3)
	; RV32I-NEXT: addi a2, a2, %lo(__mulsi3)
	; RV32I-NEXT: jalr a2			; RV32I-NEXT: jalr a2
	; RV32I-NEXT: srli a0, a0, 24			; RV32I-NEXT: srli a0, a0, 24
	; RV32I-NEXT: j .LBB6_3			; RV32I-NEXT: j .LBB6_3
	; RV32I-NEXT: .LBB6_2:			; RV32I-NEXT: .LBB6_2:
	; RV32I-NEXT: addi a0, zero, 32			; RV32I-NEXT: addi a0, zero, 32
	; RV32I-NEXT: .LBB6_3: # %cond.end			; RV32I-NEXT: .LBB6_3: # %cond.end
	; RV32I-NEXT: lw s0, 8(sp)			; RV32I-NEXT: lw s0, 8(sp)
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	Show All 18 Lines
	; RV32I-NEXT: sw s7, 12(sp)			; RV32I-NEXT: sw s7, 12(sp)
	; RV32I-NEXT: sw s8, 8(sp)			; RV32I-NEXT: sw s8, 8(sp)
	; RV32I-NEXT: addi s0, sp, 48			; RV32I-NEXT: addi s0, sp, 48
	; RV32I-NEXT: mv s2, a1			; RV32I-NEXT: mv s2, a1
	; RV32I-NEXT: mv s3, a0			; RV32I-NEXT: mv s3, a0
	; RV32I-NEXT: addi a0, s3, -1			; RV32I-NEXT: addi a0, s3, -1
	; RV32I-NEXT: not a1, s3			; RV32I-NEXT: not a1, s3
	; RV32I-NEXT: and a0, a1, a0			; RV32I-NEXT: and a0, a1, a0
	; RV32I-NEXT: lui a1, 349525
	; RV32I-NEXT: addi s5, a1, 1365
	; RV32I-NEXT: srli a1, a0, 1			; RV32I-NEXT: srli a1, a0, 1
				; RV32I-NEXT: lui s5, 349525
				; RV32I-NEXT: addi s5, s5, 1365
	; RV32I-NEXT: and a1, a1, s5			; RV32I-NEXT: and a1, a1, s5
	; RV32I-NEXT: sub a0, a0, a1			; RV32I-NEXT: sub a0, a0, a1
	; RV32I-NEXT: lui a1, 209715			; RV32I-NEXT: lui s6, 209715
	; RV32I-NEXT: addi s6, a1, 819			; RV32I-NEXT: addi s6, s6, 819
	; RV32I-NEXT: and a1, a0, s6			; RV32I-NEXT: and a1, a0, s6
	; RV32I-NEXT: srli a0, a0, 2			; RV32I-NEXT: srli a0, a0, 2
	; RV32I-NEXT: and a0, a0, s6			; RV32I-NEXT: and a0, a0, s6
	; RV32I-NEXT: add a0, a1, a0			; RV32I-NEXT: add a0, a1, a0
	; RV32I-NEXT: srli a1, a0, 4			; RV32I-NEXT: srli a1, a0, 4
	; RV32I-NEXT: add a0, a0, a1			; RV32I-NEXT: add a0, a0, a1
	; RV32I-NEXT: lui a1, 4112
	; RV32I-NEXT: addi s4, a1, 257
	; RV32I-NEXT: lui a1, %hi(__mulsi3)			; RV32I-NEXT: lui a1, %hi(__mulsi3)
	; RV32I-NEXT: addi s7, a1, %lo(__mulsi3)			; RV32I-NEXT: addi s7, a1, %lo(__mulsi3)
	; RV32I-NEXT: lui a1, 61681			; RV32I-NEXT: lui s8, 61681
	; RV32I-NEXT: addi s8, a1, -241			; RV32I-NEXT: addi s8, s8, -241
	; RV32I-NEXT: and a0, a0, s8			; RV32I-NEXT: and a0, a0, s8
				; RV32I-NEXT: lui s4, 4112
				; RV32I-NEXT: addi s4, s4, 257
	; RV32I-NEXT: mv a1, s4			; RV32I-NEXT: mv a1, s4
	; RV32I-NEXT: jalr s7			; RV32I-NEXT: jalr s7
	; RV32I-NEXT: mv s1, a0			; RV32I-NEXT: mv s1, a0
	; RV32I-NEXT: addi a0, s2, -1			; RV32I-NEXT: addi a0, s2, -1
	; RV32I-NEXT: not a1, s2			; RV32I-NEXT: not a1, s2
	; RV32I-NEXT: and a0, a1, a0			; RV32I-NEXT: and a0, a1, a0
	; RV32I-NEXT: srli a1, a0, 1			; RV32I-NEXT: srli a1, a0, 1
	; RV32I-NEXT: and a1, a1, s5			; RV32I-NEXT: and a1, a1, s5
	Show All 37 Lines
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: sw s0, 8(sp)			; RV32I-NEXT: sw s0, 8(sp)
	; RV32I-NEXT: addi s0, sp, 16			; RV32I-NEXT: addi s0, sp, 16
	; RV32I-NEXT: addi a1, a0, -1			; RV32I-NEXT: addi a1, a0, -1
	; RV32I-NEXT: not a0, a0			; RV32I-NEXT: not a0, a0
	; RV32I-NEXT: and a0, a0, a1			; RV32I-NEXT: and a0, a0, a1
	; RV32I-NEXT: lui a1, 349525			; RV32I-NEXT: srli a1, a0, 1
	; RV32I-NEXT: addi a1, a1, 1365			; RV32I-NEXT: lui a2, 349525
	; RV32I-NEXT: srli a2, a0, 1			; RV32I-NEXT: addi a2, a2, 1365
	; RV32I-NEXT: and a1, a2, a1			; RV32I-NEXT: and a1, a1, a2
	; RV32I-NEXT: sub a0, a0, a1			; RV32I-NEXT: sub a0, a0, a1
	; RV32I-NEXT: lui a1, 209715			; RV32I-NEXT: lui a1, 209715
	; RV32I-NEXT: addi a1, a1, 819			; RV32I-NEXT: addi a1, a1, 819
	; RV32I-NEXT: and a2, a0, a1			; RV32I-NEXT: and a2, a0, a1
	; RV32I-NEXT: srli a0, a0, 2			; RV32I-NEXT: srli a0, a0, 2
	; RV32I-NEXT: and a0, a0, a1			; RV32I-NEXT: and a0, a0, a1
	; RV32I-NEXT: add a0, a2, a0			; RV32I-NEXT: add a0, a2, a0
	; RV32I-NEXT: srli a1, a0, 4			; RV32I-NEXT: srli a1, a0, 4
	; RV32I-NEXT: add a0, a0, a1			; RV32I-NEXT: add a0, a0, a1
	; RV32I-NEXT: lui a1, 61681			; RV32I-NEXT: lui a1, 61681
	; RV32I-NEXT: addi a1, a1, -241			; RV32I-NEXT: addi a1, a1, -241
	; RV32I-NEXT: and a0, a0, a1			; RV32I-NEXT: and a0, a0, a1
				; RV32I-NEXT: lui a1, %hi(__mulsi3)
				; RV32I-NEXT: addi a2, a1, %lo(__mulsi3)
	; RV32I-NEXT: lui a1, 4112			; RV32I-NEXT: lui a1, 4112
	; RV32I-NEXT: addi a1, a1, 257			; RV32I-NEXT: addi a1, a1, 257
	; RV32I-NEXT: lui a2, %hi(__mulsi3)
	; RV32I-NEXT: addi a2, a2, %lo(__mulsi3)
	; RV32I-NEXT: jalr a2			; RV32I-NEXT: jalr a2
	; RV32I-NEXT: srli a0, a0, 24			; RV32I-NEXT: srli a0, a0, 24
	; RV32I-NEXT: lw s0, 8(sp)			; RV32I-NEXT: lw s0, 8(sp)
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
	%tmp = call i8 @llvm.cttz.i8(i8 %a, i1 true)			%tmp = call i8 @llvm.cttz.i8(i8 %a, i1 true)
	ret i8 %tmp			ret i8 %tmp
	}			}

	define i16 @test_cttz_i16_zero_undef(i16 %a) nounwind {			define i16 @test_cttz_i16_zero_undef(i16 %a) nounwind {
	; RV32I-LABEL: test_cttz_i16_zero_undef:			; RV32I-LABEL: test_cttz_i16_zero_undef:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: sw s0, 8(sp)			; RV32I-NEXT: sw s0, 8(sp)
	; RV32I-NEXT: addi s0, sp, 16			; RV32I-NEXT: addi s0, sp, 16
	; RV32I-NEXT: addi a1, a0, -1			; RV32I-NEXT: addi a1, a0, -1
	; RV32I-NEXT: not a0, a0			; RV32I-NEXT: not a0, a0
	; RV32I-NEXT: and a0, a0, a1			; RV32I-NEXT: and a0, a0, a1
	; RV32I-NEXT: lui a1, 349525			; RV32I-NEXT: srli a1, a0, 1
	; RV32I-NEXT: addi a1, a1, 1365			; RV32I-NEXT: lui a2, 349525
	; RV32I-NEXT: srli a2, a0, 1			; RV32I-NEXT: addi a2, a2, 1365
	; RV32I-NEXT: and a1, a2, a1			; RV32I-NEXT: and a1, a1, a2
	; RV32I-NEXT: sub a0, a0, a1			; RV32I-NEXT: sub a0, a0, a1
	; RV32I-NEXT: lui a1, 209715			; RV32I-NEXT: lui a1, 209715
	; RV32I-NEXT: addi a1, a1, 819			; RV32I-NEXT: addi a1, a1, 819
	; RV32I-NEXT: and a2, a0, a1			; RV32I-NEXT: and a2, a0, a1
	; RV32I-NEXT: srli a0, a0, 2			; RV32I-NEXT: srli a0, a0, 2
	; RV32I-NEXT: and a0, a0, a1			; RV32I-NEXT: and a0, a0, a1
	; RV32I-NEXT: add a0, a2, a0			; RV32I-NEXT: add a0, a2, a0
	; RV32I-NEXT: srli a1, a0, 4			; RV32I-NEXT: srli a1, a0, 4
	; RV32I-NEXT: add a0, a0, a1			; RV32I-NEXT: add a0, a0, a1
	; RV32I-NEXT: lui a1, 61681			; RV32I-NEXT: lui a1, 61681
	; RV32I-NEXT: addi a1, a1, -241			; RV32I-NEXT: addi a1, a1, -241
	; RV32I-NEXT: and a0, a0, a1			; RV32I-NEXT: and a0, a0, a1
				; RV32I-NEXT: lui a1, %hi(__mulsi3)
				; RV32I-NEXT: addi a2, a1, %lo(__mulsi3)
	; RV32I-NEXT: lui a1, 4112			; RV32I-NEXT: lui a1, 4112
	; RV32I-NEXT: addi a1, a1, 257			; RV32I-NEXT: addi a1, a1, 257
	; RV32I-NEXT: lui a2, %hi(__mulsi3)
	; RV32I-NEXT: addi a2, a2, %lo(__mulsi3)
	; RV32I-NEXT: jalr a2			; RV32I-NEXT: jalr a2
	; RV32I-NEXT: srli a0, a0, 24			; RV32I-NEXT: srli a0, a0, 24
	; RV32I-NEXT: lw s0, 8(sp)			; RV32I-NEXT: lw s0, 8(sp)
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
	%tmp = call i16 @llvm.cttz.i16(i16 %a, i1 true)			%tmp = call i16 @llvm.cttz.i16(i16 %a, i1 true)
	ret i16 %tmp			ret i16 %tmp
	}			}

	define i32 @test_cttz_i32_zero_undef(i32 %a) nounwind {			define i32 @test_cttz_i32_zero_undef(i32 %a) nounwind {
	; RV32I-LABEL: test_cttz_i32_zero_undef:			; RV32I-LABEL: test_cttz_i32_zero_undef:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: sw s0, 8(sp)			; RV32I-NEXT: sw s0, 8(sp)
	; RV32I-NEXT: addi s0, sp, 16			; RV32I-NEXT: addi s0, sp, 16
	; RV32I-NEXT: addi a1, a0, -1			; RV32I-NEXT: addi a1, a0, -1
	; RV32I-NEXT: not a0, a0			; RV32I-NEXT: not a0, a0
	; RV32I-NEXT: and a0, a0, a1			; RV32I-NEXT: and a0, a0, a1
	; RV32I-NEXT: lui a1, 349525			; RV32I-NEXT: srli a1, a0, 1
	; RV32I-NEXT: addi a1, a1, 1365			; RV32I-NEXT: lui a2, 349525
	; RV32I-NEXT: srli a2, a0, 1			; RV32I-NEXT: addi a2, a2, 1365
	; RV32I-NEXT: and a1, a2, a1			; RV32I-NEXT: and a1, a1, a2
	; RV32I-NEXT: sub a0, a0, a1			; RV32I-NEXT: sub a0, a0, a1
	; RV32I-NEXT: lui a1, 209715			; RV32I-NEXT: lui a1, 209715
	; RV32I-NEXT: addi a1, a1, 819			; RV32I-NEXT: addi a1, a1, 819
	; RV32I-NEXT: and a2, a0, a1			; RV32I-NEXT: and a2, a0, a1
	; RV32I-NEXT: srli a0, a0, 2			; RV32I-NEXT: srli a0, a0, 2
	; RV32I-NEXT: and a0, a0, a1			; RV32I-NEXT: and a0, a0, a1
	; RV32I-NEXT: add a0, a2, a0			; RV32I-NEXT: add a0, a2, a0
	; RV32I-NEXT: srli a1, a0, 4			; RV32I-NEXT: srli a1, a0, 4
	; RV32I-NEXT: add a0, a0, a1			; RV32I-NEXT: add a0, a0, a1
	; RV32I-NEXT: lui a1, 61681			; RV32I-NEXT: lui a1, 61681
	; RV32I-NEXT: addi a1, a1, -241			; RV32I-NEXT: addi a1, a1, -241
	; RV32I-NEXT: and a0, a0, a1			; RV32I-NEXT: and a0, a0, a1
				; RV32I-NEXT: lui a1, %hi(__mulsi3)
				; RV32I-NEXT: addi a2, a1, %lo(__mulsi3)
	; RV32I-NEXT: lui a1, 4112			; RV32I-NEXT: lui a1, 4112
	; RV32I-NEXT: addi a1, a1, 257			; RV32I-NEXT: addi a1, a1, 257
	; RV32I-NEXT: lui a2, %hi(__mulsi3)
	; RV32I-NEXT: addi a2, a2, %lo(__mulsi3)
	; RV32I-NEXT: jalr a2			; RV32I-NEXT: jalr a2
	; RV32I-NEXT: srli a0, a0, 24			; RV32I-NEXT: srli a0, a0, 24
	; RV32I-NEXT: lw s0, 8(sp)			; RV32I-NEXT: lw s0, 8(sp)
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
	%tmp = call i32 @llvm.cttz.i32(i32 %a, i1 true)			%tmp = call i32 @llvm.cttz.i32(i32 %a, i1 true)
	ret i32 %tmp			ret i32 %tmp
	Show All 14 Lines
	; RV32I-NEXT: sw s7, 12(sp)			; RV32I-NEXT: sw s7, 12(sp)
	; RV32I-NEXT: sw s8, 8(sp)			; RV32I-NEXT: sw s8, 8(sp)
	; RV32I-NEXT: addi s0, sp, 48			; RV32I-NEXT: addi s0, sp, 48
	; RV32I-NEXT: mv s2, a1			; RV32I-NEXT: mv s2, a1
	; RV32I-NEXT: mv s3, a0			; RV32I-NEXT: mv s3, a0
	; RV32I-NEXT: addi a0, s3, -1			; RV32I-NEXT: addi a0, s3, -1
	; RV32I-NEXT: not a1, s3			; RV32I-NEXT: not a1, s3
	; RV32I-NEXT: and a0, a1, a0			; RV32I-NEXT: and a0, a1, a0
	; RV32I-NEXT: lui a1, 349525
	; RV32I-NEXT: addi s5, a1, 1365
	; RV32I-NEXT: srli a1, a0, 1			; RV32I-NEXT: srli a1, a0, 1
				; RV32I-NEXT: lui s5, 349525
				; RV32I-NEXT: addi s5, s5, 1365
	; RV32I-NEXT: and a1, a1, s5			; RV32I-NEXT: and a1, a1, s5
	; RV32I-NEXT: sub a0, a0, a1			; RV32I-NEXT: sub a0, a0, a1
	; RV32I-NEXT: lui a1, 209715			; RV32I-NEXT: lui s6, 209715
	; RV32I-NEXT: addi s6, a1, 819			; RV32I-NEXT: addi s6, s6, 819
	; RV32I-NEXT: and a1, a0, s6			; RV32I-NEXT: and a1, a0, s6
	; RV32I-NEXT: srli a0, a0, 2			; RV32I-NEXT: srli a0, a0, 2
	; RV32I-NEXT: and a0, a0, s6			; RV32I-NEXT: and a0, a0, s6
	; RV32I-NEXT: add a0, a1, a0			; RV32I-NEXT: add a0, a1, a0
	; RV32I-NEXT: srli a1, a0, 4			; RV32I-NEXT: srli a1, a0, 4
	; RV32I-NEXT: add a0, a0, a1			; RV32I-NEXT: add a0, a0, a1
	; RV32I-NEXT: lui a1, 4112
	; RV32I-NEXT: addi s4, a1, 257
	; RV32I-NEXT: lui a1, %hi(__mulsi3)			; RV32I-NEXT: lui a1, %hi(__mulsi3)
	; RV32I-NEXT: addi s7, a1, %lo(__mulsi3)			; RV32I-NEXT: addi s7, a1, %lo(__mulsi3)
	; RV32I-NEXT: lui a1, 61681			; RV32I-NEXT: lui s8, 61681
	; RV32I-NEXT: addi s8, a1, -241			; RV32I-NEXT: addi s8, s8, -241
	; RV32I-NEXT: and a0, a0, s8			; RV32I-NEXT: and a0, a0, s8
				; RV32I-NEXT: lui s4, 4112
				; RV32I-NEXT: addi s4, s4, 257
	; RV32I-NEXT: mv a1, s4			; RV32I-NEXT: mv a1, s4
	; RV32I-NEXT: jalr s7			; RV32I-NEXT: jalr s7
	; RV32I-NEXT: mv s1, a0			; RV32I-NEXT: mv s1, a0
	; RV32I-NEXT: addi a0, s2, -1			; RV32I-NEXT: addi a0, s2, -1
	; RV32I-NEXT: not a1, s2			; RV32I-NEXT: not a1, s2
	; RV32I-NEXT: and a0, a1, a0			; RV32I-NEXT: and a0, a1, a0
	; RV32I-NEXT: srli a1, a0, 1			; RV32I-NEXT: srli a1, a0, 1
	; RV32I-NEXT: and a1, a1, s5			; RV32I-NEXT: and a1, a1, s5
	Show All 34 Lines

	define i32 @test_ctpop_i32(i32 %a) nounwind {			define i32 @test_ctpop_i32(i32 %a) nounwind {
	; RV32I-LABEL: test_ctpop_i32:			; RV32I-LABEL: test_ctpop_i32:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: sw s0, 8(sp)			; RV32I-NEXT: sw s0, 8(sp)
	; RV32I-NEXT: addi s0, sp, 16			; RV32I-NEXT: addi s0, sp, 16
	; RV32I-NEXT: lui a1, 349525			; RV32I-NEXT: srli a1, a0, 1
	; RV32I-NEXT: addi a1, a1, 1365			; RV32I-NEXT: lui a2, 349525
	; RV32I-NEXT: srli a2, a0, 1			; RV32I-NEXT: addi a2, a2, 1365
	; RV32I-NEXT: and a1, a2, a1			; RV32I-NEXT: and a1, a1, a2
	; RV32I-NEXT: sub a0, a0, a1			; RV32I-NEXT: sub a0, a0, a1
	; RV32I-NEXT: lui a1, 209715			; RV32I-NEXT: lui a1, 209715
	; RV32I-NEXT: addi a1, a1, 819			; RV32I-NEXT: addi a1, a1, 819
	; RV32I-NEXT: and a2, a0, a1			; RV32I-NEXT: and a2, a0, a1
	; RV32I-NEXT: srli a0, a0, 2			; RV32I-NEXT: srli a0, a0, 2
	; RV32I-NEXT: and a0, a0, a1			; RV32I-NEXT: and a0, a0, a1
	; RV32I-NEXT: add a0, a2, a0			; RV32I-NEXT: add a0, a2, a0
	; RV32I-NEXT: srli a1, a0, 4			; RV32I-NEXT: srli a1, a0, 4
	; RV32I-NEXT: add a0, a0, a1			; RV32I-NEXT: add a0, a0, a1
	; RV32I-NEXT: lui a1, 61681			; RV32I-NEXT: lui a1, 61681
	; RV32I-NEXT: addi a1, a1, -241			; RV32I-NEXT: addi a1, a1, -241
	; RV32I-NEXT: and a0, a0, a1			; RV32I-NEXT: and a0, a0, a1
				; RV32I-NEXT: lui a1, %hi(__mulsi3)
				; RV32I-NEXT: addi a2, a1, %lo(__mulsi3)
	; RV32I-NEXT: lui a1, 4112			; RV32I-NEXT: lui a1, 4112
	; RV32I-NEXT: addi a1, a1, 257			; RV32I-NEXT: addi a1, a1, 257
	; RV32I-NEXT: lui a2, %hi(__mulsi3)
	; RV32I-NEXT: addi a2, a2, %lo(__mulsi3)
	; RV32I-NEXT: jalr a2			; RV32I-NEXT: jalr a2
	; RV32I-NEXT: srli a0, a0, 24			; RV32I-NEXT: srli a0, a0, 24
	; RV32I-NEXT: lw s0, 8(sp)			; RV32I-NEXT: lw s0, 8(sp)
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
	%1 = call i32 @llvm.ctpop.i32(i32 %a)			%1 = call i32 @llvm.ctpop.i32(i32 %a)
	ret i32 %1			ret i32 %1
	}			}

test/CodeGen/RISCV/calling-conv.ll

	Show First 20 Lines • Show All 50 Lines • ▼ Show 20 Lines

	define i32 @caller_scalars() nounwind {			define i32 @caller_scalars() nounwind {
	; RV32I-LABEL: caller_scalars:			; RV32I-LABEL: caller_scalars:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: sw s0, 8(sp)			; RV32I-NEXT: sw s0, 8(sp)
	; RV32I-NEXT: addi s0, sp, 16			; RV32I-NEXT: addi s0, sp, 16
	; RV32I-NEXT: lui a0, 262464
	; RV32I-NEXT: mv a6, a0
	; RV32I-NEXT: lui a0, %hi(callee_scalars)			; RV32I-NEXT: lui a0, %hi(callee_scalars)
	; RV32I-NEXT: addi a7, a0, %lo(callee_scalars)			; RV32I-NEXT: addi a7, a0, %lo(callee_scalars)
	; RV32I-NEXT: addi a0, zero, 1			; RV32I-NEXT: addi a0, zero, 1
	; RV32I-NEXT: addi a1, zero, 2			; RV32I-NEXT: addi a1, zero, 2
	; RV32I-NEXT: addi a3, zero, 3			; RV32I-NEXT: addi a3, zero, 3
	; RV32I-NEXT: addi a4, zero, 4			; RV32I-NEXT: addi a4, zero, 4
				; RV32I-NEXT: lui a6, 262464
	; RV32I-NEXT: mv a2, zero			; RV32I-NEXT: mv a2, zero
	; RV32I-NEXT: mv a5, zero			; RV32I-NEXT: mv a5, zero
	; RV32I-NEXT: jalr a7			; RV32I-NEXT: jalr a7
	; RV32I-NEXT: lw s0, 8(sp)			; RV32I-NEXT: lw s0, 8(sp)
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
	%1 = call i32 @callee_scalars(i32 1, i64 2, i32 3, i32 4, double 5.000000e+00)			%1 = call i32 @callee_scalars(i32 1, i64 2, i32 3, i32 4, double 5.000000e+00)
	Show All 38 Lines

	define i32 @caller_large_scalars() nounwind {			define i32 @caller_large_scalars() nounwind {
	; RV32I-LABEL: caller_large_scalars:			; RV32I-LABEL: caller_large_scalars:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -48			; RV32I-NEXT: addi sp, sp, -48
	; RV32I-NEXT: sw ra, 44(sp)			; RV32I-NEXT: sw ra, 44(sp)
	; RV32I-NEXT: sw s0, 40(sp)			; RV32I-NEXT: sw s0, 40(sp)
	; RV32I-NEXT: addi s0, sp, 48			; RV32I-NEXT: addi s0, sp, 48
				; RV32I-NEXT: lui a0, 524272
				; RV32I-NEXT: sw a0, -36(s0)
	; RV32I-NEXT: sw zero, -40(s0)			; RV32I-NEXT: sw zero, -40(s0)
	; RV32I-NEXT: sw zero, -44(s0)			; RV32I-NEXT: sw zero, -44(s0)
	; RV32I-NEXT: sw zero, -48(s0)			; RV32I-NEXT: sw zero, -48(s0)
	; RV32I-NEXT: sw zero, -12(s0)			; RV32I-NEXT: sw zero, -12(s0)
	; RV32I-NEXT: sw zero, -16(s0)			; RV32I-NEXT: sw zero, -16(s0)
	; RV32I-NEXT: sw zero, -20(s0)			; RV32I-NEXT: sw zero, -20(s0)
	; RV32I-NEXT: addi a0, zero, 1			; RV32I-NEXT: addi a0, zero, 1
	; RV32I-NEXT: sw a0, -24(s0)			; RV32I-NEXT: sw a0, -24(s0)
	; RV32I-NEXT: lui a0, 524272
	; RV32I-NEXT: mv a0, a0
	; RV32I-NEXT: sw a0, -36(s0)
	; RV32I-NEXT: lui a0, %hi(callee_large_scalars)			; RV32I-NEXT: lui a0, %hi(callee_large_scalars)
	; RV32I-NEXT: addi a2, a0, %lo(callee_large_scalars)			; RV32I-NEXT: addi a2, a0, %lo(callee_large_scalars)
	; RV32I-NEXT: addi a0, s0, -24			; RV32I-NEXT: addi a0, s0, -24
	; RV32I-NEXT: addi a1, s0, -48			; RV32I-NEXT: addi a1, s0, -48
	; RV32I-NEXT: jalr a2			; RV32I-NEXT: jalr a2
	; RV32I-NEXT: lw s0, 40(sp)			; RV32I-NEXT: lw s0, 40(sp)
	; RV32I-NEXT: lw ra, 44(sp)			; RV32I-NEXT: lw ra, 44(sp)
	; RV32I-NEXT: addi sp, sp, 48			; RV32I-NEXT: addi sp, sp, 48
	▲ Show 20 Lines • Show All 46 Lines • ▼ Show 20 Lines
	; RV32I-NEXT: addi sp, sp, -64			; RV32I-NEXT: addi sp, sp, -64
	; RV32I-NEXT: sw ra, 60(sp)			; RV32I-NEXT: sw ra, 60(sp)
	; RV32I-NEXT: sw s0, 56(sp)			; RV32I-NEXT: sw s0, 56(sp)
	; RV32I-NEXT: addi s0, sp, 64			; RV32I-NEXT: addi s0, sp, 64
	; RV32I-NEXT: addi a0, s0, -48			; RV32I-NEXT: addi a0, s0, -48
	; RV32I-NEXT: sw a0, 4(sp)			; RV32I-NEXT: sw a0, 4(sp)
	; RV32I-NEXT: addi a0, zero, 9			; RV32I-NEXT: addi a0, zero, 9
	; RV32I-NEXT: sw a0, 0(sp)			; RV32I-NEXT: sw a0, 0(sp)
				; RV32I-NEXT: lui a0, 524272
				; RV32I-NEXT: sw a0, -36(s0)
	; RV32I-NEXT: sw zero, -40(s0)			; RV32I-NEXT: sw zero, -40(s0)
	; RV32I-NEXT: sw zero, -44(s0)			; RV32I-NEXT: sw zero, -44(s0)
	; RV32I-NEXT: sw zero, -48(s0)			; RV32I-NEXT: sw zero, -48(s0)
	; RV32I-NEXT: sw zero, -12(s0)			; RV32I-NEXT: sw zero, -12(s0)
	; RV32I-NEXT: sw zero, -16(s0)			; RV32I-NEXT: sw zero, -16(s0)
	; RV32I-NEXT: sw zero, -20(s0)			; RV32I-NEXT: sw zero, -20(s0)
	; RV32I-NEXT: addi a0, zero, 8			; RV32I-NEXT: addi a0, zero, 8
	; RV32I-NEXT: sw a0, -24(s0)			; RV32I-NEXT: sw a0, -24(s0)
	; RV32I-NEXT: lui a0, 524272
	; RV32I-NEXT: mv a0, a0
	; RV32I-NEXT: sw a0, -36(s0)
	; RV32I-NEXT: lui a0, %hi(callee_large_scalars_exhausted_regs)			; RV32I-NEXT: lui a0, %hi(callee_large_scalars_exhausted_regs)
	; RV32I-NEXT: addi t0, a0, %lo(callee_large_scalars_exhausted_regs)			; RV32I-NEXT: addi t0, a0, %lo(callee_large_scalars_exhausted_regs)
	; RV32I-NEXT: addi a0, zero, 1			; RV32I-NEXT: addi a0, zero, 1
	; RV32I-NEXT: addi a1, zero, 2			; RV32I-NEXT: addi a1, zero, 2
	; RV32I-NEXT: addi a2, zero, 3			; RV32I-NEXT: addi a2, zero, 3
	; RV32I-NEXT: addi a3, zero, 4			; RV32I-NEXT: addi a3, zero, 4
	; RV32I-NEXT: addi a4, zero, 5			; RV32I-NEXT: addi a4, zero, 5
	; RV32I-NEXT: addi a5, zero, 6			; RV32I-NEXT: addi a5, zero, 6
	▲ Show 20 Lines • Show All 287 Lines • ▼ Show 20 Lines
	; RV32I-NEXT: addi a0, a0, -1967			; RV32I-NEXT: addi a0, a0, -1967
	; RV32I-NEXT: sw a0, -24(s0)			; RV32I-NEXT: sw a0, -24(s0)
	; RV32I-NEXT: lui a0, 964690			; RV32I-NEXT: lui a0, 964690
	; RV32I-NEXT: addi a0, a0, -328			; RV32I-NEXT: addi a0, a0, -328
	; RV32I-NEXT: sw a0, -28(s0)			; RV32I-NEXT: sw a0, -28(s0)
	; RV32I-NEXT: lui a0, 335544			; RV32I-NEXT: lui a0, 335544
	; RV32I-NEXT: addi a0, a0, 1311			; RV32I-NEXT: addi a0, a0, 1311
	; RV32I-NEXT: sw a0, -32(s0)			; RV32I-NEXT: sw a0, -32(s0)
	; RV32I-NEXT: lui a0, 688509
	; RV32I-NEXT: addi a5, a0, -2048
	; RV32I-NEXT: lui a0, %hi(callee_aligned_stack)			; RV32I-NEXT: lui a0, %hi(callee_aligned_stack)
	; RV32I-NEXT: addi t0, a0, %lo(callee_aligned_stack)			; RV32I-NEXT: addi t0, a0, %lo(callee_aligned_stack)
	; RV32I-NEXT: addi a0, zero, 1			; RV32I-NEXT: addi a0, zero, 1
	; RV32I-NEXT: addi a1, zero, 11			; RV32I-NEXT: addi a1, zero, 11
	; RV32I-NEXT: addi a2, s0, -32			; RV32I-NEXT: addi a2, s0, -32
	; RV32I-NEXT: addi a3, zero, 12			; RV32I-NEXT: addi a3, zero, 12
	; RV32I-NEXT: addi a4, zero, 13			; RV32I-NEXT: addi a4, zero, 13
				; RV32I-NEXT: lui a5, 688509
				; RV32I-NEXT: addi a5, a5, -2048
	; RV32I-NEXT: addi a6, zero, 4			; RV32I-NEXT: addi a6, zero, 4
	; RV32I-NEXT: addi a7, zero, 14			; RV32I-NEXT: addi a7, zero, 14
	; RV32I-NEXT: jalr t0			; RV32I-NEXT: jalr t0
	; RV32I-NEXT: lw s0, 56(sp)			; RV32I-NEXT: lw s0, 56(sp)
	; RV32I-NEXT: lw ra, 60(sp)			; RV32I-NEXT: lw ra, 60(sp)
	; RV32I-NEXT: addi sp, sp, 64			; RV32I-NEXT: addi sp, sp, 64
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
	%1 = call i32 @callee_aligned_stack(i32 1, i32 11,			%1 = call i32 @callee_aligned_stack(i32 1, i32 11,
	▲ Show 20 Lines • Show All 97 Lines • ▼ Show 20 Lines
	define fp128 @callee_large_scalar_ret() nounwind {			define fp128 @callee_large_scalar_ret() nounwind {
	; RV32I-LABEL: callee_large_scalar_ret:			; RV32I-LABEL: callee_large_scalar_ret:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: sw s0, 8(sp)			; RV32I-NEXT: sw s0, 8(sp)
	; RV32I-NEXT: addi s0, sp, 16			; RV32I-NEXT: addi s0, sp, 16
	; RV32I-NEXT: lui a1, 524272			; RV32I-NEXT: lui a1, 524272
	; RV32I-NEXT: mv a1, a1
	; RV32I-NEXT: sw a1, 12(a0)			; RV32I-NEXT: sw a1, 12(a0)
	; RV32I-NEXT: sw zero, 8(a0)			; RV32I-NEXT: sw zero, 8(a0)
	; RV32I-NEXT: sw zero, 4(a0)			; RV32I-NEXT: sw zero, 4(a0)
	; RV32I-NEXT: sw zero, 0(a0)			; RV32I-NEXT: sw zero, 0(a0)
	; RV32I-NEXT: lw s0, 8(sp)			; RV32I-NEXT: lw s0, 8(sp)
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
	▲ Show 20 Lines • Show All 81 Lines • Show Last 20 Lines

test/CodeGen/RISCV/mem.ll

	Show First 20 Lines • Show All 264 Lines • ▼ Show 20 Lines
	define i32 @lw_sw_constant(i32 %a) nounwind {			define i32 @lw_sw_constant(i32 %a) nounwind {
	; TODO: the addi should be folded in to the lw/sw			; TODO: the addi should be folded in to the lw/sw
	; RV32I-LABEL: lw_sw_constant:			; RV32I-LABEL: lw_sw_constant:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: sw s0, 8(sp)			; RV32I-NEXT: sw s0, 8(sp)
	; RV32I-NEXT: addi s0, sp, 16			; RV32I-NEXT: addi s0, sp, 16
	; RV32I-NEXT: lui a1, 912092			; RV32I-NEXT: lui a2, 912092
	; RV32I-NEXT: addi a2, a1, -273			; RV32I-NEXT: addi a2, a2, -273
	; RV32I-NEXT: lw a1, 0(a2)			; RV32I-NEXT: lw a1, 0(a2)
	; RV32I-NEXT: sw a0, 0(a2)			; RV32I-NEXT: sw a0, 0(a2)
	; RV32I-NEXT: mv a0, a1			; RV32I-NEXT: mv a0, a1
	; RV32I-NEXT: lw s0, 8(sp)			; RV32I-NEXT: lw s0, 8(sp)
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
	%1 = inttoptr i32 3735928559 to i32*			%1 = inttoptr i32 3735928559 to i32*
	%2 = load volatile i32, i32* %1			%2 = load volatile i32, i32* %1
	store i32 %a, i32* %1			store i32 %a, i32* %1
	ret i32 %2			ret i32 %2
	}			}

test/CodeGen/RISCV/vararg.ll

	Show First 20 Lines • Show All 118 Lines • ▼ Show 20 Lines

	define void @va1_caller() nounwind {			define void @va1_caller() nounwind {
	; RV32I-LABEL: va1_caller:			; RV32I-LABEL: va1_caller:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: sw s0, 8(sp)			; RV32I-NEXT: sw s0, 8(sp)
	; RV32I-NEXT: addi s0, sp, 16			; RV32I-NEXT: addi s0, sp, 16
	; RV32I-NEXT: lui a0, 261888
	; RV32I-NEXT: mv a3, a0
	; RV32I-NEXT: lui a0, %hi(va1)			; RV32I-NEXT: lui a0, %hi(va1)
	; RV32I-NEXT: addi a0, a0, %lo(va1)			; RV32I-NEXT: addi a0, a0, %lo(va1)
				; RV32I-NEXT: lui a3, 261888
	; RV32I-NEXT: addi a4, zero, 2			; RV32I-NEXT: addi a4, zero, 2
	; RV32I-NEXT: mv a2, zero			; RV32I-NEXT: mv a2, zero
	; RV32I-NEXT: jalr a0			; RV32I-NEXT: jalr a0
	; RV32I-NEXT: lw s0, 8(sp)			; RV32I-NEXT: lw s0, 8(sp)
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
	; Pass a double, as a float would be promoted by a C/C++ frontend			; Pass a double, as a float would be promoted by a C/C++ frontend
	▲ Show 20 Lines • Show All 81 Lines • ▼ Show 20 Lines

	define void @va2_caller() nounwind {			define void @va2_caller() nounwind {
	; RV32I-LABEL: va2_caller:			; RV32I-LABEL: va2_caller:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: sw s0, 8(sp)			; RV32I-NEXT: sw s0, 8(sp)
	; RV32I-NEXT: addi s0, sp, 16			; RV32I-NEXT: addi s0, sp, 16
	; RV32I-NEXT: lui a0, 261888
	; RV32I-NEXT: mv a3, a0
	; RV32I-NEXT: lui a0, %hi(va2)			; RV32I-NEXT: lui a0, %hi(va2)
	; RV32I-NEXT: addi a0, a0, %lo(va2)			; RV32I-NEXT: addi a0, a0, %lo(va2)
				; RV32I-NEXT: lui a3, 261888
	; RV32I-NEXT: mv a2, zero			; RV32I-NEXT: mv a2, zero
	; RV32I-NEXT: jalr a0			; RV32I-NEXT: jalr a0
	; RV32I-NEXT: lw s0, 8(sp)			; RV32I-NEXT: lw s0, 8(sp)
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
	%1 = call double (i8, ...) @va2(i8 undef, double 1.000000e+00)			%1 = call double (i8, ...) @va2(i8 undef, double 1.000000e+00)
	ret void			ret void
	▲ Show 20 Lines • Show All 89 Lines • ▼ Show 20 Lines

	define void @va3_caller() nounwind {			define void @va3_caller() nounwind {
	; RV32I-LABEL: va3_caller:			; RV32I-LABEL: va3_caller:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: sw s0, 8(sp)			; RV32I-NEXT: sw s0, 8(sp)
	; RV32I-NEXT: addi s0, sp, 16			; RV32I-NEXT: addi s0, sp, 16
	; RV32I-NEXT: lui a0, 261888
	; RV32I-NEXT: mv a2, a0
	; RV32I-NEXT: lui a0, 262144
	; RV32I-NEXT: mv a5, a0
	; RV32I-NEXT: lui a0, %hi(va3)			; RV32I-NEXT: lui a0, %hi(va3)
	; RV32I-NEXT: addi a3, a0, %lo(va3)			; RV32I-NEXT: addi a3, a0, %lo(va3)
	; RV32I-NEXT: addi a0, zero, 2			; RV32I-NEXT: addi a0, zero, 2
				; RV32I-NEXT: lui a2, 261888
				; RV32I-NEXT: lui a5, 262144
	; RV32I-NEXT: mv a1, zero			; RV32I-NEXT: mv a1, zero
	; RV32I-NEXT: mv a4, zero			; RV32I-NEXT: mv a4, zero
	; RV32I-NEXT: jalr a3			; RV32I-NEXT: jalr a3
	; RV32I-NEXT: lw s0, 8(sp)			; RV32I-NEXT: lw s0, 8(sp)
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
	%1 = call double (i32, double, ...) @va3(i32 2, double 1.000000e+00, double 2.000000e+00)			%1 = call double (i32, double, ...) @va3(i32 2, double 1.000000e+00, double 2.000000e+00)
	▲ Show 20 Lines • Show All 122 Lines • ▼ Show 20 Lines
	; RV32I-NEXT: addi a0, a0, -1967			; RV32I-NEXT: addi a0, a0, -1967
	; RV32I-NEXT: sw a0, -24(s0)			; RV32I-NEXT: sw a0, -24(s0)
	; RV32I-NEXT: lui a0, 964690			; RV32I-NEXT: lui a0, 964690
	; RV32I-NEXT: addi a0, a0, -328			; RV32I-NEXT: addi a0, a0, -328
	; RV32I-NEXT: sw a0, -28(s0)			; RV32I-NEXT: sw a0, -28(s0)
	; RV32I-NEXT: lui a0, 335544			; RV32I-NEXT: lui a0, 335544
	; RV32I-NEXT: addi a0, a0, 1311			; RV32I-NEXT: addi a0, a0, 1311
	; RV32I-NEXT: sw a0, -32(s0)			; RV32I-NEXT: sw a0, -32(s0)
	; RV32I-NEXT: lui a0, 688509
	; RV32I-NEXT: addi a6, a0, -2048
	; RV32I-NEXT: lui a0, %hi(va5_aligned_stack_callee)			; RV32I-NEXT: lui a0, %hi(va5_aligned_stack_callee)
	; RV32I-NEXT: addi a5, a0, %lo(va5_aligned_stack_callee)			; RV32I-NEXT: addi a5, a0, %lo(va5_aligned_stack_callee)
	; RV32I-NEXT: addi a0, zero, 1			; RV32I-NEXT: addi a0, zero, 1
	; RV32I-NEXT: addi a1, zero, 11			; RV32I-NEXT: addi a1, zero, 11
	; RV32I-NEXT: addi a2, s0, -32			; RV32I-NEXT: addi a2, s0, -32
	; RV32I-NEXT: addi a3, zero, 12			; RV32I-NEXT: addi a3, zero, 12
	; RV32I-NEXT: addi a4, zero, 13			; RV32I-NEXT: addi a4, zero, 13
				; RV32I-NEXT: lui a6, 688509
				; RV32I-NEXT: addi a6, a6, -2048
	; RV32I-NEXT: addi a7, zero, 4			; RV32I-NEXT: addi a7, zero, 4
	; RV32I-NEXT: jalr a5			; RV32I-NEXT: jalr a5
	; RV32I-NEXT: lw s0, 56(sp)			; RV32I-NEXT: lw s0, 56(sp)
	; RV32I-NEXT: lw ra, 60(sp)			; RV32I-NEXT: lw ra, 60(sp)
	; RV32I-NEXT: addi sp, sp, 64			; RV32I-NEXT: addi sp, sp, 64
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
	%1 = call i32 (i32, ...) @va5_aligned_stack_callee(i32 1, i32 11,			%1 = call i32 (i32, ...) @va5_aligned_stack_callee(i32 1, i32 11,
	fp128 0xLEB851EB851EB851F400091EB851EB851, i32 12, i32 13, i64 20000000000,			fp128 0xLEB851EB851EB851F400091EB851EB851, i32 12, i32 13, i64 20000000000,
	Show All 37 Lines

test/MC/RISCV/rv32i-aliases-invalid.s

	# RUN: not llvm-mc %s -triple=riscv32 -riscv-no-aliases 2>&1 \| FileCheck %s			# RUN: not llvm-mc %s -triple=riscv32 -riscv-no-aliases 2>&1 \| FileCheck %s
	# RUN: not llvm-mc %s -triple=riscv32 2>&1 \| FileCheck %s			# RUN: not llvm-mc %s -triple=riscv32 2>&1 \| FileCheck %s

	# TODO ld			# TODO ld
	# TODO sd			# TODO sd

				li x0, 0x100000000 # CHECK: :[[@LINE]]:8: error: immediate must be an integer in the range [-2147483648, 2147483647]

	negw x1, x2 # CHECK: :[[@LINE]]:1: error: instruction use requires an option to be enabled			negw x1, x2 # CHECK: :[[@LINE]]:1: error: instruction use requires an option to be enabled
	sext.w x3, x4 # CHECK: :[[@LINE]]:1: error: instruction use requires an option to be enabled			sext.w x3, x4 # CHECK: :[[@LINE]]:1: error: instruction use requires an option to be enabled

test/MC/RISCV/rv32i-aliases-valid.s

	# RUN: llvm-mc %s -triple=riscv32 -riscv-no-aliases \			# RUN: llvm-mc %s -triple=riscv32 -riscv-no-aliases \
	# RUN: \| FileCheck -check-prefixes=CHECK-INST %s			# RUN: \| FileCheck -check-prefixes=CHECK-EXPAND,CHECK-INST %s
	# RUN: llvm-mc %s -triple=riscv32 \			# RUN: llvm-mc %s -triple=riscv32 \
	# RUN: \| FileCheck -check-prefixes=CHECK-ALIAS %s			# RUN: \| FileCheck -check-prefixes=CHECK-EXPAND,CHECK-ALIAS %s
	# RUN: llvm-mc -filetype=obj -triple riscv32 < %s \			# RUN: llvm-mc -filetype=obj -triple riscv32 < %s \
	# RUN: \| llvm-objdump -riscv-no-aliases -d - \			# RUN: \| llvm-objdump -riscv-no-aliases -d - \
	# RUN: \| FileCheck -check-prefixes=CHECK-INST %s			# RUN: \| FileCheck -check-prefixes=CHECK-EXPAND,CHECK-INST %s
	# RUN: llvm-mc -filetype=obj -triple riscv32 < %s \			# RUN: llvm-mc -filetype=obj -triple riscv32 < %s \
	# RUN: \| llvm-objdump -d - \			# RUN: \| llvm-objdump -d - \
	# RUN: \| FileCheck -check-prefixes=CHECK-ALIAS %s			# RUN: \| FileCheck -check-prefixes=CHECK-EXPAND,CHECK-ALIAS %s

				# The following check prefixes are used in this test:
				# CHECK-INST.....Match the canonical instr (tests alias to instr. mapping)
				# CHECK-ALIAS....Match the alias (tests instr. to alias mapping)
				# CHECK-EXPAND...Match canonical instr. unconditionally (tests alias expansion)

				# CHECK-INST: addi a0, zero, 0
				# CHECK-ALIAS: mv a0, zero
				li x10, 0
				# CHECK-EXPAND: addi a0, zero, 1
				li x10, 1
				# CHECK-EXPAND: addi a0, zero, -1
				li x10, -1
				# CHECK-EXPAND: addi a0, zero, 2047
				li x10, 2047
				# CHECK-EXPAND: addi a0, zero, -2047
				li x10, -2047
				# CHECK-EXPAND: lui a1, 1
				# CHECK-EXPAND: addi a1, a1, -2048
				li x11, 2048
				# CHECK-EXPAND: addi a1, zero, -2048
				li x11, -2048
				# CHECK-EXPAND: lui a1, 1
				# CHECK-EXPAND: addi a1, a1, -2047
				li x11, 2049
				# CHECK-EXPAND: lui a1, 1048575
				# CHECK-EXPAND: addi a1, a1, 2047
				li x11, -2049
				# CHECK-EXPAND: lui a1, 1
				# CHECK-EXPAND: addi a1, a1, -1
				li x11, 4095
				# CHECK-EXPAND: lui a1, 1048575
				# CHECK-EXPAND: addi a1, a1, 1
				li x11, -4095
				# CHECK-EXPAND: lui a2, 1
				li x12, 4096
				# CHECK-EXPAND: lui a2, 1048575
				li x12, -4096
				# CHECK-EXPAND: lui a2, 1
				# CHECK-EXPAND: addi a2, a2, 1
				li x12, 4097
				# CHECK-EXPAND: lui a2, 1048575
				# CHECK-EXPAND: addi a2, a2, -1
				li x12, -4097
				# CHECK-EXPAND: lui a2, 524288
				# CHECK-EXPAND: addi a2, a2, -1
				li x12, 2147483647
				# CHECK-EXPAND: lui a2, 524288
				# CHECK-EXPAND: addi a2, a2, 1
				li x12, -2147483647
				# CHECK-EXPAND: lui a2, 524288
				li x12, -2147483648

	# CHECK-INST: csrrs t4, 3202, zero			# CHECK-INST: csrrs t4, 3202, zero
	# CHECK-ALIAS: rdinstreth t4			# CHECK-ALIAS: rdinstreth t4
	rdinstreth x29			rdinstreth x29
	# CHECK-INST: csrrs s11, 3200, zero			# CHECK-INST: csrrs s11, 3200, zero
	# CHECK-ALIAS: rdcycleh s11			# CHECK-ALIAS: rdcycleh s11
	rdcycleh x27			rdcycleh x27
	# CHECK-INST: csrrs t3, 3201, zero			# CHECK-INST: csrrs t3, 3201, zero
	# CHECK-ALIAS: rdtimeh t3			# CHECK-ALIAS: rdtimeh t3
	rdtimeh x28			rdtimeh x28

test/MC/RISCV/rv64i-aliases-valid.s

	# RUN: llvm-mc %s -triple=riscv64 -riscv-no-aliases \			# RUN: llvm-mc %s -triple=riscv64 -riscv-no-aliases \
	# RUN: \| FileCheck -check-prefix=CHECK-INST %s			# RUN: \| FileCheck -check-prefixes=CHECK-EXPAND,CHECK-INST %s
	# RUN: llvm-mc %s -triple=riscv64 \			# RUN: llvm-mc %s -triple=riscv64 \
	# RUN: \| FileCheck -check-prefix=CHECK-ALIAS %s			# RUN: \| FileCheck -check-prefixes=CHECK-EXPAND,CHECK-ALIAS %s
	# RUN: llvm-mc -filetype=obj -triple riscv64 < %s \			# RUN: llvm-mc -filetype=obj -triple riscv64 < %s \
	# RUN: \| llvm-objdump -riscv-no-aliases -d - \			# RUN: \| llvm-objdump -riscv-no-aliases -d - \
	# RUN: \| FileCheck -check-prefix=CHECK-INST %s			# RUN: \| FileCheck -check-prefixes=CHECK-EXPAND,CHECK-INST %s
	# RUN: llvm-mc -filetype=obj -triple riscv64 < %s \			# RUN: llvm-mc -filetype=obj -triple riscv64 < %s \
	# RUN: \| llvm-objdump -d - \			# RUN: \| llvm-objdump -d - \
	# RUN: \| FileCheck -check-prefix=CHECK-ALIAS %s			# RUN: \| FileCheck -check-prefixes=CHECK-EXPAND,CHECK-ALIAS %s

				# The following check prefixes are used in this test:
				# CHECK-INST.....Match the canonical instr (tests alias to instr. mapping)
				# CHECK-ALIAS....Match the alias (tests instr. to alias mapping)
				# CHECK-EXPAND...Match canonical instr. unconditionally (tests alias expansion)

	# TODO ld			# TODO ld
	# TODO sd			# TODO sd

				# CHECK-INST: addiw a0, zero, 0
				# CHECK-ALIAS: sext.w a0, zero
				li x10, 0
				# CHECK-EXPAND: addiw a0, zero, 1
				li x10, 1
				# CHECK-EXPAND: addiw a0, zero, -1
				li x10, -1
				# CHECK-EXPAND: addiw a0, zero, 2047
				li x10, 2047
				# CHECK-EXPAND: addiw a0, zero, -2047
				li x10, -2047
				# CHECK-EXPAND: lui a1, 1
				# CHECK-EXPAND: addiw a1, a1, -2048
				li x11, 2048
				# CHECK-EXPAND: addiw a1, zero, -2048
				li x11, -2048
				# CHECK-EXPAND: lui a1, 1
				# CHECK-EXPAND: addiw a1, a1, -2047
				li x11, 2049
				# CHECK-EXPAND: lui a1, 1048575
				# CHECK-EXPAND: addiw a1, a1, 2047
				li x11, -2049
				# CHECK-EXPAND: lui a1, 1
				# CHECK-EXPAND: addiw a1, a1, -1
				li x11, 4095
				# CHECK-EXPAND: lui a1, 1048575
				# CHECK-EXPAND: addiw a1, a1, 1
				li x11, -4095
				# CHECK-EXPAND: lui a2, 1
				li x12, 4096
				# CHECK-EXPAND: lui a2, 1048575
				li x12, -4096
				# CHECK-EXPAND: lui a2, 1
				# CHECK-EXPAND: addiw a2, a2, 1
				li x12, 4097
				# CHECK-EXPAND: lui a2, 1048575
				# CHECK-EXPAND: addiw a2, a2, -1
				li x12, -4097
				# CHECK-EXPAND: lui a2, 524288
				# CHECK-EXPAND: addiw a2, a2, -1
				li x12, 2147483647
				# CHECK-EXPAND: lui a2, 524288
				# CHECK-EXPAND: addiw a2, a2, 1
				li x12, -2147483647
				# CHECK-EXPAND: lui a2, 524288
				li x12, -2147483648

				# CHECK-EXPAND: addiw t0, zero, 1
				# CHECK-EXPAND: slli t0, t0, 32
				li t0, 0x100000000
				# CHECK-EXPAND: addiw t1, zero, -1
				# CHECK-EXPAND: slli t1, t1, 63
				li t1, 0x8000000000000000
				# CHECK-EXPAND: lui t2, 9321
				# CHECK-EXPAND: addiw t2, t2, -1329
				# CHECK-EXPAND: slli t2, t2, 35
				li t2, 0x1234567800000000
				# CHECK-EXPAND: addiw t3, zero, 7
				# CHECK-EXPAND: slli t3, t3, 36
				# CHECK-EXPAND: addi t3, t3, 11
				# CHECK-EXPAND: slli t3, t3, 24
				# CHECK-EXPAND: addi t3, t3, 15
				li t3, 0x700000000B00000F
				# CHECK-EXPAND: lui t4, 583
				# CHECK-EXPAND: addiw t4, t4, -1875
				# CHECK-EXPAND: slli t4, t4, 14
				# CHECK-EXPAND: addi t4, t4, -947
				# CHECK-EXPAND: slli t4, t4, 12
				# CHECK-EXPAND: addi t4, t4, 1511
				# CHECK-EXPAND: slli t4, t4, 13
				# CHECK-EXPAND: addi t4, t4, -272
				li t4, 0x123456789abcdef0

	# CHECK-INST: subw t6, zero, ra			# CHECK-INST: subw t6, zero, ra
	# CHECK-ALIAS: negw t6, ra			# CHECK-ALIAS: negw t6, ra
				efriedmaUnsubmitted Not Done Reply Inline Actions This seems a little unfortunate... given you can load an arbitrary 32-bit immediate in two instructions, you should be able to load a 64-bit immediate in six instructions ("hi << 32 \| lo"). But I guess that requires a second register? efriedma: This seems a little unfortunate... given you can load an arbitrary 32-bit immediate in two…
				niosHDAuthorUnsubmitted Not Done Reply Inline Actions Correct, with a second register, 6 instructions would be sufficient. Unfortunately, using a second register is, at least for the assembler, not an option. On the other hand, during codegen I think we should invest these two (virtual) registers. Additionally, in the long term, loading the constant from a constant pool should be evaluated given that it could be even more efficient. (assuming RV64I: 64-bit constant + 1 load + at most 2 instructions for the address calculation) niosHD: Correct, with a second register, 6 instructions would be sufficient. Unfortunately, using a…
				asbUnsubmitted Not Done Reply Inline Actions For what it's worth, the RV64I codegen patches (not yet merged) do just use two registers and six instructions - but this is done in a dumb way that fails to recogise cases where <6 instructions can be used. Fully agree that it will be worth looking at using the constant pool asb: For what it's worth, the RV64I codegen patches (not yet merged) do just use two registers and…
	negw x31, x1			negw x31, x1
	# CHECK-INST: addiw t6, ra, 0			# CHECK-INST: addiw t6, ra, 0
	# CHECK-ALIAS: sext.w t6, ra			# CHECK-ALIAS: sext.w t6, ra
	sext.w x31, x1			sext.w x31, x1

test/MC/RISCV/rvi-aliases-valid.s

	Show All 13 Lines
	# RUN: \| FileCheck -check-prefix=CHECK-ALIAS %s			# RUN: \| FileCheck -check-prefix=CHECK-ALIAS %s
	# RUN: llvm-mc -filetype=obj -triple riscv64 < %s \			# RUN: llvm-mc -filetype=obj -triple riscv64 < %s \
	# RUN: \| llvm-objdump -d -riscv-no-aliases - \			# RUN: \| llvm-objdump -d -riscv-no-aliases - \
	# RUN: \| FileCheck -check-prefix=CHECK-INST %s			# RUN: \| FileCheck -check-prefix=CHECK-INST %s
	# RUN: llvm-mc -filetype=obj -triple riscv64 < %s \			# RUN: llvm-mc -filetype=obj -triple riscv64 < %s \
	# RUN: \| llvm-objdump -d - \			# RUN: \| llvm-objdump -d - \
	# RUN: \| FileCheck -check-prefix=CHECK-ALIAS %s			# RUN: \| FileCheck -check-prefix=CHECK-ALIAS %s

				# The following check prefixes are used in this test:
				# CHECK-INST.....Match the canonical instr (tests alias to instr. mapping)
				# CHECK-ALIAS....Match the alias (tests instr. to alias mapping)

	# TODO la			# TODO la
	# TODO lb lh lw			# TODO lb lh lw
	# TODO sb sh sw			# TODO sb sh sw

	# CHECK-INST: addi zero, zero, 0			# CHECK-INST: addi zero, zero, 0
	# CHECK-ALIAS: nop			# CHECK-ALIAS: nop
	nop			nop
	# TODO li
	# CHECK-INST: addi t6, zero, 0			# CHECK-INST: addi t6, zero, 0
	# CHECK-ALIAS: mv t6, zero			# CHECK-ALIAS: mv t6, zero
	mv x31, zero			mv x31, zero
	# CHECK-INST: xori t6, ra, -1			# CHECK-INST: xori t6, ra, -1
				asbUnsubmitted Done Reply Inline Actions Having CHECK-INST and CHECK-ALIAS makes sense in this file. Adding in CHECK to the mix makes it a little confusing. Maybe the li cases that don't 'round-trip' belong in a separate test file? e.g. to match gnu as behaviour we'd expect `li x3, 0x80` to be printed by objdump, but `li x4, 0x800` would be expanded to lui+addiw and will never appear in objdump output. asb: Having CHECK-INST and CHECK-ALIAS makes sense in this file. Adding in CHECK to the mix makes it…
				niosHDAuthorUnsubmitted Done Reply Inline Actions If you are not strongly against it I would prefer to keep all pseudo instruction test cases together in the respective `...aliases-valid` and `...aliases-invalid` files. We already have many files in the RISCV MC test directory and I am hesitant to add even more without real need. I expect that the remaining pseudo instructions most likely will not properly roundtrip either and certainly do not want to add new test files for every individual instruction. Also, as you noted, some `li` instructions will roundtrip depending on the specified immediate. Splitting the files based on this property would introduce even more test files given that RV32 and RV64 need separate tests too. niosHD: If you are not strongly against it I would prefer to keep all pseudo instruction test cases…
				asbUnsubmitted Done Reply Inline Actions Yes, I see your concern. My main problem is that when there was just CHECK-INST and CHECK-ALIAS it was fairly obvious what the different check lines meant. A comment in the file that explains the different check lines might make it easier on the reader. I suppose `CHECK-EXPAND` might be a little more descriptive, seeing as we're verifying that the pseudoinstruction expands to the expected multi-instruction sequence? asb: Yes, I see your concern. My main problem is that when there was just CHECK-INST and CHECK-ALIAS…
	# CHECK-ALIAS: not t6, ra			# CHECK-ALIAS: not t6, ra
	not x31, x1			not x31, x1
	# CHECK-INST: sub t6, zero, ra			# CHECK-INST: sub t6, zero, ra
	# CHECK-ALIAS: neg t6, ra			# CHECK-ALIAS: neg t6, ra
	neg x31, x1			neg x31, x1
	# CHECK-INST: sltiu t6, ra, 1			# CHECK-INST: sltiu t6, ra, 1
	# CHECK-ALIAS: seqz t6, ra			# CHECK-ALIAS: seqz t6, ra
	seqz x31, x1			seqz x31, x1
	▲ Show 20 Lines • Show All 104 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] implement li pseudo instructionClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 133026

lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp

lib/Target/RISCV/MCTargetDesc/CMakeLists.txt

lib/Target/RISCV/MCTargetDesc/RISCVMCPseudoExpansion.h

lib/Target/RISCV/MCTargetDesc/RISCVMCPseudoExpansion.cpp

lib/Target/RISCV/RISCVAsmPrinter.cpp

lib/Target/RISCV/RISCVInstrFormats.td

lib/Target/RISCV/RISCVInstrInfo.cpp

lib/Target/RISCV/RISCVInstrInfo.td

test/CodeGen/RISCV/bswap-ctlz-cttz-ctpop.ll

test/CodeGen/RISCV/calling-conv.ll

test/CodeGen/RISCV/mem.ll

test/CodeGen/RISCV/vararg.ll

test/MC/RISCV/rv32i-aliases-invalid.s

test/MC/RISCV/rv32i-aliases-valid.s

test/MC/RISCV/rv64i-aliases-valid.s

test/MC/RISCV/rvi-aliases-valid.s

[RISCV] implement li pseudo instruction
ClosedPublic