This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/CodeGen/GlobalISel/
-
llvm/
-
CodeGen/
-
GlobalISel/
7
IRTranslator.h
-
lib/
-
CodeGen/GlobalISel/
-
GlobalISel/
6/31
IRTranslator.cpp
-
Target/
-
AArch64/
-
AArch64CallLowering.cpp
-
ARM/
-
ARMCallLowering.cpp
-
test/CodeGen/
-
CodeGen/
-
AArch64/GlobalISel/
-
GlobalISel/
-
arm64-fallback.ll
-
arm64-irtranslator.ll
-
call-translator-ios.ll
-
call-translator.ll
-
irtranslator-exceptions.ll
-
legalize-exceptions.ll
-
ARM/GlobalISel/
-
GlobalISel/
-
arm-irtranslator.ll
-
arm-param-lowering.ll

Differential D46018

[GlobalISel][IRTranslator] Split aggregates during IR translation
ClosedPublic

Authored by aemerson on Apr 24 2018, 8:54 AM.

Download Raw Diff

Details

Reviewers

dsanders
aditya_nandakumar
qcolombet
bogner
volkan
rtereshin
t.p.northover
javed.absar
rovka

Commits

rG0d6a26dffcf3: [GlobalISel][IRTranslator] Split aggregates during IR translation.
rL332449: [GlobalISel][IRTranslator] Split aggregates during IR translation.

Summary

We currently handle all aggregates by creating one large LLT, and letting the legalizer deal with splitting them up. However using this approach means that we can't support big endian code correctly.

This patch changes the way that the IRTranslator deals with aggregate values, by splitting them up into their constituent element values. To do this, parts of the translator need to be modified to deal with multiple VRegs for a single Value.

A new Value to VReg mapper is introduced to help keep compile time under control, currently there is no measurable impact on CTMark despite the extra code being generated in some cases.

Patch is based on the original work of Tim Northover.

Diff Detail

Repository: rL LLVM

Event Timeline

aemerson created this revision.Apr 24 2018, 8:54 AM

Herald added a reviewer: javed.absar. · View Herald TranscriptApr 24 2018, 8:54 AM

Herald added subscribers: kristof.beyls, rovka. · View Herald Transcript

aemerson edited the summary of this revision. (Show Details)Apr 24 2018, 8:57 AM

aemerson added a reviewer: rovka.

rtereshin added inline comments.Apr 26 2018, 8:39 PM

include/llvm/CodeGen/GlobalISel/IRTranslator.h
78	This looks like over 128 bytes per `Value` or more. How does memory consumption change with this patch? If it ends up being a problem, we might reuse the `MachineInstr`s approach with storing machine memory operands. It's a similar pattern - in absolute majority of the cases we have just one to one (in case of value to vregs, or one to zero in case of the machine instruction to memory operands) mapping, but only sometimes its one to many.

aemerson added inline comments.Apr 29 2018, 2:33 PM

include/llvm/CodeGen/GlobalISel/IRTranslator.h
78	I haven't measured memory consumption, but changing the SmallVector capacities to 1 has no impact on the compile time on CTMark so I'll do that. Can you be more specific about the MachineInstr thing? To which bit of code are you referring to? I think at capacities of 1 this should be fine.

rtereshin added inline comments.Apr 30 2018, 11:56 AM

include/llvm/CodeGen/GlobalISel/IRTranslator.h
78	I haven't measured memory consumption, but changing the SmallVector capacities to 1 has no impact on the compile time on CTMark so I'll do that. Makes sense. I came up with 128+ for bytes as follows: every SmallVector has `3 * sizeof(intptr_t)` worth of overhead (size, capacity, pointer to the underlying array), looks like we have 2 of them per `Value`, so 6 `intptr_t`s of overhead. We also have 2 separate maps that in very best case consume 4 `intptr_t`s per `Value` on top of that, more likely about twice that if `DenseMap`s keep the load factor around 0.5, I didn't check that, but AFAIK it's a common value. The actual data consume 2 `intptr_t`s for 4 `unsigned`s and 4 `intptr_t`s for 4 `uint64_t` on 64-bit platforms. Overall it's optimistically `(10 + 6) * sizeof(intptr_t)` = 128+ bytes per `Value` on 64-bit platforms. Reducing the default `SmallVector` sizes to 1 changes only (3) to 1 (due to alignment) + 1 `intptr_t`s with overall memory consumption `(10 + 2) * sizeof(intptr_t)` = 96+ bytes per `Value` on 64-bit platforms. If we put a vreg and its offset in the same `pair` and have a single `SmallVector` and a single map it will go down like follows: 3 `intptr_t`s of overhead per `Value` due to `SmallVector` 2+ `intptr_t`s of overhead per `Value` due to the pointer to pointer map. 2 `intptr_t` of actual payload assuming the default size of `SmallVector` being 1. overall `(5 + 2) * sizeof(intptr_t)` = 56+ bytes per `Value` on 64-bit platforms. If the memory consumption is not a problem, that should be more than enough I think. The memory operand's trick I was referring to lives roughly here: https://github.com/llvm-mirror/llvm/blob/2a793f6500a1a77cb7186549a6f9245bea847cf5/include/llvm/CodeGen/MachineInstr.h#L107-L114 https://github.com/llvm-mirror/llvm/blob/2a793f6500a1a77cb7186549a6f9245bea847cf5/lib/CodeGen/MachineInstr.cpp#L318-L332 https://github.com/llvm-mirror/llvm/blob/2a793f6500a1a77cb7186549a6f9245bea847cf5/include/llvm/CodeGen/MachineInstr.h#L1304-L1313 It won't save much here on top of the "just one SmallVector and one map" suggestion, basically just the `capacity` part of `SmallVector`, AFAICT, we definitely don't need it if the memory consumption is not an issue here. All of this is just a mere suggestion, we can keep it as it is and worry about it later if it becomes an apparent problem, I think. UPD. I took a look at `DenseMap` implementation, and it appears to me that the expected value of it's load factor is 9/16, so it takes approximately twice as much memory as estimated above. Therefore estimations change from 128 -> 96 -> 56 to 160 -> 128 -> 72.

aemerson added inline comments.May 8 2018, 6:49 AM

include/llvm/CodeGen/GlobalISel/IRTranslator.h
78	Thanks for the analysis. I don't think we can have just a single SmallVector because we need the ability to dynamically grow each set of vregs and/or offsets, so storing one SmallVector for each Value is better despite the extra memory cost in terms of the design/

Hi Amara,

I think I'm about half-way through, didn't touch the {pack,unpack}Regs functions and the whole story with interfacing with legacy call-lowering yet.

This is great work, thank you for doing this!

Roman

lib/CodeGen/GlobalISel/IRTranslator.cpp
121	Maybe this way it's just a bit more straightforward: for (unsigned I = 0, E = STy->getNumElements(); I != E; ++I) computeValueLLTs(DL, *STy->getElementType(I), ValueTys, Offsets, StartingOffset + SL->getElementOffset(I)); It's alright as it is, of course, just in case you would like the version above better yourself.
128	I think we're trying to stick to capitalized local variables' names, including loop induction variables.
157	Looks like technically for aggregate constants `getOrCreateVRegs` has worst case complexity of `O(N^2)`, for constants like `{ i8 0, { i8 0, { i8 0, ... } } }`, but this is probably not easy to avoid and such constants hopefully don't happen often.
173	This piece in general is not exactly trivial, perhaps it makes sense to add test cases having multiple deeply nested aggregate constants sharing some parts so `getOrCreateVRegs` would visit an actual non-tree DAG while traversing them.
176	`VRegs->front()` maybe
474	Same note as for `insertvalue`, do we really need to build these copies here?
508	Let's say we have a large aggregate that eventually maps to N vregs, and also N `insertvalue`s each of which replaces just a single item (vreg). Will we end up generating N^2 `COPY`s for the IR of initial size `O(N)` here? Granted, this probably happens very rarely in practice, but what if we don't `getOrCreateVRegs(U)`, but rather just `get(U)` (no implicit `MIR.createGenericRegister` calls), resize, and assign either `*InsertedIt++` or `SrcRegs[i]` to it directly w/o building any explicit `COPY`s at all? We're just remapping values back and forth here during translation, with always constant indices, so the whole thing could be copy-propagated on the fly avoiding quite a bit of MIR-churn.

rtereshin added inline comments.May 9 2018, 12:04 AM

include/llvm/CodeGen/GlobalISel/IRTranslator.h
78	I didn't mean the same `SmallVector` for all values in the function, I meant one `SmallVector` for each `Value`, just as you say, but one (of `std::pair<unsigned, int64_t>`s, for instance), instead of two per each value (for offsets and vregs themselves).
lib/CodeGen/GlobalISel/IRTranslator.cpp
467	For better or for worse, `getIndexedOffsetInType` returns `int64_t`, not `uint64_t`.
494	This logic of getting indices from values is quite repeated here and in `extractvalue`, do you think it makes sense to refactor it out as a separate function?
1275	This could be inlined as `MIRBuilder.buildInstr(TargetOpcode::G_PHI, Reg);`

aemerson added inline comments.May 10 2018, 1:23 PM

include/llvm/CodeGen/GlobalISel/IRTranslator.h
78	That's possible, I'd rather do it in a follow up patch.
lib/CodeGen/GlobalISel/IRTranslator.cpp
128	To me using a capital I normally indicates an iterator while a lower case 'i' is assumed to be an integer type, and we already use lower case 'i' idiomatically everywhere including GISel.
494	Could do.
508	I don't know why you mean by `get(U)`. What would it return?

Also, I looked briefly into https://bugs.llvm.org/show_bug.cgi?id=37397 recently, and I tried to apply this patch and see what it would do. It generated considerably more code (final assembly) comparing to itself and FastISel both (I'd eyeball the difference as 3x) and didn't seem to fix the issue with initializing upper bits of a boolean value.

Generally, I feel concerned about the quality (runtime performance) of generated code with this approach.

include/llvm/CodeGen/GlobalISel/IRTranslator.h
78	Another thing to notice is that apparently all the offsets depend on the value's type, not the value itself, therefore they could be cached (separately from vregs this time) by using the value type as a key, not the value itself.
lib/CodeGen/GlobalISel/IRTranslator.cpp
128	@bogner It's a valid point: > grep -Ern 'for.* i\s=\s\d' ./ --include='.cpp' --include='.h' --exclude-dir=build \| wc -l 17473 > grep -Ern 'for.* I\s=\s\d' ./ --include='.cpp' --include='.h' --exclude-dir=build \| wc -l 1617
508	It was supposed to be more like `allocateVRegs(U)`. What I mean here is that let's say we only allocate the required number of vregs for `U` w/o initializing them with anything, and then just assign to a `DstReg[i]` either `Inserted++` or `SrcRegs[i]` depending on the offset, like this: MutableArrayRef<unsigned> DstRegs = allocateVRegs(U); // `VMap.insertVRegs(U)` + initializing offsets ArrayRef<uint64_t> DstOffsets = VMap.getOffsets(U); // getting just initialized offsets ArrayRef<unsigned> SrcRegs = getOrCreateVRegs(Src); ArrayRef<unsigned> InsertedRegs = getOrCreateVRegs(U.getOperand(1)); auto InsertedIt = InsertedRegs.begin(); for (unsigned i = 0; i < DstRegs.size(); ++i) { if (DstOffsets[i] >= Offset && InsertedIt != InsertedRegs.end()) DstRegs[i] = *InsertedIt++; else DstRegs[i], SrcRegs[i]; avoiding building any copies. Also, if the offsets are cached by value type (as suggested in a comment above), `allocateVRegs(U)` won't visit `U` at all, as `Src` is already processed (assuming we IR Translate top to bottom) and we have all the offsets cached for `U->getType()` already and `U` and `Src` have the same type.

bogner added inline comments.May 10 2018, 2:58 PM

lib/CodeGen/GlobalISel/IRTranslator.cpp
128	The coding standards certainly imply that variables should all be capitalized, though they don't call out loop variables explicitly: http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly I tend to stick to capitalizing in llvm here, if only because I find it very hard to talk about code if both "I" and "i" end up being in scope at the same time. In any case, if lower case "i" is meant to be canonical someone should really patch the coding standards to say so.

In D46018#1095200, @rtereshin wrote:

Also, I looked briefly into https://bugs.llvm.org/show_bug.cgi?id=37397 recently, and I tried to apply this patch and see what it would do. It generated considerably more code (final assembly) comparing to itself and FastISel both (I'd eyeball the difference as 3x) and didn't seem to fix the issue with initializing upper bits of a boolean value.

Generally, I feel concerned about the quality (runtime performance) of generated code with this approach.

My thoughts were that we'd take a compile time hit with this, although in the end it was mostly neutral, but the code size would regress as you saw. The idea was to have a pre-legalizer combiner (Aditya had a prototype demo of one to show how the combiner API would work). Either that, or we have a clean up phase at the end of IRTranslator where we eliminate G_INSERT and G_EXTRACT pairs, perhaps with some caching of potentially redundant sets of these instructions due to pack/unpack regs as translation is happening, and then a quick pass to eliminate them without searching the whole function.

To support my notes on performance and memory consumption side of things I have tried GlobalISel and FastISel / SelectionDAG ISel before and after this patch on the following test case

test.ll56 KBDownload

(~1k (1,000) lines of LLVM IR):

GlobalISel before: Total: 55 ms / 3.5k lines of assembly, IRTranslator: 0.5M / 0.5 ms / 3k of MIR lines, Legalizer: 40M / 30.5 ms / 137k of MIR lines
GlobalISel after: Total: 920 ms / 8k lines of assembly, IRTranslator: 315M / 250 ms / 2,110k of MIR lines, Legalizer: 0.06M / 12 ms / 2,110k of MIR lines
FastISel / Selection DAG ISel: Total: 7,400 ms / 4k lines of assembly, AArch64 Instruction Selection: 2M / 7,340 ms / 4k of MIR lines

In all cases instruction selection is the worst offender in terms of memory consumption.

It’s nice to see that FastISel doesn’t really manage to select that test case, it falls back to SelectionDAG ISel that operates in a -O0 mode and ends up working for much-much longer and still producing poor assembly. It would behave the same way for x86-64. If tried in -01 mode it would burn the same amount of time, but produce just 16 lines of assembly.

However, the memory consumption side of the story looks pretty bad for GlobalISel even before this patch, and looks much worse after. It could be noticed also, that memory consumption grows as O(N^2) with the size of the input LLVM IR.

I don’t think it’s acceptable.

I've made the changes to not generate copies for the insertvalue/extractvalue, but how did you measure the memory consumption of the individual passes? With the change it now produces 4k lines of assembly on your test case, vs 3.5k assembly. The MIR is much larger but that's because of the extra GEPs and G_CONSTANTs. Compile time is now comparable to without this change.

In D46018#1098046, @aemerson wrote:

I've made the changes to not generate copies for the insertvalue/extractvalue

That's great! Thanks!

but how did you measure the memory consumption of the individual passes?

-time-passes option of llc has a neat modifier: -track-memory. llc -time-passes -track-memory will report the diff in memory footprint for each pass as a difference before and after (so it could be negative if a pass frees a lot of memory). My guess that it also means that it doesn't report the peak memory consumption if the peak happens mid-pass. Which is rather unfortunate. But I'm not sure about that one.

Another way to look at things on a grand scale - no individual passes data - is to do llc -time-passes -time-compilations=1000 (or so) so llc loops for a long time and just watch the memory data in top. top seems to be greatly unable to catch spikes in memory consumption as well, unfortunately, it's probably a sampling tool so it misses brief increases in memory footprint. Or it least it feels that way. Or I'm using it wrong. Please share any ideas about all of this.

With the change it now produces 4k lines of assembly on your test case, vs 3.5k assembly. The MIR is much larger but that's because of the extra GEPs and G_CONSTANTs. Compile time is now comparable to without this change.

Awesome, could you please update the diff here at Phabricator?

Thanks!
Roman

Thanks. -time-passes shows that IRTranslator is now using around 2MB. It's also taking most of the compile time vs other passes like your initial analysis showed, but we trade off the legalizer doing much less work.

I haven't run this version of the patch through much testing yet.

Some review comments not addressed yet, will add a test requested in a later revision.

In D46018#1098606, @aemerson wrote:

Thanks. -time-passes shows that IRTranslator is now using around 2MB. It's also taking most of the compile time vs other passes like your initial analysis showed, but we trade off the legalizer doing much less work.

I haven't run this version of the patch through much testing yet.

Looks like there is a higher precision way of measuring the memory consumption for the whole process on macOS: time -l. Given that with this specific test IRTranslator was allocating much more memory than anything else, I think it also makes sense to measure the entire llc compile from outside:

command time -l ./bin/llc -O0 -global-isel=true -global-isel-abort=2 -mtriple aarch64-- test.ll -o /dev/null -time-passes

And the results are great:

[GlobalISel Before]
49.5M / 68ms

[FastISel / SelectionDAG ISel]
57.3M / 7,250ms

[GlobalISel After / Original Patch]
337.7M / 740ms

[GlobalISel After / Updated Patch]
29.5M / 40ms

(memory figure is "maximum resident set size" as shown by time -l, the time figure is "User+System / Total" as shown by -time-passes)

I think it's a huge win and we can stop here as far as performance and memory consumption are concerned. Thank you for the update!

lib/CodeGen/GlobalISel/IRTranslator.cpp
409	`Offsets[i]` here needs to be in bytes as well.
413	Ditto, `MinAlign` expects everything in bytes as far as I can tell.
438	Ditto, `MachinePointerInfo` expects everything in bytes.
442	Ditto.

Hi Amara,

Alright, I'm done with the review.

General notes:

there is insufficient tests coverage for cases like PHIs and nested / shared / repeated aggregate constants
there is code that is likely to be dead
pretty much every use-case of computeValueLLTs looks like a hack, tbh, maybe it could be helped if we actually start mapping Types instead of Values to offsets and LLTs.

None of it is a showstopper IMO though, please address what seems to be reasonable to get addressed in the initial patch and we will be good to go with this.

Thanks!
Roman

lib/CodeGen/GlobalISel/IRTranslator.cpp
154	Just `Regs.resize(SplitTys.size())` may be a littler cleaner.
912	Indentation is off
965	unsigned Res = IsSplitType ? MRI->createGenericVirtualRegister(getLLTForType(CI.getType(), DL)) : getOrCreateVReg(CI); may work a little nicer here.
1254	It looks like `getOrCreateVReg` (singular) is already implemented so it could be used here.
1307	There are two `i`s in scope here. Do we have a test covering this?
1324	This function is rather a hack, but I guess it's not a big deal.
1462–1469	I can't see how this condition could be true. And none of the tests fail if an assertion is inserted under this condition. If it is possible, could you please add a test? Thanks!
1481	could be replaced with range-based for.
1486	How this could be not true? As above, an assertion in in `else` branch shows `VRegs` is always empty here as far as the existing tests' coverage go.
1490	`ArgIt` is incremented in both branches, maybe better to put it at the end of the loop.

aivchenk added a subscriber: aivchenk.May 15 2018, 1:51 AM

rtereshin added inline comments.May 15 2018, 2:11 AM

lib/CodeGen/GlobalISel/IRTranslator.cpp

1307

I think I've got a couple of tests for PHIs. They both seem to translate correctly:

test_phi_diamond.ll
test_phi_loop.ll

The test_phi_loop.ll one translates rather beautifully IMO:

body:             |
  bb.1.entry:
    liveins: $w0

    %0:_(s32) = COPY $w0
    %5:_(s32) = G_CONSTANT i32 1
    %7:_(s32) = G_CONSTANT i32 0
    %9:_(s64) = G_CONSTANT i64 0
    %10:_(s64) = G_CONSTANT i64 1

  bb.2.loop:
    %1:_(s32) = G_PHI %0(s32), %bb.1, %6(s32), %bb.2
    %2:_(s64) = G_PHI %9(s64), %bb.1, %3(s64), %bb.2
    %3:_(s64) = G_PHI %10(s64), %bb.1, %4(s64), %bb.2
    %4:_(s64) = G_ADD %2, %3
    %6:_(s32) = G_SUB %1, %5
    %8:_(s1) = G_ICMP intpred(sle), %1(s32), %7
    G_BRCOND %8(s1), %bb.3
    G_BR %bb.2

  bb.3.exit:
    $x0 = COPY %2(s64)
    RET_ReallyLR implicit $x0

Nested / shared / repeated aggregate constants & multi-level index tests: test_cons.ll - also seems to be translated correctly:

./bin/llc -O0 -global-isel=true -global-isel-abort=2 -mtriple aarch64--  -stop-after instruction-select test_cons.ll -o - -simplify-mir -verify-machineinstrs

body:             |
  bb.1.entry:
    liveins: $x0

    %0:gpr64sp = COPY $x0
    %34:gpr32 = MOVi32imm 10
    %33:gpr32 = MOVi32imm 20
    %32:gpr32 = MOVi32imm 50
    STRBBui %34, %0, 0 :: (store 1 into %ir.dst)
    STRBBui %33, %0, 1 :: (store 1 into %ir.dst + 1)
    STRBBui %34, %0, 2 :: (store 1 into %ir.dst + 2)
    STRBBui %33, %0, 3 :: (store 1 into %ir.dst + 3)
    STRBBui %32, %0, 4 :: (store 1 into %ir.dst + 4)
    STRBBui %34, %0, 5 :: (store 1 into %ir.dst + 5)
    STRBBui %33, %0, 6 :: (store 1 into %ir.dst + 6)
    STRBBui %33, %0, 7 :: (store 1 into %ir.dst + 7)
    STRBBui %34, %0, 0 :: (store 1 into %ir.dst)
    STRBBui %33, %0, 1 :: (store 1 into %ir.dst + 1)
    STRBBui %34, %0, 2 :: (store 1 into %ir.dst + 2)
    STRBBui %33, %0, 3 :: (store 1 into %ir.dst + 3)
    STRBBui %33, %0, 4 :: (store 1 into %ir.dst + 4)
    STRBBui %34, %0, 5 :: (store 1 into %ir.dst + 5)
    STRBBui %33, %0, 6 :: (store 1 into %ir.dst + 6)
    STRBBui %33, %0, 7 :: (store 1 into %ir.dst + 7)
    RET_ReallyLR

(this is already changed to pass values in bytes (instead of bits) as offset and align parameters to pointer info of memory operands)

I think I've addressed most of the issues. I've tried to clean up the computeValueLLTs function a little bit, still perhaps not ideal. Dead code has been removed (I think an earlier revision of the patch required it for the tests to pass). Also added the tests you wrote, thanks.

In D46018#1100254, @aemerson wrote:

I think I've addressed most of the issues. I've tried to clean up the computeValueLLTs function a little bit, still perhaps not ideal. Dead code has been removed (I think an earlier revision of the patch required it for the tests to pass). Also added the tests you wrote, thanks.

Hi Amara,

I think it's ready to get in, thank you for working with me on this one!

Roman

This revision is now accepted and ready to land.May 15 2018, 3:39 PM

Closed by commit rL332449: [GlobalISel][IRTranslator] Split aggregates during IR translation. (authored by aemerson). · Explain WhyMay 16 2018, 3:35 AM

This revision was automatically updated to reflect the committed changes.

rovka mentioned this in D63549: [GlobalISel] Accept multiple vregs in lowerFormalArgs.Jun 19 2019, 6:54 AM

rovka mentioned this in D63550: [GlobalISel] Accept multiple vregs for lowerCall's result.Jun 19 2019, 6:57 AM

rovka mentioned this in D63551: [GlobalISel] Accept multiple vregs for lowerCall's arguments.Jun 19 2019, 6:59 AM

rovka mentioned this in rL364510: [GlobalISel] Accept multiple vregs in lowerFormalArgs.Jun 27 2019, 1:55 AM

rovka mentioned this in rGc3dbe2397792: [GlobalISel] Accept multiple vregs in lowerFormalArgs.

rovka mentioned this in rG8138996128cd: [GlobalISel] Accept multiple vregs for lowerCall's result.Jun 27 2019, 2:21 AM

rovka mentioned this in rL364511: [GlobalISel] Accept multiple vregs for lowerCall's result.

rovka mentioned this in rG43fb5ae50c53: [GlobalISel] Accept multiple vregs for lowerCall's args.

rovka mentioned this in rL364512: [GlobalISel] Accept multiple vregs for lowerCall's args.

Revision Contents

Path

Size

include/

llvm/

CodeGen/

GlobalISel/

IRTranslator.h

118 lines

lib/

CodeGen/

GlobalISel/

IRTranslator.cpp

393 lines

Target/

AArch64/

AArch64CallLowering.cpp

3 lines

ARM/

ARMCallLowering.cpp

7 lines

test/

CodeGen/

AArch64/

GlobalISel/

arm64-fallback.ll

21 lines

arm64-irtranslator.ll

163 lines

call-translator-ios.ll

18 lines

call-translator.ll

70 lines

irtranslator-exceptions.ll

7 lines

legalize-exceptions.ll

16 lines

ARM/

GlobalISel/

arm-irtranslator.ll

13 lines

arm-param-lowering.ll

115 lines

Diff 143756

include/llvm/CodeGen/GlobalISel/IRTranslator.h

Show All 18 Lines
#ifndef LLVM_CODEGEN_GLOBALISEL_IRTRANSLATOR_H		#ifndef LLVM_CODEGEN_GLOBALISEL_IRTRANSLATOR_H
#define LLVM_CODEGEN_GLOBALISEL_IRTRANSLATOR_H		#define LLVM_CODEGEN_GLOBALISEL_IRTRANSLATOR_H

#include "llvm/ADT/DenseMap.h"		#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/CodeGen/GlobalISel/MachineIRBuilder.h"		#include "llvm/CodeGen/GlobalISel/MachineIRBuilder.h"
#include "llvm/CodeGen/GlobalISel/Types.h"		#include "llvm/CodeGen/GlobalISel/Types.h"
#include "llvm/CodeGen/MachineFunctionPass.h"		#include "llvm/CodeGen/MachineFunctionPass.h"
		#include "llvm/Support/Allocator.h"
#include "llvm/IR/Intrinsics.h"		#include "llvm/IR/Intrinsics.h"
#include <memory>		#include <memory>
#include <utility>		#include <utility>

namespace llvm {		namespace llvm {

class AllocaInst;		class AllocaInst;
class BasicBlock;		class BasicBlock;
Show All 23 Lines
class IRTranslator : public MachineFunctionPass {		class IRTranslator : public MachineFunctionPass {
public:		public:
static char ID;		static char ID;

private:		private:
/// Interface used to lower the everything related to calls.		/// Interface used to lower the everything related to calls.
const CallLowering *CLI;		const CallLowering *CLI;

/// Mapping of the values of the current LLVM IR function		/// This class contains the mapping between the Values to vreg related data.
/// to the related virtual registers.		class ValueToVRegInfo {
ValueToVReg ValToVReg;		public:
		ValueToVRegInfo() = default;

		using VRegListT = SmallVector<unsigned, 4>;
		using OffsetListT = SmallVector<uint64_t, 4>;

		using const_vreg_iterator =
		DenseMap<const Value , VRegListT >::const_iterator;
		using const_offset_iterator =
		DenseMap<const Value , OffsetListT >::const_iterator;
		rtereshinUnsubmitted Not Done Reply Inline Actions This looks like over 128 bytes per `Value` or more. How does memory consumption change with this patch? If it ends up being a problem, we might reuse the `MachineInstr`s approach with storing machine memory operands. It's a similar pattern - in absolute majority of the cases we have just one to one (in case of value to vregs, or one to zero in case of the machine instruction to memory operands) mapping, but only sometimes its one to many. rtereshin: This looks like over 128 bytes per `Value` or more. How does memory consumption change with…
		aemersonAuthorUnsubmitted Not Done Reply Inline Actions I haven't measured memory consumption, but changing the SmallVector capacities to 1 has no impact on the compile time on CTMark so I'll do that. Can you be more specific about the MachineInstr thing? To which bit of code are you referring to? I think at capacities of 1 this should be fine. aemerson: I haven't measured memory consumption, but changing the SmallVector capacities to 1 has no…
		rtereshinUnsubmitted Not Done Reply Inline Actions I haven't measured memory consumption, but changing the SmallVector capacities to 1 has no impact on the compile time on CTMark so I'll do that. Makes sense. I came up with 128+ for bytes as follows: every SmallVector has `3 * sizeof(intptr_t)` worth of overhead (size, capacity, pointer to the underlying array), looks like we have 2 of them per `Value`, so 6 `intptr_t`s of overhead. We also have 2 separate maps that in very best case consume 4 `intptr_t`s per `Value` on top of that, more likely about twice that if `DenseMap`s keep the load factor around 0.5, I didn't check that, but AFAIK it's a common value. The actual data consume 2 `intptr_t`s for 4 `unsigned`s and 4 `intptr_t`s for 4 `uint64_t` on 64-bit platforms. Overall it's optimistically `(10 + 6) * sizeof(intptr_t)` = 128+ bytes per `Value` on 64-bit platforms. Reducing the default `SmallVector` sizes to 1 changes only (3) to 1 (due to alignment) + 1 `intptr_t`s with overall memory consumption `(10 + 2) * sizeof(intptr_t)` = 96+ bytes per `Value` on 64-bit platforms. If we put a vreg and its offset in the same `pair` and have a single `SmallVector` and a single map it will go down like follows: 3 `intptr_t`s of overhead per `Value` due to `SmallVector` 2+ `intptr_t`s of overhead per `Value` due to the pointer to pointer map. 2 `intptr_t` of actual payload assuming the default size of `SmallVector` being 1. overall `(5 + 2) * sizeof(intptr_t)` = 56+ bytes per `Value` on 64-bit platforms. If the memory consumption is not a problem, that should be more than enough I think. The memory operand's trick I was referring to lives roughly here: https://github.com/llvm-mirror/llvm/blob/2a793f6500a1a77cb7186549a6f9245bea847cf5/include/llvm/CodeGen/MachineInstr.h#L107-L114 https://github.com/llvm-mirror/llvm/blob/2a793f6500a1a77cb7186549a6f9245bea847cf5/lib/CodeGen/MachineInstr.cpp#L318-L332 https://github.com/llvm-mirror/llvm/blob/2a793f6500a1a77cb7186549a6f9245bea847cf5/include/llvm/CodeGen/MachineInstr.h#L1304-L1313 It won't save much here on top of the "just one SmallVector and one map" suggestion, basically just the `capacity` part of `SmallVector`, AFAICT, we definitely don't need it if the memory consumption is not an issue here. All of this is just a mere suggestion, we can keep it as it is and worry about it later if it becomes an apparent problem, I think. UPD. I took a look at `DenseMap` implementation, and it appears to me that the expected value of it's load factor is 9/16, so it takes approximately twice as much memory as estimated above. Therefore estimations change from 128 -> 96 -> 56 to 160 -> 128 -> 72. rtereshin: > I haven't measured memory consumption, but changing the SmallVector capacities to 1 has no…
		aemersonAuthorUnsubmitted Not Done Reply Inline Actions Thanks for the analysis. I don't think we can have just a single SmallVector because we need the ability to dynamically grow each set of vregs and/or offsets, so storing one SmallVector for each Value is better despite the extra memory cost in terms of the design/ aemerson: Thanks for the analysis. I don't think we can have just a single SmallVector because we need…
		rtereshinUnsubmitted Not Done Reply Inline Actions I didn't mean the same `SmallVector` for all values in the function, I meant one `SmallVector` for each `Value`, just as you say, but one (of `std::pair<unsigned, int64_t>`s, for instance), instead of two per each value (for offsets and vregs themselves). rtereshin: I didn't mean the same `SmallVector` for all values in the function, I meant one `SmallVector`…
		aemersonAuthorUnsubmitted Not Done Reply Inline Actions That's possible, I'd rather do it in a follow up patch. aemerson: That's possible, I'd rather do it in a follow up patch.
		rtereshinUnsubmitted Not Done Reply Inline Actions Another thing to notice is that apparently all the offsets depend on the value's type, not the value itself, therefore they could be cached (separately from vregs this time) by using the value type as a key, not the value itself. rtereshin: Another thing to notice is that apparently all the offsets depend on the value's type, not the…

		inline const_vreg_iterator vregs_end() const { return ValToVRegs.end(); }

		VRegListT *getVRegs(const Value &V) {
		auto It = ValToVRegs.find(&V);
		if (It != ValToVRegs.end())
		return It->second;

		return insertVRegs(V);
		}

		OffsetListT *getOffsets(const Value &V) {
		auto It = ValToVRegOffsets.find(&V);
		if (It != ValToVRegOffsets.end())
		return It->second;

		return insertOffsets(V);
		}

		const_vreg_iterator findVRegs(const Value &V) const {
		return ValToVRegs.find(&V);
		}

		bool contains(const Value &V) const {
		return ValToVRegs.find(&V) != ValToVRegs.end();
		}

		void reset() {
		ValToVRegs.clear();
		ValToVRegOffsets.clear();
		VRegAlloc.DestroyAll();
		OffsetAlloc.DestroyAll();
		}
		private:
		VRegListT *insertVRegs(const Value &V) {
		assert(ValToVRegs.find(&V) == ValToVRegs.end() && "Value already exists");

		// We placement new using our fast allocator since we never try to free
		// the vectors until translation is finished.
		auto *VRegList = new (VRegAlloc.Allocate()) VRegListT();
		ValToVRegs[&V] = VRegList;
		return VRegList;
		}

		OffsetListT *insertOffsets(const Value &V) {
		assert(ValToVRegOffsets.find(&V) == ValToVRegOffsets.end() &&
		"Value already exists");

		auto *OffsetList = new (OffsetAlloc.Allocate()) OffsetListT();
		ValToVRegOffsets[&V] = OffsetList;
		return OffsetList;
		}

		SpecificBumpPtrAllocator<VRegListT> VRegAlloc;
		SpecificBumpPtrAllocator<OffsetListT> OffsetAlloc;

		// We store pointers to vectors here since references may be invalidated
		// while we hold them if we stored the vectors directly.
		DenseMap<const Value , VRegListT> ValToVRegs;
		DenseMap<const Value , OffsetListT> ValToVRegOffsets;
		};

		/// Mapping of the values of the current LLVM IR function to the related
		/// virtual registers and offsets. Code relies on iterators to these
		/// structures being stable during insertion so DenseMap (among others) is not
		/// appropriate.
		ValueToVRegInfo VMap;

// N.b. it's not completely obvious that this will be sufficient for every		// N.b. it's not completely obvious that this will be sufficient for every
// LLVM IR construct (with "invoke" being the obvious candidate to mess up our		// LLVM IR construct (with "invoke" being the obvious candidate to mess up our
// lives.		// lives.
DenseMap<const BasicBlock , MachineBasicBlock > BBToMBB;		DenseMap<const BasicBlock , MachineBasicBlock > BBToMBB;

// One BasicBlock can be translated to multiple MachineBasicBlocks. For such		// One BasicBlock can be translated to multiple MachineBasicBlocks. For such
// BasicBlocks translated to multiple MachineBasicBlocks, MachinePreds retains		// BasicBlocks translated to multiple MachineBasicBlocks, MachinePreds retains
// a mapping between the edges arriving at the BasicBlock to the corresponding		// a mapping between the edges arriving at the BasicBlock to the corresponding
// created MachineBasicBlocks. Some BasicBlocks that get translated to a		// created MachineBasicBlocks. Some BasicBlocks that get translated to a
// single MachineBasicBlock may also end up in this Map.		// single MachineBasicBlock may also end up in this Map.
using CFGEdge = std::pair<const BasicBlock , const BasicBlock >;		using CFGEdge = std::pair<const BasicBlock , const BasicBlock >;
DenseMap<CFGEdge, SmallVector<MachineBasicBlock *, 1>> MachinePreds;		DenseMap<CFGEdge, SmallVector<MachineBasicBlock *, 1>> MachinePreds;

// List of stubbed PHI instructions, for values and basic blocks to be filled		// List of stubbed PHI instructions, for values and basic blocks to be filled
// in once all MachineBasicBlocks have been created.		// in once all MachineBasicBlocks have been created.
SmallVector<std::pair<const PHINode , MachineInstr >, 4> PendingPHIs;		SmallVector<std::pair<const PHINode , SmallVector<MachineInstr , 1>>, 4>
		PendingPHIs;

/// Record of what frame index has been allocated to specified allocas for		/// Record of what frame index has been allocated to specified allocas for
/// this function.		/// this function.
DenseMap<const AllocaInst *, int> FrameIndices;		DenseMap<const AllocaInst *, int> FrameIndices;

/// \name Methods for translating form LLVM IR to MachineInstr.		/// \name Methods for translating form LLVM IR to MachineInstr.
/// \see ::translate for general information on the translate methods.		/// \see ::translate for general information on the translate methods.
/// @{		/// @{

/// Translate \p Inst into its corresponding MachineInstr instruction(s).		/// Translate \p Inst into its corresponding MachineInstr instruction(s).
/// Insert the newly translated instruction(s) right where the CurBuilder		/// Insert the newly translated instruction(s) right where the CurBuilder
/// is set.		/// is set.
///		///
/// The general algorithm is:		/// The general algorithm is:
/// 1. Look for a virtual register for each operand or		/// 1. Look for a virtual register for each operand or
/// create one.		/// create one.
/// 2 Update the ValToVReg accordingly.		/// 2 Update the VMap accordingly.
/// 2.alt. For constant arguments, if they are compile time constants,		/// 2.alt. For constant arguments, if they are compile time constants,
/// produce an immediate in the right operand and do not touch		/// produce an immediate in the right operand and do not touch
/// ValToReg. Actually we will go with a virtual register for each		/// ValToReg. Actually we will go with a virtual register for each
/// constants because it may be expensive to actually materialize the		/// constants because it may be expensive to actually materialize the
/// constant. Moreover, if the constant spans on several instructions,		/// constant. Moreover, if the constant spans on several instructions,
/// CSE may not catch them.		/// CSE may not catch them.
/// => Update ValToVReg and remember that we saw a constant in Constants.		/// => Update ValToVReg and remember that we saw a constant in Constants.
/// We will materialize all the constants in finalize.		/// We will materialize all the constants in finalize.
Show All 30 Lines	private:
bool translateOverflowIntrinsic(const CallInst &CI, unsigned Op,		bool translateOverflowIntrinsic(const CallInst &CI, unsigned Op,
MachineIRBuilder &MIRBuilder);		MachineIRBuilder &MIRBuilder);

bool translateKnownIntrinsic(const CallInst &CI, Intrinsic::ID ID,		bool translateKnownIntrinsic(const CallInst &CI, Intrinsic::ID ID,
MachineIRBuilder &MIRBuilder);		MachineIRBuilder &MIRBuilder);

bool translateInlineAsm(const CallInst &CI, MachineIRBuilder &MIRBuilder);		bool translateInlineAsm(const CallInst &CI, MachineIRBuilder &MIRBuilder);

		// FIXME: temporary function to expose previous interface to call lowering
		// until it is refactored.
		/// Combines all component registers of \p V into a single scalar with size
		/// "max(Offsets) + last size".
		unsigned packRegs(const Value &V, MachineIRBuilder &MIRBuilder);

		void unpackRegs(const Value &V, unsigned Src, MachineIRBuilder &MIRBuilder);

		/// Returns true if the value should be split into multiple LLTs.
		/// If \p PopulateOffsets is true then the Value->offsets map will be modified
		/// based on the splitting information.
		bool valueIsSplit(const Value &V, bool PopulateOffsets = false);

/// Translate call instruction.		/// Translate call instruction.
/// \pre \p U is a call instruction.		/// \pre \p U is a call instruction.
bool translateCall(const User &U, MachineIRBuilder &MIRBuilder);		bool translateCall(const User &U, MachineIRBuilder &MIRBuilder);

bool translateInvoke(const User &U, MachineIRBuilder &MIRBuilder);		bool translateInvoke(const User &U, MachineIRBuilder &MIRBuilder);

bool translateLandingPad(const User &U, MachineIRBuilder &MIRBuilder);		bool translateLandingPad(const User &U, MachineIRBuilder &MIRBuilder);

▲ Show 20 Lines • Show All 219 Lines • ▼ Show 20 Lines	private:
std::unique_ptr<OptimizationRemarkEmitter> ORE;		std::unique_ptr<OptimizationRemarkEmitter> ORE;

// * Insert all the code needed to materialize the constants		// * Insert all the code needed to materialize the constants
// at the proper place. E.g., Entry block or dominator block		// at the proper place. E.g., Entry block or dominator block
// of each constant depending on how fancy we want to be.		// of each constant depending on how fancy we want to be.
// * Clear the different maps.		// * Clear the different maps.
void finalizeFunction();		void finalizeFunction();

/// Get the VReg that represents \p Val.		/// Get the VRegs that represent \p Val.
/// If such VReg does not exist, it is created.		/// Non-aggregate types have just one corresponding VReg and the list can be
unsigned getOrCreateVReg(const Value &Val);		/// used as a single "unsigned". Aggregates get flattened. If such VRegs do
		/// not exist, they are created.
		ArrayRef<unsigned> getOrCreateVRegs(const Value &Val);

		unsigned getOrCreateVReg(const Value &Val) {
		auto Regs = getOrCreateVRegs(Val);
		if (Regs.empty())
		return 0;
		assert(Regs.size() == 1 &&
		"attempt to get single VReg for aggregate or void");
		return Regs[0];
		}

/// Get the frame index that represents \p Val.		/// Get the frame index that represents \p Val.
/// If such VReg does not exist, it is created.		/// If such VReg does not exist, it is created.
int getOrCreateFrameIndex(const AllocaInst &AI);		int getOrCreateFrameIndex(const AllocaInst &AI);

/// Get the alignment of the given memory operation instruction. This will		/// Get the alignment of the given memory operation instruction. This will
/// either be the explicitly specified value or the ABI-required alignment for		/// either be the explicitly specified value or the ABI-required alignment for
/// the type being accessed (according to the Module's DataLayout).		/// the type being accessed (according to the Module's DataLayout).
▲ Show 20 Lines • Show All 50 Lines • Show Last 20 Lines

lib/CodeGen/GlobalISel/IRTranslator.cpp

Show First 20 Lines • Show All 100 Lines • ▼ Show 20 Lines	IRTranslator::IRTranslator() : MachineFunctionPass(ID) {
initializeIRTranslatorPass(*PassRegistry::getPassRegistry());		initializeIRTranslatorPass(*PassRegistry::getPassRegistry());
}		}

void IRTranslator::getAnalysisUsage(AnalysisUsage &AU) const {		void IRTranslator::getAnalysisUsage(AnalysisUsage &AU) const {
AU.addRequired<TargetPassConfig>();		AU.addRequired<TargetPassConfig>();
MachineFunctionPass::getAnalysisUsage(AU);		MachineFunctionPass::getAnalysisUsage(AU);
}		}

unsigned IRTranslator::getOrCreateVReg(const Value &Val) {		static void computeValueLLTs(const DataLayout &DL, Type &Ty,
unsigned &ValReg = ValToVReg[&Val];		SmallVectorImpl<LLT> &ValueTys,
		SmallVectorImpl<uint64_t> &Offsets,
if (ValReg)		uint64_t StartingOffset = 0) {
return ValReg;		// Given a struct type, recursively traverse the elements.
		if (StructType *STy = dyn_cast<StructType>(&Ty)) {
		const StructLayout *SL = DL.getStructLayout(STy);
		for (StructType::element_iterator EB = STy->element_begin(),
		EI = EB,
		EE = STy->element_end();
		EI != EE; ++EI)
		computeValueLLTs(DL, **EI, ValueTys, Offsets,
		StartingOffset + SL->getElementOffset(EI - EB));
		rtereshinUnsubmitted Done Reply Inline Actions Maybe this way it's just a bit more straightforward: for (unsigned I = 0, E = STy->getNumElements(); I != E; ++I) computeValueLLTs(DL, STy->getElementType(I), ValueTys, Offsets, StartingOffset + SL->getElementOffset(I)); It's alright as it is, of course, just in case you would like the version above better yourself. rtereshin:* Maybe this way it's just a bit more straightforward: ``` for (unsigned I = 0, E = STy…
		return;
		}
		// Given an array type, recursively traverse the elements.
		if (ArrayType *ATy = dyn_cast<ArrayType>(&Ty)) {
		Type *EltTy = ATy->getElementType();
		uint64_t EltSize = DL.getTypeAllocSize(EltTy);
		for (unsigned i = 0, e = ATy->getNumElements(); i != e; ++i)
		rtereshinUnsubmitted Not Done Reply Inline Actions I think we're trying to stick to capitalized local variables' names, including loop induction variables. rtereshin: I think we're trying to stick to capitalized local variables' names, including loop induction…
		aemersonAuthorUnsubmitted Not Done Reply Inline Actions To me using a capital I normally indicates an iterator while a lower case 'i' is assumed to be an integer type, and we already use lower case 'i' idiomatically everywhere including GISel. aemerson: To me using a capital I normally indicates an iterator while a lower case 'i' is assumed to be…
		rtereshinUnsubmitted Not Done Reply Inline Actions @bogner It's a valid point: > grep -Ern 'for.* i\s=\s\d' ./ --include='.cpp' --include='.h' --exclude-dir=build \| wc -l 17473 > grep -Ern 'for.* I\s=\s\d' ./ --include='.cpp' --include='.h' --exclude-dir=build \| wc -l 1617 rtereshin: @bogner It's a valid point: ``` > grep -Ern 'for.* i\s=\s\d' ./ --include='*.cpp'…
		bognerUnsubmitted Not Done Reply Inline Actions The coding standards certainly imply that variables should all be capitalized, though they don't call out loop variables explicitly: http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly I tend to stick to capitalizing in llvm here, if only because I find it very hard to talk about code if both "I" and "i" end up being in scope at the same time. In any case, if lower case "i" is meant to be canonical someone should really patch the coding standards to say so. bogner: The coding standards certainly imply that variables should all be capitalized, though they…
		computeValueLLTs(DL, *EltTy, ValueTys, Offsets,
		StartingOffset + i * EltSize);
		return;
		}
		// Interpret void as zero return values.
		if (Ty.isVoidTy())
		return;
		// Base case: we can get an LLT for this LLVM IR type.
		ValueTys.push_back(getLLTForType(Ty, DL));
		Offsets.push_back(StartingOffset * 8);
		}

		ArrayRef<unsigned> IRTranslator::getOrCreateVRegs(const Value &Val) {
		auto VRegsIt = VMap.findVRegs(Val);
		if (VRegsIt != VMap.vregs_end())
		return *VRegsIt->second;

		if (Val.getType()->isVoidTy())
		return *VMap.getVRegs(Val);

		// Create entry for this type.
		auto *VRegs = VMap.getVRegs(Val);
		auto *Offsets = VMap.getOffsets(Val);

// Fill ValRegsSequence with the sequence of registers
// we need to concat together to produce the value.
assert(Val.getType()->isSized() &&		assert(Val.getType()->isSized() &&
"Don't know how to create an empty vreg");		"Don't know how to create an empty vreg");
		rtereshinUnsubmitted Not Done Reply Inline Actions Just `Regs.resize(SplitTys.size())` may be a littler cleaner. rtereshin: Just `Regs.resize(SplitTys.size())` may be a littler cleaner.
unsigned VReg =
MRI->createGenericVirtualRegister(getLLTForType(Val.getType(), DL));
ValReg = VReg;

if (auto CV = dyn_cast<Constant>(&Val)) {		SmallVector<LLT, 4> SplitTys;
bool Success = translate(*CV, VReg);		computeValueLLTs(DL, Val.getType(), SplitTys, *Offsets);
		rtereshinUnsubmitted Not Done Reply Inline Actions Looks like technically for aggregate constants `getOrCreateVRegs` has worst case complexity of `O(N^2)`, for constants like `{ i8 0, { i8 0, { i8 0, ... } } }`, but this is probably not easy to avoid and such constants hopefully don't happen often. rtereshin: Looks like technically for aggregate constants `getOrCreateVRegs` has worst case complexity of…

		if (!isa<Constant>(Val)) {
		for (auto Ty : SplitTys)
		VRegs->push_back(MRI->createGenericVirtualRegister(Ty));
		return *VRegs;
		}

		if (Val.getType()->isAggregateType()) {
		// UndefValue, ConstantAggregateZero
		auto &C = cast<Constant>(Val);
		unsigned Idx = 0;
		while (auto Elt = C.getAggregateElement(Idx++)) {
		auto EltRegs = getOrCreateVRegs(*Elt);
		std::copy(EltRegs.begin(), EltRegs.end(), std::back_inserter(*VRegs));
		}
		} else {
		rtereshinUnsubmitted Not Done Reply Inline Actions This piece in general is not exactly trivial, perhaps it makes sense to add test cases having multiple deeply nested aggregate constants sharing some parts so `getOrCreateVRegs` would visit an actual non-tree DAG while traversing them. rtereshin: This piece in general is not exactly trivial, perhaps it makes sense to add test cases having…
		assert(SplitTys.size() == 1 && "unexpectedly split LLT");
		VRegs->push_back(MRI->createGenericVirtualRegister(SplitTys[0]));
		bool Success = translate(cast<Constant>(Val), (*VRegs)[0]);
		rtereshinUnsubmitted Not Done Reply Inline Actions `VRegs->front()` maybe rtereshin: `VRegs->front()` maybe
if (!Success) {		if (!Success) {
OptimizationRemarkMissed R("gisel-irtranslator", "GISelFailure",		OptimizationRemarkMissed R("gisel-irtranslator", "GISelFailure",
MF->getFunction().getSubprogram(),		MF->getFunction().getSubprogram(),
&MF->getFunction().getEntryBlock());		&MF->getFunction().getEntryBlock());
R << "unable to translate constant: " << ore::NV("Type", Val.getType());		R << "unable to translate constant: " << ore::NV("Type", Val.getType());
reportTranslationError(MF, TPC, *ORE, R);		reportTranslationError(MF, TPC, *ORE, R);
return VReg;		return *VRegs;
}		}
}		}

return VReg;		return *VRegs;
}		}

int IRTranslator::getOrCreateFrameIndex(const AllocaInst &AI) {		int IRTranslator::getOrCreateFrameIndex(const AllocaInst &AI) {
if (FrameIndices.find(&AI) != FrameIndices.end())		if (FrameIndices.find(&AI) != FrameIndices.end())
return FrameIndices[&AI];		return FrameIndices[&AI];

unsigned ElementSize = DL->getTypeStoreSize(AI.getAllocatedType());		unsigned ElementSize = DL->getTypeStoreSize(AI.getAllocatedType());
unsigned Size =		unsigned Size =
▲ Show 20 Lines • Show All 94 Lines • ▼ Show 20 Lines
bool IRTranslator::translateRet(const User &U, MachineIRBuilder &MIRBuilder) {		bool IRTranslator::translateRet(const User &U, MachineIRBuilder &MIRBuilder) {
const ReturnInst &RI = cast<ReturnInst>(U);		const ReturnInst &RI = cast<ReturnInst>(U);
const Value *Ret = RI.getReturnValue();		const Value *Ret = RI.getReturnValue();
if (Ret && DL->getTypeStoreSize(Ret->getType()) == 0)		if (Ret && DL->getTypeStoreSize(Ret->getType()) == 0)
Ret = nullptr;		Ret = nullptr;
// The target may mess up with the insertion point, but		// The target may mess up with the insertion point, but
// this is not important as a return is the last instruction		// this is not important as a return is the last instruction
// of the block anyway.		// of the block anyway.
return CLI->lowerReturn(MIRBuilder, Ret, !Ret ? 0 : getOrCreateVReg(*Ret));
		// FIXME: this interface should simplify when CallLowering gets adapted to
		// multiple VRegs per Value.
		unsigned VReg = Ret ? packRegs(*Ret, MIRBuilder) : 0;
		return CLI->lowerReturn(MIRBuilder, Ret, VReg);
}		}

bool IRTranslator::translateBr(const User &U, MachineIRBuilder &MIRBuilder) {		bool IRTranslator::translateBr(const User &U, MachineIRBuilder &MIRBuilder) {
const BranchInst &BrInst = cast<BranchInst>(U);		const BranchInst &BrInst = cast<BranchInst>(U);
unsigned Succ = 0;		unsigned Succ = 0;
if (!BrInst.isUnconditional()) {		if (!BrInst.isUnconditional()) {
// We want a G_BRCOND to the true BB followed by an unconditional branch.		// We want a G_BRCOND to the true BB followed by an unconditional branch.
unsigned Tst = getOrCreateVReg(*BrInst.getCondition());		unsigned Tst = getOrCreateVReg(*BrInst.getCondition());
▲ Show 20 Lines • Show All 82 Lines • ▼ Show 20 Lines	bool IRTranslator::translateLoad(const User &U, MachineIRBuilder &MIRBuilder) {

auto Flags = LI.isVolatile() ? MachineMemOperand::MOVolatile		auto Flags = LI.isVolatile() ? MachineMemOperand::MOVolatile
: MachineMemOperand::MONone;		: MachineMemOperand::MONone;
Flags \|= MachineMemOperand::MOLoad;		Flags \|= MachineMemOperand::MOLoad;

if (DL->getTypeStoreSize(LI.getType()) == 0)		if (DL->getTypeStoreSize(LI.getType()) == 0)
return true;		return true;

unsigned Res = getOrCreateVReg(LI);		ArrayRef<unsigned> Regs = getOrCreateVRegs(LI);
unsigned Addr = getOrCreateVReg(*LI.getPointerOperand());		ArrayRef<uint64_t> Offsets = *VMap.getOffsets(LI);
		unsigned Base = getOrCreateVReg(*LI.getPointerOperand());

		for (unsigned i = 0; i < Regs.size(); ++i) {
		unsigned Addr = 0;
		MIRBuilder.materializeGEP(Addr, Base, LLT::scalar(64), Offsets[i] / 8);

		MachinePointerInfo Ptr(LI.getPointerOperand(), Offsets[i]);
		rtereshinUnsubmitted Not Done Reply Inline Actions `Offsets[i]` here needs to be in bytes as well. rtereshin: `Offsets[i]` here needs to be in bytes as well.
		unsigned BaseAlign = getMemOpAlignment(LI);
		auto MMO = MF->getMachineMemOperand(
		Ptr, Flags, (MRI->getType(Regs[i]).getSizeInBits() + 7) / 8,
		MinAlign(BaseAlign, Offsets[i]), AAMDNodes(), nullptr,
		rtereshinUnsubmitted Not Done Reply Inline Actions Ditto, `MinAlign` expects everything in bytes as far as I can tell. rtereshin: Ditto, `MinAlign` expects everything in bytes as far as I can tell.
		LI.getSyncScopeID(), LI.getOrdering());
		MIRBuilder.buildLoad(Regs[i], Addr, *MMO);
		}

MIRBuilder.buildLoad(
Res, Addr,
*MF->getMachineMemOperand(MachinePointerInfo(LI.getPointerOperand()),
Flags, DL->getTypeStoreSize(LI.getType()),
getMemOpAlignment(LI), AAMDNodes(), nullptr,
LI.getSyncScopeID(), LI.getOrdering()));
return true;		return true;
}		}

bool IRTranslator::translateStore(const User &U, MachineIRBuilder &MIRBuilder) {		bool IRTranslator::translateStore(const User &U, MachineIRBuilder &MIRBuilder) {
const StoreInst &SI = cast<StoreInst>(U);		const StoreInst &SI = cast<StoreInst>(U);
auto Flags = SI.isVolatile() ? MachineMemOperand::MOVolatile		auto Flags = SI.isVolatile() ? MachineMemOperand::MOVolatile
: MachineMemOperand::MONone;		: MachineMemOperand::MONone;
Flags \|= MachineMemOperand::MOStore;		Flags \|= MachineMemOperand::MOStore;

if (DL->getTypeStoreSize(SI.getValueOperand()->getType()) == 0)		if (DL->getTypeStoreSize(SI.getValueOperand()->getType()) == 0)
return true;		return true;

unsigned Val = getOrCreateVReg(*SI.getValueOperand());		ArrayRef<unsigned> Vals = getOrCreateVRegs(*SI.getValueOperand());
unsigned Addr = getOrCreateVReg(*SI.getPointerOperand());		ArrayRef<uint64_t> Offsets = VMap.getOffsets(SI.getValueOperand());
		unsigned Base = getOrCreateVReg(*SI.getPointerOperand());
MIRBuilder.buildStore(
Val, Addr,		for (unsigned i = 0; i < Vals.size(); ++i) {
*MF->getMachineMemOperand(		unsigned Addr = 0;
MachinePointerInfo(SI.getPointerOperand()), Flags,		MIRBuilder.materializeGEP(Addr, Base, LLT::scalar(64), Offsets[i] / 8);
DL->getTypeStoreSize(SI.getValueOperand()->getType()),
getMemOpAlignment(SI), AAMDNodes(), nullptr, SI.getSyncScopeID(),		MachinePointerInfo Ptr(SI.getPointerOperand(), Offsets[i]);
		rtereshinUnsubmitted Not Done Reply Inline Actions Ditto, `MachinePointerInfo` expects everything in bytes. rtereshin: Ditto, `MachinePointerInfo` expects everything in bytes.
SI.getOrdering()));		unsigned BaseAlign = getMemOpAlignment(SI);
		auto MMO = MF->getMachineMemOperand(
		Ptr, Flags, (MRI->getType(Vals[i]).getSizeInBits() + 7) / 8,
		MinAlign(BaseAlign, Offsets[i]), AAMDNodes(), nullptr,
		rtereshinUnsubmitted Not Done Reply Inline Actions Ditto. rtereshin: Ditto.
		SI.getSyncScopeID(), SI.getOrdering());
		MIRBuilder.buildStore(Vals[i], Addr, *MMO);
		}
return true;		return true;
}		}

bool IRTranslator::translateExtractValue(const User &U,		bool IRTranslator::translateExtractValue(const User &U,
MachineIRBuilder &MIRBuilder) {		MachineIRBuilder &MIRBuilder) {
const Value *Src = U.getOperand(0);		const Value *Src = U.getOperand(0);
Type *Int32Ty = Type::getInt32Ty(U.getContext());		Type *Int32Ty = Type::getInt32Ty(U.getContext());
SmallVector<Value *, 1> Indices;

// If Src is a single element ConstantStruct, translate extractvalue
// to that element to avoid inserting a cast instruction.
if (auto CS = dyn_cast<ConstantStruct>(Src))
if (CS->getNumOperands() == 1) {
unsigned Res = getOrCreateVReg(*CS->getOperand(0));
ValToVReg[&U] = Res;
return true;
}

// getIndexedOffsetInType is designed for GEPs, so the first index is the		// getIndexedOffsetInType is designed for GEPs, so the first index is the
// usual array element rather than looking into the actual aggregate.		// usual array element rather than looking into the actual aggregate.
		SmallVector<Value *, 1> Indices;
Indices.push_back(ConstantInt::get(Int32Ty, 0));		Indices.push_back(ConstantInt::get(Int32Ty, 0));

if (const ExtractValueInst *EVI = dyn_cast<ExtractValueInst>(&U)) {		if (const ExtractValueInst *EVI = dyn_cast<ExtractValueInst>(&U)) {
for (auto Idx : EVI->indices())		for (auto Idx : EVI->indices())
Indices.push_back(ConstantInt::get(Int32Ty, Idx));		Indices.push_back(ConstantInt::get(Int32Ty, Idx));
} else {		} else {
for (unsigned i = 1; i < U.getNumOperands(); ++i)		for (unsigned i = 1; i < U.getNumOperands(); ++i)
Indices.push_back(U.getOperand(i));		Indices.push_back(U.getOperand(i));
}		}

uint64_t Offset = 8 * DL->getIndexedOffsetInType(Src->getType(), Indices);		uint64_t Offset = 8 * DL->getIndexedOffsetInType(Src->getType(), Indices);
		rtereshinUnsubmitted Done Reply Inline Actions For better or for worse, `getIndexedOffsetInType` returns `int64_t`, not `uint64_t`. rtereshin: For better or for worse, `getIndexedOffsetInType` returns `int64_t`, not `uint64_t`.
		ArrayRef<unsigned> SrcRegs = getOrCreateVRegs(*Src);
		ArrayRef<uint64_t> Offsets = VMap.getOffsets(Src);
		unsigned Idx = std::lower_bound(Offsets.begin(), Offsets.end(), Offset) -
		Offsets.begin();

unsigned Res = getOrCreateVReg(U);		for (auto DstReg : getOrCreateVRegs(U))
MIRBuilder.buildExtract(Res, getOrCreateVReg(*Src), Offset);		MIRBuilder.buildCopy(DstReg, SrcRegs[Idx++]);
		rtereshinUnsubmitted Done Reply Inline Actions Same note as for `insertvalue`, do we really need to build these copies here? rtereshin: Same note as for `insertvalue`, do we really need to build these copies here?

return true;		return true;
}		}

bool IRTranslator::translateInsertValue(const User &U,		bool IRTranslator::translateInsertValue(const User &U,
MachineIRBuilder &MIRBuilder) {		MachineIRBuilder &MIRBuilder) {
const Value *Src = U.getOperand(0);		const Value *Src = U.getOperand(0);
Type *Int32Ty = Type::getInt32Ty(U.getContext());		Type *Int32Ty = Type::getInt32Ty(U.getContext());
SmallVector<Value *, 1> Indices;		SmallVector<Value *, 1> Indices;

// getIndexedOffsetInType is designed for GEPs, so the first index is the		// getIndexedOffsetInType is designed for GEPs, so the first index is the
// usual array element rather than looking into the actual aggregate.		// usual array element rather than looking into the actual aggregate.
Indices.push_back(ConstantInt::get(Int32Ty, 0));		Indices.push_back(ConstantInt::get(Int32Ty, 0));

if (const InsertValueInst *IVI = dyn_cast<InsertValueInst>(&U)) {		if (const InsertValueInst *IVI = dyn_cast<InsertValueInst>(&U)) {
for (auto Idx : IVI->indices())		for (auto Idx : IVI->indices())
Indices.push_back(ConstantInt::get(Int32Ty, Idx));		Indices.push_back(ConstantInt::get(Int32Ty, Idx));
} else {		} else {
for (unsigned i = 2; i < U.getNumOperands(); ++i)		for (unsigned i = 2; i < U.getNumOperands(); ++i)
Indices.push_back(U.getOperand(i));		Indices.push_back(U.getOperand(i));
		rtereshinUnsubmitted Done Reply Inline Actions This logic of getting indices from values is quite repeated here and in `extractvalue`, do you think it makes sense to refactor it out as a separate function? rtereshin: This logic of getting indices from values is quite repeated here and in `extractvalue`, do you…
		aemersonAuthorUnsubmitted Not Done Reply Inline Actions Could do. aemerson: Could do.
}		}

uint64_t Offset = 8 * DL->getIndexedOffsetInType(Src->getType(), Indices);		uint64_t Offset = 8 * DL->getIndexedOffsetInType(Src->getType(), Indices);
		ArrayRef<unsigned> DstRegs = getOrCreateVRegs(U);
unsigned Res = getOrCreateVReg(U);		ArrayRef<uint64_t> DstOffsets = *VMap.getOffsets(U);
unsigned Inserted = getOrCreateVReg(*U.getOperand(1));		ArrayRef<unsigned> SrcRegs = getOrCreateVRegs(*Src);
MIRBuilder.buildInsert(Res, getOrCreateVReg(*Src), Inserted, Offset);		ArrayRef<unsigned> InsertedRegs = getOrCreateVRegs(*U.getOperand(1));
		auto InsertedIt = InsertedRegs.begin();

		for (unsigned i = 0; i < DstRegs.size(); ++i) {
		if (DstOffsets[i] >= Offset && InsertedIt != InsertedRegs.end())
		MIRBuilder.buildCopy(DstRegs[i], *InsertedIt++);
		else
		MIRBuilder.buildCopy(DstRegs[i], SrcRegs[i]);
		rtereshinUnsubmitted Done Reply Inline Actions Let's say we have a large aggregate that eventually maps to N vregs, and also N `insertvalue`s each of which replaces just a single item (vreg). Will we end up generating N^2 `COPY`s for the IR of initial size `O(N)` here? Granted, this probably happens very rarely in practice, but what if we don't `getOrCreateVRegs(U)`, but rather just `get(U)` (no implicit `MIR.createGenericRegister` calls), resize, and assign either `InsertedIt++` or `SrcRegs[i]` to it directly w/o building any explicit `COPY`s at all? We're just remapping values back and forth here during translation, with always constant indices, so the whole thing could be copy-propagated on the fly avoiding quite a bit of MIR-churn. rtereshin:* Let's say we have a large aggregate that eventually maps to N vregs, and also N `insertvalue`s…
		aemersonAuthorUnsubmitted Not Done Reply Inline Actions I don't know why you mean by `get(U)`. What would it return? aemerson: I don't know why you mean by `get(U)`. What would it return?
		rtereshinUnsubmitted Done Reply Inline Actions It was supposed to be more like `allocateVRegs(U)`. What I mean here is that let's say we only allocate the required number of vregs for `U` w/o initializing them with anything, and then just assign to a `DstReg[i]` either `Inserted++` or `SrcRegs[i]` depending on the offset, like this: MutableArrayRef<unsigned> DstRegs = allocateVRegs(U); // `VMap.insertVRegs(U)` + initializing offsets ArrayRef<uint64_t> DstOffsets = VMap.getOffsets(U); // getting just initialized offsets ArrayRef<unsigned> SrcRegs = getOrCreateVRegs(Src); ArrayRef<unsigned> InsertedRegs = getOrCreateVRegs(U.getOperand(1)); auto InsertedIt = InsertedRegs.begin(); for (unsigned i = 0; i < DstRegs.size(); ++i) { if (DstOffsets[i] >= Offset && InsertedIt != InsertedRegs.end()) DstRegs[i] = InsertedIt++; else DstRegs[i], SrcRegs[i]; avoiding building any copies. Also, if the offsets are cached by value type (as suggested in a comment above), `allocateVRegs(U)` won't visit `U` at all, as `Src` is already processed (assuming we IR Translate top to bottom) and we have all the offsets cached for `U->getType()` already and `U` and `Src` have the same type. rtereshin:* It was supposed to be more like `allocateVRegs(U)`. What I mean here is that let's say we only…
		}

return true;		return true;
}		}

bool IRTranslator::translateSelect(const User &U,		bool IRTranslator::translateSelect(const User &U,
MachineIRBuilder &MIRBuilder) {		MachineIRBuilder &MIRBuilder) {
unsigned Res = getOrCreateVReg(U);
unsigned Tst = getOrCreateVReg(*U.getOperand(0));		unsigned Tst = getOrCreateVReg(*U.getOperand(0));
unsigned Op0 = getOrCreateVReg(*U.getOperand(1));		ArrayRef<unsigned> ResRegs = getOrCreateVRegs(U);
unsigned Op1 = getOrCreateVReg(*U.getOperand(2));		ArrayRef<unsigned> Op0Regs = getOrCreateVRegs(*U.getOperand(1));
MIRBuilder.buildSelect(Res, Tst, Op0, Op1);		ArrayRef<unsigned> Op1Regs = getOrCreateVRegs(*U.getOperand(2));

		for (unsigned i = 0; i < ResRegs.size(); ++i)
		MIRBuilder.buildSelect(ResRegs[i], Tst, Op0Regs[i], Op1Regs[i]);

return true;		return true;
}		}

bool IRTranslator::translateBitCast(const User &U,		bool IRTranslator::translateBitCast(const User &U,
MachineIRBuilder &MIRBuilder) {		MachineIRBuilder &MIRBuilder) {
// If we're bitcasting to the source type, we can reuse the source vreg.		// If we're bitcasting to the source type, we can reuse the source vreg.
if (getLLTForType(U.getOperand(0)->getType(), DL) ==		if (getLLTForType(U.getOperand(0)->getType(), DL) ==
getLLTForType(U.getType(), DL)) {		getLLTForType(U.getType(), DL)) {
// Get the source vreg now, to avoid invalidating ValToVReg.
unsigned SrcReg = getOrCreateVReg(*U.getOperand(0));		unsigned SrcReg = getOrCreateVReg(*U.getOperand(0));
unsigned &Reg = ValToVReg[&U];		auto &Regs = *VMap.getVRegs(U);
// If we already assigned a vreg for this bitcast, we can't change that.		// If we already assigned a vreg for this bitcast, we can't change that.
// Emit a copy to satisfy the users we already emitted.		// Emit a copy to satisfy the users we already emitted.
if (Reg)		if (!Regs.empty())
MIRBuilder.buildCopy(Reg, SrcReg);		MIRBuilder.buildCopy(Regs[0], SrcReg);
else		else {
Reg = SrcReg;		Regs.push_back(SrcReg);
		VMap.getOffsets(U)->push_back(0);
		}
return true;		return true;
}		}
return translateCast(TargetOpcode::G_BITCAST, U, MIRBuilder);		return translateCast(TargetOpcode::G_BITCAST, U, MIRBuilder);
}		}

bool IRTranslator::translateCast(unsigned Opcode, const User &U,		bool IRTranslator::translateCast(unsigned Opcode, const User &U,
MachineIRBuilder &MIRBuilder) {		MachineIRBuilder &MIRBuilder) {
unsigned Op = getOrCreateVReg(*U.getOperand(0));		unsigned Op = getOrCreateVReg(*U.getOperand(0));
▲ Show 20 Lines • Show All 134 Lines • ▼ Show 20 Lines	void IRTranslator::getStackGuard(unsigned DstReg,
*MemRefs =		*MemRefs =
MF->getMachineMemOperand(MPInfo, Flags, DL->getPointerSizeInBits() / 8,		MF->getMachineMemOperand(MPInfo, Flags, DL->getPointerSizeInBits() / 8,
DL->getPointerABIAlignment(0));		DL->getPointerABIAlignment(0));
MIB.setMemRefs(MemRefs, MemRefs + 1);		MIB.setMemRefs(MemRefs, MemRefs + 1);
}		}

bool IRTranslator::translateOverflowIntrinsic(const CallInst &CI, unsigned Op,		bool IRTranslator::translateOverflowIntrinsic(const CallInst &CI, unsigned Op,
MachineIRBuilder &MIRBuilder) {		MachineIRBuilder &MIRBuilder) {
LLT Ty = getLLTForType(CI.getOperand(0)->getType(), DL);		ArrayRef<unsigned> ResRegs = getOrCreateVRegs(CI);
LLT s1 = LLT::scalar(1);
unsigned Width = Ty.getSizeInBits();
unsigned Res = MRI->createGenericVirtualRegister(Ty);
unsigned Overflow = MRI->createGenericVirtualRegister(s1);
auto MIB = MIRBuilder.buildInstr(Op)		auto MIB = MIRBuilder.buildInstr(Op)
.addDef(Res)		.addDef(ResRegs[0])
.addDef(Overflow)		.addDef(ResRegs[1])
.addUse(getOrCreateVReg(*CI.getOperand(0)))		.addUse(getOrCreateVReg(*CI.getOperand(0)))
.addUse(getOrCreateVReg(*CI.getOperand(1)));		.addUse(getOrCreateVReg(*CI.getOperand(1)));

if (Op == TargetOpcode::G_UADDE \|\| Op == TargetOpcode::G_USUBE) {		if (Op == TargetOpcode::G_UADDE \|\| Op == TargetOpcode::G_USUBE) {
unsigned Zero = getOrCreateVReg(		unsigned Zero = getOrCreateVReg(
*Constant::getNullValue(Type::getInt1Ty(CI.getContext())));		*Constant::getNullValue(Type::getInt1Ty(CI.getContext())));
MIB.addUse(Zero);		MIB.addUse(Zero);
}		}

MIRBuilder.buildSequence(getOrCreateVReg(CI), {Res, Overflow}, {0, Width});
return true;		return true;
}		}

bool IRTranslator::translateKnownIntrinsic(const CallInst &CI, Intrinsic::ID ID,		bool IRTranslator::translateKnownIntrinsic(const CallInst &CI, Intrinsic::ID ID,
MachineIRBuilder &MIRBuilder) {		MachineIRBuilder &MIRBuilder) {
switch (ID) {		switch (ID) {
default:		default:
break;		break;
▲ Show 20 Lines • Show All 190 Lines • ▼ Show 20 Lines	bool IRTranslator::translateInlineAsm(const CallInst &CI,

MIRBuilder.buildInstr(TargetOpcode::INLINEASM)		MIRBuilder.buildInstr(TargetOpcode::INLINEASM)
.addExternalSymbol(IA.getAsmString().c_str())		.addExternalSymbol(IA.getAsmString().c_str())
.addImm(ExtraInfo);		.addImm(ExtraInfo);

return true;		return true;
}		}

		unsigned IRTranslator::packRegs(const Value &V,
		MachineIRBuilder &MIRBuilder) {
		rtereshinUnsubmitted Not Done Reply Inline Actions Indentation is off rtereshin: Indentation is off
		ArrayRef<unsigned> Regs = getOrCreateVRegs(V);
		ArrayRef<uint64_t> Offsets = *VMap.getOffsets(V);
		LLT BigTy = getLLTForType(V.getType(), DL);

		if (Regs.size() == 1)
		return Regs[0];

		unsigned Dst = MRI->createGenericVirtualRegister(BigTy);
		MIRBuilder.buildUndef(Dst);
		for (unsigned i = 0; i < Regs.size(); ++i) {
		unsigned NewDst = MRI->createGenericVirtualRegister(BigTy);
		MIRBuilder.buildInsert(NewDst, Dst, Regs[i], Offsets[i]);
		Dst = NewDst;
		}
		return Dst;
		}

		void IRTranslator::unpackRegs(const Value &V, unsigned Src,
		MachineIRBuilder &MIRBuilder) {
		ArrayRef<unsigned> Regs = getOrCreateVRegs(V);
		ArrayRef<uint64_t> Offsets = *VMap.getOffsets(V);

		for (unsigned i = 0; i < Regs.size(); ++i)
		MIRBuilder.buildExtract(Regs[i], Src, Offsets[i]);
		}

bool IRTranslator::translateCall(const User &U, MachineIRBuilder &MIRBuilder) {		bool IRTranslator::translateCall(const User &U, MachineIRBuilder &MIRBuilder) {
const CallInst &CI = cast<CallInst>(U);		const CallInst &CI = cast<CallInst>(U);
auto TII = MF->getTarget().getIntrinsicInfo();		auto TII = MF->getTarget().getIntrinsicInfo();
const Function *F = CI.getCalledFunction();		const Function *F = CI.getCalledFunction();

// FIXME: support Windows dllimport function calls.		// FIXME: support Windows dllimport function calls.
if (F && F->hasDLLImportStorageClass())		if (F && F->hasDLLImportStorageClass())
return false;		return false;

if (CI.isInlineAsm())		if (CI.isInlineAsm())
return translateInlineAsm(CI, MIRBuilder);		return translateInlineAsm(CI, MIRBuilder);

Intrinsic::ID ID = Intrinsic::not_intrinsic;		Intrinsic::ID ID = Intrinsic::not_intrinsic;
if (F && F->isIntrinsic()) {		if (F && F->isIntrinsic()) {
ID = F->getIntrinsicID();		ID = F->getIntrinsicID();
if (TII && ID == Intrinsic::not_intrinsic)		if (TII && ID == Intrinsic::not_intrinsic)
ID = static_cast<Intrinsic::ID>(TII->getIntrinsicID(F));		ID = static_cast<Intrinsic::ID>(TII->getIntrinsicID(F));
}		}

		bool IsSplitType = valueIsSplit(CI);
if (!F \|\| !F->isIntrinsic() \|\| ID == Intrinsic::not_intrinsic) {		if (!F \|\| !F->isIntrinsic() \|\| ID == Intrinsic::not_intrinsic) {
unsigned Res = CI.getType()->isVoidTy() ? 0 : getOrCreateVReg(CI);		unsigned Res;
		if (IsSplitType)
		Res =
		MRI->createGenericVirtualRegister(getLLTForType(CI.getType(), DL));
		else
		Res = getOrCreateVReg(CI);
		rtereshinUnsubmitted Not Done Reply Inline Actions unsigned Res = IsSplitType ? MRI->createGenericVirtualRegister(getLLTForType(CI.getType(), DL)) : getOrCreateVReg(CI); may work a little nicer here. rtereshin: ``` unsigned Res = IsSplitType ? MRI->createGenericVirtualRegister(getLLTForType…

SmallVector<unsigned, 8> Args;		SmallVector<unsigned, 8> Args;
for (auto &Arg: CI.arg_operands())		for (auto &Arg: CI.arg_operands())
Args.push_back(getOrCreateVReg(*Arg));		Args.push_back(packRegs(*Arg, MIRBuilder));

MF->getFrameInfo().setHasCalls(true);		MF->getFrameInfo().setHasCalls(true);
return CLI->lowerCall(MIRBuilder, &CI, Res, Args, [&]() {		bool Success = CLI->lowerCall(MIRBuilder, &CI, Res, Args, [&]() {
return getOrCreateVReg(*CI.getCalledValue());		return getOrCreateVReg(*CI.getCalledValue());
});		});

		if (IsSplitType)
		unpackRegs(CI, Res, MIRBuilder);
		return Success;
}		}

assert(ID != Intrinsic::not_intrinsic && "unknown intrinsic");		assert(ID != Intrinsic::not_intrinsic && "unknown intrinsic");

if (translateKnownIntrinsic(CI, ID, MIRBuilder))		if (translateKnownIntrinsic(CI, ID, MIRBuilder))
return true;		return true;

unsigned Res = CI.getType()->isVoidTy() ? 0 : getOrCreateVReg(CI);		unsigned Res = 0;
		if (!CI.getType()->isVoidTy()) {
		if (IsSplitType)
		Res =
		MRI->createGenericVirtualRegister(getLLTForType(CI.getType(), DL));
		else
		Res = getOrCreateVReg(CI);
		}
MachineInstrBuilder MIB =		MachineInstrBuilder MIB =
MIRBuilder.buildIntrinsic(ID, Res, !CI.doesNotAccessMemory());		MIRBuilder.buildIntrinsic(ID, Res, !CI.doesNotAccessMemory());

for (auto &Arg : CI.arg_operands()) {		for (auto &Arg : CI.arg_operands()) {
// Some intrinsics take metadata parameters. Reject them.		// Some intrinsics take metadata parameters. Reject them.
if (isa<MetadataAsValue>(Arg))		if (isa<MetadataAsValue>(Arg))
return false;		return false;
MIB.addUse(getOrCreateVReg(*Arg));		MIB.addUse(packRegs(*Arg, MIRBuilder));
}		}

		if (IsSplitType)
		unpackRegs(CI, Res, MIRBuilder);

// Add a MachineMemOperand if it is a target mem intrinsic.		// Add a MachineMemOperand if it is a target mem intrinsic.
const TargetLowering &TLI = *MF->getSubtarget().getTargetLowering();		const TargetLowering &TLI = *MF->getSubtarget().getTargetLowering();
TargetLowering::IntrinsicInfo Info;		TargetLowering::IntrinsicInfo Info;
// TODO: Add a GlobalISel version of getTgtMemIntrinsic.		// TODO: Add a GlobalISel version of getTgtMemIntrinsic.
if (TLI.getTgtMemIntrinsic(Info, CI, *MF, ID)) {		if (TLI.getTgtMemIntrinsic(Info, CI, *MF, ID)) {
uint64_t Size = Info.memVT.getStoreSize();		uint64_t Size = Info.memVT.getStoreSize();
MIB.addMemOperand(MF->getMachineMemOperand(MachinePointerInfo(Info.ptrVal),		MIB.addMemOperand(MF->getMachineMemOperand(MachinePointerInfo(Info.ptrVal),
Info.flags, Size, Info.align));		Info.flags, Size, Info.align));
Show All 27 Lines	bool IRTranslator::translateInvoke(const User &U,
if (!isa<LandingPadInst>(EHPadBB->front()))		if (!isa<LandingPadInst>(EHPadBB->front()))
return false;		return false;

// Emit the actual call, bracketed by EH_LABELs so that the MF knows about		// Emit the actual call, bracketed by EH_LABELs so that the MF knows about
// the region covered by the try.		// the region covered by the try.
MCSymbol *BeginSymbol = Context.createTempSymbol();		MCSymbol *BeginSymbol = Context.createTempSymbol();
MIRBuilder.buildInstr(TargetOpcode::EH_LABEL).addSym(BeginSymbol);		MIRBuilder.buildInstr(TargetOpcode::EH_LABEL).addSym(BeginSymbol);

unsigned Res = I.getType()->isVoidTy() ? 0 : getOrCreateVReg(I);		unsigned Res =
		MRI->createGenericVirtualRegister(getLLTForType(I.getType(), DL));
SmallVector<unsigned, 8> Args;		SmallVector<unsigned, 8> Args;
for (auto &Arg: I.arg_operands())		for (auto &Arg: I.arg_operands())
Args.push_back(getOrCreateVReg(*Arg));		Args.push_back(packRegs(*Arg, MIRBuilder));

if (!CLI->lowerCall(MIRBuilder, &I, Res, Args,		if (!CLI->lowerCall(MIRBuilder, &I, Res, Args,
[&]() { return getOrCreateVReg(*I.getCalledValue()); }))		[&]() { return getOrCreateVReg(*I.getCalledValue()); }))
return false;		return false;

		unpackRegs(I, Res, MIRBuilder);

MCSymbol *EndSymbol = Context.createTempSymbol();		MCSymbol *EndSymbol = Context.createTempSymbol();
MIRBuilder.buildInstr(TargetOpcode::EH_LABEL).addSym(EndSymbol);		MIRBuilder.buildInstr(TargetOpcode::EH_LABEL).addSym(EndSymbol);

// FIXME: track probabilities.		// FIXME: track probabilities.
MachineBasicBlock &EHPadMBB = getMBB(*EHPadBB),		MachineBasicBlock &EHPadMBB = getMBB(*EHPadBB),
&ReturnMBB = getMBB(*ReturnBB);		&ReturnMBB = getMBB(*ReturnBB);
MF->addInvoke(&EHPadMBB, BeginSymbol, EndSymbol);		MF->addInvoke(&EHPadMBB, BeginSymbol, EndSymbol);
MIRBuilder.getMBB().addSuccessor(&ReturnMBB);		MIRBuilder.getMBB().addSuccessor(&ReturnMBB);
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	bool IRTranslator::translateLandingPad(const User &U,
assert(Tys.size() == 2 && "Only two-valued landingpads are supported");		assert(Tys.size() == 2 && "Only two-valued landingpads are supported");

// Mark exception register as live in.		// Mark exception register as live in.
unsigned ExceptionReg = TLI.getExceptionPointerRegister(PersonalityFn);		unsigned ExceptionReg = TLI.getExceptionPointerRegister(PersonalityFn);
if (!ExceptionReg)		if (!ExceptionReg)
return false;		return false;

MBB.addLiveIn(ExceptionReg);		MBB.addLiveIn(ExceptionReg);
unsigned VReg = MRI->createGenericVirtualRegister(Tys[0]),		ArrayRef<unsigned> ResRegs = getOrCreateVRegs(LP);
Tmp = MRI->createGenericVirtualRegister(Ty);		MIRBuilder.buildCopy(ResRegs[0], ExceptionReg);
MIRBuilder.buildCopy(VReg, ExceptionReg);
MIRBuilder.buildInsert(Tmp, Undef, VReg, 0);

unsigned SelectorReg = TLI.getExceptionSelectorRegister(PersonalityFn);		unsigned SelectorReg = TLI.getExceptionSelectorRegister(PersonalityFn);
if (!SelectorReg)		if (!SelectorReg)
return false;		return false;

MBB.addLiveIn(SelectorReg);		MBB.addLiveIn(SelectorReg);

// N.b. the exception selector register always has pointer type and may not
// match the actual IR-level type in the landingpad so an extra cast is
// needed.
unsigned PtrVReg = MRI->createGenericVirtualRegister(Tys[0]);		unsigned PtrVReg = MRI->createGenericVirtualRegister(Tys[0]);
MIRBuilder.buildCopy(PtrVReg, SelectorReg);		MIRBuilder.buildCopy(PtrVReg, SelectorReg);
		MIRBuilder.buildCast(ResRegs[1], PtrVReg);

VReg = MRI->createGenericVirtualRegister(Tys[1]);
MIRBuilder.buildInstr(TargetOpcode::G_PTRTOINT).addDef(VReg).addUse(PtrVReg);
MIRBuilder.buildInsert(getOrCreateVReg(LP), Tmp, VReg,
Tys[0].getSizeInBits());
return true;		return true;
}		}

bool IRTranslator::translateAlloca(const User &U,		bool IRTranslator::translateAlloca(const User &U,
MachineIRBuilder &MIRBuilder) {		MachineIRBuilder &MIRBuilder) {
auto &AI = cast<AllocaInst>(U);		auto &AI = cast<AllocaInst>(U);

if (AI.isStaticAlloca()) {		if (AI.isStaticAlloca()) {
▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines
}		}

bool IRTranslator::translateInsertElement(const User &U,		bool IRTranslator::translateInsertElement(const User &U,
MachineIRBuilder &MIRBuilder) {		MachineIRBuilder &MIRBuilder) {
// If it is a <1 x Ty> vector, use the scalar as it is		// If it is a <1 x Ty> vector, use the scalar as it is
// not a legal vector type in LLT.		// not a legal vector type in LLT.
if (U.getType()->getVectorNumElements() == 1) {		if (U.getType()->getVectorNumElements() == 1) {
unsigned Elt = getOrCreateVReg(*U.getOperand(1));		unsigned Elt = getOrCreateVReg(*U.getOperand(1));
ValToVReg[&U] = Elt;		auto &Regs = *VMap.getVRegs(U);
		if (Regs.empty()) {
		Regs.push_back(Elt);
		VMap.getOffsets(U)->push_back(0);
		} else MIRBuilder.buildCopy(Regs[0], Elt);
return true;		return true;
}		}

unsigned Res = getOrCreateVReg(U);		unsigned Res = getOrCreateVReg(U);
unsigned Val = getOrCreateVReg(*U.getOperand(0));		unsigned Val = getOrCreateVReg(*U.getOperand(0));
unsigned Elt = getOrCreateVReg(*U.getOperand(1));		unsigned Elt = getOrCreateVReg(*U.getOperand(1));
unsigned Idx = getOrCreateVReg(*U.getOperand(2));		unsigned Idx = getOrCreateVReg(*U.getOperand(2));
MIRBuilder.buildInsertVectorElement(Res, Val, Elt, Idx);		MIRBuilder.buildInsertVectorElement(Res, Val, Elt, Idx);
return true;		return true;
}		}

bool IRTranslator::translateExtractElement(const User &U,		bool IRTranslator::translateExtractElement(const User &U,
MachineIRBuilder &MIRBuilder) {		MachineIRBuilder &MIRBuilder) {
// If it is a <1 x Ty> vector, use the scalar as it is		// If it is a <1 x Ty> vector, use the scalar as it is
// not a legal vector type in LLT.		// not a legal vector type in LLT.
if (U.getOperand(0)->getType()->getVectorNumElements() == 1) {		if (U.getOperand(0)->getType()->getVectorNumElements() == 1) {
unsigned Elt = getOrCreateVReg(*U.getOperand(0));		unsigned Elt = getOrCreateVReg(*U.getOperand(0));
ValToVReg[&U] = Elt;		auto &Regs = *VMap.getVRegs(U);
		if (Regs.empty()) {
		Regs.push_back(Elt);
		VMap.getOffsets(U)->push_back(0);
		} else {
		MIRBuilder.buildCopy(Regs[0], Elt);
		}
return true;		return true;
}		}
unsigned Res = getOrCreateVReg(U);		unsigned Res = getOrCreateVRegs(U)[0];
unsigned Val = getOrCreateVReg(*U.getOperand(0));		unsigned Val = getOrCreateVRegs(*U.getOperand(0))[0];
unsigned Idx = getOrCreateVReg(*U.getOperand(1));		unsigned Idx = getOrCreateVRegs(*U.getOperand(1))[0];
		rtereshinUnsubmitted Not Done Reply Inline Actions It looks like `getOrCreateVReg` (singular) is already implemented so it could be used here. rtereshin: It looks like `getOrCreateVReg` (singular) is already implemented so it could be used here.
MIRBuilder.buildExtractVectorElement(Res, Val, Idx);		MIRBuilder.buildExtractVectorElement(Res, Val, Idx);
return true;		return true;
}		}

bool IRTranslator::translateShuffleVector(const User &U,		bool IRTranslator::translateShuffleVector(const User &U,
MachineIRBuilder &MIRBuilder) {		MachineIRBuilder &MIRBuilder) {
MIRBuilder.buildInstr(TargetOpcode::G_SHUFFLE_VECTOR)		MIRBuilder.buildInstr(TargetOpcode::G_SHUFFLE_VECTOR)
.addDef(getOrCreateVReg(U))		.addDef(getOrCreateVReg(U))
.addUse(getOrCreateVReg(*U.getOperand(0)))		.addUse(getOrCreateVReg(*U.getOperand(0)))
.addUse(getOrCreateVReg(*U.getOperand(1)))		.addUse(getOrCreateVReg(*U.getOperand(1)))
.addUse(getOrCreateVReg(*U.getOperand(2)));		.addUse(getOrCreateVReg(*U.getOperand(2)));
return true;		return true;
}		}

bool IRTranslator::translatePHI(const User &U, MachineIRBuilder &MIRBuilder) {		bool IRTranslator::translatePHI(const User &U, MachineIRBuilder &MIRBuilder) {
const PHINode &PI = cast<PHINode>(U);		const PHINode &PI = cast<PHINode>(U);

		SmallVector<MachineInstr *, 4> Insts;
		for (auto Reg : getOrCreateVRegs(PI)) {
auto MIB = MIRBuilder.buildInstr(TargetOpcode::G_PHI);		auto MIB = MIRBuilder.buildInstr(TargetOpcode::G_PHI);
MIB.addDef(getOrCreateVReg(PI));		MIB.addDef(Reg);
		rtereshinUnsubmitted Not Done Reply Inline Actions This could be inlined as `MIRBuilder.buildInstr(TargetOpcode::G_PHI, Reg);` rtereshin: This could be inlined as `MIRBuilder.buildInstr(TargetOpcode::G_PHI, Reg);`
		Insts.push_back(MIB.getInstr());
		}

PendingPHIs.emplace_back(&PI, MIB.getInstr());		PendingPHIs.emplace_back(&PI, std::move(Insts));
return true;		return true;
}		}

void IRTranslator::finishPendingPhis() {		void IRTranslator::finishPendingPhis() {
for (std::pair<const PHINode , MachineInstr > &Phi : PendingPHIs) {		for (auto &Phi : PendingPHIs) {
const PHINode *PI = Phi.first;		const PHINode *PI = Phi.first;
MachineInstrBuilder MIB(*MF, Phi.second);		ArrayRef<MachineInstr *> ComponentPHIs = Phi.second;

// All MachineBasicBlocks exist, add them to the PHI. We assume IRTranslator		// All MachineBasicBlocks exist, add them to the PHI. We assume IRTranslator
// won't create extra control flow here, otherwise we need to find the		// won't create extra control flow here, otherwise we need to find the
// dominating predecessor here (or perhaps force the weirder IRTranslators		// dominating predecessor here (or perhaps force the weirder IRTranslators
// to provide a simple boundary).		// to provide a simple boundary).
SmallSet<const BasicBlock *, 4> HandledPreds;		SmallSet<const BasicBlock *, 4> HandledPreds;

for (unsigned i = 0; i < PI->getNumIncomingValues(); ++i) {		for (unsigned i = 0; i < PI->getNumIncomingValues(); ++i) {
auto IRPred = PI->getIncomingBlock(i);		auto IRPred = PI->getIncomingBlock(i);
if (HandledPreds.count(IRPred))		if (HandledPreds.count(IRPred))
continue;		continue;

HandledPreds.insert(IRPred);		HandledPreds.insert(IRPred);
unsigned ValReg = getOrCreateVReg(*PI->getIncomingValue(i));		ArrayRef<unsigned> ValRegs = getOrCreateVRegs(*PI->getIncomingValue(i));
for (auto Pred : getMachinePredBBs({IRPred, PI->getParent()})) {		for (auto Pred : getMachinePredBBs({IRPred, PI->getParent()})) {
assert(Pred->isSuccessor(MIB->getParent()) &&		assert(Pred->isSuccessor(ComponentPHIs[0]->getParent()) &&
"incorrect CFG at MachineBasicBlock level");		"incorrect CFG at MachineBasicBlock level");
MIB.addUse(ValReg);		for (unsigned i = 0; i < ValRegs.size(); ++i) {
		MachineInstrBuilder MIB(*MF, ComponentPHIs[i]);
		MIB.addUse(ValRegs[i]);
MIB.addMBB(Pred);		MIB.addMBB(Pred);
		rtereshinUnsubmitted Not Done Reply Inline Actions There are two `i`s in scope here. Do we have a test covering this? rtereshin: There are two `i`s in scope here. Do we have a test covering this?
		rtereshinUnsubmitted Not Done Reply Inline Actions I think I've got a couple of tests for `PHI`s. They both seem to translate correctly: test_phi_diamond.ll test_phi_loop.ll The `test_phi_loop.ll` one translates rather beautifully IMO: body: \| bb.1.entry: liveins: $w0 %0:_(s32) = COPY $w0 %5:_(s32) = G_CONSTANT i32 1 %7:_(s32) = G_CONSTANT i32 0 %9:_(s64) = G_CONSTANT i64 0 %10:_(s64) = G_CONSTANT i64 1 bb.2.loop: %1:_(s32) = G_PHI %0(s32), %bb.1, %6(s32), %bb.2 %2:_(s64) = G_PHI %9(s64), %bb.1, %3(s64), %bb.2 %3:_(s64) = G_PHI %10(s64), %bb.1, %4(s64), %bb.2 %4:_(s64) = G_ADD %2, %3 %6:_(s32) = G_SUB %1, %5 %8:_(s1) = G_ICMP intpred(sle), %1(s32), %7 G_BRCOND %8(s1), %bb.3 G_BR %bb.2 bb.3.exit: $x0 = COPY %2(s64) RET_ReallyLR implicit $x0 rtereshin: I think I've got a couple of tests for `PHI`s. They both seem to translate correctly: [[ https…
}		}
}		}
}		}
}		}
		}

		bool IRTranslator::valueIsSplit(const Value &V, bool PopulateOffsets) {
		SmallVector<LLT, 4> SplitTys;
		SmallVector<uint64_t, 4> TmpOffsets;
		if (PopulateOffsets) {
		SmallVectorImpl<uint64_t> &RealOffsets = *VMap.getOffsets(V);
		computeValueLLTs(DL, V.getType(), SplitTys, RealOffsets);
		} else {
		computeValueLLTs(DL, V.getType(), SplitTys, TmpOffsets);
		}
		return SplitTys.size() > 1;
		}
		rtereshinUnsubmitted Not Done Reply Inline Actions This function is rather a hack, but I guess it's not a big deal. rtereshin: This function is rather a hack, but I guess it's not a big deal.

bool IRTranslator::translate(const Instruction &Inst) {		bool IRTranslator::translate(const Instruction &Inst) {
CurBuilder.setDebugLoc(Inst.getDebugLoc());		CurBuilder.setDebugLoc(Inst.getDebugLoc());
switch(Inst.getOpcode()) {		switch(Inst.getOpcode()) {
#define HANDLE_INST(NUM, OPCODE, CLASS) \		#define HANDLE_INST(NUM, OPCODE, CLASS) \
case Instruction::OPCODE: return translate##OPCODE(Inst, CurBuilder);		case Instruction::OPCODE: return translate##OPCODE(Inst, CurBuilder);
#include "llvm/IR/Instruction.def"		#include "llvm/IR/Instruction.def"
default:		default:
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	bool IRTranslator::translate(const Constant &C, unsigned Reg) {
} else if (auto CE = dyn_cast<ConstantExpr>(&C)) {		} else if (auto CE = dyn_cast<ConstantExpr>(&C)) {
switch(CE->getOpcode()) {		switch(CE->getOpcode()) {
#define HANDLE_INST(NUM, OPCODE, CLASS) \		#define HANDLE_INST(NUM, OPCODE, CLASS) \
case Instruction::OPCODE: return translate##OPCODE(*CE, EntryBuilder);		case Instruction::OPCODE: return translate##OPCODE(*CE, EntryBuilder);
#include "llvm/IR/Instruction.def"		#include "llvm/IR/Instruction.def"
default:		default:
return false;		return false;
}		}
} else if (auto CS = dyn_cast<ConstantStruct>(&C)) {
// Return the element if it is a single element ConstantStruct.
if (CS->getNumOperands() == 1) {
unsigned EltReg = getOrCreateVReg(*CS->getOperand(0));
EntryBuilder.buildCast(Reg, EltReg);
return true;
}
SmallVector<unsigned, 4> Ops;
SmallVector<uint64_t, 4> Indices;
uint64_t Offset = 0;
for (unsigned i = 0; i < CS->getNumOperands(); ++i) {
unsigned OpReg = getOrCreateVReg(*CS->getOperand(i));
Ops.push_back(OpReg);
Indices.push_back(Offset);
Offset += MRI->getType(OpReg).getSizeInBits();
}
EntryBuilder.buildSequence(Reg, Ops, Indices);
} else if (auto CV = dyn_cast<ConstantVector>(&C)) {		} else if (auto CV = dyn_cast<ConstantVector>(&C)) {
if (CV->getNumOperands() == 1)		if (CV->getNumOperands() == 1)
return translate(*CV->getOperand(0), Reg);		return translate(*CV->getOperand(0), Reg);
SmallVector<unsigned, 4> Ops;		SmallVector<unsigned, 4> Ops;
for (unsigned i = 0; i < CV->getNumOperands(); ++i) {		for (unsigned i = 0; i < CV->getNumOperands(); ++i) {
Ops.push_back(getOrCreateVReg(*CV->getOperand(i)));		Ops.push_back(getOrCreateVReg(*CV->getOperand(i)));
}		}
EntryBuilder.buildMerge(Reg, Ops);		EntryBuilder.buildMerge(Reg, Ops);
} else		} else
return false;		return false;

return true;		return true;
}		}

void IRTranslator::finalizeFunction() {		void IRTranslator::finalizeFunction() {
// Release the memory used by the different maps we		// Release the memory used by the different maps we
// needed during the translation.		// needed during the translation.
PendingPHIs.clear();		PendingPHIs.clear();
ValToVReg.clear();		VMap.reset();
FrameIndices.clear();		FrameIndices.clear();
MachinePreds.clear();		MachinePreds.clear();
// MachineIRBuilder::DebugLoc can outlive the DILocation it holds. Clear it		// MachineIRBuilder::DebugLoc can outlive the DILocation it holds. Clear it
// to avoid accessing free’d memory (in runOnMachineFunction) and to avoid		// to avoid accessing free’d memory (in runOnMachineFunction) and to avoid
// destroying it twice (in ~IRTranslator() and ~LLVMContext())		// destroying it twice (in ~IRTranslator() and ~LLVMContext())
EntryBuilder = MachineIRBuilder();		EntryBuilder = MachineIRBuilder();
CurBuilder = MachineIRBuilder();		CurBuilder = MachineIRBuilder();
}		}
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	bool IRTranslator::runOnMachineFunction(MachineFunction &CurMF) {
// Make our arguments/constants entry block fallthrough to the IR entry block.		// Make our arguments/constants entry block fallthrough to the IR entry block.
EntryBB->addSuccessor(&getMBB(F.front()));		EntryBB->addSuccessor(&getMBB(F.front()));

// Lower the actual args into this basic block.		// Lower the actual args into this basic block.
SmallVector<unsigned, 8> VRegArgs;		SmallVector<unsigned, 8> VRegArgs;
for (const Argument &Arg: F.args()) {		for (const Argument &Arg: F.args()) {
if (DL->getTypeStoreSize(Arg.getType()) == 0)		if (DL->getTypeStoreSize(Arg.getType()) == 0)
continue; // Don't handle zero sized types.		continue; // Don't handle zero sized types.
VRegArgs.push_back(getOrCreateVReg(Arg));		if (VMap.contains(cast<Value>(Arg))) {
		auto &VRegs = *VMap.getVRegs(cast<Value>(Arg));
		if (!valueIsSplit(Arg) && !Arg.getType()->isVoidTy())
		VRegArgs.push_back(VRegs[0]);
		} else {
		VRegArgs.push_back(MRI->createGenericVirtualRegister(
		getLLTForType(Arg.getType(), DL)));
}		}
		rtereshinUnsubmitted Not Done Reply Inline Actions I can't see how this condition could be true. And none of the tests fail if an assertion is inserted under this condition. If it is possible, could you please add a test? Thanks! rtereshin: I can't see how this condition could be true. And none of the tests fail if an assertion is…
		}

if (!CLI->lowerFormalArguments(EntryBuilder, F, VRegArgs)) {		if (!CLI->lowerFormalArguments(EntryBuilder, F, VRegArgs)) {
OptimizationRemarkMissed R("gisel-irtranslator", "GISelFailure",		OptimizationRemarkMissed R("gisel-irtranslator", "GISelFailure",
F.getSubprogram(), &F.getEntryBlock());		F.getSubprogram(), &F.getEntryBlock());
R << "unable to lower arguments: " << ore::NV("Prototype", F.getType());		R << "unable to lower arguments: " << ore::NV("Prototype", F.getType());
reportTranslationError(MF, TPC, *ORE, R);		reportTranslationError(MF, TPC, *ORE, R);
return false;		return false;
}		}

		auto ArgIt = F.arg_begin();
		for (unsigned i = 0; i < VRegArgs.size(); ++i) {
		rtereshinUnsubmitted Not Done Reply Inline Actions could be replaced with range-based for. rtereshin: could be replaced with range-based for.
		// If the argument is an unsplit scalar then don't use unpackRegs to avoid
		// creating redundant copies.
		if (!valueIsSplit(ArgIt, / PopulateOffsets */ true)) {
		auto &VRegs = VMap.getVRegs(cast<Value>(ArgIt));
		if (VRegs.empty())
		rtereshinUnsubmitted Not Done Reply Inline Actions How this could be not true? As above, an assertion in in `else` branch shows `VRegs` is always empty here as far as the existing tests' coverage go. rtereshin: How this could be not true? As above, an assertion in in `else` branch shows `VRegs` is always…
		VRegs.push_back(VRegArgs[i]);
		ArgIt++;
		} else {
		unpackRegs(*ArgIt++, VRegArgs[i], EntryBuilder);
		rtereshinUnsubmitted Not Done Reply Inline Actions `ArgIt` is incremented in both branches, maybe better to put it at the end of the loop. rtereshin: `ArgIt` is incremented in both branches, maybe better to put it at the end of the loop.
		}
		}

// And translate the function!		// And translate the function!
for (const BasicBlock &BB: F) {		for (const BasicBlock &BB : F) {
MachineBasicBlock &MBB = getMBB(BB);		MachineBasicBlock &MBB = getMBB(BB);
// Set the insertion point of all the following translations to		// Set the insertion point of all the following translations to
// the end of this basic block.		// the end of this basic block.
CurBuilder.setMBB(MBB);		CurBuilder.setMBB(MBB);

for (const Instruction &Inst: BB) {		for (const Instruction &Inst : BB) {
if (translate(Inst))		if (translate(Inst))
continue;		continue;

OptimizationRemarkMissed R("gisel-irtranslator", "GISelFailure",		OptimizationRemarkMissed R("gisel-irtranslator", "GISelFailure",
Inst.getDebugLoc(), &BB);		Inst.getDebugLoc(), &BB);
R << "unable to translate instruction: " << ore::NV("Opcode", &Inst);		R << "unable to translate instruction: " << ore::NV("Opcode", &Inst);

if (ORE->allowExtraAnalysis("gisel-irtranslator")) {		if (ORE->allowExtraAnalysis("gisel-irtranslator")) {
▲ Show 20 Lines • Show All 43 Lines • Show Last 20 Lines

lib/Target/AArch64/AArch64CallLowering.cpp

	Show First 20 Lines • Show All 181 Lines • ▼ Show 20 Lines

	void AArch64CallLowering::splitToValueTypes(			void AArch64CallLowering::splitToValueTypes(
	const ArgInfo &OrigArg, SmallVectorImpl<ArgInfo> &SplitArgs,			const ArgInfo &OrigArg, SmallVectorImpl<ArgInfo> &SplitArgs,
	const DataLayout &DL, MachineRegisterInfo &MRI, CallingConv::ID CallConv,			const DataLayout &DL, MachineRegisterInfo &MRI, CallingConv::ID CallConv,
	const SplitArgTy &PerformArgSplit) const {			const SplitArgTy &PerformArgSplit) const {
	const AArch64TargetLowering &TLI = *getTLI<AArch64TargetLowering>();			const AArch64TargetLowering &TLI = *getTLI<AArch64TargetLowering>();
	LLVMContext &Ctx = OrigArg.Ty->getContext();			LLVMContext &Ctx = OrigArg.Ty->getContext();

				if (OrigArg.Ty->isVoidTy())
				return;

	SmallVector<EVT, 4> SplitVTs;			SmallVector<EVT, 4> SplitVTs;
	SmallVector<uint64_t, 4> Offsets;			SmallVector<uint64_t, 4> Offsets;
	ComputeValueVTs(TLI, DL, OrigArg.Ty, SplitVTs, &Offsets, 0);			ComputeValueVTs(TLI, DL, OrigArg.Ty, SplitVTs, &Offsets, 0);

	if (SplitVTs.size() == 1) {			if (SplitVTs.size() == 1) {
	// No splitting to do, but we want to replace the original type (e.g. [1 x			// No splitting to do, but we want to replace the original type (e.g. [1 x
	// double] -> double).			// double] -> double).
	SplitArgs.emplace_back(OrigArg.Reg, SplitVTs[0].getTypeForEVT(Ctx),			SplitArgs.emplace_back(OrigArg.Reg, SplitVTs[0].getTypeForEVT(Ctx),
	▲ Show 20 Lines • Show All 206 Lines • Show Last 20 Lines

lib/Target/ARM/ARMCallLowering.cpp

Show First 20 Lines • Show All 463 Lines • ▼ Show 20 Lines	if (!SplitRegs.empty())
MIRBuilder.buildMerge(VRegs[Idx], SplitRegs);		MIRBuilder.buildMerge(VRegs[Idx], SplitRegs);

Idx++;		Idx++;
}		}

if (!MBB.empty())		if (!MBB.empty())
MIRBuilder.setInstr(*MBB.begin());		MIRBuilder.setInstr(*MBB.begin());

return handleAssignments(MIRBuilder, ArgInfos, ArgHandler);		if (!handleAssignments(MIRBuilder, ArgInfos, ArgHandler))
		return false;

		// Move back to the end of the basic block.
		MIRBuilder.setMBB(MBB);
		return true;
}		}

namespace {		namespace {

struct CallReturnHandler : public IncomingValueHandler {		struct CallReturnHandler : public IncomingValueHandler {
CallReturnHandler(MachineIRBuilder &MIRBuilder, MachineRegisterInfo &MRI,		CallReturnHandler(MachineIRBuilder &MIRBuilder, MachineRegisterInfo &MRI,
MachineInstrBuilder MIB, CCAssignFn *AssignFn)		MachineInstrBuilder MIB, CCAssignFn *AssignFn)
: IncomingValueHandler(MIRBuilder, MRI, AssignFn), MIB(MIB) {}		: IncomingValueHandler(MIRBuilder, MRI, AssignFn), MIB(MIB) {}
▲ Show 20 Lines • Show All 108 Lines • Show Last 20 Lines

test/CodeGen/AArch64/GlobalISel/arm64-fallback.ll

Show All 26 Lines
; FALLBACK-WITH-REPORT-OUT: ldr q0,		; FALLBACK-WITH-REPORT-OUT: ldr q0,
; FALLBACK-WITH-REPORT-OUT-NEXT: bl __fixunstfti		; FALLBACK-WITH-REPORT-OUT-NEXT: bl __fixunstfti
define i128 @ABIi128(i128 %arg1) {		define i128 @ABIi128(i128 %arg1) {
%farg1 = bitcast i128 %arg1 to fp128		%farg1 = bitcast i128 %arg1 to fp128
%res = fptoui fp128 %farg1 to i128		%res = fptoui fp128 %farg1 to i128
ret i128 %res		ret i128 %res
}		}

; It happens that we don't handle ConstantArray instances yet during
; translation. Any other constant would be fine too.

; FALLBACK-WITH-REPORT-ERR: remark: <unknown>:0:0: unable to translate constant: [1 x double] (in function: constant)
; FALLBACK-WITH-REPORT-ERR: warning: Instruction selection used fallback path for constant
; FALLBACK-WITH-REPORT-OUT-LABEL: constant:
; FALLBACK-WITH-REPORT-OUT: fmov d0, #1.0
define [1 x double] @constant() {
ret [1 x double] [double 1.0]
}

; The key problem here is that we may fail to create an MBB referenced by a		; The key problem here is that we may fail to create an MBB referenced by a
; PHI. If so, we cannot complete the G_PHI and mustn't try or bad things		; PHI. If so, we cannot complete the G_PHI and mustn't try or bad things
; happen.		; happen.
; FALLBACK-WITH-REPORT-ERR: remark: <unknown>:0:0: cannot select: G_STORE %6:gpr(s32), %2:gpr(p0) :: (store seq_cst 4 into %ir.addr) (in function: pending_phis)		; FALLBACK-WITH-REPORT-ERR: remark: <unknown>:0:0: cannot select: G_STORE %6:gpr(s32), %2:gpr(p0) :: (store seq_cst 4 into %ir.addr) (in function: pending_phis)
; FALLBACK-WITH-REPORT-ERR: warning: Instruction selection used fallback path for pending_phis		; FALLBACK-WITH-REPORT-ERR: warning: Instruction selection used fallback path for pending_phis
; FALLBACK-WITH-REPORT-OUT-LABEL: pending_phis:		; FALLBACK-WITH-REPORT-OUT-LABEL: pending_phis:
define i32 @pending_phis(i1 %tst, i32 %val, i32* %addr) {		define i32 @pending_phis(i1 %tst, i32 %val, i32* %addr) {
br i1 %tst, label %true, label %false		br i1 %tst, label %true, label %false
▲ Show 20 Lines • Show All 126 Lines • ▼ Show 20 Lines	block:
store <2 x i16> %dummy, <2 x i16>* undef		store <2 x i16> %dummy, <2 x i16>* undef
ret void		ret void

end:		end:
%vec = load <2 x i16>, <2 x i16>* undef		%vec = load <2 x i16>, <2 x i16>* undef
br label %block		br label %block
}		}

; FALLBACK-WITH-REPORT-ERR: remark: <unknown>:0:0: unable to legalize instruction: G_STORE %1:_(s96), %3:_(p0) :: (store 12 into `%struct96* undef`, align 4) (in function: nonpow2_insertvalue_narrowing)
; FALLBACK-WITH-REPORT-ERR: warning: Instruction selection used fallback path for nonpow2_insertvalue_narrowing
; FALLBACK-WITH-REPORT-OUT-LABEL: nonpow2_insertvalue_narrowing:
%struct96 = type { float, float, float }
define void @nonpow2_insertvalue_narrowing(float %a) {
%dummy = insertvalue %struct96 undef, float %a, 0
store %struct96 %dummy, %struct96* undef
ret void
}

; FALLBACK-WITH-REPORT-ERR remark: <unknown>:0:0: unable to legalize instruction: G_STORE %3, %4 :: (store 12 into `i96* undef`, align 16) (in function: nonpow2_add_narrowing)		; FALLBACK-WITH-REPORT-ERR remark: <unknown>:0:0: unable to legalize instruction: G_STORE %3, %4 :: (store 12 into `i96* undef`, align 16) (in function: nonpow2_add_narrowing)
; FALLBACK-WITH-REPORT-ERR: warning: Instruction selection used fallback path for nonpow2_add_narrowing		; FALLBACK-WITH-REPORT-ERR: warning: Instruction selection used fallback path for nonpow2_add_narrowing
; FALLBACK-WITH-REPORT-OUT-LABEL: nonpow2_add_narrowing:		; FALLBACK-WITH-REPORT-OUT-LABEL: nonpow2_add_narrowing:
define void @nonpow2_add_narrowing() {		define void @nonpow2_add_narrowing() {
%a = add i128 undef, undef		%a = add i128 undef, undef
%b = trunc i128 %a to i96		%b = trunc i128 %a to i96
%dummy = add i96 %b, %b		%dummy = add i96 %b, %b
store i96 %dummy, i96* undef		store i96 %dummy, i96* undef
▲ Show 20 Lines • Show All 53 Lines • Show Last 20 Lines

test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll

Show First 20 Lines • Show All 703 Lines • ▼ Show 20 Lines
; CHECK: [[NULL:%[0-9]+]]:_(p0) = G_INTTOPTR [[ZERO]]		; CHECK: [[NULL:%[0-9]+]]:_(p0) = G_INTTOPTR [[ZERO]]
; CHECK: $x0 = COPY [[NULL]]		; CHECK: $x0 = COPY [[NULL]]
define i8* @test_constant_null() {		define i8* @test_constant_null() {
ret i8* null		ret i8* null
}		}

; CHECK-LABEL: name: test_struct_memops		; CHECK-LABEL: name: test_struct_memops
; CHECK: [[ADDR:%[0-9]+]]:_(p0) = COPY $x0		; CHECK: [[ADDR:%[0-9]+]]:_(p0) = COPY $x0
; CHECK: [[VAL:%[0-9]+]]:_(s64) = G_LOAD [[ADDR]](p0) :: (load 8 from %ir.addr, align 4)		; CHECK: [[VAL1:%[0-9]+]]:_(s8) = G_LOAD %0(p0) :: (load 1 from %ir.addr, align 4)
; CHECK: G_STORE [[VAL]](s64), [[ADDR]](p0) :: (store 8 into %ir.addr, align 4)		; CHECK: [[CST1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
		; CHECK: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[ADDR]], [[CST1]](s64)
		; CHECK: [[VAL2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 4 from %ir.addr + 32)
		; CHECK: G_STORE [[VAL1]](s8), [[ADDR]](p0) :: (store 1 into %ir.addr, align 4)
		; CHECK: [[CST2:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
		; CHECK: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[ADDR]], [[CST2]](s64)
		; CHECK: G_STORE [[VAL2]](s32), [[GEP2]](p0) :: (store 4 into %ir.addr + 32)
define void @test_struct_memops({ i8, i32 }* %addr) {		define void @test_struct_memops({ i8, i32 }* %addr) {
%val = load { i8, i32 }, { i8, i32 }* %addr		%val = load { i8, i32 }, { i8, i32 }* %addr
store { i8, i32 } %val, { i8, i32 }* %addr		store { i8, i32 } %val, { i8, i32 }* %addr
ret void		ret void
}		}

; CHECK-LABEL: name: test_i1_memops		; CHECK-LABEL: name: test_i1_memops
; CHECK: [[ADDR:%[0-9]+]]:_(p0) = COPY $x0		; CHECK: [[ADDR:%[0-9]+]]:_(p0) = COPY $x0
▲ Show 20 Lines • Show All 84 Lines • ▼ Show 20 Lines	define float @test_frem(float %arg1, float %arg2) {
ret float %res		ret float %res
}		}

; CHECK-LABEL: name: test_sadd_overflow		; CHECK-LABEL: name: test_sadd_overflow
; CHECK: [[LHS:%[0-9]+]]:_(s32) = COPY $w0		; CHECK: [[LHS:%[0-9]+]]:_(s32) = COPY $w0
; CHECK: [[RHS:%[0-9]+]]:_(s32) = COPY $w1		; CHECK: [[RHS:%[0-9]+]]:_(s32) = COPY $w1
; CHECK: [[ADDR:%[0-9]+]]:_(p0) = COPY $x2		; CHECK: [[ADDR:%[0-9]+]]:_(p0) = COPY $x2
; CHECK: [[VAL:%[0-9]+]]:_(s32), [[OVERFLOW:%[0-9]+]]:_(s1) = G_SADDO [[LHS]], [[RHS]]		; CHECK: [[VAL:%[0-9]+]]:_(s32), [[OVERFLOW:%[0-9]+]]:_(s1) = G_SADDO [[LHS]], [[RHS]]
; CHECK: [[TMP:%[0-9]+]]:_(s64) = G_IMPLICIT_DEF		; CHECK: G_STORE [[VAL]](s32), [[ADDR]](p0) :: (store 4 into %ir.addr)
; CHECK: [[TMP1:%[0-9]+]]:_(s64) = G_INSERT [[TMP]], [[VAL]](s32), 0		; CHECK: [[CST:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
; CHECK: [[RES:%[0-9]+]]:_(s64) = G_INSERT [[TMP1]], [[OVERFLOW]](s1), 32		; CHECK: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[ADDR]], [[CST]](s64)
; CHECK: G_STORE [[RES]](s64), [[ADDR]](p0)		; CHECK: G_STORE [[OVERFLOW]](s1), [[GEP]](p0) :: (store 1 into %ir.addr + 32, align 4)
declare { i32, i1 } @llvm.sadd.with.overflow.i32(i32, i32)		declare { i32, i1 } @llvm.sadd.with.overflow.i32(i32, i32)
define void @test_sadd_overflow(i32 %lhs, i32 %rhs, { i32, i1 }* %addr) {		define void @test_sadd_overflow(i32 %lhs, i32 %rhs, { i32, i1 }* %addr) {
%res = call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 %lhs, i32 %rhs)		%res = call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 %lhs, i32 %rhs)
store { i32, i1 } %res, { i32, i1 }* %addr		store { i32, i1 } %res, { i32, i1 }* %addr
ret void		ret void
}		}

; CHECK-LABEL: name: test_uadd_overflow		; CHECK-LABEL: name: test_uadd_overflow
; CHECK: [[LHS:%[0-9]+]]:_(s32) = COPY $w0		; CHECK: [[LHS:%[0-9]+]]:_(s32) = COPY $w0
; CHECK: [[RHS:%[0-9]+]]:_(s32) = COPY $w1		; CHECK: [[RHS:%[0-9]+]]:_(s32) = COPY $w1
; CHECK: [[ADDR:%[0-9]+]]:_(p0) = COPY $x2		; CHECK: [[ADDR:%[0-9]+]]:_(p0) = COPY $x2
; CHECK: [[ZERO:%[0-9]+]]:_(s1) = G_CONSTANT i1 false		; CHECK: [[ZERO:%[0-9]+]]:_(s1) = G_CONSTANT i1 false
; CHECK: [[VAL:%[0-9]+]]:_(s32), [[OVERFLOW:%[0-9]+]]:_(s1) = G_UADDE [[LHS]], [[RHS]], [[ZERO]]		; CHECK: [[VAL:%[0-9]+]]:_(s32), [[OVERFLOW:%[0-9]+]]:_(s1) = G_UADDE [[LHS]], [[RHS]], [[ZERO]]
; CHECK: [[TMP:%[0-9]+]]:_(s64) = G_IMPLICIT_DEF		; CHECK: G_STORE [[VAL]](s32), [[ADDR]](p0) :: (store 4 into %ir.addr)
; CHECK: [[TMP1:%[0-9]+]]:_(s64) = G_INSERT [[TMP]], [[VAL]](s32), 0		; CHECK: [[CST:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
; CHECK: [[RES:%[0-9]+]]:_(s64) = G_INSERT [[TMP1]], [[OVERFLOW]](s1), 32		; CHECK: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[ADDR]], [[CST]](s64)
; CHECK: G_STORE [[RES]](s64), [[ADDR]](p0)		; CHECK: G_STORE [[OVERFLOW]](s1), [[GEP]](p0) :: (store 1 into %ir.addr + 32, align 4)
declare { i32, i1 } @llvm.uadd.with.overflow.i32(i32, i32)		declare { i32, i1 } @llvm.uadd.with.overflow.i32(i32, i32)
define void @test_uadd_overflow(i32 %lhs, i32 %rhs, { i32, i1 }* %addr) {		define void @test_uadd_overflow(i32 %lhs, i32 %rhs, { i32, i1 }* %addr) {
%res = call { i32, i1 } @llvm.uadd.with.overflow.i32(i32 %lhs, i32 %rhs)		%res = call { i32, i1 } @llvm.uadd.with.overflow.i32(i32 %lhs, i32 %rhs)
store { i32, i1 } %res, { i32, i1 }* %addr		store { i32, i1 } %res, { i32, i1 }* %addr
ret void		ret void
}		}

; CHECK-LABEL: name: test_ssub_overflow		; CHECK-LABEL: name: test_ssub_overflow
; CHECK: [[LHS:%[0-9]+]]:_(s32) = COPY $w0		; CHECK: [[LHS:%[0-9]+]]:_(s32) = COPY $w0
; CHECK: [[RHS:%[0-9]+]]:_(s32) = COPY $w1		; CHECK: [[RHS:%[0-9]+]]:_(s32) = COPY $w1
; CHECK: [[ADDR:%[0-9]+]]:_(p0) = COPY $x2		; CHECK: [[ADDR:%[0-9]+]]:_(p0) = COPY $x2
; CHECK: [[VAL:%[0-9]+]]:_(s32), [[OVERFLOW:%[0-9]+]]:_(s1) = G_SSUBO [[LHS]], [[RHS]]		; CHECK: [[VAL:%[0-9]+]]:_(s32), [[OVERFLOW:%[0-9]+]]:_(s1) = G_SSUBO [[LHS]], [[RHS]]
; CHECK: [[TMP:%[0-9]+]]:_(s64) = G_IMPLICIT_DEF		; CHECK: G_STORE [[VAL]](s32), [[ADDR]](p0) :: (store 4 into %ir.subr)
; CHECK: [[TMP1:%[0-9]+]]:_(s64) = G_INSERT [[TMP]], [[VAL]](s32), 0		; CHECK: [[CST:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
; CHECK: [[RES:%[0-9]+]]:_(s64) = G_INSERT [[TMP1]], [[OVERFLOW]](s1), 32		; CHECK: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[ADDR]], [[CST]](s64)
; CHECK: G_STORE [[RES]](s64), [[ADDR]](p0)		; CHECK: G_STORE [[OVERFLOW]](s1), [[GEP]](p0) :: (store 1 into %ir.subr + 32, align 4)
declare { i32, i1 } @llvm.ssub.with.overflow.i32(i32, i32)		declare { i32, i1 } @llvm.ssub.with.overflow.i32(i32, i32)
define void @test_ssub_overflow(i32 %lhs, i32 %rhs, { i32, i1 }* %subr) {		define void @test_ssub_overflow(i32 %lhs, i32 %rhs, { i32, i1 }* %subr) {
%res = call { i32, i1 } @llvm.ssub.with.overflow.i32(i32 %lhs, i32 %rhs)		%res = call { i32, i1 } @llvm.ssub.with.overflow.i32(i32 %lhs, i32 %rhs)
store { i32, i1 } %res, { i32, i1 }* %subr		store { i32, i1 } %res, { i32, i1 }* %subr
ret void		ret void
}		}

; CHECK-LABEL: name: test_usub_overflow		; CHECK-LABEL: name: test_usub_overflow
; CHECK: [[LHS:%[0-9]+]]:_(s32) = COPY $w0		; CHECK: [[LHS:%[0-9]+]]:_(s32) = COPY $w0
; CHECK: [[RHS:%[0-9]+]]:_(s32) = COPY $w1		; CHECK: [[RHS:%[0-9]+]]:_(s32) = COPY $w1
; CHECK: [[ADDR:%[0-9]+]]:_(p0) = COPY $x2		; CHECK: [[ADDR:%[0-9]+]]:_(p0) = COPY $x2
; CHECK: [[ZERO:%[0-9]+]]:_(s1) = G_CONSTANT i1 false		; CHECK: [[ZERO:%[0-9]+]]:_(s1) = G_CONSTANT i1 false
; CHECK: [[VAL:%[0-9]+]]:_(s32), [[OVERFLOW:%[0-9]+]]:_(s1) = G_USUBE [[LHS]], [[RHS]], [[ZERO]]		; CHECK: [[VAL:%[0-9]+]]:_(s32), [[OVERFLOW:%[0-9]+]]:_(s1) = G_USUBE [[LHS]], [[RHS]], [[ZERO]]
; CHECK: [[TMP:%[0-9]+]]:_(s64) = G_IMPLICIT_DEF		; CHECK: G_STORE [[VAL]](s32), [[ADDR]](p0) :: (store 4 into %ir.subr)
; CHECK: [[TMP1:%[0-9]+]]:_(s64) = G_INSERT [[TMP]], [[VAL]](s32), 0		; CHECK: [[CST:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
; CHECK: [[RES:%[0-9]+]]:_(s64) = G_INSERT [[TMP1]], [[OVERFLOW]](s1), 32		; CHECK: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[ADDR]], [[CST]](s64)
; CHECK: G_STORE [[RES]](s64), [[ADDR]](p0)		; CHECK: G_STORE [[OVERFLOW]](s1), [[GEP]](p0) :: (store 1 into %ir.subr + 32, align 4)
declare { i32, i1 } @llvm.usub.with.overflow.i32(i32, i32)		declare { i32, i1 } @llvm.usub.with.overflow.i32(i32, i32)
define void @test_usub_overflow(i32 %lhs, i32 %rhs, { i32, i1 }* %subr) {		define void @test_usub_overflow(i32 %lhs, i32 %rhs, { i32, i1 }* %subr) {
%res = call { i32, i1 } @llvm.usub.with.overflow.i32(i32 %lhs, i32 %rhs)		%res = call { i32, i1 } @llvm.usub.with.overflow.i32(i32 %lhs, i32 %rhs)
store { i32, i1 } %res, { i32, i1 }* %subr		store { i32, i1 } %res, { i32, i1 }* %subr
ret void		ret void
}		}

; CHECK-LABEL: name: test_smul_overflow		; CHECK-LABEL: name: test_smul_overflow
; CHECK: [[LHS:%[0-9]+]]:_(s32) = COPY $w0		; CHECK: [[LHS:%[0-9]+]]:_(s32) = COPY $w0
; CHECK: [[RHS:%[0-9]+]]:_(s32) = COPY $w1		; CHECK: [[RHS:%[0-9]+]]:_(s32) = COPY $w1
; CHECK: [[ADDR:%[0-9]+]]:_(p0) = COPY $x2		; CHECK: [[ADDR:%[0-9]+]]:_(p0) = COPY $x2
; CHECK: [[VAL:%[0-9]+]]:_(s32), [[OVERFLOW:%[0-9]+]]:_(s1) = G_SMULO [[LHS]], [[RHS]]		; CHECK: [[VAL:%[0-9]+]]:_(s32), [[OVERFLOW:%[0-9]+]]:_(s1) = G_SMULO [[LHS]], [[RHS]]
; CHECK: [[TMP:%[0-9]+]]:_(s64) = G_IMPLICIT_DEF		; CHECK: G_STORE [[VAL]](s32), [[ADDR]](p0) :: (store 4 into %ir.addr)
; CHECK: [[TMP1:%[0-9]+]]:_(s64) = G_INSERT [[TMP]], [[VAL]](s32), 0		; CHECK: [[CST:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
; CHECK: [[RES:%[0-9]+]]:_(s64) = G_INSERT [[TMP1]], [[OVERFLOW]](s1), 32		; CHECK: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[ADDR]], [[CST]](s64)
; CHECK: G_STORE [[RES]](s64), [[ADDR]](p0)		; CHECK: G_STORE [[OVERFLOW]](s1), [[GEP]](p0) :: (store 1 into %ir.addr + 32, align 4)
declare { i32, i1 } @llvm.smul.with.overflow.i32(i32, i32)		declare { i32, i1 } @llvm.smul.with.overflow.i32(i32, i32)
define void @test_smul_overflow(i32 %lhs, i32 %rhs, { i32, i1 }* %addr) {		define void @test_smul_overflow(i32 %lhs, i32 %rhs, { i32, i1 }* %addr) {
%res = call { i32, i1 } @llvm.smul.with.overflow.i32(i32 %lhs, i32 %rhs)		%res = call { i32, i1 } @llvm.smul.with.overflow.i32(i32 %lhs, i32 %rhs)
store { i32, i1 } %res, { i32, i1 }* %addr		store { i32, i1 } %res, { i32, i1 }* %addr
ret void		ret void
}		}

; CHECK-LABEL: name: test_umul_overflow		; CHECK-LABEL: name: test_umul_overflow
; CHECK: [[LHS:%[0-9]+]]:_(s32) = COPY $w0		; CHECK: [[LHS:%[0-9]+]]:_(s32) = COPY $w0
; CHECK: [[RHS:%[0-9]+]]:_(s32) = COPY $w1		; CHECK: [[RHS:%[0-9]+]]:_(s32) = COPY $w1
; CHECK: [[ADDR:%[0-9]+]]:_(p0) = COPY $x2		; CHECK: [[ADDR:%[0-9]+]]:_(p0) = COPY $x2
; CHECK: [[VAL:%[0-9]+]]:_(s32), [[OVERFLOW:%[0-9]+]]:_(s1) = G_UMULO [[LHS]], [[RHS]]		; CHECK: [[VAL:%[0-9]+]]:_(s32), [[OVERFLOW:%[0-9]+]]:_(s1) = G_UMULO [[LHS]], [[RHS]]
; CHECK: [[TMP:%[0-9]+]]:_(s64) = G_IMPLICIT_DEF		; CHECK: G_STORE [[VAL]](s32), [[ADDR]](p0) :: (store 4 into %ir.addr)
; CHECK: [[TMP1:%[0-9]+]]:_(s64) = G_INSERT [[TMP]], [[VAL]](s32), 0		; CHECK: [[CST:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
; CHECK: [[RES:%[0-9]+]]:_(s64) = G_INSERT [[TMP1]], [[OVERFLOW]](s1), 32		; CHECK: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[ADDR]], [[CST]](s64)
; CHECK: G_STORE [[RES]](s64), [[ADDR]](p0)		; CHECK: G_STORE [[OVERFLOW]](s1), [[GEP]](p0) :: (store 1 into %ir.addr + 32, align 4)
declare { i32, i1 } @llvm.umul.with.overflow.i32(i32, i32)		declare { i32, i1 } @llvm.umul.with.overflow.i32(i32, i32)
define void @test_umul_overflow(i32 %lhs, i32 %rhs, { i32, i1 }* %addr) {		define void @test_umul_overflow(i32 %lhs, i32 %rhs, { i32, i1 }* %addr) {
%res = call { i32, i1 } @llvm.umul.with.overflow.i32(i32 %lhs, i32 %rhs)		%res = call { i32, i1 } @llvm.umul.with.overflow.i32(i32 %lhs, i32 %rhs)
store { i32, i1 } %res, { i32, i1 }* %addr		store { i32, i1 } %res, { i32, i1 }* %addr
ret void		ret void
}		}

; CHECK-LABEL: name: test_extractvalue		; CHECK-LABEL: name: test_extractvalue
; CHECK: [[STRUCT:%[0-9]+]]:_(s128) = G_LOAD		; CHECK: %0:_(p0) = COPY $x0
; CHECK: [[RES:%[0-9]+]]:_(s32) = G_EXTRACT [[STRUCT]](s128), 64		; CHECK: [[LD1:%[0-9]+]]:_(s8) = G_LOAD %0(p0) :: (load 1 from %ir.addr, align 4)
; CHECK: $w0 = COPY [[RES]]		; CHECK: [[CST1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
		; CHECK: [[GEP1:%[0-9]+]]:_(p0) = G_GEP %0, [[CST1]](s64)
		; CHECK: [[LD2:%[0-9]+]]:_(s8) = G_LOAD [[GEP1]](p0) :: (load 1 from %ir.addr + 32, align 4)
		; CHECK: [[CST2:%[0-9]+]]:_(s64) = G_CONSTANT i64 8
		; CHECK: [[GEP2:%[0-9]+]]:_(p0) = G_GEP %0, [[CST2]](s64)
		; CHECK: [[LD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 4 from %ir.addr + 64)
		; CHECK: [[CST3:%[0-9]+]]:_(s64) = G_CONSTANT i64 12
		; CHECK: [[GEP3:%[0-9]+]]:_(p0) = G_GEP %0, [[CST3]](s64)
		; CHECK: [[LD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p0) :: (load 4 from %ir.addr + 96)
		; CHECK: [[COPY:%[0-9]+]]:_(s32) = COPY [[LD3]](s32)
		; CHECK: $w0 = COPY [[COPY]](s32)
%struct.nested = type {i8, { i8, i32 }, i32}		%struct.nested = type {i8, { i8, i32 }, i32}
define i32 @test_extractvalue(%struct.nested* %addr) {		define i32 @test_extractvalue(%struct.nested* %addr) {
%struct = load %struct.nested, %struct.nested* %addr		%struct = load %struct.nested, %struct.nested* %addr
%res = extractvalue %struct.nested %struct, 1, 1		%res = extractvalue %struct.nested %struct, 1, 1
ret i32 %res		ret i32 %res
}		}

; CHECK-LABEL: name: test_extractvalue_agg		; CHECK-LABEL: name: test_extractvalue_agg
; CHECK: [[STRUCT:%[0-9]+]]:_(s128) = G_LOAD		; CHECK: %0:_(p0) = COPY $x0
; CHECK: [[RES:%[0-9]+]]:_(s64) = G_EXTRACT [[STRUCT]](s128), 32		; CHECK: %1:_(p0) = COPY $x1
; CHECK: G_STORE [[RES]]		; CHECK: [[LD1:%[0-9]+]]:_(s8) = G_LOAD %0(p0) :: (load 1 from %ir.addr, align 4)
		; CHECK: [[CST1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
		; CHECK: [[GEP1:%[0-9]+]]:_(p0) = G_GEP %0, [[CST1]](s64)
		; CHECK: [[LD2:%[0-9]+]]:_(s8) = G_LOAD [[GEP1]](p0) :: (load 1 from %ir.addr + 32, align 4)
		; CHECK: [[CST2:%[0-9]+]]:_(s64) = G_CONSTANT i64 8
		; CHECK: [[GEP2:%[0-9]+]]:_(p0) = G_GEP %0, [[CST2]](s64)
		; CHECK: [[LD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 4 from %ir.addr + 64)
		; CHECK: [[CST3:%[0-9]+]]:_(s64) = G_CONSTANT i64 12
		; CHECK: [[GEP3:%[0-9]+]]:_(p0) = G_GEP %0, [[CST3]](s64)
		; CHECK: [[LD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p0) :: (load 4 from %ir.addr + 96)
		; CHECK: [[LD2COPY:%[0-9]+]]:_(s8) = COPY [[LD2]](s8)
		; CHECK: [[LD3COPY:%[0-9]+]]:_(s32) = COPY [[LD3]](s32)
		; CHECK: G_STORE [[LD2COPY]](s8), %1(p0) :: (store 1 into %ir.addr2, align 4)
		; CHECK: [[CST4:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
		; CHECK: [[GEP4:%[0-9]+]]:_(p0) = G_GEP %1, [[CST4]](s64)
		; CHECK: G_STORE [[LD3COPY]](s32), [[GEP4]](p0) :: (store 4 into %ir.addr2 + 32)
define void @test_extractvalue_agg(%struct.nested* %addr, {i8, i32}* %addr2) {		define void @test_extractvalue_agg(%struct.nested* %addr, {i8, i32}* %addr2) {
%struct = load %struct.nested, %struct.nested* %addr		%struct = load %struct.nested, %struct.nested* %addr
%res = extractvalue %struct.nested %struct, 1		%res = extractvalue %struct.nested %struct, 1
store {i8, i32} %res, {i8, i32}* %addr2		store {i8, i32} %res, {i8, i32}* %addr2
ret void		ret void
}		}

; CHECK-LABEL: name: test_insertvalue		; CHECK-LABEL: name: test_insertvalue
; CHECK: [[VAL:%[0-9]+]]:_(s32) = COPY $w1		; CHECK: %0:_(p0) = COPY $x0
; CHECK: [[STRUCT:%[0-9]+]]:_(s128) = G_LOAD		; CHECK: %1:_(s32) = COPY $w1
; CHECK: [[NEWSTRUCT:%[0-9]+]]:_(s128) = G_INSERT [[STRUCT]], [[VAL]](s32), 64		; CHECK: [[LD1:%[0-9]+]]:_(s8) = G_LOAD %0(p0) :: (load 1 from %ir.addr, align 4)
; CHECK: G_STORE [[NEWSTRUCT]](s128),		; CHECK: [[CST1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
		; CHECK: [[GEP1:%[0-9]+]]:_(p0) = G_GEP %0, [[CST1]](s64)
		; CHECK: [[LD2:%[0-9]+]]:_(s8) = G_LOAD [[GEP1]](p0) :: (load 1 from %ir.addr + 32, align 4)
		; CHECK: [[CST2:%[0-9]+]]:_(s64) = G_CONSTANT i64 8
		; CHECK: [[GEP2:%[0-9]+]]:_(p0) = G_GEP %0, [[CST2]](s64)
		; CHECK: [[LD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 4 from %ir.addr + 64)
		; CHECK: [[CST3:%[0-9]+]]:_(s64) = G_CONSTANT i64 12
		; CHECK: [[GEP3:%[0-9]+]]:_(p0) = G_GEP %0, [[CST3]](s64)
		; CHECK: [[LD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p0) :: (load 4 from %ir.addr + 96)
		; CHECK: [[LD1COPY:%[0-9]+]]:_(s8) = COPY [[LD1]](s8)
		; CHECK: [[LD2COPY:%[0-9]+]]:_(s8) = COPY [[LD2]](s8)
		; CHECK: [[ARGCOPY:%[0-9]+]]:_(s32) = COPY %1(s32)
		; CHECK: [[LD4COPY:%[0-9]+]]:_(s32) = COPY [[LD4]](s32)
		; CHECK: G_STORE [[LD1COPY]](s8), %0(p0) :: (store 1 into %ir.addr, align 4)
		; CHECK: [[CST4:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
		; CHECK: [[GEP4:%[0-9]+]]:_(p0) = G_GEP %0, [[CST4]](s64)
		; CHECK: G_STORE [[LD2COPY]](s8), [[GEP4]](p0) :: (store 1 into %ir.addr + 32, align 4)
		; CHECK: [[CST5:%[0-9]+]]:_(s64) = G_CONSTANT i64 8
		; CHECK: [[GEP5:%[0-9]+]]:_(p0) = G_GEP %0, [[CST5]](s64)
		; CHECK: G_STORE [[ARGCOPY]](s32), [[GEP5]](p0) :: (store 4 into %ir.addr + 64)
		; CHECK: [[CST6:%[0-9]+]]:_(s64) = G_CONSTANT i64 12
		; CHECK: [[GEP6:%[0-9]+]]:_(p0) = G_GEP %0, [[CST6]](s64)
		; CHECK: G_STORE [[LD4COPY]](s32), [[GEP6]](p0) :: (store 4 into %ir.addr + 96)
define void @test_insertvalue(%struct.nested* %addr, i32 %val) {		define void @test_insertvalue(%struct.nested* %addr, i32 %val) {
%struct = load %struct.nested, %struct.nested* %addr		%struct = load %struct.nested, %struct.nested* %addr
%newstruct = insertvalue %struct.nested %struct, i32 %val, 1, 1		%newstruct = insertvalue %struct.nested %struct, i32 %val, 1, 1
store %struct.nested %newstruct, %struct.nested* %addr		store %struct.nested %newstruct, %struct.nested* %addr
ret void		ret void
}		}

define [1 x i64] @test_trivial_insert([1 x i64] %s, i64 %val) {		define [1 x i64] @test_trivial_insert([1 x i64] %s, i64 %val) {
; CHECK-LABEL: name: test_trivial_insert		; CHECK-LABEL: name: test_trivial_insert
; CHECK: [[STRUCT:%[0-9]+]]:_(s64) = COPY $x0		; CHECK: [[STRUCT:%[0-9]+]]:_(s64) = COPY $x0
; CHECK: [[VAL:%[0-9]+]]:_(s64) = COPY $x1		; CHECK: [[VAL:%[0-9]+]]:_(s64) = COPY $x1
; CHECK: [[RES:%[0-9]+]]:_(s64) = COPY [[VAL]](s64)		; CHECK: [[RES:%[0-9]+]]:_(s64) = COPY [[VAL]](s64)
; CHECK: $x0 = COPY [[RES]]		; CHECK: $x0 = COPY [[RES]]
%res = insertvalue [1 x i64] %s, i64 %val, 0		%res = insertvalue [1 x i64] %s, i64 %val, 0
ret [1 x i64] %res		ret [1 x i64] %res
}		}

define [1 x i8] @test_trivial_insert_ptr([1 x i8] %s, i8* %val) {		define [1 x i8] @test_trivial_insert_ptr([1 x i8] %s, i8* %val) {
; CHECK-LABEL: name: test_trivial_insert_ptr		; CHECK-LABEL: name: test_trivial_insert_ptr
; CHECK: [[STRUCT:%[0-9]+]]:_(s64) = COPY $x0		; CHECK: [[STRUCT:%[0-9]+]]:_(s64) = COPY $x0
; CHECK: [[VAL:%[0-9]+]]:_(p0) = COPY $x1		; CHECK: [[VAL:%[0-9]+]]:_(p0) = COPY $x1
; CHECK: [[RES:%[0-9]+]]:_(s64) = G_PTRTOINT [[VAL]](p0)		; CHECK: [[VAL2:%[0-9]+]]:_(p0) = COPY [[VAL]]
; CHECK: $x0 = COPY [[RES]]		; CHECK: $x0 = COPY [[VAL2]]
%res = insertvalue [1 x i8] %s, i8 %val, 0		%res = insertvalue [1 x i8] %s, i8 %val, 0
ret [1 x i8*] %res		ret [1 x i8*] %res
}		}

; CHECK-LABEL: name: test_insertvalue_agg		; CHECK-LABEL: name: test_insertvalue_agg
; CHECK: [[SMALLSTRUCT:%[0-9]+]]:_(s64) = G_LOAD		; CHECK: %0:_(p0) = COPY $x0
; CHECK: [[STRUCT:%[0-9]+]]:_(s128) = G_LOAD		; CHECK: %1:_(p0) = COPY $x1
; CHECK: [[RES:%[0-9]+]]:_(s128) = G_INSERT [[STRUCT]], [[SMALLSTRUCT]](s64), 32		; CHECK: [[LD1:%[0-9]+]]:_(s8) = G_LOAD %1(p0) :: (load 1 from %ir.addr2, align 4)
; CHECK: G_STORE [[RES]](s128)		; CHECK: [[CST1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
		; CHECK: [[GEP1:%[0-9]+]]:_(p0) = G_GEP %1, [[CST1]](s64)
		; CHECK: [[LD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 4 from %ir.addr2 + 32)
		; CHECK: [[LD3:%[0-9]+]]:_(s8) = G_LOAD %0(p0) :: (load 1 from %ir.addr, align 4)
		; CHECK: [[CST2:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
		; CHECK: [[GEP2:%[0-9]+]]:_(p0) = G_GEP %0, [[CST2]](s64)
		; CHECK: [[LD4:%[0-9]+]]:_(s8) = G_LOAD [[GEP2]](p0) :: (load 1 from %ir.addr + 32, align 4)
		; CHECK: [[CST3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8
		; CHECK: [[GEP3:%[0-9]+]]:_(p0) = G_GEP %0, [[CST3]](s64)
		; CHECK: [[LD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p0) :: (load 4 from %ir.addr + 64)
		; CHECK: [[CST4:%[0-9]+]]:_(s64) = G_CONSTANT i64 12
		; CHECK: [[GEP4:%[0-9]+]]:_(p0) = G_GEP %0, [[CST4]](s64)
		; CHECK: [[LD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p0) :: (load 4 from %ir.addr + 96)
		; CHECK: [[LD3COPY:%[0-9]+]]:_(s8) = COPY [[LD3]](s8)
		; CHECK: [[LD1COPY:%[0-9]+]]:_(s8) = COPY [[LD1]](s8)
		; CHECK: [[LD2COPY:%[0-9]+]]:_(s32) = COPY [[LD2]](s32)
		; CHECK: [[LD6COPY:%[0-9]+]]:_(s32) = COPY [[LD6]](s32)
		; CHECK: G_STORE [[LD3COPY]](s8), %0(p0) :: (store 1 into %ir.addr, align 4)
		; CHECK: [[CST5:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
		; CHECK: [[GEP5:%[0-9]+]]:_(p0) = G_GEP %0, [[CST5]](s64)
		; CHECK: G_STORE [[LD1COPY]](s8), [[GEP5]](p0) :: (store 1 into %ir.addr + 32, align 4)
		; CHECK: [[CST6:%[0-9]+]]:_(s64) = G_CONSTANT i64 8
		; CHECK: [[GEP6:%[0-9]+]]:_(p0) = G_GEP %0, [[CST6]](s64)
		; CHECK: G_STORE [[LD2COPY]](s32), [[GEP6]](p0) :: (store 4 into %ir.addr + 64)
		; CHECK: [[CST7:%[0-9]+]]:_(s64) = G_CONSTANT i64 12
		; CHECK: [[GEP7:%[0-9]+]]:_(p0) = G_GEP %0, [[CST7]](s64)
		; CHECK: G_STORE [[LD6COPY]](s32), [[GEP7]](p0) :: (store 4 into %ir.addr + 96)
define void @test_insertvalue_agg(%struct.nested* %addr, {i8, i32}* %addr2) {		define void @test_insertvalue_agg(%struct.nested* %addr, {i8, i32}* %addr2) {
%smallstruct = load {i8, i32}, {i8, i32}* %addr2		%smallstruct = load {i8, i32}, {i8, i32}* %addr2
%struct = load %struct.nested, %struct.nested* %addr		%struct = load %struct.nested, %struct.nested* %addr
%res = insertvalue %struct.nested %struct, {i8, i32} %smallstruct, 1		%res = insertvalue %struct.nested %struct, {i8, i32} %smallstruct, 1
store %struct.nested %res, %struct.nested* %addr		store %struct.nested %res, %struct.nested* %addr
ret void		ret void
}		}

▲ Show 20 Lines • Show All 690 Lines • Show Last 20 Lines

test/CodeGen/AArch64/GlobalISel/call-translator-ios.ll

	Show First 20 Lines • Show All 49 Lines • ▼ Show 20 Lines
	; CHECK: {{%.*}}:_(s64) = COPY $x1			; CHECK: {{%.*}}:_(s64) = COPY $x1
	; CHECK: {{%.*}}:_(s64) = COPY $x2			; CHECK: {{%.*}}:_(s64) = COPY $x2
	define void @take_128bit_struct([2 x i64]* %ptr, [2 x i64] %in) {			define void @take_128bit_struct([2 x i64]* %ptr, [2 x i64] %in) {
	store [2 x i64] %in, [2 x i64]* %ptr			store [2 x i64] %in, [2 x i64]* %ptr
	ret void			ret void
	}			}

	; CHECK-LABEL: name: test_split_struct			; CHECK-LABEL: name: test_split_struct
	; CHECK: [[STRUCT:%[0-9]+]]:_(s128) = G_LOAD {{.*}}(p0)			; CHECK: [[LD1:%[0-9]+]]:_(s64) = G_LOAD %0(p0) :: (load 8 from %ir.ptr)
	; CHECK: [[LO:%[0-9]+]]:_(s64) = G_EXTRACT [[STRUCT]](s128), 0			; CHECK: [[CST:%[0-9]+]]:_(s64) = G_CONSTANT i64 8
	; CHECK: [[HI:%[0-9]+]]:_(s64) = G_EXTRACT [[STRUCT]](s128), 64			; CHECK: [[GEP:%[0-9]+]]:_(p0) = G_GEP %0, [[CST]](s64)
				; CHECK: [[LD2:%[0-9]+]]:_(s64) = G_LOAD %3(p0) :: (load 8 from %ir.ptr + 64)
				; CHECK: [[IMPDEF:%[0-9]+]]:_(s128) = G_IMPLICIT_DEF
				; CHECK: [[INS1:%[0-9]+]]:_(s128) = G_INSERT [[IMPDEF]], [[LD1]](s64), 0
				; CHECK: [[INS2:%[0-9]+]]:_(s128) = G_INSERT [[INS1]], [[LD2]](s64), 64
				; CHECK: [[EXT1:%[0-9]+]]:_(s64) = G_EXTRACT [[INS2]](s128), 0
				; CHECK: [[EXT2:%[0-9]+]]:_(s64) = G_EXTRACT [[INS2]](s128), 64

	; CHECK: [[SP:%[0-9]+]]:_(p0) = COPY $sp			; CHECK: [[SP:%[0-9]+]]:_(p0) = COPY $sp
	; CHECK: [[OFF:%[0-9]+]]:_(s64) = G_CONSTANT i64 0			; CHECK: [[OFF:%[0-9]+]]:_(s64) = G_CONSTANT i64 0
	; CHECK: [[ADDR:%[0-9]+]]:_(p0) = G_GEP [[SP]], [[OFF]]			; CHECK: [[ADDR:%[0-9]+]]:_(p0) = G_GEP [[SP]], [[OFF]](s64)
	; CHECK: G_STORE [[LO]](s64), [[ADDR]](p0) :: (store 8 into stack, align 0)			; CHECK: G_STORE [[EXT1]](s64), [[ADDR]](p0) :: (store 8 into stack, align 0)

	; CHECK: [[SP:%[0-9]+]]:_(p0) = COPY $sp			; CHECK: [[SP:%[0-9]+]]:_(p0) = COPY $sp
	; CHECK: [[OFF:%[0-9]+]]:_(s64) = G_CONSTANT i64 8			; CHECK: [[OFF:%[0-9]+]]:_(s64) = G_CONSTANT i64 8
	; CHECK: [[ADDR:%[0-9]+]]:_(p0) = G_GEP [[SP]], [[OFF]]			; CHECK: [[ADDR:%[0-9]+]]:_(p0) = G_GEP [[SP]], [[OFF]]
	; CHECK: G_STORE [[HI]](s64), [[ADDR]](p0) :: (store 8 into stack + 8, align 0)			; CHECK: G_STORE [[EXT2]](s64), [[ADDR]](p0) :: (store 8 into stack + 8, align 0)
	define void @test_split_struct([2 x i64]* %ptr) {			define void @test_split_struct([2 x i64]* %ptr) {
	%struct = load [2 x i64], [2 x i64]* %ptr			%struct = load [2 x i64], [2 x i64]* %ptr
	call void @take_split_struct([2 x i64]* null, i64 1, i64 2, i64 3,			call void @take_split_struct([2 x i64]* null, i64 1, i64 2, i64 3,
	i64 4, i64 5, i64 6,			i64 4, i64 5, i64 6,
	[2 x i64] %struct)			[2 x i64] %struct)
	ret void			ret void
	}			}

	Show All 16 Lines

test/CodeGen/AArch64/GlobalISel/call-translator.ll

	Show First 20 Lines • Show All 64 Lines • ▼ Show 20 Lines
	; CHECK: [[I8:%[0-9]+]]:_(s8) = G_TRUNC [[I8_C]]			; CHECK: [[I8:%[0-9]+]]:_(s8) = G_TRUNC [[I8_C]]
	; CHECK: [[ADDR:%[0-9]+]]:_(p0) = COPY $x2			; CHECK: [[ADDR:%[0-9]+]]:_(p0) = COPY $x2

	; CHECK: [[UNDEF:%[0-9]+]]:_(s192) = G_IMPLICIT_DEF			; CHECK: [[UNDEF:%[0-9]+]]:_(s192) = G_IMPLICIT_DEF
	; CHECK: [[ARG0:%[0-9]+]]:_(s192) = G_INSERT [[UNDEF]], [[DBL]](s64), 0			; CHECK: [[ARG0:%[0-9]+]]:_(s192) = G_INSERT [[UNDEF]], [[DBL]](s64), 0
	; CHECK: [[ARG1:%[0-9]+]]:_(s192) = G_INSERT [[ARG0]], [[I64]](s64), 64			; CHECK: [[ARG1:%[0-9]+]]:_(s192) = G_INSERT [[ARG0]], [[I64]](s64), 64
	; CHECK: [[ARG2:%[0-9]+]]:_(s192) = G_INSERT [[ARG1]], [[I8]](s8), 128			; CHECK: [[ARG2:%[0-9]+]]:_(s192) = G_INSERT [[ARG1]], [[I8]](s8), 128
	; CHECK: [[ARG:%[0-9]+]]:_(s192) = COPY [[ARG2]]			; CHECK: [[ARG:%[0-9]+]]:_(s192) = COPY [[ARG2]]
				; CHECK: [[EXTA0:%[0-9]+]]:_(s64) = G_EXTRACT [[ARG]](s192), 0
	; CHECK: G_STORE [[ARG]](s192), [[ADDR]](p0)			; CHECK: [[EXTA1:%[0-9]+]]:_(s64) = G_EXTRACT [[ARG]](s192), 64
				; CHECK: [[EXTA2:%[0-9]+]]:_(s8) = G_EXTRACT [[ARG]](s192), 128
				; CHECK: G_STORE [[EXTA0]](s64), [[ADDR]](p0) :: (store 8 into %ir.addr)
				; CHECK: [[CST1:%[0-9]+]]:_(s64) = G_CONSTANT i64 8
				; CHECK: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[ADDR]], [[CST1]](s64)
				; CHECK: G_STORE [[EXTA1]](s64), [[GEP1]](p0) :: (store 8 into %ir.addr + 64)
				; CHECK: [[CST2:%[0-9]+]]:_(s64) = G_CONSTANT i64 16
				; CHECK: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[ADDR]], [[CST2]](s64)
				; CHECK: G_STORE [[EXTA2]](s8), [[GEP2]](p0) :: (store 1 into %ir.addr + 128, align 8)
	; CHECK: RET_ReallyLR			; CHECK: RET_ReallyLR
	define void @test_struct_formal({double, i64, i8} %in, {double, i64, i8}* %addr) {			define void @test_struct_formal({double, i64, i8} %in, {double, i64, i8}* %addr) {
	store {double, i64, i8} %in, {double, i64, i8}* %addr			store {double, i64, i8} %in, {double, i64, i8}* %addr
	ret void			ret void
	}			}


	; CHECK-LABEL: name: test_struct_return			; CHECK-LABEL: name: test_struct_return
	; CHECK: [[ADDR:%[0-9]+]]:_(p0) = COPY $x0			; CHECK: [[ADDR:%[0-9]+]]:_(p0) = COPY $x0
	; CHECK: [[VAL:%[0-9]+]]:_(s192) = G_LOAD [[ADDR]](p0)
				; CHECK: [[LD1:%[0-9]+]]:_(s64) = G_LOAD [[ADDR]](p0) :: (load 8 from %ir.addr)
				; CHECK: [[CST1:%[0-9]+]]:_(s64) = G_CONSTANT i64 8
				; CHECK: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[ADDR]], [[CST1]](s64)
				; CHECK: [[LD2:%[0-9]+]]:_(s64) = G_LOAD [[GEP1]](p0) :: (load 8 from %ir.addr + 64)
				; CHECK: [[CST2:%[0-9]+]]:_(s64) = G_CONSTANT i64 16
				; CHECK: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[ADDR]], [[CST2]](s64)
				; CHECK: [[LD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 4 from %ir.addr + 128, align 8)
				; CHECK: [[IMPDEF:%[0-9]+]]:_(s192) = G_IMPLICIT_DEF
				; CHECK: [[INS1:%[0-9]+]]:_(s192) = G_INSERT [[IMPDEF]], [[LD1]](s64), 0
				; CHECK: [[INS2:%[0-9]+]]:_(s192) = G_INSERT [[INS1]], [[LD2]](s64), 64
				; CHECK: [[VAL:%[0-9]+]]:_(s192) = G_INSERT [[INS2]], [[LD3]](s32), 128

	; CHECK: [[DBL:%[0-9]+]]:_(s64) = G_EXTRACT [[VAL]](s192), 0			; CHECK: [[DBL:%[0-9]+]]:_(s64) = G_EXTRACT [[VAL]](s192), 0
	; CHECK: [[I64:%[0-9]+]]:_(s64) = G_EXTRACT [[VAL]](s192), 64			; CHECK: [[I64:%[0-9]+]]:_(s64) = G_EXTRACT [[VAL]](s192), 64
	; CHECK: [[I32:%[0-9]+]]:_(s32) = G_EXTRACT [[VAL]](s192), 128			; CHECK: [[I32:%[0-9]+]]:_(s32) = G_EXTRACT [[VAL]](s192), 128

	; CHECK: $d0 = COPY [[DBL]](s64)			; CHECK: $d0 = COPY [[DBL]](s64)
	; CHECK: $x0 = COPY [[I64]](s64)			; CHECK: $x0 = COPY [[I64]](s64)
	; CHECK: $w1 = COPY [[I32]](s32)			; CHECK: $w1 = COPY [[I32]](s32)
	; CHECK: RET_ReallyLR implicit $d0, implicit $x0, implicit $w1			; CHECK: RET_ReallyLR implicit $d0, implicit $x0, implicit $w1
	define {double, i64, i32} @test_struct_return({double, i64, i32}* %addr) {			define {double, i64, i32} @test_struct_return({double, i64, i32}* %addr) {
	%val = load {double, i64, i32}, {double, i64, i32}* %addr			%val = load {double, i64, i32}, {double, i64, i32}* %addr
	ret {double, i64, i32} %val			ret {double, i64, i32} %val
	}			}

	; CHECK-LABEL: name: test_arr_call			; CHECK-LABEL: name: test_arr_call
	; CHECK: hasCalls: true			; CHECK: hasCalls: true
	; CHECK: [[ARG:%[0-9]+]]:_(s256) = G_LOAD			; CHECK: %0:_(p0) = COPY $x0
				; CHECK: [[LD1:%[0-9]+]]:_(s64) = G_LOAD %0(p0) :: (load 8 from %ir.addr)
				; CHECK: [[CST1:%[0-9]+]]:_(s64) = G_CONSTANT i64 8
				; CHECK: [[GEP1:%[0-9]+]]:_(p0) = G_GEP %0, [[CST1]](s64)
				; CHECK: [[LD2:%[0-9]+]]:_(s64) = G_LOAD [[GEP1]](p0) :: (load 8 from %ir.addr + 64)
				; CHECK: [[CST2:%[0-9]+]]:_(s64) = G_CONSTANT i64 16
				; CHECK: [[GEP2:%[0-9]+]]:_(p0) = G_GEP %0, [[CST2]](s64)
				; CHECK: [[LD3:%[0-9]+]]:_(s64) = G_LOAD [[GEP2]](p0) :: (load 8 from %ir.addr + 128)
				; CHECK: [[CST3:%[0-9]+]]:_(s64) = G_CONSTANT i64 24
				; CHECK: [[GEP3:%[0-9]+]]:_(p0) = G_GEP %0, [[CST3]](s64)
				; CHECK: [[LD4:%[0-9]+]]:_(s64) = G_LOAD [[GEP3]](p0) :: (load 8 from %ir.addr + 192)
				; CHECK: [[IMPDEF:%[0-9]+]]:_(s256) = G_IMPLICIT_DEF
				; CHECK: [[INS1:%[0-9]+]]:_(s256) = G_INSERT [[IMPDEF]], [[LD1]](s64), 0
				; CHECK: [[INS2:%[0-9]+]]:_(s256) = G_INSERT [[INS1]], [[LD2]](s64), 64
				; CHECK: [[INS3:%[0-9]+]]:_(s256) = G_INSERT [[INS2]], [[LD3]](s64), 128
				; CHECK: [[ARG:%[0-9]+]]:_(s256) = G_INSERT [[INS3]], [[LD4]](s64), 192
	; CHECK: [[E0:%[0-9]+]]:_(s64) = G_EXTRACT [[ARG]](s256), 0			; CHECK: [[E0:%[0-9]+]]:_(s64) = G_EXTRACT [[ARG]](s256), 0
	; CHECK: [[E1:%[0-9]+]]:_(s64) = G_EXTRACT [[ARG]](s256), 64			; CHECK: [[E1:%[0-9]+]]:_(s64) = G_EXTRACT [[ARG]](s256), 64
	; CHECK: [[E2:%[0-9]+]]:_(s64) = G_EXTRACT [[ARG]](s256), 128			; CHECK: [[E2:%[0-9]+]]:_(s64) = G_EXTRACT [[ARG]](s256), 128
	; CHECK: [[E3:%[0-9]+]]:_(s64) = G_EXTRACT [[ARG]](s256), 192			; CHECK: [[E3:%[0-9]+]]:_(s64) = G_EXTRACT [[ARG]](s256), 192

	; CHECK: $x0 = COPY [[E0]](s64)			; CHECK: $x0 = COPY [[E0]](s64)
	; CHECK: $x1 = COPY [[E1]](s64)			; CHECK: $x1 = COPY [[E1]](s64)
	; CHECK: $x2 = COPY [[E2]](s64)			; CHECK: $x2 = COPY [[E2]](s64)
	▲ Show 20 Lines • Show All 124 Lines • ▼ Show 20 Lines
	; CHECK: {{%.*}}:_(s64) = COPY $x1			; CHECK: {{%.*}}:_(s64) = COPY $x1
	; CHECK: {{%.*}}:_(s64) = COPY $x2			; CHECK: {{%.*}}:_(s64) = COPY $x2
	define void @take_128bit_struct([2 x i64]* %ptr, [2 x i64] %in) {			define void @take_128bit_struct([2 x i64]* %ptr, [2 x i64] %in) {
	store [2 x i64] %in, [2 x i64]* %ptr			store [2 x i64] %in, [2 x i64]* %ptr
	ret void			ret void
	}			}

	; CHECK-LABEL: name: test_split_struct			; CHECK-LABEL: name: test_split_struct
	; CHECK: [[STRUCT:%[0-9]+]]:_(s128) = G_LOAD {{.*}}(p0)			; CHECK: [[ADDR:%[0-9]+]]:_(p0) = COPY $x0
	; CHECK: [[LO:%[0-9]+]]:_(s64) = G_EXTRACT [[STRUCT]](s128), 0			; CHECK: [[LO:%[0-9]+]]:_(s64) = G_LOAD %0(p0) :: (load 8 from %ir.ptr)
	; CHECK: [[HI:%[0-9]+]]:_(s64) = G_EXTRACT [[STRUCT]](s128), 64			; CHECK: [[CST:%[0-9]+]]:_(s64) = G_CONSTANT i64 8
				; CHECK: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[ADDR]], [[CST]](s64)
				; CHECK: [[HI:%[0-9]+]]:_(s64) = G_LOAD [[GEP]](p0) :: (load 8 from %ir.ptr + 64)

				; CHECK: [[IMPDEF:%[0-9]+]]:_(s128) = G_IMPLICIT_DEF
				; CHECK: [[INS1:%[0-9]+]]:_(s128) = G_INSERT [[IMPDEF]], [[LO]](s64), 0
				; CHECK: [[INS2:%[0-9]+]]:_(s128) = G_INSERT [[INS1]], [[HI]](s64), 64
				; CHECK: [[EXTLO:%[0-9]+]]:_(s64) = G_EXTRACT [[INS2]](s128), 0
				; CHECK: [[EXTHI:%[0-9]+]]:_(s64) = G_EXTRACT [[INS2]](s128), 64

	; CHECK: [[SP:%[0-9]+]]:_(p0) = COPY $sp			; CHECK: [[SP:%[0-9]+]]:_(p0) = COPY $sp
	; CHECK: [[OFF:%[0-9]+]]:_(s64) = G_CONSTANT i64 0			; CHECK: [[CST2:%[0-9]+]]:_(s64) = G_CONSTANT i64 0
	; CHECK: [[ADDR:%[0-9]+]]:_(p0) = G_GEP [[SP]], [[OFF]]			; CHECK: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[SP]], [[CST2]](s64)
	; CHECK: G_STORE [[LO]](s64), [[ADDR]](p0) :: (store 8 into stack, align 0)			; CHECK: G_STORE [[EXTLO]](s64), [[GEP2]](p0) :: (store 8 into stack, align 0)

	; CHECK: [[SP:%[0-9]+]]:_(p0) = COPY $sp			; CHECK: [[SP:%[0-9]+]]:_(p0) = COPY $sp
	; CHECK: [[OFF:%[0-9]+]]:_(s64) = G_CONSTANT i64 8			; CHECK: [[CST3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8
	; CHECK: [[ADDR:%[0-9]+]]:_(p0) = G_GEP [[SP]], [[OFF]]			; CHECK: [[GEP3:%[0-9]+]]:_(p0) = G_GEP [[SP]], [[CST3]](s64)
	; CHECK: G_STORE [[HI]](s64), [[ADDR]](p0) :: (store 8 into stack + 8, align 0)			; CHECK: G_STORE [[EXTHI]](s64), [[GEP3]](p0) :: (store 8 into stack + 8, align 0)
	define void @test_split_struct([2 x i64]* %ptr) {			define void @test_split_struct([2 x i64]* %ptr) {
	%struct = load [2 x i64], [2 x i64]* %ptr			%struct = load [2 x i64], [2 x i64]* %ptr
	call void @take_split_struct([2 x i64]* null, i64 1, i64 2, i64 3,			call void @take_split_struct([2 x i64]* null, i64 1, i64 2, i64 3,
	i64 4, i64 5, i64 6,			i64 4, i64 5, i64 6,
	[2 x i64] %struct)			[2 x i64] %struct)
	ret void			ret void
	}			}

	Show All 16 Lines

test/CodeGen/AArch64/GlobalISel/irtranslator-exceptions.ll

	Show All 13 Lines
	; CHECK: $w0 = COPY			; CHECK: $w0 = COPY
	; CHECK: BL @foo, csr_aarch64_aapcs, implicit-def $lr, implicit $sp, implicit $w0, implicit-def $w0			; CHECK: BL @foo, csr_aarch64_aapcs, implicit-def $lr, implicit $sp, implicit $w0, implicit-def $w0
	; CHECK: {{%[0-9]+}}:_(s32) = COPY $w0			; CHECK: {{%[0-9]+}}:_(s32) = COPY $w0
	; CHECK: EH_LABEL			; CHECK: EH_LABEL
	; CHECK: G_BR %[[GOOD]]			; CHECK: G_BR %[[GOOD]]

	; CHECK: [[BAD]].{{[a-z]+}} (landing-pad):			; CHECK: [[BAD]].{{[a-z]+}} (landing-pad):
	; CHECK: EH_LABEL			; CHECK: EH_LABEL
	; CHECK: [[UNDEF:%[0-9]+]]:_(s128) = G_IMPLICIT_DEF
	; CHECK: [[PTR:%[0-9]+]]:_(p0) = COPY $x0			; CHECK: [[PTR:%[0-9]+]]:_(p0) = COPY $x0
	; CHECK: [[VAL_WITH_PTR:%[0-9]+]]:_(s128) = G_INSERT [[UNDEF]], [[PTR]](p0), 0
	; CHECK: [[SEL_PTR:%[0-9]+]]:_(p0) = COPY $x1			; CHECK: [[SEL_PTR:%[0-9]+]]:_(p0) = COPY $x1
	; CHECK: [[SEL:%[0-9]+]]:_(s32) = G_PTRTOINT [[SEL_PTR]]			; CHECK: [[SEL:%[0-9]+]]:_(s32) = G_PTRTOINT [[SEL_PTR]]
				; CHECK: [[UNDEF:%[0-9]+]]:_(s128) = G_IMPLICIT_DEF
				; CHECK: [[VAL_WITH_PTR:%[0-9]+]]:_(s128) = G_INSERT [[UNDEF]], [[PTR]](p0), 0
	; CHECK: [[PTR_SEL:%[0-9]+]]:_(s128) = G_INSERT [[VAL_WITH_PTR]], [[SEL]](s32), 64			; CHECK: [[PTR_SEL:%[0-9]+]]:_(s128) = G_INSERT [[VAL_WITH_PTR]], [[SEL]](s32), 64
	; CHECK: [[PTR_RET:%[0-9]+]]:_(s64) = G_EXTRACT [[PTR_SEL]](s128), 0			; CHECK: [[PTR_RET:%[0-9]+]]:_(s64) = G_EXTRACT [[PTR_SEL]](s128), 0
	; CHECK: [[SEL_RET:%[0-9]+]]:_(s32) = G_EXTRACT [[PTR_SEL]](s128), 64			; CHECK: [[SEL_RET:%[0-9]+]]:_(s32) = G_EXTRACT [[PTR_SEL]](s128), 64
	; CHECK: $x0 = COPY [[PTR_RET]]			; CHECK: $x0 = COPY [[PTR_RET]]
	; CHECK: $w1 = COPY [[SEL_RET]]			; CHECK: $w1 = COPY [[SEL_RET]]

	; CHECK: [[GOOD]].{{[a-z]+}}:			; CHECK: [[GOOD]].{{[a-z]+}}:
	; CHECK: [[SEL:%[0-9]+]]:_(s32) = G_CONSTANT i32 1			; CHECK: [[SEL:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
	; CHECK: {{%[0-9]+}}:_(s128) = G_INSERT {{%[0-9]+}}, [[SEL]](s32), 64			; CHECK: [[SELCOPY:%[0-9]+]]:_(s32) = COPY [[SEL]](s32)
				; CHECK: {{%[0-9]+}}:_(s128) = G_INSERT {{%[0-9]+}}, [[SELCOPY]](s32), 64

	define { i8, i32 } @bar() personality i8 bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {			define { i8, i32 } @bar() personality i8 bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
	%res32 = invoke i32 @foo(i32 42) to label %continue unwind label %broken			%res32 = invoke i32 @foo(i32 42) to label %continue unwind label %broken


	broken:			broken:
	%ptr.sel = landingpad { i8, i32 } catch i8 bitcast(i8** @_ZTIi to i8*)			%ptr.sel = landingpad { i8, i32 } catch i8 bitcast(i8** @_ZTIi to i8*)
	ret { i8*, i32 } %ptr.sel			ret { i8*, i32 } %ptr.sel
	▲ Show 20 Lines • Show All 52 Lines • Show Last 20 Lines

test/CodeGen/AArch64/GlobalISel/legalize-exceptions.ll

	Show All 10 Lines
	; CHECK: body:			; CHECK: body:
	; CHECK-NEXT: bb.1 (%ir-block.0):			; CHECK-NEXT: bb.1 (%ir-block.0):
	; CHECK: successors: %{{bb.[0-9]+.*}}%[[LP:bb.[0-9]+]]			; CHECK: successors: %{{bb.[0-9]+.*}}%[[LP:bb.[0-9]+]]

	; CHECK: [[LP]].{{[a-z]+}} (landing-pad):			; CHECK: [[LP]].{{[a-z]+}} (landing-pad):
	; CHECK: EH_LABEL			; CHECK: EH_LABEL

	; CHECK: [[PTR:%[0-9]+]]:_(p0) = COPY $x0			; CHECK: [[PTR:%[0-9]+]]:_(p0) = COPY $x0
	; CHECK: [[STRUCT_PTR:%[0-9]+]]:_(s64) = G_PTRTOINT [[PTR]](p0)

	; CHECK: [[SEL_PTR:%[0-9]+]]:_(p0) = COPY $x1			; CHECK: [[SEL_PTR:%[0-9]+]]:_(p0) = COPY $x1
	; CHECK: [[SEL:%[0-9]+]]:_(s32) = G_PTRTOINT [[SEL_PTR]]			; CHECK: [[SEL_PTR_INT:%[0-9]+]]:_(s32) = G_PTRTOINT [[SEL_PTR]](p0)
	; CHECK: [[STRUCT_SEL:%[0-9]+]]:_(s64) = G_INSERT {{%[0-9]+}}, [[SEL]](s32), 0			; CHECK: [[PTRCPY:%[0-9]+]]:_(p0) = COPY [[PTR]](p0)
				; CHECK: G_STORE %9(p0), %0(p0) :: (store 8 into %ir.exn.slot)
	; CHECK: [[PTR:%[0-9]+]]:_(p0) = G_INTTOPTR [[STRUCT_PTR]](s64)			; CHECK: [[SEL_PTR_INT_CPY:%[0-9]+]]:_(s32) = COPY [[SEL_PTR_INT]](s32)
	; CHECK: G_STORE [[PTR]](p0), {{%[0-9]+}}(p0)			; CHECK: G_STORE [[SEL_PTR_INT_CPY]](s32), %1(p0) :: (store 4 into %ir.ehselector.slot)

	; CHECK: [[SEL_TMP:%[0-9]+]]:_(s32) = G_EXTRACT [[STRUCT_SEL]](s64), 0
	; CHECK: [[SEL:%[0-9]+]]:_(s32) = COPY [[SEL_TMP]]
	; CHECK: G_STORE [[SEL]](s32), {{%[0-9]+}}(p0)

	define void @bar() personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {			define void @bar() personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
	%exn.slot = alloca i8*			%exn.slot = alloca i8*
	%ehselector.slot = alloca i32			%ehselector.slot = alloca i32
	%1 = invoke i32 @foo(i32 42) to label %continue unwind label %cleanup			%1 = invoke i32 @foo(i32 42) to label %continue unwind label %cleanup

	cleanup:			cleanup:
	%2 = landingpad { i8*, i32 } cleanup			%2 = landingpad { i8*, i32 } cleanup
	Show All 14 Lines

test/CodeGen/ARM/GlobalISel/arm-irtranslator.ll

	Show First 20 Lines • Show All 516 Lines • ▼ Show 20 Lines

	%struct.v2s32 = type { <2 x i32> }			%struct.v2s32 = type { <2 x i32> }

	define i32 @test_constantstruct_v2s32() {			define i32 @test_constantstruct_v2s32() {
	; CHECK-LABEL: name: test_constantstruct_v2s32			; CHECK-LABEL: name: test_constantstruct_v2s32
	; CHECK: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 1			; CHECK: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
	; CHECK: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 2			; CHECK: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 2
	; CHECK: [[VEC:%[0-9]+]]:_(<2 x s32>) = G_MERGE_VALUES [[C1]](s32), [[C2]](s32)			; CHECK: [[VEC:%[0-9]+]]:_(<2 x s32>) = G_MERGE_VALUES [[C1]](s32), [[C2]](s32)
	; CHECK: G_EXTRACT_VECTOR_ELT [[VEC]](<2 x s32>)			; CHECK: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
				; CHECK: [[VECCPY:%[0-9]+]]:_(<2 x s32>) = COPY [[VEC]](<2 x s32>)
				; CHECK: G_EXTRACT_VECTOR_ELT [[VECCPY]](<2 x s32>), [[C3]]
	%vec = extractvalue %struct.v2s32 {<2 x i32><i32 1, i32 2>}, 0			%vec = extractvalue %struct.v2s32 {<2 x i32><i32 1, i32 2>}, 0
	%elt = extractelement <2 x i32> %vec, i32 0			%elt = extractelement <2 x i32> %vec, i32 0
	ret i32 %elt			ret i32 %elt
	}			}

	%struct.v2s32.s32.s32 = type { <2 x i32>, i32, i32 }			%struct.v2s32.s32.s32 = type { <2 x i32>, i32, i32 }

	define i32 @test_constantstruct_v2s32_s32_s32() {			define i32 @test_constantstruct_v2s32_s32_s32() {
	; CHECK-LABEL: name: test_constantstruct_v2s32_s32_s32			; CHECK-LABEL: name: test_constantstruct_v2s32_s32_s32
	; CHECK: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 1			; CHECK: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
	; CHECK: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 2			; CHECK: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 2
	; CHECK: [[VEC:%[0-9]+]]:_(<2 x s32>) = G_MERGE_VALUES [[C1]](s32), [[C2]](s32)			; CHECK: [[VEC:%[0-9]+]]:_(<2 x s32>) = G_MERGE_VALUES [[C1]](s32), [[C2]](s32)
	; CHECK: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 3			; CHECK: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 3
	; CHECK: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 4			; CHECK: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
	; CHECK: [[C5:%[0-9]+]]:_(s128) = G_IMPLICIT_DEF			; CHECK: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
	; CHECK: [[C6:%[0-9]+]]:_(s128) = G_INSERT [[C5]], [[VEC]](<2 x s32>), 0			; CHECK: [[VECCPY:%[0-9]+]]:_(<2 x s32>) = COPY [[VEC]](<2 x s32>)
	; CHECK: [[C7:%[0-9]+]]:_(s128) = G_INSERT [[C6]], [[C3]](s32), 64			; CHECK: G_EXTRACT_VECTOR_ELT [[VECCPY]](<2 x s32>), [[C5]](s32)
	; CHECK: [[C8:%[0-9]+]]:_(s128) = G_INSERT [[C7]], [[C4]](s32), 96
	; CHECK: [[EXT:%[0-9]+]]:_(<2 x s32>) = G_EXTRACT [[C8]](s128), 0
	; CHECK: G_EXTRACT_VECTOR_ELT [[EXT]](<2 x s32>)
	%vec = extractvalue %struct.v2s32.s32.s32 {<2 x i32><i32 1, i32 2>, i32 3, i32 4}, 0			%vec = extractvalue %struct.v2s32.s32.s32 {<2 x i32><i32 1, i32 2>, i32 3, i32 4}, 0
	%elt = extractelement <2 x i32> %vec, i32 0			%elt = extractelement <2 x i32> %vec, i32 0
	ret i32 %elt			ret i32 %elt
	}			}

test/CodeGen/ARM/GlobalISel/arm-param-lowering.ll

	Show All 24 Lines

	define arm_aapcscc i32* @test_call_simple_stack_params(i32 *%a, i32 %b) {			define arm_aapcscc i32* @test_call_simple_stack_params(i32 *%a, i32 %b) {
	; CHECK-LABEL: name: test_call_simple_stack_params			; CHECK-LABEL: name: test_call_simple_stack_params
	; CHECK-DAG: [[AVREG:%[0-9]+]]:_(p0) = COPY $r0			; CHECK-DAG: [[AVREG:%[0-9]+]]:_(p0) = COPY $r0
	; CHECK-DAG: [[BVREG:%[0-9]+]]:_(s32) = COPY $r1			; CHECK-DAG: [[BVREG:%[0-9]+]]:_(s32) = COPY $r1
	; CHECK: ADJCALLSTACKDOWN 8, 0, 14, $noreg, implicit-def $sp, implicit $sp			; CHECK: ADJCALLSTACKDOWN 8, 0, 14, $noreg, implicit-def $sp, implicit $sp
	; CHECK-DAG: $r0 = COPY [[BVREG]]			; CHECK-DAG: $r0 = COPY [[BVREG]]
	; CHECK-DAG: $r1 = COPY [[AVREG]]			; CHECK-DAG: $r1 = COPY [[AVREG]]
	; CHECK-DAG: $r2 = COPY [[BVREG]]			; CHECK-DxAG: $r2 = COPY [[BVREG]]
	; CHECK-DAG: $r3 = COPY [[AVREG]]			; CHECK-DAG: $r3 = COPY [[AVREG]]
	; CHECK: [[SP1:%[0-9]+]]:_(p0) = COPY $sp			; CHECK: [[SP1:%[0-9]+]]:_(p0) = COPY $sp
	; CHECK: [[OFF1:%[0-9]+]]:_(s32) = G_CONSTANT i32 0			; CHECK: [[OFF1:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
	; CHECK: [[FI1:%[0-9]+]]:_(p0) = G_GEP [[SP1]], [[OFF1]](s32)			; CHECK: [[FI1:%[0-9]+]]:_(p0) = G_GEP [[SP1]], [[OFF1]](s32)
	; CHECK: G_STORE [[BVREG]](s32), [[FI1]](p0){{.*}}store 4			; CHECK: G_STORE [[BVREG]](s32), [[FI1]](p0){{.*}}store 4
	; CHECK: [[SP2:%[0-9]+]]:_(p0) = COPY $sp			; CHECK: [[SP2:%[0-9]+]]:_(p0) = COPY $sp
	; CHECK: [[OFF2:%[0-9]+]]:_(s32) = G_CONSTANT i32 4			; CHECK: [[OFF2:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
	; CHECK: [[FI2:%[0-9]+]]:_(p0) = G_GEP [[SP2]], [[OFF2]](s32)			; CHECK: [[FI2:%[0-9]+]]:_(p0) = G_GEP [[SP2]], [[OFF2]](s32)
	▲ Show 20 Lines • Show All 144 Lines • ▼ Show 20 Lines
	declare arm_aapcscc [3 x i32] @tiny_int_arrays_target([2 x i32])			declare arm_aapcscc [3 x i32] @tiny_int_arrays_target([2 x i32])

	define arm_aapcscc [3 x i32] @test_tiny_int_arrays([2 x i32] %arr) {			define arm_aapcscc [3 x i32] @test_tiny_int_arrays([2 x i32] %arr) {
	; CHECK-LABEL: name: test_tiny_int_arrays			; CHECK-LABEL: name: test_tiny_int_arrays
	; CHECK: liveins: $r0, $r1			; CHECK: liveins: $r0, $r1
	; CHECK: [[R0:%[0-9]+]]:_(s32) = COPY $r0			; CHECK: [[R0:%[0-9]+]]:_(s32) = COPY $r0
	; CHECK: [[R1:%[0-9]+]]:_(s32) = COPY $r1			; CHECK: [[R1:%[0-9]+]]:_(s32) = COPY $r1
	; CHECK: [[ARG_ARR:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[R0]](s32), [[R1]](s32)			; CHECK: [[ARG_ARR:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[R0]](s32), [[R1]](s32)
				; CHECK: [[EXT1:%[0-9]+]]:_(s32) = G_EXTRACT [[ARG_ARR]](s64), 0
				; CHECK: [[EXT2:%[0-9]+]]:_(s32) = G_EXTRACT [[ARG_ARR]](s64), 32
				; CHECK: [[IMPDEF:%[0-9]+]]:_(s64) = G_IMPLICIT_DEF
				; CHECK: [[INS1:%[0-9]+]]:_(s64) = G_INSERT [[IMPDEF]], [[EXT1]](s32), 0
				; CHECK: [[INS2:%[0-9]+]]:_(s64) = G_INSERT [[INS1]], [[EXT2]](s32), 32
	; CHECK: ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def $sp, implicit $sp			; CHECK: ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def $sp, implicit $sp
	; CHECK: [[R0:%[0-9]+]]:_(s32), [[R1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[ARG_ARR]](s64)			; CHECK: [[R0:%[0-9]+]]:_(s32), [[R1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[INS2]](s64)
	; CHECK: $r0 = COPY [[R0]]			; CHECK: $r0 = COPY [[R0]]
	; CHECK: $r1 = COPY [[R1]]			; CHECK: $r1 = COPY [[R1]]
	; CHECK: BL @tiny_int_arrays_target, csr_aapcs, implicit-def $lr, implicit $sp, implicit $r0, implicit $r1, implicit-def $r0, implicit-def $r1			; CHECK: BL @tiny_int_arrays_target, csr_aapcs, implicit-def $lr, implicit $sp, implicit $r0, implicit $r1, implicit-def $r0, implicit-def $r1
	; CHECK: [[R0:%[0-9]+]]:_(s32) = COPY $r0			; CHECK: [[R0:%[0-9]+]]:_(s32) = COPY $r0
	; CHECK: [[R1:%[0-9]+]]:_(s32) = COPY $r1			; CHECK: [[R1:%[0-9]+]]:_(s32) = COPY $r1
	; CHECK: [[R2:%[0-9]+]]:_(s32) = COPY $r2			; CHECK: [[R2:%[0-9]+]]:_(s32) = COPY $r2
	; CHECK: [[RES_ARR:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[R0]](s32), [[R1]](s32), [[R2]](s32)			; CHECK: [[RES_ARR:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[R0]](s32), [[R1]](s32), [[R2]](s32)
	; CHECK: ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def $sp, implicit $sp			; CHECK: ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def $sp, implicit $sp
	; CHECK: [[R0:%[0-9]+]]:_(s32), [[R1:%[0-9]+]]:_(s32), [[R2:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[RES_ARR]](s96)			; CHECK: [[EXT3:%[0-9]+]]:_(s32) = G_EXTRACT [[RES_ARR]](s96), 0
				; CHECK: [[EXT4:%[0-9]+]]:_(s32) = G_EXTRACT [[RES_ARR]](s96), 32
				; CHECK: [[EXT5:%[0-9]+]]:_(s32) = G_EXTRACT [[RES_ARR]](s96), 64
				; CHECK: [[IMPDEF2:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF
				; CHECK: [[INS3:%[0-9]+]]:_(s96) = G_INSERT [[IMPDEF2]], [[EXT3]](s32), 0
				; CHECK: [[INS4:%[0-9]+]]:_(s96) = G_INSERT [[INS3]], [[EXT4]](s32), 32
				; CHECK: [[INS5:%[0-9]+]]:_(s96) = G_INSERT [[INS4]], [[EXT5]](s32), 64
				; CHECK: [[R0:%[0-9]+]]:_(s32), [[R1:%[0-9]+]]:_(s32), [[R2:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[INS5]](s96)
	; FIXME: This doesn't seem correct with regard to the AAPCS docs (which say			; FIXME: This doesn't seem correct with regard to the AAPCS docs (which say
	; that composite types larger than 4 bytes should be passed through memory),			; that composite types larger than 4 bytes should be passed through memory),
	; but it's what DAGISel does. We should fix it in the common code for both.			; but it's what DAGISel does. We should fix it in the common code for both.
	; CHECK: $r0 = COPY [[R0]]			; CHECK: $r0 = COPY [[R0]]
	; CHECK: $r1 = COPY [[R1]]			; CHECK: $r1 = COPY [[R1]]
	; CHECK: $r2 = COPY [[R2]]			; CHECK: $r2 = COPY [[R2]]
	; CHECK: BX_RET 14, $noreg, implicit $r0, implicit $r1, implicit $r2			; CHECK: BX_RET 14, $noreg, implicit $r0, implicit $r1, implicit $r2
	entry:			entry:
	%r = notail call arm_aapcscc [3 x i32] @tiny_int_arrays_target([2 x i32] %arr)			%r = notail call arm_aapcscc [3 x i32] @tiny_int_arrays_target([2 x i32] %arr)
	ret [3 x i32] %r			ret [3 x i32] %r
	}			}

	declare arm_aapcscc void @multiple_int_arrays_target([2 x i32], [2 x i32])			declare arm_aapcscc void @multiple_int_arrays_target([2 x i32], [2 x i32])

	define arm_aapcscc void @test_multiple_int_arrays([2 x i32] %arr0, [2 x i32] %arr1) {			define arm_aapcscc void @test_multiple_int_arrays([2 x i32] %arr0, [2 x i32] %arr1) {
	; CHECK-LABEL: name: test_multiple_int_arrays			; CHECK-LABEL: name: test_multiple_int_arrays
	; CHECK: liveins: $r0, $r1			; CHECK: liveins: $r0, $r1
	; CHECK: [[R0:%[0-9]+]]:_(s32) = COPY $r0			; CHECK: [[R0:%[0-9]+]]:_(s32) = COPY $r0
	; CHECK: [[R1:%[0-9]+]]:_(s32) = COPY $r1			; CHECK: [[R1:%[0-9]+]]:_(s32) = COPY $r1
	; CHECK: [[R2:%[0-9]+]]:_(s32) = COPY $r2			; CHECK: [[R2:%[0-9]+]]:_(s32) = COPY $r2
	; CHECK: [[R3:%[0-9]+]]:_(s32) = COPY $r3			; CHECK: [[R3:%[0-9]+]]:_(s32) = COPY $r3
	; CHECK: [[ARG_ARR0:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[R0]](s32), [[R1]](s32)			; CHECK: [[ARG_ARR0:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[R0]](s32), [[R1]](s32)
	; CHECK: [[ARG_ARR1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[R2]](s32), [[R3]](s32)			; CHECK: [[ARG_ARR1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[R2]](s32), [[R3]](s32)
				; CHECK: [[EXT1:%[0-9]+]]:_(s32) = G_EXTRACT [[ARG_ARR0]](s64), 0
				; CHECK: [[EXT2:%[0-9]+]]:_(s32) = G_EXTRACT [[ARG_ARR0]](s64), 32
				; CHECK: [[EXT3:%[0-9]+]]:_(s32) = G_EXTRACT [[ARG_ARR1]](s64), 0
				; CHECK: [[EXT4:%[0-9]+]]:_(s32) = G_EXTRACT [[ARG_ARR1]](s64), 32
				; CHECK: [[IMPDEF:%[0-9]+]]:_(s64) = G_IMPLICIT_DEF
				; CHECK: [[INS1:%[0-9]+]]:_(s64) = G_INSERT [[IMPDEF]], [[EXT1]](s32), 0
				; CHECK: [[INS2:%[0-9]+]]:_(s64) = G_INSERT [[INS1]], [[EXT2]](s32), 32
				; CHECK: [[IMPDEF2:%[0-9]+]]:_(s64) = G_IMPLICIT_DEF
				; CHECK: [[INS3:%[0-9]+]]:_(s64) = G_INSERT [[IMPDEF2]], [[EXT3]](s32), 0
				; CHECK: [[INS4:%[0-9]+]]:_(s64) = G_INSERT [[INS3]], [[EXT4]](s32), 32
	; CHECK: ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def $sp, implicit $sp			; CHECK: ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def $sp, implicit $sp
	; CHECK: [[R0:%[0-9]+]]:_(s32), [[R1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[ARG_ARR0]](s64)			; CHECK: [[R0:%[0-9]+]]:_(s32), [[R1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[INS2]](s64)
	; CHECK: [[R2:%[0-9]+]]:_(s32), [[R3:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[ARG_ARR1]](s64)			; CHECK: [[R2:%[0-9]+]]:_(s32), [[R3:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[INS4]](s64)
	; CHECK: $r0 = COPY [[R0]]			; CHECK: $r0 = COPY [[R0]]
	; CHECK: $r1 = COPY [[R1]]			; CHECK: $r1 = COPY [[R1]]
	; CHECK: $r2 = COPY [[R2]]			; CHECK: $r2 = COPY [[R2]]
	; CHECK: $r3 = COPY [[R3]]			; CHECK: $r3 = COPY [[R3]]
	; CHECK: BL @multiple_int_arrays_target, csr_aapcs, implicit-def $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit $r3			; CHECK: BL @multiple_int_arrays_target, csr_aapcs, implicit-def $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit $r3
	; CHECK: ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def $sp, implicit $sp			; CHECK: ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def $sp, implicit $sp
	; CHECK: BX_RET 14, $noreg			; CHECK: BX_RET 14, $noreg
	entry:			entry:
	Show All 15 Lines
	; CHECK-DAG: [[R1:%[0-9]+]]:_(s32) = COPY $r1			; CHECK-DAG: [[R1:%[0-9]+]]:_(s32) = COPY $r1
	; CHECK-DAG: [[R2:%[0-9]+]]:_(s32) = COPY $r2			; CHECK-DAG: [[R2:%[0-9]+]]:_(s32) = COPY $r2
	; CHECK-DAG: [[R3:%[0-9]+]]:_(s32) = COPY $r3			; CHECK-DAG: [[R3:%[0-9]+]]:_(s32) = COPY $r3
	; CHECK: [[FIRST_STACK_ELEMENT_FI:%[0-9]+]]:_(p0) = G_FRAME_INDEX %fixed-stack.[[FIRST_STACK_ID]]			; CHECK: [[FIRST_STACK_ELEMENT_FI:%[0-9]+]]:_(p0) = G_FRAME_INDEX %fixed-stack.[[FIRST_STACK_ID]]
	; CHECK: [[FIRST_STACK_ELEMENT:%[0-9]+]]:_(s32) = G_LOAD [[FIRST_STACK_ELEMENT_FI]]{{.*}}load 4 from %fixed-stack.[[FIRST_STACK_ID]]			; CHECK: [[FIRST_STACK_ELEMENT:%[0-9]+]]:_(s32) = G_LOAD [[FIRST_STACK_ELEMENT_FI]]{{.*}}load 4 from %fixed-stack.[[FIRST_STACK_ID]]
	; CHECK: [[LAST_STACK_ELEMENT_FI:%[0-9]+]]:_(p0) = G_FRAME_INDEX %fixed-stack.[[LAST_STACK_ID]]			; CHECK: [[LAST_STACK_ELEMENT_FI:%[0-9]+]]:_(p0) = G_FRAME_INDEX %fixed-stack.[[LAST_STACK_ID]]
	; CHECK: [[LAST_STACK_ELEMENT:%[0-9]+]]:_(s32) = G_LOAD [[LAST_STACK_ELEMENT_FI]]{{.*}}load 4 from %fixed-stack.[[LAST_STACK_ID]]			; CHECK: [[LAST_STACK_ELEMENT:%[0-9]+]]:_(s32) = G_LOAD [[LAST_STACK_ELEMENT_FI]]{{.*}}load 4 from %fixed-stack.[[LAST_STACK_ID]]
	; CHECK: [[ARG_ARR:%[0-9]+]]:_(s640) = G_MERGE_VALUES [[R0]](s32), [[R1]](s32), [[R2]](s32), [[R3]](s32), [[FIRST_STACK_ELEMENT]](s32), {{.*}}, [[LAST_STACK_ELEMENT]](s32)			; CHECK: [[ARG_ARR:%[0-9]+]]:_(s640) = G_MERGE_VALUES [[R0]](s32), [[R1]](s32), [[R2]](s32), [[R3]](s32), [[FIRST_STACK_ELEMENT]](s32), {{.*}}, [[LAST_STACK_ELEMENT]](s32)
				; CHECK: [[INS:%[0-9]+]]:_(s640) = G_INSERT {{.}}, {{.}}(s32), 608
	; CHECK: ADJCALLSTACKDOWN 64, 0, 14, $noreg, implicit-def $sp, implicit $sp			; CHECK: ADJCALLSTACKDOWN 64, 0, 14, $noreg, implicit-def $sp, implicit $sp
	; CHECK: [[R0:%[0-9]+]]:_(s32), [[R1:%[0-9]+]]:_(s32), [[R2:%[0-9]+]]:_(s32), [[R3:%[0-9]+]]:_(s32), [[FIRST_STACK_ELEMENT:%[0-9]+]]:_(s32), {{.*}}, [[LAST_STACK_ELEMENT:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[ARG_ARR]](s640)			; CHECK: [[R0:%[0-9]+]]:_(s32), [[R1:%[0-9]+]]:_(s32), [[R2:%[0-9]+]]:_(s32), [[R3:%[0-9]+]]:_(s32), [[FIRST_STACK_ELEMENT:%[0-9]+]]:_(s32), {{.*}}, [[LAST_STACK_ELEMENT:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[INS]](s640)
	; CHECK: $r0 = COPY [[R0]]			; CHECK: $r0 = COPY [[R0]]
	; CHECK: $r1 = COPY [[R1]]			; CHECK: $r1 = COPY [[R1]]
	; CHECK: $r2 = COPY [[R2]]			; CHECK: $r2 = COPY [[R2]]
	; CHECK: $r3 = COPY [[R3]]			; CHECK: $r3 = COPY [[R3]]
	; CHECK: [[SP:%[0-9]+]]:_(p0) = COPY $sp			; CHECK: [[SP:%[0-9]+]]:_(p0) = COPY $sp
	; CHECK: [[OFF_FIRST_ELEMENT:%[0-9]+]]:_(s32) = G_CONSTANT i32 0			; CHECK: [[OFF_FIRST_ELEMENT:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
	; CHECK: [[FIRST_STACK_ARG_ADDR:%[0-9]+]]:_(p0) = G_GEP [[SP]], [[OFF_FIRST_ELEMENT]](s32)			; CHECK: [[FIRST_STACK_ARG_ADDR:%[0-9]+]]:_(p0) = G_GEP [[SP]], [[OFF_FIRST_ELEMENT]](s32)
	; CHECK: G_STORE [[FIRST_STACK_ELEMENT]](s32), [[FIRST_STACK_ARG_ADDR]]{{.*}}store 4			; CHECK: G_STORE [[FIRST_STACK_ELEMENT]](s32), [[FIRST_STACK_ARG_ADDR]]{{.*}}store 4
	Show All 24 Lines
	; BIG: [[ARR0:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[ARR0_1]](s32), [[ARR0_0]](s32)			; BIG: [[ARR0:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[ARR0_1]](s32), [[ARR0_0]](s32)
	; CHECK: [[ARR1_0:%[0-9]+]]:_(s32) = COPY $r2			; CHECK: [[ARR1_0:%[0-9]+]]:_(s32) = COPY $r2
	; CHECK: [[ARR1_1:%[0-9]+]]:_(s32) = COPY $r3			; CHECK: [[ARR1_1:%[0-9]+]]:_(s32) = COPY $r3
	; LITTLE: [[ARR1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[ARR1_0]](s32), [[ARR1_1]](s32)			; LITTLE: [[ARR1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[ARR1_0]](s32), [[ARR1_1]](s32)
	; BIG: [[ARR1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[ARR1_1]](s32), [[ARR1_0]](s32)			; BIG: [[ARR1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[ARR1_1]](s32), [[ARR1_0]](s32)
	; CHECK: [[ARR2_FI:%[0-9]+]]:_(p0) = G_FRAME_INDEX %fixed-stack.[[ARR2_ID]]			; CHECK: [[ARR2_FI:%[0-9]+]]:_(p0) = G_FRAME_INDEX %fixed-stack.[[ARR2_ID]]
	; CHECK: [[ARR2:%[0-9]+]]:_(s64) = G_LOAD [[ARR2_FI]]{{.*}}load 8 from %fixed-stack.[[ARR2_ID]]			; CHECK: [[ARR2:%[0-9]+]]:_(s64) = G_LOAD [[ARR2_FI]]{{.*}}load 8 from %fixed-stack.[[ARR2_ID]]
	; CHECK: [[ARR_MERGED:%[0-9]+]]:_(s192) = G_MERGE_VALUES [[ARR0]](s64), [[ARR1]](s64), [[ARR2]](s64)			; CHECK: [[ARR_MERGED:%[0-9]+]]:_(s192) = G_MERGE_VALUES [[ARR0]](s64), [[ARR1]](s64), [[ARR2]](s64)
				; CHECK: [[EXT1:%[0-9]+]]:_(s64) = G_EXTRACT [[ARR_MERGED]](s192), 0
				; CHECK: [[EXT2:%[0-9]+]]:_(s64) = G_EXTRACT [[ARR_MERGED]](s192), 64
				; CHECK: [[EXT3:%[0-9]+]]:_(s64) = G_EXTRACT [[ARR_MERGED]](s192), 128
				; CHECK: [[IMPDEF:%[0-9]+]]:_(s192) = G_IMPLICIT_DEF
				; CHECK: [[INS1:%[0-9]+]]:_(s192) = G_INSERT [[IMPDEF]], [[EXT1]](s64), 0
				; CHECK: [[INS2:%[0-9]+]]:_(s192) = G_INSERT [[INS1]], [[EXT2]](s64), 64
				; CHECK: [[INS3:%[0-9]+]]:_(s192) = G_INSERT [[INS2]], [[EXT3]](s64), 128
	; CHECK: ADJCALLSTACKDOWN 8, 0, 14, $noreg, implicit-def $sp, implicit $sp			; CHECK: ADJCALLSTACKDOWN 8, 0, 14, $noreg, implicit-def $sp, implicit $sp
	; CHECK: [[ARR0:%[0-9]+]]:_(s64), [[ARR1:%[0-9]+]]:_(s64), [[ARR2:%[0-9]+]]:_(s64) = G_UNMERGE_VALUES [[ARR_MERGED]](s192)			; CHECK: [[ARR0:%[0-9]+]]:_(s64), [[ARR1:%[0-9]+]]:_(s64), [[ARR2:%[0-9]+]]:_(s64) = G_UNMERGE_VALUES [[INS3]](s192)
	; CHECK: [[ARR0_0:%[0-9]+]]:_(s32), [[ARR0_1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[ARR0]](s64)			; CHECK: [[ARR0_0:%[0-9]+]]:_(s32), [[ARR0_1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[ARR0]](s64)
	; LITTLE: $r0 = COPY [[ARR0_0]](s32)			; LITTLE: $r0 = COPY [[ARR0_0]](s32)
	; LITTLE: $r1 = COPY [[ARR0_1]](s32)			; LITTLE: $r1 = COPY [[ARR0_1]](s32)
	; BIG: $r0 = COPY [[ARR0_1]](s32)			; BIG: $r0 = COPY [[ARR0_1]](s32)
	; BIG: $r1 = COPY [[ARR0_0]](s32)			; BIG: $r1 = COPY [[ARR0_0]](s32)
	; CHECK: [[ARR1_0:%[0-9]+]]:_(s32), [[ARR1_1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[ARR1]](s64)			; CHECK: [[ARR1_0:%[0-9]+]]:_(s32), [[ARR1_1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[ARR1]](s64)
	; LITTLE: $r2 = COPY [[ARR1_0]](s32)			; LITTLE: $r2 = COPY [[ARR1_0]](s32)
	; LITTLE: $r3 = COPY [[ARR1_1]](s32)			; LITTLE: $r3 = COPY [[ARR1_1]](s32)
	; BIG: $r2 = COPY [[ARR1_1]](s32)			; BIG: $r2 = COPY [[ARR1_1]](s32)
	; BIG: $r3 = COPY [[ARR1_0]](s32)			; BIG: $r3 = COPY [[ARR1_0]](s32)
	; CHECK: [[SP:%[0-9]+]]:_(p0) = COPY $sp			; CHECK: [[SP:%[0-9]+]]:_(p0) = COPY $sp
	; CHECK: [[ARR2_OFFSET:%[0-9]+]]:_(s32) = G_CONSTANT i32 0			; CHECK: [[ARR2_OFFSET:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
	; CHECK: [[ARR2_ADDR:%[0-9]+]]:_(p0) = G_GEP [[SP]], [[ARR2_OFFSET]](s32)			; CHECK: [[ARR2_ADDR:%[0-9]+]]:_(p0) = G_GEP [[SP]], [[ARR2_OFFSET]](s32)
	; CHECK: G_STORE [[ARR2]](s64), [[ARR2_ADDR]](p0){{.*}}store 8			; CHECK: G_STORE [[ARR2]](s64), [[ARR2_ADDR]](p0){{.*}}store 8
	; CHECK: BL @fp_arrays_aapcs_target, csr_aapcs, implicit-def $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit $r3, implicit-def $r0, implicit-def $r1			; CHECK: BL @fp_arrays_aapcs_target, csr_aapcs, implicit-def $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit $r3, implicit-def $r0, implicit-def $r1
	; CHECK: [[R0:%[0-9]+]]:_(s32) = COPY $r0			; CHECK: [[R0:%[0-9]+]]:_(s32) = COPY $r0
	; CHECK: [[R1:%[0-9]+]]:_(s32) = COPY $r1			; CHECK: [[R1:%[0-9]+]]:_(s32) = COPY $r1
	; CHECK: [[R_MERGED:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[R0]](s32), [[R1]](s32)			; CHECK: [[R_MERGED:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[R0]](s32), [[R1]](s32)
	; CHECK: ADJCALLSTACKUP 8, 0, 14, $noreg, implicit-def $sp, implicit $sp			; CHECK: ADJCALLSTACKUP 8, 0, 14, $noreg, implicit-def $sp, implicit $sp
	; CHECK: [[R0:%[0-9]+]]:_(s32), [[R1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[R_MERGED]](s64)			; CHECK: [[EXT4:%[0-9]+]]:_(s32) = G_EXTRACT [[R_MERGED]](s64), 0
				; CHECK: [[EXT5:%[0-9]+]]:_(s32) = G_EXTRACT [[R_MERGED]](s64), 32
				; CHECK: [[IMPDEF2:%[0-9]+]]:_(s64) = G_IMPLICIT_DEF
				; CHECK: [[INS4:%[0-9]+]]:_(s64) = G_INSERT [[IMPDEF2]], [[EXT4]](s32), 0
				; CHECK: [[INS5:%[0-9]+]]:_(s64) = G_INSERT [[INS4]], [[EXT5]](s32), 32
				; CHECK: [[R0:%[0-9]+]]:_(s32), [[R1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[INS5]](s64)
	; CHECK: $r0 = COPY [[R0]]			; CHECK: $r0 = COPY [[R0]]
	; CHECK: $r1 = COPY [[R1]]			; CHECK: $r1 = COPY [[R1]]
	; CHECK: BX_RET 14, $noreg, implicit $r0, implicit $r1			; CHECK: BX_RET 14, $noreg, implicit $r0, implicit $r1
	entry:			entry:
	%r = notail call arm_aapcscc [2 x float] @fp_arrays_aapcs_target([3 x double] %arr)			%r = notail call arm_aapcscc [2 x float] @fp_arrays_aapcs_target([3 x double] %arr)
	ret [2 x float] %r			ret [2 x float] %r
	}			}

	Show All 19 Lines
	; CHECK: [[Z1:%[0-9]+]]:_(s64) = G_LOAD [[Z1_FI]]{{.*}}load 8			; CHECK: [[Z1:%[0-9]+]]:_(s64) = G_LOAD [[Z1_FI]]{{.*}}load 8
	; CHECK: [[Z2_FI:%[0-9]+]]:_(p0) = G_FRAME_INDEX %fixed-stack.[[Z2_ID]]			; CHECK: [[Z2_FI:%[0-9]+]]:_(p0) = G_FRAME_INDEX %fixed-stack.[[Z2_ID]]
	; CHECK: [[Z2:%[0-9]+]]:_(s64) = G_LOAD [[Z2_FI]]{{.*}}load 8			; CHECK: [[Z2:%[0-9]+]]:_(s64) = G_LOAD [[Z2_FI]]{{.*}}load 8
	; CHECK: [[Z3_FI:%[0-9]+]]:_(p0) = G_FRAME_INDEX %fixed-stack.[[Z3_ID]]			; CHECK: [[Z3_FI:%[0-9]+]]:_(p0) = G_FRAME_INDEX %fixed-stack.[[Z3_ID]]
	; CHECK: [[Z3:%[0-9]+]]:_(s64) = G_LOAD [[Z3_FI]]{{.*}}load 8			; CHECK: [[Z3:%[0-9]+]]:_(s64) = G_LOAD [[Z3_FI]]{{.*}}load 8
	; CHECK: [[X_ARR:%[0-9]+]]:_(s192) = G_MERGE_VALUES [[X0]](s64), [[X1]](s64), [[X2]](s64)			; CHECK: [[X_ARR:%[0-9]+]]:_(s192) = G_MERGE_VALUES [[X0]](s64), [[X1]](s64), [[X2]](s64)
	; CHECK: [[Y_ARR:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[Y0]](s32), [[Y1]](s32), [[Y2]](s32)			; CHECK: [[Y_ARR:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[Y0]](s32), [[Y1]](s32), [[Y2]](s32)
	; CHECK: [[Z_ARR:%[0-9]+]]:_(s256) = G_MERGE_VALUES [[Z0]](s64), [[Z1]](s64), [[Z2]](s64), [[Z3]](s64)			; CHECK: [[Z_ARR:%[0-9]+]]:_(s256) = G_MERGE_VALUES [[Z0]](s64), [[Z1]](s64), [[Z2]](s64), [[Z3]](s64)
				; CHECK: [[EXT1:%[0-9]+]]:_(s64) = G_EXTRACT [[X_ARR]](s192), 0
				; CHECK: [[EXT2:%[0-9]+]]:_(s64) = G_EXTRACT [[X_ARR]](s192), 64
				; CHECK: [[EXT3:%[0-9]+]]:_(s64) = G_EXTRACT [[X_ARR]](s192), 128
				; CHECK: [[EXT4:%[0-9]+]]:_(s32) = G_EXTRACT [[Y_ARR]](s96), 0
				; CHECK: [[EXT5:%[0-9]+]]:_(s32) = G_EXTRACT [[Y_ARR]](s96), 32
				; CHECK: [[EXT6:%[0-9]+]]:_(s32) = G_EXTRACT [[Y_ARR]](s96), 64
				; CHECK: [[EXT7:%[0-9]+]]:_(s64) = G_EXTRACT [[Z_ARR]](s256), 0
				; CHECK: [[EXT8:%[0-9]+]]:_(s64) = G_EXTRACT [[Z_ARR]](s256), 64
				; CHECK: [[EXT9:%[0-9]+]]:_(s64) = G_EXTRACT [[Z_ARR]](s256), 128
				; CHECK: [[EXT10:%[0-9]+]]:_(s64) = G_EXTRACT [[Z_ARR]](s256), 192
				; CHECK: [[IMPDEF:%[0-9]+]]:_(s192) = G_IMPLICIT_DEF
				; CHECK: [[INS1:%[0-9]+]]:_(s192) = G_INSERT [[IMPDEF]], [[EXT1]](s64), 0
				; CHECK: [[INS2:%[0-9]+]]:_(s192) = G_INSERT [[INS1]], [[EXT2]](s64), 64
				; CHECK: [[INS3:%[0-9]+]]:_(s192) = G_INSERT [[INS2]], [[EXT3]](s64), 128
				; CHECK: [[IMPDEF2:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF
				; CHECK: [[INS4:%[0-9]+]]:_(s96) = G_INSERT [[IMPDEF2]], [[EXT4]](s32), 0
				; CHECK: [[INS5:%[0-9]+]]:_(s96) = G_INSERT [[INS4]], [[EXT5]](s32), 32
				; CHECK: [[INS6:%[0-9]+]]:_(s96) = G_INSERT [[INS5]], [[EXT6]](s32), 64
				; CHECK: [[IMPDEF3:%[0-9]+]]:_(s256) = G_IMPLICIT_DEF
				; CHECK: [[INS7:%[0-9]+]]:_(s256) = G_INSERT [[IMPDEF3]], [[EXT7]](s64), 0
				; CHECK: [[INS8:%[0-9]+]]:_(s256) = G_INSERT [[INS7]], [[EXT8]](s64), 64
				; CHECK: [[INS9:%[0-9]+]]:_(s256) = G_INSERT [[INS8]], [[EXT9]](s64), 128
				; CHECK: [[INS10:%[0-9]+]]:_(s256) = G_INSERT [[INS9]], [[EXT10]](s64), 192
	; CHECK: ADJCALLSTACKDOWN 32, 0, 14, $noreg, implicit-def $sp, implicit $sp			; CHECK: ADJCALLSTACKDOWN 32, 0, 14, $noreg, implicit-def $sp, implicit $sp
	; CHECK: [[X0:%[0-9]+]]:_(s64), [[X1:%[0-9]+]]:_(s64), [[X2:%[0-9]+]]:_(s64) = G_UNMERGE_VALUES [[X_ARR]](s192)			; CHECK: [[X0:%[0-9]+]]:_(s64), [[X1:%[0-9]+]]:_(s64), [[X2:%[0-9]+]]:_(s64) = G_UNMERGE_VALUES [[INS3]](s192)
	; CHECK: [[Y0:%[0-9]+]]:_(s32), [[Y1:%[0-9]+]]:_(s32), [[Y2:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[Y_ARR]](s96)			; CHECK: [[Y0:%[0-9]+]]:_(s32), [[Y1:%[0-9]+]]:_(s32), [[Y2:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[INS6]](s96)
	; CHECK: [[Z0:%[0-9]+]]:_(s64), [[Z1:%[0-9]+]]:_(s64), [[Z2:%[0-9]+]]:_(s64), [[Z3:%[0-9]+]]:_(s64) = G_UNMERGE_VALUES [[Z_ARR]](s256)			; CHECK: [[Z0:%[0-9]+]]:_(s64), [[Z1:%[0-9]+]]:_(s64), [[Z2:%[0-9]+]]:_(s64), [[Z3:%[0-9]+]]:_(s64) = G_UNMERGE_VALUES [[INS10]](s256)
	; CHECK: $d0 = COPY [[X0]](s64)			; CHECK: $d0 = COPY [[X0]](s64)
	; CHECK: $d1 = COPY [[X1]](s64)			; CHECK: $d1 = COPY [[X1]](s64)
	; CHECK: $d2 = COPY [[X2]](s64)			; CHECK: $d2 = COPY [[X2]](s64)
	; CHECK: $s6 = COPY [[Y0]](s32)			; CHECK: $s6 = COPY [[Y0]](s32)
	; CHECK: $s7 = COPY [[Y1]](s32)			; CHECK: $s7 = COPY [[Y1]](s32)
	; CHECK: $s8 = COPY [[Y2]](s32)			; CHECK: $s8 = COPY [[Y2]](s32)
	; CHECK: [[SP:%[0-9]+]]:_(p0) = COPY $sp			; CHECK: [[SP:%[0-9]+]]:_(p0) = COPY $sp
	; CHECK: [[Z0_OFFSET:%[0-9]+]]:_(s32) = G_CONSTANT i32 0			; CHECK: [[Z0_OFFSET:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
	Show All 13 Lines
	; CHECK: G_STORE [[Z3]](s64), [[Z3_ADDR]](p0){{.*}}store 8			; CHECK: G_STORE [[Z3]](s64), [[Z3_ADDR]](p0){{.*}}store 8
	; CHECK: BL @fp_arrays_aapcs_vfp_target, csr_aapcs, implicit-def $lr, implicit $sp, implicit $d0, implicit $d1, implicit $d2, implicit $s6, implicit $s7, implicit $s8, implicit-def $s0, implicit-def $s1, implicit-def $s2, implicit-def $s3			; CHECK: BL @fp_arrays_aapcs_vfp_target, csr_aapcs, implicit-def $lr, implicit $sp, implicit $d0, implicit $d1, implicit $d2, implicit $s6, implicit $s7, implicit $s8, implicit-def $s0, implicit-def $s1, implicit-def $s2, implicit-def $s3
	; CHECK: [[R0:%[0-9]+]]:_(s32) = COPY $s0			; CHECK: [[R0:%[0-9]+]]:_(s32) = COPY $s0
	; CHECK: [[R1:%[0-9]+]]:_(s32) = COPY $s1			; CHECK: [[R1:%[0-9]+]]:_(s32) = COPY $s1
	; CHECK: [[R2:%[0-9]+]]:_(s32) = COPY $s2			; CHECK: [[R2:%[0-9]+]]:_(s32) = COPY $s2
	; CHECK: [[R3:%[0-9]+]]:_(s32) = COPY $s3			; CHECK: [[R3:%[0-9]+]]:_(s32) = COPY $s3
	; CHECK: [[R_MERGED:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[R0]](s32), [[R1]](s32), [[R2]](s32), [[R3]](s32)			; CHECK: [[R_MERGED:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[R0]](s32), [[R1]](s32), [[R2]](s32), [[R3]](s32)
	; CHECK: ADJCALLSTACKUP 32, 0, 14, $noreg, implicit-def $sp, implicit $sp			; CHECK: ADJCALLSTACKUP 32, 0, 14, $noreg, implicit-def $sp, implicit $sp
	; CHECK: [[R0:%[0-9]+]]:_(s32), [[R1:%[0-9]+]]:_(s32), [[R2:%[0-9]+]]:_(s32), [[R3:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[R_MERGED]](s128)			; CHECK: [[EXT11:%[0-9]+]]:_(s32) = G_EXTRACT [[R_MERGED]](s128), 0
				; CHECK: [[EXT12:%[0-9]+]]:_(s32) = G_EXTRACT [[R_MERGED]](s128), 32
				; CHECK: [[EXT13:%[0-9]+]]:_(s32) = G_EXTRACT [[R_MERGED]](s128), 64
				; CHECK: [[EXT14:%[0-9]+]]:_(s32) = G_EXTRACT [[R_MERGED]](s128), 96
				; CHECK: [[IMPDEF4:%[0-9]+]]:_(s128) = G_IMPLICIT_DEF
				; CHECK: [[INS11:%[0-9]+]]:_(s128) = G_INSERT [[IMPDEF4]], [[EXT11]](s32), 0
				; CHECK: [[INS12:%[0-9]+]]:_(s128) = G_INSERT [[INS11]], [[EXT12]](s32), 32
				; CHECK: [[INS13:%[0-9]+]]:_(s128) = G_INSERT [[INS12]], [[EXT13]](s32), 64
				; CHECK: [[INS14:%[0-9]+]]:_(s128) = G_INSERT [[INS13]], [[EXT14]](s32), 96
				; CHECK: [[R0:%[0-9]+]]:_(s32), [[R1:%[0-9]+]]:_(s32), [[R2:%[0-9]+]]:_(s32), [[R3:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[INS14]](s128)
	; CHECK: $s0 = COPY [[R0]]			; CHECK: $s0 = COPY [[R0]]
	; CHECK: $s1 = COPY [[R1]]			; CHECK: $s1 = COPY [[R1]]
	; CHECK: $s2 = COPY [[R2]]			; CHECK: $s2 = COPY [[R2]]
	; CHECK: $s3 = COPY [[R3]]			; CHECK: $s3 = COPY [[R3]]
	; CHECK: BX_RET 14, $noreg, implicit $s0, implicit $s1, implicit $s2, implicit $s3			; CHECK: BX_RET 14, $noreg, implicit $s0, implicit $s1, implicit $s2, implicit $s3
	entry:			entry:
	%r = notail call arm_aapcs_vfpcc [4 x float] @fp_arrays_aapcs_vfp_target([3 x double] %x, [3 x float] %y, [4 x double] %z)			%r = notail call arm_aapcs_vfpcc [4 x float] @fp_arrays_aapcs_vfp_target([3 x double] %x, [3 x float] %y, [4 x double] %z)
	ret [4 x float] %r			ret [4 x float] %r
	Show All 13 Lines
	; CHECK-DAG: [[R1:%[0-9]+]]:_(s32) = COPY $r1			; CHECK-DAG: [[R1:%[0-9]+]]:_(s32) = COPY $r1
	; CHECK-DAG: [[R2:%[0-9]+]]:_(s32) = COPY $r2			; CHECK-DAG: [[R2:%[0-9]+]]:_(s32) = COPY $r2
	; CHECK-DAG: [[R3:%[0-9]+]]:_(s32) = COPY $r3			; CHECK-DAG: [[R3:%[0-9]+]]:_(s32) = COPY $r3
	; CHECK: [[FIRST_STACK_ELEMENT_FI:%[0-9]+]]:_(p0) = G_FRAME_INDEX %fixed-stack.[[FIRST_STACK_ID]]			; CHECK: [[FIRST_STACK_ELEMENT_FI:%[0-9]+]]:_(p0) = G_FRAME_INDEX %fixed-stack.[[FIRST_STACK_ID]]
	; CHECK: [[FIRST_STACK_ELEMENT:%[0-9]+]]:_(s32) = G_LOAD [[FIRST_STACK_ELEMENT_FI]]{{.*}}load 4 from %fixed-stack.[[FIRST_STACK_ID]]			; CHECK: [[FIRST_STACK_ELEMENT:%[0-9]+]]:_(s32) = G_LOAD [[FIRST_STACK_ELEMENT_FI]]{{.*}}load 4 from %fixed-stack.[[FIRST_STACK_ID]]
	; CHECK: [[LAST_STACK_ELEMENT_FI:%[0-9]+]]:_(p0) = G_FRAME_INDEX %fixed-stack.[[LAST_STACK_ID]]			; CHECK: [[LAST_STACK_ELEMENT_FI:%[0-9]+]]:_(p0) = G_FRAME_INDEX %fixed-stack.[[LAST_STACK_ID]]
	; CHECK: [[LAST_STACK_ELEMENT:%[0-9]+]]:_(s32) = G_LOAD [[LAST_STACK_ELEMENT_FI]]{{.*}}load 4 from %fixed-stack.[[LAST_STACK_ID]]			; CHECK: [[LAST_STACK_ELEMENT:%[0-9]+]]:_(s32) = G_LOAD [[LAST_STACK_ELEMENT_FI]]{{.*}}load 4 from %fixed-stack.[[LAST_STACK_ID]]
	; CHECK: [[ARG_ARR:%[0-9]+]]:_(s768) = G_MERGE_VALUES [[R0]](s32), [[R1]](s32), [[R2]](s32), [[R3]](s32), [[FIRST_STACK_ELEMENT]](s32), {{.*}}, [[LAST_STACK_ELEMENT]](s32)			; CHECK: [[ARG_ARR:%[0-9]+]]:_(s768) = G_MERGE_VALUES [[R0]](s32), [[R1]](s32), [[R2]](s32), [[R3]](s32), [[FIRST_STACK_ELEMENT]](s32), {{.*}}, [[LAST_STACK_ELEMENT]](s32)
				; CHECK: [[INS:%[0-9]+]]:_(s768) = G_INSERT {{.}}, {{.}}(s32), 736
	; CHECK: ADJCALLSTACKDOWN 80, 0, 14, $noreg, implicit-def $sp, implicit $sp			; CHECK: ADJCALLSTACKDOWN 80, 0, 14, $noreg, implicit-def $sp, implicit $sp
	; CHECK: [[R0:%[0-9]+]]:_(s32), [[R1:%[0-9]+]]:_(s32), [[R2:%[0-9]+]]:_(s32), [[R3:%[0-9]+]]:_(s32), [[FIRST_STACK_ELEMENT:%[0-9]+]]:_(s32), {{.*}}, [[LAST_STACK_ELEMENT:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[ARG_ARR]](s768)			; CHECK: [[R0:%[0-9]+]]:_(s32), [[R1:%[0-9]+]]:_(s32), [[R2:%[0-9]+]]:_(s32), [[R3:%[0-9]+]]:_(s32), [[FIRST_STACK_ELEMENT:%[0-9]+]]:_(s32), {{.*}}, [[LAST_STACK_ELEMENT:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[INS]](s768)
	; CHECK: $r0 = COPY [[R0]]			; CHECK: $r0 = COPY [[R0]]
	; CHECK: $r1 = COPY [[R1]]			; CHECK: $r1 = COPY [[R1]]
	; CHECK: $r2 = COPY [[R2]]			; CHECK: $r2 = COPY [[R2]]
	; CHECK: $r3 = COPY [[R3]]			; CHECK: $r3 = COPY [[R3]]
	; CHECK: [[SP:%[0-9]+]]:_(p0) = COPY $sp			; CHECK: [[SP:%[0-9]+]]:_(p0) = COPY $sp
	; CHECK: [[OFF_FIRST_ELEMENT:%[0-9]+]]:_(s32) = G_CONSTANT i32 0			; CHECK: [[OFF_FIRST_ELEMENT:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
	; CHECK: [[FIRST_STACK_ARG_ADDR:%[0-9]+]]:_(p0) = G_GEP [[SP]], [[OFF_FIRST_ELEMENT]](s32)			; CHECK: [[FIRST_STACK_ARG_ADDR:%[0-9]+]]:_(p0) = G_GEP [[SP]], [[OFF_FIRST_ELEMENT]](s32)
	; CHECK: G_STORE [[FIRST_STACK_ELEMENT]](s32), [[FIRST_STACK_ARG_ADDR]]{{.*}}store 4			; CHECK: G_STORE [[FIRST_STACK_ELEMENT]](s32), [[FIRST_STACK_ARG_ADDR]]{{.*}}store 4
	; Match the second-to-last offset, so we can get the correct SP for the last element			; Match the second-to-last offset, so we can get the correct SP for the last element
	; CHECK: G_CONSTANT i32 72			; CHECK: G_CONSTANT i32 72
	; CHECK: [[SP:%[0-9]+]]:_(p0) = COPY $sp			; CHECK: [[SP:%[0-9]+]]:_(p0) = COPY $sp
	; CHECK: [[OFF_LAST_ELEMENT:%[0-9]+]]:_(s32) = G_CONSTANT i32 76			; CHECK: [[OFF_LAST_ELEMENT:%[0-9]+]]:_(s32) = G_CONSTANT i32 76
	; CHECK: [[LAST_STACK_ARG_ADDR:%[0-9]+]]:_(p0) = G_GEP [[SP]], [[OFF_LAST_ELEMENT]](s32)			; CHECK: [[LAST_STACK_ARG_ADDR:%[0-9]+]]:_(p0) = G_GEP [[SP]], [[OFF_LAST_ELEMENT]](s32)
	; CHECK: G_STORE [[LAST_STACK_ELEMENT]](s32), [[LAST_STACK_ARG_ADDR]]{{.*}}store 4			; CHECK: G_STORE [[LAST_STACK_ELEMENT]](s32), [[LAST_STACK_ARG_ADDR]]{{.*}}store 4
	; CHECK: BL @tough_arrays_target, csr_aapcs, implicit-def $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit $r3, implicit-def $r0, implicit-def $r1			; CHECK: BL @tough_arrays_target, csr_aapcs, implicit-def $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit $r3, implicit-def $r0, implicit-def $r1
	; CHECK: [[R0:%[0-9]+]]:_(s32) = COPY $r0			; CHECK: [[R0:%[0-9]+]]:_(s32) = COPY $r0
	; CHECK: [[R1:%[0-9]+]]:_(s32) = COPY $r1			; CHECK: [[R1:%[0-9]+]]:_(s32) = COPY $r1
	; CHECK: [[RES_ARR:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[R0]](s32), [[R1]](s32)			; CHECK: [[RES_ARR:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[R0]](s32), [[R1]](s32)
	; CHECK: ADJCALLSTACKUP 80, 0, 14, $noreg, implicit-def $sp, implicit $sp			; CHECK: ADJCALLSTACKUP 80, 0, 14, $noreg, implicit-def $sp, implicit $sp
	; CHECK: [[R0:%[0-9]+]]:_(s32), [[R1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[RES_ARR]](s64)			; CHECK: [[EXT1:%[0-9]+]]:_(p0) = G_EXTRACT [[RES_ARR]](s64), 0
				; CHECK: [[EXT2:%[0-9]+]]:_(p0) = G_EXTRACT [[RES_ARR]](s64), 32
				; CHECK: [[IMPDEF:%[0-9]+]]:_(s64) = G_IMPLICIT_DEF
				; CHECK: [[INS2:%[0-9]+]]:_(s64) = G_INSERT [[IMPDEF]], [[EXT1]](p0), 0
				; CHECK: [[INS3:%[0-9]+]]:_(s64) = G_INSERT [[INS2]], [[EXT2]](p0), 32
				; CHECK: [[R0:%[0-9]+]]:_(s32), [[R1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[INS3]](s64)
	; CHECK: $r0 = COPY [[R0]]			; CHECK: $r0 = COPY [[R0]]
	; CHECK: $r1 = COPY [[R1]]			; CHECK: $r1 = COPY [[R1]]
	; CHECK: BX_RET 14, $noreg, implicit $r0, implicit $r1			; CHECK: BX_RET 14, $noreg, implicit $r0, implicit $r1
	entry:			entry:
	%r = notail call arm_aapcscc [2 x i32*] @tough_arrays_target([6 x [4 x i32]] %arr)			%r = notail call arm_aapcscc [2 x i32*] @tough_arrays_target([6 x [4 x i32]] %arr)
	ret [2 x i32*] %r			ret [2 x i32*] %r
	}			}

	declare arm_aapcscc {i32, i32} @structs_target({i32, i32})			declare arm_aapcscc {i32, i32} @structs_target({i32, i32})

	define arm_aapcscc {i32, i32} @test_structs({i32, i32} %x) {			define arm_aapcscc {i32, i32} @test_structs({i32, i32} %x) {
	; CHECK-LABEL: test_structs			; CHECK-LABEL: test_structs
	; CHECK: liveins: $r0, $r1			; CHECK: liveins: $r0, $r1
	; CHECK-DAG: [[X0:%[0-9]+]]:_(s32) = COPY $r0			; CHECK-DAG: [[X0:%[0-9]+]]:_(s32) = COPY $r0
	; CHECK-DAG: [[X1:%[0-9]+]]:_(s32) = COPY $r1			; CHECK-DAG: [[X1:%[0-9]+]]:_(s32) = COPY $r1
	; CHECK: [[X:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[X0]](s32), [[X1]](s32)			; CHECK: [[X:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[X0]](s32), [[X1]](s32)
				; CHECK: [[EXT1:%[0-9]+]]:_(s32) = G_EXTRACT [[X]](s64), 0
				; CHECK: [[EXT2:%[0-9]+]]:_(s32) = G_EXTRACT [[X]](s64), 32
				; CHECK: [[IMPDEF:%[0-9]+]]:_(s64) = G_IMPLICIT_DEF
				; CHECK: [[INS1:%[0-9]+]]:_(s64) = G_INSERT [[IMPDEF]], [[EXT1]](s32), 0
				; CHECK: [[INS2:%[0-9]+]]:_(s64) = G_INSERT [[INS1]], [[EXT2]](s32), 32
	; CHECK: ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def $sp, implicit $sp			; CHECK: ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def $sp, implicit $sp
	; CHECK: [[X0:%[0-9]+]]:_(s32), [[X1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[X]](s64)			; CHECK: [[X0:%[0-9]+]]:_(s32), [[X1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[INS2]](s64)
	; CHECK-DAG: $r0 = COPY [[X0]](s32)			; CHECK-DAG: $r0 = COPY [[X0]](s32)
	; CHECK-DAG: $r1 = COPY [[X1]](s32)			; CHECK-DAG: $r1 = COPY [[X1]](s32)
	; CHECK: BL @structs_target, csr_aapcs, implicit-def $lr, implicit $sp, implicit $r0, implicit $r1, implicit-def $r0, implicit-def $r1			; CHECK: BL @structs_target, csr_aapcs, implicit-def $lr, implicit $sp, implicit $r0, implicit $r1, implicit-def $r0, implicit-def $r1
	; CHECK: [[R0:%[0-9]+]]:_(s32) = COPY $r0			; CHECK: [[R0:%[0-9]+]]:_(s32) = COPY $r0
	; CHECK: [[R1:%[0-9]+]]:_(s32) = COPY $r1			; CHECK: [[R1:%[0-9]+]]:_(s32) = COPY $r1
	; CHECK: [[R:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[R0]](s32), [[R1]](s32)			; CHECK: [[R:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[R0]](s32), [[R1]](s32)
	; CHECK: ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def $sp, implicit $sp			; CHECK: ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def $sp, implicit $sp
	; CHECK: [[R0:%[0-9]+]]:_(s32), [[R1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[R]](s64)			; CHECK: [[EXT3:%[0-9]+]]:_(s32) = G_EXTRACT [[R]](s64), 0
				; CHECK: [[EXT4:%[0-9]+]]:_(s32) = G_EXTRACT [[R]](s64), 32
				; CHECK: [[IMPDEF2:%[0-9]+]]:_(s64) = G_IMPLICIT_DEF
				; CHECK: [[INS3:%[0-9]+]]:_(s64) = G_INSERT [[IMPDEF2]], [[EXT3]](s32), 0
				; CHECK: [[INS4:%[0-9]+]]:_(s64) = G_INSERT [[INS3]], [[EXT4]](s32), 32
				; CHECK: [[R0:%[0-9]+]]:_(s32), [[R1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[INS4]](s64)
	; CHECK: $r0 = COPY [[R0]](s32)			; CHECK: $r0 = COPY [[R0]](s32)
	; CHECK: $r1 = COPY [[R1]](s32)			; CHECK: $r1 = COPY [[R1]](s32)
	; CHECK: BX_RET 14, $noreg, implicit $r0, implicit $r1			; CHECK: BX_RET 14, $noreg, implicit $r0, implicit $r1
	%r = notail call arm_aapcscc {i32, i32} @structs_target({i32, i32} %x)			%r = notail call arm_aapcscc {i32, i32} @structs_target({i32, i32} %x)
	ret {i32, i32} %r			ret {i32, i32} %r
	}			}

This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel][IRTranslator] Split aggregates during IR translationClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 143756

include/llvm/CodeGen/GlobalISel/IRTranslator.h

lib/CodeGen/GlobalISel/IRTranslator.cpp

lib/Target/AArch64/AArch64CallLowering.cpp

lib/Target/ARM/ARMCallLowering.cpp

test/CodeGen/AArch64/GlobalISel/arm64-fallback.ll

test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll

test/CodeGen/AArch64/GlobalISel/call-translator-ios.ll

test/CodeGen/AArch64/GlobalISel/call-translator.ll

test/CodeGen/AArch64/GlobalISel/irtranslator-exceptions.ll

test/CodeGen/AArch64/GlobalISel/legalize-exceptions.ll

test/CodeGen/ARM/GlobalISel/arm-irtranslator.ll

test/CodeGen/ARM/GlobalISel/arm-param-lowering.ll

[GlobalISel][IRTranslator] Split aggregates during IR translation
ClosedPublic