This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/PowerPC/
-
Target/
-
PowerPC/
35/36
PPCISelLowering.cpp
-
test/CodeGen/PowerPC/
-
CodeGen/
-
PowerPC/
47/51
aix-cc-abi.ll

Differential D74225

[AIX] Implement formal arguments passed in stack memory
ClosedPublic

Authored by ZarkoCA on Feb 7 2020, 8:06 AM.

Download Raw Diff

Details

Reviewers

cebowleratibm
sfertile

Commits

rGd68831266033: [PowerPC][AIX] Implement formal arguments passed in stack memory.

Summary

This patch is the callee side for https://reviews.llvm.org/D73209. It removes the fatal error when we pass more formal arguments than available registers.

Diff Detail

Event Timeline

ZarkoCA created this revision.Feb 7 2020, 8:06 AM

Herald added subscribers: llvm-commits, jsji, kbarton and 2 others. · View Herald TranscriptFeb 7 2020, 8:06 AM

ZarkoCA edited the summary of this revision. (Show Details)Feb 7 2020, 8:13 AM

Herald added a subscriber: • wuzish. · View Herald TranscriptFeb 7 2020, 8:13 AM

There's some trailing whitespace that emits warnings when I apply the patch. Probably a good idea to tidy it up. I'll continue to review the content of that patch.

$ git apply D74225.diff.txt
D74225.diff.txt:157: trailing whitespace.
; 32BIT: - { id: 3, type: default, offset: 80, size: 4
D74225.diff.txt:161: trailing whitespace.
; 32BIT: - { id: 7, type: default, offset: 64, size: 4
D74225.diff.txt:179: trailing whitespace.
; 64BIT: - { id: 0, type: default, offset: 168, size: 8
D74225.diff.txt:180: trailing whitespace.
; 64BIT: - { id: 1, type: default, offset: 160, size: 8
D74225.diff.txt:181: trailing whitespace.
; 64BIT: - { id: 2, type: default, offset: 152, size: 8
warning: squelched 10 whitespace errors
warning: 15 lines add whitespace errors.

Did a passthrough of the test changes with a number of comments. I expect you wanted to make some test updates, nonetheless I thought I'd document what I'd like to see in the test coverage. Will review code changes next.

llvm/test/CodeGen/PowerPC/aix-cc-abi.ll
1302	suggest using 32BIT-DAG for the liveins.
1311	I suggest 32BIT-DAG and reorder the memory locations in growing offset. IMO it makes it easier to read and understand the test.
1327	DAG.
1336	DAG and reorder.
1346	I would like to see the expected output for the assembly added. We should validate that the instructions emitted locate the arguments in the correct memory location. We should also strive for consistency with the other tests, which have expected assembly.
1374	suggest DAG.
1387	remove blank line or add blank line in the earlier test for the sake of consistency.
1400	I would also like to see the expected assembly here as well.
1401	Additional test requests: -1 call that passes 1, 2, 4, 8 byte ints and 4, 8 byte floats in memory (mixed with all sizes represented.) -burn the first 8 GPR on float args, then confirm integer args passed in memory. ex: foo(double, double, double, double, char, short, int)

Added assembly tests and fixed white space errors in tests.

This concludes my initial review. Mostly tidy up comments. The logic seems correct so semantically I think it's ok (for the incremental support that it adds.) I'll see if I can break it on the next round.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
6957	I think it would be useful to add a !byVal assertion on the args just to make it clear the logic you've implemented is not intended to handle this case.
6959	whitespace. formatting.
6960	I think it's better if this def is placed closer to its use.
6975–6976	Since there doesn't appear to be any necessary handling to sign or zero extend on this code path, should there be an assertion?
6976	I'd rather see this pushed into the respective conditional blocks. It's not used outside so it's cleaner to contain it locally.
6979	I think this should be an assertion. CC_AIX should never push a loc which is neither reg nor mem. You could switch this up a bit: if (VA.isRegLoc()) { ...} assume(VA.isMemLoc) // no need for "if (VA.isMemLoc())"
7023–7024	formatting.
llvm/test/CodeGen/PowerPC/aix-cc-abi.ll
1271	The convention we've used for cc tests so far is to do expected output for both callee and caller. I'd like to see the caller IR and expected output as well.

ZarkoCA planned changes to this revision.Feb 10 2020, 10:49 AM

ZarkoCA marked an inline comment as done.

ZarkoCA added inline comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
6979	I can change that to an assert. I was only following the precedent in LowerCall_AIX which uses 'report_fatal_error'. I can check the code to: assert((VA.isRegLoc() \|\| VA.isMemLoc()) if (VA.isRegloc()){ register code ... }else { assert(VA.isMemLoc()) memory code ... } Does that work?

Added new testcases which include both callee and callee IR and assembly expected output.
Addressed comments:

Added ByVal assertion
Replaced no RegLoc/MemLoc fatal error with an assertion
Moved variable definitions closer to where they are used
Fixed formatting

ZarkoCA marked 20 inline comments as done.Feb 12 2020, 7:23 PM

ZarkoCA added inline comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
6957	Good idea, added.
6975–6976	I added `assert(ObjSize <= ArgSize);` is this what you had in mind? Or is a different assert specifically related to sign or zero extension needed?
llvm/test/CodeGen/PowerPC/aix-cc-abi.ll
1271	Added both caller and callee for all tests.
1311	Agreed, it does look better once done that way.

ZarkoCA added inline comments.Feb 12 2020, 7:23 PM

llvm/test/CodeGen/PowerPC/aix-cc-abi.ll
1346	Added assembly expected output.
1401	Thanks, that's a really good testcase. I added it as `mix_callee` below along with the respective caller testcase.

Code semantics look ok. If you can post a revision to address the minor concerns I need to have one more pass-through of the test updates in detail.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
7001	suggest continue here, then get rid of the else { }
7005	you already latched VA.getLocVT and VA.getValVT into locals you can reuse.
7007	I find the use of name "ObjSize" and "ArgSize" confusing. Suggest sticking to terminology "ValueSize" and "LocSize".
7010	suggest: // AIX objects are right justified because they are word written to stack memory from their register image.
llvm/test/CodeGen/PowerPC/aix-cc-abi.ll
1311	suggest 32BIT-LABEL to tag the expected sections.
1350	Suggest adding a LABEL check for the assembly output. Applies multiple.
1351	suggest reordering these lwz closer to the reg use for readability.

Addressed comments and fixed testcases.

ZarkoCA marked 5 inline comments as done.Feb 18 2020, 3:22 AM

ZarkoCA added inline comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
7007	Thanks, I agree it's better calling them something else.

All minor comments except for the concern on whether or not a truncation node is needed when ValSize < LocSize.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
6846	I agree with this change though I plan that it will be committed on another change to correct the caller stackarg patch I authored previously. I think it's best to do this as an independent change with new test coverage for what was missed.
6850	this comment is now in the wrong place. It belongs with line 6844.
7010	updated suggestion: // Objects are right-justified because AIX is big-endian. I tend to avoid the use of the term word because in Power ISA it means 4 bytes. To others it means PtrByteSize.
7029	MinReservedArea is now a misnomer. This is effectively the LSA + PSA size, whatever the PSA size was evaluated to. Min is always 56 in 32-bit and 112 in 64-bit.
7029	Do we need to add a truncate node when ValueSize is less than LocSize? Suggest "ValSize".
llvm/test/CodeGen/PowerPC/aix-cc-abi.ll
1301	32BIT-LABEL ?
1327	64BIT-LABEL probably better.
1337	64BIT-LABEL
1565	CHECK-LABEL
1580	CHECK-LABEL
1589	CHECK-LABEL
1817	LABEL
1823	LABEL
1832	LABEL
1846	LABEL

ZarkoCA planned changes to this revision.Feb 20 2020, 11:59 AM

ZarkoCA marked 15 inline comments as done.

ZarkoCA added inline comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
6846	I agree, it wasn't meant to be included here initially. I added it when trying to figure out some of the offsets I was seeing. I will keep it in but let the follow up patch which includes this land first before I do any commits of this one.
7029	I think the name is used in all PPC targets because of the call to `FuncInfo->setMinReservedArea`, I agree with the comment though. I changed it to CallerReservedArea. I think if I do that, then we don't even need the comment.
7029	From what I've looked into, no other PPC targets add a truncate node in their respective LowerFormalArguments() functions. I did find it being done on x86 for i1s, however, and poked around to see why the difference is there. As far as I understand, it looks like we don't need to do on PPC since PPCTargetLowering::LowerLOAD() will do that when we set the load node in LowerFormalArguments. The relevant code snippet from PPCTargetLowering::LowerLOAD() is below: SDValue NewLD = DAG.getExtLoad(ISD::EXTLOAD, dl, getPointerTy(DAG.getDataLayout()), Chain, BasePtr, MVT::i8, MMO); SDValue Result = DAG.getNode(ISD::TRUNCATE, dl, MVT::i1, NewLD); I'm fairly sure that's the reason why we don't see truncation to loads from memory in the various PPC LowerFormalArguments() since it's done in LowerLoad for certain types.
llvm/test/CodeGen/PowerPC/aix-cc-abi.ll
1301	Sorry, missed those in the initial update.

Added Check-Label to testcases.
Changed variable names.
Fixed testcases to improve readability.

cebowleratibm added inline comments.Feb 24 2020, 8:48 AM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
6844–6848	Braces are no longer necessary.
6853	trailing whitespace.
7025	I don't feel that this comment adds to understanding the code. I suggest removing it.
7027	I don't feel that this comment adds to understanding the code. I suggest removing it.
7035	formatting.

ZarkoCA marked 16 inline comments as done.Feb 24 2020, 10:09 AM

ZarkoCA added inline comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
7029	To add further, the special load handling is needed for i1s since we cant do i1 sized loads on Power. We extend to an i8 and then truncate. Also, in 64Bit mode for example, the caller will store parameters sized 4 bytes or less with 8 byte loads and sign or zero extended to 8bytes. For the callee, we adjust the offset where the load happens to load only the 4bytes containing the argument and then do a 4byte load from there. There shouldn't be a need to truncate in that case.

Fixed trailing whitespace issues when applying the patch.
Fixed formatting and removed redundant comments.

I think we need some changes to ensure frame objects are created with their proper size and not word size. i64 is passed as two i32s but we should coalesce these into a single 8-byte object. Like-wise arguments smaller than word size are generating frame objects that are word size.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
7010	I don't see the updated comment reflected in the current patch. I still see: // AIX objects are right justified because they are word written to stack // memory from their register image. and I would prefer this to say // Objects are right-justified because AIX is big-endian.
llvm/test/CodeGen/PowerPC/aix-cc-abi.ll
1320	When i looked at Linux PPC32 I saw that they used 8 byte frame objects for the long long parameters. I think we should be consistent with the precedent.
1352	I got confused by why there's no load at 56 to account for the msw of the 9th parameter (i64) then realized the result is stored in i32, so the optimizer is probably getting clever. To keep the test straight forward and avoid confusion, I'd suggest accumulating the result in an i64 so that we'll see (and can validate) the load at 56.
1356	My preference is to see the: add 3, 3, 5 add 3, 3, 6 add 3,3, 7 ... as a block and coalesce the stack loads next to where they're used.
1382	The ASM32PWR4-DAG expects the add operations but they're omitted in the ASM64PWR4-DAG expected output. I don't strongly prefer either though I strongly prefer consistency.
1409	I usually name the toc load appending ADDR to indicate the register holds the address of the variable, then name the loads after to indicate they hold the value of the variable. Seems the naming you have here his backwards.
1438	Logically I like to see the parameter initializations in order. I find it easier to understand in the approach.
1499	would be a bit nicer to tidy the order of parameter reg inits.
1632	A little bit nicer if these are moved next to their use in the DAG block.
1693	why are all the other STW flagged "killed" and this one not?
1713	I don't think that it's useful to expect the constant pool loads given that the STW matches on SCRATCHREG. There's no validation of where each constant is written and we're not intending to test the constant loads anyways.
1781	REGDADDR
1784	lfd [[REGD:[0-9]+]], 0([[REGDADDR]])
1785	diddo on REGF1 and REGF2
1861	Should be size 2. Though the arg passes in 4 bytes, the object is 2 bytes at offset 58.

In D74225#1894003, @cebowleratibm wrote:

I think we need some changes to ensure frame objects are created with their proper size and not word size.

We have 2 choices for the smaller then save slot sized objects:

Follow XL's behavior and perform a load of the entire save slot, and create frame objects of this size to match. If we chose this we would have to build and insert 'AssertSExt' and 'AssertZExt` nodes and truncates appropriately. The truncates should get cleaned up down stream due to the assert-extended nodes.
Perform the offset adjustment and load using the load instruction with the correct extending type, and have frame objects created at the correct offset matching the size of the load.

It seems we are mixing these 2 right now? but I have to play with some smaller tests to make sure my understanding is correct.

Having the caller and callee semantics in the same test is helpful, but it leads to having really large tests. Can we split the tests into 2. 1 for everything fits in regs and 1 for when we have to go to the stack?

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
6846	I don't see the LocInfo being used on the Memloc path for lower formal args, so this shouldn't semantically affect this patch.Is it included because it affects the lit tests in an observable way? If it doesn't pull it out, if it does then you can pull this patch into your branch and use just the diff of your changes for this review.
7031	Why don't we have to perform the adjustment for ppc32? IIUC ValSize is promoted to i32 for i16/i8, but what about i1s? They shouldn't be promoted by the generic selection dag code because they are a legal register type on PowerPC. At the very least we need a lit test that includes `i1 zeroext` passed in a stack slot. Sorry if there is one, I scanned through and don't see one but the test changes are 1000 lines long :)

ZarkoCA marked 21 inline comments as done.Feb 27 2020, 5:33 PM

ZarkoCA added inline comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
7010	I missed this, sorry about that.
llvm/test/CodeGen/PowerPC/aix-cc-abi.ll
1320	I spent a while investigating this and it looks like the PPC32 and 32BIT AIX do the same thing here. I see that both create two 4yte sized fixed frame objects for one i64 int: AIX: fixedStack: - { id: 0, type: default, offset: 60, size: 4, alignment: 4, stack-id: default, isImmutable: true, isAliased: false, callee-saved-register: '', callee-saved-restored: true, debug-info-variable: '', debug-info-expression: '', debug-info-location: '' } - { id: 1, type: default, offset: 56, size: 4, alignment: 8, stack-id: default, isImmutable: true, isAliased: false, callee-saved-register: '', callee-saved-restored: true, debug-info-variable: '', debug-info-expression: '', debug-info-location: '' } PPC32 fixedStack: - { id: 0, type: default, offset: 12, size: 4, alignment: 4, stack-id: default, isImmutable: true, isAliased: false, callee-saved-register: '', callee-saved-restored: true, debug-info-variable: '', debug-info-expression: '', debug-info-location: '' } - { id: 1, type: default, offset: 8, size: 4, alignment: 8, stack-id: default, isImmutable: true, isAliased: false, callee-saved-register: '', callee-saved-restored: true, debug-info-variable: '', debug-info-expression: '', debug-info-location: '' } However, if we don't run the IR under opt I see we add it adds allocas and stores for each of the parameter and this is what I think creates local stack objects for all the parameters, including a size 8 for the i64 parameter in both AIX and PPC32. On PPC32 and 32BIT AIX all of these local stack objects are at offset 0 and a negative local-offset, their size is : - { id: 8, name: ll9.addr, type: default, offset: 0, size: 8, alignment: 8, stack-id: default, callee-saved-register: '', callee-saved-restored: true, local-offset: -40, debug-info-variable: '', debug-info-expression: '', debug-info-location: '' } I'm not sure that what the use of these stack objects is since they are eliminated if I run the IR under opt which is what I've done to generate the IR for all test cases. I think they are related to lowering of the stack frame but I'm not entirely sure yet.
1352	My initial thought in doing it this was to show that on AIX the upper word of the i64 holds the most significant values. But I agree, it's confusing and it's better to accumulate in an i64.
1356	I think it's best to remove these as per your comment and my reply below.
1382	This is inconsistent, your right. I'm leaning toward only showing the loads and stores in these test cases to more clear about what we are testing. It's also the fastest change to make.
1409	Thanks, fixed now.
1438	Yes, that's better, I agree.
1693	I'm not sure...I found that if I change the value that's held in that register (the double 1.000000e-01) to anything else, the register is then flagged "killed" like the rest. I will need to follow up with some more investigation to find what determines when a register will not be killed in a case like this.
1713	In this case here I'm using SCRATCHREG only to not hard code the registers in which we do the constant pool loads. Hypothetically, they can be any register and I don't want the test to break if any of those scratch registers change at any point in time. I could take them out altogether maybe?
1861	To me it looks like CC_AIX promotes i16 to i32s, and the ValVT passed to getLoad() is i32, so I think this is why it does a 4 byte load at offset 56. // Promote integers if needed. if (ValVT.getSizeInBits() < RegVT.getSizeInBits()) LocInfo = ArgFlags.isSExt() ? CCValAssign::LocInfo::SExt : CCValAssign::LocInfo::ZExt; Is this not correct in 32BIT mode? I see PPC32 do the same thing and XLC on AIX also using a 4byte load (lwz) at offset 56 for an i16.

Updated comment.
Fixed testcases:

Rearranged IR and assembly of parameters in which they are initialized.
Renamed filecheck variables to follow convention already established in aix-cc-abi.ll

ZarkoCA marked 2 inline comments as done.Feb 27 2020, 5:51 PM

ZarkoCA added inline comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
6846	It's not needed for the lower formal args but this patch includes caller tests which are affected.
7031	Sorry, @sfertile I missed your comments and you're right. The i1 zeroext is a case which we'll need to do exactly that. I will remove the IsPPC64 and a test for it.

Change the offset for args smaller than loc size on 32BIT too and add a bool stack test case.

In D74225#1894378, @sfertile wrote:

In D74225#1894003, @cebowleratibm wrote:

I think we need some changes to ensure frame objects are created with their proper size and not word size.

We have 2 choices for the smaller then save slot sized objects:

Follow XL's behavior and perform a load of the entire save slot, and create frame objects of this size to match. If we chose this we would have to build and insert 'AssertSExt' and 'AssertZExt` nodes and truncates appropriately. The truncates should get cleaned up down stream due to the assert-extended nodes.

Perform the offset adjustment and load using the load instruction with the correct extending type, and have frame objects created at the correct offset matching the size of the load.

It seems we are mixing these 2 right now? but I have to play with some smaller tests to make sure my understanding is correct.

On the callee side, for i32s on 64BIT and i1s in 32/64BIT we create fixed objects the size of the value type and then use the appropriate load (lw or lb). Except for i8 and i16s which I think are extended to fit the reg size. I think this is consistent with LLVM PPC32.

On the caller side we always do a PtrByteSize store.

Having the caller and callee semantics in the same test is helpful, but it leads to having really large tests. Can we split the tests into 2. 1 for everything fits in regs and 1 for when we have to go to the stack?

Maybe we can do an NFC patch which splits the tests?

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
7031	I made it so we to the adjustment on 32BIT too and added a bool test case.

Fix whitespace errors, sorry for the noise.

Rebase off https://reviews.llvm.org/D75126 and fix expected output of caller side tests based on that patch.

Harbormaster completed remote builds in B48229: Diff 248527.Mar 5 2020, 10:58 AM

LGTM.

This revision is now accepted and ready to land.Mar 9 2020, 7:03 AM

Closed by commit rGd68831266033: [PowerPC][AIX] Implement formal arguments passed in stack memory. (authored by ZarkoCA, committed by sfertile). · Explain WhyMar 12 2020, 9:13 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

lib/

Target/

PowerPC/

PPCISelLowering.cpp

83 lines

test/

CodeGen/

PowerPC/

aix-cc-abi.ll

993 lines

Diff 244319

llvm/lib/Target/PowerPC/PPCISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 6,835 Lines • ▼ Show 20 Lines	static bool CC_AIX(unsigned ValNo, MVT ValVT, MVT LocVT,
case MVT::i64:		case MVT::i64:
// i64 arguments should have been split to i32 for PPC32.		// i64 arguments should have been split to i32 for PPC32.
assert(IsPPC64 && "PPC32 should have split i64 values.");		assert(IsPPC64 && "PPC32 should have split i64 values.");
LLVM_FALLTHROUGH;		LLVM_FALLTHROUGH;
case MVT::i1:		case MVT::i1:
case MVT::i32: {		case MVT::i32: {
const unsigned Offset = State.AllocateStack(PtrByteSize, PtrByteSize);		const unsigned Offset = State.AllocateStack(PtrByteSize, PtrByteSize);
const MVT RegVT = IsPPC64 ? MVT::i64 : MVT::i32;		const MVT RegVT = IsPPC64 ? MVT::i64 : MVT::i32;
if (unsigned Reg = State.AllocateReg(IsPPC64 ? GPR_64 : GPR_32)) {		if (unsigned Reg = State.AllocateReg(IsPPC64 ? GPR_64 : GPR_32)) {
// Promote integers if needed.		// Promote integers if needed.
if (ValVT.getSizeInBits() < RegVT.getSizeInBits())		if (ValVT.getSizeInBits() < RegVT.getSizeInBits())
		cebowleratibmUnsubmitted Done Reply Inline Actions I agree with this change though I plan that it will be committed on another change to correct the caller stackarg patch I authored previously. I think it's best to do this as an independent change with new test coverage for what was missed. cebowleratibm: I agree with this change though I plan that it will be committed on another change to correct…
		ZarkoCAAuthorUnsubmitted Done Reply Inline Actions I agree, it wasn't meant to be included here initially. I added it when trying to figure out some of the offsets I was seeing. I will keep it in but let the follow up patch which includes this land first before I do any commits of this one. ZarkoCA: I agree, it wasn't meant to be included here initially. I added it when trying to figure out…
		sfertileUnsubmitted Not Done Reply Inline Actions I don't see the LocInfo being used on the Memloc path for lower formal args, so this shouldn't semantically affect this patch.Is it included because it affects the lit tests in an observable way? If it doesn't pull it out, if it does then you can pull this patch into your branch and use just the diff of your changes for this review. sfertile: I don't see the LocInfo being used on the Memloc path for lower formal args, so this shouldn't…
		ZarkoCAAuthorUnsubmitted Done Reply Inline Actions It's not needed for the lower formal args but this patch includes caller tests which are affected. ZarkoCA: It's not needed for the lower formal args but this patch includes caller tests which are…
LocInfo = ArgFlags.isSExt() ? CCValAssign::LocInfo::SExt		LocInfo = ArgFlags.isSExt() ? CCValAssign::LocInfo::SExt
: CCValAssign::LocInfo::ZExt;		: CCValAssign::LocInfo::ZExt;
		cebowleratibmUnsubmitted Done Reply Inline Actions Braces are no longer necessary. cebowleratibm: Braces are no longer necessary.
State.addLoc(CCValAssign::getReg(ValNo, ValVT, Reg, RegVT, LocInfo));		State.addLoc(CCValAssign::getReg(ValNo, ValVT, Reg, RegVT, LocInfo));
}		}
		cebowleratibmUnsubmitted Done Reply Inline Actions this comment is now in the wrong place. It belongs with line 6844. cebowleratibm: this comment is now in the wrong place. It belongs with line 6844.
else		else
State.addLoc(CCValAssign::getMem(ValNo, ValVT, Offset, RegVT, LocInfo));		State.addLoc(CCValAssign::getMem(ValNo, ValVT, Offset, RegVT, LocInfo));

		cebowleratibmUnsubmitted Done Reply Inline Actions trailing whitespace. cebowleratibm: trailing whitespace.
return false;		return false;
}		}
case MVT::f32:		case MVT::f32:
case MVT::f64: {		case MVT::f64: {
// Parameter save area (PSA) is reserved even if the float passes in fpr.		// Parameter save area (PSA) is reserved even if the float passes in fpr.
const unsigned StoreSize = LocVT.getStoreSize();		const unsigned StoreSize = LocVT.getStoreSize();
// Floats are always 4-byte aligned in the PSA on AIX.		// Floats are always 4-byte aligned in the PSA on AIX.
// This includes f64 in 64-bit mode for ABI compatibility.		// This includes f64 in 64-bit mode for ABI compatibility.
▲ Show 20 Lines • Show All 87 Lines • ▼ Show 20 Lines	if (getTargetMachine().Options.GuaranteedTailCallOpt)
report_fatal_error("Tail call support is unimplemented on AIX.");		report_fatal_error("Tail call support is unimplemented on AIX.");

if (useSoftFloat())		if (useSoftFloat())
report_fatal_error("Soft float support is unimplemented on AIX.");		report_fatal_error("Soft float support is unimplemented on AIX.");

const PPCSubtarget &Subtarget =		const PPCSubtarget &Subtarget =
static_cast<const PPCSubtarget &>(DAG.getSubtarget());		static_cast<const PPCSubtarget &>(DAG.getSubtarget());
if (Subtarget.hasQPX())		if (Subtarget.hasQPX())
report_fatal_error("QPX support is not supported on AIX.");		report_fatal_error("QPX support is not supported on AIX.");
		cebowleratibmUnsubmitted Done Reply Inline Actions I think it would be useful to add a !byVal assertion on the args just to make it clear the logic you've implemented is not intended to handle this case. cebowleratibm: I think it would be useful to add a !byVal assertion on the args just to make it clear the…
		ZarkoCAAuthorUnsubmitted Done Reply Inline Actions Good idea, added. ZarkoCA: Good idea, added.

const bool IsPPC64 = Subtarget.isPPC64();		const bool IsPPC64 = Subtarget.isPPC64();
		cebowleratibmUnsubmitted Done Reply Inline Actions whitespace. formatting. cebowleratibm: whitespace. formatting.
const unsigned PtrByteSize = IsPPC64 ? 8 : 4;		const unsigned PtrByteSize = IsPPC64 ? 8 : 4;
		cebowleratibmUnsubmitted Done Reply Inline Actions I think it's better if this def is placed closer to its use. cebowleratibm: I think it's better if this def is placed closer to its use.

// Assign locations to all of the incoming arguments.		// Assign locations to all of the incoming arguments.
SmallVector<CCValAssign, 16> ArgLocs;		SmallVector<CCValAssign, 16> ArgLocs;
MachineFunction &MF = DAG.getMachineFunction();		MachineFunction &MF = DAG.getMachineFunction();
CCState CCInfo(CallConv, isVarArg, MF, ArgLocs, *DAG.getContext());		CCState CCInfo(CallConv, isVarArg, MF, ArgLocs, *DAG.getContext());

		const EVT PtrVT = getPointerTy(MF.getDataLayout());
// Reserve space for the linkage area on the stack.		// Reserve space for the linkage area on the stack.
const unsigned LinkageSize = Subtarget.getFrameLowering()->getLinkageSize();		const unsigned LinkageSize = Subtarget.getFrameLowering()->getLinkageSize();
// On AIX a minimum of 8 words is saved to the parameter save area.		CCInfo.AllocateStack(LinkageSize, PtrByteSize);
const unsigned MinParameterSaveArea = 8 * PtrByteSize;
CCInfo.AllocateStack(LinkageSize + MinParameterSaveArea, PtrByteSize);
CCInfo.AnalyzeFormalArguments(Ins, CC_AIX);		CCInfo.AnalyzeFormalArguments(Ins, CC_AIX);

for (CCValAssign &VA : ArgLocs) {		for (CCValAssign &VA : ArgLocs) {
		EVT ValVT = VA.getValVT();
		MVT LocVT = VA.getLocVT();
		ISD::ArgFlagsTy Flags = Ins[VA.getValNo()].Flags;
		cebowleratibmUnsubmitted Done Reply Inline Actions I'd rather see this pushed into the respective conditional blocks. It's not used outside so it's cleaner to contain it locally. cebowleratibm: I'd rather see this pushed into the respective conditional blocks. It's not used outside so…
		cebowleratibmUnsubmitted Done Reply Inline Actions Since there doesn't appear to be any necessary handling to sign or zero extend on this code path, should there be an assertion? cebowleratibm: Since there doesn't appear to be any necessary handling to sign or zero extend on this code…
		ZarkoCAAuthorUnsubmitted Done Reply Inline Actions I added `assert(ObjSize <= ArgSize);` is this what you had in mind? Or is a different assert specifically related to sign or zero extension needed? ZarkoCA: I added `assert(ObjSize <= ArgSize);` is this what you had in mind? Or is a different assert…
		assert(!Flags.isByVal() &&
		"Passing structure by value is unimplemented for formal arguments.");
		assert((VA.isRegLoc() \|\| VA.isMemLoc()) &&
		cebowleratibmUnsubmitted Done Reply Inline Actions I think this should be an assertion. CC_AIX should never push a loc which is neither reg nor mem. You could switch this up a bit: if (VA.isRegLoc()) { ...} assume(VA.isMemLoc) // no need for "if (VA.isMemLoc())" cebowleratibm: I think this should be an assertion. CC_AIX should never push a loc which is neither reg nor…
		ZarkoCAAuthorUnsubmitted Done Reply Inline Actions I can change that to an assert. I was only following the precedent in LowerCall_AIX which uses 'report_fatal_error'. I can check the code to: assert((VA.isRegLoc() \|\| VA.isMemLoc()) if (VA.isRegloc()){ register code ... }else { assert(VA.isMemLoc()) memory code ... } Does that work? ZarkoCA: I can change that to an assert. I was only following the precedent in LowerCall_AIX which uses…
		"Unexpected location for function call argument.");

if (VA.isMemLoc()) {
// For compatibility with the AIX XL compiler, the float args in the		// For compatibility with the AIX XL compiler, the float args in the
// parameter save area are initialized even if the argument is available		// parameter save area are initialized even if the argument is available
// in register. The caller is required to initialize both the register		// in register. The caller is required to initialize both the register
// and memory, however, the callee can choose to expect it in either. The		// and memory, however, the callee can choose to expect it in either.
// memloc is dismissed here because the argument is retrieved from the		// The memloc is dismissed here because the argument is retrieved from
// register.		// the register.
if (VA.needsCustom())		if (VA.isMemLoc() && VA.needsCustom())
continue;		continue;
report_fatal_error(
"Handling of formal arguments on the stack is unimplemented!");
}

assert(VA.isRegLoc() && "Unexpected argument location.");

EVT ValVT = VA.getValVT();		if (VA.isRegLoc()) {
MVT LocVT = VA.getLocVT();
MVT::SimpleValueType SVT = ValVT.getSimpleVT().SimpleTy;		MVT::SimpleValueType SVT = ValVT.getSimpleVT().SimpleTy;
unsigned VReg =		unsigned VReg =
MF.addLiveIn(VA.getLocReg(), getRegClassForSVT(SVT, IsPPC64));		MF.addLiveIn(VA.getLocReg(), getRegClassForSVT(SVT, IsPPC64));
SDValue ArgValue = DAG.getCopyFromReg(Chain, dl, VReg, LocVT);		SDValue ArgValue = DAG.getCopyFromReg(Chain, dl, VReg, LocVT);
if (ValVT.isScalarInteger() &&		if (ValVT.isScalarInteger() &&
(ValVT.getSizeInBits() < LocVT.getSizeInBits())) {		(ValVT.getSizeInBits() < LocVT.getSizeInBits())) {
ISD::ArgFlagsTy Flags = Ins[VA.getValNo()].Flags;
ArgValue =		ArgValue =
truncateScalarIntegerArg(Flags, ValVT, DAG, ArgValue, LocVT, dl);		truncateScalarIntegerArg(Flags, ValVT, DAG, ArgValue, LocVT, dl);
}		}
InVals.push_back(ArgValue);		InVals.push_back(ArgValue);
		cebowleratibmUnsubmitted Done Reply Inline Actions suggest continue here, then get rid of the else { } cebowleratibm: suggest continue here, then get rid of the else { }
		} else {
		assert(VA.isMemLoc());
		// Get the extended size of the argument type in stack.
		const unsigned ArgSize = VA.getLocVT().getStoreSize();
		cebowleratibmUnsubmitted Done Reply Inline Actions you already latched VA.getLocVT and VA.getValVT into locals you can reuse. cebowleratibm: you already latched VA.getLocVT and VA.getValVT into locals you can reuse.
		// Get the actual size of the argument type.
		const unsigned ObjSize = VA.getValVT().getStoreSize();
		cebowleratibmUnsubmitted Done Reply Inline Actions I find the use of name "ObjSize" and "ArgSize" confusing. Suggest sticking to terminology "ValueSize" and "LocSize". cebowleratibm: I find the use of name "ObjSize" and "ArgSize" confusing. Suggest sticking to terminology…
		ZarkoCAAuthorUnsubmitted Done Reply Inline Actions Thanks, I agree it's better calling them something else. ZarkoCA: Thanks, I agree it's better calling them something else.
		assert(ObjSize <= ArgSize);
		int CurArgOffset = VA.getLocMemOffset();
		// Objects in AIX are right justified.
		cebowleratibmUnsubmitted Done Reply Inline Actions suggest: // AIX objects are right justified because they are word written to stack memory from their register image. cebowleratibm: suggest: // AIX objects are right justified because they are word written to stack memory from…
		cebowleratibmUnsubmitted Done Reply Inline Actions updated suggestion: // Objects are right-justified because AIX is big-endian. I tend to avoid the use of the term word because in Power ISA it means 4 bytes. To others it means PtrByteSize. cebowleratibm: updated suggestion: // Objects are right-justified because AIX is big-endian. I tend to avoid…
		cebowleratibmUnsubmitted Done Reply Inline Actions I don't see the updated comment reflected in the current patch. I still see: // AIX objects are right justified because they are word written to stack // memory from their register image. and I would prefer this to say // Objects are right-justified because AIX is big-endian. cebowleratibm: I don't see the updated comment reflected in the current patch. I still see: // AIX…
		ZarkoCAAuthorUnsubmitted Done Reply Inline Actions I missed this, sorry about that. ZarkoCA: I missed this, sorry about that.
		if (ArgSize < ObjSize)
		CurArgOffset += ArgSize - ObjSize;
		MachineFrameInfo &MFI = MF.getFrameInfo();
		// Potential tail calls could cause overwriting of argument stack slots.
		const bool IsImmutable =
		!(getTargetMachine().Options.GuaranteedTailCallOpt &&
		(CallConv == CallingConv::Fast));
		int FI = MFI.CreateFixedObject(ArgSize, CurArgOffset, IsImmutable);
		SDValue FIN = DAG.getFrameIndex(FI, PtrVT);
		SDValue ArgValue =
		DAG.getLoad(VA.getLocVT(), dl, Chain, FIN, MachinePointerInfo());
		InVals.push_back(ArgValue);
		}
}		}
		cebowleratibmUnsubmitted Done Reply Inline Actions formatting. cebowleratibm: formatting.

		cebowleratibmUnsubmitted Done Reply Inline Actions I don't feel that this comment adds to understanding the code. I suggest removing it. cebowleratibm: I don't feel that this comment adds to understanding the code. I suggest removing it.
		// On AIX a minimum of 8 words is saved to the parameter save area.
		const unsigned MinParameterSaveArea = 8 * PtrByteSize;
		cebowleratibmUnsubmitted Done Reply Inline Actions I don't feel that this comment adds to understanding the code. I suggest removing it. cebowleratibm: I don't feel that this comment adds to understanding the code. I suggest removing it.
// Area that is at least reserved in the caller of this function.		// Area that is at least reserved in the caller of this function.
unsigned MinReservedArea = CCInfo.getNextStackOffset();		unsigned MinReservedArea =
		cebowleratibmUnsubmitted Done Reply Inline Actions MinReservedArea is now a misnomer. This is effectively the LSA + PSA size, whatever the PSA size was evaluated to. Min is always 56 in 32-bit and 112 in 64-bit. cebowleratibm: MinReservedArea is now a misnomer. This is effectively the LSA + PSA size, whatever the PSA…
		ZarkoCAAuthorUnsubmitted Done Reply Inline Actions I think the name is used in all PPC targets because of the call to `FuncInfo->setMinReservedArea`, I agree with the comment though. I changed it to CallerReservedArea. I think if I do that, then we don't even need the comment. ZarkoCA: I think the name is used in all PPC targets because of the call to `FuncInfo…
		cebowleratibmUnsubmitted Done Reply Inline Actions Do we need to add a truncate node when ValueSize is less than LocSize? Suggest "ValSize". cebowleratibm: Do we need to add a truncate node when ValueSize is less than LocSize? Suggest "ValSize".
		ZarkoCAAuthorUnsubmitted Done Reply Inline Actions From what I've looked into, no other PPC targets add a truncate node in their respective LowerFormalArguments() functions. I did find it being done on x86 for i1s, however, and poked around to see why the difference is there. As far as I understand, it looks like we don't need to do on PPC since PPCTargetLowering::LowerLOAD() will do that when we set the load node in LowerFormalArguments. The relevant code snippet from PPCTargetLowering::LowerLOAD() is below: SDValue NewLD = DAG.getExtLoad(ISD::EXTLOAD, dl, getPointerTy(DAG.getDataLayout()), Chain, BasePtr, MVT::i8, MMO); SDValue Result = DAG.getNode(ISD::TRUNCATE, dl, MVT::i1, NewLD); I'm fairly sure that's the reason why we don't see truncation to loads from memory in the various PPC LowerFormalArguments() since it's done in LowerLoad for certain types. ZarkoCA: From what I've looked into, no other PPC targets add a truncate node in their respective…
		ZarkoCAAuthorUnsubmitted Done Reply Inline Actions To add further, the special load handling is needed for i1s since we cant do i1 sized loads on Power. We extend to an i8 and then truncate. Also, in 64Bit mode for example, the caller will store parameters sized 4 bytes or less with 8 byte loads and sign or zero extended to 8bytes. For the callee, we adjust the offset where the load happens to load only the 4bytes containing the argument and then do a 4byte load from there. There shouldn't be a need to truncate in that case. ZarkoCA: To add further, the special load handling is needed for i1s since we cant do i1 sized loads on…
		std::max(CCInfo.getNextStackOffset(), LinkageSize + MinParameterSaveArea);

		sfertileUnsubmitted Done Reply Inline Actions Why don't we have to perform the adjustment for ppc32? IIUC ValSize is promoted to i32 for i16/i8, but what about i1s? They shouldn't be promoted by the generic selection dag code because they are a legal register type on PowerPC. At the very least we need a lit test that includes `i1 zeroext` passed in a stack slot. Sorry if there is one, I scanned through and don't see one but the test changes are 1000 lines long :) sfertile: Why don't we have to perform the adjustment for ppc32? IIUC ValSize is promoted to i32 for…
		ZarkoCAAuthorUnsubmitted Done Reply Inline Actions Sorry, @sfertile I missed your comments and you're right. The i1 zeroext is a case which we'll need to do exactly that. I will remove the IsPPC64 and a test for it. ZarkoCA: Sorry, @sfertile I missed your comments and you're right. The i1 zeroext is a case which we'll…
		ZarkoCAAuthorUnsubmitted Done Reply Inline Actions I made it so we to the adjustment on 32BIT too and added a bool test case. ZarkoCA: I made it so we to the adjustment on 32BIT too and added a bool test case.
// Set the size that is at least reserved in caller of this function. Tail		// Set the size that is at least reserved in caller of this function. Tail
// call optimized function's reserved stack space needs to be aligned so		// call optimized function's reserved stack space needs to be aligned so
// that taking the difference between two stack areas will result in an		// that taking the difference between two stack areas will result in an
// aligned stack.		// aligned stack.
		cebowleratibmUnsubmitted Done Reply Inline Actions formatting. cebowleratibm: formatting.
MinReservedArea =		MinReservedArea =
EnsureStackAlignment(Subtarget.getFrameLowering(), MinReservedArea);		EnsureStackAlignment(Subtarget.getFrameLowering(), MinReservedArea);
PPCFunctionInfo *FuncInfo = MF.getInfo<PPCFunctionInfo>();		PPCFunctionInfo *FuncInfo = MF.getInfo<PPCFunctionInfo>();
FuncInfo->setMinReservedArea(MinReservedArea);		FuncInfo->setMinReservedArea(MinReservedArea);

return Chain;		return Chain;
}		}

▲ Show 20 Lines • Show All 8,845 Lines • Show Last 20 Lines

llvm/test/CodeGen/PowerPC/aix-cc-abi.ll

	Show First 20 Lines • Show All 1,260 Lines • ▼ Show 20 Lines
	; ASM64PWR4-DAG: li 7, 5			; ASM64PWR4-DAG: li 7, 5
	; ASM64PWR4-DAG: li 8, 6			; ASM64PWR4-DAG: li 8, 6
	; ASM64PWR4-DAG: li 9, 7			; ASM64PWR4-DAG: li 9, 7
	; ASM64PWR4-DAG: stfd 1, 120(1)			; ASM64PWR4-DAG: stfd 1, 120(1)
	; ASM64PWR4-DAG: ld 10, 120(1)			; ASM64PWR4-DAG: ld 10, 120(1)
	; ASM64PWR4-NEXT: bl .test_stackarg_float3			; ASM64PWR4-NEXT: bl .test_stackarg_float3
	; ASM64PWR4-NEXT: nop			; ASM64PWR4-NEXT: nop
	; ASM64PWR4-NEXT: addi 1, 1, 128			; ASM64PWR4-NEXT: addi 1, 1, 128


				define i32 @test_ints_stack(i32 %i1, i32 %i2, i32 %i3, i32 %i4, i32 %i5, i32 %i6, i32 %i7, i32 %i8, i64 %ll9, i16 signext %s10, i8 zeroext %c11, i32 %ui12, i32 %si13, i64 %ll14, i8 zeroext %uc15, i32 %i16) {
				cebowleratibmUnsubmitted Done Reply Inline Actions The convention we've used for cc tests so far is to do expected output for both callee and caller. I'd like to see the caller IR and expected output as well. cebowleratibm: The convention we've used for cc tests so far is to do expected output for both callee and…
				ZarkoCAAuthorUnsubmitted Done Reply Inline Actions Added both caller and callee for all tests. ZarkoCA: Added both caller and callee for all tests.
				entry:
				%add = add nsw i32 %i1, %i2
				%add1 = add nsw i32 %add, %i3
				%add2 = add nsw i32 %add1, %i4
				%add3 = add nsw i32 %add2, %i5
				%add4 = add nsw i32 %add3, %i6
				%add5 = add nsw i32 %add4, %i7
				%add6 = add nsw i32 %add5, %i8
				%conv = sext i32 %add6 to i64
				%add7 = add nsw i64 %conv, %ll9
				%conv8 = sext i16 %s10 to i64
				%add9 = add nsw i64 %add7, %conv8
				%conv10 = zext i8 %c11 to i64
				%add11 = add nsw i64 %add9, %conv10
				%conv12 = zext i32 %ui12 to i64
				%add13 = add nsw i64 %add11, %conv12
				%conv14 = sext i32 %si13 to i64
				%add15 = add nsw i64 %add13, %conv14
				%add16 = add nsw i64 %add15, %ll14
				%conv17 = zext i8 %uc15 to i64
				%add18 = add nsw i64 %add16, %conv17
				%conv19 = sext i32 %i16 to i64
				%add20 = add nsw i64 %add18, %conv19
				%conv21 = trunc i64 %add20 to i32
				ret i32 %conv21
				}

				; CHECK-LABEL: name: test_ints_stack

				; 32BIT-DAG: liveins:
				cebowleratibmUnsubmitted Done Reply Inline Actions 32BIT-LABEL ? cebowleratibm: 32BIT-LABEL ?
				ZarkoCAAuthorUnsubmitted Done Reply Inline Actions Sorry, missed those in the initial update. ZarkoCA: Sorry, missed those in the initial update.
				; 32BIT-DAG: - { reg: '$r3', virtual-reg: '' }
				cebowleratibmUnsubmitted Done Reply Inline Actions suggest using 32BIT-DAG for the liveins. cebowleratibm: suggest using 32BIT-DAG for the liveins.
				; 32BIT-DAG: - { reg: '$r4', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$r5', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$r6', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$r7', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$r8', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$r9', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$r10', virtual-reg: '' }

				; 32BIT-DAG: fixedStack:
				cebowleratibmUnsubmitted Done Reply Inline Actions I suggest 32BIT-DAG and reorder the memory locations in growing offset. IMO it makes it easier to read and understand the test. cebowleratibm: I suggest 32BIT-DAG and reorder the memory locations in growing offset. IMO it makes it easier…
				ZarkoCAAuthorUnsubmitted Done Reply Inline Actions Agreed, it does look better once done that way. ZarkoCA: Agreed, it does look better once done that way.
				cebowleratibmUnsubmitted Done Reply Inline Actions suggest 32BIT-LABEL to tag the expected sections. cebowleratibm: suggest 32BIT-LABEL to tag the expected sections.
				; 32BIT-DAG: - { id: 9, type: default, offset: 56, size: 4
				; 32BIT-DAG: - { id: 8, type: default, offset: 60, size: 4
				; 32BIT-DAG: - { id: 7, type: default, offset: 64, size: 4
				; 32BIT-DAG: - { id: 6, type: default, offset: 68, size: 4
				; 32BIT-DAG: - { id: 5, type: default, offset: 72, size: 4
				; 32BIT-DAG: - { id: 4, type: default, offset: 76, size: 4
				; 32BIT-DAG: - { id: 3, type: default, offset: 80, size: 4
				; 32BIT-DAG: - { id: 2, type: default, offset: 84, size: 4
				; 32BIT-DAG: - { id: 1, type: default, offset: 88, size: 4
				cebowleratibmUnsubmitted Not Done Reply Inline Actions When i looked at Linux PPC32 I saw that they used 8 byte frame objects for the long long parameters. I think we should be consistent with the precedent. cebowleratibm: When i looked at Linux PPC32 I saw that they used 8 byte frame objects for the long long…
				ZarkoCAAuthorUnsubmitted Done Reply Inline Actions I spent a while investigating this and it looks like the PPC32 and 32BIT AIX do the same thing here. I see that both create two 4yte sized fixed frame objects for one i64 int: AIX: fixedStack: - { id: 0, type: default, offset: 60, size: 4, alignment: 4, stack-id: default, isImmutable: true, isAliased: false, callee-saved-register: '', callee-saved-restored: true, debug-info-variable: '', debug-info-expression: '', debug-info-location: '' } - { id: 1, type: default, offset: 56, size: 4, alignment: 8, stack-id: default, isImmutable: true, isAliased: false, callee-saved-register: '', callee-saved-restored: true, debug-info-variable: '', debug-info-expression: '', debug-info-location: '' } PPC32 fixedStack: - { id: 0, type: default, offset: 12, size: 4, alignment: 4, stack-id: default, isImmutable: true, isAliased: false, callee-saved-register: '', callee-saved-restored: true, debug-info-variable: '', debug-info-expression: '', debug-info-location: '' } - { id: 1, type: default, offset: 8, size: 4, alignment: 8, stack-id: default, isImmutable: true, isAliased: false, callee-saved-register: '', callee-saved-restored: true, debug-info-variable: '', debug-info-expression: '', debug-info-location: '' } However, if we don't run the IR under opt I see we add it adds allocas and stores for each of the parameter and this is what I think creates local stack objects for all the parameters, including a size 8 for the i64 parameter in both AIX and PPC32. On PPC32 and 32BIT AIX all of these local stack objects are at offset 0 and a negative local-offset, their size is : - { id: 8, name: ll9.addr, type: default, offset: 0, size: 8, alignment: 8, stack-id: default, callee-saved-register: '', callee-saved-restored: true, local-offset: -40, debug-info-variable: '', debug-info-expression: '', debug-info-location: '' } I'm not sure that what the use of these stack objects is since they are eliminated if I run the IR under opt which is what I've done to generate the IR for all test cases. I think they are related to lowering of the stack frame but I'm not entirely sure yet. ZarkoCA: I spent a while investigating this and it looks like the PPC32 and 32BIT AIX do the same thing…
				; 32BIT-DAG: - { id: 0, type: default, offset: 92, size: 4

				; 32BIT-DAG: body: \|
				; 32BIT-DAG: bb.0.entry:
				; 32BIT-NEXT: liveins: $r3, $r4, $r5, $r6, $r7, $r8, $r9, $r10

				; 64BIT-DAG: liveins:
				cebowleratibmUnsubmitted Done Reply Inline Actions DAG. cebowleratibm: DAG.
				cebowleratibmUnsubmitted Done Reply Inline Actions 64BIT-LABEL probably better. cebowleratibm: 64BIT-LABEL probably better.
				; 64BIT-DAG: - { reg: '$x3', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$x4', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$x5', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$x6', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$x7', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$x8', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$x9', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$x10', virtual-reg: '' }

				cebowleratibmUnsubmitted Done Reply Inline Actions DAG and reorder. cebowleratibm: DAG and reorder.
				; 64BIT-DAG: fixedStack:
				cebowleratibmUnsubmitted Done Reply Inline Actions 64BIT-LABEL cebowleratibm: 64BIT-LABEL
				; 64BIT-DAG: - { id: 7, type: default, offset: 112, size: 8
				; 64BIT-DAG: - { id: 6, type: default, offset: 120, size: 8
				; 64BIT-DAG: - { id: 5, type: default, offset: 128, size: 8
				; 64BIT-DAG: - { id: 4, type: default, offset: 136, size: 8
				; 64BIT-DAG: - { id: 3, type: default, offset: 144, size: 8
				; 64BIT-DAG: - { id: 2, type: default, offset: 152, size: 8
				; 64BIT-DAG: - { id: 1, type: default, offset: 160, size: 8
				; 64BIT-DAG: - { id: 0, type: default, offset: 168, size: 8
				; 64BIT-DAG: body: \|
				cebowleratibmUnsubmitted Done Reply Inline Actions I would like to see the expected output for the assembly added. We should validate that the instructions emitted locate the arguments in the correct memory location. We should also strive for consistency with the other tests, which have expected assembly. cebowleratibm: I would like to see the expected output for the assembly added. We should validate that the…
				ZarkoCAAuthorUnsubmitted Done Reply Inline Actions Added assembly expected output. ZarkoCA: Added assembly expected output.
				; 64BIT-DAG: bb.0.entry:
				; 64BIT-NEXT: liveins: $x3, $x4, $x5, $x6, $x7, $x8, $x9, $x10

				; ASM32PWR4-DAG: add 3, 3, 4
				cebowleratibmUnsubmitted Done Reply Inline Actions Suggest adding a LABEL check for the assembly output. Applies multiple. cebowleratibm: Suggest adding a LABEL check for the assembly output. Applies multiple.
				; ASM32PWR4-DAG: lwz [[REG1:[0-9]+]], 60(1)
				cebowleratibmUnsubmitted Done Reply Inline Actions suggest reordering these lwz closer to the reg use for readability. cebowleratibm: suggest reordering these lwz closer to the reg use for readability.
				; ASM32PWR4-DAG: add 3, 3, 5
				cebowleratibmUnsubmitted Done Reply Inline Actions I got confused by why there's no load at 56 to account for the msw of the 9th parameter (i64) then realized the result is stored in i32, so the optimizer is probably getting clever. To keep the test straight forward and avoid confusion, I'd suggest accumulating the result in an i64 so that we'll see (and can validate) the load at 56. cebowleratibm: I got confused by why there's no load at 56 to account for the msw of the 9th parameter (i64)…
				ZarkoCAAuthorUnsubmitted Done Reply Inline Actions My initial thought in doing it this was to show that on AIX the upper word of the i64 holds the most significant values. But I agree, it's confusing and it's better to accumulate in an i64. ZarkoCA: My initial thought in doing it this was to show that on AIX the upper word of the i64 holds the…
				; ASM32PWR4-DAG: add 3, 3, 6
				; ASM32PWR4-DAG: add 3, 3, 7
				; ASM32PWR4-DAG: lwz [[REG2:[0-9]+]], 64(1)
				; ASM32PWR4-DAG: add 3, 3, 8
				cebowleratibmUnsubmitted Done Reply Inline Actions My preference is to see the: add 3, 3, 5 add 3, 3, 6 add 3,3, 7 ... as a block and coalesce the stack loads next to where they're used. cebowleratibm: My preference is to see the: add 3, 3, 5 add 3, 3, 6 add 3,3, 7 ... as a block and coalesce the…
				ZarkoCAAuthorUnsubmitted Done Reply Inline Actions I think it's best to remove these as per your comment and my reply below. ZarkoCA: I think it's best to remove these as per your comment and my reply below.
				; ASM32PWR4-DAG: add 3, 3, 9
				; ASM32PWR4-DAG: lwz [[REG3:[0-9]+]], 68(1)
				; ASM32PWR4-DAG: add 3, 3, 10
				; ASM32PWR4-DAG: add 3, 3, [[REG1]]
				; ASM32PWR4-DAG: add 3, 3, [[REG2]]
				; ASM32PWR4-DAG: lwz [[REG4:[0-9]+]], 72(1)
				; ASM32PWR4-DAG: add 3, 3, [[REG3]]
				; ASM32PWR4-DAG: lwz [[REG5:[0-9]+]], 76(1)
				; ASM32PWR4-DAG: add 3, 3, [[REG4]]
				; ASM32PWR4-DAG: lwz [[REG6:[0-9]+]], 84(1)
				; ASM32PWR4-DAG: add 3, 3, [[REG5]]
				; ASM32PWR4-DAG: lwz [[REG7:[0-9]+]], 88(1)
				; ASM32PWR4-DAG: add 3, 3, [[REG6]]
				; ASM32PWR4-DAG: lwz [[REG8:[0-9]+]], 92(1)
				; ASM32PWR4-DAG: add 3, 3, [[REG7]]
				; ASM32PWR4-DAG: add 3, 3, [[REG8]]

				; ASM64PWR4-DAG: add 3, 3, 4
				cebowleratibmUnsubmitted Done Reply Inline Actions suggest DAG. cebowleratibm: suggest DAG.
				; ASM64PWR4-DAG: ld [[REG1:[0-9]+]], 112(1)
				; ASM64PWR4-DAG: add 3, 3, 5
				; ASM64PWR4-DAG: add 3, 3, 6
				; ASM64PWR4-DAG: add 3, 3, 7
				; ASM64PWR4-DAG: ld [[REG2:[0-9]+]], 120(1)
				; ASM64PWR4-DAG: add 3, 3, 8
				; ASM64PWR4-DAG: add 3, 3, 9
				; ASM64PWR4-DAG: add 3, 3, 10
				cebowleratibmUnsubmitted Done Reply Inline Actions The ASM32PWR4-DAG expects the add operations but they're omitted in the ASM64PWR4-DAG expected output. I don't strongly prefer either though I strongly prefer consistency. cebowleratibm: The ASM32PWR4-DAG expects the add operations but they're omitted in the ASM64PWR4-DAG expected…
				ZarkoCAAuthorUnsubmitted Done Reply Inline Actions This is inconsistent, your right. I'm leaning toward only showing the loads and stores in these test cases to more clear about what we are testing. It's also the fastest change to make. ZarkoCA: This is inconsistent, your right. I'm leaning toward only showing the loads and stores in…
				; ASM64PWR4-DAG: std 30, -8(1)
				; ASM64PWR4-DAG: extsw 3, 3
				; ASM64PWR4-DAG: add 3, 3, [[REG1]]
				; ASM64PWR4-DAG: ld [[REG3:[0-9]+]], 128(1)
				; ASM64PWR4-DAG: add 3, 3, [[REG2]]
				cebowleratibmUnsubmitted Done Reply Inline Actions remove blank line or add blank line in the earlier test for the sake of consistency. cebowleratibm: remove blank line or add blank line in the earlier test for the sake of consistency.
				; ASM64PWR4-DAG: lwz [[REG4:[0-9]+]], 140(1)
				; ASM64PWR4-DAG: add 3, 3, [[REG3]]
				; ASM64PWR4-DAG: lwa [[REG5:[0-9]+]], 148(1)
				; ASM64PWR4-DAG: add 3, 3, [[REG4]]
				; ASM64PWR4-DAG: add 3, 3, [[REG5]]
				; ASM64PWR4-DAG: ld [[REG6:[0-9]+]], 152(1)
				; ASM64PWR4-DAG: ld [[REG7:[0-9]+]], 160(1)
				; ASM64PWR4-DAG: add 3, 3, [[REG6]]
				; ASM64PWR4-DAG: lwa [[REG8:[0-9]+]], 172(1)
				; ASM64PWR4-DAG: add 3, 3, [[REG7]]
				; ASM64PWR4-DAG: add 3, 3, [[REG8]]
				; ASM64PWR4-DAG: ld 30, -8(1)

				cebowleratibmUnsubmitted Done Reply Inline Actions I would also like to see the expected assembly here as well. cebowleratibm: I would also like to see the expected assembly here as well.
				@ll1 = common global i64 0, align 8
				cebowleratibmUnsubmitted Done Reply Inline Actions Additional test requests: -1 call that passes 1, 2, 4, 8 byte ints and 4, 8 byte floats in memory (mixed with all sizes represented.) -burn the first 8 GPR on float args, then confirm integer args passed in memory. ex: foo(double, double, double, double, char, short, int) cebowleratibm: Additional test requests: -1 call that passes 1, 2, 4, 8 byte ints and 4, 8 byte floats in…
				ZarkoCAAuthorUnsubmitted Done Reply Inline Actions Thanks, that's a really good testcase. I added it as `mix_callee` below along with the respective caller testcase. ZarkoCA: Thanks, that's a really good testcase. I added it as `mix_callee` below along with the…
				@si1 = common global i16 0, align 2
				@ch = common global i8 0, align 1
				@ui = common global i32 0, align 4
				@sint = common global i32 0, align 4
				@ll2 = common global i64 0, align 8
				@uc1 = common global i8 0, align 1
				@i1 = common global i32 0, align 4

				cebowleratibmUnsubmitted Done Reply Inline Actions I usually name the toc load appending ADDR to indicate the register holds the address of the variable, then name the loads after to indicate they hold the value of the variable. Seems the naming you have here his backwards. cebowleratibm: I usually name the toc load appending ADDR to indicate the register holds the address of the…
				ZarkoCAAuthorUnsubmitted Done Reply Inline Actions Thanks, fixed now. ZarkoCA: Thanks, fixed now.
				define void @caller_ints_stack() {
				entry:
				%0 = load i64, i64* @ll1, align 8
				%1 = load i16, i16* @si1, align 2
				%2 = load i8, i8* @ch, align 1
				%3 = load i32, i32* @ui, align 4
				%4 = load i32, i32* @sint, align 4
				%5 = load i64, i64* @ll2, align 8
				%6 = load i8, i8* @uc1, align 1
				%7 = load i32, i32* @i1, align 4
				%call = call i32 @test_ints_stack(i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i64 %0, i16 signext %1, i8 zeroext %2, i32 %3, i32 %4, i64 %5, i8 zeroext %6, i32 %7)
				ret void
				}

				; CHECK-LABEL: name: caller_ints_stack

				; 32BIT-DAG: renamable $r[[REGLL1ADDR:[0-9]+]] = LWZtoc @ll1, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r[[REGSADDR:[0-9]+]] = LWZtoc @si1, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r[[REGCADDR:[0-9]+]] = LWZtoc @ch, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r[[REGUIADDR:[0-9]+]] = LWZtoc @ui, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r[[REGSIADDR:[0-9]+]] = LWZtoc @sint, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r[[REGLL2ADDR:[0-9]+]] = LWZtoc @ll2, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r[[REGUCADDR:[0-9]+]] = LWZtoc @uc1, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r[[REGIADDR:[0-9]+]] = LWZtoc @i1, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r11 = LWZ 0, renamable $r[[REGLL1ADDR]] :: (dereferenceable load 4 from @ll1, align 8)
				; 32BIT-DAG: renamable $r3 = LWZ 4, killed renamable $r[[REGLL1ADDR]] :: (dereferenceable load 4 from @ll1 + 4)
				; 32BIT-DAG: renamable $r4 = LHA 0, killed renamable $r[[REGSADDR]] :: (dereferenceable load 2 from @si1)
				; 32BIT-DAG: renamable $r5 = LBZ 0, killed renamable $r[[REGCADDR]] :: (dereferenceable load 1 from @ch)
				; 32BIT-DAG: renamable $r6 = LWZ 0, killed renamable $r[[REGUIADDR]] :: (dereferenceable load 4 from @ui)
				cebowleratibmUnsubmitted Done Reply Inline Actions Logically I like to see the parameter initializations in order. I find it easier to understand in the approach. cebowleratibm: Logically I like to see the parameter initializations in order. I find it easier to understand…
				ZarkoCAAuthorUnsubmitted Done Reply Inline Actions Yes, that's better, I agree. ZarkoCA: Yes, that's better, I agree.
				; 32BIT-DAG: renamable $r7 = LWZ 0, killed renamable $r[[REGSIADDR]] :: (dereferenceable load 4 from @sint)
				; 32BIT-DAG: renamable $r12 = LWZ 0, renamable $r[[REGLL2ADDR]] :: (dereferenceable load 4 from @ll2, align 8)
				; 32BIT-DAG: renamable $r8 = LWZ 4, killed renamable $r[[REGLL2ADDR]] :: (dereferenceable load 4 from @ll2 + 4)
				; 32BIT-DAG: renamable $r9 = LBZ 0, killed renamable $r[[REGUCADDR]] :: (dereferenceable load 1 from @uc1)
				; 32BIT-DAG: renamable $r10 = LWZ 0, killed renamable $r[[REGIADDR]] :: (dereferenceable load 4 from @i1)
				; 32BIT-NEXT: ADJCALLSTACKDOWN 96, 0, implicit-def dead $r1, implicit $r1
				; 32BIT-DAG: STW killed renamable $r11, 56, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r3, 60, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r4, 64, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r5, 68, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r6, 72, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r7, 76, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r12, 80, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r8, 84, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r9, 88, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r10, 92, $r1 :: (store 4)
				; 32BIT-DAG: $r3 = LI 1
				; 32BIT-DAG: $r4 = LI 2
				; 32BIT-DAG: $r5 = LI 3
				; 32BIT-DAG: $r6 = LI 4
				; 32BIT-DAG: $r7 = LI 5
				; 32BIT-DAG: $r8 = LI 6
				; 32BIT-DAG: $r9 = LI 7
				; 32BIT-DAG: $r10 = LI 8
				; 32BIT-NEXT: BL_NOP <mcsymbol .test_ints_stack>, csr_aix32, implicit-def dead $lr, implicit $rm, implicit $r3, implicit $r4, implicit $r5, implicit $r6, implicit $r7, implicit $r8, implicit $r9, implicit $r10, implicit $r2, implicit-def $r1, implicit-def dead $r3
				; 32BIT-NEXT: ADJCALLSTACKUP 96, 0, implicit-def dead $r1, implicit $r1

				; CHECKASM-LABEL: .caller_ints_stack:

				; ASM32PWR4: mflr 0
				; ASM32PWR4-DAG: stw 0, 8(1)
				; ASM32PWR4-DAG: stwu 1, -96(1)
				; ASM32PWR4-DAG: lwz [[REG1:[0-9]+]], LC10(2)
				; ASM32PWR4-DAG: lwz [[REG2:[0-9]+]], LC11(2)
				; ASM32PWR4-DAG: lwz [[REG3:[0-9]+]], LC12(2)
				; ASM32PWR4-DAG: lwz [[REG4:[0-9]+]], LC13(2)
				; ASM32PWR4-DAG: lwz [[REG5:[0-9]+]], LC14(2)
				; ASM32PWR4-DAG: lwz [[REG6:[0-9]+]], LC15(2)
				; ASM32PWR4-DAG: lwz [[REG7:[0-9]+]], LC16(2)
				; ASM32PWR4-DAG: lwz [[REG8:[0-9]+]], LC17(2)
				; ASM32PWR4-DAG: lha 5, 0([[REG1]])
				; ASM32PWR4-DAG: lwz 11, 0([[REG7]])
				; ASM32PWR4-DAG: lwz 7, 4([[REG7]])
				; ASM32PWR4-DAG: lbz 4, 0([[REG2]])
				; ASM32PWR4-DAG: lwz 3, 0([[REG8]])
				; ASM32PWR4-DAG: lwz 6, 0([[REG3]])
				; ASM32PWR4-DAG: lwz 9, 0([[REG4]])
				; ASM32PWR4-DAG: lwz 8, 4([[REG4]])
				; ASM32PWR4-DAG: lbz 10, 0([[REG5]])
				; ASM32PWR4-DAG: lwz 12, 0([[REG6]])
				; ASM32PWR4-DAG: stw 11, 56(1)
				; ASM32PWR4-DAG: stw 7, 60(1)
				; ASM32PWR4-DAG: stw 5, 64(1)
				; ASM32PWR4-DAG: stw 4, 68(1)
				; ASM32PWR4-DAG: stw 3, 72(1)
				; ASM32PWR4-DAG: stw 6, 76(1)
				; ASM32PWR4-DAG: stw 9, 80(1)
				; ASM32PWR4-DAG: stw 8, 84(1)
				; ASM32PWR4-DAG: stw 10, 88(1)
				; ASM32PWR4-DAG: stw 12, 92(1)
				; ASM32PWR4-DAG: li 3, 1
				cebowleratibmUnsubmitted Done Reply Inline Actions would be a bit nicer to tidy the order of parameter reg inits. cebowleratibm: would be a bit nicer to tidy the order of parameter reg inits.
				; ASM32PWR4-DAG: li 4, 2
				; ASM32PWR4-DAG: li 5, 3
				; ASM32PWR4-DAG: li 6, 4
				; ASM32PWR4-DAG: li 7, 5
				; ASM32PWR4-DAG: li 9, 7
				; ASM32PWR4-DAG: li 8, 6
				; ASM32PWR4-DAG: li 10, 8
				; ASM32PWR4-DAG: bl .test_ints_stack
				; ASM32PWR4-DAG: nop
				; ASM32PWR4-DAG: addi 1, 1, 96
				; ASM32PWR4-DAG: lwz 0, 8(1)
				; ASM32PWR4-NEXT: mtlr 0
				; ASM32PWR4-NEXT: blr

				; ASM64PWR4: mflr 0
				; ASM64PWR4-DAG: std 0, 16(1)
				; ASM64PWR4-DAG: stdu 1, -176(1)
				; ASM64PWR4-DAG: ld [[REG1:[0-9]+]], LC9(2)
				; ASM64PWR4-DAG: ld [[REG2:[0-9]+]], LC10(2)
				; ASM64PWR4-DAG: ld [[REG3:[0-9]+]], LC11(2)
				; ASM64PWR4-DAG: ld [[REG4:[0-9]+]], LC12(2)
				; ASM64PWR4-DAG: ld [[REG5:[0-9]+]], LC13(2)
				; ASM64PWR4-DAG: ld [[REG6:[0-9]+]], LC14(2)
				; ASM64PWR4-DAG: ld [[REG7:[0-9]+]], LC15(2)
				; ASM64PWR4-DAG: ld [[REG8:[0-9]+]], LC16(2)
				; ASM64PWR4-DAG: lha 7, 0([[REG1]])
				; ASM64PWR4-DAG: lbz 5, 0([[REG2]])
				; ASM64PWR4-DAG: ld 6, 0([[REG3]])
				; ASM64PWR4-DAG: lbz 8, 0([[REG4]])
				; ASM64PWR4-DAG: lwz 9, 0([[REG5]])
				; ASM64PWR4-DAG: ld 11, 0([[REG6]])
				; ASM64PWR4-DAG: lwz 3, 0([[REG7]])
				; ASM64PWR4-DAG: lwz 4, 0([[REG8]])
				; ASM64PWR4-DAG: std 11, 112(1)
				; ASM64PWR4-DAG: stw 7, 120(1)
				; ASM64PWR4-DAG: stw 5, 128(1)
				; ASM64PWR4-DAG: stw 3, 136(1)
				; ASM64PWR4-DAG: stw 4, 144(1)
				; ASM64PWR4-DAG: std 6, 152(1)
				; ASM64PWR4-DAG: stw 8, 160(1)
				; ASM64PWR4-DAG: stw 9, 168(1)
				; ASM64PWR4-DAG: li 3, 1
				; ASM64PWR4-DAG: li 4, 2
				; ASM64PWR4-DAG: li 5, 3
				; ASM64PWR4-DAG: li 6, 4
				; ASM64PWR4-DAG: li 7, 5
				; ASM64PWR4-DAG: li 8, 6
				; ASM64PWR4-DAG: li 9, 7
				; ASM64PWR4-DAG: li 10, 8
				; ASM64PWR4-NEXT: bl .test_ints_stack
				; ASM64PWR4-NEXT: nop
				; ASM64PWR4-NEXT: addi 1, 1, 176
				; ASM64PWR4-NEXT: ld 0, 16(1)
				; ASM64PWR4-NEXT: mtlr 0
				; ASM64PWR4-NEXT: blr

				define double @test_fpr_stack(double %d1, double %d2, double %d3, double %d4, double %d5, double %d6, double %d7, double %d8, double %d9, double %s10, double %l11, double %d12, double %d13, float %f14, double %d15, float %f16) {
				entry:
				%add = fadd double %d1, %d2
				%add1 = fadd double %add, %d3
				%add2 = fadd double %add1, %d4
				%add3 = fadd double %add2, %d5
				%add4 = fadd double %add3, %d6
				%add5 = fadd double %add4, %d7
				%add6 = fadd double %add5, %d8
				%add7 = fadd double %add6, %d9
				cebowleratibmUnsubmitted Done Reply Inline Actions CHECK-LABEL cebowleratibm: CHECK-LABEL
				%add8 = fadd double %add7, %s10
				%add9 = fadd double %add8, %l11
				%add10 = fadd double %add9, %d12
				%add11 = fadd double %add10, %d13
				%add12 = fadd double %add11, %d13
				%conv = fpext float %f14 to double
				%add13 = fadd double %add12, %conv
				%add14 = fadd double %add13, %d15
				%conv15 = fpext float %f16 to double
				%add16 = fadd double %add14, %conv15
				ret double %add16
				}

				; CHECK-LABEL: name: test_fpr_stack{{.*}}

				cebowleratibmUnsubmitted Done Reply Inline Actions CHECK-LABEL cebowleratibm: CHECK-LABEL
				; CHECK-DAG: liveins:
				; CHECK-DAG: - { reg: '$f1', virtual-reg: '' }
				; CHECK-DAG: - { reg: '$f2', virtual-reg: '' }
				; CHECK-DAG: - { reg: '$f3', virtual-reg: '' }
				; CHECK-DAG: - { reg: '$f4', virtual-reg: '' }
				; CHECK-DAG: - { reg: '$f5', virtual-reg: '' }
				; CHECK-DAG: - { reg: '$f6', virtual-reg: '' }
				; CHECK-DAG: - { reg: '$f7', virtual-reg: '' }
				; CHECK-DAG: - { reg: '$f8', virtual-reg: '' }
				cebowleratibmUnsubmitted Done Reply Inline Actions CHECK-LABEL cebowleratibm: CHECK-LABEL
				; CHECK-DAG: - { reg: '$f9', virtual-reg: '' }
				; CHECK-DAG: - { reg: '$f10', virtual-reg: '' }
				; CHECK-DAG: - { reg: '$f11', virtual-reg: '' }
				; CHECK-DAG: - { reg: '$f12', virtual-reg: '' }
				; CHECK-DAG: - { reg: '$f13', virtual-reg: '' }

				; CHECK: fixedStack:
				; 32BIT-DAG: - { id: 2, type: default, offset: 128, size: 4
				; 32BIT-DAG: - { id: 1, type: default, offset: 132, size: 8
				; 32BIT-DAG: - { id: 0, type: default, offset: 140, size: 4

				; 64BIT-DAG: - { id: 2, type: default, offset: 152, size: 4
				; 64BIT-DAG: - { id: 1, type: default, offset: 160, size: 8
				; 64BIT-DAG: - { id: 0, type: default, offset: 168, size: 4

				; CHECK: body: \|
				; CHECK-DAG: bb.0.entry:
				; CHECK-DAG: liveins: $f1, $f2, $f3, $f4, $f5, $f6, $f7, $f8, $f9, $f10, $f11, $f12, $f13

				; CHECKASM-LABEL: .test_fpr_stack:
				; ASM32PWR4: fadd 0, 1, 2
				; ASM32PWR4-DAG: lfs [[REG1:[0-9]+]], 128(1)
				; ASM32PWR4-DAG: fadd 0, 0, 3
				; ASM32PWR4-DAG: lfd [[REG2:[0-9]+]], 132(1)
				; ASM32PWR4-DAG: fadd 0, 0, 4
				; ASM32PWR4-DAG: fadd 0, 0, 5
				; ASM32PWR4-DAG: fadd 0, 0, 6
				; ASM32PWR4-DAG: fadd 0, 0, 7
				; ASM32PWR4-DAG: fadd 0, 0, 8
				; ASM32PWR4-DAG: fadd 0, 0, 9
				; ASM32PWR4-DAG: fadd 0, 0, 10
				; ASM32PWR4-DAG: fadd 0, 0, 11
				; ASM32PWR4-DAG: fadd 0, 0, 12
				; ASM32PWR4-DAG: fadd 0, 0, 13
				; ASM32PWR4-DAG: fadd 0, 0, 13
				; ASM32PWR4-DAG: fadd 0, 0, [[REG1]]
				; ASM32PWR4-DAG: lfs [[REG3:[0-9]+]], 140(1)
				; ASM32PWR4-DAG: fadd 0, 0, [[REG2]]
				; ASM32PWR4-DAG: fadd 1, 0, [[REG3]]

				; ASM64PWR4: fadd 0, 1, 2
				; ASM64PWR4-DAG: lfs [[REG1:[0-9]+]], 152(1)
				; ASM64PWR4-DAG: fadd 0, 0, 3
				cebowleratibmUnsubmitted Done Reply Inline Actions A little bit nicer if these are moved next to their use in the DAG block. cebowleratibm: A little bit nicer if these are moved next to their use in the DAG block.
				; ASM64PWR4-DAG: lfd [[REG2:[0-9]+]], 160(1)
				; ASM64PWR4-DAG: fadd 0, 0, 4
				; ASM64PWR4-DAG: fadd 0, 0, 5
				; ASM64PWR4-DAG: fadd 0, 0, 6
				; ASM64PWR4-DAG: fadd 0, 0, 7
				; ASM64PWR4-DAG: fadd 0, 0, 8
				; ASM64PWR4-DAG: fadd 0, 0, 9
				; ASM64PWR4-DAG: fadd 0, 0, 10
				; ASM64PWR4-DAG: fadd 0, 0, 11
				; ASM64PWR4-DAG: fadd 0, 0, 12
				; ASM64PWR4-DAG: fadd 0, 0, 13
				; ASM64PWR4-DAG: fadd 0, 0, 13
				; ASM64PWR4-DAG: fadd 0, 0, [[REG1]]
				; ASM64PWR4-DAG: lfs [[REG3:[0-9]+]], 168(1)
				; ASM64PWR4-DAG: fadd 0, 0, [[REG2]]
				; ASM64PWR4-DAG: fadd 1, 0, [[REG3]]

				@f14 = common global float 0.000000e+00, align 4
				@d15 = common global double 0.000000e+00, align 8
				@f16 = common global float 0.000000e+00, align 4

				define void @caller_fpr_stack() {
				entry:
				%0 = load float, float* @f14, align 4
				%1 = load double, double* @d15, align 8
				%2 = load float, float* @f16, align 4
				%call = call double @test_fpr_stack(double 1.000000e-01, double 2.000000e-01, double 3.000000e-01, double 4.000000e-01, double 5.000000e-01, double 6.000000e-01, double 0x3FE6666666666666, double 8.000000e-01, double 9.000000e-01, double 1.000000e-01, double 1.100000e-01, double 1.200000e-01, double 1.300000e-01, float %0, double %1, float %2)
				ret void
				}

				; 32BIT-DAG: ADJCALLSTACKDOWN 144, 0, implicit-def dead $r1, implicit $r1
				; 32BIT-DAG: STW killed renamable $r6, 56, $r1 :: (store 4, align 8)
				; 32BIT-DAG: STW killed renamable $r5, 60, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r8, 64, $r1 :: (store 4, align 8)
				; 32BIT-DAG: STW killed renamable $r7, 68, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r6, 72, $r1 :: (store 4, align 8)
				; 32BIT-DAG: STW killed renamable $r5, 76, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r8, 80, $r1 :: (store 4, align 8)
				; 32BIT-DAG: STW renamable $r7, 84, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r6, 88, $r1 :: (store 4, align 8)
				; 32BIT-DAG: STW killed renamable $r5, 92, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r8, 96, $r1 :: (store 4, align 8)
				; 32BIT-DAG: STW killed renamable $r7, 100, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r6, 104, $r1 :: (store 4, align 8)
				; 32BIT-DAG: STW killed renamable $r5, 108, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r8, 112, $r1 :: (store 4, align 8)
				; 32BIT-DAG: STW killed renamable $r7, 116, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r5, 120, $r1 :: (store 4, align 8)
				; 32BIT-DAG: STW killed renamable $r5, 124, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r3, 128, $r1 :: (store 4)
				; 32BIT-DAG: renamable $r[[REGF1:[0-9]+]] = LWZtoc @f14, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r3 = LWZ 0, killed renamable $r[[REGF1]] :: (load 4 from @f14)
				; 32BIT-DAG: STFD killed renamable $f0, 132, $r1 :: (store 8)
				; 32BIT-DAG: renamable $r[[REGD:[0-9]+]] = LWZtoc @d15, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $f0 = LFD 0, killed renamable $r[[REGD]] :: (dereferenceable load 8 from @d15)
				; 32BIT-DAG: STW killed renamable $r4, 140, $r1 :: (store 4)
				; 32BIT-DAG: renamable $r[[REGF2:[0-9]+]] = LWZtoc @f16, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r4 = LWZ 0, killed renamable $r[[REGF2]] :: (load 4 from @f16)
				; 32BIT-DAG: renamable $r6 = LWZtoc %const.0, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r7 = LWZtoc %const.1, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r8 = LWZtoc %const.2, $r2 :: (load 4 from got)
				cebowleratibmUnsubmitted Not Done Reply Inline Actions why are all the other STW flagged "killed" and this one not? cebowleratibm: why are all the other STW flagged "killed" and this one not?
				ZarkoCAAuthorUnsubmitted Done Reply Inline Actions I'm not sure...I found that if I change the value that's held in that register (the double 1.000000e-01) to anything else, the register is then flagged "killed" like the rest. I will need to follow up with some more investigation to find what determines when a register will not be killed in a case like this. ZarkoCA: I'm not sure...I found that if I change the value that's held in that register (the double 1.
				; 32BIT-DAG: renamable $r5 = LWZtoc %const.3, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r6 = LWZtoc %const.4, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r7 = LWZtoc %const.5, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r8 = LWZtoc %const.6, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r5 = LWZtoc %const.7, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r6 = LWZtoc %const.8, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r7 = LWZtoc %const.9, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r8 = LWZtoc %const.10, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r5 = LWZtoc %const.11, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $f1 = LFD 0, killed renamable $r5 :: (load 8 from constant-pool)
				; 32BIT-DAG: renamable $f2 = LFD 0, killed renamable $r6 :: (load 8 from constant-pool)
				; 32BIT-DAG: renamable $f3 = LFD 0, killed renamable $r7 :: (load 8 from constant-pool)
				; 32BIT-DAG: renamable $f4 = LFD 0, killed renamable $r8 :: (load 8 from constant-pool)
				; 32BIT-DAG: renamable $f5 = LFS 0, killed renamable $r5 :: (load 4 from constant-pool)
				; 32BIT-DAG: renamable $f6 = LFD 0, killed renamable $r5 :: (load 8 from constant-pool)
				; 32BIT-DAG: renamable $f7 = LFD 0, killed renamable $r6 :: (load 8 from constant-pool)
				; 32BIT-DAG: renamable $f8 = LFD 0, killed renamable $r7 :: (load 8 from constant-pool)
				; 32BIT-DAG: renamable $f9 = LFD 0, killed renamable $r8 :: (load 8 from constant-pool)
				; 32BIT-DAG: $f10 = COPY renamable $f1
				; 32BIT-DAG: renamable $f11 = LFD 0, killed renamable $r6 :: (load 8 from constant-pool)
				cebowleratibmUnsubmitted Not Done Reply Inline Actions I don't think that it's useful to expect the constant pool loads given that the STW matches on SCRATCHREG. There's no validation of where each constant is written and we're not intending to test the constant loads anyways. cebowleratibm: I don't think that it's useful to expect the constant pool loads given that the STW matches on…
				ZarkoCAAuthorUnsubmitted Done Reply Inline Actions In this case here I'm using SCRATCHREG only to not hard code the registers in which we do the constant pool loads. Hypothetically, they can be any register and I don't want the test to break if any of those scratch registers change at any point in time. I could take them out altogether maybe? ZarkoCA: In this case here I'm using SCRATCHREG only to not hard code the registers in which we do the…
				; 32BIT-DAG: renamable $f12 = LFD 0, killed renamable $r7 :: (load 8 from constant-pool)
				; 32BIT-DAG: renamable $f13 = LFD 0, killed renamable $r8 :: (load 8 from constant-pool)
				; 32BIT-NEXT: BL_NOP <mcsymbol .test_fpr_stack>, csr_aix32, implicit-def dead $lr, implicit $rm, implicit $f1, implicit $f2, implicit $f3, implicit $f4, implicit $f5, implicit $f6, implicit $f7, implicit $f8, implicit $f9, implicit killed $f10, implicit $f11, implicit $f12, implicit $f13, implicit $r2, implicit-def $r1, implicit-def dead $f1
				; 32BIT-NEXT: ADJCALLSTACKUP 144, 0, implicit-def dead $r1, implicit $r1

				; CHECKASM-LABEL: .caller_fpr_stack:

				; ASM32PWR4: mflr 0
				; ASM32PWR4-DAG: stw 0, 8(1)
				; ASM32PWR4-DAG: stwu 1, -144(1)
				; ASM32PWR4-DAG: lwz [[REGD:[0-9]+]], LC18(2)
				; ASM32PWR4-DAG: lwz [[REGF1:[0-9]+]], LC19(2)
				; ASM32PWR4-DAG: lwz [[REGF2:[0-9]+]], LC20(2)
				; ASM32PWR4-DAG: lfd [[REGDNEW:[0-9]+]], 0([[REGD]])
				; ASM32PWR4-DAG: lwz [[REGF1NEW:[0-9]+]], 0([[REGF1]])
				; ASM32PWR4-DAG: lwz [[REGF2NEW:[0-9]+]], 0([[REGF2]])
				; ASM32PWR4-DAG: stw [[SCRATCHREG:[0-9]+]], 56(1)
				; ASM32PWR4-DAG: stw [[SCRATCHREG:[0-9]+]], 60(1)
				; ASM32PWR4-DAG: stw [[SCRATCHREG:[0-9]+]], 64(1)
				; ASM32PWR4-DAG: stw [[SCRATCHREG:[0-9]+]], 68(1)
				; ASM32PWR4-DAG: stw [[SCRATCHREG:[0-9]+]], 72(1)
				; ASM32PWR4-DAG: stw [[SCRATCHREG:[0-9]+]], 76(1)
				; ASM32PWR4-DAG: stw [[SCRATCHREG:[0-9]+]], 80(1)
				; ASM32PWR4-DAG: stw [[SCRATCHREG:[0-9]+]], 84(1)
				; ASM32PWR4-DAG: stw [[SCRATCHREG:[0-9]+]], 88(1)
				; ASM32PWR4-DAG: stw [[SCRATCHREG:[0-9]+]], 92(1)
				; ASM32PWR4-DAG: stw [[SCRATCHREG:[0-9]+]], 96(1)
				; ASM32PWR4-DAG: stw [[SCRATCHREG:[0-9]+]], 100(1)
				; ASM32PWR4-DAG: stw [[SCRATCHREG:[0-9]+]], 108(1)
				; ASM32PWR4-DAG: stw [[SCRATCHREG:[0-9]+]], 104(1)
				; ASM32PWR4-DAG: stw [[SCRATCHREG:[0-9]+]], 112(1)
				; ASM32PWR4-DAG: stw [[SCRATCHREG:[0-9]+]], 116(1)
				; ASM32PWR4-DAG: stw [[SCRATCHREG:[0-9]+]], 120(1)
				; ASM32PWR4-DAG: stw [[SCRATCHREG:[0-9]+]], 124(1)
				; ASM32PWR4-DAG: stw [[REGF1NEW]], 128(1)
				; ASM32PWR4-DAG: stfd [[REGDNEW]], 132(1)
				; ASM32PWR4-DAG: stw [[REGF2NEW]], 140(1)
				; ASM32PWR4-NEXT: bl .test_fpr_stack

				; ASM64PWR4: mflr 0
				; ASM64PWR4-DAG: std 0, 16(1)
				; ASM64PWR4-DAG: stdu 1, -176(1)
				; ASM64PWR4-DAG: ld [[REGF1:[0-9]+]], LC17(2)
				; ASM64PWR4-DAG: ld [[REGD:[0-9]+]], LC18(2)
				; ASM64PWR4-DAG: ld [[REGF2:[0-9]+]], LC19(2)
				; ASM64PWR4-DAG: lwz [[REGF1NEW:[0-9]+]], 0([[REGF1]])
				; ASM64PWR4-DAG: ld [[REGDNEW:[0-9]+]], 0([[REGD]])
				; ASM64PWR4-DAG: lwz [[REGF2NEW:[0-9]+]], 0([[REGF2]])
				; ASM64PWR4-DAG: std [[SCRATCHREG:[0-9]+]], 112(1)
				; ASM64PWR4-DAG: std [[SCRATCHREG:[0-9]+]], 120(1)
				; ASM64PWR4-DAG: std [[SCRATCHREG:[0-9]+]], 128(1)
				; ASM64PWR4-DAG: std [[SCRATCHREG:[0-9]+]], 136(1)
				; ASM64PWR4-DAG: std [[SCRATCHREG:[0-9]+]], 144(1)
				; ASM64PWR4-DAG: stw [[REGF1NEW]], 152(1)
				; ASM64PWR4-DAG: std [[REGDNEW]], 160(1)
				; ASM64PWR4-DAG: stw [[REGF2NEW]], 168(1)
				; ASM64PWR4-NEXT: bl .test_fpr_stack

				define i32 @mix_callee(double %d1, double %d2, double %d3, double %d4, i8 zeroext %c1, i16 signext %s1, i64 %ll1, i32 %i1, i32 %i2, i32 %i3) {
				entry:
				%add = fadd double %d1, %d2
				%add1 = fadd double %add, %d3
				%add2 = fadd double %add1, %d4
				%conv = zext i8 %c1 to i32
				%conv3 = sext i16 %s1 to i32
				%add4 = add nsw i32 %conv, %conv3
				%conv5 = sext i32 %add4 to i64
				%add6 = add nsw i64 %conv5, %ll1
				cebowleratibmUnsubmitted Done Reply Inline Actions REGDADDR cebowleratibm: REGDADDR
				%conv7 = sext i32 %i1 to i64
				%add8 = add nsw i64 %add6, %conv7
				%conv9 = sext i32 %i2 to i64
				cebowleratibmUnsubmitted Done Reply Inline Actions lfd [[REGD:[0-9]+]], 0([[REGDADDR]]) cebowleratibm: lfd [[REGD:[0-9]+]], 0([[REGDADDR]])
				%add10 = add nsw i64 %add8, %conv9
				cebowleratibmUnsubmitted Done Reply Inline Actions diddo on REGF1 and REGF2 cebowleratibm: diddo on REGF1 and REGF2
				%conv11 = sext i32 %i3 to i64
				%add12 = add nsw i64 %add10, %conv11
				%conv13 = trunc i64 %add12 to i32
				%conv14 = sitofp i32 %conv13 to double
				%add15 = fadd double %conv14, %add2
				%conv16 = fptosi double %add15 to i32
				ret i32 %conv16
				}

				; 32BIT-DAG: liveins:
				; 32BIT-DAG: - { reg: '$f1', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$f2', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$f3', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$f4', virtual-reg: '' }

				; 32BIT-DAG: fixedStack:
				; 32BIT-DAG: - { id: 6, type: default, offset: 56, size: 4
				; 32BIT-DAG: - { id: 5, type: default, offset: 60, size: 4
				; 32BIT-DAG: - { id: 4, type: default, offset: 64, size: 4
				; 32BIT-DAG: - { id: 3, type: default, offset: 68, size: 4
				; 32BIT-DAG: - { id: 2, type: default, offset: 72, size: 4
				; 32BIT-DAG: - { id: 1, type: default, offset: 76, size: 4
				; 32BIT-DAG: - { id: 0, type: default, offset: 80, size: 4

				; 32BIT-DAG: body: \|
				; 32BIT-DAG: bb.0.entry:
				; 32BIT-NEXT: liveins: $f1, $f2, $f3, $f4

				; 64BIT: liveins:
				; 64BIT: - { reg: '$f1', virtual-reg: '' }
				; 64BIT-NEXT: - { reg: '$f2', virtual-reg: '' }
				; 64BIT-NEXT: - { reg: '$f3', virtual-reg: '' }
				cebowleratibmUnsubmitted Done Reply Inline Actions LABEL cebowleratibm: LABEL
				; 64BIT-NEXT: - { reg: '$f4', virtual-reg: '' }
				; 64BIT-NEXT: - { reg: '$x7', virtual-reg: '' }
				; 64BIT-NEXT: - { reg: '$x8', virtual-reg: '' }
				; 64BIT-NEXT: - { reg: '$x9', virtual-reg: '' }
				; 64BIT-NEXT: - { reg: '$x10', virtual-reg: '' }

				cebowleratibmUnsubmitted Done Reply Inline Actions LABEL cebowleratibm: LABEL
				; 64BIT: fixedStack:
				; 64BIT-DAG: - { id: 1, type: default, offset: 112, size: 8
				; 64BIT-DAG: - { id: 0, type: default, offset: 120, size: 8

				; 64BIT: body: \|
				; 64BIT-NEXT: bb.0.entry:
				; 64BIT-NEXT: liveins: $f1, $f2, $f3, $f4, $x7, $x8, $x9, $x10

				; CHECKASM-LABEL: .mix_callee
				cebowleratibmUnsubmitted Done Reply Inline Actions LABEL cebowleratibm: LABEL

				; ASM32PWR4: stwu 1, -48(1)
				; ASM32PWR4-DAG: lwz [[REG1:[0-9]+]], 104(1)
				; ASM32PWR4-DAG: lwz [[REG2:[0-9]+]], 108(1)
				; ASM32PWR4-DAG: lwz [[REG3:[0-9]+]], 116(1)
				; ASM32PWR4-DAG: lwz [[REG4:[0-9]+]], 120(1)
				; ASM32PWR4-DAG: lwz [[REG5:[0-9]+]], 124(1)
				; ASM32PWR4-DAG: lwz [[REG6:[0-9]+]], 128(1)
				; ASM32PWR4-DAG: lwz 5, LC33(2)
				; ASM32PWR4-DAG: lis 8, 17200
				; ASM32PWR4-DAG: lfs 0, 0(5)
				; ASM32PWR4-DAG: fadd 1, 1, 2
				; ASM32PWR4-DAG: fadd 1, 1, 3
				; ASM32PWR4-DAG: add 4, 5, 4
				cebowleratibmUnsubmitted Done Reply Inline Actions LABEL cebowleratibm: LABEL
				; ASM32PWR4-DAG: fadd 1, 1, 4
				; ASM32PWR4-DAG: add 3, 4, 3
				; ASM32PWR4-DAG: add 3, 3, 6
				; ASM32PWR4-DAG: stw 8, 32(1)
				; ASM32PWR4-DAG: add 3, 3, 7
				; ASM32PWR4-DAG: add 3, 3, 8
				; ASM32PWR4-DAG: xoris 3, 3, 32768
				; ASM32PWR4-DAG: stw 3, 36(1)
				; ASM32PWR4-DAG: addi 3, 1, 44
				; ASM32PWR4-DAG: lfd 2, 32(1)
				; ASM32PWR4-DAG: fsub 0, 2, 0
				; ASM32PWR4-DAG: fadd 0, 0, 1
				; ASM32PWR4-DAG: fctiwz 0, 0
				; ASM32PWR4-DAG: stfiwx 0, 0, 3
				; ASM32PWR4-DAG: lwz 3, 44(1)
				cebowleratibmUnsubmitted Not Done Reply Inline Actions Should be size 2. Though the arg passes in 4 bytes, the object is 2 bytes at offset 58. cebowleratibm: Should be size 2. Though the arg passes in 4 bytes, the object is 2 bytes at offset 58.
				ZarkoCAAuthorUnsubmitted Done Reply Inline Actions To me it looks like CC_AIX promotes i16 to i32s, and the ValVT passed to getLoad() is i32, so I think this is why it does a 4 byte load at offset 56. // Promote integers if needed. if (ValVT.getSizeInBits() < RegVT.getSizeInBits()) LocInfo = ArgFlags.isSExt() ? CCValAssign::LocInfo::SExt : CCValAssign::LocInfo::ZExt; Is this not correct in 32BIT mode? I see PPC32 do the same thing and XLC on AIX also using a 4byte load (lwz) at offset 56 for an i16. ZarkoCA: To me it looks like CC_AIX promotes i16 to i32s, and the ValVT passed to getLoad() is i32, so I…
				; ASM32PWR4-NEXT: addi 1, 1, 48
				; ASM32PWR4-NEXT: blr

				; ASM64PWR-DAG: ld [[REG1:[0-9]+]], 112(1)
				; ASM64PWR-DAG: ld [[REG2:[0-9]+]], 120(1)
				; ASM64PWR-DAG: add 4, 7, 8
				; ASM64PWR-DAG: fadd 0, 1, 2
				; ASM64PWR-DAG: add 4, 4, 9
				; ASM64PWR-DAG: fadd 0, 0, [[REG1]]
				; ASM64PWR-DAG: add 4, 4, 10
				; ASM64PWR-DAG: add 3, 4, 3
				; ASM64PWR-DAG: add 3, 3, [[REG2]]
				; ASM64PWR-DAG: fadd 0, 0, 4
				; ASM64PWR-DAG: extsw 3, 3
				; ASM64PWR-DAG: std 3, -16(1)
				; ASM64PWR-DAG: addi 3, 1, -4
				; ASM64PWR-DAG: lfd 1, -16(1)
				; ASM64PWR-DAG: fcfid 1, 1
				; ASM64PWR-DAG: fadd 0, 1, 0
				; ASM64PWR-DAG: fctiwz 0, 0
				; ASM64PWR-DAG: stfiwx 0, 0, 3
				; ASM64PWR-DAG: lwz 3, -4(1)
				; ASM64PWR-DAG: blr

				define void @caller_mix() {
				entry:
				%call = call i32 @mix_callee(double 1.000000e-01, double 2.000000e-01, double 3.000000e-01, double 4.000000e-01, i8 zeroext 1, i16 signext 2, i64 30000000, i32 40, i32 50, i32 60)
				ret void
				}

				; CHECK-LABEL: name: caller_mix
				; 64BIT: ADJCALLSTACKDOWN 128, 0, implicit-def dead $r1, implicit $r1
				; 64BIT-DAG: STW8 killed renamable $x[[REG1ADDR: [0-9]+]], 112, $x1 :: (store 4)
				; 64BIT-DAG: STW8 killed renamable $x[[REG2ADDR: [0-9]+]], 120, $x1 :: (store 4)
				; 64BIT-DAG: renamable $x3 = LDtocCPT %const.0, $x2 :: (load 8 from got)
				; 64BIT-DAG: renamable $x4 = LDtocCPT %const.1, $x2 :: (load 8 from got)
				; 64BIT-DAG: renamable $x3 = LDtocCPT %const.2, $x2 :: (load 8 from got)
				; 64BIT-DAG: renamable $x4 = LDtocCPT %const.3, $x2 :: (load 8 from got)
				; 64BIT-DAG: renamable $f1 = LFD 0, killed renamable $x[[REG1ADDR]] :: (load 8 from constant-pool)
				; 64BIT-DAG: renamable $f2 = LFD 0, killed renamable $x[[REG2ADDR]] :: (load 8 from constant-pool)
				; 64BIT-DAG: renamable $f3 = LFD 0, killed renamable $x3 :: (load 8 from constant-pool)
				; 64BIT-DAG: renamable $f4 = LFD 0, killed renamable $x4 :: (load 8 from constant-pool)
				; 64BIT-DAG: renamable $x[[SCRATCHREG: [0-9]+]] = LI8 50
				; 64BIT-DAG: renamable $x[[SCRATCHREG: [0-9]+]] = LI8 60
				; 64BIT-DAG: renamable $x[[SCRATCHREG: [0-9]+]] = LIS8 457
				; 64BIT-DAG: $x7 = LI8 1
				; 64BIT-DAG: $x8 = LI8 2
				; 64BIT-DAG: $x10 = LI8 40
				; 64BIT-DAG: renamable $x[[SCRATCHREG: [0-9]+]] = ORI8 killed renamable $x[[SCRATCHREG: [0-9]+]], 50048
				; 64BIT-NEXT: BL8_NOP <mcsymbol .mix_callee>, csr_aix64, implicit-def dead $lr8, implicit $rm, implicit $f1, implicit $f2, implicit $f3, implicit $f4, implicit killed $x7, implicit killed $x8, implicit $x9, implicit killed $x10, implicit $x2, implicit-def $r1, implicit-def dead $x3
				; 64BIT-NEXT: ADJCALLSTACKUP 128, 0, implicit-def dead $r1, implicit $r1
				; 64BIT-NEXT: BLR8 implicit $lr8, implicit $rm

				; CHEKASM-LABEL: .caller_mix
				; ASM32PWR4: mflr 0
				; ASM32PWR4-DAG: stw [[REG1:[0-9]+]], 56(1)
				; ASM32PWR4-DAG: stw [[REG2:[0-9]+]], 60(1)
				; ASM32PWR4-DAG: stw [[REG3:[0-9]+]], 64(1)
				; ASM32PWR4-DAG: stw [[REG4:[0-9]+]], 68(1)
				; ASM32PWR4-DAG: stw [[REG5:[0-9]+]], 72(1)
				; ASM32PWR4-DAG: stw [[REG6:[0-9]+]], 76(1)
				; ASM32PWR4-DAG: stw [[REG7:[0-9]+]], 80(1)
				; ASM32PWR4-DAG: stw 0, 8(1)
				; ASM32PWR4-DAG: stwu 1, -96(1)
				; ASM32PWR4-DAG: li 3, 60
				; ASM32PWR4-DAG: li 3, 50
				; ASM32PWR4-DAG: li 3, 40
				; ASM32PWR4-DAG: li 3, 0
				; ASM32PWR4-DAG: li 3, 2
				; ASM32PWR4-DAG: lwz 3, LC34(2)
				; ASM32PWR4-DAG: lfd 1, 0(3)
				; ASM32PWR4-DAG: lwz 3, LC35(2)
				; ASM32PWR4-DAG: lfd 2, 0(3)
				; ASM32PWR4-DAG: lwz 3, LC36(2)
				; ASM32PWR4-DAG: lfd 3, 0(3)
				; ASM32PWR4-DAG: lwz 3, LC37(2)
				; ASM32PWR4-DAG: lfd 4, 0(3)
				; ASM32PWR4-DAG: li 3, 1
				; ASM32PWR4-DAG: lis 3, 457
				; ASM32PWR4-DAG: ori 3, 3, 50048
				; ASM32PWR4-DAG: bl .mix_callee
				; ASM32PWR4-DAG: nop
				; ASM32PWR4-DAG: addi 1, 1, 96
				; ASM32PWR4-DAG: lwz 0, 8(1)
				; ASM32PWR4-DAG: mtlr 0
				; ASM32PWR4-DAG: blr

				; ASM64PWR4: mflr 0
				; ASM64PWR4-DAG: std 0, 16(1)
				; ASM64PWR4-DAG: stdu 1, -128(1)
				; ASM64PWR4-DAG: stw [[REG1:[0-9]+]], 112(1)
				; ASM64PWR4-DAG: stw [[REG2:[0-9]+]], 120(1)
				; ASM64PWR4-DAG: ld 3, LC32(2)
				; ASM64PWR4-DAG: ld 4, LC33(2)
				; ASM64PWR4-DAG: lis 5, 457
				; ASM64PWR4-DAG: li 7, 1
				; ASM64PWR4-DAG: ori 9, 5, 50048
				; ASM64PWR4-DAG: li 8, 2
				; ASM64PWR4-DAG: lfd 1, 0(3)
				; ASM64PWR4-DAG: ld 3, LC34(2)
				; ASM64PWR4-DAG: li 10, 40
				; ASM64PWR4-DAG: lfd 2, 0(4)
				; ASM64PWR4-DAG: ld 4, LC35(2)
				; ASM64PWR4-DAG: lfd 3, 0(3)
				; ASM64PWR4-DAG: li 3, 60
				; ASM64PWR4-DAG: lfd 4, 0(4)
				; ASM64PWR4-DAG: li 4, 50
				; ASM64PWR4-DAG: bl .mix_callee
				; ASM64PWR4-DAG: nop
				; ASM64PWR4-DAG: addi 1, 1, 128
				; ASM64PWR4-DAG: ld 0, 16(1)
				; ASM64PWR4-DAG: mtlr 0
				; ASM64PWR4-DAG: blr


				define i32 @mix_floats(i32 %i1, i32 %i2, i32 %i3, i32 %i4, i32 %i5, i32 %i6, i32 %i7, i32 %i8, double %d1, double %d2, double %d3, double %d4, double %d5, double %d6, double %d7, double %d8, double %d9, double %d10, double %d11, double %d12, double %d13, double %d14) {
				entry:
				%add = add nsw i32 %i1, %i2
				%add1 = add nsw i32 %add, %i3
				%add2 = add nsw i32 %add1, %i4
				%add3 = add nsw i32 %add2, %i5
				%add4 = add nsw i32 %add3, %i6
				%add5 = add nsw i32 %add4, %i7
				%add6 = add nsw i32 %add5, %i8
				%conv = sitofp i32 %add6 to double
				%add7 = fadd double %conv, %d1
				%add8 = fadd double %add7, %d2
				%add9 = fadd double %add8, %d3
				%add10 = fadd double %add9, %d4
				%add11 = fadd double %add10, %d5
				%add12 = fadd double %add11, %d6
				%add13 = fadd double %add12, %d7
				%add14 = fadd double %add13, %d8
				%add15 = fadd double %add14, %d9
				%add16 = fadd double %add15, %d10
				%add17 = fadd double %add16, %d11
				%add18 = fadd double %add17, %d12
				%add19 = fadd double %add18, %d13
				%add20 = fadd double %add19, %d14
				%conv21 = fptosi double %add20 to i32
				ret i32 %conv21
				}
				; CHECK-LABEL: mix_floats

				; 32BIT-DAG: liveins:
				; 32BIT-DAG: - { reg: '$r3', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$r4', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$r5', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$r6', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$r7', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$r8', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$r9', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$r10', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$f1', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$f2', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$f3', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$f4', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$f5', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$f6', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$f7', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$f8', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$f9', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$f10', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$f11', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$f12', virtual-reg: '' }
				; 32BIT-DAG: - { reg: '$f13', virtual-reg: '' }

				; 32BIT-DAG: fixedStack:
				; 32BIT-DAG: - { id: 0, type: default, offset: 160, size: 8

				; 32BIT-DAG: body: \|
				; 32BIT-DAG: bb.0.entry:
				; 32BIT-DAG: liveins: $f1, $f2, $f3, $f4, $f5, $f6, $f7, $f8, $f9, $f10, $f11, $f12, $f13, $r3, $r4, $r5, $r6, $r7, $r8, $r9, $r10

				; 64BIT-DAG: liveins:
				; 64BIT-DAG: - { reg: '$x3', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$x4', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$x5', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$x6', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$x7', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$x8', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$x9', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$x10', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$f1', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$f2', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$f3', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$f4', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$f5', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$f6', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$f7', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$f8', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$f9', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$f10', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$f11', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$f12', virtual-reg: '' }
				; 64BIT-DAG: - { reg: '$f13', virtual-reg: '' }

				; 64BIT-DAG: fixedStack:
				; 64BIT-DAG: - { id: 0, type: default, offset: 216, size: 8

				; 64BIT-DAG: body: \|
				; 64BIT-DAG: bb.0.entry:
				; 64BIT-DAG: liveins: $f1, $f2, $f3, $f4, $f5, $f6, $f7, $f8, $f9, $f10, $f11, $f12, $f13, $x3, $x4, $x5, $x6, $x7, $x8, $x9, $x10

				; CHECKASM-LABEL: .mix_floats:
				; ASM32PWR4: stwu 1, -48(1)
				; ASM32PWR4-DAG: lfd [[REG1:[0-9]+]], 208(1)
				; ASM32PWR4-DAG: add 3, 3, 4
				; ASM32PWR4-DAG: lwz 4, LC38(2)
				; ASM32PWR4-DAG: lis 11, 17200
				; ASM32PWR4-DAG: stfd 31, 40(1)
				; ASM32PWR4-DAG: add 3, 3, 5
				; ASM32PWR4-DAG: add 3, 3, 6
				; ASM32PWR4-DAG: add 3, 3, 7
				; ASM32PWR4-DAG: stw 11, 24(1)
				; ASM32PWR4-DAG: add 3, 3, 8
				; ASM32PWR4-DAG: add 3, 3, 9
				; ASM32PWR4-DAG: add 3, 3, 10
				; ASM32PWR4-DAG: lfs 0, 0(4)
				; ASM32PWR4-DAG: xoris 3, 3, 32768
				; ASM32PWR4-DAG: stw 3, 28(1)
				; ASM32PWR4-DAG: addi 3, 1, 36
				; ASM32PWR4-DAG: lfd 31, 24(1)
				; ASM32PWR4-DAG: fsub 0, 31, 0
				; ASM32PWR4-DAG: fadd 0, 0, 1
				; ASM32PWR4-DAG: fadd 0, 0, 2
				; ASM32PWR4-DAG: fadd 0, 0, 3
				; ASM32PWR4-DAG: fadd 0, 0, 4
				; ASM32PWR4-DAG: fadd 0, 0, 5
				; ASM32PWR4-DAG: fadd 0, 0, 6
				; ASM32PWR4-DAG: fadd 0, 0, 7
				; ASM32PWR4-DAG: fadd 0, 0, 8
				; ASM32PWR4-DAG: fadd 0, 0, 9
				; ASM32PWR4-DAG: fadd 0, 0, 10
				; ASM32PWR4-DAG: fadd 0, 0, 11
				; ASM32PWR4-DAG: fadd 0, 0, 12
				; ASM32PWR4-DAG: fadd 0, 0, 13
				; ASM32PWR4-DAG: fadd 0, 0, [[REG1]]
				; ASM32PWR4-DAG: fctiwz 0, 0
				; ASM32PWR4-DAG: stfiwx 0, 0, 3
				; ASM32PWR4-DAG: lwz 3, 36(1)
				; ASM32PWR4-DAG: lfd 31, 40(1)
				; ASM32PWR4-DAG: addi 1, 1, 48
				; ASM32PWR4-NEXT: blr

				; ASM64PWR4: add 3, 3, 4
				; ASM64PWR4-DAG: add 3, 3, 5
				; ASM64PWR4-DAG: add 3, 3, 6
				; ASM64PWR4-DAG: add 3, 3, 7
				; ASM64PWR4-DAG: add 3, 3, 8
				; ASM64PWR4-DAG: add 3, 3, 9
				; ASM64PWR4-DAG: add 3, 3, 10
				; ASM64PWR4-DAG: extsw 3, 3
				; ASM64PWR4-DAG: std 3, -16(1)
				; ASM64PWR4-DAG: addi 3, 1, -4
				; ASM64PWR4-DAG: lfd 0, -16(1)
				; ASM64PWR4-DAG: fcfid 0, 0
				; ASM64PWR4-DAG: fadd 0, 0, 1
				; ASM64PWR4-DAG: lfd [[REG1:[0-9]+]], 216(1)
				; ASM64PWR4-DAG: fadd 0, 0, 2
				; ASM64PWR4-DAG: fadd 0, 0, 3
				; ASM64PWR4-DAG: fadd 0, 0, 4
				; ASM64PWR4-DAG: fadd 0, 0, 5
				; ASM64PWR4-DAG: fadd 0, 0, 6
				; ASM64PWR4-DAG: fadd 0, 0, 7
				; ASM64PWR4-DAG: fadd 0, 0, 8
				; ASM64PWR4-DAG: fadd 0, 0, 9
				; ASM64PWR4-DAG: fadd 0, 0, 10
				; ASM64PWR4-DAG: fadd 0, 0, 11
				; ASM64PWR4-DAG: fadd 0, 0, 12
				; ASM64PWR4-DAG: fadd 0, 0, 13
				; ASM64PWR4-DAG: fadd 0, 0, [[REG1]]
				; ASM64PWR4-DAG: fctiwz 0, 0
				; ASM64PWR4-DAG: stfiwx 0, 0, 3
				; ASM64PWR4-DAG: lwz 3, -4(1)
				; ASM64PWR4-DAG: blr

				define void @mix_floats_caller() {
				entry:
				%call = call i32 @mix_floats(i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, double 1.000000e-01, double 2.000000e-01, double 3.000000e-01, double 4.000000e-01, double 5.000000e-01, double 6.000000e-01, double 0x3FE6666666666666, double 8.000000e-01, double 9.000000e-01, double 1.000000e+00, double 1.100000e+00, double 1.200000e+00, double 1.300000e+00, double 1.400000e+00)
				ret void
				}

				; CHECK-LABEL: mix_floats_caller

				; 32BIT-DAG: ADJCALLSTACKDOWN 168, 0, implicit-def dead $r1, implicit $r1
				; 32BIT-DAG: STW killed renamable $r7, 56, $r1 :: (store 4, align 8)
				; 32BIT-DAG: STW renamable $r6, 60, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r8, 64, $r1 :: (store 4, align 8)
				; 32BIT-DAG: STW renamable $r6, 68, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r10, 72, $r1 :: (store 4, align 8)
				; 32BIT-DAG: STW renamable $r9, 76, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r11, 80, $r1 :: (store 4, align 8)
				; 32BIT-DAG: STW renamable $r6, 84, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r4, 88, $r1 :: (store 4, align 8)
				; 32BIT-DAG: STW renamable $r3, 92, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r4, 96, $r1 :: (store 4, align 8)
				; 32BIT-DAG: STW renamable $r9, 100, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r5, 104, $r1 :: (store 4, align 8)
				; 32BIT-DAG: STW renamable $r3, 108, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r7, 112, $r1 :: (store 4, align 8)
				; 32BIT-DAG: STW renamable $r6, 116, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r10, 120, $r1 :: (store 4, align 8)
				; 32BIT-DAG: STW killed renamable $r5, 128, $r1 :: (store 4, align 8)
				; 32BIT-DAG: STW renamable $r8, 124, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r3, 132, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r11, 136, $r1 :: (store 4, align 8)
				; 32BIT-DAG: STW killed renamable $r6, 140, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r4, 144, $r1 :: (store 4, align 8)
				; 32BIT-DAG: STW killed renamable $r9, 148, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r5, 152, $r1 :: (store 4, align 8)
				; 32BIT-DAG: STW killed renamable $r8, 156, $r1 :: (store 4)
				; 32BIT-DAG: STW killed renamable $r12, 160, $r1 :: (store 4, align 8)
				; 32BIT-DAG: STW killed renamable $r3, 164, $r1 :: (store 4)
				; 32BIT-DAG: renamable $r10 = LWZtoc %const.0, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r6 = LWZtoc %const.1, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r11 = LWZtoc %const.2, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r9 = LWZtoc %const.3, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r4 = LWZtoc %const.4, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r8 = LWZtoc %const.5, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r5 = LWZtoc %const.6, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r3 = LWZtoc %const.7, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r10 = LWZtoc %const.8, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r6 = LWZtoc %const.9, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r11 = LWZtoc %const.10, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r9 = LWZtoc %const.11, $r2 :: (load 4 from got)
				; 32BIT-DAG: renamable $r4 = LWZtoc %const.12, $r2 :: (load 4 from got)
				; 32BIT-DAG: $r3 = LI 1
				; 32BIT-DAG: $r4 = LI 2
				; 32BIT-DAG: $r5 = LI 3
				; 32BIT-DAG: $r6 = LI 4
				; 32BIT-DAG: $r7 = LI 5
				; 32BIT-DAG: $r8 = LI 6
				; 32BIT-DAG: $r9 = LI 7
				; 32BIT-DAG: $r10 = LI 8
				; 32BIT-NEXT: BL_NOP <mcsymbol .mix_floats>, csr_aix32, implicit-def dead $lr, implicit $rm, implicit $r3, implicit $r4, implicit $r5, implicit $r6, implicit $r7, implicit $r8, implicit $r9, implicit $r10, implicit $f1, implicit $f2, implicit $f3, implicit $f4, implicit $f5, implicit $f6, implicit $f7, implicit $f8, implicit $f9, implicit $f10, implicit $f11, implicit $f12, implicit $f13, implicit $r2, implicit-def $r1, implicit-def dead $r3
				; 32BIT-NEXT: ADJCALLSTACKUP 168, 0, implicit-def dead $r1, implicit $r1

				; CHEKASM-LABEL: .mix_floats_caller:

				; ASM32PWR4: mflr 0
				; ASM32PWR4-DAG: stw 0, 8(1)
				; ASM32PWR4-DAG: stwu 1, -176(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 56(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 60(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 64(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 68(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 72(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 76(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 80(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 84(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 88(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 92(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 96(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 100(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 104(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 108(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 112(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 116(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 120(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 124(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 128(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 132(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 136(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 140(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 144(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 148(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 152(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 156(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 160(1)
				; ASM32PWR4-DAG: stw [[REG:[0-9]+]], 164(1)
				; ASM32PWR4-NEXT: bl .mix_floats

				; ASM64PWR4: mflr 0
				; ASM64PWR4-DAG: std 0, 16(1)
				; ASM64PWR4-DAG: stdu 1, -240(1)
				; ASM64PWR4-DAG: std [[REG:[0-9]+]], 112(1)
				; ASM64PWR4-DAG: std [[REG:[0-9]+]], 120(1)
				; ASM64PWR4-DAG: std [[REG:[0-9]+]], 128(1)
				; ASM64PWR4-DAG: std [[REG:[0-9]+]], 136(1)
				; ASM64PWR4-DAG: std [[REG:[0-9]+]], 144(1)
				; ASM64PWR4-DAG: std [[REG:[0-9]+]], 152(1)
				; ASM64PWR4-DAG: std [[REG:[0-9]+]], 160(1)
				; ASM64PWR4-DAG: std [[REG:[0-9]+]], 168(1)
				; ASM64PWR4-DAG: std [[REG:[0-9]+]], 176(1)
				; ASM64PWR4-DAG: std [[REG:[0-9]+]], 184(1)
				; ASM64PWR4-DAG: std [[REG:[0-9]+]], 192(1)
				; ASM64PWR4-DAG: std [[REG:[0-9]+]], 200(1)
				; ASM64PWR4-DAG: std [[REG:[0-9]+]], 208(1)
				; ASM64PWR4-DAG: std [[REG:[0-9]+]], 216(1)
				; ASM64PWR4-DAG: std [[REG:[0-9]+]], 224(1)
				; ASM64PWR4-DAG: std [[REG:[0-9]+]], 232(1)
				; ASM64PWR4-NEXT: bl .mix_floats
				; ASM64PWR4-DAG: nop
				; ASM64PWR4-DAG: ld [[REGF1:[0-9]+]], 232(1)
				; ASM64PWR4-DAG: ld [[REGF2:[0-9]+]], 224(1)
				; ASM64PWR4-DAG: addi 1, 1, 240
				; ASM64PWR4-DAG: ld 0, 16(1)
				; ASM64PWR4-DAG: mtlr 0
				; ASM64PWR4-NEXT: blr

This is an archive of the discontinued LLVM Phabricator instance.

[AIX] Implement formal arguments passed in stack memoryClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 244319

llvm/lib/Target/PowerPC/PPCISelLowering.cpp

llvm/test/CodeGen/PowerPC/aix-cc-abi.ll

[AIX] Implement formal arguments passed in stack memory
ClosedPublic