This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/Driver/
-
clang/
-
Driver/
-
Options.td
-
lib/Driver/ToolChains/
-
Driver/
-
ToolChains/
-
Clang.cpp
-
llvm/
-
lib/Target/
-
Target/
-
AArch64/
-
AArch64FrameLowering.h
7/14
AArch64FrameLowering.cpp
-
AArch64RegisterInfo.h
-
AArch64RegisterInfo.cpp
2/4
AArch64RegisterInfo.td
-
X86/
1/2
X86RegisterInfo.cpp
-
test/CodeGen/AArch64/
-
CodeGen/
-
AArch64/
5/6
zero-call-used-regs.ll
-
utils/TableGen/
-
TableGen/
-
RegisterInfoEmitter.cpp

Differential D124836

[AArch64] Add support for -fzero-call-used-regs
ClosedPublic

Authored by void on May 3 2022, 2:55 AM.

Download Raw Diff

Details

Reviewers

nickdesaulniers
danielkiss
MaskRay

Commits

rG6e00a34cdb49: [AArch64] Add support for -fzero-call-used-regs

Summary

Support the "-fzero-call-used-regs" option on AArch64. This involves much less
specialized code than the X86 version. Most of the checks can be done with
TableGen.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

void created this revision.May 3 2022, 2:55 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 3 2022, 2:56 AM

Herald added subscribers: StephenFan, pengfei, hiraditya, kristof.beyls. · View Herald Transcript

void requested review of this revision.May 3 2022, 2:56 AM

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMay 3 2022, 2:56 AM

Herald added subscribers: llvm-commits, cfe-commits. · View Herald Transcript

kristof.beyls added inline comments.May 3 2022, 2:59 AM

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
752	Just a drive-by comment: I'm wondering if SVE registers should also be listed here?

Harbormaster completed remote builds in B162406: Diff 426625.May 3 2022, 3:30 AM

nickdesaulniers added inline comments.May 3 2022, 12:01 PM

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
682	What happens if `-fomit-frame-pointer` is specified? Is X29 used as a GPR then?
757	sink this closer to first use, L767
776–778	so for 32b registers, we clear the whole 64b register?
787–792	isn't it more canonical on ARM to move from the dedicated zero register XZR rather than use an immediate?
llvm/lib/Target/AArch64/AArch64RegisterInfo.td
1404	is `i32` correct here?
llvm/lib/Target/X86/X86RegisterInfo.cpp
659	Does this allow us to clean up anything else in the body of this method? Consider making this and the tablegen related patch a distinct child patch.

nickdesaulniers added inline comments.May 3 2022, 1:14 PM

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
776–778	Perhaps a more descriptive method name like `getWidestRegisterAlias` or the like? Perhaps we should simply assert if we get a non GPR rather than return 0, which might actually be a Register? Also, TargetRegisterClass has some notion of sub and super register classes. I wonder if have existing machinery to say, given a register class, what's the equivalent/aliases super register class (if that's even what a super register is).

void added inline comments.May 3 2022, 3:17 PM

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
776–778	That's what's happening with GCC. https://godbolt.org/z/W6b7zxYnK Perhaps we should simply assert if we get a non GPR rather than return 0, which might actually be a Register? I'm also using it for the vector registers. And 0 can't be a register. (See `include/llvm/MC/Register.h`.) Might be able to use the TRC. But I see that X86 has `llvm::getX86SubSuperRegisterOrZero` in `X86MCTargetDesc.cpp` which has a large table of registers so that you can get the register of the proper size.
787–792	GCC outputs the immediate move. I'm not familiar though with what's more canonical.
llvm/lib/Target/AArch64/AArch64RegisterInfo.td
1404	That's what FPR32 is defined to be: def FPR32 : RegisterClass<"AArch64", [f32, i32], 32,(sequence "S%u", 0, 31)>;
llvm/lib/Target/X86/X86RegisterInfo.cpp
659	It's possible to simplify here, but it would take more work. I'll address that in a separate patch.

void added inline comments.May 3 2022, 3:23 PM

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
752	I'm not familiar with the SVE registers (I assume you mean the `Z#` and `P#` ones). Could you give an example program?

peterwaller-arm added a subscriber: peterwaller-arm.May 4 2022, 1:32 AM

peterwaller-arm added inline comments.

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
752	SVE is slightly tricker here because the set of registers the caller must preserve depends on the signature of the function. This is described here: https://github.com/ARM-software/abi-aa/blob/8a7b266879c60ca1c76e94ebb279b2dac60ed6a5/aapcs64/aapcs64.rst#613scalable-vector-registers The callee-preserved registers are z8-z23 and p4-p15 if the function is using the VARIANT_PCS, the code for that condition in the asm printer is here: https://github.com/llvm/llvm-project/blob/78fd413cf736953ac623cabf3d5f84c8219e31f8/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp#L864-L875 if (MF->getFunction().getCallingConv() == CallingConv::AArch64_VectorCall \|\| MF->getFunction().getCallingConv() == CallingConv::AArch64_SVE_VectorCall \|\| STI->getRegisterInfo()->hasSVEArgsOrReturn(MF)) { Hope that helps a little.

sdesmalen added a subscriber: sdesmalen.May 4 2022, 3:06 AM

sdesmalen added inline comments.

llvm/lib/Target/AArch64/AArch64RegisterInfo.td
1398	Should this feature/attribute work with other calling conventions? If so, then it's probably best not to hard-code these values here, but rather to get them from the chosen calling convention for that particular function. The supported calling conventions are defined in AArch64CallingConvention.td. For example, you could iterate all registers in GPR64/FPR128/ZPR/PPR register classes and zero their values if they are not marked as callee saved. You can query this information from the call by looking at it's callee-saved regmask (see for example `CSR_AArch64_AAPCS_RegMask` to see how those are defined defined).

void added inline comments.May 4 2022, 3:42 PM

llvm/lib/Target/AArch64/AArch64RegisterInfo.td
1398	Ideally it could be retrieved from the `CallingConvention.td` files, but in reality it's difficult because those files have a lot of `CCIf...<>` constructs in them, making a simple query complex. I don't know about the `_RegMask` thing. Could you explain what it is and how it works?

void added inline comments.May 7 2022, 3:37 AM

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp

682

GCC only clears registers R0-R18 with or without -fomit-frame-pointer. (That's using -fzero-call-used-regs=all, so register usage isn't a consideration.) I assume that it's correct, or at least close to it.

752

Okay, so GCC does clear out SVE registers when using -march=armv8-a+sve:

mov     z0.h, #0
mov     z1.h, #0
mov     z2.h, #0
mov     z3.h, #0
mov     z4.h, #0
mov     z5.h, #0
mov     z6.h, #0
mov     z7.h, #0
mov     z16.h, #0
mov     z17.h, #0
mov     z18.h, #0
mov     z19.h, #0
mov     z20.h, #0
mov     z21.h, #0
mov     z22.h, #0
mov     z23.h, #0
mov     z24.h, #0
mov     z25.h, #0
mov     z26.h, #0
mov     z27.h, #0
mov     z28.h, #0
mov     z29.h, #0
mov     z30.h, #0
mov     z31.h, #0
pfalse  p0.b
pfalse  p1.b
pfalse  p2.b
pfalse  p3.b
pfalse  p4.b
pfalse  p5.b
pfalse  p6.b
pfalse  p7.b
pfalse  p8.b
pfalse  p9.b
pfalse  p10.b
pfalse  p11.b
pfalse  p12.b
pfalse  p13.b
pfalse  p14.b
pfalse  p15.b

Support SVE registers.
Initial feature to gather argument registers from the *CallingConv.td files.

Harbormaster completed remote builds in B163358: Diff 427909.May 8 2022, 1:34 AM

peterwaller-arm added inline comments.May 9 2022, 1:29 AM

llvm/test/CodeGen/AArch64/zero-call-used-regs.ll
234	Thanks for addressing the SVE case. Please can we have `target-features=+sve` tests as well?

void added inline comments.May 9 2022, 12:28 PM

llvm/test/CodeGen/AArch64/zero-call-used-regs.ll
234	Thanks for addressing the SVE case. Please can we have `target-features=+sve` tests as well? Yes. :-) I'm working on a few other clean-up things so they'll be coming soon.

@peterwaller-arm @sdesmalen Could you comment on what is considered the canonical way to zero Arm registers? Is mov x1, #0 the way or mov x1, xzr or some other way?

Used lists of argument registers generated from AArch64CallingConv.td.
Add more tests for floating point and SVE.

Harbormaster completed remote builds in B163586: Diff 428218.May 9 2022, 6:12 PM

I still think this would be easier to review if the isArgumentRegister tablegen changes were separated out into a distinct parent patch and then the existing x86 implementation updated to use, then this would rebased on top of as a child patch.

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
798	Reuse `HasSVE` from L771?
llvm/test/CodeGen/AArch64/zero-call-used-regs.ll
3–4	If you use `--check-prefixes=CHECK,<unique>` (ie. `--check-prefixes=CHECK,DEFAULT` and `--check-prefixes=CHECK,SVE`) then when `DEFAULT` and `SVE` match, you can just use `CHECK`. That should help reduce the number of checks in this test significantly. Otherwise it's hard to tell what's different between the two cases, if anything at all. update_llc_test_checks should work with --check-prefixes IME.
260–263	N00b question about SVE: do we need `pfalse` for each of the numbered p registers corresponding to the x registers we zeroed? i.e. here we have pfalse for p0-3, yet we zero z0-7.

peterwaller-arm added inline comments.May 11 2022, 12:36 PM

llvm/test/CodeGen/AArch64/zero-call-used-regs.ll
260–263	No, the set of p registers are independent of the z registers. The calling convention states [0] that the predicate registers p0-p3 may be used for parameter passing (if you have an argument which belongs in a p register), so this looks reasonable. [0] https://github.com/ARM-software/abi-aa/blob/8a7b266879c60ca1c76e94ebb279b2dac60ed6a5/aapcs64/aapcs64.rst#scalable-predicate-registers

Fix think-o use of HasSVE. Use --check-prefixes in the testcase.

I'll split off the TableGen changes into a separate patch. It will supersede those changes here, so it shouldn't delay other reviews here. :-)

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
798	Doh! Changed.
llvm/test/CodeGen/AArch64/zero-call-used-regs.ll
3–4	What hath God wrought?! Done.

In D124836#3507268, @void wrote:

I'll split off the TableGen changes into a separate patch. It will supersede those changes here, so it shouldn't delay other reviews here. :-)

I'm referring to the changes to llvm/utils/TableGen/CallingConvEmitter.cpp and llvm/utils/TableGen/RegisterInfoEmitter.cpp. I would like to see those as a parent patch. I'm not sure what you're referring to, and it sounds like a child patch, not a parent patch.

In D124836#3507354, @nickdesaulniers wrote:

In D124836#3507268, @void wrote:

I'll split off the TableGen changes into a separate patch. It will supersede those changes here, so it shouldn't delay other reviews here. :-)

I'm referring to the changes to llvm/utils/TableGen/CallingConvEmitter.cpp and llvm/utils/TableGen/RegisterInfoEmitter.cpp. I would like to see those as a parent patch. I'm not sure what you're referring to, and it sounds like a child patch, not a parent patch.

That's what I meant. And I need those TableGen changes for this patch.

Harbormaster completed remote builds in B163975: Diff 428764.May 11 2022, 4:18 PM

void added a parent revision: D125421: [TableGen] Add generation of argument register lists.May 16 2022, 1:13 PM

Friendly ping.

Don't forget to rebase this on top of https://reviews.llvm.org/D125421.

This revision is now accepted and ready to land.May 19 2022, 12:19 PM

MaskRay accepted this revision.May 19 2022, 12:47 PM

Last update.

This revision was landed with ongoing or failed builds.May 19 2022, 4:58 PM

Closed by commit rG6e00a34cdb49: [AArch64] Add support for -fzero-call-used-regs (authored by void). · Explain Why

This revision was automatically updated to reflect the committed changes.

void added a commit: rG6e00a34cdb49: [AArch64] Add support for -fzero-call-used-regs.

Harbormaster completed remote builds in B165451: Diff 430858.May 19 2022, 6:05 PM

Looks like this commit breaks msvc build: https://lab.llvm.org/buildbot/#/builders/222/builds/532

Hi @void ,

the zero-call-used-regs.ll test gets failed on llvm-clang-x86_64-expensive-checks-ubuntu builder with the following errors:

...
*** Bad machine code: Illegal physical register for instruction ***
- function:    all_arg
- basic block: %bb.0 entry (0x555be568bb88)
- instruction: $q0 = MOVID 0
- operand 0:   $q0
$q0 is not a FPR64 register.
...

see more details here https://lab.llvm.org/buildbot/#/builders/104/builds/7797/steps/6/logs/FAIL__LLVM__zero-call-used-regs_ll

Looks like you need to limit this test with REQURES: aarch64-registered-target or something similar.

The first failed build: https://lab.llvm.org/buildbot/#/builders/104/builds/7797

In D124836#3528109, @vvereschaka wrote:
Hi @void ,

the zero-call-used-regs.ll test gets failed on llvm-clang-x86_64-expensive-checks-ubuntu builder with the following errors:
...
*** Bad machine code: Illegal physical register for instruction ***
- function:    all_arg
- basic block: %bb.0 entry (0x555be568bb88)
- instruction: $q0 = MOVID 0
- operand 0:   $q0
$q0 is not a FPR64 register.
...

I think this is actually the verifier highlighting a codegen issue with the patch. Looks like it just got fixed though.

got it. Yes, looks like it fixed. The test got passed during the last build: https://lab.llvm.org/buildbot/#/builders/104/builds/7812
Thank you.

In D124836#3528521, @vvereschaka wrote:

got it. Yes, looks like it fixed. The test got passed during the last build: https://lab.llvm.org/buildbot/#/builders/104/builds/7812
Thank you.

I'm sorry for the failure. I thought I had reverted the offending change, but didn't push it. :-/

Hi @void,

The msvc build is still broken. https://lab.llvm.org/buildbot/#/builders/222/builds/532

In D124836#3528529, @void wrote:

In D124836#3528521, @vvereschaka wrote:

got it. Yes, looks like it fixed. The test got passed during the last build: https://lab.llvm.org/buildbot/#/builders/104/builds/7812
Thank you.

I'm sorry for the failure. I thought I had reverted the offending change, but didn't push it. :-/

Allen added a subscriber: Allen.May 21 2022, 10:10 PM

Revision Contents

Path

Size

clang/

include/

clang/

Driver/

Options.td

2 lines

lib/

Driver/

ToolChains/

Clang.cpp

2 lines

llvm/

lib/

Target/

AArch64/

AArch64FrameLowering.h

4 lines

AArch64FrameLowering.cpp

132 lines

AArch64RegisterInfo.h

3 lines

AArch64RegisterInfo.cpp

64 lines

AArch64RegisterInfo.td

9 lines

X86/

X86RegisterInfo.cpp

2 lines

test/

CodeGen/

AArch64/

zero-call-used-regs.ll

666 lines

utils/

TableGen/

RegisterInfoEmitter.cpp

16 lines

Diff 430860

clang/include/clang/Driver/Options.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 2,980 Lines • ▼ Show 20 Lines

	def fenable_matrix : Flag<["-"], "fenable-matrix">, Group<f_Group>,			def fenable_matrix : Flag<["-"], "fenable-matrix">, Group<f_Group>,
	Flags<[CC1Option]>,			Flags<[CC1Option]>,
	HelpText<"Enable matrix data type and related builtin functions">,			HelpText<"Enable matrix data type and related builtin functions">,
	MarshallingInfoFlag<LangOpts<"MatrixTypes">>;			MarshallingInfoFlag<LangOpts<"MatrixTypes">>;

	def fzero_call_used_regs_EQ			def fzero_call_used_regs_EQ
	: Joined<["-"], "fzero-call-used-regs=">, Group<f_Group>, Flags<[CC1Option]>,			: Joined<["-"], "fzero-call-used-regs=">, Group<f_Group>, Flags<[CC1Option]>,
	HelpText<"Clear call-used registers upon function return.">,			HelpText<"Clear call-used registers upon function return (AArch64/x86 only)">,
	Values<"skip,used-gpr-arg,used-gpr,used-arg,used,all-gpr-arg,all-gpr,all-arg,all">,			Values<"skip,used-gpr-arg,used-gpr,used-arg,used,all-gpr-arg,all-gpr,all-arg,all">,
	NormalizedValues<["Skip", "UsedGPRArg", "UsedGPR", "UsedArg", "Used",			NormalizedValues<["Skip", "UsedGPRArg", "UsedGPR", "UsedArg", "Used",
	"AllGPRArg", "AllGPR", "AllArg", "All"]>,			"AllGPRArg", "AllGPR", "AllArg", "All"]>,
	NormalizedValuesScope<"llvm::ZeroCallUsedRegs::ZeroCallUsedRegsKind">,			NormalizedValuesScope<"llvm::ZeroCallUsedRegs::ZeroCallUsedRegsKind">,
	MarshallingInfoEnum<CodeGenOpts<"ZeroCallUsedRegs">, "Skip">;			MarshallingInfoEnum<CodeGenOpts<"ZeroCallUsedRegs">, "Skip">;

	def fdebug_types_section: Flag <["-"], "fdebug-types-section">, Group<f_Group>,			def fdebug_types_section: Flag <["-"], "fdebug-types-section">, Group<f_Group>,
	HelpText<"Place debug types in their own section (ELF Only)">;			HelpText<"Place debug types in their own section (ELF Only)">;
	▲ Show 20 Lines • Show All 3,798 Lines • Show Last 20 Lines

clang/lib/Driver/ToolChains/Clang.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 5,987 Lines • ▼ Show 20 Lines
	Args.AddLastArg(CmdArgs, options::OPT_femulated_tls,			Args.AddLastArg(CmdArgs, options::OPT_femulated_tls,
	options::OPT_fno_emulated_tls);			options::OPT_fno_emulated_tls);
	Args.AddLastArg(CmdArgs, options::OPT_fzero_call_used_regs_EQ);			Args.AddLastArg(CmdArgs, options::OPT_fzero_call_used_regs_EQ);

	if (Arg *A = Args.getLastArg(options::OPT_fzero_call_used_regs_EQ)) {			if (Arg *A = Args.getLastArg(options::OPT_fzero_call_used_regs_EQ)) {
	// FIXME: There's no reason for this to be restricted to X86. The backend			// FIXME: There's no reason for this to be restricted to X86. The backend
	// code needs to be changed to include the appropriate function calls			// code needs to be changed to include the appropriate function calls
	// automatically.			// automatically.
	if (!Triple.isX86())			if (!Triple.isX86() && !Triple.isAArch64())
	D.Diag(diag::err_drv_unsupported_opt_for_target)			D.Diag(diag::err_drv_unsupported_opt_for_target)
	<< A->getAsString(Args) << TripleStr;			<< A->getAsString(Args) << TripleStr;
	}			}

	// AltiVec-like language extensions aren't relevant for assembling.			// AltiVec-like language extensions aren't relevant for assembling.
	if (!isa<PreprocessJobAction>(JA) \|\| Output.getType() != types::TY_PP_Asm)			if (!isa<PreprocessJobAction>(JA) \|\| Output.getType() != types::TY_PP_Asm)
	Args.AddLastArg(CmdArgs, options::OPT_fzvector);			Args.AddLastArg(CmdArgs, options::OPT_fzvector);

	▲ Show 20 Lines • Show All 2,403 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64FrameLowering.h

Show First 20 Lines • Show All 147 Lines • ▼ Show 20 Lines	private:
void emitCalleeSavedGPRLocations(MachineBasicBlock &MBB,		void emitCalleeSavedGPRLocations(MachineBasicBlock &MBB,
MachineBasicBlock::iterator MBBI) const;		MachineBasicBlock::iterator MBBI) const;
void emitCalleeSavedSVELocations(MachineBasicBlock &MBB,		void emitCalleeSavedSVELocations(MachineBasicBlock &MBB,
MachineBasicBlock::iterator MBBI) const;		MachineBasicBlock::iterator MBBI) const;
void emitCalleeSavedGPRRestores(MachineBasicBlock &MBB,		void emitCalleeSavedGPRRestores(MachineBasicBlock &MBB,
MachineBasicBlock::iterator MBBI) const;		MachineBasicBlock::iterator MBBI) const;
void emitCalleeSavedSVERestores(MachineBasicBlock &MBB,		void emitCalleeSavedSVERestores(MachineBasicBlock &MBB,
MachineBasicBlock::iterator MBBI) const;		MachineBasicBlock::iterator MBBI) const;

		/// Emit target zero call-used regs.
		void emitZeroCallUsedRegs(BitVector RegsToZero,
		MachineBasicBlock &MBB) const override;
};		};

} // End llvm namespace		} // End llvm namespace

#endif		#endif

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp

Show First 20 Lines • Show All 669 Lines • ▼ Show 20 Lines	void AArch64FrameLowering::emitCalleeSavedGPRRestores(
emitCalleeSavedRestores(MBB, MBBI, false);		emitCalleeSavedRestores(MBB, MBBI, false);
}		}

void AArch64FrameLowering::emitCalleeSavedSVERestores(		void AArch64FrameLowering::emitCalleeSavedSVERestores(
MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI) const {		MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI) const {
emitCalleeSavedRestores(MBB, MBBI, true);		emitCalleeSavedRestores(MBB, MBBI, true);
}		}

		static MCRegister getRegisterOrZero(MCRegister Reg, bool HasSVE) {
		switch (Reg.id()) {
		default:
		// The called routine is expected to preserve r19-r28
		// r29 and r30 are used as frame pointer and link register resp.
		nickdesaulniersUnsubmitted Not Done Reply Inline Actions What happens if `-fomit-frame-pointer` is specified? Is X29 used as a GPR then? nickdesaulniers: What happens if `-fomit-frame-pointer` is specified? Is X29 used as a GPR then?
		voidAuthorUnsubmitted Done Reply Inline Actions GCC only clears registers R0-R18 with or without `-fomit-frame-pointer`. (That's using `-fzero-call-used-regs=all`, so register usage isn't a consideration.) I assume that it's correct, or at least close to it. void: GCC only clears registers R0-R18 with or without `-fomit-frame-pointer`. (That's using `-fzero…
		return 0;

		// GPRs
		#define CASE(n) \
		case AArch64::W##n: \
		case AArch64::X##n: \
		return AArch64::X##n
		CASE(0);
		CASE(1);
		CASE(2);
		CASE(3);
		CASE(4);
		CASE(5);
		CASE(6);
		CASE(7);
		CASE(8);
		CASE(9);
		CASE(10);
		CASE(11);
		CASE(12);
		CASE(13);
		CASE(14);
		CASE(15);
		CASE(16);
		CASE(17);
		CASE(18);
		#undef CASE

		// FPRs
		#define CASE(n) \
		case AArch64::B##n: \
		case AArch64::H##n: \
		case AArch64::S##n: \
		case AArch64::D##n: \
		case AArch64::Q##n: \
		return HasSVE ? AArch64::Z##n : AArch64::Q##n
		CASE(0);
		CASE(1);
		CASE(2);
		CASE(3);
		CASE(4);
		CASE(5);
		CASE(6);
		CASE(7);
		CASE(8);
		CASE(9);
		CASE(10);
		CASE(11);
		CASE(12);
		CASE(13);
		CASE(14);
		CASE(15);
		CASE(16);
		CASE(17);
		CASE(18);
		CASE(19);
		CASE(20);
		CASE(21);
		CASE(22);
		CASE(23);
		CASE(24);
		CASE(25);
		CASE(26);
		CASE(27);
		CASE(28);
		CASE(29);
		CASE(30);
		CASE(31);
		#undef CASE
		}
		kristof.beylsUnsubmitted Not Done Reply Inline Actions Just a drive-by comment: I'm wondering if SVE registers should also be listed here? kristof.beyls: Just a drive-by comment: I'm wondering if SVE registers should also be listed here?
		voidAuthorUnsubmitted Done Reply Inline Actions I'm not familiar with the SVE registers (I assume you mean the `Z#` and `P#` ones). Could you give an example program? void: I'm not familiar with the SVE registers (I assume you mean the `Z#` and `P#` ones). Could you…
		peterwaller-armUnsubmitted Not Done Reply Inline Actions SVE is slightly tricker here because the set of registers the caller must preserve depends on the signature of the function. This is described here: https://github.com/ARM-software/abi-aa/blob/8a7b266879c60ca1c76e94ebb279b2dac60ed6a5/aapcs64/aapcs64.rst#613scalable-vector-registers The callee-preserved registers are z8-z23 and p4-p15 if the function is using the VARIANT_PCS, the code for that condition in the asm printer is here: https://github.com/llvm/llvm-project/blob/78fd413cf736953ac623cabf3d5f84c8219e31f8/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp#L864-L875 if (MF->getFunction().getCallingConv() == CallingConv::AArch64_VectorCall \|\| MF->getFunction().getCallingConv() == CallingConv::AArch64_SVE_VectorCall \|\| STI->getRegisterInfo()->hasSVEArgsOrReturn(MF)) { Hope that helps a little. peterwaller-arm: SVE is slightly tricker here because the set of registers the caller must preserve depends on…
		voidAuthorUnsubmitted Done Reply Inline Actions Okay, so GCC does clear out SVE registers when using `-march=armv8-a+sve`: mov z0.h, #0 mov z1.h, #0 mov z2.h, #0 mov z3.h, #0 mov z4.h, #0 mov z5.h, #0 mov z6.h, #0 mov z7.h, #0 mov z16.h, #0 mov z17.h, #0 mov z18.h, #0 mov z19.h, #0 mov z20.h, #0 mov z21.h, #0 mov z22.h, #0 mov z23.h, #0 mov z24.h, #0 mov z25.h, #0 mov z26.h, #0 mov z27.h, #0 mov z28.h, #0 mov z29.h, #0 mov z30.h, #0 mov z31.h, #0 pfalse p0.b pfalse p1.b pfalse p2.b pfalse p3.b pfalse p4.b pfalse p5.b pfalse p6.b pfalse p7.b pfalse p8.b pfalse p9.b pfalse p10.b pfalse p11.b pfalse p12.b pfalse p13.b pfalse p14.b pfalse p15.b void: Okay, so GCC does clear out SVE registers when using `-march=armv8-a+sve`: ``` mov…
		}

		void AArch64FrameLowering::emitZeroCallUsedRegs(BitVector RegsToZero,
		MachineBasicBlock &MBB) const {
		// Insertion point.
		nickdesaulniersUnsubmitted Not Done Reply Inline Actions sink this closer to first use, L767 nickdesaulniers: sink this closer to first use, L767
		MachineBasicBlock::iterator MBBI = MBB.getFirstTerminator();

		// Fake a debug loc.
		DebugLoc DL;
		if (MBBI != MBB.end())
		DL = MBBI->getDebugLoc();

		const MachineFunction &MF = *MBB.getParent();
		const AArch64Subtarget &STI = MF.getSubtarget<AArch64Subtarget>();
		const AArch64RegisterInfo &TRI = *STI.getRegisterInfo();

		BitVector GPRsToZero(TRI.getNumRegs());
		BitVector FPRsToZero(TRI.getNumRegs());
		bool HasSVE = STI.hasSVE();
		for (MCRegister Reg : RegsToZero.set_bits()) {
		if (TRI.isGeneralPurposeRegister(MF, Reg)) {
		// For GPRs, we only care to clear out the 64-bit register.
		if (MCRegister XReg = getRegisterOrZero(Reg, HasSVE))
		GPRsToZero.set(XReg);
		} else if (AArch64::FPR128RegClass.contains(Reg) \|\|
		AArch64::FPR64RegClass.contains(Reg) \|\|
		nickdesaulniersUnsubmitted Not Done Reply Inline Actions so for 32b registers, we clear the whole 64b register? nickdesaulniers: so for 32b registers, we clear the whole 64b register?
		nickdesaulniersUnsubmitted Not Done Reply Inline Actions Perhaps a more descriptive method name like `getWidestRegisterAlias` or the like? Perhaps we should simply assert if we get a non GPR rather than return 0, which might actually be a Register? Also, TargetRegisterClass has some notion of sub and super register classes. I wonder if have existing machinery to say, given a register class, what's the equivalent/aliases super register class (if that's even what a super register is). nickdesaulniers: Perhaps a more descriptive method name like `getWidestRegisterAlias` or the like? Perhaps we…
		voidAuthorUnsubmitted Done Reply Inline Actions That's what's happening with GCC. https://godbolt.org/z/W6b7zxYnK Perhaps we should simply assert if we get a non GPR rather than return 0, which might actually be a Register? I'm also using it for the vector registers. And 0 can't be a register. (See `include/llvm/MC/Register.h`.) Might be able to use the TRC. But I see that X86 has `llvm::getX86SubSuperRegisterOrZero` in `X86MCTargetDesc.cpp` which has a large table of registers so that you can get the register of the proper size. void: That's what's happening with GCC. https://godbolt.org/z/W6b7zxYnK > Perhaps we should simply…
		AArch64::FPR32RegClass.contains(Reg) \|\|
		AArch64::FPR16RegClass.contains(Reg) \|\|
		AArch64::FPR8RegClass.contains(Reg)) {
		// For FPRs,
		if (MCRegister XReg = getRegisterOrZero(Reg, HasSVE))
		FPRsToZero.set(XReg);
		}
		}

		const AArch64InstrInfo &TII = *STI.getInstrInfo();

		// Zero out GPRs.
		for (MCRegister Reg : GPRsToZero.set_bits())
		BuildMI(MBB, MBBI, DL, TII.get(AArch64::MOVi64imm), Reg).addImm(0);
		nickdesaulniersUnsubmitted Not Done Reply Inline Actions isn't it more canonical on ARM to move from the dedicated zero register XZR rather than use an immediate? nickdesaulniers: isn't it more canonical on ARM to move from the dedicated zero register XZR rather than use an…
		voidAuthorUnsubmitted Done Reply Inline Actions GCC outputs the immediate move. I'm not familiar though with what's more canonical. void: GCC outputs the immediate move. I'm not familiar though with what's more canonical.

		// Zero out FP/vector registers.
		for (MCRegister Reg : FPRsToZero.set_bits())
		BuildMI(MBB, MBBI, DL, TII.get(AArch64::MOVID), Reg).addImm(0);

		if (HasSVE) {
		nickdesaulniersUnsubmitted Done Reply Inline Actions Reuse `HasSVE` from L771? nickdesaulniers: Reuse `HasSVE` from L771?
		voidAuthorUnsubmitted Done Reply Inline Actions Doh! Changed. void: Doh! Changed.
		for (MCRegister PReg :
		{AArch64::P0, AArch64::P1, AArch64::P2, AArch64::P3, AArch64::P4,
		AArch64::P5, AArch64::P6, AArch64::P7, AArch64::P8, AArch64::P9,
		AArch64::P10, AArch64::P11, AArch64::P12, AArch64::P13, AArch64::P14,
		AArch64::P15}) {
		if (RegsToZero[PReg])
		BuildMI(MBB, MBBI, DL, TII.get(AArch64::PFALSE), PReg);
		}
		}
		}

// Find a scratch register that we can use at the start of the prologue to		// Find a scratch register that we can use at the start of the prologue to
// re-align the stack pointer. We avoid using callee-save registers since they		// re-align the stack pointer. We avoid using callee-save registers since they
// may appear to be free when this is called from canUseAsPrologue (during		// may appear to be free when this is called from canUseAsPrologue (during
// shrink wrapping), but then no longer be free when this is called from		// shrink wrapping), but then no longer be free when this is called from
// emitPrologue.		// emitPrologue.
//		//
// FIXME: This is a bit conservative, since in the above case we could use one		// FIXME: This is a bit conservative, since in the above case we could use one
// of the callee-save registers as a scratch temp to re-align the stack pointer,		// of the callee-save registers as a scratch temp to re-align the stack pointer,
▲ Show 20 Lines • Show All 3,115 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64RegisterInfo.h

Show First 20 Lines • Show All 114 Lines • ▼ Show 20 Lines	void eliminateFrameIndex(MachineBasicBlock::iterator II, int SPAdj,
unsigned FIOperandNum,		unsigned FIOperandNum,
RegScavenger *RS = nullptr) const override;		RegScavenger *RS = nullptr) const override;
bool cannotEliminateFrame(const MachineFunction &MF) const;		bool cannotEliminateFrame(const MachineFunction &MF) const;

bool requiresVirtualBaseRegisters(const MachineFunction &MF) const override;		bool requiresVirtualBaseRegisters(const MachineFunction &MF) const override;
bool hasBasePointer(const MachineFunction &MF) const;		bool hasBasePointer(const MachineFunction &MF) const;
unsigned getBaseRegister() const;		unsigned getBaseRegister() const;

		bool isArgumentRegister(const MachineFunction &MF,
		MCRegister Reg) const override;

// Debug information queries.		// Debug information queries.
Register getFrameRegister(const MachineFunction &MF) const override;		Register getFrameRegister(const MachineFunction &MF) const override;

unsigned getRegPressureLimit(const TargetRegisterClass *RC,		unsigned getRegPressureLimit(const TargetRegisterClass *RC,
MachineFunction &MF) const override;		MachineFunction &MF) const override;

unsigned getLocalAddressRegister(const MachineFunction &MF) const;		unsigned getLocalAddressRegister(const MachineFunction &MF) const;
bool regNeedsCFI(unsigned Reg, unsigned &RegToUseForCFI) const;		bool regNeedsCFI(unsigned Reg, unsigned &RegToUseForCFI) const;
Show All 14 Lines

llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp

Show All 27 Lines
#include "llvm/IR/DebugInfoMetadata.h"		#include "llvm/IR/DebugInfoMetadata.h"
#include "llvm/IR/DiagnosticInfo.h"		#include "llvm/IR/DiagnosticInfo.h"
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include "llvm/Target/TargetOptions.h"		#include "llvm/Target/TargetOptions.h"

using namespace llvm;		using namespace llvm;

		#define GET_CC_REGISTER_LISTS
		#include "AArch64GenCallingConv.inc"
#define GET_REGINFO_TARGET_DESC		#define GET_REGINFO_TARGET_DESC
#include "AArch64GenRegisterInfo.inc"		#include "AArch64GenRegisterInfo.inc"

AArch64RegisterInfo::AArch64RegisterInfo(const Triple &TT)		AArch64RegisterInfo::AArch64RegisterInfo(const Triple &TT)
: AArch64GenRegisterInfo(AArch64::LR), TT(TT) {		: AArch64GenRegisterInfo(AArch64::LR), TT(TT) {
AArch64_MC::initLLVMToCVRegMapping(this);		AArch64_MC::initLLVMToCVRegMapping(this);
}		}

▲ Show 20 Lines • Show All 369 Lines • ▼ Show 20 Lines	if (MFI.hasVarSizedObjects() \|\| MF.hasEHFunclets()) {
// object; it's just suboptimal. Negative offsets use the unscaled		// object; it's just suboptimal. Negative offsets use the unscaled
// load/store instructions, which have a 9-bit signed immediate.		// load/store instructions, which have a 9-bit signed immediate.
return MFI.getLocalFrameSize() >= 256;		return MFI.getLocalFrameSize() >= 256;
}		}

return false;		return false;
}		}

		bool AArch64RegisterInfo::isArgumentRegister(const MachineFunction &MF,
		MCRegister Reg) const {
		CallingConv::ID CC = MF.getFunction().getCallingConv();
		const AArch64Subtarget &STI = MF.getSubtarget<AArch64Subtarget>();
		bool IsVarArg = STI.isCallingConvWin64(MF.getFunction().getCallingConv());

		auto HasReg = [](ArrayRef<MCRegister> RegList, MCRegister Reg) {
		return llvm::any_of(RegList,
		[Reg](const MCRegister R) { return R == Reg; });
		};

		switch (CC) {
		default:
		report_fatal_error("Unsupported calling convention.");
		case CallingConv::WebKit_JS:
		return HasReg(CC_AArch64_WebKit_JS_ArgRegs, Reg);
		case CallingConv::GHC:
		return HasReg(CC_AArch64_GHC_ArgRegs, Reg);
		case CallingConv::C:
		case CallingConv::Fast:
		case CallingConv::PreserveMost:
		case CallingConv::CXX_FAST_TLS:
		case CallingConv::Swift:
		case CallingConv::SwiftTail:
		case CallingConv::Tail:
		if (STI.isTargetWindows() && IsVarArg)
		return HasReg(CC_AArch64_Win64_VarArg_ArgRegs, Reg);
		if (!STI.isTargetDarwin()) {
		switch (CC) {
		default:
		return HasReg(CC_AArch64_AAPCS_ArgRegs, Reg);
		case CallingConv::Swift:
		case CallingConv::SwiftTail:
		return HasReg(CC_AArch64_AAPCS_ArgRegs, Reg) \|\|
		HasReg(CC_AArch64_AAPCS_Swift_ArgRegs, Reg);
		}
		}
		if (!IsVarArg) {
		switch (CC) {
		default:
		return HasReg(CC_AArch64_DarwinPCS_ArgRegs, Reg);
		case CallingConv::Swift:
		case CallingConv::SwiftTail:
		return HasReg(CC_AArch64_DarwinPCS_ArgRegs, Reg) \|\|
		HasReg(CC_AArch64_DarwinPCS_Swift_ArgRegs, Reg);
		}
		}
		if (STI.isTargetILP32())
		return HasReg(CC_AArch64_DarwinPCS_ILP32_VarArg_ArgRegs, Reg);
		return HasReg(CC_AArch64_DarwinPCS_VarArg_ArgRegs, Reg);
		case CallingConv::Win64:
		if (IsVarArg)
		HasReg(CC_AArch64_Win64_VarArg_ArgRegs, Reg);
		return HasReg(CC_AArch64_AAPCS_ArgRegs, Reg);
		case CallingConv::CFGuard_Check:
		return HasReg(CC_AArch64_Win64_CFGuard_Check_ArgRegs, Reg);
		case CallingConv::AArch64_VectorCall:
		case CallingConv::AArch64_SVE_VectorCall:
		return HasReg(CC_AArch64_AAPCS_ArgRegs, Reg);
		}
		}

Register		Register
AArch64RegisterInfo::getFrameRegister(const MachineFunction &MF) const {		AArch64RegisterInfo::getFrameRegister(const MachineFunction &MF) const {
const AArch64FrameLowering *TFI = getFrameLowering(MF);		const AArch64FrameLowering *TFI = getFrameLowering(MF);
return TFI->hasFP(MF) ? AArch64::FP : AArch64::SP;		return TFI->hasFP(MF) ? AArch64::FP : AArch64::SP;
}		}

bool AArch64RegisterInfo::requiresRegisterScavenging(		bool AArch64RegisterInfo::requiresRegisterScavenging(
const MachineFunction &MF) const {		const MachineFunction &MF) const {
▲ Show 20 Lines • Show All 378 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64RegisterInfo.td

Show First 20 Lines • Show All 1,379 Lines • ▼ Show 20 Lines	def svcr_op : Operand<i32> {
let PrintMethod = "printSVCROp";		let PrintMethod = "printSVCROp";
let DecoderMethod = "DecodeSVCROp";		let DecoderMethod = "DecodeSVCROp";
let MCOperandPredicate = [{		let MCOperandPredicate = [{
if (!MCOp.isImm())		if (!MCOp.isImm())
return false;		return false;
return AArch64SVCR::lookupSVCRByEncoding(MCOp.getImm()) != nullptr;		return AArch64SVCR::lookupSVCRByEncoding(MCOp.getImm()) != nullptr;
}];		}];
}		}

		//===----------------------------------------------------------------------===//
		// Register categories.
		//

		def GeneralPurposeRegisters : RegisterCategory<[GPR64, GPR32]>;

		def FIXED_REGS : RegisterClass<"AArch64", [i64], 64, (add FP, SP, VG, FFR)>;
		def FixedRegisters : RegisterCategory<[CCR, FIXED_REGS]>;
		nickdesaulniersUnsubmitted Not Done Reply Inline Actions is `i32` correct here? nickdesaulniers: is `i32` correct here?
		voidAuthorUnsubmitted Done Reply Inline Actions That's what FPR32 is defined to be: def FPR32 : RegisterClass<"AArch64", [f32, i32], 32,(sequence "S%u", 0, 31)>; void: That's what FPR32 is defined to be: ``` def FPR32 : RegisterClass<"AArch64", [f32, i32], 32…
		sdesmalenUnsubmitted Not Done Reply Inline Actions Should this feature/attribute work with other calling conventions? If so, then it's probably best not to hard-code these values here, but rather to get them from the chosen calling convention for that particular function. The supported calling conventions are defined in AArch64CallingConvention.td. For example, you could iterate all registers in GPR64/FPR128/ZPR/PPR register classes and zero their values if they are not marked as callee saved. You can query this information from the call by looking at it's callee-saved regmask (see for example `CSR_AArch64_AAPCS_RegMask` to see how those are defined defined). sdesmalen: Should this feature/attribute work with other calling conventions? If so, then it's probably…
		voidAuthorUnsubmitted Done Reply Inline Actions Ideally it could be retrieved from the `CallingConvention.td` files, but in reality it's difficult because those files have a lot of `CCIf...<>` constructs in them, making a simple query complex. I don't know about the `_RegMask` thing. Could you explain what it is and how it works? void: Ideally it could be retrieved from the `*CallingConvention.td` files, but in reality it's…

llvm/lib/Target/X86/X86RegisterInfo.cpp

Show First 20 Lines • Show All 650 Lines • ▼ Show 20 Lines	bool X86RegisterInfo::isArgumentRegister(const MachineFunction &MF,

if (ST.hasSSE1() &&		if (ST.hasSSE1() &&
llvm::any_of(SmallVector<MCRegister>{X86::XMM0, X86::XMM1, X86::XMM2,		llvm::any_of(SmallVector<MCRegister>{X86::XMM0, X86::XMM1, X86::XMM2,
X86::XMM3, X86::XMM4, X86::XMM5,		X86::XMM3, X86::XMM4, X86::XMM5,
X86::XMM6, X86::XMM7},		X86::XMM6, X86::XMM7},
[&](MCRegister &RegA) { return IsSubReg(RegA, Reg); }))		[&](MCRegister &RegA) { return IsSubReg(RegA, Reg); }))
return true;		return true;

return false;		return X86GenRegisterInfo::isArgumentRegister(MF, Reg);
		nickdesaulniersUnsubmitted Not Done Reply Inline Actions Does this allow us to clean up anything else in the body of this method? Consider making this and the tablegen related patch a distinct child patch. nickdesaulniers: Does this allow us to clean up anything else in the body of this method? Consider making this…
		voidAuthorUnsubmitted Done Reply Inline Actions It's possible to simplify here, but it would take more work. I'll address that in a separate patch. void: It's possible to simplify here, but it would take more work. I'll address that in a separate…
}		}

bool X86RegisterInfo::isFixedRegister(const MachineFunction &MF,		bool X86RegisterInfo::isFixedRegister(const MachineFunction &MF,
MCRegister PhysReg) const {		MCRegister PhysReg) const {
const X86Subtarget &ST = MF.getSubtarget<X86Subtarget>();		const X86Subtarget &ST = MF.getSubtarget<X86Subtarget>();
const TargetRegisterInfo &TRI = *ST.getRegisterInfo();		const TargetRegisterInfo &TRI = *ST.getRegisterInfo();

// Stack pointer.		// Stack pointer.
▲ Show 20 Lines • Show All 351 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/zero-call-used-regs.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc < %s -mtriple=aarch64-unknown-unknown \| FileCheck %s --check-prefixes=CHECK,DEFAULT
				; RUN: llc < %s -mtriple=aarch64-unknown-unknown -mattr=+sve \| FileCheck %s --check-prefixes=CHECK,SVE

				nickdesaulniersUnsubmitted Done Reply Inline Actions If you use `--check-prefixes=CHECK,<unique>` (ie. `--check-prefixes=CHECK,DEFAULT` and `--check-prefixes=CHECK,SVE`) then when `DEFAULT` and `SVE` match, you can just use `CHECK`. That should help reduce the number of checks in this test significantly. Otherwise it's hard to tell what's different between the two cases, if anything at all. update_llc_test_checks should work with --check-prefixes IME. nickdesaulniers: If you use `--check-prefixes=CHECK,<unique>` (ie. `--check-prefixes=CHECK,DEFAULT` and `--check…
				voidAuthorUnsubmitted Done Reply Inline Actions What hath God wrought?! Done. void: What hath God wrought?! Done.
				@result = dso_local global i32 0, align 4

				define dso_local i32 @skip(i32 noundef %a, i32 noundef %b, i32 noundef %c) local_unnamed_addr #0 "zero-call-used-regs"="skip" {
				; CHECK-LABEL: skip:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: mul w8, w1, w0
				; CHECK-NEXT: orr w0, w8, w2
				; CHECK-NEXT: ret

				entry:
				%mul = mul nsw i32 %b, %a
				%or = or i32 %mul, %c
				ret i32 %or
				}

				define dso_local i32 @used_gpr_arg(i32 noundef %a, i32 noundef %b, i32 noundef %c) local_unnamed_addr #0 noinline optnone "zero-call-used-regs"="used-gpr-arg" {
				; CHECK-LABEL: used_gpr_arg:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: mul w8, w1, w0
				; CHECK-NEXT: orr w0, w8, w2
				; CHECK-NEXT: mov x1, #0
				; CHECK-NEXT: mov x2, #0
				; CHECK-NEXT: ret

				entry:
				%mul = mul nsw i32 %b, %a
				%or = or i32 %mul, %c
				ret i32 %or
				}

				define dso_local i32 @used_gpr(i32 noundef %a, i32 noundef %b, i32 noundef %c) local_unnamed_addr #0 noinline optnone "zero-call-used-regs"="used-gpr" {
				; CHECK-LABEL: used_gpr:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: mul w8, w1, w0
				; CHECK-NEXT: orr w0, w8, w2
				; CHECK-NEXT: mov x1, #0
				; CHECK-NEXT: mov x2, #0
				; CHECK-NEXT: mov x8, #0
				; CHECK-NEXT: ret

				entry:
				%mul = mul nsw i32 %b, %a
				%or = or i32 %mul, %c
				ret i32 %or
				}

				define dso_local i32 @used_arg(i32 noundef %a, i32 noundef %b, i32 noundef %c) local_unnamed_addr #0 noinline optnone "zero-call-used-regs"="used-arg" {
				; CHECK-LABEL: used_arg:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: mul w8, w1, w0
				; CHECK-NEXT: orr w0, w8, w2
				; CHECK-NEXT: mov x1, #0
				; CHECK-NEXT: mov x2, #0
				; CHECK-NEXT: ret

				entry:
				%mul = mul nsw i32 %b, %a
				%or = or i32 %mul, %c
				ret i32 %or
				}

				define dso_local i32 @used(i32 noundef %a, i32 noundef %b, i32 noundef %c) local_unnamed_addr #0 noinline optnone "zero-call-used-regs"="used" {
				; CHECK-LABEL: used:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: mul w8, w1, w0
				; CHECK-NEXT: orr w0, w8, w2
				; CHECK-NEXT: mov x1, #0
				; CHECK-NEXT: mov x2, #0
				; CHECK-NEXT: mov x8, #0
				; CHECK-NEXT: ret

				entry:
				%mul = mul nsw i32 %b, %a
				%or = or i32 %mul, %c
				ret i32 %or
				}

				define dso_local i32 @all_gpr_arg(i32 noundef %a, i32 noundef %b, i32 noundef %c) local_unnamed_addr #0 "zero-call-used-regs"="all-gpr-arg" {
				; CHECK-LABEL: all_gpr_arg:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: mul w8, w1, w0
				; CHECK-NEXT: mov x1, #0
				; CHECK-NEXT: mov x3, #0
				; CHECK-NEXT: mov x4, #0
				; CHECK-NEXT: orr w0, w8, w2
				; CHECK-NEXT: mov x2, #0
				; CHECK-NEXT: mov x5, #0
				; CHECK-NEXT: mov x6, #0
				; CHECK-NEXT: mov x7, #0
				; CHECK-NEXT: mov x8, #0
				; CHECK-NEXT: mov x18, #0
				; CHECK-NEXT: ret

				entry:
				%mul = mul nsw i32 %b, %a
				%or = or i32 %mul, %c
				ret i32 %or
				}

				define dso_local i32 @all_gpr(i32 noundef %a, i32 noundef %b, i32 noundef %c) local_unnamed_addr #0 "zero-call-used-regs"="all-gpr" {
				; CHECK-LABEL: all_gpr:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: mul w8, w1, w0
				; CHECK-NEXT: mov x1, #0
				; CHECK-NEXT: mov x3, #0
				; CHECK-NEXT: mov x4, #0
				; CHECK-NEXT: orr w0, w8, w2
				; CHECK-NEXT: mov x2, #0
				; CHECK-NEXT: mov x5, #0
				; CHECK-NEXT: mov x6, #0
				; CHECK-NEXT: mov x7, #0
				; CHECK-NEXT: mov x8, #0
				; CHECK-NEXT: mov x9, #0
				; CHECK-NEXT: mov x10, #0
				; CHECK-NEXT: mov x11, #0
				; CHECK-NEXT: mov x12, #0
				; CHECK-NEXT: mov x13, #0
				; CHECK-NEXT: mov x14, #0
				; CHECK-NEXT: mov x15, #0
				; CHECK-NEXT: mov x16, #0
				; CHECK-NEXT: mov x17, #0
				; CHECK-NEXT: mov x18, #0
				; CHECK-NEXT: ret

				entry:
				%mul = mul nsw i32 %b, %a
				%or = or i32 %mul, %c
				ret i32 %or
				}

				define dso_local i32 @all_arg(i32 noundef %a, i32 noundef %b, i32 noundef %c) local_unnamed_addr #0 "zero-call-used-regs"="all-arg" {
				; DEFAULT-LABEL: all_arg:
				; DEFAULT: // %bb.0: // %entry
				; DEFAULT-NEXT: mul w8, w1, w0
				; DEFAULT-NEXT: mov x1, #0
				; DEFAULT-NEXT: mov x3, #0
				; DEFAULT-NEXT: mov x4, #0
				; DEFAULT-NEXT: orr w0, w8, w2
				; DEFAULT-NEXT: mov x2, #0
				; DEFAULT-NEXT: mov x5, #0
				; DEFAULT-NEXT: mov x6, #0
				; DEFAULT-NEXT: mov x7, #0
				; DEFAULT-NEXT: mov x8, #0
				; DEFAULT-NEXT: mov x18, #0
				; DEFAULT-NEXT: movi q0, #0000000000000000
				; DEFAULT-NEXT: movi q1, #0000000000000000
				; DEFAULT-NEXT: movi q2, #0000000000000000
				; DEFAULT-NEXT: movi q3, #0000000000000000
				; DEFAULT-NEXT: movi q4, #0000000000000000
				; DEFAULT-NEXT: movi q5, #0000000000000000
				; DEFAULT-NEXT: movi q6, #0000000000000000
				; DEFAULT-NEXT: movi q7, #0000000000000000
				; DEFAULT-NEXT: ret
				;
				; SVE-LABEL: all_arg:
				; SVE: // %bb.0: // %entry
				; SVE-NEXT: mul w8, w1, w0
				; SVE-NEXT: mov x1, #0
				; SVE-NEXT: mov x3, #0
				; SVE-NEXT: mov x4, #0
				; SVE-NEXT: orr w0, w8, w2
				; SVE-NEXT: mov x2, #0
				; SVE-NEXT: mov x5, #0
				; SVE-NEXT: mov x6, #0
				; SVE-NEXT: mov x7, #0
				; SVE-NEXT: mov x8, #0
				; SVE-NEXT: mov x18, #0
				; SVE-NEXT: movi z0, #0000000000000000
				; SVE-NEXT: movi z1, #0000000000000000
				; SVE-NEXT: movi z2, #0000000000000000
				; SVE-NEXT: movi z3, #0000000000000000
				; SVE-NEXT: movi z4, #0000000000000000
				; SVE-NEXT: movi z5, #0000000000000000
				; SVE-NEXT: movi z6, #0000000000000000
				; SVE-NEXT: movi z7, #0000000000000000
				; SVE-NEXT: pfalse p0.b
				; SVE-NEXT: pfalse p1.b
				; SVE-NEXT: pfalse p2.b
				; SVE-NEXT: pfalse p3.b
				; SVE-NEXT: ret

				entry:
				%mul = mul nsw i32 %b, %a
				%or = or i32 %mul, %c
				ret i32 %or
				}

				define dso_local i32 @all(i32 noundef %a, i32 noundef %b, i32 noundef %c) local_unnamed_addr #0 "zero-call-used-regs"="all" {
				; DEFAULT-LABEL: all:
				; DEFAULT: // %bb.0: // %entry
				; DEFAULT-NEXT: mul w8, w1, w0
				; DEFAULT-NEXT: mov x1, #0
				; DEFAULT-NEXT: mov x3, #0
				; DEFAULT-NEXT: mov x4, #0
				; DEFAULT-NEXT: orr w0, w8, w2
				; DEFAULT-NEXT: mov x2, #0
				; DEFAULT-NEXT: mov x5, #0
				; DEFAULT-NEXT: mov x6, #0
				; DEFAULT-NEXT: mov x7, #0
				; DEFAULT-NEXT: mov x8, #0
				; DEFAULT-NEXT: mov x9, #0
				; DEFAULT-NEXT: mov x10, #0
				; DEFAULT-NEXT: mov x11, #0
				; DEFAULT-NEXT: mov x12, #0
				; DEFAULT-NEXT: mov x13, #0
				; DEFAULT-NEXT: mov x14, #0
				; DEFAULT-NEXT: mov x15, #0
				; DEFAULT-NEXT: mov x16, #0
				; DEFAULT-NEXT: mov x17, #0
				; DEFAULT-NEXT: mov x18, #0
				; DEFAULT-NEXT: movi q0, #0000000000000000
				; DEFAULT-NEXT: movi q1, #0000000000000000
				; DEFAULT-NEXT: movi q2, #0000000000000000
				; DEFAULT-NEXT: movi q3, #0000000000000000
				; DEFAULT-NEXT: movi q4, #0000000000000000
				; DEFAULT-NEXT: movi q5, #0000000000000000
				; DEFAULT-NEXT: movi q6, #0000000000000000
				; DEFAULT-NEXT: movi q7, #0000000000000000
				; DEFAULT-NEXT: movi q8, #0000000000000000
				; DEFAULT-NEXT: movi q9, #0000000000000000
				; DEFAULT-NEXT: movi q10, #0000000000000000
				; DEFAULT-NEXT: movi q11, #0000000000000000
				; DEFAULT-NEXT: movi q12, #0000000000000000
				; DEFAULT-NEXT: movi q13, #0000000000000000
				; DEFAULT-NEXT: movi q14, #0000000000000000
				; DEFAULT-NEXT: movi q15, #0000000000000000
				; DEFAULT-NEXT: movi q16, #0000000000000000
				; DEFAULT-NEXT: movi q17, #0000000000000000
				; DEFAULT-NEXT: movi q18, #0000000000000000
				; DEFAULT-NEXT: movi q19, #0000000000000000
				peterwaller-armUnsubmitted Not Done Reply Inline Actions Thanks for addressing the SVE case. Please can we have `target-features=+sve` tests as well? peterwaller-arm: Thanks for addressing the SVE case. Please can we have `target-features=+sve` tests as well?
				voidAuthorUnsubmitted Done Reply Inline Actions Thanks for addressing the SVE case. Please can we have `target-features=+sve` tests as well? Yes. :-) I'm working on a few other clean-up things so they'll be coming soon. void: > Thanks for addressing the SVE case. Please can we have `target-features=+sve` tests as well?
				; DEFAULT-NEXT: movi q20, #0000000000000000
				; DEFAULT-NEXT: movi q21, #0000000000000000
				; DEFAULT-NEXT: movi q22, #0000000000000000
				; DEFAULT-NEXT: movi q23, #0000000000000000
				; DEFAULT-NEXT: movi q24, #0000000000000000
				; DEFAULT-NEXT: movi q25, #0000000000000000
				; DEFAULT-NEXT: movi q26, #0000000000000000
				; DEFAULT-NEXT: movi q27, #0000000000000000
				; DEFAULT-NEXT: movi q28, #0000000000000000
				; DEFAULT-NEXT: movi q29, #0000000000000000
				; DEFAULT-NEXT: movi q30, #0000000000000000
				; DEFAULT-NEXT: movi q31, #0000000000000000
				; DEFAULT-NEXT: ret
				;
				; SVE-LABEL: all:
				; SVE: // %bb.0: // %entry
				; SVE-NEXT: mul w8, w1, w0
				; SVE-NEXT: mov x1, #0
				; SVE-NEXT: mov x3, #0
				; SVE-NEXT: mov x4, #0
				; SVE-NEXT: orr w0, w8, w2
				; SVE-NEXT: mov x2, #0
				; SVE-NEXT: mov x5, #0
				; SVE-NEXT: mov x6, #0
				; SVE-NEXT: mov x7, #0
				; SVE-NEXT: mov x8, #0
				; SVE-NEXT: mov x9, #0
				; SVE-NEXT: mov x10, #0
				; SVE-NEXT: mov x11, #0
				nickdesaulniersUnsubmitted Done Reply Inline Actions N00b question about SVE: do we need `pfalse` for each of the numbered p registers corresponding to the x registers we zeroed? i.e. here we have pfalse for p0-3, yet we zero z0-7. nickdesaulniers: N00b question about SVE: do we need `pfalse` for each of the numbered p registers corresponding…
				peterwaller-armUnsubmitted Done Reply Inline Actions No, the set of p registers are independent of the z registers. The calling convention states [0] that the predicate registers p0-p3 may be used for parameter passing (if you have an argument which belongs in a p register), so this looks reasonable. [0] https://github.com/ARM-software/abi-aa/blob/8a7b266879c60ca1c76e94ebb279b2dac60ed6a5/aapcs64/aapcs64.rst#scalable-predicate-registers peterwaller-arm: No, the set of p registers are independent of the z registers. The calling convention states…
				; SVE-NEXT: mov x12, #0
				; SVE-NEXT: mov x13, #0
				; SVE-NEXT: mov x14, #0
				; SVE-NEXT: mov x15, #0
				; SVE-NEXT: mov x16, #0
				; SVE-NEXT: mov x17, #0
				; SVE-NEXT: mov x18, #0
				; SVE-NEXT: movi z0, #0000000000000000
				; SVE-NEXT: movi z1, #0000000000000000
				; SVE-NEXT: movi z2, #0000000000000000
				; SVE-NEXT: movi z3, #0000000000000000
				; SVE-NEXT: movi z4, #0000000000000000
				; SVE-NEXT: movi z5, #0000000000000000
				; SVE-NEXT: movi z6, #0000000000000000
				; SVE-NEXT: movi z7, #0000000000000000
				; SVE-NEXT: movi z8, #0000000000000000
				; SVE-NEXT: movi z9, #0000000000000000
				; SVE-NEXT: movi z10, #0000000000000000
				; SVE-NEXT: movi z11, #0000000000000000
				; SVE-NEXT: movi z12, #0000000000000000
				; SVE-NEXT: movi z13, #0000000000000000
				; SVE-NEXT: movi z14, #0000000000000000
				; SVE-NEXT: movi z15, #0000000000000000
				; SVE-NEXT: movi z16, #0000000000000000
				; SVE-NEXT: movi z17, #0000000000000000
				; SVE-NEXT: movi z18, #0000000000000000
				; SVE-NEXT: movi z19, #0000000000000000
				; SVE-NEXT: movi z20, #0000000000000000
				; SVE-NEXT: movi z21, #0000000000000000
				; SVE-NEXT: movi z22, #0000000000000000
				; SVE-NEXT: movi z23, #0000000000000000
				; SVE-NEXT: movi z24, #0000000000000000
				; SVE-NEXT: movi z25, #0000000000000000
				; SVE-NEXT: movi z26, #0000000000000000
				; SVE-NEXT: movi z27, #0000000000000000
				; SVE-NEXT: movi z28, #0000000000000000
				; SVE-NEXT: movi z29, #0000000000000000
				; SVE-NEXT: movi z30, #0000000000000000
				; SVE-NEXT: movi z31, #0000000000000000
				; SVE-NEXT: pfalse p0.b
				; SVE-NEXT: pfalse p1.b
				; SVE-NEXT: pfalse p2.b
				; SVE-NEXT: pfalse p3.b
				; SVE-NEXT: pfalse p4.b
				; SVE-NEXT: pfalse p5.b
				; SVE-NEXT: pfalse p6.b
				; SVE-NEXT: pfalse p7.b
				; SVE-NEXT: pfalse p8.b
				; SVE-NEXT: pfalse p9.b
				; SVE-NEXT: pfalse p10.b
				; SVE-NEXT: pfalse p11.b
				; SVE-NEXT: pfalse p12.b
				; SVE-NEXT: pfalse p13.b
				; SVE-NEXT: pfalse p14.b
				; SVE-NEXT: pfalse p15.b
				; SVE-NEXT: ret

				entry:
				%mul = mul nsw i32 %b, %a
				%or = or i32 %mul, %c
				ret i32 %or
				}

				define dso_local double @skip_float(double noundef %a, float noundef %b) local_unnamed_addr #0 "zero-call-used-regs"="skip" {
				; CHECK-LABEL: skip_float:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: fcvt d1, s1
				; CHECK-NEXT: fmul d0, d1, d0
				; CHECK-NEXT: ret

				entry:
				%conv = fpext float %b to double
				%mul = fmul double %conv, %a
				ret double %mul
				}

				define dso_local double @used_gpr_arg_float(double noundef %a, float noundef %b) local_unnamed_addr #0 noinline optnone "zero-call-used-regs"="used-gpr-arg" {
				; CHECK-LABEL: used_gpr_arg_float:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: fcvt d1, s1
				; CHECK-NEXT: fmul d0, d1, d0
				; CHECK-NEXT: ret

				entry:
				%conv = fpext float %b to double
				%mul = fmul double %conv, %a
				ret double %mul
				}

				define dso_local double @used_gpr_float(double noundef %a, float noundef %b) local_unnamed_addr #0 noinline optnone "zero-call-used-regs"="used-gpr" {
				; CHECK-LABEL: used_gpr_float:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: fcvt d1, s1
				; CHECK-NEXT: fmul d0, d1, d0
				; CHECK-NEXT: ret

				entry:
				%conv = fpext float %b to double
				%mul = fmul double %conv, %a
				ret double %mul
				}

				define dso_local double @used_arg_float(double noundef %a, float noundef %b) local_unnamed_addr #0 noinline optnone "zero-call-used-regs"="used-arg" {
				; DEFAULT-LABEL: used_arg_float:
				; DEFAULT: // %bb.0: // %entry
				; DEFAULT-NEXT: fcvt d1, s1
				; DEFAULT-NEXT: fmul d0, d1, d0
				; DEFAULT-NEXT: movi q1, #0000000000000000
				; DEFAULT-NEXT: ret
				;
				; SVE-LABEL: used_arg_float:
				; SVE: // %bb.0: // %entry
				; SVE-NEXT: fcvt d1, s1
				; SVE-NEXT: fmul d0, d1, d0
				; SVE-NEXT: movi z1, #0000000000000000
				; SVE-NEXT: ret

				entry:
				%conv = fpext float %b to double
				%mul = fmul double %conv, %a
				ret double %mul
				}

				define dso_local double @used_float(double noundef %a, float noundef %b) local_unnamed_addr #0 noinline optnone "zero-call-used-regs"="used" {
				; DEFAULT-LABEL: used_float:
				; DEFAULT: // %bb.0: // %entry
				; DEFAULT-NEXT: fcvt d1, s1
				; DEFAULT-NEXT: fmul d0, d1, d0
				; DEFAULT-NEXT: movi q1, #0000000000000000
				; DEFAULT-NEXT: ret
				;
				; SVE-LABEL: used_float:
				; SVE: // %bb.0: // %entry
				; SVE-NEXT: fcvt d1, s1
				; SVE-NEXT: fmul d0, d1, d0
				; SVE-NEXT: movi z1, #0000000000000000
				; SVE-NEXT: ret

				entry:
				%conv = fpext float %b to double
				%mul = fmul double %conv, %a
				ret double %mul
				}

				define dso_local double @all_gpr_arg_float(double noundef %a, float noundef %b) local_unnamed_addr #0 noinline optnone "zero-call-used-regs"="all-gpr-arg" {
				; CHECK-LABEL: all_gpr_arg_float:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: fcvt d1, s1
				; CHECK-NEXT: fmul d0, d1, d0
				; CHECK-NEXT: mov x0, #0
				; CHECK-NEXT: mov x1, #0
				; CHECK-NEXT: mov x2, #0
				; CHECK-NEXT: mov x3, #0
				; CHECK-NEXT: mov x4, #0
				; CHECK-NEXT: mov x5, #0
				; CHECK-NEXT: mov x6, #0
				; CHECK-NEXT: mov x7, #0
				; CHECK-NEXT: mov x8, #0
				; CHECK-NEXT: mov x18, #0
				; CHECK-NEXT: ret

				entry:
				%conv = fpext float %b to double
				%mul = fmul double %conv, %a
				ret double %mul
				}

				define dso_local double @all_gpr_float(double noundef %a, float noundef %b) local_unnamed_addr #0 noinline optnone "zero-call-used-regs"="all-gpr" {
				; CHECK-LABEL: all_gpr_float:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: fcvt d1, s1
				; CHECK-NEXT: fmul d0, d1, d0
				; CHECK-NEXT: mov x0, #0
				; CHECK-NEXT: mov x1, #0
				; CHECK-NEXT: mov x2, #0
				; CHECK-NEXT: mov x3, #0
				; CHECK-NEXT: mov x4, #0
				; CHECK-NEXT: mov x5, #0
				; CHECK-NEXT: mov x6, #0
				; CHECK-NEXT: mov x7, #0
				; CHECK-NEXT: mov x8, #0
				; CHECK-NEXT: mov x9, #0
				; CHECK-NEXT: mov x10, #0
				; CHECK-NEXT: mov x11, #0
				; CHECK-NEXT: mov x12, #0
				; CHECK-NEXT: mov x13, #0
				; CHECK-NEXT: mov x14, #0
				; CHECK-NEXT: mov x15, #0
				; CHECK-NEXT: mov x16, #0
				; CHECK-NEXT: mov x17, #0
				; CHECK-NEXT: mov x18, #0
				; CHECK-NEXT: ret

				entry:
				%conv = fpext float %b to double
				%mul = fmul double %conv, %a
				ret double %mul
				}

				define dso_local double @all_arg_float(double noundef %a, float noundef %b) local_unnamed_addr #0 noinline optnone "zero-call-used-regs"="all-arg" {
				; DEFAULT-LABEL: all_arg_float:
				; DEFAULT: // %bb.0: // %entry
				; DEFAULT-NEXT: fcvt d1, s1
				; DEFAULT-NEXT: fmul d0, d1, d0
				; DEFAULT-NEXT: mov x0, #0
				; DEFAULT-NEXT: mov x1, #0
				; DEFAULT-NEXT: mov x2, #0
				; DEFAULT-NEXT: mov x3, #0
				; DEFAULT-NEXT: mov x4, #0
				; DEFAULT-NEXT: mov x5, #0
				; DEFAULT-NEXT: mov x6, #0
				; DEFAULT-NEXT: mov x7, #0
				; DEFAULT-NEXT: mov x8, #0
				; DEFAULT-NEXT: mov x18, #0
				; DEFAULT-NEXT: movi q1, #0000000000000000
				; DEFAULT-NEXT: movi q2, #0000000000000000
				; DEFAULT-NEXT: movi q3, #0000000000000000
				; DEFAULT-NEXT: movi q4, #0000000000000000
				; DEFAULT-NEXT: movi q5, #0000000000000000
				; DEFAULT-NEXT: movi q6, #0000000000000000
				; DEFAULT-NEXT: movi q7, #0000000000000000
				; DEFAULT-NEXT: ret
				;
				; SVE-LABEL: all_arg_float:
				; SVE: // %bb.0: // %entry
				; SVE-NEXT: fcvt d1, s1
				; SVE-NEXT: fmul d0, d1, d0
				; SVE-NEXT: mov x0, #0
				; SVE-NEXT: mov x1, #0
				; SVE-NEXT: mov x2, #0
				; SVE-NEXT: mov x3, #0
				; SVE-NEXT: mov x4, #0
				; SVE-NEXT: mov x5, #0
				; SVE-NEXT: mov x6, #0
				; SVE-NEXT: mov x7, #0
				; SVE-NEXT: mov x8, #0
				; SVE-NEXT: mov x18, #0
				; SVE-NEXT: movi z1, #0000000000000000
				; SVE-NEXT: movi z2, #0000000000000000
				; SVE-NEXT: movi z3, #0000000000000000
				; SVE-NEXT: movi z4, #0000000000000000
				; SVE-NEXT: movi z5, #0000000000000000
				; SVE-NEXT: movi z6, #0000000000000000
				; SVE-NEXT: movi z7, #0000000000000000
				; SVE-NEXT: pfalse p0.b
				; SVE-NEXT: pfalse p1.b
				; SVE-NEXT: pfalse p2.b
				; SVE-NEXT: pfalse p3.b
				; SVE-NEXT: ret

				entry:
				%conv = fpext float %b to double
				%mul = fmul double %conv, %a
				ret double %mul
				}

				define dso_local double @all_float(double noundef %a, float noundef %b) local_unnamed_addr #0 noinline optnone "zero-call-used-regs"="all" {
				; DEFAULT-LABEL: all_float:
				; DEFAULT: // %bb.0: // %entry
				; DEFAULT-NEXT: fcvt d1, s1
				; DEFAULT-NEXT: fmul d0, d1, d0
				; DEFAULT-NEXT: mov x0, #0
				; DEFAULT-NEXT: mov x1, #0
				; DEFAULT-NEXT: mov x2, #0
				; DEFAULT-NEXT: mov x3, #0
				; DEFAULT-NEXT: mov x4, #0
				; DEFAULT-NEXT: mov x5, #0
				; DEFAULT-NEXT: mov x6, #0
				; DEFAULT-NEXT: mov x7, #0
				; DEFAULT-NEXT: mov x8, #0
				; DEFAULT-NEXT: mov x9, #0
				; DEFAULT-NEXT: mov x10, #0
				; DEFAULT-NEXT: mov x11, #0
				; DEFAULT-NEXT: mov x12, #0
				; DEFAULT-NEXT: mov x13, #0
				; DEFAULT-NEXT: mov x14, #0
				; DEFAULT-NEXT: mov x15, #0
				; DEFAULT-NEXT: mov x16, #0
				; DEFAULT-NEXT: mov x17, #0
				; DEFAULT-NEXT: mov x18, #0
				; DEFAULT-NEXT: movi q1, #0000000000000000
				; DEFAULT-NEXT: movi q2, #0000000000000000
				; DEFAULT-NEXT: movi q3, #0000000000000000
				; DEFAULT-NEXT: movi q4, #0000000000000000
				; DEFAULT-NEXT: movi q5, #0000000000000000
				; DEFAULT-NEXT: movi q6, #0000000000000000
				; DEFAULT-NEXT: movi q7, #0000000000000000
				; DEFAULT-NEXT: movi q8, #0000000000000000
				; DEFAULT-NEXT: movi q9, #0000000000000000
				; DEFAULT-NEXT: movi q10, #0000000000000000
				; DEFAULT-NEXT: movi q11, #0000000000000000
				; DEFAULT-NEXT: movi q12, #0000000000000000
				; DEFAULT-NEXT: movi q13, #0000000000000000
				; DEFAULT-NEXT: movi q14, #0000000000000000
				; DEFAULT-NEXT: movi q15, #0000000000000000
				; DEFAULT-NEXT: movi q16, #0000000000000000
				; DEFAULT-NEXT: movi q17, #0000000000000000
				; DEFAULT-NEXT: movi q18, #0000000000000000
				; DEFAULT-NEXT: movi q19, #0000000000000000
				; DEFAULT-NEXT: movi q20, #0000000000000000
				; DEFAULT-NEXT: movi q21, #0000000000000000
				; DEFAULT-NEXT: movi q22, #0000000000000000
				; DEFAULT-NEXT: movi q23, #0000000000000000
				; DEFAULT-NEXT: movi q24, #0000000000000000
				; DEFAULT-NEXT: movi q25, #0000000000000000
				; DEFAULT-NEXT: movi q26, #0000000000000000
				; DEFAULT-NEXT: movi q27, #0000000000000000
				; DEFAULT-NEXT: movi q28, #0000000000000000
				; DEFAULT-NEXT: movi q29, #0000000000000000
				; DEFAULT-NEXT: movi q30, #0000000000000000
				; DEFAULT-NEXT: movi q31, #0000000000000000
				; DEFAULT-NEXT: ret
				;
				; SVE-LABEL: all_float:
				; SVE: // %bb.0: // %entry
				; SVE-NEXT: fcvt d1, s1
				; SVE-NEXT: fmul d0, d1, d0
				; SVE-NEXT: mov x0, #0
				; SVE-NEXT: mov x1, #0
				; SVE-NEXT: mov x2, #0
				; SVE-NEXT: mov x3, #0
				; SVE-NEXT: mov x4, #0
				; SVE-NEXT: mov x5, #0
				; SVE-NEXT: mov x6, #0
				; SVE-NEXT: mov x7, #0
				; SVE-NEXT: mov x8, #0
				; SVE-NEXT: mov x9, #0
				; SVE-NEXT: mov x10, #0
				; SVE-NEXT: mov x11, #0
				; SVE-NEXT: mov x12, #0
				; SVE-NEXT: mov x13, #0
				; SVE-NEXT: mov x14, #0
				; SVE-NEXT: mov x15, #0
				; SVE-NEXT: mov x16, #0
				; SVE-NEXT: mov x17, #0
				; SVE-NEXT: mov x18, #0
				; SVE-NEXT: movi z1, #0000000000000000
				; SVE-NEXT: movi z2, #0000000000000000
				; SVE-NEXT: movi z3, #0000000000000000
				; SVE-NEXT: movi z4, #0000000000000000
				; SVE-NEXT: movi z5, #0000000000000000
				; SVE-NEXT: movi z6, #0000000000000000
				; SVE-NEXT: movi z7, #0000000000000000
				; SVE-NEXT: movi z8, #0000000000000000
				; SVE-NEXT: movi z9, #0000000000000000
				; SVE-NEXT: movi z10, #0000000000000000
				; SVE-NEXT: movi z11, #0000000000000000
				; SVE-NEXT: movi z12, #0000000000000000
				; SVE-NEXT: movi z13, #0000000000000000
				; SVE-NEXT: movi z14, #0000000000000000
				; SVE-NEXT: movi z15, #0000000000000000
				; SVE-NEXT: movi z16, #0000000000000000
				; SVE-NEXT: movi z17, #0000000000000000
				; SVE-NEXT: movi z18, #0000000000000000
				; SVE-NEXT: movi z19, #0000000000000000
				; SVE-NEXT: movi z20, #0000000000000000
				; SVE-NEXT: movi z21, #0000000000000000
				; SVE-NEXT: movi z22, #0000000000000000
				; SVE-NEXT: movi z23, #0000000000000000
				; SVE-NEXT: movi z24, #0000000000000000
				; SVE-NEXT: movi z25, #0000000000000000
				; SVE-NEXT: movi z26, #0000000000000000
				; SVE-NEXT: movi z27, #0000000000000000
				; SVE-NEXT: movi z28, #0000000000000000
				; SVE-NEXT: movi z29, #0000000000000000
				; SVE-NEXT: movi z30, #0000000000000000
				; SVE-NEXT: movi z31, #0000000000000000
				; SVE-NEXT: pfalse p0.b
				; SVE-NEXT: pfalse p1.b
				; SVE-NEXT: pfalse p2.b
				; SVE-NEXT: pfalse p3.b
				; SVE-NEXT: pfalse p4.b
				; SVE-NEXT: pfalse p5.b
				; SVE-NEXT: pfalse p6.b
				; SVE-NEXT: pfalse p7.b
				; SVE-NEXT: pfalse p8.b
				; SVE-NEXT: pfalse p9.b
				; SVE-NEXT: pfalse p10.b
				; SVE-NEXT: pfalse p11.b
				; SVE-NEXT: pfalse p12.b
				; SVE-NEXT: pfalse p13.b
				; SVE-NEXT: pfalse p14.b
				; SVE-NEXT: pfalse p15.b
				; SVE-NEXT: ret

				entry:
				%conv = fpext float %b to double
				%mul = fmul double %conv, %a
				ret double %mul
				}

				; Don't emit zeroing registers in "main" function.
				define dso_local i32 @main() local_unnamed_addr #0 {
				; CHECK-LABEL: main:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: mov w0, wzr
				; CHECK-NEXT: ret

				entry:
				ret i32 0
				}

				attributes #0 = { mustprogress nofree norecurse nosync nounwind readnone willreturn uwtable "frame-pointer"="non-leaf" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+neon,+v8a" }

llvm/utils/TableGen/RegisterInfoEmitter.cpp

Show First 20 Lines • Show All 1,182 Lines • ▼ Show 20 Lines	OS << " const RegClassWeight &getRegClassWeight("
<< " const int *getRegUnitPressureSets("		<< " const int *getRegUnitPressureSets("
<< "unsigned RegUnit) const override;\n"		<< "unsigned RegUnit) const override;\n"
<< " ArrayRef<const char *> getRegMaskNames() const override;\n"		<< " ArrayRef<const char *> getRegMaskNames() const override;\n"
<< " ArrayRef<const uint32_t *> getRegMasks() const override;\n"		<< " ArrayRef<const uint32_t *> getRegMasks() const override;\n"
<< " bool isGeneralPurposeRegister(const MachineFunction &, "		<< " bool isGeneralPurposeRegister(const MachineFunction &, "
<< "MCRegister) const override;\n"		<< "MCRegister) const override;\n"
<< " bool isFixedRegister(const MachineFunction &, "		<< " bool isFixedRegister(const MachineFunction &, "
<< "MCRegister) const override;\n"		<< "MCRegister) const override;\n"
		<< " bool isArgumentRegister(const MachineFunction &, "
		<< "MCRegister) const override;\n"
<< " /// Devirtualized TargetFrameLowering.\n"		<< " /// Devirtualized TargetFrameLowering.\n"
<< " static const " << TargetName << "FrameLowering *getFrameLowering(\n"		<< " static const " << TargetName << "FrameLowering *getFrameLowering(\n"
<< " const MachineFunction &MF);\n"		<< " const MachineFunction &MF);\n"
<< "};\n\n";		<< "};\n\n";

const auto &RegisterClasses = RegBank.getRegClasses();		const auto &RegisterClasses = RegBank.getRegClasses();

if (!RegisterClasses.empty()) {		if (!RegisterClasses.empty()) {
▲ Show 20 Lines • Show All 458 Lines • ▼ Show 20 Lines	if (Category.getName() == "FixedRegisters") {
for (const CodeGenRegisterClass *RC : Category.getClasses())		for (const CodeGenRegisterClass *RC : Category.getClasses())
OS << " " << RC->getQualifiedName()		OS << " " << RC->getQualifiedName()
<< "RegClass.contains(PhysReg) \|\|\n";		<< "RegClass.contains(PhysReg) \|\|\n";
break;		break;
}		}
OS << " false;\n";		OS << " false;\n";
OS << "}\n\n";		OS << "}\n\n";

		OS << "bool " << ClassName << "::\n"
		<< "isArgumentRegister(const MachineFunction &MF, "
		<< "MCRegister PhysReg) const {\n"
		<< " return\n";
		for (const CodeGenRegisterCategory &Category : RegCategories)
		if (Category.getName() == "ArgumentRegisters") {
		for (const CodeGenRegisterClass *RC : Category.getClasses())
		OS << " " << RC->getQualifiedName()
		<< "RegClass.contains(PhysReg) \|\|\n";
		break;
		}
		OS << " false;\n";
		OS << "}\n\n";

OS << "ArrayRef<const char *> " << ClassName		OS << "ArrayRef<const char *> " << ClassName
<< "::getRegMaskNames() const {\n";		<< "::getRegMaskNames() const {\n";
if (!CSRSets.empty()) {		if (!CSRSets.empty()) {
OS << " static const char *const Names[] = {\n";		OS << " static const char *const Names[] = {\n";
for (Record *CSRSet : CSRSets)		for (Record *CSRSet : CSRSets)
OS << " " << '"' << CSRSet->getName() << '"' << ",\n";		OS << " " << '"' << CSRSet->getName() << '"' << ",\n";
OS << " };\n";		OS << " };\n";
OS << " return makeArrayRef(Names);\n";		OS << " return makeArrayRef(Names);\n";
▲ Show 20 Lines • Show All 105 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Add support for -fzero-call-used-regsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 430860

clang/include/clang/Driver/Options.td

clang/lib/Driver/ToolChains/Clang.cpp

llvm/lib/Target/AArch64/AArch64FrameLowering.h

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp

llvm/lib/Target/AArch64/AArch64RegisterInfo.h

llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp

llvm/lib/Target/AArch64/AArch64RegisterInfo.td

llvm/lib/Target/X86/X86RegisterInfo.cpp

llvm/test/CodeGen/AArch64/zero-call-used-regs.ll

llvm/utils/TableGen/RegisterInfoEmitter.cpp

[AArch64] Add support for -fzero-call-used-regs
ClosedPublic