This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Set preferred function alignment
ClosedPublic

Authored by NickGuy on Aug 9 2023, 8:08 AM.

Download Raw Diff

Details

Reviewers

dmgreen
samtebbs

Commits

rGd65feccb1262: [ARM] Set preferred function alignment

Summary

Aligning functions yields small performance gains on embedded cores, moreso with numerous small function calls. Similar to aligning loops, if the function can fit within a single cache line then the performance overhead of fetching more instructions can be limited.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	60 ms	x64 debian > LLVM.CodeGen/ARM::func-sanitizer.ll
	50 ms	x64 debian > LLVM.CodeGen/ARM::kcfi.ll
	60 ms	x64 debian > LLVM.CodeGen/ARM::thumb-alignment.ll
	50 ms	x64 debian > LLVM.CodeGen/ARM::thumb-function-section-reloc.ll
	50 ms	x64 debian > LLVM.CodeGen/Thumb::2010-07-01-FuncAlign.ll
		View Full Test Results (6 Failed)

Event Timeline

NickGuy created this revision.Aug 9 2023, 8:08 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 9 2023, 8:08 AM

Herald added subscribers: hiraditya, kristof.beyls. · View Herald Transcript

NickGuy requested review of this revision.Aug 9 2023, 8:08 AM

Can you add a test? Thanks.

Harbormaster completed remote builds in B251393: Diff 548629.Aug 9 2023, 10:24 AM

In D157514#4573318, @dmgreen wrote:

Can you add a test? Thanks.

Done and precommitted.

Harbormaster completed remote builds in B251616: Diff 548931.Aug 10 2023, 2:08 AM

Can you improve the summary to explain why this is being done? Its the same reasons as we align loops.

Should this be done for all cpus? I can see how that would make sense, but as far as I understand you are only really aiming for M-class devices. And we haven't in the past aligned loops for v6m devices (or some of the higher end v7m devices).

llvm/test/CodeGen/ARM/preferred-function-alignment.ll
1 ↗	(On Diff #548931)	It might be better to make this an Arm CPU deliberately (as opposed to thumb), as opposed to generic. I believe that is what this is testing.

I've assigned the function alignment to the same as the loop alignment, as in my testing I'd seen that the values are "best" when they are equal.

Can you improve the summary to explain why this is being done? Its the same reasons as we align loops.

Words seem to be failing me today. Hopefully the new summary makes sense.

Harbormaster completed remote builds in B252939: Diff 550727.Aug 16 2023, 6:55 AM

I agree it makes sense to use the same alignments, especially for cortex-m cpus. Can you update the LoopAlignment. Maybe call it "CodeAlignment" now? I'm not sure that's better or not to change the name. The documentation can be changed to: "/// What alignment is preferred for loop bodies <and functions>, in log2(bytes)."

Otherwise LGTM. Thanks

This revision is now accepted and ready to land.Aug 16 2023, 7:47 AM

This revision was landed with ongoing or failed builds.Aug 16 2023, 9:33 AM

Closed by commit rGd65feccb1262: [ARM] Set preferred function alignment (authored by NickGuy). · Explain Why

This revision was automatically updated to reflect the committed changes.

NickGuy added a commit: rGd65feccb1262: [ARM] Set preferred function alignment.

Revision Contents

Path

Size

llvm/

lib/

Target/

ARM/

ARMISelLowering.cpp

1 line

Diff 548629

llvm/lib/Target/ARM/ARMISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 1,606 Lines • ▼ Show 20 Lines
	// On ARM arguments smaller than 4 bytes are extended, so all arguments			// On ARM arguments smaller than 4 bytes are extended, so all arguments
	// are at least 4 bytes aligned.			// are at least 4 bytes aligned.
	setMinStackArgumentAlignment(Align(4));			setMinStackArgumentAlignment(Align(4));

	// Prefer likely predicted branches to selects on out-of-order cores.			// Prefer likely predicted branches to selects on out-of-order cores.
	PredictableSelectIsExpensive = Subtarget->getSchedModel().isOutOfOrder();			PredictableSelectIsExpensive = Subtarget->getSchedModel().isOutOfOrder();

	setPrefLoopAlignment(Align(1ULL << Subtarget->getPrefLoopLogAlignment()));			setPrefLoopAlignment(Align(1ULL << Subtarget->getPrefLoopLogAlignment()));
				setPrefFunctionAlignment(Align(4));

	setMinFunctionAlignment(Subtarget->isThumb() ? Align(2) : Align(4));			setMinFunctionAlignment(Subtarget->isThumb() ? Align(2) : Align(4));

	if (Subtarget->isThumb() \|\| Subtarget->isThumb2())			if (Subtarget->isThumb() \|\| Subtarget->isThumb2())
	setTargetDAGCombine(ISD::ABS);			setTargetDAGCombine(ISD::ABS);
	}			}

	bool ARMTargetLowering::useSoftFloat() const {			bool ARMTargetLowering::useSoftFloat() const {
	▲ Show 20 Lines • Show All 20,512 Lines • Show Last 20 Lines