This is an archive of the discontinued LLVM Phabricator instance.

Thumb2: When applying branch optimizations, visit branches in reverse order.
ClosedPublic

Authored by pcc on Apr 21 2015, 10:11 PM.

Download Raw Diff

Details

Reviewers

rengolin
t.p.northover

Commits

rG167668f8c82f: Thumb2: When applying branch optimizations, visit branches in reverse order.
rL235640: Thumb2: When applying branch optimizations, visit branches in reverse order.

Summary

The order in which branches appear in ImmBranches is approximately their
order within the function body. By visiting later branches first, we reduce
the distance between earlier forward branches and their targets, making it
more likely that the cbn?z optimization, which can only apply to forward
branches, will succeed for those earlier branches.

Diff Detail

Repository: rL LLVM

Event Timeline

pcc updated this revision to Diff 24196.Apr 21 2015, 10:11 PM

pcc retitled this revision from to Thumb2: When applying branch optimizations, visit branches in reverse order..

pcc updated this object.

pcc edited the test plan for this revision. (Show Details)

pcc added a reviewer: t.p.northover.

pcc added a subscriber: Unknown Object (MLST).

Herald added a subscriber: rengolin. · View Herald TranscriptApr 21 2015, 10:11 PM

I can't see anything wrong with this patch, but I'm curious why visiting the later branches first makes them closer.

The test case provides a good illustration of this. In that case, we have:

branch to L if x != 0
64 bytes of instructions
branch to L if y != 0
60 bytes of instructions
L:

If we visit the first branch first, there will be 64+4+60=128 bytes of instructions between the branch and the label, which is too large for the backend to apply the cbnz optimization. However, if we visit the second branch first, the backend can apply the cbnz optimization to that branch, shrinking it to 2 bytes, and when we visit the first branch the calculation becomes 64+2+60=126, which is just small enough that the cbnz optimization can be applied.

pcc updated this object.Apr 23 2015, 12:05 PM

Got it! So if you have multiple jumps, you'd increase the range in which the optimization could work by a few bytes, but not much more than that. And you'd have to have a long enough branch (but not longer) and multiple branches in between. Quite the corner case. :)

But since the change makes no difference (at all) in compile time and, AFAICT, semantically equivalent, LGTM, thanks!

This revision is now accepted and ready to land.Apr 23 2015, 12:23 PM

Closed by commit rL235640: Thumb2: When applying branch optimizations, visit branches in reverse order. (authored by pcc). · Explain WhyApr 23 2015, 1:35 PM

This revision was automatically updated to reflect the committed changes.

jevinskie added a subscriber: jevinskie.Nov 24 2015, 11:37 AM

Revision Contents

Path

Size

llvm/

trunk/

lib/

Target/

ARM/

ARMConstantIslandPass.cpp

9 lines

test/

CodeGen/

Thumb2/

cbnz.ll

54 lines

Diff 24331

llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp

Show First 20 Lines • Show All 1,739 Lines • ▼ Show 20 Lines	bool ARMConstantIslands::optimizeThumb2Instructions() {
MadeChange \|= optimizeThumb2Branches();		MadeChange \|= optimizeThumb2Branches();
MadeChange \|= optimizeThumb2JumpTables();		MadeChange \|= optimizeThumb2JumpTables();
return MadeChange;		return MadeChange;
}		}

bool ARMConstantIslands::optimizeThumb2Branches() {		bool ARMConstantIslands::optimizeThumb2Branches() {
bool MadeChange = false;		bool MadeChange = false;

for (unsigned i = 0, e = ImmBranches.size(); i != e; ++i) {		// The order in which branches appear in ImmBranches is approximately their
ImmBranch &Br = ImmBranches[i];		// order within the function body. By visiting later branches first, we reduce
		// the distance between earlier forward branches and their targets, making it
		// more likely that the cbn?z optimization, which can only apply to forward
		// branches, will succeed.
		for (unsigned i = ImmBranches.size(); i != 0; --i) {
		ImmBranch &Br = ImmBranches[i-1];
unsigned Opcode = Br.MI->getOpcode();		unsigned Opcode = Br.MI->getOpcode();
unsigned NewOpc = 0;		unsigned NewOpc = 0;
unsigned Scale = 1;		unsigned Scale = 1;
unsigned Bits = 0;		unsigned Bits = 0;
switch (Opcode) {		switch (Opcode) {
default: break;		default: break;
case ARM::t2B:		case ARM::t2B:
NewOpc = ARM::tB;		NewOpc = ARM::tB;
▲ Show 20 Lines • Show All 303 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/Thumb2/cbnz.ll

				; RUN: llc -mtriple thumbv7-unknown-linux -o - %s \| FileCheck %s

				declare void @x()
				declare void @y()

				define void @f(i32 %x, i32 %y) {
				; CHECK-LABEL: f:
				; CHECK: cbnz
				%p = icmp eq i32 %x, 0
				br i1 %p, label %t, label %f

				t:
				call void @x()
				call void @x()
				call void @x()
				call void @x()
				call void @x()
				call void @x()
				call void @x()
				call void @x()
				call void @x()
				call void @x()
				call void @x()
				call void @x()
				call void @x()
				call void @x()
				call void @x()
				call void @x()
				; CHECK: cbnz
				%q = icmp eq i32 %y, 0
				br i1 %q, label %t2, label %f

				t2:
				call void @x()
				call void @x()
				call void @x()
				call void @x()
				call void @x()
				call void @x()
				call void @x()
				call void @x()
				call void @x()
				call void @x()
				call void @x()
				call void @x()
				call void @x()
				call void @x()
				call void @x()
				br label %f

				f:
				call void @y()
				ret void
				}