This is an archive of the discontinued LLVM Phabricator instance.

Is it expected for binary size increases to result from this? Between the commit for this patch and the commit before it, I'm seeing an increase in some fuchsia ZBIs by about 13 kB.

In D74825#1927641, @leonardchan wrote:

Is it expected for binary size increases to result from this? Between the commit for this patch and the commit before it, I'm seeing an increase in some fuchsia ZBIs by about 13 kB.

The pass isn't supposed to even run for CPU targets

In D74825#1927673, @arsenm wrote:

In D74825#1927641, @leonardchan wrote:

Is it expected for binary size increases to result from this? Between the commit for this patch and the commit before it, I'm seeing an increase in some fuchsia ZBIs by about 13 kB.

The pass isn't supposed to even run for CPU targets

I think this pass is also running for non-GPU targets. I'm seeing this pass run on x86_64, aarch64, and riscv64 when building a toolchain for those targets.

https://github.com/llvm/llvm-project/blob/a4cde9ad7b6f1a4cfef228f6cf2fc4911bf24c77/llvm/lib/Passes/PassBuilder.cpp#L436 seems to add it to the new PM default function pipeline that I think runs as long as optimizations are available.

In D74825#1927949, @leonardchan wrote:

In D74825#1927673, @arsenm wrote:

In D74825#1927641, @leonardchan wrote:

Is it expected for binary size increases to result from this? Between the commit for this patch and the commit before it, I'm seeing an increase in some fuchsia ZBIs by about 13 kB.

The pass isn't supposed to even run for CPU targets

I think this pass is also running for non-GPU targets. I'm seeing this pass run on x86_64, aarch64, and riscv64 when building a toolchain for those targets.

https://github.com/llvm/llvm-project/blob/a4cde9ad7b6f1a4cfef228f6cf2fc4911bf24c77/llvm/lib/Passes/PassBuilder.cpp#L436 seems to add it to the new PM default function pipeline that I think runs as long as optimizations are available.

That doesn't match what the comment says, or the old PM does (which does createSpeculativeExecutionIfHasBranchDivergencePassz)

In D74825#1927964, @arsenm wrote:

In D74825#1927949, @leonardchan wrote:

In D74825#1927673, @arsenm wrote:

In D74825#1927641, @leonardchan wrote:

Is it expected for binary size increases to result from this? Between the commit for this patch and the commit before it, I'm seeing an increase in some fuchsia ZBIs by about 13 kB.

The pass isn't supposed to even run for CPU targets

I think this pass is also running for non-GPU targets. I'm seeing this pass run on x86_64, aarch64, and riscv64 when building a toolchain for those targets.

https://github.com/llvm/llvm-project/blob/a4cde9ad7b6f1a4cfef228f6cf2fc4911bf24c77/llvm/lib/Passes/PassBuilder.cpp#L436 seems to add it to the new PM default function pipeline that I think runs as long as optimizations are available.

That doesn't match what the comment says, or the old PM does (which does createSpeculativeExecutionIfHasBranchDivergencePassz)

@chandlerc It seems like you added the pass to the new PM pipelne in https://reviews.llvm.org/rGe3f5064b7235 with the main difference being OnlyIfDivergentTarget false whereas OnlyIfDivergentTarget is true in the old PM with createSpeculativeExecutionIfHasBranchDivergencePass. Is this intentional?

Hi!

I just wrote a PR about a debug-info fault that starts occuring with this patch:
https://bugs.llvm.org/show_bug.cgi?id=46267

I've just got the bug, will take a look

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Scalar/

SpeculativeExecution.cpp

3 lines

test/

CodeGen/

AMDGPU/

speculative-execution-freecasts.ll

30 lines

Diff 245612

llvm/lib/Transforms/Scalar/SpeculativeExecution.cpp

Show First 20 Lines • Show All 273 Lines • ▼ Show 20 Lines	if (Cost != UINT_MAX && isSafeToSpeculativelyExecute(&I) &&
return false; // too much to hoist		return false; // too much to hoist
} else {		} else {
NotHoisted.insert(&I);		NotHoisted.insert(&I);
if (NotHoisted.size() > SpecExecMaxNotHoisted)		if (NotHoisted.size() > SpecExecMaxNotHoisted)
return false; // too much left behind		return false; // too much left behind
}		}
}		}

if (TotalSpeculationCost == 0)
return false; // nothing to hoist

for (auto I = FromBlock.begin(); I != FromBlock.end();) {		for (auto I = FromBlock.begin(); I != FromBlock.end();) {
// We have to increment I before moving Current as moving Current		// We have to increment I before moving Current as moving Current
// changes the list that I is iterating through.		// changes the list that I is iterating through.
auto Current = I;		auto Current = I;
++I;		++I;
if (!NotHoisted.count(&*Current)) {		if (!NotHoisted.count(&*Current)) {
Current->moveBefore(ToBlock.getTerminator());		Current->moveBefore(ToBlock.getTerminator());
}		}
Show All 30 Lines

llvm/test/CodeGen/AMDGPU/speculative-execution-freecasts.ll

This file was added.

				; RUN: opt < %s -S -mtriple=amdgcn-unknown-amdhsa -speculative-execution \
				; RUN: -spec-exec-max-speculation-cost 1 -spec-exec-max-not-hoisted 1 \
				; RUN: \| FileCheck %s

				; CHECK-LABEL: @ifThen_bitcast(
				; CHECK: bitcast
				; CHECK: br i1 true
				define void @ifThen_bitcast(i32 %y) {
				br i1 true, label %a, label %b

				a:
				%x = bitcast i32 %y to float
				br label %b
				arsenmUnsubmitted Done Reply Inline Actions Would probably be better to have real values, since someday something might conclude any undef operation is free or something arsenm: Would probably be better to have real values, since someday something might conclude any undef…

				b:
				ret void
				}

				; CHECK-LABEL: @ifThen_addrspacecast(
				; CHECK: addrspacecast
				; CHECK: br i1 true
				define void @ifThen_addrspacecast(i32* %y) {
				br i1 true, label %a, label %b
				a:
				%x = addrspacecast i32* %y to i32 addrspace(1)*
				br label %b

				b:
				ret void
				}

This is an archive of the discontinued LLVM Phabricator instance.

SpeculativeExecution: fixed ingoring free executionClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 245612

llvm/lib/Transforms/Scalar/SpeculativeExecution.cpp

llvm/test/CodeGen/AMDGPU/speculative-execution-freecasts.ll

SpeculativeExecution: fixed ingoring free execution
ClosedPublic