This is an archive of the discontinued LLVM Phabricator instance.

Differential D119510

AMDGPU: Clamp min value of effective waves-per-eu instead of discarding
AbandonedPublic

Authored by arsenm on Feb 10 2022, 8:00 PM.

Download Raw Diff

Details

Reviewers

kzhuravl
t-tye
rampitec

Group Reviewers

Restricted Project

Summary

If the flat work group size implied a larger minimum, this was
ignoring the requested maximum. This was interfering with the logic to
propagate amdgpu-waves-per-eu when accounting for the inferred flat
workgroup size. Just clamp the minimum so we still preserve the
requested maximum.

Plus I'm not really sure what the point of the minimum really is or
does. It is queried in a few IR passes (AMDGPUPromoteAlloca and TTI)
use it for getting a number of VGPRs, but everything else uses the
maximum.

No test here since I don't think this is a directly observable
property, but fixes a future patch which propagates
amdgpu-waves-per-eu.

Diff Detail

Unit TestsFailed

	Time	Test
	60 ms	x64 debian > LLVM.CodeGen/AMDGPU::default-flat-work-group-size-overrides-waves-per-eu.ll
	220 ms	x64 debian > LLVM.CodeGen/AMDGPU/GlobalISel::insertelement.large.ll
	60,030 ms	x64 debian > libFuzzer.libFuzzer::fuzzer-leak.test
	60,080 ms	x64 debian > libFuzzer.libFuzzer::large.test
	60,030 ms	x64 debian > libFuzzer.libFuzzer::out-of-process-fuzz.test
		View Full Test Results (6 Failed)

Event Timeline

arsenm created this revision.Feb 10 2022, 8:00 PM

Herald added subscribers: foad, kerbowa, hiraditya and 5 others. · View Herald TranscriptFeb 10 2022, 8:00 PM

arsenm requested review of this revision.Feb 10 2022, 8:00 PM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 10 2022, 8:00 PM

Herald added a subscriber: wdng. · View Herald Transcript

Harbormaster completed remote builds in B148897: Diff 407752.Feb 10 2022, 8:20 PM

foad added inline comments.Feb 11 2022, 12:11 AM

llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
562–563	Just use std::max?

Looks like several tests have failed.

arsenm abandoned this revision.Jun 5 2023, 8:02 AM

arsenm marked an inline comment as done.

Herald added a project: Restricted Project. · View Herald TranscriptJun 5 2023, 8:02 AM

Revision Contents

Path

Size

llvm/

lib/

Target/

AMDGPU/

AMDGPUSubtarget.cpp

2 lines

Diff 407752

llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp

Show First 20 Lines • Show All 553 Lines • ▼ Show 20 Lines	std::pair<unsigned, unsigned> AMDGPUSubtarget::getWavesPerEU(

// Make sure requested values do not violate subtarget's specifications.		// Make sure requested values do not violate subtarget's specifications.
if (Requested.first < getMinWavesPerEU() \|\|		if (Requested.first < getMinWavesPerEU() \|\|
Requested.second > getMaxWavesPerEU())		Requested.second > getMaxWavesPerEU())
return Default;		return Default;

// Make sure requested values are compatible with values implied by requested		// Make sure requested values are compatible with values implied by requested
// minimum/maximum flat work group sizes.		// minimum/maximum flat work group sizes.
if (Requested.first < MinImpliedByFlatWorkGroupSize)		if (Requested.first < MinImpliedByFlatWorkGroupSize)
return Default;		Requested.first = MinImpliedByFlatWorkGroupSize;
		foadUnsubmitted Done Reply Inline Actions Just use std::max? foad: Just use std::max?

return Requested;		return Requested;
}		}

static unsigned getReqdWorkGroupSize(const Function &Kernel, unsigned Dim) {		static unsigned getReqdWorkGroupSize(const Function &Kernel, unsigned Dim) {
auto Node = Kernel.getMetadata("reqd_work_group_size");		auto Node = Kernel.getMetadata("reqd_work_group_size");
if (Node && Node->getNumOperands() == 3)		if (Node && Node->getNumOperands() == 3)
return mdconst::extract<ConstantInt>(Node->getOperand(Dim))->getZExtValue();		return mdconst::extract<ConstantInt>(Node->getOperand(Dim))->getZExtValue();
▲ Show 20 Lines • Show All 553 Lines • Show Last 20 Lines