This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/lib/Driver/ToolChains/
-
lib/
-
Driver/
-
ToolChains/
-
Cuda.cpp

Differential D124253

[Clang][OpenMP] Fix the issue that temp cubin files are not removed after compilation when using new OpenMP driver
ClosedPublic

Authored by tianshilei1992 on Apr 22 2022, 5:42 AM.

Download Raw Diff

Details

Reviewers

jdoerfert
jhuber6

Commits

rG20a9fb953e46: [Clang][OpenMP] Fix the issue that temp cubin files are not removed after…

Summary

The root cause of this is, in NVPTX::Assembler::ConstructJob, the output file name might not match the Output's file name passed into the function because CudaToolChain::getInputFilename is a specialized version. That means the real output file is not added to the temp files list, which will be all removed in the d'tor of Compilation. In order to "fix" it, in the function NVPTX::OpenMPLinker::ConstructJob, before calling clang-nvlink-wrapper, the function calls getToolChain().getInputFilename(II) to get the right output file name for each input, and add it to temp file, and then they can be removed w/o any issue. However, this whole logic doesn't work when using the new OpenMP driver because NVPTX::OpenMPLinker::ConstructJob is not called at all, which causing the issue that the cubin file generated in each single unit compilation is out of track.

In this patch, we add the real output file into temp files if its name doesn't match Output. We add it when the file is an output instead of doing it when it is an input, like what we did in NVPTX::OpenMPLinker::ConstructJob, which makes more sense.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

tianshilei1992 created this revision.Apr 22 2022, 5:42 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 22 2022, 5:42 AM

Herald added subscribers: guansong, yaxunl. · View Herald Transcript

tianshilei1992 requested review of this revision.Apr 22 2022, 5:42 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 22 2022, 5:42 AM

Herald added subscribers: cfe-commits, sstefan1. · View Herald Transcript

Is this really the cause? nvptx::assemble should call createOutputFile which makes a temp file to output to that is added to TempFiles.

In D124253#3467375, @jhuber6 wrote:

Is this really the cause? nvptx::assemble should call createOutputFile which makes a temp file to output to that is added to TempFiles.

Hmm, that's interesting. After adding that line, the temp file gets removed. Let me check that function.

The issue is in clang/lib/Driver/ToolChains/Cuda.cpp.

Harbormaster completed remote builds in B160837: Diff 424452.Apr 22 2022, 7:08 AM

update

Herald added a subscriber: MaskRay. · View Herald TranscriptApr 22 2022, 1:07 PM

tianshilei1992 retitled this revision from [Clang][OpenMP] Fix the issue that one temp cubin file is not removed after compilation to [Clang][OpenMP] Fix the issue that temp cubin files are not removed after compilation when using new OpenMP driver.Apr 22 2022, 1:17 PM

tianshilei1992 edited the summary of this revision. (Show Details)

This revision is now accepted and ready to land.Apr 22 2022, 1:51 PM

Harbormaster completed remote builds in B160941: Diff 424589.Apr 22 2022, 2:14 PM

Closed by commit rG20a9fb953e46: [Clang][OpenMP] Fix the issue that temp cubin files are not removed after… (authored by tianshilei1992). · Explain WhyApr 22 2022, 3:07 PM

This revision was automatically updated to reflect the committed changes.

tianshilei1992 added a commit: rG20a9fb953e46: [Clang][OpenMP] Fix the issue that temp cubin files are not removed after….

Revision Contents

Path

Size

clang/

lib/

Driver/

ToolChains/

Cuda.cpp

9 lines

Diff 424628

clang/lib/Driver/ToolChains/Cuda.cpp

Show First 20 Lines • Show All 441 Lines • ▼ Show 20 Lines	void NVPTX::Assembler::ConstructJob(Compilation &C, const JobAction &JA,

// Pass -v to ptxas if it was passed to the driver.		// Pass -v to ptxas if it was passed to the driver.
if (Args.hasArg(options::OPT_v))		if (Args.hasArg(options::OPT_v))
CmdArgs.push_back("-v");		CmdArgs.push_back("-v");

CmdArgs.push_back("--gpu-name");		CmdArgs.push_back("--gpu-name");
CmdArgs.push_back(Args.MakeArgString(CudaArchToString(gpu_arch)));		CmdArgs.push_back(Args.MakeArgString(CudaArchToString(gpu_arch)));
CmdArgs.push_back("--output-file");		CmdArgs.push_back("--output-file");
CmdArgs.push_back(Args.MakeArgString(TC.getInputFilename(Output)));		const char *OutputFileName = Args.MakeArgString(TC.getInputFilename(Output));
		if (std::string(OutputFileName) != std::string(Output.getFilename()))
		C.addTempFile(OutputFileName);
		CmdArgs.push_back(OutputFileName);
for (const auto& II : Inputs)		for (const auto& II : Inputs)
CmdArgs.push_back(Args.MakeArgString(II.getFilename()));		CmdArgs.push_back(Args.MakeArgString(II.getFilename()));

for (const auto& A : Args.getAllArgValues(options::OPT_Xcuda_ptxas))		for (const auto& A : Args.getAllArgValues(options::OPT_Xcuda_ptxas))
CmdArgs.push_back(Args.MakeArgString(A));		CmdArgs.push_back(Args.MakeArgString(A));

bool Relocatable = false;		bool Relocatable = false;
if (JA.isOffloading(Action::OFK_OpenMP))		if (JA.isOffloading(Action::OFK_OpenMP))
▲ Show 20 Lines • Show All 142 Lines • ▼ Show 20 Lines	if (II.getType() == types::TY_LLVM_IR \|\|
continue;		continue;
}		}

// Currently, we only pass the input files to the linker, we do not pass		// Currently, we only pass the input files to the linker, we do not pass
// any libraries that may be valid only for the host.		// any libraries that may be valid only for the host.
if (!II.isFilename())		if (!II.isFilename())
continue;		continue;

const char *CubinF = C.addTempFile(		const char *CubinF =
C.getArgs().MakeArgString(getToolChain().getInputFilename(II)));		C.getArgs().MakeArgString(getToolChain().getInputFilename(II));

CmdArgs.push_back(CubinF);		CmdArgs.push_back(CubinF);
}		}

AddStaticDeviceLibsLinking(C, *this, JA, Inputs, Args, CmdArgs, "nvptx",		AddStaticDeviceLibsLinking(C, *this, JA, Inputs, Args, CmdArgs, "nvptx",
GPUArch, /isBitCodeSDL=/false,		GPUArch, /isBitCodeSDL=/false,
/postClangLink=/false);		/postClangLink=/false);

▲ Show 20 Lines • Show All 303 Lines • Show Last 20 Lines