This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/Driver/ToolChains/
-
Driver/
-
ToolChains/
-
Arch/
-
RISCV.h
-
RISCV.cpp
-
CommonArgs.cpp
-
test/Driver/
-
Driver/
-
lto.c

Differential D132843

[RISCV] Ensure target features get passed to the LTO linker for RISC-V
Needs ReviewPublic

Authored by lewis-revill on Aug 29 2022, 3:49 AM.

Download Raw Diff

Details

Reviewers

efriedma
jrtc27
asb
kito-cheng
lenary

Summary

This patch fixes an issue whereby the LTO linker would not receive any information about target features, so things like architecture extensions would not be accounted for during codegen.

There is perhaps an outstanding question as to why this is required versus obtaining the target features from bitcode. This is the approach I implemented for now, but would be happy to look into whether I can make it work via bitcode if desired.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

lewis-revill created this revision.Aug 29 2022, 3:49 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 29 2022, 3:49 AM

Herald added subscribers: sunshaoce, VincentWu, luke957 and 31 others. · View Herald Transcript

lewis-revill requested review of this revision.Aug 29 2022, 3:49 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 29 2022, 3:49 AM

Herald added subscribers: cfe-commits, • pcwang-thead, eopXD, MaskRay. · View Herald Transcript

lewis-revill added a parent revision: D71387: pass -mabi to LTO linker only in RISC-V targets, enable RISC-V LTO.Aug 29 2022, 3:50 AM

Harbormaster completed remote builds in B183892: Diff 456301.Aug 29 2022, 3:50 AM

lenary resigned from this revision.Aug 29 2022, 5:45 AM

I'd prefer to use an approach with less moving parts: the bitcode should contain enough information to generate code without specifying anything on the command-line that wouldn't normally be passed to the linker. Among other issues, it's hard to understand the intended meaning if the flags contradict information encoded in the bitcode.

craig.topper added a reviewer: kito-cheng.Aug 29 2022, 5:29 PM

This is dump from my mailbox, few month ago I written a offlist mail to describe about RISC-V LTO status:

LTO for RISC-V is really kind of a long long story. @khchen has been fighting for that for a long time, but he is no longer focusing on RISC-V.

A short summary is we still have issues with dealing with ABI, target features, and datalayout,

Datalayout is determined by target triple in most other targets, but RISC-V can't only use target triple to determine the datalayout.

So we need to keep ABI in somewhere and read that at LTO phase, the most ideal place is the module flags. We already did that[6], but that comes with a problem is it's too late to update datalayout when we start to read a module, because LLVM will use datalayout to infer some info like the alignment of loads[7], and that means we might re-read the whole LLVM IR again to get the IR with the right info, and that requires fixing multiple places in LLVM (see[2]. Still, I am not sure it's enough or not).

There is also an issue with how to store and determine the final LTO target features. For example, A object built with -march=rv64g and B object built with -march=rv64gc, so which arch should we use in the LTO code generation stage? see [5] for more discussion around this issue.

Currently, our downstream work-around is passing -march and -mabi into LTO by -Wl,-plugin-opt=, and ABI and ISA are determined by the linking stage to avoid (or work-around) the above issues.

Related open revision:
[1] https://reviews.llvm.org/D71387
[2] https://reviews.llvm.org/D72624 (This one is abandoned, but it
still necessary if we need store ABI info in module flags)
[3] https://reviews.llvm.org/D72245
[4] https://reviews.llvm.org/D102582
[5] https://reviews.llvm.org/D106347
[6] https://reviews.llvm.org/D72755

Other revision for ref.

[7] https://reviews.llvm.org/D78403

So we need to keep ABI in somewhere and read that at LTO phase, the most ideal place is the module flags. We already did that[6], but that comes with a problem is it's too late to update datalayout when we start to read a module, because LLVM will use datalayout to infer some info like the alignment of loads[7], and that means we might re-read the whole LLVM IR again to get the IR with the right info, and that requires fixing multiple places in LLVM (see[2]. Still, I am not sure it's enough or not).

I'm not sure how the issues with datalayout in particular end up being an issue in practice.

clang shouldn't be writing out object files without a datalayout.
The code to infer alignment on load/store/etc. only exists for compatibility with old bitcode/IR; current LLVM bitcode always marks up alignment for all operations.

Or do you mean something else when you say "datalayout"?

There is also an issue with how to store and determine the final LTO target features. For example, A object built with -march=rv64g and B object built with -march=rv64gc, so which arch should we use in the LTO code generation stage? see [5] for more discussion around this issue.

On other targets, the instruction set used is encoded at a per-function level. So on x86, for example, you can have an "AVX" codepath and an "SSE" codepath, and use runtime CPU detection to figure out which to use.

In D132843#3770936, @efriedma wrote:

So we need to keep ABI in somewhere and read that at LTO phase, the most ideal place is the module flags. We already did that[6], but that comes with a problem is it's too late to update datalayout when we start to read a module, because LLVM will use datalayout to infer some info like the alignment of loads[7], and that means we might re-read the whole LLVM IR again to get the IR with the right info, and that requires fixing multiple places in LLVM (see[2]. Still, I am not sure it's enough or not).

I'm not sure how the issues with datalayout in particular end up being an issue in practice.

Maybe for IR DataLayout upgrades, but those are rather rare.

clang shouldn't be writing out object files without a datalayout.

The code to infer alignment on load/store/etc. only exists for compatibility with old bitcode/IR; current LLVM bitcode always marks up alignment for all operations.

Or do you mean something else when you say "datalayout"?

There is also an issue with how to store and determine the final LTO target features. For example, A object built with -march=rv64g and B object built with -march=rv64gc, so which arch should we use in the LTO code generation stage? see [5] for more discussion around this issue.

On other targets, the instruction set used is encoded at a per-function level. So on x86, for example, you can have an "AVX" codepath and an "SSE" codepath, and use runtime CPU detection to figure out which to use.

The IR records that too, but the "default" set of extensions gets encoded in the output file so loaders can know what instruction sets are _required_ (as opposed to optional via IFUNCs). It's not a perfect match, but it's about as good as you can get. With GNU ld these get merged together, though currently LLD just picks the one from the first file and ignores the rest (I think?); ideally the LLVM module linker would do similar merging.

In D132843#3771029, @jrtc27 wrote:

In D132843#3770936, @efriedma wrote:

There is also an issue with how to store and determine the final LTO target features. For example, A object built with -march=rv64g and B object built with -march=rv64gc, so which arch should we use in the LTO code generation stage? see [5] for more discussion around this issue.

On other targets, the instruction set used is encoded at a per-function level. So on x86, for example, you can have an "AVX" codepath and an "SSE" codepath, and use runtime CPU detection to figure out which to use.

The IR records that too, but the "default" set of extensions gets encoded in the output file so loaders can know what instruction sets are _required_ (as opposed to optional via IFUNCs). It's not a perfect match, but it's about as good as you can get. With GNU ld these get merged together, though currently LLD just picks the one from the first file and ignores the rest (I think?); ideally the LLVM module linker would do similar merging.

Oh, I see. Most other targets don't have ELF attributes like that. 32-bit ARM does, but I'm not sure how the interaction with LTO works there off the top of my head.

We probably want to encode the architecture into the IR in some way that allows us to produce the same attributes whether or not LTO is enabled.

I'm not sure how the issues with datalayout in particular end up being an issue in practice.

clang shouldn't be writing out object files without a datalayout.
The code to infer alignment on load/store/etc. only exists for compatibility with old bitcode/IR; current LLVM bitcode always marks up alignment for all operations.
Or do you mean something else when you say "datalayout"?

ilp32/ilp32f/ilp32d and ilp32e(D70401) having different data layout, so when we try to merge those stuffs will run into trouble, although I admit build object file with ilp32/ilp32f/ilp32d then LTO with ilp32e is generally a bad idea.

Another issue is the LLVM isn't compatible between different ABI, e.g. ilp32 and ilp32f having different LLVM IR when passing a struct with a int and a float[2].

[1] https://reviews.llvm.org/D71387#1792169

On other targets, the instruction set used is encoded at a per-function level. So on x86, for example, you can have an "AVX" codepath and an "SSE" codepath, and use runtime CPU detection to figure out which to use.

Give an example to explain what problem we have now and what option we have:

$ clang -target riscv64-elf -flto a.c -o a.o -march=rv64gc            # a.o build with -march=rv64gc
$ clang -target riscv64-elf -flto b.c -o b.o -march=rv64g             # b.o build with -march=rv64g
$ clang -target riscv64-elf  a.o b.o -o a.out -flto -march=rv64gc_zba # and LTO phase use -march=rv64gc_zba, what target feature (or ISA extensions) should be used for all function

Possible solution/results:

All functions in a.o and b.o using same target features during the first build stage, -march=rv64gc for a.o, -march=rv64g for b.o, and -march option given in LTO CodeGen stage is ignored, it only used for ELF attribute use (this revision).
All functions in a.o and b.o using same target features during the first build stage, -march=rv64gc for a.o, -march=rv64g for b.o, and deduced arch info from those .o for ELF attribute use (D106347), -march
All functions in a.o and b.o re-compile with -march=rv64gc_zba and ELF attribute use rv64gc_zba.

Option 1: Require user use right -march option during LTO stage, and might fill wrong/unexpected ELF attribute if give wrong -march or not even not given in LTO stage.
Option 2: Should be more ideal, but D106347 seems no progress for a while.
Option 3: This option will break IFUNC usage.

Possible solution/results:

All functions in a.o and b.o using same target features during the first build stage, -march=rv64gc for a.o, -march=rv64g for b.o, and -march option given in LTO CodeGen stage is ignored, it only used for ELF attribute use (this revision).

All functions in a.o and b.o using same target features during the first build stage, -march=rv64gc for a.o, -march=rv64g for b.o, and deduced arch info from those .o for ELF attribute use (D106347), -march

All functions in a.o and b.o re-compile with -march=rv64gc_zba and ELF attribute use rv64gc_zba.

Option 1: Require user use right -march option during LTO stage, and might fill wrong/unexpected ELF attribute if give wrong -march or not even not given in LTO stage.
Option 2: Should be more ideal, but D106347 seems no progress for a while.
Option 3: This option will break IFUNC usage.

This patch (Option 1) is look good to me, but maybe we need to report a warning in linking stage if possible.
I think users may not easy to specific the right -march string when they're using external libraries.
We had discussed that before in here and here, it's why I proposed Option 2 which encodes a module scope arch features in IR. IIRC, it's similar to what gcc did.

anton-afanasyev added a subscriber: anton-afanasyev.Sep 15 2022, 2:13 AM

MTC added a subscriber: MTC.Sep 16 2022, 4:56 AM

@lewis-revill could you please provide the case where the LTO linker would not receive any information about target features? We met similar problems, it appears on runtime functions in our cases. I made a small fix for this problem https://reviews.llvm.org/D139704. Do you have any other problem cases?

khchen mentioned this in D72245: [PoC][RISCV][LTO] Pass target-abi via module flag metadata.May 23 2023, 9:32 AM

khchen mentioned this in D78035: [PoC][RISCV] enable LTO/ThinLTO on RISCV.

craig.topper mentioned this in D155326: [Clang][lld][RISCV] Passing 'mattr' to lld/llvmgold.Jul 14 2023, 1:09 PM

Revision Contents

Path

Size

clang/

lib/

Driver/

ToolChains/

Arch/

RISCV.h

4 lines

RISCV.cpp

12 lines

CommonArgs.cpp

1 line

test/

Driver/

lto.c

5 lines

Diff 456301

clang/lib/Driver/ToolChains/Arch/RISCV.h

	Show All 24 Lines
	StringRef getRISCVABI(const llvm::opt::ArgList &Args,			StringRef getRISCVABI(const llvm::opt::ArgList &Args,
	const llvm::Triple &Triple);			const llvm::Triple &Triple);
	StringRef getRISCVArch(const llvm::opt::ArgList &Args,			StringRef getRISCVArch(const llvm::opt::ArgList &Args,
	const llvm::Triple &Triple);			const llvm::Triple &Triple);

	void addRISCVTargetABIArgs(const ToolChain &ToolChain,			void addRISCVTargetABIArgs(const ToolChain &ToolChain,
	const llvm::opt::ArgList &Args,			const llvm::opt::ArgList &Args,
	llvm::opt::ArgStringList &CmdArgs);			llvm::opt::ArgStringList &CmdArgs);

				void addRISCVTargetFeatureArgs(const ToolChain &ToolChain,
				const llvm::opt::ArgList &Args,
				llvm::opt::ArgStringList &CmdArgs);
	} // end namespace riscv			} // end namespace riscv
	} // namespace tools			} // namespace tools
	} // end namespace driver			} // end namespace driver
	} // end namespace clang			} // end namespace clang

	#endif // LLVM_CLANG_LIB_DRIVER_TOOLCHAINS_ARCH_RISCV_H			#endif // LLVM_CLANG_LIB_DRIVER_TOOLCHAINS_ARCH_RISCV_H

clang/lib/Driver/ToolChains/Arch/RISCV.cpp

Show First 20 Lines • Show All 307 Lines • ▼ Show 20 Lines	void riscv::addRISCVTargetABIArgs(const ToolChain &ToolChain,
const llvm::opt::ArgList &Args,		const llvm::opt::ArgList &Args,
llvm::opt::ArgStringList &CmdArgs) {		llvm::opt::ArgStringList &CmdArgs) {
// We need to pass target-abi option to check it is equal to module's		// We need to pass target-abi option to check it is equal to module's
// target-abi information.		// target-abi information.
StringRef ABIName = getRISCVABI(Args, ToolChain.getTriple());		StringRef ABIName = getRISCVABI(Args, ToolChain.getTriple());
CmdArgs.push_back(		CmdArgs.push_back(
Args.MakeArgString(Twine("-plugin-opt=-target-abi=") + ABIName));		Args.MakeArgString(Twine("-plugin-opt=-target-abi=") + ABIName));
}		}

		void riscv::addRISCVTargetFeatureArgs(const ToolChain &ToolChain,
		const llvm::opt::ArgList &Args,
		llvm::opt::ArgStringList &CmdArgs) {
		const Driver &D = ToolChain.getDriver();

		// Pass a plugin-opt for each of the target features to the LTO linker.
		std::vector<StringRef> Features;
		riscv::getRISCVTargetFeatures(D, ToolChain.getTriple(), Args, Features);
		for (StringRef F : Features)
		CmdArgs.push_back(Args.MakeArgString("-plugin-opt=-mattr=" + F));
		}

clang/lib/Driver/ToolChains/CommonArgs.cpp

Show First 20 Lines • Show All 631 Lines • ▼ Show 20 Lines	#endif

// pass more options in specific target		// pass more options in specific target
switch (ToolChain.getArch()) {		switch (ToolChain.getArch()) {
default:		default:
break;		break;
case llvm::Triple::riscv32:		case llvm::Triple::riscv32:
case llvm::Triple::riscv64: {		case llvm::Triple::riscv64: {
riscv::addRISCVTargetABIArgs(ToolChain, Args, CmdArgs);		riscv::addRISCVTargetABIArgs(ToolChain, Args, CmdArgs);
		riscv::addRISCVTargetFeatureArgs(ToolChain, Args, CmdArgs);
break;		break;
}		}
}		}

// Handle remark diagnostics on screen options: '-Rpass-*'.		// Handle remark diagnostics on screen options: '-Rpass-*'.
renderRpassOptions(Args, CmdArgs);		renderRpassOptions(Args, CmdArgs);

// Handle serialized remarks options: '-fsave-optimization-record'		// Handle serialized remarks options: '-fsave-optimization-record'
▲ Show 20 Lines • Show All 1,510 Lines • Show Last 20 Lines

clang/test/Driver/lto.c

	Show First 20 Lines • Show All 149 Lines • ▼ Show 20 Lines
	// RUN: -mabi=lp64 -### 2>&1 \| FileCheck %s --check-prefix=RISCV-SPEC-ABI-3			// RUN: -mabi=lp64 -### 2>&1 \| FileCheck %s --check-prefix=RISCV-SPEC-ABI-3
	// RUN: %clang -target riscv64-unknown-linux-gnu %s -fuse-ld=lld -flto \			// RUN: %clang -target riscv64-unknown-linux-gnu %s -fuse-ld=lld -flto \
	// RUN: -mabi=lp64f -### 2>&1 \| FileCheck %s --check-prefix=RISCV-SPEC-ABI-4			// RUN: -mabi=lp64f -### 2>&1 \| FileCheck %s --check-prefix=RISCV-SPEC-ABI-4
	//			//
	// RISCV-SPEC-ABI-1: "-plugin-opt=-target-abi=ilp32f"			// RISCV-SPEC-ABI-1: "-plugin-opt=-target-abi=ilp32f"
	// RISCV-SPEC-ABI-2: "-plugin-opt=-target-abi=ilp32d"			// RISCV-SPEC-ABI-2: "-plugin-opt=-target-abi=ilp32d"
	// RISCV-SPEC-ABI-3: "-plugin-opt=-target-abi=lp64"			// RISCV-SPEC-ABI-3: "-plugin-opt=-target-abi=lp64"
	// RISCV-SPEC-ABI-4: "-plugin-opt=-target-abi=lp64f"			// RISCV-SPEC-ABI-4: "-plugin-opt=-target-abi=lp64f"

				// RUN: %clang -target riscv32-unknown-elf %s -fuse-ld=gold -flto -mrelax \
				// RUN: -### 2>&1 \| FileCheck %s --check-prefix=RV32-RELAX-ATTR
				//
				// RV32-RELAX-ATTR: "-plugin-opt=-mattr=+relax"