This is an archive of the discontinued LLVM Phabricator instance.

[clang] [Driver] More flexible rules for loading default configs
ClosedPublic

Authored by mgorny on Sep 21 2022, 1:03 AM.

Details

Summary

Change the default config file loading logic to be more flexible
and more readable at the same time. The new algorithm focuses on four
locations, in order:

  1. <triple>-<mode>.cfg using real driver mode
  2. <triple>-<mode>.cfg using executable suffix
  3. <triple>.cfg + <mode>.cfg using real driver mode
  4. <triple>.cfg + <mode>.cfg using executable suffix

This is meant to preserve reasonable level of compatibility with
the existing use, while introducing more flexibility and making the code
simpler. Notably:

  1. In this layout, the actual target triple is normally respected, and e.g. in -m32 build the x86_64-* configs will never be used.
  1. Both real driver mode (preferable) and executable suffix are supported. This permits correctly handling calls with explicit --driver-mode= while at the same time preserving compatibility with the existing code.
  1. The first two locations provide users with the ability to override configuration per specific target+mode combinaton, while the next two make it possible to independently specify per-target and per-mode configuration.
  1. All config file locations are applicable independently of whether clang is started via a prefixed executable, or bare clang.
  1. If the target is not explicitly specified and the executable prefix does not name a valid triple, it is used instead of the actual target triple for backwards compatibility.

This is particularly meant to address Gentoo's use case for
configuration files: to configure the default runtimes (i.e. -rtlib=,
-stdlib=) and --gcc-install-dir= for all the relevant drivers,
as well as to make it more convenient for users to override -W flags
to test compatibility with future versions of Clang easier.

Diff Detail

Event Timeline

mgorny created this revision.Sep 21 2022, 1:03 AM
Herald added a project: Restricted Project. · View Herald Transcript
mgorny requested review of this revision.Sep 21 2022, 1:03 AM

@sepavloff, I'd appreciate if you could take a look at the logic before I start updating tests.

To explain this better, let's take the following example call:

x86_64-pc-linux-gnu-clang --driver-mode=g++ -target i386

This maps to:

  • effective triple: i386 (from -target)
  • effective mode: clang++ (from --driver-mode)
  • name prefix: x86_64-pc-linux-gnu
  • "fixed" name prefix: i386-pc-linux-gnu (with arch substituted based on effective triple)
  • name suffix: clang

The algorithm will first try to use the first of the following configuration files:

  1. i386-clang++.cfg (effective triple + effective mode)
  2. i386-pc-linux-gnu-clang.cfg ("fixed" prefix + suffix)
  3. x86_64-pc-linux-gnu-clang.cfg (orig. prefix + suffix)

If neither of these files are found, it will try to load one config file for the target and another for the driver.

For target, it will try:

  1. i386.cfg (effective triple)
  2. i386-pc-linux-gnu.cfg ("fixed" prefix)
  3. x86_64-pc-linux-gnu.cfg (orig. prefix)

For driver, it will try:

  1. clang++.cfg (effective mode)
  2. clang.cfg (suffix)

This version doesn't include any further fallback, i.e. assumes we need to cover all triples and driver modes we care about. However, adding one should be trivial.

clang/lib/Driver/Driver.cpp
1070–1085

This part is mostly proof-of-concept, I need to look into how to do this properly.

mgorny updated this revision to Diff 461960.Sep 21 2022, 11:27 AM

Simplify getting driver mode — it turns out it's set for us already!

MaskRay added inline comments.Sep 21 2022, 12:44 PM
clang/lib/Driver/Driver.cpp
6192

If the switch covers all enum members, we typically use llvm_unreachable to work around a GCC -Wswitch warning.
Clang correctly knows all branches are covered and do not need llvm_unreachable

@sepavloff, gentle ping. I'm waiting for your ACK/NAK on the algo in https://reviews.llvm.org/D134337#3805368.

I would propose to slightly modify the config file search algorithm.

For the tool named as x86_64-pc-linux-gnu-clang, the existing algorithm would search for the files:

  • i386-clang.cfg
  • i386.cfg
  • x86_64-clang.cfg
  • x86_64.cfg

We could modify this algorithm to obtain more flexible one and still preserve backward compatibility as much as possible. Let's use the invocation x86_64-pc-linux-gnu-clang --driver-mode=g++ -target i386 as example and see what files the tool would search for in the modified algorithm.

First clang tries to find x86_64-pc-linux-gnu-clang.cfg. Just tool name with added suffix cfg. No target override, no attempt to split tool name into components. It provides possibility to have a separate configuration for any tool. The existing mechanism does not search for such file, so compatibility cannot be broken. This variant makes it possible to use configuration file even if tool name is arbitrary and does not follow usual pattern.

If x86_64-pc-linux-gnu-clang.cfg is found, it is loaded as configuration file and that's all. Otherwise the middle components are dropped and only target prefix and driver mode suffix are used for the search. It is similar to the search made by existing implementation, the following files that are looked for :

  • i386-clang.cfg
  • i386.cfg
  • x86_64-clang.cfg
  • x86_64.cfg

If the file found has form target.cfg or none of the files was found, clang tries to load driver-mode configuration file. The files are the same as in your algorithm:

  • clang++.cfg
  • clang.cfg

The driver-mode configuration is loaded prior to the target config, - as it may contain default settings that target config may override. In the current implementation such files are not, risk of breaking compatibility is minimal.

What do you think about this variant? Does it fit your use case?

First clang tries to find x86_64-pc-linux-gnu-clang.cfg. Just tool name with added suffix cfg. No target override, no attempt to split tool name into components. It provides possibility to have a separate configuration for any tool. The existing mechanism does not search for such file, so compatibility cannot be broken. This variant makes it possible to use configuration file even if tool name is arbitrary and does not follow usual pattern.

That's incorrect. The existing algorithm uses precisely this file if that's how clang is named:

$ ./bin/x86_64-pc-linux-gnu-clang --version
clang version 16.0.0 (git@github.com:llvm/llvm-project.git 17779984ca4441657071ea5abe6ef2bb32306dce)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /home/mgorny/git/llvm-project/build/./bin
Configuration file: /tmp/clang-config/x86_64-pc-linux-gnu-clang.cfg

Also note that it performs the arch substitution on this path:

$ ./bin/x86_64-pc-linux-gnu-clang --version -m32
clang version 16.0.0 (git@github.com:llvm/llvm-project.git 17779984ca4441657071ea5abe6ef2bb32306dce)
Target: i386-pc-linux-gnu
Thread model: posix
InstalledDir: /home/mgorny/git/llvm-project/build/./bin
Configuration file: /tmp/clang-config/i386-pc-linux-gnu-clang.cfg

So your proposal breaks backwards compatibility.

If x86_64-pc-linux-gnu-clang.cfg is found, it is loaded as configuration file and that's all. Otherwise the middle components are dropped and only target prefix and driver mode suffix are used for the search. It is similar to the search made by existing implementation, the following files that are looked for :

  • i386-clang.cfg
  • i386.cfg
  • x86_64-clang.cfg
  • x86_64.cfg

If the file found has form target.cfg or none of the files was found, clang tries to load driver-mode configuration file. The files are the same as in your algorithm:

  • clang++.cfg
  • clang.cfg

The driver-mode configuration is loaded prior to the target config, - as it may contain default settings that target config may override. In the current implementation such files are not, risk of breaking compatibility is minimal.

What do you think about this variant? Does it fit your use case?

Well, my immediate use case only needs per-driver mode configuration files, so it would work — presuming we're using the real driver mode and not the executable suffix.

However, I definitely do see potential use cases for per-target configuration files — for example, providing target-specific paths instead of hardcoding them in the driver. Please note that architecture part alone is insufficient here — there are ABIs (such as n32 ABI on mips or x32 ABI on x86_64) that can only be distinguished by -gnuabin32 or -gnux32 suffix rather than the architecture part. This is also relevant to hardfloat/softfloat targets, and the larger part of the triple is definitely needed for cross-toolchains that may target different operating systems.

Perhaps it would be easier if you described your use case to me — I presume you have one as author of the original patch.

First clang tries to find x86_64-pc-linux-gnu-clang.cfg. Just tool name with added suffix cfg. No target override, no attempt to split tool name into components. It provides possibility to have a separate configuration for any tool. The existing mechanism does not search for such file, so compatibility cannot be broken. This variant makes it possible to use configuration file even if tool name is arbitrary and does not follow usual pattern.

That's incorrect. The existing algorithm uses precisely this file if that's how clang is named:

I was wrong, the target prefix reported by ToolChain::getTargetAndModeFromProgramName is everything but the driver component, so indeed the first file tried for configuration is x86_64-pc-linux-gnu-clang.cfg. So the new algorithm would try files tried by current implementation and risk of breaking compatibility would be minimized.

For target, it will try:

  1. i386.cfg (effective triple)
  2. i386-pc-linux-gnu.cfg ("fixed" prefix)
  3. x86_64-pc-linux-gnu.cfg (orig. prefix)

Does it mean that i386.cfg would be tried before i386-pc-linux-gnu.cfg? It looks more natural to try specific variant first then generic. File x86_64.cfg is not used in this algorithm, what is the reason for this?

For target, it will try:

  1. i386.cfg (effective triple)
  2. i386-pc-linux-gnu.cfg ("fixed" prefix)
  3. x86_64-pc-linux-gnu.cfg (orig. prefix)

Does it mean that i386.cfg would be tried before i386-pc-linux-gnu.cfg? It looks more natural to try specific variant first then generic. File x86_64.cfg is not used in this algorithm, what is the reason for this?

Yes but this is only because the example explicitly passes -target i386, i.e. incompletely triple. If you passed -target i386-pc-linux-gnu, then it would try that one instead. Perhaps I should try and see what triple normalization does here.

MaskRay added a comment.EditedSep 26 2022, 10:40 AM

For target, it will try:

  1. i386.cfg (effective triple)
  2. i386-pc-linux-gnu.cfg ("fixed" prefix)
  3. x86_64-pc-linux-gnu.cfg (orig. prefix)

Does it mean that i386.cfg would be tried before i386-pc-linux-gnu.cfg? It looks more natural to try specific variant first then generic. File x86_64.cfg is not used in this algorithm, what is the reason for this?

Yes but this is only because the example explicitly passes -target i386, i.e. incompletely triple. If you passed -target i386-pc-linux-gnu, then it would try that one instead. Perhaps I should try and see what triple normalization does here.

Trying just i386.cfg for x86_64-pc-linux-gnu-clang --target=i386 makes sense to me. Trying x86_64*.cfg seems inappropriate.
I do not recommend i386-pc-linux-gnu.cfg since --target= is as if completely ignoring the executable prefix.

If x86_64-pc-linux-gnu-clang --target=i386-pc-linux-gnu is specified, trying i386-pc-linux-gnu-clang.cfg makes sense. I do not know whether i386-clang.cfg should be tried as well.

For target, it will try:

  1. i386.cfg (effective triple)
  2. i386-pc-linux-gnu.cfg ("fixed" prefix)
  3. x86_64-pc-linux-gnu.cfg (orig. prefix)

Does it mean that i386.cfg would be tried before i386-pc-linux-gnu.cfg? It looks more natural to try specific variant first then generic. File x86_64.cfg is not used in this algorithm, what is the reason for this?

Yes but this is only because the example explicitly passes -target i386, i.e. incompletely triple. If you passed -target i386-pc-linux-gnu, then it would try that one instead. Perhaps I should try and see what triple normalization does here.

Trying just i386.cfg for x86_64-pc-linux-gnu-clang --target=i386 makes sense to me. Trying x86_64*.cfg seems inappropriate.
I do not recommend i386-pc-linux-gnu.cfg since --target= is as if completely ignoring the executable prefix.

If x86_64-pc-linux-gnu-clang --target=i386-pc-linux-gnu is specified, trying i386-pc-linux-gnu-clang.cfg makes sense. I do not know whether i386-clang.cfg should be tried as well.

It make sense but the algorithm looks overcomplicated. The current implementation is already too complex and documenting the new algorithm is a challenge.

What about making minimal changes to the current algorithm so that the task of @mgorny can be done, without reasoning about possible profit for someone else? Our customers don't use target overloading and prefixes longer than target, so any change that preserve existing functionality is OK.

A part of this algorithm that loads driver mode configuration is worth implementing:

For driver, it will try:

  1. clang++.cfg (effective mode)
  2. clang.cfg (suffix)

It must not break compatibility with existing implementation and allows further extension in future. For more complex use cases driver configuration may implement mode complex syntax as proposed in https://discourse.llvm.org/t/rfc-adding-a-default-file-location-to-config-file-support/63606/41.

@mgorny what modifications to target mode file search are necessary for your task?

It make sense but the algorithm looks overcomplicated. The current implementation is already too complex and documenting the new algorithm is a challenge.

What about making minimal changes to the current algorithm so that the task of @mgorny can be done, without reasoning about possible profit for someone else? Our customers don't use target overloading and prefixes longer than target, so any change that preserve existing functionality is OK.

Do your customers need all of the current functions? Could you perhaps specify the absolutely minimal algorithm you'd need to work, and I'll try to make the implementation as simple as possible.

For driver, it will try:

  1. clang++.cfg (effective mode)
  2. clang.cfg (suffix)

It must not break compatibility with existing implementation and allows further extension in future. For more complex use cases driver configuration may implement mode complex syntax as proposed in https://discourse.llvm.org/t/rfc-adding-a-default-file-location-to-config-file-support/63606/41.

@mgorny what modifications to target mode file search are necessary for your task?

For my immediate task, I'm perfectly happy with leaving it as-is, as long as it won't get in the way of getting effective driver mode config in. I'm a bit worried that better target support may be useful in the future but I suppose that ship has sailed already.

mgorny updated this revision to Diff 463201.Sep 27 2022, 5:43 AM
mgorny edited the summary of this revision. (Show Details)

Ok, how about this variant? I think it's the simplest I can come up with that roughly matches the old behavior and adds what's necessary for the new.

The algorithm is to use:

  1. <triple>-<mode>.cfg if available, using the real <mode>,
  2. <triple>-<mode>.cfg if available, using <mode> inferred from executable name,
  3. <triple>.cfg + <mode>.cfg if either is available, using the real <mode>,
  4. <triple>.cfg + <mode>.cfg if either is available, using the <mode> inferred from executable.

The important features of this are:

  1. Triple accounts for -m32 and other options changing mode, so x86_64-specific configs won't be used when building 32-bit executables.
  2. Triple accounts for executable prefix, so the existing configs will continue to work (presuming that your customers do not use --target, as you noted).
  3. It accounts both for real mode and mode inferred from executable, so it will prefer the correct config when --driver-mode= is used but it will also work with existing configs that used different executable names.

Unlike the previous mode, there's no potential confusion between 32-bit and 64-bit configs, and the logic is definitely less confusing.

Ok, how about this variant? I think it's the simplest I can come up with that roughly matches the old behavior and adds what's necessary for the new.

The algorithm is to use:

  1. <triple>-<mode>.cfg if available, using the real <mode>,
  2. <triple>-<mode>.cfg if available, using <mode> inferred from executable name,
  3. <triple>.cfg + <mode>.cfg if either is available, using the real <mode>,
  4. <triple>.cfg + <mode>.cfg if either is available, using the <mode> inferred from executable.

This algorithm looks good. It is compatible with the existing algorithm if target is not overridden. In the case of overridden target behavior is different, but you a right, using x86_64.cfg in compilation for 32-bit target is not natural. Probably we should document the difference in ReleaseNotes.rst. Also documentation in UserManual.rst should be updated.

Thanks. Since the algorithm's good, I'll work on updating the docs and tests.

mgorny updated this revision to Diff 463274.Sep 27 2022, 10:32 AM

Update tests. Docs coming up next.

mgorny updated this revision to Diff 463288.Sep 27 2022, 11:16 AM

Added doc update. Would use some help with release notes.

mgorny edited the summary of this revision. (Show Details)Sep 27 2022, 1:46 PM
mgorny added a reviewer: thesamesam.

Adding @thesamesam to reviewers since he's been testing it on Gentoo, and I'd like to get his ACK too.

mgorny updated this revision to Diff 463413.Sep 27 2022, 9:20 PM
mgorny retitled this revision from [clang] [Driver] More flexible rules for loading default configs (WIP) to [clang] [Driver] More flexible rules for loading default configs.

Update release notes. I may have been a bit verbose/repeating myself.

Thanks. The previous behavior is too magical. I like this version.

clang/docs/ReleaseNotes.rst
243 ↗(On Diff #463413)

-target => --target=

avoid legacy spelling

clang/test/Driver/config-file3.c
33

Since %t is rebuilt. You can just use touch

47

Is // FULL1-NOT: Configuration file: sufficient?

62–82

Is // FULL1-NOT: Configuration file: sufficient?

pengfei added inline comments.Sep 27 2022, 10:43 PM
clang/test/Driver/config-file3.c
33

touch may not work on Windows?

mgorny marked 4 inline comments as done.Sep 27 2022, 10:59 PM
mgorny added inline comments.
clang/docs/ReleaseNotes.rst
243 ↗(On Diff #463413)

Ah, so it needs = (I really hate the option parsing in clang, you know?).

BTW that's what got me to use -target:

$ clang --target foo
clang-16: error: unsupported option '--target'; did you mean '-target'?
clang/test/Driver/config-file3.c
33

Well, my experience tells me to avoid external commands when the goal can be achieved via a builtin but touch is more popular in clang indeed.

47

Indeed. I forgot that FileCheck is weird-ish about retrying the matches ;-).

mgorny updated this revision to Diff 463424.Sep 27 2022, 11:00 PM
mgorny marked 2 inline comments as done.

Use --target= instead of -target. Use touch instead of echo >. Shorten FileCheck matches and guarantee that only expected configs appear.

MaskRay accepted this revision.Sep 27 2022, 11:03 PM
MaskRay added inline comments.
clang/lib/Driver/Driver.cpp
6192

llvm_unreachable

This revision is now accepted and ready to land.Sep 27 2022, 11:03 PM
mgorny updated this revision to Diff 463433.Sep 27 2022, 11:37 PM
mgorny marked 2 inline comments as done.

Use llvm_unreachable().

sepavloff added inline comments.Sep 28 2022, 12:45 AM
clang/docs/ReleaseNotes.rst
236–239 ↗(On Diff #463424)

I would say this paragraph is a combination of two provided below. IMHO.

240 ↗(On Diff #463424)

This paragraph is actually about two changes. One is about loading driver mode file, this is an extension. The other is about skipping files using original target prefix. It is potentially breaking change and is worth separate paragraph.

clang/docs/UsersManual.rst
918 ↗(On Diff #463424)

I would propose changing <target>-><triple> and target->triple to reduce misunderstanding.

949–950 ↗(On Diff #463424)

Maybe target->triple, <target.cfg>-><triplr.cfg>?

964 ↗(On Diff #463424)

I think we need to put something like:
It is not an error if any or both of these files are not found.

clang/lib/Driver/Driver.cpp
1089

Can the variable name be better? It looks like it means that driver mode is replaced (overridden).

clang/test/Driver/config-file3.c
33

IIRC, shell is not available on Windows.

mgorny marked 4 inline comments as done.Sep 28 2022, 12:59 AM
mgorny added inline comments.
clang/docs/ReleaseNotes.rst
236–239 ↗(On Diff #463424)

Yeah, it was intentional. I was thinking it would be cleaner to describe this in tandem but I won't insist. Just please confirm whether I should remove it or keep it.

240 ↗(On Diff #463424)

Updated.

clang/lib/Driver/Driver.cpp
1089

Can you suggest a better name? It's used to avoid searching for the same filename twice.

6192

Right, sorry, forgot about it.

mgorny updated this revision to Diff 463455.Sep 28 2022, 12:59 AM
mgorny marked an inline comment as done.

Updated docs per @sepavloff 's suggestions.

sepavloff added inline comments.Sep 28 2022, 1:23 AM
clang/docs/ReleaseNotes.rst
236–239 ↗(On Diff #463424)

No, I don't insist. It is OK.

clang/lib/Driver/Driver.cpp
1089

DriverModeReplaced? Or something like that.

mgorny marked 2 inline comments as done.Sep 28 2022, 2:11 AM
mgorny added inline comments.
clang/lib/Driver/Driver.cpp
1089

Ok, how about something more literal? TryModeSuffix.

mgorny updated this revision to Diff 463464.Sep 28 2022, 2:12 AM

Rename the bool to TryModeSuffix.

thesamesam accepted this revision.Sep 28 2022, 2:15 AM
sepavloff added inline comments.Sep 28 2022, 2:45 AM
clang/lib/Driver/Driver.cpp
1098

Loading driver-mode config should take place before loading target config, it allows to override default settings set by driver mode config in target config. Also a test is need for such case.

clang/test/Driver/config-file3.c
27

Tests must check the case when target prefix is not a real triple as in the original test (qqq-clang).

68

We also need a test that checks that in the case of --target=i386-unknown-linux-gnu and absence of i386-unknown-linux-gnu-clang++.cfg clang does not load x86_64-unknown-linux-gnu-clang-g++.cfg.

226–228

target and driver mode configs should be in reverse order.

arichardson added inline comments.Sep 28 2022, 2:52 AM
clang/test/Driver/config-file3.c
27

It would be quite important for me that this continues to work. I made use of that in the CheriBSD toolchain when creating symlinked binaries to easily build for different ABIs such as cheribsd-riscv64-hybrid-clang++ and cheribsd-riscv64-purecap-clang-cpp. It appears this previously only worked if the prefix did not start with a valid triple (which is why I put the OS before the architecture). I think it would also be nice if the whole prefix was checked even if it starts with a valid triple, but this does not need to be changed in this patch (haven't looked at it in detail so this might actually work).

mgorny added inline comments.Sep 28 2022, 3:24 AM
clang/lib/Driver/Driver.cpp
1098

Do you have any specific option in mind? The main idea was that you use full triple+driver config when you need overrides because — as we pretty much established before — the options accepted by different drivers are too different for them to be meaningfully used with a single config file shared between targets.

clang/test/Driver/config-file3.c
27

If the prefix is not a valid triple, then clang ignores it and uses host triple instead. And now we're back to square one. If I check both variants, it's too complex. If I don't, it's bad too.

sepavloff added inline comments.Sep 28 2022, 3:45 AM
clang/lib/Driver/Driver.cpp
1098

I mean we can have several driver-mode configs like tool.cfg, where some default settings are set (maybe by using @file constructs`) and several target.cfg which may override options defined in tool.cfg. Due to different target and driver configs loaded separately, we can avoid creating separate config for each target+driver combination. Not a killer feature, but it is convenient.

As I understand, it is enough to swap code that tries loading driver config below with this piece of code, no?

clang/test/Driver/config-file3.c
27

Our customers use this feature. Target prefix may designate, for example, debug build or build with specific framework.

mgorny added inline comments.Sep 28 2022, 4:03 AM
clang/test/Driver/config-file3.c
27

How are we supposed to avoid the "absurd" case where x86_64 configs are loaded for -m32 invocation then?

sepavloff added inline comments.Sep 28 2022, 4:14 AM
clang/test/Driver/config-file3.c
27

In this case overloading target does not makes sense. You need to analyze RealTriple in Driver::loadDefaultConfigFiles and if it is wrong, use target prefix as if it is real target.

mgorny added inline comments.Sep 28 2022, 4:30 AM
clang/test/Driver/config-file3.c
27

What do you mean by "wrong"? The target triple is always a correct triple.

sepavloff added inline comments.Sep 28 2022, 4:38 AM
clang/test/Driver/config-file3.c
27

Not triple, sorry. Target prefix taken from ClangNameParts.TargetPrefix.

mgorny added inline comments.Sep 28 2022, 5:01 AM
clang/test/Driver/config-file3.c
27

Now I'm confused. Are you suggesting that we use prefix instead of target if it's… incorrect?

sepavloff added inline comments.Sep 28 2022, 5:17 AM
clang/test/Driver/config-file3.c
27

Basically yes.

We need to support arbitrary target prefixes, because they are used now. If ClangNameParts.TargetPrefix is not a valid triple, use it to build config file names.

We also need to use config files like x86_64.cfg where middle components are dropped, they are also used. They also need to support target overloading.

mgorny updated this revision to Diff 463531.Sep 28 2022, 6:43 AM
mgorny marked 2 inline comments as done.

Swap driver and target config loading order.

arichardson added inline comments.Sep 28 2022, 6:43 AM
clang/test/Driver/config-file3.c
27

I can replace my approach with simple shell scripts that invoke clang --config=..., but IMO it would be nice if the logic was something like the first check being
if "explicit --target empty" and "ClangNameParts.TargetPrefix non-empty" and "ClangNameParts.TargetPrefix not a valid triple" -> load ClangNameParts.TargetPrefix + ".cfg" followed by the current logic.
I don't mind either way whether the first check should also load a driver-specific config file or not. I'd also be happy if the logic was "try loading ClangNameParts.TargetPrefix if explicit --target empty" regardless of whether it is a valid triple or not.
Does this approach sound reasonable?

mgorny marked an inline comment as done.Sep 28 2022, 6:46 AM
mgorny added inline comments.
clang/test/Driver/config-file3.c
27

Ok, so I guess that the logic would be: if name prefix's (non-normalized) triple does not have valid arch or valid os parts, we use that instead of real triple.

68

Isn't this what FULL2 + -m32 tests for?

mgorny updated this revision to Diff 463548.Sep 28 2022, 7:25 AM
mgorny edited the summary of this revision. (Show Details)

Implement the backwards compatible logic for using name prefix if 1) no target override options are used, and 2) name prefix is not a valid triple.

@sepavloff, @arichardson, does this approach look OK? I've included the cheribsd name in tests.

sepavloff accepted this revision.Sep 29 2022, 12:14 AM

LGTM.

Thanks!

MaskRay added inline comments.Sep 29 2022, 12:33 AM
clang/include/clang/Driver/Driver.h
758

This should be private

clang/lib/Driver/Driver.cpp
1094

"executable suffix" is less clear. An example will help.

1101

Mention "first part" ? (And what does "first part refer to")

mgorny edited the summary of this revision. (Show Details)Sep 29 2022, 12:35 AM
mgorny updated this revision to Diff 463825.Sep 29 2022, 3:26 AM
mgorny marked 2 inline comments as done.

Address @MaskRay 's comments.

clang/include/clang/Driver/Driver.h
758

No opinion but will do.

clang/lib/Driver/Driver.cpp
1094

Added for *clang-g++ below. Hope that's good enough.

1101

I don't understand this comment.

MaskRay added inline comments.Sep 29 2022, 9:44 AM
clang/lib/Driver/Driver.cpp
1070–1085

The next common mentions "second part" but it isn't very clear what "first part" and "second part" mean.
Perhaps repeat <triple>.cfg here and replace "second part" with "<mode>.cfg"

mgorny marked 2 inline comments as done.Sep 29 2022, 9:51 AM
mgorny added inline comments.
clang/lib/Driver/Driver.cpp
1070–1085

Oh, it's so weird because the comment was moved when I was swapping the logic but didn't notice it needs to be updated.

mgorny updated this revision to Diff 463944.Sep 29 2022, 9:51 AM
mgorny marked an inline comment as done.

Change confusing comments.

MaskRay accepted this revision.Sep 29 2022, 10:46 AM
MaskRay added inline comments.Sep 29 2022, 10:50 AM
clang/test/Driver/config-file3.c
114–124

This pattern can be replaced with FileCheck's --implicit-check-not='Configuration file:'

It might be nice to work with rustc folks on the format of these configs. A frequent difficulty between Rust and C interop is the totally different command line flags necessary to compile code meant to link together and match as much of the ABI as possible.

ojeda added a subscriber: ojeda.Sep 29 2022, 11:25 AM

Thanks Nick for the ping. Indeed, rustc allows to use a "target specification file" to create new targets that are not built-in, as a JSON file.

They are unstable though, since they are fairly tied to LLVM, and I think the Rust compiler team prefers to stabilize flags instead.

However, it would definitely be great for users and build systems if GCC, Clang and rustc could get together and decide on a new, common config file, even if it just covered the target specification.

This revision was landed with ongoing or failed builds.Sep 29 2022, 12:00 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptSep 29 2022, 12:00 PM

Thanks Nick for the ping. Indeed, rustc allows to use a "target specification file" to create new targets that are not built-in, as a JSON file.

They are unstable though, since they are fairly tied to LLVM, and I think the Rust compiler team prefers to stabilize flags instead.

However, it would definitely be great for users and build systems if GCC, Clang and rustc could get together and decide on a new, common config file, even if it just covered the target specification.

Clang just uses a plain text file similar to a response file (which one can specify with clang ... @a.rsp before Clang got --config support).
It's unclear Clang wants to support GCC style specs file and whether GCC wants to adopt another system beside its specs.
I lean toward there isn't much cooperation as we might think.

mgorny marked an inline comment as done.Sep 29 2022, 11:09 PM
mgorny added inline comments.
clang/test/Driver/config-file3.c
114–124

Thanks. I've only read your comment after pushing but I've done that now, and it is a definite improvement:

[main 78c97b46e6e0] [clang] [test] Use --implicit-check-not in config-file3 tests
 Date: Fri Sep 30 07:59:41 2022 +0200
 1 file changed, 26 insertions(+), 68 deletions(-)

It's unclear Clang wants to support GCC style specs file and whether GCC wants to adopt another system beside its specs.
I lean toward there isn't much cooperation as we might think.

Perhaps there ought to be?

It's unclear Clang wants to support GCC style specs file and whether GCC wants to adopt another system beside its specs.
I lean toward there isn't much cooperation as we might think.

Perhaps there ought to be?

rust use very different compiler options from Clang/GCC, so it's clear what benefits a unified format can bring, if we consider that more established compilers want to prefer internal consistency and avoid introducing duplicate mechanism.
GCC uses specs files. There may be some obstacles introducing a new mechanism.
Clang Driver doesn't support specs files. We pick configuration files for now whose format isn't different from the traditional response files.

If anything, rust can make a decision to support Clang style configuration files.

tstellar added inline comments.
clang/docs/UsersManual.rst
954 ↗(On Diff #463983)

Is x86_64-pc-linux-gnu-clang-g++ being passed to --driver= in this example?

In what scenarios will clang load the clang.cfg file?

mgorny marked an inline comment as done.Oct 18 2022, 12:23 AM

In what scenarios will clang load the clang.cfg file?

It will load it if all of the following are true:

  1. --no-default-config is not passed.
  2. a more specific <triple>-clang.cfg is not found.
  3. --driver-mode=gcc is in effect or the executable is named clang or *-clang and --driver-mode= did not select another config file.
clang/docs/UsersManual.rst
954 ↗(On Diff #463983)

I don't understand your question. x86_64-pc-linux-gnu-clang-g++ is the executable (symlink) name here.

In what scenarios will clang load the clang.cfg file?

It will load it if all of the following are true:

  1. --no-default-config is not passed.
  2. a more specific <triple>-clang.cfg is not found.
  3. --driver-mode=gcc is in effect or the executable is named clang or *-clang and --driver-mode= did not select another config file.

I know I'm a little late here, but having a default config file that's always loaded makes triaging issues much harder, because now every time someone files a bug, we need to ask for the contents of their config directory.

clang/docs/UsersManual.rst
954 ↗(On Diff #463983)

OK, it's a symlink. That makes sense.

I know I'm a little late here, but having a default config file that's always loaded makes triaging issues much harder, because now every time someone files a bug, we need to ask for the contents of their config directory.

I don't see that as much worse than asking them for the values of *DEFAULT* CMake options or whether they have patched their clang because the driver code makes so many specific assumptions that it didn't work for them. At least clang -v tells us whether they are actually using any configs.

I know I'm a little late here, but having a default config file that's always loaded makes triaging issues much harder, because now every time someone files a bug, we need to ask for the contents of their config directory.

I don't see that as much worse than asking them for the values of *DEFAULT* CMake options or whether they have patched their clang because the driver code makes so many specific assumptions that it didn't work for them. At least clang -v tells us whether they are actually using any configs.

At least as a distro maintainer, I know the CMake options used for the packages I'm trying to support.

I was also wondering, would it be possible to move some of the targets specific driver code out of the codebase and into config files that are shipped with clang?

I know I'm a little late here, but having a default config file that's always loaded makes triaging issues much harder, because now every time someone files a bug, we need to ask for the contents of their config directory.

I don't see that as much worse than asking them for the values of *DEFAULT* CMake options or whether they have patched their clang because the driver code makes so many specific assumptions that it didn't work for them. At least clang -v tells us whether they are actually using any configs.

At least as a distro maintainer, I know the CMake options used for the packages I'm trying to support.

I was also wondering, would it be possible to move some of the targets specific driver code out of the codebase and into config files that are shipped with clang?

We covered the need for this, I feel, quite extensively in the linked discussions - especially wrt the Clang 15/16 -Werror changes.

As for bug reports: just ask people to use --no-default-config if something looks dodgy?

I know I'm a little late here, but having a default config file that's always loaded makes triaging issues much harder, because now every time someone files a bug, we need to ask for the contents of their config directory.

I don't see that as much worse than asking them for the values of *DEFAULT* CMake options or whether they have patched their clang because the driver code makes so many specific assumptions that it didn't work for them. At least clang -v tells us whether they are actually using any configs.

At least as a distro maintainer, I know the CMake options used for the packages I'm trying to support.

As a distro maintainer, you can decide to build clang without system/user config directories set (which is the default), in which case only explicit --config will work.

I know I'm a little late here, but having a default config file that's always loaded makes triaging issues much harder, because now every time someone files a bug, we need to ask for the contents of their config directory.

I don't see that as much worse than asking them for the values of *DEFAULT* CMake options or whether they have patched their clang because the driver code makes so many specific assumptions that it didn't work for them. At least clang -v tells us whether they are actually using any configs.

At least as a distro maintainer, I know the CMake options used for the packages I'm trying to support.

As a distro maintainer, you can decide to build clang without system/user config directories set (which is the default), in which case only explicit --config will work.

Strictly speaking in this case the default config file may be loaded from the binary directory but the latter is controlled by distro.

Users also may split configuration file into pieces and include them using @file construct, in this case single config file is not enough. I think clang could print driver options in output of -v. Another way is to implement a special command line option, say -print-args, which would print all driver options.

Arfrever added a subscriber: Arfrever.EditedNov 6 2022, 6:58 AM

After reading this discussion, I need some clarification.

If there were the following symlinks pointing to clang: i386-pc-linux-gnu-clang, i486-pc-linux-gnu-clang, i586-pc-linux-gnu-clang, i686-pc-linux-gnu-clang, x86_64-pc-linux-gnu-clang.
Then do I assume correctly that:

  1. x86_64-pc-linux-gnu-clang -m32 would try to load i386.cfg
  2. i386-pc-linux-gnu-clang would try to load i386.cfg
  3. i486-pc-linux-gnu-clang would try to load i486.cfg
  4. i586-pc-linux-gnu-clang would try to load i586.cfg
  5. i686-pc-linux-gnu-clang would try to load i686.cfg

?

In modern multilib x86_64 systems, only i686-pc-linux-gnu-clang and x86_64-pc-linux-gnu-clang symlinks (from above list) are likely to exist, but discrepancy in behavior between x86_64-pc-linux-gnu-clang -m32 and i686-pc-linux-gnu-clang will cause confusion of users...