This is an archive of the discontinued LLVM Phabricator instance.

[clangd][query-driver] Extract GCC version from the driver output
AbandonedPublic

Authored by ArcsinX on Aug 2 2021, 12:53 PM.

Details

Summary

Clang uses 4.2.1 as a default GCC version.
For projects that rely on the GCC version, this can be a problem, e.g.

#if __GNUC__ < 5
#error "Incompatible compiler"
#endif

This patch extracts the GCC version from the driver output and adds -fgnuc-version=<extracted version> compile option to ensure that the values of GCC macros are correct.

Diff Detail

Event Timeline

ArcsinX created this revision.Aug 2 2021, 12:53 PM
ArcsinX requested review of this revision.Aug 2 2021, 12:53 PM
ArcsinX edited the summary of this revision. (Show Details)Aug 2 2021, 1:02 PM
joerg added a subscriber: joerg.Aug 2 2021, 2:38 PM

What do you mean with "GCC macros are correct"? Clang is *not* GCC 5 and not 1:1 compatible with GCC 5. Project that have checks like that should arrive in 2010...

What do you mean with "GCC macros are correct"?

Imagine that we have a project and we build it with GCC, we have compile_commands.json file which contains all compile command used to build this project.
Now we are trying to use clangd for this project, but as it turns, __GNUC__ is equal to 4, but we expect it to be equal to compiler version which was used during the build.

Clang is *not* GCC

We are talking about clangd -- tool, which provides IDE features, I expect it to be compatible (as much as possible) with a project which was built with GCC

kadircet added a comment.EditedAug 3 2021, 1:21 AM

I am not sure this is a good idea. Surely we can go with this approach and pretend to have the same GNUC version as the toolchain, but we are still using clang underneath as @joerg pointed out and we might not be able to parse language/compiler extensions supported by GNU compilers, which will lead to confusion.

I think in such cases people should just have a .clangd file in their repo that adds -fgnuc-version=x.y.z to compile flags, i.e:

CompileFlags:
  Add: -fgnuc-version=x.y.z

This will be a more explicit step taken by the user, hence they'll be more likely to know why clangd is not behaving the way they want it to.

ArcsinX added a comment.EditedAug 3 2021, 1:56 AM

I am not sure this is a good idea. Surely we can go with this approach and pretend to have the same GNUC version as the toolchain, but we are still using clang underneath as @joerg pointed out and we might not be able to parse language/compiler extensions supported by GNU compilers, which will lead to confusion.

The most confusing thing for me that we already have this implicitly set -fgnuc-version=4.2.1. So, we already have implicitly set __GNUC__ and other GCC macros as a default behavior.

I think in such cases people should just have a .clangd file in their repo that adds -fgnuc-version=x.y.z to compile flags, i.e:

CompileFlags:
  Add: -fgnuc-version=x.y.z

This will be a more explicit step taken by the user, hence they'll be more likely to know why clangd is not behaving the way they want it to.

Sometimes it's not an easy task to understand that errors in diagnostics returned by clangd are the result of incorrect __GNUC__ value (in many projects it's not just #error ...). Also it's not convenient to use .clangd config in this case:

  • change compiler version => update .clangd
  • several build directories (e.g. native and cross with different GCC versions).
  • query-driver supports different compilers in the same compile_commands.json
  • idea was to make it in more automatic way.

The most confusing thing for me that we already have this implicitly set -fgnuc-version=4.2.1. So, we already have implicitly set GNUC and other GCC macros as a default behavior.

It might be because clang actually supports all the GNU extensions up until 4.2.1, I don't know the history here, but if clang can compile with that GNUC macro by default, clangd should be good as well. But there are no promises for anything higher than that.

Sometimes it's not an easy task to understand that errors in diagnostics returned by clangd are the result of incorrect GNUC value (in many projects it's not just #error ...). Also it's not convenient to use .clangd config in this case:
change compiler version => update .clangd
several build directories (e.g. native and cross with different GCC versions).
query-driver supports different compilers in the same compile_commands.json
idea was to make it in more automatic way.

I think these are the reasons why this needs to be set explicitly by the user. Because even though it can make things work, it might as well break things implicitly (e.g. you have a compiler version 4.2.1 everything works as expected, but once you update your compiler to gcc-5 clangd breaks all of a sudden)

The most confusing thing for me that we already have this implicitly set -fgnuc-version=4.2.1. So, we already have implicitly set GNUC and other GCC macros as a default behavior.

It might be because clang actually supports all the GNU extensions up until 4.2.1, I don't know the history here, but if clang can compile with that GNUC macro by default, clangd should be good as well. But there are no promises for anything higher than that.

Seem the background is that it was unable to build chromium with clang: https://reviews.llvm.org/D68055
And as I can see there: this flag does not enable or disable any GCC extensions implemented in Clang. Thus, it just defines some GCC macros. So, without this patch we guaranty to user that these macros values will be incompatible with his GCC version if it is not 4.2.1, which is not an expected behavior for --query-driver for me: I expect that it can extract as much information from the driver output as possible and help clang with files processing.

I think these are the reasons why this needs to be set explicitly by the user. Because even though it can make things work, it might as well break things implicitly (e.g. you have a compiler version 4.2.1 everything works as expected, but once you update your compiler to gcc-5 clangd breaks all of a sudden)

Can't agree that we have equal probabilities to break something in a project which was build with GCC x.y.z when specifying the correct version (x.y.z) and incorrect (4.2.1).

joerg added a comment.Aug 3 2021, 6:08 AM

Yes, there are quite a number of small differences in builtin support and even certain macros that just invite even more trouble than this. This is IMO begging for hard to trace down errors.

Sorry I think I've failed to explain my point. My concern is not about changing the way clang is parsing the code.

I am saying that people might make use of compiler extensions by guarding their code with GNUC macros. Today we know that having 4.2.1 for GNUC version in clang works for those use cases (one way or the other clang is able to compile majority of the code out there people put behind a guard of GNUC <= 4.2.1). But people might make use of different compiler extensions while guarding for higher GNUC versions and they are probably not tested as extensively today. Hence I am saying that those will get broken in weird ways (and implicitly), while they are working today.

@kadircet Thanks for clarification. Now your position is clear for me.

But clang was designed as a drop-in replacement for GCC:

I do not have any proves that clang is 100% compatible with any GCC version, but also I can't find any evidence that the most compatible version of GCC is 4.2.1

I really faced the problem with that default 4.2.1 on a project: strange errors in clangd and it was really hard to understand that the reason was __GNUC__ == 4. -fgnuc-version=<real GCC version here> addition helped me to solve this. That's why I think that it can be added automatically (and .clangd can be used to override this with 4.2.1 (or any other) if user doesn't want this automatic GCC version detection =) )

@kadircet Thanks for clarification. Now your position is clear for me.

But clang was designed as a drop-in replacement for GCC:

I do not have any proves that clang is 100% compatible with any GCC version, but also I can't find any evidence that the most compatible version of GCC is 4.2.1

The existence of query-driver mechanism is proof that clang is not 100% GCC compatible (anymore). Instead of going with the query-driver approach we wanted to update custom toolchain support to mimic the GCC behaviour, but we gave up because it would get out-of-date again.

I really faced the problem with that default 4.2.1 on a project: strange errors in clangd and it was really hard to understand that the reason was __GNUC__ == 4. -fgnuc-version=<real GCC version here> addition helped me to solve this. That's why I think that it can be added automatically (and .clangd can be used to override this with 4.2.1 (or any other) if user doesn't want this automatic GCC version detection =) )

I definitely get your point too. But I am not convinced that changing the default behaviour here is going to be beneficial for majority of the users.
In other words, if we keep things as-is today, people that want a specific GNUC version can set clangd to parse their code with that specific version they know about and face the consequences if clang doesn't provide the same extensions GCC does. Or we can change the behaviour to what you propose and make people that has a broken parse to change their GNUC version to something they have no idea about (they won't know that they should set it to 4.2.1 or any other specific version).

But I am not convinced that changing the default behaviour here is going to be beneficial for majority of the users.

I think that majority of users doesn't use --query-driver at all.

In other words, if we keep things as-is today, people that want a specific GNUC version can set clangd to parse their code with that specific version they know about and face the consequences if clang doesn't provide the same extensions GCC does. Or we can change the behaviour to what you propose and make people that has a broken parse to change their GNUC version to something they have no idea about (they won't know that they should set it to 4.2.1 or any other specific version).

I think that setting correct version of GCC wont' break anything for the most of users (but I do not know how to prove this), but can fix (I saw this by myself), that's why I can't agree with you =)
I bet that most of users which face the problem I described, can't solve this by their own, because of knowledge lack (nothing can be found in the documentation that can help):
Level 1: I see errors in my code with --query-driver specified. Why this happened?
Level 2: Ok, I know that this is because of __GNUC__ macro definition, what should I do to make it works?

Ok if you think that setting GCC version to higher values by default won't break anything, I think we should update this for all of clang instead. Apparently that was something done in the past already in https://github.com/llvm/llvm-project/commit/b5fe494a2cdd33a0b069939795043f538e9a492c. It is currently being set in https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Clang.cpp#L6127.

If we bumped the version number there for all of clang by default, clangd wouldn't be lying about the version and this will have a much wider visibility. That way we'll be sure that users won't see breakages specific to clangd and we'll also be able to get opinions from other stakeholders in clang that know about possible side effects of this bump. WDYT?

I think we should update this for all of clang instead.

But first you need to prove that clang matches gcc x.y.z.

joerg added a comment.Aug 4 2021, 6:01 AM

This discussion is becoming pointless. Clang is not a 1:1 replacement for later GCC versions. Easiest example is the support for __builtin_apply and friends.

Ok if you think that setting GCC version to higher values by default won't break anything,

I think that GCC version should be the same, which was used to build the project: it can be higher or lower than 4.2.1

This discussion is becoming pointless.

Yes, seems we can't get agreement here, so I am planning to abandon this.

Clang is not a 1:1 replacement for later GCC versions. Easiest example is the support for __builtin_apply and friends.

Still we have no example when GCC x.y.z can compile something, but clang with -fgnuc-version=x.y.z can't process this project with some critical errors (but works ok with -fgnuc-version=4.2.1). But it's easy to create an example when -fgnuc-version=4.2.1 doesn't work (example in the description).

I disagree that this is a pointless discussion, FWIW.

The __GNUC__ macro is used for compiler feature (or bug?) detection.
Clang's current answer is "yes we are GCC >=4, but no we are not GCC >=5".
This causes the codebase to reject the compiler, because it relies on features/absence of bugs from GCC 5. @ArcsinX any idea what these are, concretely?
If clang has those features/lacks those bugs, there's some value in making it pass the feature check.

Problem is, there are significant extensions in GCC 4.3 that clang doesn't implement. Per user's manual:

clang does not support __builtin_va_arg_pack/__builtin_va_arg_pack_len. This is used rarely, but in some potentially interesting places, like the glibc headers, so it may be implemented pending user demand. Note that because clang pretends to be like GCC 4.2, and this extension was introduced in 4.3, the glibc headers will not try to use this extension with clang at the moment.

See also this thread and D51011. So we can't simply bump clang's compatibilty claims.


For clangd, I think this illustrates why implicitly increasing __GNUC__ (e.g. via query-driver) isn't safe in general.
For files that optionally use e.g. __builtin_va_arg_pack (like glibc headers), we'll direct the preprocessor to take that path and then reject the code.

With codebases that refuse to build at all without large __GNUC__ but actually mostly parse fine, this is unfortunate, but the sort of thing that users probably should opt into.
The symptom #error You need GCC X.Y to build this code is much easier to understand than the results of trying to use unsupported extensions.
I think the best we can do here is document the workaround.

ArcsinX added a comment.EditedAug 4 2021, 12:13 PM

Clang's current answer is "yes we are GCC >=4, but no we are not GCC >=5".
This causes the codebase to reject the compiler, because it relies on features/absence of bugs from GCC 5. @ArcsinX any idea what these are, concretely?

The checks about GCC version that I saw were related with some features from C++ standard that GCC < 4.X.Y doesn't support and there was a try to implement these features for GCC < 4.X.Y, but clang (or system headers got from query driver?) has these features and some kind of conflict appears.

ArcsinX abandoned this revision.Aug 8 2021, 11:44 AM

As I can see, no chances for this to get approved. So. closing.