This is an archive of the discontinued LLVM Phabricator instance.

[Clang] Remove ability to downgrade warning on the diagnostic for setting a non fixed enum to a value outside the range of the enumeration values
Needs ReviewPublic

Authored by shafik on May 9 2023, 2:22 PM.

Details

Reviewers
aaron.ballman
erichkeane
thakis
mstorsjo
rsmith
Group Reviewers
Restricted Project
Restricted Project
Summary

Previously we allowed the ability to downgrade this diagnostic based on feedback that time was needed to fix code: D131307

We believe sufficient time has passed and we will turn this back into a hard error.

This will also fix some bugs that were a consequence of the workaround:

https://github.com/llvm/llvm-project/issues/62355
https://github.com/llvm/llvm-project/issues/57176

Diff Detail

Event Timeline

shafik created this revision.May 9 2023, 2:22 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 9 2023, 2:22 PM
shafik requested review of this revision.May 9 2023, 2:22 PM
shafik updated this revision to Diff 520880.May 9 2023, 6:32 PM
  • Apply clang-format
shafik added a comment.May 9 2023, 6:32 PM

I am wondering what section of the release notes to put this under.

shafik added a comment.May 9 2023, 8:45 PM

Build failures look unrelated to my changes, it looks like it was failing before this.

One of the buildbot failures is the test you modified (\dr1xx.c), so I suspect it is related.

I THINK this needs a notice on the 'breaking changes' mailing list as well.

For release notes, I'd put it in the 'Potentially Breaking Changes' section.

One of the buildbot failures is the test you modified (\dr1xx.c), so I suspect it is related.

I THINK this needs a notice on the 'breaking changes' mailing list as well.

For release notes, I'd put it in the 'Potentially Breaking Changes' section.

Interestingly, my downstream noticed both of those same two tests failing, so I think you're right. I have no idea why I thought you'd modified dr1xx.c? No idea what I was thinking, so disregard. So just the 'breaking changes' mailing list, and the release notes for me are needed.

shafik updated this revision to Diff 521507.May 11 2023, 5:19 PM
  • Adding release note
shafik added reviewers: Restricted Project, Restricted Project.May 11 2023, 5:20 PM

We're still using -Wno-enum-constexpr-conversion, although I'm not sure if we need that or if we just forgot to remove it after doing some cleanup. I'm trying it out now. (Sorry, I'm not sure we were aware that having a way to turn this off was just temporary).

BTW, LLVM itself still uses this flag in openmp: https://github.com/llvm/llvm-project/blob/c2ce2a509f74a85a3c0ef4b9d6d79fbacc7e8bdf/openmp/cmake/HandleOpenMPOptions.cmake#L34

We also use Wno-enum-constexpr-conversion in ChromeOS. There are many packages that break with this warning. One of them is boost which is used in many other packages.

The errors in boost were:

./boost/mpl/aux_/integral_wrapper.hpp:73:31: error: integer value -1 is outside the valid range of values [0, 3] for this enumeration type [-Wenum-constexpr-conversion]
./boost/mpl/aux_/static_cast.hpp:24:47: note: expanded from macro 'BOOST_MPL_AUX_STATIC_CAST'
#   define BOOST_MPL_AUX_STATIC_CAST(T, expr) static_cast<T>(expr)
                                              ^

I do not think boost upstream has updated these files, I see they were last updated 3 years back.
https://github.com/boostorg/mpl/blob/master/include/boost/mpl/aux_/static_cast.hpp
and https://github.com/boostorg/mpl/blob/master/include/boost/mpl/aux_/integral_wrapper.hpp

Adding to the concerns raised above, I don't think we're there yet. See https://bugs.gentoo.org/buglist.cgi?quicksearch=enum-constexpr-conversion&list_id=6843355 and keep in mind that a bunch of stuff isn't buildable with Clang 16 yet anyway so I expect a bunch more to be broken on top of that.

Also, e.g. gdb isn't fixed in a release yet.

We're still using -Wno-enum-constexpr-conversion, although I'm not sure if we need that or if we just forgot to remove it after doing some cleanup. I'm trying it out now. (Sorry, I'm not sure we were aware that having a way to turn this off was just temporary).

Please let me know how things look on your end.

BTW, LLVM itself still uses this flag in openmp: https://github.com/llvm/llvm-project/blob/c2ce2a509f74a85a3c0ef4b9d6d79fbacc7e8bdf/openmp/cmake/HandleOpenMPOptions.cmake#L34

Thank you for flagging that.

I followed up on the PR that put in this change to see if they can remediate this.

We also use Wno-enum-constexpr-conversion in ChromeOS. There are many packages that break with this warning. One of them is boost which is used in many other packages.

The errors in boost were:

./boost/mpl/aux_/integral_wrapper.hpp:73:31: error: integer value -1 is outside the valid range of values [0, 3] for this enumeration type [-Wenum-constexpr-conversion]
./boost/mpl/aux_/static_cast.hpp:24:47: note: expanded from macro 'BOOST_MPL_AUX_STATIC_CAST'
#   define BOOST_MPL_AUX_STATIC_CAST(T, expr) static_cast<T>(expr)
                                              ^

I do not think boost upstream has updated these files, I see they were last updated 3 years back.
https://github.com/boostorg/mpl/blob/master/include/boost/mpl/aux_/static_cast.hpp
and https://github.com/boostorg/mpl/blob/master/include/boost/mpl/aux_/integral_wrapper.hpp

I was under the impression from https://github.com/boostorg/mpl/issues/69 that this was fixed but there are a number of issues off of the main one and maybe I am confused.

Adding to the concerns raised above, I don't think we're there yet. See https://bugs.gentoo.org/buglist.cgi?quicksearch=enum-constexpr-conversion&list_id=6843355 and keep in mind that a bunch of stuff isn't buildable with Clang 16 yet anyway so I expect a bunch more to be broken on top of that.

Is there anything we can do to help move these along? We really want to fix this asap b/c we are not conforming at this point wrt constant expressions and this causes other undesirable issues w/ diagnostics and SFINAE that are easily fixable.

Also, e.g. gdb isn't fixed in a release yet.

Can you give me a reference for the gdb issue?

I was under the impression from https://github.com/boostorg/mpl/issues/69 that this was fixed but there are a number of issues off of the main one and maybe I am confused.

Seems like boost 1.81 has the mentioned fix. I can try it and see if the warning still fires.

I was under the impression from https://github.com/boostorg/mpl/issues/69 that this was fixed but there are a number of issues off of the main one and maybe I am confused.

Seems like boost 1.81 has the mentioned fix. I can try it and see if the warning still fires.

Building boost 1.81 itself worked. But I still need to check the whole codebase.

https://github.com/bminor/binutils-gdb/blob/master/include/diagnostics.h

gdb only suppresses the warning. So this patch will likely break gdb.

As per commit: https://github.com/bminor/binutils-gdb/commit/ae61525fcf456ab395d55c45492a106d1275873a

Since the current code does what we want, and I don't see any way of doing it
differently, ignore -Wenum-constexpr-conversion around it.

https://github.com/bminor/binutils-gdb/blob/master/include/diagnostics.h

gdb only suppresses the warning. So this patch will likely break gdb.

As per commit: https://github.com/bminor/binutils-gdb/commit/ae61525fcf456ab395d55c45492a106d1275873a

Since the current code does what we want, and I don't see any way of doing it
differently, ignore -Wenum-constexpr-conversion around it.

Hmmm, I'm not certain, but I *think* this does the right thing: std::is_signed_v<decltype(1 ? std::declval<Ty>() : std::declval<std::underlying_type_t<Ty>>())>;

https://godbolt.org/z/MWY4e1e9e

https://github.com/bminor/binutils-gdb/blob/master/include/diagnostics.h

gdb only suppresses the warning. So this patch will likely break gdb.

As per commit: https://github.com/bminor/binutils-gdb/commit/ae61525fcf456ab395d55c45492a106d1275873a

Since the current code does what we want, and I don't see any way of doing it
differently, ignore -Wenum-constexpr-conversion around it.

Hmmm, I'm not certain, but I *think* this does the right thing: std::is_signed_v<decltype(1 ? std::declval<Ty>() : std::declval<std::underlying_type_t<Ty>>())>;

https://godbolt.org/z/MWY4e1e9e

Can you please send a patch to gdb?

We're still using -Wno-enum-constexpr-conversion, although I'm not sure if we need that or if we just forgot to remove it after doing some cleanup. I'm trying it out now. (Sorry, I'm not sure we were aware that having a way to turn this off was just temporary).

Please let me know how things look on your end.

It's not that bad, and all remaining breakages have fixes sent out now, but still waiting to land. A week or two is probably good enough for us. But I don't know about the rest of OSS mentioned in other comments.

(FYI, I'll be out for a few weeks, and I might not have a chance to follow up before then)

When looking for errors in existing codebases, don't forget that this diagnostic is currently suppressed by default in system headers. So this patch is moving from "no diagnostics for code in system headers" to "unconditional hard error in system headers". Just removing -Wno-enum-constexpr-conversion from your build flags is insufficient to test that this patch won't break code!

I haven't done any tests, but I'm rather skeptical that this change is going to be viable without breaking stuff, still.

dim added a subscriber: dim.May 18 2023, 11:48 AM

I submitted a similar workaround for the FreeBSD devel/gdb port, via https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271045 that also applies -Wno-enum-constexpr-conversion. As the upstream commit message said, they didn't see any other good way to get rid of the warning, so if there is a workaround clang is OK with, that would be nice. Still, I don't really understand what the value is of making this warning into an error, that is not suppressible?

When looking for errors in existing codebases, don't forget that this diagnostic is currently suppressed by default in system headers. So this patch is moving from "no diagnostics for code in system headers" to "unconditional hard error in system headers". Just removing -Wno-enum-constexpr-conversion from your build flags is insufficient to test that this patch won't break code!

I haven't done any tests, but I'm rather skeptical that this change is going to be viable without breaking stuff, still.

+1 to needing to test against system headers, but also: if we find any system headers that would be broken by this, we should proactively alert the owners of those headers so that they understand there's urgency to getting the fixes into their headers so that the entire ecosystem isn't held back. Alternatively, if it's just one problematic system header in an LTS release somewhere, we could perhaps put in a compat hack for just that header so we can move forward.

I submitted a similar workaround for the FreeBSD devel/gdb port, via https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271045 that also applies -Wno-enum-constexpr-conversion. As the upstream commit message said, they didn't see any other good way to get rid of the warning, so if there is a workaround clang is OK with, that would be nice. Still, I don't really understand what the value is of making this warning into an error, that is not suppressible?

One of the major selling points to constexpr functions in C++ is that they cannot contain UB -- if your code compiles, it is correct. This bug that we've fixed was another instance of us accidentally allowing UB in a constant expression context when we shouldn't have. FWIW, I pointed out a reasonable workaround for the freebsd issue: https://reviews.llvm.org/D150226#4342516

I submitted a similar workaround for the FreeBSD devel/gdb port, via https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271045 that also applies -Wno-enum-constexpr-conversion. As the upstream commit message said, they didn't see any other good way to get rid of the warning, so if there is a workaround clang is OK with, that would be nice. Still, I don't really understand what the value is of making this warning into an error, that is not suppressible?

One of the major selling points to constexpr functions in C++ is that they cannot contain UB -- if your code compiles, it is correct. This bug that we've fixed was another instance of us accidentally allowing UB in a constant expression context when we shouldn't have. FWIW, I pointed out a reasonable workaround for the freebsd issue: https://reviews.llvm.org/D150226#4342516

Also note, one of the bugs I reference shows how this breaks SFINAE: https://github.com/llvm/llvm-project/issues/57176 and that is not easily fixable. So this is non-conforming since it breaks valid code.

There are two basic work arounds. One give the enum a fixed underlying type:

enum E : int {}

which will give the whole range of values to work with or if there is a one specific error value they want they can explicitly add an error enumerator and make the error value explicit.

Also note, one of the bugs I reference shows how this breaks SFINAE: https://github.com/llvm/llvm-project/issues/57176 and that is not easily fixable. So this is non-conforming since it breaks valid code.

You can mark a warning diagnostic SFINAEFailure to ensure it's treated as an error in contexts where SFINAE matters. It might make sense to do that as an intermediate step.

Also note, one of the bugs I reference shows how this breaks SFINAE: https://github.com/llvm/llvm-project/issues/57176 and that is not easily fixable. So this is non-conforming since it breaks valid code.

You can mark a warning diagnostic SFINAEFailure to ensure it's treated as an error in contexts where SFINAE matters. It might make sense to do that as an intermediate step.

I don't remember all the details but I noted in the bug report that b/c we are not using CCEDiag it seemed that won't work. At least I could not get it to work at the time and I believe that was the underlying issue but maybe I missed something. It may be worth looking into again.

Also note, one of the bugs I reference shows how this breaks SFINAE: https://github.com/llvm/llvm-project/issues/57176 and that is not easily fixable. So this is non-conforming since it breaks valid code.

You can mark a warning diagnostic SFINAEFailure to ensure it's treated as an error in contexts where SFINAE matters. It might make sense to do that as an intermediate step.

Oh, I guess it's actually a bit trickier because constant evaluation is involved... we don't currently have an equivalent there. Probably it's doable somehow, but maybe not worth the effort.

I guess to make that work, the constant evaluator would need to track whether we're in an SFINAE context (Sema::isSFINAEContext()). Based on that, we'd need to explicitly generate an error if we're in an SFINAE context, and a warning if we're not in such a context.

dim added a subscriber: bsdjhb.May 18 2023, 1:35 PM

One of the major selling points to constexpr functions in C++ is that they cannot contain UB -- if your code compiles, it is correct. This bug that we've fixed was another instance of us accidentally allowing UB in a constant expression context when we shouldn't have. FWIW, I pointed out a reasonable workaround for the freebsd issue: https://reviews.llvm.org/D150226#4342516

I tried making that work for gdb, but I failed; there are some complains about deleted operator~ that I didn't see before, and I'm going to have to let it rest for a bit. I could maybe convince @bsdjhb to submit a solution, since he's contributed to gdb and binutils before.

pirama added a subscriber: pirama.May 24 2023, 11:00 PM

When looking for errors in existing codebases, don't forget that this diagnostic is currently suppressed by default in system headers. So this patch is moving from "no diagnostics for code in system headers" to "unconditional hard error in system headers". Just removing -Wno-enum-constexpr-conversion from your build flags is insufficient to test that this patch won't break code!

I haven't done any tests, but I'm rather skeptical that this change is going to be viable without breaking stuff, still.

+1 to needing to test against system headers, but also: if we find any system headers that would be broken by this, we should proactively alert the owners of those headers so that they understand there's urgency to getting the fixes into their headers so that the entire ecosystem isn't held back. Alternatively, if it's just one problematic system header in an LTS release somewhere, we could perhaps put in a compat hack for just that header so we can move forward.

As a general question/feature request: is there a way to have specific warnings apply even for system headers? It would be nice if I could check what breaks when by adding -Wsystem-error=enum-constexpr-conversion to the global build flags. Rebuilding clang w/ this patched in also works, but is a little more difficult/noisy.

As a general question/feature request: is there a way to have specific warnings apply even for system headers? It would be nice if I could check what breaks when by adding -Wsystem-error=enum-constexpr-conversion to the global build flags. Rebuilding clang w/ this patched in also works, but is a little more difficult/noisy.

I think there's no per-warning flag (you can turn on _all_ of the enabled warnings in system headers with -Wsystem-headers), but by modifying the clang sources you can add ShowInSystemHeader to the diagnostic.
E.g.

 def warn_constexpr_unscoped_enum_out_of_range : Warning<
   "integer value %0 is outside the valid range of values [%1, %2] for this "
-  "enumeration type">, DefaultError, InGroup<DiagGroup<"enum-constexpr-conversion">>;
+  "enumeration type">, DefaultError, InGroup<DiagGroup<"enum-constexpr-conversion">>, ShowInSystemHeader;

As a general question/feature request: is there a way to have specific warnings apply even for system headers? It would be nice if I could check what breaks when by adding -Wsystem-error=enum-constexpr-conversion to the global build flags. Rebuilding clang w/ this patched in also works, but is a little more difficult/noisy.

I think there's no per-warning flag (you can turn on _all_ of the enabled warnings in system headers with -Wsystem-headers), but by modifying the clang sources you can add ShowInSystemHeader to the diagnostic.
E.g.

 def warn_constexpr_unscoped_enum_out_of_range : Warning<
   "integer value %0 is outside the valid range of values [%1, %2] for this "
-  "enumeration type">, DefaultError, InGroup<DiagGroup<"enum-constexpr-conversion">>;
+  "enumeration type">, DefaultError, InGroup<DiagGroup<"enum-constexpr-conversion">>, ShowInSystemHeader;

Thanks for that pointer. It's definitely a lot simpler/general to do that than to patch a change like this in, but still requires rebuilding clang. My most recent attempt didn't find anything -Wenum-constexpr-conversion related, but did find a bunch of other breakages that I assume are related to other recent clang changes (usually existing issues with the code that need to be cleaned up, but maybe a few are clang bugs). The fact that ShowInSystemHeader already exists makes me hopeful that it isn't too hard to support this, so I filed https://github.com/llvm/llvm-project/issues/63180.

I suppose including this warning in ShowInSystemHeader with your diff above would be a good first step anyway, right? If we're about to make it a hard error anyway, then making it a ShowInSystemHeader error first would ease that transition.

I suppose including this warning in ShowInSystemHeader with your diff above would be a good first step anyway, right? If we're about to make it a hard error anyway, then making it a ShowInSystemHeader error first would ease that transition.

Yes and no.

If we're going to turn something into a hard error, letting folks implementing system headers know about it is important, so from that perspective, it makes sense to enable the diagnostic in system headers. However, *users* have no choice in their system headers (oftentimes) which makes the diagnostic unactionable for them as they're not going to (and shouldn't have to) modify system headers, so from that perspective, enabling the diagnostic in a system header introduces friction.

If this diagnostic is showing up in system headers, I think we would likely need to consider adding a compatibility hack to allow Clang to still consume that system header while erroring when outside of that system header (we've done this before to keep STL implementations working, for example). Between this need and the friction it causes users to have an unactionable diagnostic, I think we probably should not enable ShowInSystemHeader.

I suppose including this warning in ShowInSystemHeader with your diff above would be a good first step anyway, right? If we're about to make it a hard error anyway, then making it a ShowInSystemHeader error first would ease that transition.

Yes and no.

If we're going to turn something into a hard error, letting folks implementing system headers know about it is important, so from that perspective, it makes sense to enable the diagnostic in system headers. However, *users* have no choice in their system headers (oftentimes) which makes the diagnostic unactionable for them as they're not going to (and shouldn't have to) modify system headers, so from that perspective, enabling the diagnostic in a system header introduces friction.

If this diagnostic is showing up in system headers, I think we would likely need to consider adding a compatibility hack to allow Clang to still consume that system header while erroring when outside of that system header (we've done this before to keep STL implementations working, for example). Between this need and the friction it causes users to have an unactionable diagnostic, I think we probably should not enable ShowInSystemHeader.

It seems to me that if our concern is breaking system headers, we need to do that with better testing. Some sort of 'diagnostic group' for "This is going to become an error *SOON*" mixed with us/vendors running that on platforms we consider significant enough to not break. But just diagnosing on arbitrary users with no choice on how to fix the headers doesn't seem appropriate.

I suppose including this warning in ShowInSystemHeader with your diff above would be a good first step anyway, right? If we're about to make it a hard error anyway, then making it a ShowInSystemHeader error first would ease that transition.

Yes and no.

If we're going to turn something into a hard error, letting folks implementing system headers know about it is important, so from that perspective, it makes sense to enable the diagnostic in system headers. However, *users* have no choice in their system headers (oftentimes) which makes the diagnostic unactionable for them as they're not going to (and shouldn't have to) modify system headers, so from that perspective, enabling the diagnostic in a system header introduces friction.

If this diagnostic is showing up in system headers, I think we would likely need to consider adding a compatibility hack to allow Clang to still consume that system header while erroring when outside of that system header (we've done this before to keep STL implementations working, for example). Between this need and the friction it causes users to have an unactionable diagnostic, I think we probably should not enable ShowInSystemHeader.

It seems to me that if our concern is breaking system headers, we need to do that with better testing. Some sort of 'diagnostic group' for "This is going to become an error *SOON*" mixed with us/vendors running that on platforms we consider significant enough to not break. But just diagnosing on arbitrary users with no choice on how to fix the headers doesn't seem appropriate.

I think that's the request here: https://github.com/llvm/llvm-project/issues/63180 - some way, initially, to opt-in to such a diagnostic group or mechanism (like "enable this warning in system headers") to assess how big the migration effort is/report back on whether it's necessary to implement a compatibility hack or whether things will be able to be cleaned up. (such a thing could be opt-in at first, vendors etc could use that to assess the fallout - if the answer is "we think we can clean this up" and folks plan some timeline, they reckon they're ready, then you switch to that feature being enabled by default (because we think we can make it an error - but there's always surprises, so shipping it as an error-by-default-but-with-a-way-to-opt-out is better than shipping it as a hard error that catches users by surprise) - then if there's no major fallout, remove the ability to opt-out)

I suppose including this warning in ShowInSystemHeader with your diff above would be a good first step anyway, right? If we're about to make it a hard error anyway, then making it a ShowInSystemHeader error first would ease that transition.

Yes and no.

If we're going to turn something into a hard error, letting folks implementing system headers know about it is important, so from that perspective, it makes sense to enable the diagnostic in system headers. However, *users* have no choice in their system headers (oftentimes) which makes the diagnostic unactionable for them as they're not going to (and shouldn't have to) modify system headers, so from that perspective, enabling the diagnostic in a system header introduces friction.

If this diagnostic is showing up in system headers, I think we would likely need to consider adding a compatibility hack to allow Clang to still consume that system header while erroring when outside of that system header (we've done this before to keep STL implementations working, for example). Between this need and the friction it causes users to have an unactionable diagnostic, I think we probably should not enable ShowInSystemHeader.

It seems to me that if our concern is breaking system headers, we need to do that with better testing. Some sort of 'diagnostic group' for "This is going to become an error *SOON*" mixed with us/vendors running that on platforms we consider significant enough to not break. But just diagnosing on arbitrary users with no choice on how to fix the headers doesn't seem appropriate.

I think that's the request here: https://github.com/llvm/llvm-project/issues/63180

+1, I think that request is a reasonable idea to help maintainers of system headers.

It sounds like we're in agreement that we should not enable ShowInSystemHeaders for this, but it's not clear whether we've got agreement yet that we can land this change right now. I think it's acceptable to kick the can down the road by another release or so if we feel we need to, but there is a hard stop to that at some point (some folks won't fix their code until they're forced to do so, and there's not much we can do about those cases).

I suppose including this warning in ShowInSystemHeader with your diff above would be a good first step anyway, right? If we're about to make it a hard error anyway, then making it a ShowInSystemHeader error first would ease that transition.

Yes and no.

If we're going to turn something into a hard error, letting folks implementing system headers know about it is important, so from that perspective, it makes sense to enable the diagnostic in system headers. However, *users* have no choice in their system headers (oftentimes) which makes the diagnostic unactionable for them as they're not going to (and shouldn't have to) modify system headers, so from that perspective, enabling the diagnostic in a system header introduces friction.

If this diagnostic is showing up in system headers, I think we would likely need to consider adding a compatibility hack to allow Clang to still consume that system header while erroring when outside of that system header (we've done this before to keep STL implementations working, for example). Between this need and the friction it causes users to have an unactionable diagnostic, I think we probably should not enable ShowInSystemHeader.

It seems to me that if our concern is breaking system headers, we need to do that with better testing. Some sort of 'diagnostic group' for "This is going to become an error *SOON*" mixed with us/vendors running that on platforms we consider significant enough to not break. But just diagnosing on arbitrary users with no choice on how to fix the headers doesn't seem appropriate.

I think that's the request here: https://github.com/llvm/llvm-project/issues/63180

+1, I think that request is a reasonable idea to help maintainers of system headers.

It sounds like we're in agreement that we should not enable ShowInSystemHeaders for this, but it's not clear whether we've got agreement yet that we can land this change right now. I think it's acceptable to kick the can down the road by another release or so if we feel we need to, but there is a hard stop to that at some point (some folks won't fix their code until they're forced to do so, and there's not much we can do about those cases).

FWIW, I still think it might be reasonable to ShowInSystemHeaders before turning something into an unconditional/hard error - ShowInSystemHeaders is strictly less intrusive than a hard error, and more intrusive than the warning as-is, so seems like a reasonable part of transitioning to a feature removal. (warning -> default-error -> show in system headers -> hard error) I don't have enough of an investment in this area to suggest that this /must/ be the way such a transition is done, but I have a hard time seeing why it would be better to avoid that intermediate step.

I suppose including this warning in ShowInSystemHeader with your diff above would be a good first step anyway, right? If we're about to make it a hard error anyway, then making it a ShowInSystemHeader error first would ease that transition.

Yes and no.

If we're going to turn something into a hard error, letting folks implementing system headers know about it is important, so from that perspective, it makes sense to enable the diagnostic in system headers. However, *users* have no choice in their system headers (oftentimes) which makes the diagnostic unactionable for them as they're not going to (and shouldn't have to) modify system headers, so from that perspective, enabling the diagnostic in a system header introduces friction.

If this diagnostic is showing up in system headers, I think we would likely need to consider adding a compatibility hack to allow Clang to still consume that system header while erroring when outside of that system header (we've done this before to keep STL implementations working, for example). Between this need and the friction it causes users to have an unactionable diagnostic, I think we probably should not enable ShowInSystemHeader.

It seems to me that if our concern is breaking system headers, we need to do that with better testing. Some sort of 'diagnostic group' for "This is going to become an error *SOON*" mixed with us/vendors running that on platforms we consider significant enough to not break. But just diagnosing on arbitrary users with no choice on how to fix the headers doesn't seem appropriate.

I think that's the request here: https://github.com/llvm/llvm-project/issues/63180

+1, I think that request is a reasonable idea to help maintainers of system headers.

It sounds like we're in agreement that we should not enable ShowInSystemHeaders for this, but it's not clear whether we've got agreement yet that we can land this change right now. I think it's acceptable to kick the can down the road by another release or so if we feel we need to, but there is a hard stop to that at some point (some folks won't fix their code until they're forced to do so, and there's not much we can do about those cases).

FWIW, I still think it might be reasonable to ShowInSystemHeaders before turning something into an unconditional/hard error - ShowInSystemHeaders is strictly less intrusive than a hard error, and more intrusive than the warning as-is, so seems like a reasonable part of transitioning to a feature removal. (warning -> default-error -> show in system headers -> hard error) I don't have enough of an investment in this area to suggest that this /must/ be the way such a transition is done, but I have a hard time seeing why it would be better to avoid that intermediate step.

My concern with ShowInSystemHeaders is that this seems like a bad user experience. If the system header triggers an error 1) some users aren't going to know how to fix that by downgrading the diagnostic, so that may cause them to go "Clang is buggy because <other compiler> accepts this fine" (not the end of the world, but frustrates both us and users). 2) the only recourse users have is to downgrade/disable the diagnostic (otherwise they'd have to change system header code), which they may likely do with a command line flag rather than something more targeted like diagnostic pragmas around the include, which increases the risk of users not seeing the issues in code they can control.

That said, I see your logic and kind of agree with it. I just worry if it's going to cause more harm than good (and I don't know of a particularly good way to try to find out aside from "try it and see").

My concern with ShowInSystemHeaders is that this seems like a bad user experience.

I guess you're comparing this user experience to the one when this feature was a normal warning/without ShowInSystemHeaders - I'm comparing this situation to the future where this becomes a hard error with no escape hatch. ShowInSystemHeaders is more aggressive than the former & yeah, isn't terribly nice for users - but it's less aggressive than the latter, and gives users some escape hatch for now until the thing becomes a hard error & at least they know it's coming, maybe? (with enough text in the warning, etc)

So, yeah, I'd be inclined to make the change to ShowInSystemHeaders close to when you were going to make it a hard error, pushing back the hard-error change a little bit (maybe just one clang release? Maybe a couple? Not sure).

If the system header triggers an error 1) some users aren't going to know how to fix that by downgrading the diagnostic, so that may cause them to go "Clang is buggy because <other compiler> accepts this fine" (not the end of the world, but frustrates both us and users). 2) the only recourse users have is to downgrade/disable the diagnostic (otherwise they'd have to change system header code), which they may likely do with a command line flag rather than something more targeted like diagnostic pragmas around the include, which increases the risk of users not seeing the issues in code they can control.

They'd already have had a chance to deal with their code when this was a warning-default-error without "ShowInSystemHeaders"? (or, if the yhaven't picked up a new compiler often enough, and they go from "a warning we didn't care about" to "warning-default-error-with-ShowInSystemHeaders" - they're still better off than if it'd gone straight to hard error, some chance to cleanup while disabling the warning/error before picking up a compiler version that makes it a hard error)

They'd already have had a chance to deal with their code when this was a warning-default-error without "ShowInSystemHeaders"? (or, if the yhaven't picked up a new compiler often enough, and they go from "a warning we didn't care about" to "warning-default-error-with-ShowInSystemHeaders" - they're still better off than if it'd gone straight to hard error, some chance to cleanup while disabling the warning/error before picking up a compiler version that makes it a hard error)

+1 for this. Having no intermediate step for developers to find/fix the warning in system headers (short of recompiling Clang with different settings) seems really rough for very downstream folks.

One other thought I had was whether this should have a different way to suppress a "severe" warning (i.e. -fno-really-I-know-I-should-fix-this-UB-hole a la the now defunct flag for automatic initialization), because it is far too common for downstream developers of all sorts to encounter a warning once, see that they just don't care about that specific instance, and then slap on a -Wno-foo to their project forever. It almost seems like there should be a strongly-worded opt-out for these severe warning suppressions that might be behavior-changing (even if we later remove the ability for users to suppress them). I know that compiler developers don't like adding flags (let alone confusing ones for behaviors that we know should have been strict from the start), but it seems like there might be some tradeoff here that is worth it, considering how much other downstream cleanup might be needed for some compiler vendors.

The fundamental problem here is the interaction with SFINAE; if we don't diagnose this as an error, we actually miscompile some cases. Because of that, a flag wouldn't really be "turning off the warning"; it would be enabling a non-compliant language mode that has different rules for whether enum casts are legal.

If it weren't for that, I don't think anyone would be in a hurry to turn the warning into a hard error.

The fundamental problem here is the interaction with SFINAE; if we don't diagnose this as an error, we actually miscompile some cases. Because of that, a flag wouldn't really be "turning off the warning"; it would be enabling a non-compliant language mode that has different rules for whether enum casts are legal.

If it weren't for that, I don't think anyone would be in a hurry to turn the warning into a hard error.

+1 to this. I'm worried about the miscompiles continuing in the wild. We did the best we could when landing the warning-as-error changes in D131307 by telling users explicitly "This diagnostic is expected to turn into an error-only diagnostic in the next Clang release." in the release notes. Obviously, that's not ideal because very few folks read release notes, but we don't have any better mechanism for communicating these kinds of changes.

What do folks think is a reasonable path forward here? Given that we're going to branch the Clang 17 release in a few weeks (AIUI), my suggestion is to kick the can down the road by landing these changes just after we branch. That way the changes land such that early adopters can test and see how disruptive we're being without time pressure or hassle of dealing with release branches. Do folks think that's a reasonable approach? (I'm open to other suggestions.)

The fundamental problem here is the interaction with SFINAE; if we don't diagnose this as an error, we actually miscompile some cases. Because of that, a flag wouldn't really be "turning off the warning"; it would be enabling a non-compliant language mode that has different rules for whether enum casts are legal.

If it weren't for that, I don't think anyone would be in a hurry to turn the warning into a hard error.

+1 to this. I'm worried about the miscompiles continuing in the wild. We did the best we could when landing the warning-as-error changes in D131307 by telling users explicitly "This diagnostic is expected to turn into an error-only diagnostic in the next Clang release." in the release notes. Obviously, that's not ideal because very few folks read release notes, but we don't have any better mechanism for communicating these kinds of changes.

What do folks think is a reasonable path forward here? Given that we're going to branch the Clang 17 release in a few weeks (AIUI), my suggestion is to kick the can down the road by landing these changes just after we branch. That way the changes land such that early adopters can test and see how disruptive we're being without time pressure or hassle of dealing with release branches. Do folks think that's a reasonable approach? (I'm open to other suggestions.)

I'm not quite following if/what the objection is to removing the "ignore in system headers" for the warning for a release or two prior to making this a hard error? That seems like a fairly low-cost change (it does kick the can down the road a release or two before it becomes a hard error - but isn't worse for users, for the most part - if they were going to get a hard error from a system header before, at least now instead they'd get a warning or warning-defaulted-to-error instead, they could turn it off (but they had been warned that their system headers were going to cause them problems soon) and then it becomes a hard error a release or two later).

The fundamental problem here is the interaction with SFINAE; if we don't diagnose this as an error, we actually miscompile some cases. Because of that, a flag wouldn't really be "turning off the warning"; it would be enabling a non-compliant language mode that has different rules for whether enum casts are legal.

If it weren't for that, I don't think anyone would be in a hurry to turn the warning into a hard error.

+1 to this. I'm worried about the miscompiles continuing in the wild. We did the best we could when landing the warning-as-error changes in D131307 by telling users explicitly "This diagnostic is expected to turn into an error-only diagnostic in the next Clang release." in the release notes. Obviously, that's not ideal because very few folks read release notes, but we don't have any better mechanism for communicating these kinds of changes.

What do folks think is a reasonable path forward here? Given that we're going to branch the Clang 17 release in a few weeks (AIUI), my suggestion is to kick the can down the road by landing these changes just after we branch. That way the changes land such that early adopters can test and see how disruptive we're being without time pressure or hassle of dealing with release branches. Do folks think that's a reasonable approach? (I'm open to other suggestions.)

I'm not quite following if/what the objection is to removing the "ignore in system headers" for the warning for a release or two prior to making this a hard error? That seems like a fairly low-cost change (it does kick the can down the road a release or two before it becomes a hard error - but isn't worse for users, for the most part - if they were going to get a hard error from a system header before, at least now instead they'd get a warning or warning-defaulted-to-error instead, they could turn it off (but they had been warned that their system headers were going to cause them problems soon) and then it becomes a hard error a release or two later).

Thank you for bringing that up, sorry for not being more explicit -- by "these changes", I meant "whatever form we decide they should take" as opposed to "the patch as it is now."

In terms of removing the "ignore in system headers" flag, I'm not strongly opposed to it, but I do worry it will throw the baby out with the bathwater. The specific use case I'm worried about is: user has a system header with the problematic code, they run into this diagnostic and they can't address it themselves so they disable the warning for the project. The user then doesn't learn about the problems in their own (non-system header) code until it becomes a hard error later.

The fundamental problem here is the interaction with SFINAE; if we don't diagnose this as an error, we actually miscompile some cases. Because of that, a flag wouldn't really be "turning off the warning"; it would be enabling a non-compliant language mode that has different rules for whether enum casts are legal.

If it weren't for that, I don't think anyone would be in a hurry to turn the warning into a hard error.

+1 to this. I'm worried about the miscompiles continuing in the wild. We did the best we could when landing the warning-as-error changes in D131307 by telling users explicitly "This diagnostic is expected to turn into an error-only diagnostic in the next Clang release." in the release notes. Obviously, that's not ideal because very few folks read release notes, but we don't have any better mechanism for communicating these kinds of changes.

What do folks think is a reasonable path forward here? Given that we're going to branch the Clang 17 release in a few weeks (AIUI), my suggestion is to kick the can down the road by landing these changes just after we branch. That way the changes land such that early adopters can test and see how disruptive we're being without time pressure or hassle of dealing with release branches. Do folks think that's a reasonable approach? (I'm open to other suggestions.)

I'm not quite following if/what the objection is to removing the "ignore in system headers" for the warning for a release or two prior to making this a hard error? That seems like a fairly low-cost change (it does kick the can down the road a release or two before it becomes a hard error - but isn't worse for users, for the most part - if they were going to get a hard error from a system header before, at least now instead they'd get a warning or warning-defaulted-to-error instead, they could turn it off (but they had been warned that their system headers were going to cause them problems soon) and then it becomes a hard error a release or two later).

Thank you for bringing that up, sorry for not being more explicit -- by "these changes", I meant "whatever form we decide they should take" as opposed to "the patch as it is now."

Ah, OK.

In terms of removing the "ignore in system headers" flag, I'm not strongly opposed to it, but I do worry it will throw the baby out with the bathwater. The specific use case I'm worried about is: user has a system header with the problematic code, they run into this diagnostic and they can't address it themselves so they disable the warning for the project. The user then doesn't learn about the problems in their own (non-system header) code until it becomes a hard error later.

OK, I think we're still talking past each other/making different comparisons.
I think you're comparing "make this a warning in system headers" to "not doing anything" (so, yes, it causes people to ignore the warning even in their own code when if they had done nothing/we had done nothing to force them, they would've kept getting the warning in their code)
Whereas I'm comparing "make this a warning in system headers" to "making this a hard error" (in which case a user for which it would've been a hard error and they'd have had no choice but to fix their system headers (might be hard/impossible) would have some relief valve for a time/some notice that this would be a problem in the future when it becomes a hard error)

If we picture a timeline:
Time 1) Code is valid/no problem
Time 2) Warning added (not in system headers)
Time 3) Warning becomes error by default (but could be disabled by a user) (still not in system headers)
Time 4) Becomes hard error

And if someone has a system header with a violation, they won't know about it until (4) and they won't be able to do much/anything about it.

I'm suggesting making the process longer.

Time 4) default-error warning becomes enabled in system headers

Users who, in the first timeline, would be stuck - at least have a release valve. I think they are better off than they were in the first timeline - even though they lose warning coverage over their own code (which is bad), at least they can still build their project and know they have problems in their system headers they should report/etc and try to have addressed.

then Time 5) make it a hard error.

So it takes a bit longer to get to (5), and users who would be OK in the original timeline are no worse off - they had there system headers fixed by (4) and don't need to disable the warning. But users who would've had a bad time at (4) have a less bad time/some chance of working through the bad time.

The fundamental problem here is the interaction with SFINAE; if we don't diagnose this as an error, we actually miscompile some cases. Because of that, a flag wouldn't really be "turning off the warning"; it would be enabling a non-compliant language mode that has different rules for whether enum casts are legal.

If it weren't for that, I don't think anyone would be in a hurry to turn the warning into a hard error.

+1 to this. I'm worried about the miscompiles continuing in the wild. We did the best we could when landing the warning-as-error changes in D131307 by telling users explicitly "This diagnostic is expected to turn into an error-only diagnostic in the next Clang release." in the release notes. Obviously, that's not ideal because very few folks read release notes, but we don't have any better mechanism for communicating these kinds of changes.

What do folks think is a reasonable path forward here? Given that we're going to branch the Clang 17 release in a few weeks (AIUI), my suggestion is to kick the can down the road by landing these changes just after we branch. That way the changes land such that early adopters can test and see how disruptive we're being without time pressure or hassle of dealing with release branches. Do folks think that's a reasonable approach? (I'm open to other suggestions.)

I'm not quite following if/what the objection is to removing the "ignore in system headers" for the warning for a release or two prior to making this a hard error? That seems like a fairly low-cost change (it does kick the can down the road a release or two before it becomes a hard error - but isn't worse for users, for the most part - if they were going to get a hard error from a system header before, at least now instead they'd get a warning or warning-defaulted-to-error instead, they could turn it off (but they had been warned that their system headers were going to cause them problems soon) and then it becomes a hard error a release or two later).

Thank you for bringing that up, sorry for not being more explicit -- by "these changes", I meant "whatever form we decide they should take" as opposed to "the patch as it is now."

Ah, OK.

In terms of removing the "ignore in system headers" flag, I'm not strongly opposed to it, but I do worry it will throw the baby out with the bathwater. The specific use case I'm worried about is: user has a system header with the problematic code, they run into this diagnostic and they can't address it themselves so they disable the warning for the project. The user then doesn't learn about the problems in their own (non-system header) code until it becomes a hard error later.

OK, I think we're still talking past each other/making different comparisons.
I think you're comparing "make this a warning in system headers" to "not doing anything" (so, yes, it causes people to ignore the warning even in their own code when if they had done nothing/we had done nothing to force them, they would've kept getting the warning in their code)
Whereas I'm comparing "make this a warning in system headers" to "making this a hard error" (in which case a user for which it would've been a hard error and they'd have had no choice but to fix their system headers (might be hard/impossible) would have some relief valve for a time/some notice that this would be a problem in the future when it becomes a hard error)

If we picture a timeline:
Time 1) Code is valid/no problem
Time 2) Warning added (not in system headers)
Time 3) Warning becomes error by default (but could be disabled by a user) (still not in system headers)
Time 4) Becomes hard error

And if someone has a system header with a violation, they won't know about it until (4) and they won't be able to do much/anything about it.

I'm suggesting making the process longer.

Time 4) default-error warning becomes enabled in system headers

Users who, in the first timeline, would be stuck - at least have a release valve. I think they are better off than they were in the first timeline - even though they lose warning coverage over their own code (which is bad), at least they can still build their project and know they have problems in their system headers they should report/etc and try to have addressed.

then Time 5) make it a hard error.

So it takes a bit longer to get to (5), and users who would be OK in the original timeline are no worse off - they had there system headers fixed by (4) and don't need to disable the warning. But users who would've had a bad time at (4) have a less bad time/some chance of working through the bad time.

Thank you for the detailed explanation, that's really helpful! I think we were approaching this from somewhat different angles, but I now better understand what you mean. The tradeoff boils down to a question of which is less bad for users: losing warning coverage in their own code or finding out they have to get new system headers (or modify the existing ones, but no easy switch to address the issue). I think getting new system headers is harder for projects to deal with, so I think I'm now in agreement with you that we should enable the warning as an error in system headers.

I think getting new system headers is harder for projects to deal with, so I think I'm now in agreement with you that we should enable the warning as an error in system headers.

Awesome - glad we're on the same page :)

carlosgalvezp added a subscriber: carlosgalvezp.EditedSep 23 2023, 12:15 AM

@shafik @aaron.ballman @dblaikie

Hello! Just wanted to check if there's any blockers for merging this patch? We are now on Clang 18, i.e. 2 releases after the warning was introduced, so IMO I believe it's a good time to turn it into a hard error and test it in the wild.

I read concerns about breaking code. Technically, UB is only invoked in C++17 and onwards (before, it's only unspecified behavior) - could a solution to mitigate risk/break less code be to only enable this hard error in C++17? This way, only people who use a relatively new C++ Standard and compiler would get the error.

I also wonder what is the way forward with respect to code reviews, since Phabricator is deprecated. @shafik do you intend to continue here, or will you move this into a Github PR?

Happy to help if there's anything I can do! Thanks for the great work :)

@shafik @aaron.ballman @dblaikie

Hello! Just wanted to check if there's any blockers for merging this patch? We are now on Clang 18, i.e. 2 releases after the warning was introduced, so IMO I believe it's a good time to turn it into a hard error and test it in the wild.

I read concerns about breaking code. Technically, UB is only invoked in C++17 and onwards (before, it's only unspecified behavior) - could a solution to mitigate risk/break less code be to only enable this hard error in C++17? This way, only people who use a relatively new C++ Standard and compiler would get the error.

I also wonder what is the way forward with respect to code reviews, since Phabricator is deprecated. @shafik do you intend to continue here, or will you move this into a Github PR?

Happy to help if there's anything I can do! Thanks for the great work :)

I still think even if we can subset this, for whatever we're going to turn into a hard error, it should be a warning-as-error in system headers first for at least a release. (so perhaps the transition should look like: null (no diagnostic) -> warning -> warning-default-to-error -> warning-default-to-error-even-in-system-headers -> hard error)

I still think even if we can subset this, for whatever we're going to turn into a hard error, it should be a warning-as-error in system headers first for at least a release. (so perhaps the transition should look like: null (no diagnostic) -> warning -> warning-default-to-error -> warning-default-to-error-even-in-system-headers -> hard error)

@dblaikie Thanks for the input, I believe I achieve the behavior you propose here, feel free to review! https://github.com/llvm/llvm-project/pull/67528