Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I agree with safety and grepable. However, I disagree with cast to void. I believe there is no final decision between cast to void and [[maybe_unused]].
I'm happy to drop the bit about casting to void (I have no particular thoughts on [[maybe_unused]] at this time). I only added it for clarity, because the section immediately above explicitly says to cast to void for assertions.
LGTM, but you should probably get a least another one.
I agree with safety and grepable. However, I disagree with cast to void. I believe there is no final decision between cast to void and [[maybe_unused]].
Are you arguing that casting to void should be forbidden right now?
The new documentation is not saying anything about a preference, it merely avoids introducing a new prohibition about the (void) pattern, which to me seems to match existing practices.
What about function style casts? I personally find them acceptable in limited cases, such as "constructing" int64_t from unsigned.
llvm/docs/CodingStandards.rst | ||
---|---|---|
1318 | This does not compile. |
It's a fair question, and I don't have a good answer for this. Strictly, we should probably forbid them in new code on the basis that they are just as bad as C-style casts (they are explicitly defined to be identical), with regards to safety etc, but I also have used them in the past for the same "constructing" style like you find. Any suggestion on how we might codify the latter.
@mehdi_amini, do you have a preference?
llvm/docs/CodingStandards.rst | ||
---|---|---|
1318 | Oops /facepalm. (Fixed now) |
I'm a bit puzzled with we would forbid one form of the C-style cast and not the other if they are equivalent? Is the intent to call an exception for integer casting?
I would prefer to say nothing about cast to void. Preference for named casts is an improvement.
I would prefer to say nothing about cast to void.
You’re not addressing my remark: the sentence that preceded would be akin to forbid it.
llvm/docs/CodingStandards.rst | ||
---|---|---|
1312 | const_cast for completeness? And perhaps a good place to reiterate that dynamic_cast is explicitly *not* accepted, in favor of LLVM's own casting infrastructure. |
Maybe adopt this clang-tidy check? https://clang.llvm.org/extra/clang-tidy/checks/cppcoreguidelines/pro-type-cstyle-cast.html
That is, allow C/functional-style cast when it is known to be safe (like unsigned -> int64_t or enum -> int), and prefer named casts otherwise.
The check also provides fix-its, which is nice.
"Are you arguing that casting to void should be forbidden right now? "
I cannot. I believe the discussion on discourse ended without a decision.
I disagree. We have a *lot* of casts to (void) in the code base (approx 4000 times) and those should not be expanded to use a much more visually distracting static_cast. IMO, we should be explicit that C-style casting to void is acceptable.
llvm/docs/CodingStandards.rst | ||
---|---|---|
1313 | FWIW, I think it would be reasonable to also mention use of [[maybe_unused]] as a potential alternative to casting to void, but without prejudice as to which approach to use. | |
1317–1319 | How about a more realistic example along the lines of: int check_value = doSomeWorkWithSideEffects(); assert(check_value == 42 && "the assert is removed in NDEBUG builds"); (void)check_value; |
I am talking about cast to void or [[maybe_used]]. If you can use a named cast, then you should. I would never argue to use static_cast for a cast to void.
The issue here is that if a C-style/functional-style cast is added here, and then one of the types changes, the cast will prevent a potential build breakage. C++-style casts don't completely fix that, but should prevent some mistakes at least.
llvm/docs/CodingStandards.rst | ||
---|---|---|
1312 | Makes sense. | |
1313 | I feel like any reference to [[maybe_unused]] belongs either in its own section (which I don't propose to add at this time) or in the section above where asserts are discussed, as that talks about casting to void to suppress the warnings. What I want to achieve here is to ensure the rule for preferring C++-style doesn't apply to that specific case. [[maybe_unused]] is an alternative solution, but isn't relevant to casting. | |
1317–1319 | I want to avoid essentially repeating the example or explanation from the previous section about asserts. I'm tempted to drop the void cast example entirely. |
I don't think this is an issue. static_cast can't catch a change in size of an integer or a change between floating point types. In fact, for scalar types it works exactly the same as static_cast.
C-style cast can only be dangerous when casting pointers (downcasting and unrelated class hierarchies). Chances that one of the types changes to pointer is miserable.
Moreover, I find the difference between C-style casts and named casts more of a stylistic issue rather than a safety issue.
This does not make the issue less important though -- consistency is very important. And for that matter:
$ egrep -r "\(u?int[0-9]+_t\)" llvm | wc -l
1943
$ egrep -r "static_cast<u?int[0-9]+_t>" llvm | wc -l
1078
C-style cast wins :)
This is pretty nice! If it was possible to make it a clang warning instead of a tidy check (not clear to me why this can't be?), then we could make it a Werror and actually enforce it!
llvm/docs/CodingStandards.rst | ||
---|---|---|
1313 | What about The sole exception to this is when casting to void to suppress warnings about unused variables (as an alternative to [[maybe_unused]], when C-style casts should be preferred instead. (the example is likely unnecessary). |
Someone would have to test over a large corpus of code to see what the false positive vs true positive rate looks like. My intuition is that it would have too many false positives to be enabled by default, so it wouldn't meet the bar for a clang diagnostic.
Someone would have to test over a large corpus of code to see what the false positive vs true positive rate looks like. My intuition is that it would have too many false positives to be enabled by default, so it wouldn't meet the bar for a clang diagnostic.
Don't we have many clang-diagnostics that aren't enabled by default? I thought that was a given that people can opt-in specific warnings in this category.
What if we got LLVM entirely clean for such a warning: would it be good enough signal that it can be useful?
I wondering what you mean by false positives? You know the type on the left and right of the cast. There should be some rules to find fishy casts. You want to cast a struct to a bool?
You picked not the most modern part of the codebase though, I can cherry-pick differently with more recent project to get signal for "code written in the last 5 years" and get a very opposite result:
$ egrep -r "\(u?int[0-9]+_t\)" bolt flang lld mlir | wc -l 186 $ egrep -r "static_cast<u?int[0-9]+_t>" bolt flang lld mlir | wc -l 525
(I'm not making any claim on safety of integer-to-integer casting here)
We do have off-by-default warnings but those tend to mostly be: historical warnings we likely would never add today, pedantic warnings for extensions/future standards compatibility, or they're for extremely one-off situations (like serious performance concerns that aren't also correctness concerns). We have evident that off-by-default warnings basically do not get enabled often enough to warrant adding them to the tool. So getting LLVM warning free for this wouldn't actually move the needle all that much.
I mean: a situation where the existing C-style cast is as correct as the suggested named cast replacement (basically, where the switch between cast styles would be a noop and so users are likely to find the diagnostic too opinionated).
(void)some_value_to_ignore; is identical to static_cast<void>(some_value_to_ignore); per spec, so any diagnostic of that variety would be a false positive, for example.
For the first iteration, I would not ask clang provide fixits to use named casts. I just want reports about fishy casts.
The concern isn't with the fix-it to use a named cast, it's with the definition of "fishy cast". The language already has constraints for super fishy constructs. You mentioned casting a struct to a bool -- that's already not allowed and you get errors in both C and C++: https://godbolt.org/z/P8WaPv8zT The tricky part will be the the constructs for which the language allows the cast as it might be reasonable or might not be reasonable, depending on what types and values are involved.
We do have off-by-default warnings but those tend to mostly be: historical warnings we likely would never add today, pedantic warnings for extensions/future standards compatibility, or they're for extremely one-off situations (like serious performance concerns that aren't also correctness concerns). We have evident that off-by-default warnings basically do not get enabled often enough to warrant adding them to the tool. So getting LLVM warning free for this wouldn't actually move the needle all that much.
We're off-topic, but is there a discussion/reference about this I can read?( this seems actually pretty unfortunate to me: clang-tidy is nice but hardly a replacement since it isn't really in the day-to-day developer flow).
Add reference to [[maybe_unused]] and remove example.
I haven't put anything about functional-style casts at this point, but I'm happy to, if there's consensus one way or another (I feel like there isn't currently).
Once upon a time, someone did some digging to see just how often the various off-by-default warnings were enabled across a significant corpus of code and found it was a very low percentage pretty much across the board. However, that was *years* ago at this point and I can't find the discussion in my archives.
However, as an example of some off-by-default warnings, we have -Wzero-as-null-pointer-constant as an off-by-default warning, and at first it looks like it's heavily used: https://sourcegraph.com/search?q=context:global+-Wzero-as-null-pointer-constant+-file:.*test.*&patternType=standard&sm=1&groupBy=repo but when you look more closely, you see that the vast majority of "uses" are trying to disable the warning if it's already enabled. We have -Wshadow-field off-by-default, same situation: https://sourcegraph.com/search?q=context:global+-Wshadow-field+-file:.*test.*&patternType=standard&sm=1&groupBy=repo
clang-tidy exists specifically because we wanted a tool people could integrate into their CI pipeline for more opinionated diagnostics. It's particularly good for enforcing coding standards because of how often those also make style choices. It can be used in the day-to-day workflow (for example, I know of at least one person who integrates it into VS Code through clangd to get IDE integration and then runs clang-tidy ahead of clang within their builds), but perhaps there are ways we can make it more ergonomic so that's easier.
The discussion seems to have died a bit. It seems like there are no particular objections to this going in as-is now? If so, could I get another LGTM or two from the other people who looked at this, before I land it.
The only topic I think that doesn't have a 100% consensus on is whether functional style casts should be permitted in some limited cases. I don't have a strong preference, so I'm happy to go with whatever others prefer. It seems like @barannikov88 is the only proponent for permitting them.
-1 from my side. I consider static_cast<> to be worse for integer casts by dint of adding unnecessary noise. I could get on board with this if integer/enum casts are excluded.
For the cases where it really matters, we already use our own cast/dyn_cast anyway. Encouraging use of const_cast<> is fine as well, and I believe we already do use that fairly consistently in the cases where it is applicable.
I don't have strong feelings about using static_cast with integer or enumeration types. Personally, I tend to use C-style casts for those, but I don't have an objection to preferring a named cast instead. Someone could put together a quick and dirty clang-tidy check to see if we use C-style casts to integer types with far greater frequency than named casts and use that to make a decision. Barring that amount of effort, I would be fine excluding int/enum casts from this requirement on the assumption that we have more of those casts (just based on my gut feeling, if others have a different gut feeling, definitely speak up).
I am not at all familiar with implementing things in clang-tidy, so don't have the time or skills to implement said dirty hack.
If we were to exclude integers and enums, then I would suggest the following wording:
When casting, use static_cast, reinterpret_cast, and const_cast rather than C-style casts. There are two exceptions to this:
- When casting to void to suppress warnings about unused variables (as an alternative to `[[maybe_unused]]`). Prefer C-style casts in this instance.
- When casting integral types (including enums), prefer the prevailing style in the area being worked on.
No worries, it's a tall ask. :-)
If we were to exclude integers and enums, then I would suggest the following wording:
When casting, use static_cast, reinterpret_cast, and const_cast rather than C-style casts. There are two exceptions to this:
- When casting to void to suppress warnings about unused variables (as an alternative to `[[maybe_unused]]`). Prefer C-style casts in this instance.
- When casting integral types (including enums), prefer the prevailing style in the area being worked on.
I like that phrasing, thank you! Do others have opinions on using this wording over the existing wording?
! In D151187#4420876, @aaron.ballman wrote:
I like that phrasing, thank you! Do others have opinions on using this wording over the existing wording?
Yup, it looks kind of redundant because at the very beginning of the coding standards it is already spelled, in bold:
If you are extending, enhancing, or bug fixing already implemented code, use the style that is already being used so that the source is uniform and easy to follow.
And it still forbids functional style casts. I'd like them to explicitly be allowed for constructing objects, as if the type being cast to was a class.
In D152098 I introduce a typedef MCRegUnit and use functional style cast to construct it from integer, like this:
Val = MCRegUnit(*++I);
This violates the suggested wording; I'm supposed to use static_cast here. But I plan to turn the typedef into a class in the future, and don't want to use static_cast for calling class' constructor, it is unnatural.
I'd also allow functional-style casts for constructing enums from integers. static_cast isn't safer here, only more noisy.
I'm fine with static_cast in all other cases, and I also support forbidding C-style casts completely.
Unfortunately, my English is too poor to suggest a wording that would take these cases into account.
Essentially, the second bullet point is saying that integrals are specifically excluded from the rule, so fall back to the default. Whilst it's redundant, I think it's important to say it explicitly here. Plus, if you are writing some new code, or it's in an area where there is no prevailing style, and this basically leaves the door open to "choose what you want", rather than prescribing that static_cast is used. At least, that's how I'd interpret it.
And it still forbids functional style casts. I'd like them to explicitly be allowed for constructing objects, as if the type being cast to was a class.
In D152098 I introduce a typedef MCRegUnit and use functional style cast to construct it from integer, like this:
Val = MCRegUnit(*++I);
This violates the suggested wording; I'm supposed to use static_cast here. But I plan to turn the typedef into a class in the future, and don't want to use static_cast for calling class' constructor, it is unnatural.
I'd also allow functional-style casts for constructing enums from integers. static_cast isn't safer here, only more noisy.
I'm fine with static_cast in all other cases, and I also support forbidding C-style casts completely.Unfortunately, my English is too poor to suggest a wording that would take these cases into account.
A typdefed integral is still an integral type, so the rule is to follow the prevailing style. If functional style casts are already in use elsewhere in that area of code, then the code you've highlighted would be fine. Once you turn it into a class, it's no longer a functional-style cast - it's simply calling the class constructor, so this coding standard isn't applicable. (If the issue is more that you'd be forced to use a static_cast initially and then the transition to a class results in an unnecessary static_cast, I think it would be fine to argue that you're ignoring the style guide because a pending change will work better, and I think that rule applies for all parts of the style guide).
+1
Plus, if you are writing some new code, or it's in an area where there is no prevailing style, and this basically leaves the door open to "choose what you want", rather than prescribing that static_cast is used. At least, that's how I'd interpret it.
I think new code is generally expected to adhere to the coding guideline (so we add new code that's increasing consistency rather than decreasing it), so I read it as requiring static_cast (and I'm fine with that, but I'm also fine rewording it to "prefer the prevailing style in the area being worked on, or use <blah> if there is no prevailing style").
And it still forbids functional style casts. I'd like them to explicitly be allowed for constructing objects, as if the type being cast to was a class.
In D152098 I introduce a typedef MCRegUnit and use functional style cast to construct it from integer, like this:
Val = MCRegUnit(*++I);
This violates the suggested wording; I'm supposed to use static_cast here. But I plan to turn the typedef into a class in the future, and don't want to use static_cast for calling class' constructor, it is unnatural.
I'd also allow functional-style casts for constructing enums from integers. static_cast isn't safer here, only more noisy.
I'm fine with static_cast in all other cases, and I also support forbidding C-style casts completely.Unfortunately, my English is too poor to suggest a wording that would take these cases into account.
A typdefed integral is still an integral type, so the rule is to follow the prevailing style. If functional style casts are already in use elsewhere in that area of code, then the code you've highlighted would be fine. Once you turn it into a class, it's no longer a functional-style cast - it's simply calling the class constructor, so this coding standard isn't applicable. (If the issue is more that you'd be forced to use a static_cast initially and then the transition to a class results in an unnecessary static_cast, I think it would be fine to argue that you're ignoring the style guide because a pending change will work better, and I think that rule applies for all parts of the style guide).
+1
I find the "prefer the prevailing style" mantra unsatisfying in that it does not provide a path forward to evolve the codebase toward a unified style, which seems like an ideal state from a "style guide" point of view.
The thinking is that the path naturally converges as more and more new code gets written using the new style out of habit, and only when the prevailing style makes it incredibly obvious you're doing something different does code stick with the older style. It takes a *long* time to migrate to the new style this way, but it does work (at least in my experience on the Clang side of things).
Sounds fine to me, although I read it as "ignoring the style guide is allowed if you can convince the reviewer".
OK then.
I realised this dropped off my radar for a while, for various reasons, so I'm picking this back up in an effort to get a consensus and to land something. I'm not bothering uploading a diff at this point, but my most recent proposal was:
When casting, use static_cast, reinterpret_cast, and const_cast rather than C-style casts. There are two exceptions to this:
- When casting to void to suppress warnings about unused variables (as an alternative to `[[maybe_unused]]`). Prefer C-style casts in this instance.
- When casting integral types (including enums), prefer the prevailing style in the area being worked on.
@mehdi_amini pointed out that this doesn't provide a proper convergence point. Would there be broad acceptance if we permitted functional-style casts for safe integral types, but not regular C-style casts? This would allow code to (eventually) converge towards at least two points. Concrete proposal:
When casting, use static_cast, reinterpret_cast, and const_cast rather than C-style casts. There are two exceptions to this:
- When casting to void to suppress warnings about unused variables (as an alternative to `[[maybe_unused]]`). Prefer C-style casts in this instance.
- When casting between integral types (including enums that are not strongly-typed), functional-style casts are permitted as an alternative to static_cast.
I also think it makes sense to move this block to the paragraph below https://llvm.org/docs/CodingStandards.html#do-not-use-rtti-or-exceptions, so that dynamic_cast is referenced in close proximity.
Some thoughts about the second bullet point in this proposal:
- I've explicitly excluded strongly-typed enums from functional-style casts. In my experience, the point of strongly typed enums is that they're always referred to by name. Deviating from this may be appropriate in limited cases, but those cases probably want highlighting by virtue of a static_cast.
- I considered adding a clause to it to only permit functional-style casts for safe casts where there is no risk of loss of precision (i.e. as outlined here). However, those cases are typically cases where I personally DO use functional-style casts, to show the loss-of-precision is intentional. I don't oppose using static_cast in those cases though, so if there is preference to change this, I'm happy to do so.
Thoughts?
Assuming that we're still following the golden rule at the top of the style guide ("If you are extending, enhancing, or bug fixing already implemented code, use the style that is already being used so that the source is uniform and easy to follow."), I'm happy with this formulation.
I also think it makes sense to move this block to the paragraph below https://llvm.org/docs/CodingStandards.html#do-not-use-rtti-or-exceptions, so that dynamic_cast is referenced in close proximity.
Seems reasonable to me.
Some thoughts about the second bullet point in this proposal:
- I've explicitly excluded strongly-typed enums from functional-style casts. In my experience, the point of strongly typed enums is that they're always referred to by name. Deviating from this may be appropriate in limited cases, but those cases probably want highlighting by virtue of a static_cast.
Agreed.
- I considered adding a clause to it to only permit functional-style casts for safe casts where there is no risk of loss of precision (i.e. as outlined here). However, those cases are typically cases where I personally DO use functional-style casts, to show the loss-of-precision is intentional. I don't oppose using static_cast in those cases though, so if there is preference to change this, I'm happy to do so.
My weak opinion is to use static_cast in those cases, but I can live with function-style casts there as well because we've limited it to integer types.
Thoughts?
Thank you for picking this discussion back up! :-)
When casting, use static_cast, reinterpret_cast, and const_cast rather than C-style casts. There are two exceptions to this:
- When casting to void to suppress warnings about unused variables (as an alternative to `[[maybe_unused]]`). Prefer C-style casts in this instance.
- When casting between integral types (including enums that are not strongly-typed), functional-style casts are permitted as an alternative to static_cast.
Reads good to me as well.
The changes look different from the proposal. Am I missing something?
@barannikov88 / @tschuett / @nhaehnle / @nikic are you all happy with the current version? I think I've addressed everything (apart from @aaron.ballman's syntax comments).