Analogous to https://reviews.llvm.org/D93758, this patch uses
unsigned char for all enums in FormatStyle.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
Is there a clang-tidy check for this? I mean if we are doing this because it helps reduce the size of FormatStyle and its recommended practice for small enums, then we should have a clang-tidy check that catches us on this. (no?)
Looks innocent but I have a few questions.
What's the size difference that you observe?
Any runtime penalty? (Usually compilers choose a faster underlying type for enums, not smaller)
Also, is this change very relevant to clang-format? We should not copy FormatStyle a lot, no?
The decision is made in the header (so I think it will always be int until this doesn't fit). To decide what is faster the compiler would need to know all the usages. Which it can't.
We have multiple times bit fields in clang-format to reduce the object size. I think it's only consistent to decrease the FormatStyle size.
The only reason I didn't do it for the enums introduced by me is consistency.
None that I'm aware of.
Any runtime penalty? (Usually compilers choose a faster underlying type for enums, not smaller)
The decision is made in the header (so I think it will always be int until this doesn't fit). To decide what is faster the compiler would need to know all the usages. Which it can't.
Yes, Clang chooses int by default. In fact, C++11 and later mandate that a new-style scoped enum enum class Foo { means the same thing as enum class Foo : int { — see https://timsong-cpp.github.io/cppwp/n3337/dcl.enum#5 . For unscoped old-style enums, Clang/GCC/MSVC all choose int by default, unless -fshort-enums is passed on the command line and/or the enum is annotated with __attribute__((packed)), in which case it chooses the smallest possible type. -fshort-enums and __attribute__((packed)) have no effect on new-style scoped enums nor on enums with explicit underlying types.
Before this lands can we have a discussion about what clarity this gives us?, because I think it makes the code more unreadable, but surely we are saving just 7x(3 bytes) (the difference between and int and a unsigned char for 7 enums)
Is saving 21 bytes valuable?
Peanut gallery says: I think giving every enum a concrete underlying type is a good practice, so I don't think this makes anything less readable. (That is, I'd even favor adding : int to each of these enums, over the status quo where the type is merely implied — we've seen that people can be unsure of the exact behavior.)
I can imagine that someone might object that hard-coding one byte for each enum might one day cause problems if we want more than 256 possible settings for any given enum, but that's not likely, is it?
Re space savings, OP says that they're saving 21 bytes per object and that they have "lots" of objects in their particular use-case, even if clang-format itself doesn't.
In my own area I gave FormatStyle a copy constructor and made it private, it hardly caused any issues, this isn't something that we tend copy about very much, we tend copy construct it only when we do getLLVMStyle() to make another style say like google style sure we'll do that a handful of times but I very much doubt it would make an impact.. can we measure?
I have no problems adding the base class if that's the LLVM coding style recommendation, otherwise I don't see it gives us anything.
I also like a defined type for enums (I even have multiple enums based on bool).
I can imagine that someone might object that hard-coding one byte for each enum might one day cause problems if we want more than 256 possible settings for any given enum, but that's not likely, is it?
I think an option with more than 256 variants would be a horror. But even if we need that, we can just increase the type, since we have no API or ABI stability in clang-format. We regulary change a bool to an enum.
That being said, I don't care that much that we declare the enums with a type. But I care about consistency. Either land this, or revert D93758.
That being said, I don't care that much that we declare the enums with a type. But I care about consistency. Either land this, or revert D93758.
Consistency was more or less the only reason why I made this change in the first place and I didn't expect so many comments on a change that is almost identical to D93758. Sure I didn't have a good argument for or against it so far but... consistency... I'm glad this discussion came up anyways. Will look at each comment tomorrow.
Can we use int8_t instead of unsigned char to keep the enum types signed (and update the changes made in D93758)?