Page MenuHomePhabricator

[SemanticTypedef] Provide a semantic typedef class and operators
Needs ReviewPublic

Authored by greened on Aug 13 2019, 9:40 AM.

Details

Summary

A semanic typedef is a kind of "strong typedef" that prevents implicit
conversion between itself and its underlying type. But rather than a
straight typedef of the base type with all operators available, a
semantic typedef allows the new type to choose the operators available
to it, increasing type safety.

An example:

// It doesn't make sense to add two CacheLineSize objects together, so
// don't define add-assignment.
struct CacheLineSize : public SemanticTypedef<CacheLineSize, unsigned>,
                     : public EqualityCompare<CacheLineSize>,
  		       : public LessThanCompare<CacheLineSize> {
  using SemanticTypedef::SemanticTypedef;
};

CacheLineSize ASize(32);
CacheLineSize BSize(64);

assert(ASize < BSize);

static_assert(!std::is_convertible<decltype(ASize),
                                   UnderlyingType<decltype(ASize)>>::value,
              "Converted CacheLineSize to its underlying type!");

static_assert(!std::is_convertible<UnderlyingType<decltype(ASize)>,
                                   decltype(ASize)>::value,
              "Converted underlying type to CacheLineSize!");

This commit defines the basic SemanticTypedef class along with classes
to make the most common operators available: ==, <, >=, =, +=, &&, ||,
!, << (output streaming).

Diff Detail

Event Timeline

greened created this revision.Aug 13 2019, 9:40 AM
greened edited the summary of this revision. (Show Details)Aug 13 2019, 9:41 AM

A strong motivation for using this over enum class should be provided. The default behavior is too minimalistic.

In my experience, the reason people use bool, int, string, etc. is because they provide a rich set of operations out of the box: assignment, comparison, implicit conversions, etc. enum class also provides most of those operations (other than implicit conversion) and has two benefits: it introduces a new type, and it makes it easy to name constants. SemanticTypedef, on the other hand, provides almost no operations without a lot more declaration.

If you use derivation (as opposed to using/typedef), you don't even get the converting constructor without an extra declaration. While derivation does have a benefit (you don't have to invent a tag type), it has a fairly high cost. Also, public derivation is used (not strictly necessary for introducing friend functions but necessary for everything else) even though this does not model an is-a relationship. The idea for using CRTP for this has been here for over two decades (e.g. Boost.Operators is one of the oldest Boost libraries) and yet, for the most part, hasn't caught on.

It is rare to have a value type that does not have assignment or equality comparisons (and hashing, as that goes along with equality comparisons). The default should minimally provide this behavior.

llvm/include/llvm/ADT/SemanticTypedef.h
112

This line gives you copy construction. Because there are no other declarations for copy/move constructor/assignment operator or destructor, you also get copy assignment but move construction/assignment is effectively deleted. Also deriving from Assign<Typedef> does not change this.

Recommendation: remove this line of code.

126

This needs an exception specification.

184

The downside to using macros here is that it makes it harder for users to debug code going through these macros. It doesn't seem worth writing eight macros just to write six classes.

207

Why the casts?

239

This is not true in general. The canonical example is floating point NaNs. See std;:optional for how comparisons when wrapping a single value should work when comparing the underlying value.

labath added a subscriber: labath.Aug 14 2019, 12:49 AM
labath added inline comments.
llvm/include/llvm/ADT/SemanticTypedef.h
2

I know this has been discussed on the thread, but I feel obligated to point out that there is already precedent for using the term "strong typedef" for this kind of thing: LLVM_YAML_STRONG_TYPEDEF. Might not matter if the goal is to eventually replace LLVM_YAML_STRONG_TYPEDEF with this, but otherwise, it would be nice to maintain consistent naming within the project.

A strong motivation for using this over enum class should be provided. The default behavior is too minimalistic.

There will be follow-on commits that add types with rich semantics. A StrongBoolean is one of them. This is just the bare minimum to get feedback and set the foundation. Doing this in a modular way provides the most flexibility which in my mind is what you want from foundation classes.

In my experience, the reason people use bool, int, string, etc. is because they provide a rich set of operations out of the box: assignment, comparison, implicit conversions, etc.

I don't think we want implicit conversions in the base SemanticTypedef class. We can add flexibility to interoperate with other types later on. I have prototyped almost all of this.

enum class also provides most of those operations (other than implicit conversion) and has two benefits: it introduces a new type, and it makes it easy to name constants. SemanticTypedef, on the other hand, provides almost no operations without a lot more declaration.

enum class is certainly useful in many situations where this would be an inferior solution. SemanticTypedef is useful in places where enum class is less so. For example, enum class won't work with any non-integral type.

If you use derivation (as opposed to using/typedef), you don't even get the converting constructor without an extra declaration. While derivation does have a benefit (you don't have to invent a tag type), it has a fairly high cost.

Can you elaborate on what you see as the high cost? I'm just trying to understand your perspective. As noted in the comments, nothing here precludes making use of using aliases rather than CRTP, though you lose some type safety. I find CRTP much more convenient.

Also, public derivation is used (not strictly necessary for introducing friend functions but necessary for everything else) even though this does not model an is-a relationship.

I agree this is one area where the model isn't great. One can't state that the mixins should always be inherited privately because assignment, for example, requires the operator to be a member of the class and not just a friend.

The idea for using CRTP for this has been here for over two decades (e.g. Boost.Operators is one of the oldest Boost libraries) and yet, for the most part, hasn't caught on.

Boost.Operators is overkill for this kind of thing and I suspect the same is true for other use-cases. With Boost.Operators you still have to define what the base operators do. For SemanticTypedef we know we want to operators to forward to the operators for the underlying type, so the client has no need to provide definitions of the operators.

It is rare to have a value type that does not have assignment or equality comparisons (and hashing, as that goes along with equality comparisons). The default should minimally provide this behavior.

I disagree. Boolean flags, for example, often don't need any of that.

Can you sketch out what you'd like to see in a strong typedef facility?

greened marked 6 inline comments as done.Aug 14 2019, 8:30 AM
greened added inline comments.
llvm/include/llvm/ADT/SemanticTypedef.h
2

I'm totally fine changing the name. Other people objected to "strong typedef" and "semantic typedef" seemed the best proposed option to me.

The C++ committee also rejected the same "strong typdef," opting for "opaque typedef" for reasons that are not entirely clear to me. I didn't want to name this OpaqueTypedef to avoid confusion with the language proposal of that name.

112

Deriving from Assign<Typedef> works. It's in the unit tests. Perhaps I've missed something.

I can add a move constructor. That was an oversight, thanks for pointing it out!

126

What should it be? is_nothrow_swappable is only available in C++ 17. Mkaing it unconditionally noexcept would be wrong.

184

If people feel strongly about it, I'm fine not using macros. It will be a lot more code, though.

207

That's a mistake. Will fix.

239

Good point. Will fix.

mehdi_amini added inline comments.
llvm/include/llvm/ADT/SemanticTypedef.h
28

The problem exposed above is compelling, but I have hard time connecting the explanation that follows to the problem statement. Can you provide the final solution for expressing the original API void foo(bool Feature1, bool Feature2); and how it compares to an enum class?

greened marked an inline comment as done.Aug 14 2019, 12:09 PM
greened added inline comments.
llvm/include/llvm/ADT/SemanticTypedef.h
28

For booleans enum class may be better in some cases, so perhaps I should restate the problem. Assuming a StrongBoolean is available:

cl::opt<bool> EnableFeature1("enable-feature1", cl::init(false), cl::desc(...));
cl::opt<bool> EnableFeature2("enable-feature2", cl::init(false), cl::desc(...));

struct Feature1Flag : public StrongBoolean<Feature1Flag> {
  using StrongBoolean::StrongBoolean;
};
struct Feature2Flag : public StrongBoolean<Feature2Flag> {
  using StrongBoolean::StrongBoolean;
};

void foo(Feature1Flag Feature1, Feature2Flag Feature2);

Feature1Flag F1(EnableFeature1);
Feature2Flag F2(EnableFeature2);

foo(F1, F2);

Using enum class:

enum class Feature1Flag : bool {
  Disabled = false,
  Enabled = true
};
enum class Feature2Flag : bool {
  Disabled = false,
  Enabled = true
}

cl::opt<Feature1Flag> EnableFeature1(
  cl::values(
    clEnumValN("enable-feature1", Feature1Flag::Enabled, "..."),
   clEnumValN("disable-feature1", Feature1Flag::Disabled, "...")));

cl::opt<Feature2Flag> EnableFeature2(
  cl::values(
    clEnumValN("enable-feature2", Feature2Flag::Enabled , "..."),
   clEnumValN("disable-feature2", Feature2Flag::Disabled, "...")));

void foo(Feature1Flag Feature1, Feature2Flag Feature2);

Feature1Flag F1(EnableFeature1);
Feature2Flag F2(EnableFeature2);

foo(F1, F2);

At least I think that's right. It's a bit more burdensome to create cl::opt flags with enum class.

Of course many Booleans won't use a cl::opt. Still, SemanticTypedef may be easier than enum class:

template<typename Tag>
struct MyFeatureFlags : public StrongBoolean<Tag> {
  using StrongBoolean<Tag>::StrongBoolean;
};

struct Feature1Flag : public MyFeatureFlags<Feature1Flag> {
  using MyFeatureFlags::MyFeatureFlags;
};
struct Feature2Flag : public MyFeatureFlags<Feature2Flag> {
  using MyFeatureFlags::MyFeatureFlags;
};

void foo(Feature1Flag Feature1, Feature2Flag Feature2) {
  if (Feature1 || Feature2) {
    ...
  }
}

Here, I'm assuming the ability to have "related" typedefs interoperate, so that logical disjunction between types with the same wrapper template but different tags can be allowed. I have prototyped this. It involves some complicated SFINAE magic and may not be worth the effort. But it is possible.

We could instead do:

// Feature1 and Feature2 declared as originally

void foo(Feature1Flag Feature1, Feature2Flag Feature2) {
  if (Feature1.isTrue() || Feature2.isTrue()) {
    ...
  }
}

Then we wouldn't need the messy SFINAE stuff but it's slightly less type-safe. The tradeoff is probably fine, though.

With enum class:

// Feature1 and Feature2 declared as before

void foo(Feature1Flag Feature1, Feature2Flag Feature2) {
  if (bool(Feature1) || bool(Feature2)) {
    ...
  }
}

Not that much of a win for SemanticTypedef but it is a bit cleaner.

One use for SemanticTypedef that would be helpful and not as conveniently achieved with enum class:

template<typename Tag, typename Int = unsigned>
struct Bits : StrongInteger<Tag, Int> {
  using SemanticTypedef<Tag, Int>::SemanticTypedef;
  Int toBytes() { return Int(*this) / 8; }
};

template<typename Tag, typename Int = unsigned>
struct Bytests : StrongInteger<Tag, Int> {
  using SemanticTypedef<Tag, Int>::SemanticTypedef;
  Int toBits() { return Int(*this) * 8; }
};

Then the interfaces that take/return bits/bytes would instead take/return these types and we'd reduce the number of bits-or-bytes errors in the code.

greened marked an inline comment as done.Aug 14 2019, 2:09 PM
greened added inline comments.
llvm/include/llvm/ADT/SemanticTypedef.h
112

Ok, I think I see. I misunderstood what you were getting at. Deriving from Assign doesn't provide move construction. It does provide move assignment though. Wouldn't adding a move constructor fix the issue?

greened marked an inline comment as done.Aug 15 2019, 8:13 AM
greened added inline comments.
llvm/include/llvm/ADT/SemanticTypedef.h
112

After thinking about this some more, I think you are right. I will delete the default copy constructor and also the Assign operator since it's hidden by the default assignment operator anyway.

greened updated this revision to Diff 215921.EditedAug 19 2019, 9:13 AM
greened marked an inline comment as done.

Removed casts, default constructors, added all operators where it makes sense (unary *, ->, ->*, unary &, comma, [] and () not added because a default return type is not obvious), placed operators in their own namespace to avoid collisions, rewrote example, updated tests.

greened marked 4 inline comments as done.Aug 19 2019, 9:14 AM
greened marked an inline comment as done.
greened added inline comments.
llvm/include/llvm/ADT/SemanticTypedef.h
184

As a data point, I build with gcc 8.1 and errors were pretty easy to track down, even through the macros.

nliber added a comment.Sep 5 2019, 3:52 PM

[Sorry it took so long to get back to this...]

  • Consider putting swap into a separate "operator" class (it isn't really an operator, but it is close...), which would be inherited like everything else.
  • Move construction, move assignment and swap are the three which really need noexcept on them, and you already get it for free with move construction/assignment because you are applying the "Rule of Zero". As was pointed out, is_nothrow_swappable didn't show up until C++17. However, I do not believe its implementation relies on anything in C++17. Consider implementing the trait yourself and using it (or equivalent) on swap.
  • Instead of default construction," copy the value" construction and "moving the value" construction, consider having a templated constructor which does emplace construction into Value. Not only does this cover the three construction methods already mentioned, it can be more efficient if you do not have the value constructed already, and it also enables wrapping of non-copyable non-moveable types (such as atomic<T>). It would look something like (caution: untested):
template<typename... Us, typename = std::enable_if_t<std::is_constructible<T, Us...>::value>>
constexpr SemanticTypedef(Us&&... us)
noexcept(noexcept(T(std::forward<Us>(us)...)))
: Value(std::forward<Us>(us)...)
{}

Note: even if you don't want to constrain it to is_constructible, you still need to constrain it so that non-const instances of SemanticTypedef bind to the copy constructor instead of this templated constructor. IIRC this doesn't cover std::initializer_list constructors, so you would need to make forward constructors if you wished to support forwarding those.

  • Consider adding a named getter member function (such as .get() or .ref()) that returns a (const) reference as well as the conversion operator. It is useful to have a short name when you need access to the underlying value.
  • Consider adding a typedef (such as value_type) inside SemanticTypedef for the underlying type (and I would have a tag_type for good measure). You know what it is, and users shouldn't have to go through hoops to get it. Heck, UnderlyingType and TagType shouldn't have to go through hoops to get it.
  • Consider eliminating Sortable. Types which only define operator< are weird, and we shouldn't be encouraging weird types.
  • Consider adding heterogeneous operations with SemanticTypedef<Tag, T> and T (in either order). For instance, one shouldn't have to construct a SemanticTypedef<Tag, string> if they just want to compare a string with a SemanticTypedef<Tag, string>.
  • Consider defining all operators except assignment as friends. Anthony Williams's blog post on hidden friends explains why.