This is an archive of the discontinued LLVM Phabricator instance.

[ADT] Add LLVM_MARK_AS_BITMASK_ENUM, used to enable bitwise operations on enums without static_cast.
ClosedPublic

Authored by jlebar on Jul 12 2016, 2:56 PM.

Details

Summary

Normally when you do a bitwise operation on an enum value, you
get back an instance of the underlying type (e.g. int). But using this
macro, bitwise ops on your enum will return you back instances of the
enum. This is particularly useful for enums which represent a
combination of flags.

Suppose you have a function which takes an int and a set of flags. One
way to do this would be to take two numeric params:

enum SomeFlags { F1 = 1, F2 = 2, F3 = 4, ... };
void Fn(int Num, int Flags);

void foo() {
  Fn(42, F2 | F3);
}

But now if you get the order of arguments wrong, you won't get an error.

You might try to fix this by changing the signature of Fn so it accepts
a SomeFlags arg:

enum SomeFlags { F1 = 1, F2 = 2, F3 = 4, ... };
void Fn(int Num, SomeFlags Flags);

void foo() {
  Fn(42, static_cast<SomeFlags>(F2 | F3));
}

But now we need a static cast after doing "F2 | F3" because the result
of that computation is the enum's underlying type.

This patch adds a mechanism which gives us the safety of the second
approach with the brevity of the first.

enum SomeFlags {
  F1 = 1, F2 = 2, F3 = 4, ..., F_MAX = 128,
  LLVM_MARK_AS_BITMASK_ENUM(F_MAX)
};

void Fn(int Num, SomeFlags Flags);

void foo() {
  Fn(42, F2 | F3);  // No static_cast.
}

The LLVM_MARK_AS_BITMASK_ENUM macro enables overloads for bitwise
operators on SomeFlags. Critically, these operators return the enum
type, not its underlying type, so you don't need any static_casts.

An advantage of this solution over the previously-proposed BitMask class
[0, 1] is that we don't need any wrapper classes -- we can operate
directly on the enum itself.

The approach here is somewhat similar to OpenOffice's typed_flags_set
[2]. But we skirt the need for a wrapper class (and a good deal of
complexity) by judicious use of enable_if. We SFINAE on the presence of
a particular enumerator (added by the LLVM_MARK_AS_BITMASK_ENUM macro)
instead of using a traits class so that it's impossible to use the enum
before the overloads are present. The solution here also seamlessly
works across multiple namespaces.

[0] http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150622/283369.html
[1] http://lists.llvm.org/pipermail/llvm-commits/attachments/20150623/073434b6/attachment.obj
[2] https://cgit.freedesktop.org/libreoffice/core/tree/include/o3tl/typed_flags_set.hxx

Diff Detail

Event Timeline

jlebar updated this revision to Diff 63731.Jul 12 2016, 2:56 PM
jlebar retitled this revision from to [ADT] Add FlagsEnum, used to enable bitwise operations on enums without static_cast..
jlebar updated this object.
jlebar added reviewers: chandlerc, rsmith.
jlebar added a subscriber: llvm-commits.
jlebar removed a reviewer: sberg.
jlebar added subscribers: sberg, compnerd.
jlebar updated this revision to Diff 63735.Jul 12 2016, 3:08 PM

Get rid of some unnecessary namespace qualifications.

chandlerc edited edge metadata.Jul 12 2016, 3:59 PM

This patch has no detailed descrpition. Can you add some context, such as problem trying to solve, link to discussions, and maybe why this approach is better than the other two proposed? (looking at the code now, but just wanted to make the initial comment)

jlebar updated this object.Jul 12 2016, 4:15 PM
jlebar edited edge metadata.
jlebar updated this object.
jlebar updated this object.
chandlerc added inline comments.Jul 12 2016, 4:38 PM
include/llvm/ADT/FlagsEnum.h
21 ↗(On Diff #63735)

Doxygen?

Is "flags" more or less descriptive than "bitset"?

39–41 ↗(On Diff #63735)

This doesn't actually work AFAIK.

In C++11 we still have to specialize templates in the namespace they are declared.

So, either this needs to instead be two macros, one in the llvm namespace and one in the namespace of the enum, or we need something much, much more crafty.

I have a terrible idea. Instead of specializing a template, we can declare an overload of a function in the same namespace as the enum which accepts an enum as the argument and returns a reference to an array of characters the size of the largest value. Then you can call this function using unqualified lookup and ADL will find the one in the enum's namespace, and sizeof that call will tell you the largest value.

Too clever?

61 ↗(On Diff #63735)

Part of this can be a static_assert.

62 ↗(On Diff #63735)

You could pre-compute the NextPowerOf2 bit as a constexpr and use it both in the static_assert above, here, and below. Maybe compute it as the mask and use <=?

68 ↗(On Diff #63735)

I would make two traits classes I think. One which is just a predicate and uses the idiomatic inheritance from true_type, and one which has the computed values etc.

I would also mildly prefer using std::enable_if on the return type rather than the template argument technique where we can. It ends up being a bit shorter IMO.

jlebar updated this revision to Diff 63761.Jul 12 2016, 6:05 PM
jlebar marked 5 inline comments as done.

Address chandlerc's comments.

include/llvm/ADT/FlagsEnum.h
21 ↗(On Diff #63735)

I considered this, but preferred "flags" over "bitset", because "bitset" sounds to me like a class that's like vector<bool>, where I can set bit N, where N is a runtime variable. Not unlike std::bitset.

I added some doxygen bits. Wanted to check to see how it's rendered, but it's spun 20 minutes so far, so I now have difficulty caring...

39–41 ↗(On Diff #63735)

I like the idea; it's pretty clean, aside from requiring a long name on the function we declare (since we'll be polluting others' namespaces with it).

61 ↗(On Diff #63735)

Beg pardon, but which part? This may be resolved now that I've rejiggered things.

68 ↗(On Diff #63735)

I added a new traits class, but left the enable_if logic in the template for now. I'll change it if you want, but compare

template <typename E, bool = IsFlagsEnum<E>::value>
E &operator|=(E &Lhs, E Rhs) {

to

template <typename E>
typename std::enable_if<IsFlagsEnum<E>::value, E &>::type operator|=(E &Lhs,
                                                                     E Rhs) {
jlebar updated this revision to Diff 63764.Jul 12 2016, 6:30 PM

Get rid of constexpr.

jlebar updated this revision to Diff 63765.Jul 12 2016, 6:34 PM

...and add missing inline now that we no longer have constexpr.

chandlerc added inline comments.Jul 12 2016, 6:41 PM
include/llvm/ADT/FlagsEnum.h
22 ↗(On Diff #63761)

My concern is that this is useful when the enum does not describe flags per-se but where we would like the bit set operations to Just Work. We have a lot of these in LLVM, typically due to lattices in analyses. I feel like flags will be confusing in that context.

I hear your concern that it is a fixed set of bits. Suggestions on a more generic name?

64 ↗(On Diff #63761)

I would name this in the STL's style (we do that a lot but on a completely ad-hoc basis in ADT when something "looks like" an STL-style thing. In this case, this looks like a type trait so I'd use the type trait style.

69 ↗(On Diff #63761)

Your suggested syntax is actually my personal preference to write, however I remembered that Clang has special, very nice diagnostics for enable_if patterns that won't fire with it and will fire with the second version. =/

I guess I would suggest:

template <typename E,
          typename = typename std::enable_if<IsFlagsEnum<E>::value>::type>
E &operator|=(E &LHS, E &RHS) {

It's a lot more typing, but Richard Smith and I are conspiring on a way to make this much more brief.

I would request 'RHS' and 'LHS' instead of 'Rhs' and 'Lhs' here and elsewhere.

jlebar updated this revision to Diff 63767.Jul 12 2016, 6:49 PM
jlebar marked 5 inline comments as done.

Use different SFINAE pattern.

include/llvm/ADT/FlagsEnum.h
23 ↗(On Diff #63765)

BitwiseEnum, BitwiseEnabledEnum?

Really the only name that's part of the API is that of the macro. ENABLE_BITWISE_OPS_ON_ENUM?

70 ↗(On Diff #63765)

sgtm. If we do it this way we can also inherit from false_type in the base case, which is nice.

jlebar marked 2 inline comments as done.Jul 12 2016, 6:49 PM
jlebar updated this revision to Diff 63768.Jul 12 2016, 6:51 PM

Change capitalization of Lhs/Rhs.

(important bit at the bottom)

include/llvm/ADT/FlagsEnum.h
24 ↗(On Diff #63768)

At a high level, these things are "bitmask enums" based on the terminology used in the standard (it talks about bitmask types).

After a long, and protracted discussion about the rest of the name, Justin, Richard, and myself realized that the challenge of naming this also pointed at the challenge of communicating *what it does*. The fact that you put this outside the class definition, and between the enum and the macro, the behavior of these operators is *different* is really confusing and subtle.

We really wanted a way to do this so that the moment you declare the enum, things work the way you expect. Sadly, this is pretty much impossible to do in general. Of the things we have to sacrifice, the least-bad sacrifice seems to be name lookup perfection. So our idea was to give up perfect ADL-based lookup of using declarations of the operator overloads, and instead are thinking to just define them in the 'llvm' namespace, provide using declarations in other common project namespaces like 'lld' and 'clang' via a macro and the LLVM.h headers in those projects, and then trigger this behavior in a much simpler way.

The idea is then to just declare a special *enumerator* that triggers these operator overloads to work. This has the advantage of forcing it to reside *inside* the enum's definition, working correctly with both enums nested within classes and those not nested within classes, and not requiring any interesting code to derive from a macro expansion (mucking up diagnostics and such).

The result with hypothetical names would look like this when used:

enum E {
  ...,
  MyLargest = 1 << N,

  MARK_AS_BITMASK_ENUM(MyLargest)
};

And would be defined something along the lines of:

#define MARK_AS_BITMASK_ENUM(Largest) \
    LLVM_BITMASK_LARGEST_ENUMERATOR = Largest

#define EXPAND_BITMASK_ENUM_USING_DECLS \
    using ::llvm::operator?; \
    ...

namespace llvm {
  template <typename E, typename Enabled = void> struct is_bitmask_enum
      : false_type {};
  template <typename E>
  struct is_bitmask_enum<E, std::enable_if<std::is_enum<decltype(
      E::LLVM_BITMASK_LARGEST_ENUMERATOR)>::value>::type> : true_type {};

  ...
}
59–60 ↗(On Diff #63768)

The ADL part might make more sense below where we actually do the specialization as an implementation comment.

As part of splitting that out, I'd write a (super short) doxygen comment for the trait here.

70–71 ↗(On Diff #63768)

Doxygen here as well.

73–74 ↗(On Diff #63768)

We don't use braced init syntax like this really anywhere in LLVM, I'd use a more common syntax for writing this or just write it more in prose ("returns zero with the type uint64_t, so ...").

jlebar updated this revision to Diff 63815.Jul 13 2016, 9:33 AM
jlebar marked 3 inline comments as done.

Update to newest methodology.

I may be missing some important subtlety of C++ name lookup, but I think we don't actually need these shenanigans with the usings if we just declare our operator overloads in the global namespace. So I've done that in this version of the patch.

I've also added tests for enum class.

jlebar updated this object.Jul 13 2016, 9:36 AM
jlebar updated this object.Jul 13 2016, 9:46 AM
jlebar updated this revision to Diff 63817.Jul 13 2016, 9:47 AM

Add test for enum inside a class.

Oh, I understand why we have to use ADL to pick up these operator overloads. As written, if anyone declares any overload of, say, operator|, in the global namespace, then our overloads won't be found.

I will change this again. Maybe finally it'll be right...

jlebar updated this revision to Diff 63823.Jul 13 2016, 10:38 AM

Use ADL, sigh.

chandlerc accepted this revision.Jul 13 2016, 11:18 AM
chandlerc edited edge metadata.

Nit picks below, but LGTM. I think we should ship this. We can iterate on it further if folks want in post-commit, this already seems really nice.

include/llvm/ADT/BitmaskEnum.h
19

We have auto-brief, so no '\brief' needed.

40

As mentioned in IRC, maybe put the last sentence in its own paragraph so it stands out a bit more.

67–75

Any particular reason to not provide the is_bitmask_enum in the llvm namespace for general use?

This revision is now accepted and ready to land.Jul 13 2016, 11:18 AM
jlebar retitled this revision from [ADT] Add FlagsEnum, used to enable bitwise operations on enums without static_cast. to [ADT] Add LLVM_MARK_AS_BITMASK_ENUM, used to enable bitwise operations on enums without static_cast..Jul 13 2016, 11:20 AM
jlebar edited edge metadata.
This revision was automatically updated to reflect the committed changes.
jlebar marked 3 inline comments as done.
rnk added a subscriber: rnk.Jul 13 2016, 11:33 AM

It sounds like we should use this instead of CV_DEFINE_ENUM_CLASS_FLAGS_OPERATORS in include/llvm/DebugInfo/CodeView/CodeView.h

rnk added a subscriber: zturner.Jul 13 2016, 11:34 AM