This is an archive of the discontinued LLVM Phabricator instance.

"Fancy" ARBs
Needs ReviewPublic

Authored by hfinkel on May 12 2015, 1:03 PM.

Details

Summary

When ARBs (arrays of runtime bound) were discussed in Lenexa last week, the idea that gained consensus to move forward was a "magic" class, one that could only be constructed on the stack, perhaps using an ARB(-like) syntax. We did not have a concrete proposal for such a facility, nor are there any implementations that follow that model. However, there is not consensus for ARBs as-is, nor particularly for dynarray, and some hybrid seems like the only viable path forward (aside from doing nothing). And I'd prefer we do something...

This patch implements a mode in Clang for "fancy" ARBs, controlled by a feature flag: -ffancy-arbs -- While still rough, I'd like early feedback on this. The basic idea is that declaring:

int x[n]

where n is not a constant, not only creates a stack array to hold n objects of type int, but also wraps that array in a container-like class. Here I've named it std::arb (I know this makes it a non-conforming extension; we can certainly name it something else), and it is kind of like std::initializer_list on steroids (it looks more like an array wrapper, has more of the standard typedefs, is convertable to a pointer type, etc.).

The implementation wraps the VarDecl creation and causes two variables to be declared: an internal variable (x.ARB) that actually manages the storage and object lifetimes, and an std::arb wrapper (which gets the declared name x), through which the surrounding code interacts with the storage. Some plumbing changes in Sema allow for the capturing of this additional declaration so it can be added to the DeclGroup with the primary one.

One of the design requirement of the design is that the class be "magic" (unable to be created manually), and to meet this requirement, the class is non-copyable, non-movable, and has no public constructors. A new "magic" init type is added, like CallInit, but bypassing access control checks, to cause the ARB syntax to trigger construction of std::arb via an otherwise inaccessible private constructor. I'm not sure if, from a standardization perspective, this access-control-overriding is necessary. If we don't do it, any code that uses the constructor would be implementation-specific and have UB.

This should, hopefully, satisfy the consensus requirements from EWG, while matching the current syntax for ARBs. While you can pass x to a function requiring int*, x will also support begin(), end(), size(), etc. and can be treated like a container. This using (pointer,size) on interface boundaries is not required (which is part of the reason I elected to make it a type with a "standardized" name).

Currently, the definition of std::arb is in an internal header that is included before normal source parsing. To limit expense, I did not include any features in std::arb that would be complicated to implement, or require pulling in other library headers (no reverse iterators, no at() because no exceptions, no fill()). An alternative design is to make the implementation more fully-featured but require the header be included in every TU in which ARBs are used.

The current implementation does not work with non-POD types because the underlying VLA support rejects them. That can be extended separately (from this patch), and then non-POD types should also work with this feature.

Please review.

Diff Detail

Event Timeline

hfinkel updated this revision to Diff 25609.May 12 2015, 1:03 PM
hfinkel retitled this revision from to "Fancy" ARBs.
hfinkel updated this object.
hfinkel edited the test plan for this revision. (Show Details)
hfinkel added a subscriber: Unknown Object (MLST).
rsmith edited edge metadata.May 12 2015, 1:27 PM

First off: I'm not happy about having this extension in upstream clang until we have a strong indication that this is the direction that will end up standardized. For now, I'd recommend maintaining this as a clang fork on github or similar. With that said, I'm going to review this as if for upstream clang.

I don't like that you create two variables here. We try to maintain as much source fidelity as we can, and I think we can do better here -- how about instead introducing a new form of expression that represents "stack-allocate a certain amount of memory" (with a subexpression for the initialization, if you allow these variables to have an initializer), much like MaterializeTemporaryExpr does for a SD_Automatic temporary, but parameterized by an expression specifying the array size? Then create just a single expression of the std::arb type, initialized by that expression.

You should also introduce a Type subclass to represent type sugar for the ARB type, so that we can model that int[n] desugars to std::arb<int> but should be pretty-printed as an array type.

First off: I'm not happy about having this extension in upstream clang until we have a strong indication that this is the direction that will end up standardized. For now, I'd recommend maintaining this as a clang fork on github or similar.

Will do.

With that said, I'm going to review this as if for upstream clang.

Thanks! (The point of doing this is to get feedback on implementation-related issues)

I don't like that you create two variables here. We try to maintain as much source fidelity as we can, and I think we can do better here -- how about instead introducing a new form of expression that represents "stack-allocate a certain amount of memory" (with a subexpression for the initialization, if you allow these variables to have an initializer), much like MaterializeTemporaryExpr does for a SD_Automatic temporary, but parameterized by an expression specifying the array size? Then create just a single expression of the std::arb type, initialized by that expression.

You should also introduce a Type subclass to represent type sugar for the ARB type, so that we can model that int[n] desugars to std::arb<int> but should be pretty-printed as an array type.

That makes sense (and this is very similar to how we currently handle std::initializer_list).

Any opinion on:

  1. Should we bother with the access-control-overriding init type? We could just make the constructor public but make it UB if the user calls it (it would be implementation-specific anyhow).
  2. Should I make std::arb manage the lifetime of the objects directly? If it just takes a special allocation expression maybe that's more natural? I'd like to not force extra work for POD types (but I imagine I could use some enable_if/is_pod logic to leave PODs uninitialized).
  3. The automatically-included header (or similar) with a simpler class, or just requiring the header and making the class more fully-featured?

Thanks again!

First off: I'm not happy about having this extension in upstream clang until we have a strong indication that this is the direction that will end up standardized. For now, I'd recommend maintaining this as a clang fork on github or similar.

Will do.

With that said, I'm going to review this as if for upstream clang.

Thanks! (The point of doing this is to get feedback on implementation-related issues)

I don't like that you create two variables here. We try to maintain as much source fidelity as we can, and I think we can do better here -- how about instead introducing a new form of expression that represents "stack-allocate a certain amount of memory" (with a subexpression for the initialization, if you allow these variables to have an initializer), much like MaterializeTemporaryExpr does for a SD_Automatic temporary, but parameterized by an expression specifying the array size? Then create just a single expression of the std::arb type, initialized by that expression.

You should also introduce a Type subclass to represent type sugar for the ARB type, so that we can model that int[n] desugars to std::arb<int> but should be pretty-printed as an array type.

That makes sense (and this is very similar to how we currently handle std::initializer_list).

Any opinion on:

  1. Should we bother with the access-control-overriding init type? We could just make the constructor public but make it UB if the user calls it (it would be implementation-specific anyhow).
  2. Should I make std::arb manage the lifetime of the objects directly? If it just takes a special allocation expression maybe that's more natural? I'd like to not force extra work for POD types (but I imagine I could use some enable_if/is_pod logic to leave PODs uninitialized).

Or to put it another way, should the aforementioned new "stack-allocate a certain amount of memory" expression also be responsible for calling the constructors of non-POD array elements and register calling their destructors as cleanup, or should that logic be embedded in std::arb?

  1. The automatically-included header (or similar) with a simpler class, or just requiring the header and making the class more fully-featured?

Thanks again!

aaron.ballman resigned from this revision.Oct 13 2015, 5:56 AM
aaron.ballman removed a reviewer: aaron.ballman.