This is an archive of the discontinued LLVM Phabricator instance.

Arrays of unknown bound in constant expressions
ClosedPublic

Authored by Arcoth on Apr 21 2017, 2:01 PM.

Details

Reviewers
rsmith
Summary

Arrays of unknown bound are subject to some bugs in constant expressions.

const extern int arr[];
constexpr int f(int i) {return arr[i];}
constexpr int arr[] {1, 2, 3};
int main() {constexpr int x = f(2);}

This is spuriously rejected. On the other hand,

extern const int arr[];
constexpr int const* p = arr + 2;

compiles. The standard will presumably incorporate a rule that forbids array-to-pointer conversions on arrays of unknown bound (unless they are completed at the point of evaluation, as in the first
example). The proposed changes reflect this idea.

Event Timeline

Arcoth created this revision.Apr 21 2017, 2:01 PM
rsmith edited edge metadata.Apr 21 2017, 2:30 PM

This needs testcases (the one from your summary plus the ones in my comments above would be good).

lib/AST/ExprConstant.cpp
2636

I think this should be named ArrayBound or similar.

2639–2640

Use line comments, start with capital letter, end with period, please.

2641

This will not be the correct bound in general. This field is the array size of the *innermost* non-base-class subobject described by the designator, not the outermost one.

You could compute the correct bound here by going back to the ValueDecl* in the CompleteObject and checking its most recent declaration, but I think it would be preferable to do that when computing the type in findCompleteObject instead.

(It might seem possible to map to the most recent VarDecl when evaluating the DeclRefExpr instead, but that actually won't work in general, for cases like:

extern int arr[];
constexpr int *p = arr; // computes APValue referring to first declaration of 'arr'
int arr[3];
constexpr int *q = p + 1; // should be accepted

... but findCompleteObject will happen late enough.)

5661–5662

Per LLVM coding style, else goes on the same line as }.

5672–5678

We should produce a CCEDiag along this path too. (This happens for VLAs.)

5675

Comments should start with a capital letter.

5679–5682

Please suppress this diagnostic when Info.checkingPotentialConstantExpression(), since the expression is still potentially a constant expression. Example:

extern int arr[];
constexpr int *f() { return arr + 3; }
int arr[5];
constexpr int *p = f();

Here, when we check that the body of f() is valid, we need to ignore the fact that we don't know the array bound, since it could become known later.

Arcoth updated this revision to Diff 96594.Apr 25 2017, 10:21 AM
Arcoth marked 6 inline comments as done.

Adjusted.

Implemented the changes suggested by Richard. The array-to-pointer decay
is now well-formed; the check was instead moved to the pointer
arithmetic facility.

Arcoth updated this revision to Diff 96603.Apr 25 2017, 10:53 AM

Small fix.

The change in direction from diagnosing the lvalue-to-rvalue conversion to diagnosing the pointer arithmetic seems fine to me (and is likely a better approach overall), but this means we should now treat a designator referring to element 0 of an array of unknown / runtime bound as being valid. We have minimal support for this already for __builtin_object_size evaluation, which you should be able to generalize to support arbitrary designators ending with such a value.

lib/AST/ExprConstant.cpp
2970

This path fails without producing a diagnostic (a default-constructed CompleteObject() is an error return). I think we should instead treat this as a valid CompleteObject with a type that simply can't be further decomposed.

5570–5578

I think this should be handled in SubobjectDesignator::adjustIndex instead; there are other ways to get to the pointer arithmetic logic (such as array indexing and the implicit indexing we do in some builtins).

5680–5681

We should never set a designator invalid without issuing a diagnostic. If you want to defer the diagnostic until pointer arithmetic happens, you need to be able to represent that situation in a valid designator. Perhaps generalizing the existing support for isMostDerivedAnUnsizedArray would be a path forward.

Arcoth updated this revision to Diff 97105.EditedApr 28 2017, 9:00 AM

Updated SubobjectDesignator to hold MostDerivedIsAnUnsizedArray instead of FirstEntryIsAnUnsizedArray. Consequently, we cannot know whether the first element is an unsized array; please see line 7352 in particular, where the check for isMostDerivedAnUnsizedArrayhad to be removed (is this fine? All regression tests pass), due to a testcase in CogeGen/alloc-size.c:

void *my_malloc(size_t) __attribute__((alloc_size(1)));

struct Data {
  int t[10];
};

void test5() {
  Data *const data = (Data*)my_malloc(sizeof(*data));
  __builtin_object_size(&data->t[1], 1);
}

The first entry in the Designator for &data->t[1] is an unsized array, since data points to what has been allocated via an allocation function, but the most derived object is not a member of an unsized array; it's a member of t. I.e. my change is not a strict extension.

Furthermore, I addressed the previous review's comments. No invalid designators are returned in the array-to-pointer decay case anymore, we simply use addUnsizedArray instead. This also enables stuff like arr + 0 out of the box.

Arcoth updated this revision to Diff 97114.Apr 28 2017, 10:09 AM

Simplified array-to-pointer conversion (simply add the array as unsized; let findCompleteObject and the SubobjectDesignator ctor do the work).

Arcoth updated this revision to Diff 97116.Apr 28 2017, 10:12 AM

(Just fixed a slip after typing with the wrong window in focus...)

Thanks, this looks good, just a couple of minor things and then it should be ready to land.

Do you have commit access or will you need someone else to commit this for you?

lib/AST/ExprConstant.cpp
152

Please start variable names with an uppercase letter.

168

The other 'most derived' paths through here should set this back to false (a sized array in a field in an element of an unsized array does not give an unsized array).

1301

We should call checkSubobject here, for cases like:

void f(int n) {
  int arr[2][n];
  constexpr int *p = &arr[2][0];
}

... which we should reject because arr[2] is a one-past-the-end lvalue, so cannot be indexed (we reject this if n is instead a constant).

Arcoth updated this revision to Diff 97180.Apr 28 2017, 6:40 PM

Applied the last review's suggestions.

rsmith accepted this revision.May 1 2017, 12:03 PM

Committed as r301822.

lib/AST/ExprConstant.cpp
1301

Turns out this case is already failing before we get here: forming arr[2] requires determining the size of the array element type int[n], which fails.

This revision is now accepted and ready to land.May 1 2017, 12:03 PM
rsmith closed this revision.May 1 2017, 12:03 PM