This is an archive of the discontinued LLVM Phabricator instance.

[Sema] Don't apply an lvalue-to-rvalue conversion to a discarded-value expression if it has an array type
ClosedPublic

Authored by ahatanak on Apr 14 2020, 10:46 AM.

Details

Summary

This fixes a bug which causes clang to emit incorrect IR for the following code in C++:

int foo(void) {
  unsigned a = get_size();
  volatile char b[a];
  b;
  return b[0];
}

The code crashes because the destination of the memcpy that is generated to do the lvalue-to-rvalue conversion of discarded-value expression b is incorrect. To fix this bug, I've added a check to Sema which prevents an lvalue cast expression to be emitted if the expression has an array type. My understanding is that lvalue-to-rvalue conversion isn't applied to array type expressions, so I think we should detect this in Sema rather than fixing the code in IRGen that emits the lvalue for the destination of the lvalue-to-rvalue conversion. I've also made changes to Sema::DiagnoseUnusedExprResult so that clang doesn't tell users to assign a reference to a volatile array to a variable to force its value to be loaded.

Diff Detail

Event Timeline

ahatanak created this revision.Apr 14 2020, 10:46 AM

C++ says:

[expr.context]p2:

In some contexts, an expression only appears for its side effects. Such an expression is called a discarded-value expression. The array-to-pointer (7.3.2) and function-to-pointer (7.3.3) standard conversions are not applied. The lvalue-to-rvalue conversion (7.3.1) is applied if and only if the expression is a glvalue of volatile-qualified type and it is one of the following:

  • (list of special forms)

[conv.lval]p1:

A glvalue of a non-function, non-array type T can be converted to a prvalue.

So I think the right fix is probably to make sure that Sema::DefaultLvalueConversion ignores gl-values of array type.

My first patch just did that and it seems to work. But the comment for DefaultLvalueConversion says "This is DefaultFunctionArrayLvalueConversion, except that it assumes the operand isn't of function or array type". It doesn't even have assertions to check that though.

That doesn't seem to be how we use it in practice; I'd just change the comment.

It seems to me that there's as bug in the C++ specification here. The rules for discarded-value expressions say the lvalue-to-rvalue conversion is applied, whereas the rules for lvalue-to-rvalue conversions say one cannot be applied in this case. It's hard to guess what the correct behavior is (do we copy from the array or not?). But the rules do explicitly say "The array-to-pointer (7.3.2) and function-to-pointer (7.3.3) standard conversions are not applied." and it's unclear what effect that would have other than to cause a volatile copy of the entire array in a case such as this. I'm going to take this to CWG.

I'm going to take this to CWG.

So far, the direction the wind is blowing is that attempting to perform an lvalue-to-rvalue conversion on an array should be a no-op. (That's weirdly different from the behavior on a class, but so be it.)

I'm going to take this to CWG.

So far, the direction the wind is blowing is that attempting to perform an lvalue-to-rvalue conversion on an array should be a no-op. (That's weirdly different from the behavior on a class, but so be it.)

Okay. It's probably easiest and most correct to do that check in DefaultLvalueConversion, since otherwise callers would also have to do placeholder conversions. We use DefaultLvalueConversion a lot in Sema, apparently mostly in places where subsequent type restrictions disallow both arrays and decayed arrays; I'll admit to running out of energy after checking the first half-dozen matches, though.

I'm going to take this to CWG.

So far, the direction the wind is blowing is that attempting to perform an lvalue-to-rvalue conversion on an array should be a no-op. (That's weirdly different from the behavior on a class, but so be it.)

Is this the final decision? Or should I wait until we know for sure that this is a no-op and doesn't cause an lvalue-to-rvalue conversion?

So far, the direction the wind is blowing is that attempting to perform an lvalue-to-rvalue conversion on an array should be a no-op. (That's weirdly different from the behavior on a class, but so be it.)

Is this the final decision? Or should I wait until we know for sure that this is a no-op and doesn't cause an lvalue-to-rvalue conversion?

We won't know for months, but it seems reasonable to assume that this will be the final decision.

ahatanak updated this revision to Diff 259081.Apr 21 2020, 12:37 PM

Make Sema::DefaultLvalueConversion ignore gl-values of array type

ahatanak updated this revision to Diff 263309.May 11 2020, 4:52 PM

Rebase and ping.

I audited the calls to DefaultLvalueConversion. As far as I can tell, either an array type is never passed to the function call or there are checks that reject array types after the function is called, so the changes in this patch seem safe to me.

rsmith accepted this revision.May 11 2020, 6:48 PM
rsmith added inline comments.
clang/lib/Sema/SemaExpr.cpp
601–603

Given this comment, should we be checking for function types below too?

This revision is now accepted and ready to land.May 11 2020, 6:48 PM
rjmccall accepted this revision.May 11 2020, 7:03 PM

Alright, let's go with this.

ahatanak updated this revision to Diff 263792.May 13 2020, 11:01 AM
ahatanak marked an inline comment as done.

Check function types in DefaultLvalueConversion.

rjmccall accepted this revision.May 13 2020, 11:41 AM
This revision was automatically updated to reflect the committed changes.