Page MenuHomePhabricator

[clang] NFC: change uses of `Expr->getValueKind` into `is?Value`
ClosedPublic

Authored by mizvekov on Apr 18 2021, 1:40 PM.

Details

Summary

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>

Diff Detail

Event Timeline

mizvekov created this revision.Apr 18 2021, 1:40 PM
mizvekov requested review of this revision.Apr 18 2021, 1:40 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 18 2021, 1:40 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript

The change seems to be correct, but I'm wondering if x.getValueKind() == VK_*Value doesn't have one advantage over x.is*Value(): it's obvious that this is exclusive with the other values. Especially with isRValue() it might not be so obvious, because Clang doesn't follow the C++11 terminology with this.

But it's admittedly shorter, so I'd be willing to approve this.

clang/lib/Sema/SemaExpr.cpp
5555

There might be a certain benefit to using LHSExp->getValueKind() above when we use it here again: that makes it more obvious what we're trying to achieve in that if. (Namely changing the value category.)

The change seems to be correct, but I'm wondering if x.getValueKind() == VK_*Value doesn't have one advantage over x.is*Value(): it's obvious that this is exclusive with the other values. Especially with isRValue() it might not be so obvious, because Clang doesn't follow the C++11 terminology with this.

But it's admittedly shorter, so I'd be willing to approve this.

This came up in a patch where I am experimenting with a new value category.
The helpers 'help' a lot more when you are changing some of these tests from testing just one category, to testing a combination of categories (like GLValue).

But this is just the easy pickings of the patch, which is still WIP, and I may need in the future for some more drastic change to deal with the code that just stores the kind in a variable and tests that later.

It would be useful if you could do:

VK = Expr->getValueKind();
if (VK.isGLValue()) {

I may need to do something like that later.

clang/lib/Sema/SemaExpr.cpp
5555

Or just making the same kind of change here again:

if (!LHSExp->isRValue())
  OK = OK_VectorComponent;
VK = LHSExp->getValueKind();

@aaron.ballman, what do you think about this? We can't really prevent anyone from writing .getValueKind() == VK_*Value instead of .is*Value(), so this change will make things consistent only for now.

The change seems to be correct, but I'm wondering if x.getValueKind() == VK_*Value doesn't have one advantage over x.is*Value(): it's obvious that this is exclusive with the other values. Especially with isRValue() it might not be so obvious, because Clang doesn't follow the C++11 terminology with this.

But it's admittedly shorter, so I'd be willing to approve this.

This came up in a patch where I am experimenting with a new value category.

Not sure how to feel about this, the value categories are already hard to grasp for most C++ programmers. To solve the implicit move concerns with a new value category seems like cracking a nut with a sledgehammer. But that's not directly related to this change, we can discuss this when the change is there.

@aaron.ballman, what do you think about this? We can't really prevent anyone from writing .getValueKind() == VK_*Value instead of .is*Value(), so this change will make things consistent only for now.

I'm not certain this is a big win in all cases, especially because the standard has one set of terminology and we use another. e.g., the standard makes a distinction between rvalue and prvalue that we don't make, same for lvalue and glvalue. When I see the equality expression, I know that only one category is being tested, but when I see the function call expression, I have to think more about "does isRValue() return true for a prvalue and an xvalue, or just a prvalue?"

mizvekov added a comment.EditedMay 14 2021, 3:07 PM

Not sure how to feel about this, the value categories are already hard to grasp for most C++ programmers. To solve the implicit move concerns with a new value category seems like cracking a nut with a sledgehammer. But that's not directly related to this change, we can discuss this when the change is there.

Yeah this is unrelated to this patch, but hear me out here, I agree with all you said about the complexity of value categories, but:

  • I still think a new value category is simpler than the two phase overload resolution it replaces :D
  • This new value category would (so far as this is only an experiment I am running), only apply to standards older than c++23, where from then on it's just the current categories and the rules as in P2266.
  • Even in the old standards it replaces, it does not introduce any drastic changes. This new category is the same as XValues as far as overload resolution is concerned (with the small addition that it binds less preferentially to non-const l-value references), and as far as type deduction goes, it behaves exactly as LValues.

I'm not certain this is a big win in all cases, especially because the standard has one set of terminology and we use another. e.g., the standard makes a distinction between rvalue and prvalue that we don't make, same for lvalue and glvalue. When I see the equality expression, I know that only one category is being tested, but when I see the function call expression, I have to think more about "does isRValue() return true for a prvalue and an xvalue, or just a prvalue?"

I understand. But consider that it's not complicated right now:

bool isLValue() const { return getValueKind() == VK_LValue; }
bool isRValue() const { return getValueKind() == VK_RValue; }
bool isXValue() const { return getValueKind() == VK_XValue; }
bool isGLValue() const { return getValueKind() != VK_RValue; }

They are all testing exactly one enum, except the GLValue one. I understand however that I worked in very small subset of clang, this is as easy to remember as it gets for me, but this is not the same for others with different workloads. I am sure you have way more stuff to remember :)

I get @aaronpuchert 's point about how this only makes it consistent for now. I think we should do something about that and still make it more consistent.

How about another approach: we remove all the helpers that test exactly one enum (X, L and R, leaving only GL).
What do you think?

Another approach would be to document, or make more official, that the helpers that test a category with just one letter are just testing one enum value, and the ones with two or more letters are testing something more complex.

I don't think out-of-tree experiments on possibilities for move semantics are especially motivating for this, one way or the other, but I do think it would be nice to make some kind of change here.

What do we think about renaming isRValue() to isPRValue() and renaming VK_RValue to VK_PRValue, adding a "real" isRValue(), and then performing this cleanup? I think the current state of treating "rvalue" as sometimes meaning rvalue and sometimes meaning "prvalue" is unhelpful and confusing, and this change makes it worse because isRValue *could* be checking for an rvalue whereas a comparison against VK_RValue more clearly is only looking for one specific value category rather than two.

What do we think about renaming isRValue() to isPRValue() and renaming VK_RValue to VK_PRValue, adding a "real" isRValue(), and then performing this cleanup? I think the current state of treating "rvalue" as sometimes meaning rvalue and sometimes meaning "prvalue" is unhelpful and confusing, and this change makes it worse because isRValue *could* be checking for an rvalue whereas a comparison against VK_RValue more clearly is only looking for one specific value category rather than two.

Sounds good. For some reason I thought this would be more controversial...

Is there any special reason we ended up in this state of affairs with the swapped out names?

  • I still think a new value category is simpler than the two phase overload resolution it replaces :D

Not sure if it's simpler, but it'll produce less confusing results.

  • This new value category would (so far as this is only an experiment I am running), only apply to standards older than c++23, where from then on it's just the current categories and the rules as in P2266.
  • Even in the old standards it replaces, it does not introduce any drastic changes. This new category is the same as XValues as far as overload resolution is concerned (with the small addition that it binds less preferentially to non-const l-value references), and as far as type deduction goes, it behaves exactly as LValues.

A new value category feels like a global change for a local problem. We can explain the behavior that we want without introducing a new value category: either as a “combined” overload resolution for xvalue and lvalue, where all candidates for the xvalue are strictly better than any candidate for the lvalue, or as a two-phase resolution that falls back only if there are no candidates. These explanations are equivalent.

What do we think about renaming isRValue() to isPRValue() and renaming VK_RValue to VK_PRValue, adding a "real" isRValue(), and then performing this cleanup?

One problem might be that C and C++ have different terminology, right? I like the idea of going with prvalue instead of rvalue, then it's the right thing for C++ and if you're coming from the C world it's at least not confusing. But perhaps I would not introduce for now an isRValue function. (Although C doesn't have xvalues, so maybe it doesn't matter?)

Is there any special reason we ended up in this state of affairs with the swapped out names?

My impression is that some people didn't like the changes that came with C++11 and that it made the terminology of C and C++ somewhat inconsistent. (Though I think you can work with the C++ terminology in C, there are just no xvalues, but you can't work the other way around.)

I created a DR which proposes the renaming as rsmith suggested: https://reviews.llvm.org/D103720

A new value category feels like a global change for a local problem. We can explain the behavior that we want without introducing a new value category: either as a “combined” overload resolution for xvalue and lvalue, where all candidates for the xvalue are strictly better than any candidate for the lvalue, or as a two-phase resolution that falls back only if there are no candidates. These explanations are equivalent.

Sure, I had it in mind that is not necessarily how that would be explained to the users, but perhaps as the standard text is more geared towards the toolchain folks, the original explanation might be more suited there, and this alternative one is more didactic, as you pointed out it can be explained just in the context of implicit moves.

My impression is that some people didn't like the changes that came with C++11 and that it made the terminology of C and C++ somewhat inconsistent. (Though I think you can work with the C++ terminology in C, there are just no xvalues, but you can't work the other way around.)

Thanks. That makes sense!

But perhaps I would not introduce for now an isRValue function. (Although C doesn't have xvalues, so maybe it doesn't matter?)

What do we think about renaming isRValue() to isPRValue() and renaming VK_RValue to VK_PRValue, adding a "real" isRValue(), and then performing this cleanup?

We might want to hold on to the second step there (adding back another isRValue) for a little while, to give time for people to rebase patches and such, so they get a friendlier build breakage instead of some more subtle issue.

mizvekov updated this revision to Diff 351123.Jun 10 2021, 3:58 AM

Rebase.

Now that we have isPRValue, this is hopefully less controversial now :)

Also by the way D100713 is not really a dependency here, this DR can land on it's own if that is not clear.

rsmith accepted this revision.Jul 27 2021, 5:03 PM
This revision is now accepted and ready to land.Jul 27 2021, 5:03 PM