This is an archive of the discontinued LLVM Phabricator instance.

[clang][ConstExprEmitter] handle ArrayToPointerDecay ImplicitCastExpr of StringLiterals
ClosedPublic

Authored by nickdesaulniers on Jul 24 2023, 4:10 PM.

Details

Summary

Consider the following statement:

const char* foo = "foo";

For the sub-AST:

`-ImplicitCastExpr <col:19> 'const char *' <NoOp>
  `-ImplicitCastExpr <col:19> 'char *' <ArrayToPointerDecay>
    `-StringLiteral <col:19> 'char[4]' lvalue "foo"

The address of the StringLiteral can be emitted as the Constant.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptJul 24 2023, 4:10 PM
nickdesaulniers requested review of this revision.Jul 24 2023, 4:10 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 24 2023, 4:10 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript

We can do a whole bunch of these, but at some point we hit diminishing returns... bailing out here isn't *that* expensive. Do you have some idea how far you want to go with this? Adding a couple obvious checks like this isn't a big deal, but I don't want to build up a complete parallel infrastructure for scalar expressions here.

We can do a whole bunch of these, but at some point we hit diminishing returns... bailing out here isn't *that* expensive. Do you have some idea how far you want to go with this? Adding a couple obvious checks like this isn't a big deal, but I don't want to build up a complete parallel infrastructure for scalar expressions here.

My goal here is to land https://reviews.llvm.org/D76096. If I have to build a whole complete parallel infra to do so, I will. It might even be nice to start converging or deleting some of the duplication.

Some of these seem like very low hanging fruit, IMO, and probably still do speed up constant evaluation.

My (mis?)interpretation of https://reviews.llvm.org/D76096#4523828 was "find places where Evaluate() from clang/lib/AST/ExprConstant.cpp is called instead of the fast path very recently committed in D151587. I guess I lost focus on the "struct or array part."

For this case in this patch in particular:

extern const char* bar;
const char* foo [] = {
    "hello",
    "goodbye",
    bar
};
TranslationUnitDecl
|-VarDecl <line:1:1, col:20> col:20 used bar 'const char *' extern
`-VarDecl <line:2:1, line:6:1> line:2:13 foo 'const char *[3]' cinit
  `-InitListExpr <col:22, line:6:1> 'const char *[3]'
    |-ImplicitCastExpr <line:3:5> 'const char *' <NoOp>
    | `-ImplicitCastExpr <col:5> 'char *' <ArrayToPointerDecay>
    |   `-StringLiteral <col:5> 'char[6]' lvalue "hello"
    |-ImplicitCastExpr <line:4:5> 'const char *' <NoOp>
    | `-ImplicitCastExpr <col:5> 'char *' <ArrayToPointerDecay>
    |   `-StringLiteral <col:5> 'char[8]' lvalue "goodbye"
    `-ImplicitCastExpr <line:5:5> 'const char *' <LValueToRValue>
      `-DeclRefExpr <col:5> 'const char *' lvalue Var 0x55b64410c270 'bar' 'const char *'

so if you were concerned with arrays of string literals, we need this patch.

I guess I lost focus on the "struct or array part."

Right... scalars weren't the concern for D76096. They have much less overhead going through ExprConstant, and any evaluation of scalars isn't a regression because we currently don't have a codepath for that anyway.

Some of these seem like very low hanging fruit, IMO, and probably still do speed up constant evaluation.

Sure.

so if you were concerned with arrays of string literals, we need this patch.

Ohhh.... that's not what I meant by StringLiterals. I meant cases where the string literal is an rvalue, like char foo[100000] = "x";. If it's an lvalue, we go through ConstantLValueEmitter, which is still reasonably fast.

so if you were concerned with arrays of string literals, we need this patch.

Ohhh.... that's not what I meant by StringLiterals. I meant cases where the string literal is an rvalue, like char foo[100000] = "x";. If it's an lvalue, we go through ConstantLValueEmitter, which is still reasonably fast.

This patch is addressing the case of the string literal rvalue; see the commit description:

const char* foo = "foo";

No, that's taking the address of a string literal lvalue.

No, that's taking the address of a string literal lvalue.

ah! sorry, yeah
const char* foo = "foo";
vs
char foo[100000] = "x";

That AST looks like:

`-VarDecl 0x55eaf6075ba8 <x.c:1:1, col:20> col:6 foo 'char[100000]' cinit
  `-StringLiteral 0x55eaf6075c98 <col:20> 'char[100000]' "x"

That case is handled today by the fast path today. The case I refer to which is 1000% more common is not, hence this patch. WDYT?

efriedma accepted this revision.Jul 25 2023, 3:04 PM

I don't have any concern with this specific patch; LGTM

This revision is now accepted and ready to land.Jul 25 2023, 3:04 PM
This revision was landed with ongoing or failed builds.Jul 25 2023, 3:27 PM
This revision was automatically updated to reflect the committed changes.