This is an archive of the discontinued LLVM Phabricator instance.

Fix IRGen for referencing a static local before emitting its decl
ClosedPublic

Authored by rnk on Aug 4 2014, 7:31 PM.

Details

Summary

Previously CodeGen assumed that static locals were emitted before they
could be accessed, which is true for automatic storage duration locals.
However, it is possible to have CodeGen emit a nested function that uses
a static local before emitting the function that defines the static
local, breaking that assumption.

Fix it by splitting static local creation up into a getAddrOf phase and
a definition phase, as is done for other globals.

Fixes PR18020. See also previous attempts to fix static locals in
PR6769 and PR7101.

Diff Detail

Repository
rL LLVM

Event Timeline

rnk updated this revision to Diff 12185.Aug 4 2014, 7:31 PM
rnk retitled this revision from to Fix IRGen for referencing a static local before emitting its decl.
rnk updated this object.
rnk added a reviewer: rsmith.
rnk added a subscriber: Unknown Object (MLST).
rsmith edited edge metadata.Aug 7 2014, 11:04 AM

Hmm, I think you need to emit a definition, not just a declaration, when you see the first use of a static local variable. Here's a valid C++1y program that I think the current patch would reject (with a link error):

static auto f() {
  static int n;
  struct S { int &operator()() { return n; } };
  return S();
}

int main() { return decltype(f())()(); }
rnk updated this revision to Diff 13291.Sep 4 2014, 4:19 PM
rnk edited edge metadata.
  • Use getOrCreate* instead of GetAddr* and Create*
rnk added a comment.Sep 4 2014, 4:19 PM
In D4787#4, @rsmith wrote:

Hmm, I think you need to emit a definition, not just a declaration, when you see the first use of a static local variable. Here's a valid C++1y program that I think the current patch would reject (with a link error):

static auto f() {
  static int n;
  struct S { int &operator()() { return n; } };
  return S();
}

int main() { return decltype(f())()(); }

Hopefully the new patch addresses this.

rsmith added inline comments.Sep 4 2014, 5:09 PM
lib/CodeGen/CGDecl.cpp
174 ↗(On Diff #13291)

Does it still make sense for this to be on CodeGenFunction since it can now be called while emitting the "wrong" function? It seems to risk using local state of the CodeGenFunction object, which would be bad.

191 ↗(On Diff #13291)

Case in point: this will presumably produce the wrong mangled name if the static local's emission is triggered by a function other than the containing one.

199 ↗(On Diff #13291)

This doesn't look right. If the static local has a constant initializer, you need to emit it here, not as part of emitting the surrounding function. Given:

static auto f() {
  static int n = 1;
  struct S { int &operator()() { return n; } };
  return S();
}

int main() { return decltype(f())()(); }

... main should return 1, and I think with this patch it returns 0.

rnk added a comment.Sep 5 2014, 3:20 PM

This is turning out to be really, really difficult.

lib/CodeGen/CGDecl.cpp
174 ↗(On Diff #13291)

It seems there's two uses of local state:

  • CGF affects name mangling for static locals in non-C++ languages, but in those languages, the name is presumably not significant
  • CGF matters to CGExprConstant, in particular for taking the address of a label
191 ↗(On Diff #13291)

No, in C++ mode it just mangles the VarDecl, which has all the info needed.

199 ↗(On Diff #13291)

OK. This can be done, but it has considerable complexity because the initializer can reference itself circularly, apparently.

rnk updated this revision to Diff 13732.Sep 15 2014, 2:49 PM
  • Use getOrCreate* instead of GetAddr* and Create*
  • Emit constant initializers of local statics
rsmith added inline comments.Sep 29 2014, 11:39 AM
lib/CodeGen/CGDecl.cpp
170 ↗(On Diff #13732)

Would be useful for this function name to mention something about static local variables.

307–309 ↗(On Diff #13732)

Do we ever need to re-emit the initializer for correctness? The only cases I can think of are like:

extern const int n;
auto f() {
  static const int k = n;
  return [] { return k; }
}
auto x = f();
const int n = 3;

... where we can choose to use either static or dynamic initialization for f()::k.

lib/CodeGen/CGExprConstant.cpp
921–922 ↗(On Diff #13732)

Oh, I see. This is why EmitInitializerForStaticVarDecl might need to re-emit the initializer. It seems a little scary for this to silently return 0 in other cases, but I suppose it's not too bad. A comment here would be useful (and also one in EmitInitializerForStaticVarDecl explaining why we re-emit).

I'm not convinced this is correct, though. Consider an inline function that has a static local with an initializer that uses the address of a label, and exposes a local lambda that uses the static local. Even if that function is not used in *this* TU, we must still make the label addresses be correct, in case the function is used in a different TU. For instance:

inline auto foo(void *p) {
  if (p) goto *p;
  static void *q = &&label;
  while (1) {
    struct S { static void *get() { return q; } };
    return S();
  label:
    puts("hello world");
  }
}

TU1:

void *global = decltype(foo(0))::get();

TU2:

int main() { extern void *global; foo(global); }

You might say that you don't care about this case. I'd be inclined to agree =) Perhaps we should ErrorUnsupported on an AddrLabelExpr with no CGF?

rnk updated this revision to Diff 14535.Oct 7 2014, 3:37 PM
  • Back up and do this the easy way
rsmith accepted this revision.Oct 7 2014, 5:12 PM
rsmith edited edge metadata.

LGTM, subject to handling of ObjCMethodDecl.

lib/CodeGen/CGDecl.cpp
231 ↗(On Diff #14535)

Should we handle ObjCMethodDecls here?

This revision is now accepted and ready to land.Oct 7 2014, 5:12 PM
rnk added a comment.Oct 7 2014, 5:35 PM

Thanks!

lib/CodeGen/CGDecl.cpp
231 ↗(On Diff #14535)

I thought about this a lot and I don't think this bug is reachable with an objective C method decl. They always appear at global scope, and you can't actually refer to one without constructing an object of the ObjC class type and calling the method indirectly via string lookup. If you've done that, that implies we've emitted the code.

That overflowed my objective C thinking budget, so I avoided crashing on them and moved on. I'll try to comment on that.

rnk closed this revision.Oct 7 2014, 6:18 PM
rnk updated this revision to Diff 14545.

Closed by commit rL219265 (authored by @rnk).