This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] Relax assertion on generating destructor call
AbandonedPublic

Authored by Hahnfeld on Nov 10 2022, 7:34 AM.

Details

Summary

If the Decl pointers are not identical, declaresSameEntity will check the canonical Decls.

Diff Detail

Event Timeline

Hahnfeld created this revision.Nov 10 2022, 7:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 10 2022, 7:34 AM
Hahnfeld requested review of this revision.Nov 10 2022, 7:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 10 2022, 7:34 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

This fixes the assertion for a downstream case in ROOT/Cling with the involvement of modules. If anyone has ideas how to test this, please let me know...

ChuanqiXu added a comment.EditedNov 10 2022, 6:12 PM

I think it might be necessary/better to add a test for this. Otherwise, it looks not good to change this.

This fixes the assertion for a downstream case in ROOT/Cling with the involvement of modules. If anyone has ideas how to test this, please let me know...

Yeah, I know, this is pretty hard. And it would generally take a lot of time doing it. The method I used is to:

  • preprocess the failure files
  • Remove the unnecessary information, like a lot of line markers.
  • reduce them - repeating remove some part of the codes and see if the crash exists
  • Rename variables (if need)

Generally, it would take half a day or even a whole day for me to reduce an example from the real world test case. But it worths since it blocks the problem happen again.

I've trimmed the failing code down to

#include <string>
#include <string_view>
#include <vector>

template <typename T>
struct SO {
  void a() {
    struct SI {
      std::vector<int> v;
    };
    SI s;
    SI m(std::move(s));
  }

  void g() {
    std::vector<std::string_view> v{"a"};
  }
};

in a header / module and

SO<int> s;
s.a();
s.g();

in the calling code. Sadly this works fine in standalone Clang...

All of the above code seems to be important, starting from the outer template, having two functions, moving a std::vector from a default generated move constructor and then constructing a std::vector<std::string_view> with at least one element. If this rings a bell for anybody or anybody has an idea where to go from here, please let me know. I'm out of depth how to produce the exact failing conditions in a test. I would argue that relaxing the assert is fine regardless because it still tests that the DtorDecl belongs to this type, but I can't articulate why an exact pointer comparison fails in very rare circumstances...

I've trimmed the failing code down to

#include <string>
#include <string_view>
#include <vector>

template <typename T>
struct SO {
  void a() {
    struct SI {
      std::vector<int> v;
    };
    SI s;
    SI m(std::move(s));
  }

  void g() {
    std::vector<std::string_view> v{"a"};
  }
};

in a header / module and

SO<int> s;
s.a();
s.g();

in the calling code. Sadly this works fine in standalone Clang...

All of the above code seems to be important, starting from the outer template, having two functions, moving a std::vector from a default generated move constructor and then constructing a std::vector<std::string_view> with at least one element. If this rings a bell for anybody or anybody has an idea where to go from here, please let me know. I'm out of depth how to produce the exact failing conditions in a test. I would argue that relaxing the assert is fine regardless because it still tests that the DtorDecl belongs to this type, but I can't articulate why an exact pointer comparison fails in very rare circumstances...

So the code look like:

// foo.h
#include <string>
#include <string_view>
#include <vector>

template <typename T>
struct SO {
  void a() {
    struct SI {
      std::vector<int> v;
    };
    SI s;
    SI m(std::move(s));
  }

  void g() {
    std::vector<std::string_view> v{"a"};
  }
};

// foo.cpp
import "foo.h";
void func() {
    SO<int> s;
    s.a();
    s.g();
}

Is it the reproducer? If it is, my suggestion would be:

  • Preprocess foo.h into foo.ii then replace import "foo.h";
  • Then reduce foo.ii step by step. There should be many things that can be removed.
  • Then we found there is no things we can reduce, then it should be the test.

I know it sounds too hard at first. But I tried and it works although it would take one whole day. Here is an example for the reduced test case: https://github.com/llvm/llvm-project/commit/1aaba40dcbe8fdc93d825d1f4e22edaa3e9aa5b1, which is reduced from 2 system headers which contains 30,000+ lines of code.

I would argue that relaxing the assert is fine regardless

Yes, I understood. But we hope it can land with a test case really. Otherwise, the same use case may be broken some time again without being noticed by any one.

Is it the reproducer?

No, as I wrote:

Sadly this works fine in standalone Clang...

Is it the reproducer?

No, as I wrote:

Sadly this works fine in standalone Clang...

Sorry, I don't understand well. Could you rewrite the reproducer in the style I wrote? And in what cases it works fine?

Sorry, I don't understand well. Could you rewrite the reproducer in the style I wrote? And in what cases it works fine?

Okay, let me try to restate: I can reproduce the problem in ROOT/Cling which is an *interpreter* built on top of LLVM and CLang. The moment I put the same code (basically in the style that you wrote it) into standalone Clang, all is fine and I cannot reproduce the crash. Sorry if that was confusing in the beginning, but I've tried to make this clear from the start that I cannot actually reproduce this in pure Clang.

Sorry, I don't understand well. Could you rewrite the reproducer in the style I wrote? And in what cases it works fine?

Okay, let me try to restate: I can reproduce the problem in ROOT/Cling which is an *interpreter* built on top of LLVM and CLang. The moment I put the same code (basically in the style that you wrote it) into standalone Clang, all is fine and I cannot reproduce the crash. Sorry if that was confusing in the beginning, but I've tried to make this clear from the start that I cannot actually reproduce this in pure Clang.

Thanks, I got your point. But a test is really necessary here. Since it is completely possible that the problem happens in earlier steps and it only shows in this assertion. And it will be pretty bad in this case.

Yes, I fully agree that having a test is desirable, I just didn't manage so far. Maybe it takes somebody with deep AST knowledge to explain under what circumstances DtorDecl->getParent() is *not* the canonical Decl. Maybe this could help crafting a test case, even outside of modules maybe

Yes, I fully agree that having a test is desirable, I just didn't manage so far. Maybe it takes somebody with deep AST knowledge to explain under what circumstances DtorDecl->getParent() is *not* the canonical Decl. Maybe this could help crafting a test case, even outside of modules maybe

Modules can be "activated" at a random point of the translation unit. This would have influence of which Decl* becomes the canonical declaration as this is usually the first seen declaration. We have seen such issues over time with the modules system where depending on time when we load the module; the version of the standard library; the modulemap organization, etc. I am inclined to say this change somewhat makes sense to me as it seems somewhat consistent with what we've seen over the years but there are several questions I'd like to ask:

  • Is the failure also not reproducible with clang9 (on which is based cling)?
  • Is the failure also not reproducible with clang9 built on top of the downstream patches?

I believe if you can reduce the headers further it would make it easier to figure out in which context this assertion needs to be relaxed. Sorry for not being very helpful here.

v.g.vassilev added a subscriber: rsmith.

Usually @rsmith has something up in the sleeve in these situations ;)

  • Is the failure also not reproducible with clang9 (on which is based cling)?
  • Is the failure also not reproducible with clang9 built on top of the downstream patches?

The failure *only* happens with downstream llvm13 (not with clang9) on exactly one test for exactly one version of libstdc++. However, all attempts to reproduce it with clang13 + our downstream patches have failed so far...

Hahnfeld abandoned this revision.Jul 3 2023, 1:48 AM

apparently not needed anymore downstream...

The problem is back, but has been fixed upstream by https://reviews.llvm.org/rG61c7a9140becb19c5b1bc644e54452c6f782f5d5. I managed to reduce a test case triggering the assert touched in this revision, see https://reviews.llvm.org/D156806