This is an archive of the discontinued LLVM Phabricator instance.

[GlobalOpt] Remove valgrind specific hacks (revert r160529)
Needs ReviewPublic

Authored by evgeny777 on Oct 25 2019, 3:56 AM.

Details

Summary

The r160529 introduced alternative way of removing globals to appease leak detectors (e.g valgrind). However this commit made nearly impossible removing global if it is of structure or array type and contains element of pointer type. Consider following example:

static int bar() { return 42; }
static struct S {
   int (*F)();
   int A;
} Obj = { bar, 0 };

int main() {
   Obj.A = 42;
   return 0;
}

In the example above global variable Obj will not be eliminated by globalopt, because it contains a pointer, although that pointer is a pointer to function and is never really accessed. The more serious problem is that function bar is not being DCE'd, although it's obviously dead. This could be especially problematic for LTO builds - I'm observing significant amount of dead stuff left in the image, which would otherwise have been eliminated.

Another issues with r160529 are:

  • Very poor test coverage. The only test case cleanup-pointer-root-users.ll covers tiny part of added functionality
  • While preventing optimizations in many cases it strangely allows some of global stores to be eliminated. For instance this one will be optimized out:
static void *p;
int main() { p = malloc(100); return 0; }

And this one not:

static void *p;
__attribute__((noinline)) int set(void *_p) { p = _p; }
int main() { set(malloc(100)); return 0; }

An alternative to removal of r160529 could probably be converting global to private with zero initializer.

Diff Detail

Event Timeline

evgeny777 created this revision.Oct 25 2019, 3:56 AM

Added the author of r160529 as a reviewer for thoughts.

Not sure you are still interested in this patch. If so, I would suggest you get in touch with some Google folks and check with them if their codebase is ready for this patch. They were the only reason for this workaround.

@nlopes

Not sure you are still interested in this patch. If so, I would suggest you get in touch with some Google folks and check with them if their codebase is ready for this patch. They were the only reason for this workaround.

Yeah, it would be nice to get rid of this valgrind legacy (btw, there is already D70006 which eliminates such kind of unneeded globals for thin LTO). However I don't whom to contact.

@nlopes

Not sure you are still interested in this patch. If so, I would suggest you get in touch with some Google folks and check with them if their codebase is ready for this patch. They were the only reason for this workaround.

Yeah, it would be nice to get rid of this valgrind legacy (btw, there is already D70006 which eliminates such kind of unneeded globals for thin LTO). However I don't whom to contact.

@tejohnson @chandlerc @hans should be able to point out someone that can comment on this and/or test if this patch regresses google's leak tests.

test/Transforms/GlobalOpt/2009-11-16-BrokenPerformHeapAllocSRoA.ll
20 ↗(On Diff #226398)

is this a regression? The internal @chartypes global is never read, so the store can go away (well, the whole function is a nop).

@nlopes

Not sure you are still interested in this patch. If so, I would suggest you get in touch with some Google folks and check with them if their codebase is ready for this patch. They were the only reason for this workaround.

Yeah, it would be nice to get rid of this valgrind legacy (btw, there is already D70006 which eliminates such kind of unneeded globals for thin LTO). However I don't whom to contact.

@tejohnson @chandlerc @hans should be able to point out someone that can comment on this and/or test if this patch regresses google's leak tests.

Added @chandlerc as reviewer here since he was involved in the original discussion at https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20120625/145646.html.

evgeny777 added inline comments.Mar 30 2021, 9:06 AM
test/Transforms/GlobalOpt/2009-11-16-BrokenPerformHeapAllocSRoA.ll
20 ↗(On Diff #226398)

Doubtful. instcombine run after global opt converts the whole function to ret void

nlopes accepted this revision.Apr 12 2021, 11:13 AM

Please go ahead. The relevant stakeholders didn't reply, so let's assume they are not interested in this functionality anymore.
Anyway, these days people can use @llvm.used if needed.

This revision is now accepted and ready to land.Apr 12 2021, 11:13 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptApr 13 2021, 9:11 AM

I don't know why the various people you added didn't respond. This is unfortunate.

As it happens, this change does regress a bunch of Google internal leak tests with some internal tools. As it also happens, we are working to replace these tools with lsan, but that won't happen for a bit longer, and I can't promise a timeline.

Not sure the best thing to do from here. From my perspective, in a perfect world, if we could revert it, that would be great. Leaks do happen in the cases that the original change mentions:

If we delete the variable, but don't eliminate the corresponding allocation, then that memory does leak.

To me that looks like a problem of "as-if" and definition of what a "leak" is: do we have a reference to a standard (or better LangRef) that would guide this here?
(to me it always looked like these were leaks in the first place, even if the leak checker was happy about them because technically "reachable")

In the meantime, we should revert: if this is making clang entirely incompatible with Valgrind it'd be nice to have a plan. Can we bring this up on llvm-dev@ maybe?

I went ahead and reverted. We can discuss a plan--hopefully the internal cleanup to obviate this will go smoothly.

Let me know if there are more issues.

MaskRay added a subscriber: MaskRay.EditedApr 13 2021, 7:16 PM
// a.cc
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

static void *g;

void set(char *a);
void foo(void *a) { g = a; } // add a store from a different function to suppress global internalization

int main() {
  char a[10];
  set(a);
  char *b = strdup(a);
  printf("%p %s\n", b, b);
  g = b;
}
// b.cc
#include <string.h>

void set(char *a) {
  strcpy(a, "hello");
}

clang++ -O1 a.cc b.cc -fsanitize=leak had no leak before and reported a leak with this patch. You can also change -O1 to -O2 or -O3. -fsanitize=leak can be changed to -fsanitize=address. You may remove -fsanitize= and use valgrind.

(Note I kept a store from a different function foo, because otherwise the GlobalOpt internalization can cause a false positive before this patch.... I don't know whether it is worth a fix. I think in practice for interesting cases for such a private global variable retaining a heap allocation, there are always more than one storer...)

I think there is value retaining the original test case cleanup-pointer-root-users.ll. We can teach isLeakCheckerRoot that function pointer should not need the pessimization.

clang++ -O1 a.cc b.cc -fsanitize=leak had no leak before and reported a leak with this patch. You can also change -O1 to -O2 or -O3. -fsanitize=leak can be changed to -fsanitize=address. You may remove -fsanitize= and use valgrind.

I'd question whether this is a feature or a bug though: if a user really want to intentionally "leak" this way, they could mark the static void *g; with __attribute__((__used__)) to prevent the optimization, or make the store volatile.

clang++ -O1 a.cc b.cc -fsanitize=leak had no leak before and reported a leak with this patch. You can also change -O1 to -O2 or -O3. -fsanitize=leak can be changed to -fsanitize=address. You may remove -fsanitize= and use valgrind.

I'd question whether this is a feature or a bug though: if a user really want to intentionally "leak" this way, they could mark the static void *g; with __attribute__((__used__)) to prevent the optimization, or make the store volatile.

The loads can be optimized out, so I don't think the user needs annotation.

clang++ -O1 a.cc b.cc -fsanitize=leak had no leak before and reported a leak with this patch. You can also change -O1 to -O2 or -O3. -fsanitize=leak can be changed to -fsanitize=address. You may remove -fsanitize= and use valgrind.

I'd question whether this is a feature or a bug though: if a user really want to intentionally "leak" this way, they could mark the static void *g; with __attribute__((__used__)) to prevent the optimization, or make the store volatile.

The loads can be optimized out, so I don't think the user needs annotation.

Can you expand on why the store shouldn't be optimized? I understand it breaks leak checker, but that does not make it a bug...

I think it is a valid transformation.

On the other hand, all three leak checkers that we have tried don't cope with the transformation well at all, and having those work is a useful feature from a QOI point of view. One possibility would be to only do the optimization at, say, -O3, rather than at -O1. Suboptimal, but better than the current situation. Having asan to report a failure at O3 that it doesn't at -O0 is a little weird, but better than nothing.

Another question to ask is:

Just how much do we gain by eliminating these locations? Are we talking 10%, 1%, or what?

On the other hand, all three leak checkers that we have tried don't cope with the transformation well at all

Well, again it depends on your point of view: from my point of view they work perfectly well here and the problem could very well be seen as to be fixed in the user code if this is what they intend (otherwise it will find more unintended leak potentially).

I went ahead and reverted. We can discuss a plan--hopefully the internal cleanup to obviate this will go smoothly.

Let me know if there are more issues.

I think it's unfair what just happened here. Google had over a year to comment and ended up wasting people's time.

Can you please comment on why you can't use __attribute__((__used__))? If you want to keep this hack in LLVM, I think it would be helpful if we understood the reason. Thanks.

Just how much do we gain by eliminating these locations? Are we talking 10%, 1%, or what?

I don't know, but preventing optimization to appease valgrind looks .. hm.. strange. I agree with @nlopes that __attribute__((__used__)) should be used for such purpose (probably added from some instrumentation pass)

+1.
I find it quite worrying/unacceptable that $BIGCORP believes they get to have the final say as to what happens in $PROJ.

Hot take: immediately revert the revert.
If someone needs that hack, they can submit a new review, and argument *there* why llvm must have this hack,
not their llvm fork/their codebase.

xbolva00 added a subscriber: xbolva00.EditedApr 14 2021, 4:06 AM

I think it's unfair what just happened here. Google had over a year to comment and ended up wasting people's time.

True. Why the LLVM project should suffer because Google is unable to take action for such long time?

Googler 1: In the meantime, we should revert
Googler 2: I went ahead and reverted.

Rather than provide any reasonable action plan (maybe temporarily provide option for google/others to leave this hack on for them), they just reverted it without proper review/discussion to possibly solve this "issue" with a different way :/ Just OK from google-colleague is not enough.

I find it quite worrying/unacceptable that $BIGCORP believes they get to have the final say as to what happens in $PROJ.

+1.

If someone needs that hack, they can submit a new review, and argument *there* why llvm must have this hack, not their llvm fork/their codebase.

+1. Anybody needs some hacks? Maintain it in own internal fork - should not be a problem, right?

(Disclaimer: I'm also employed by Google).

I think it's unfair what just happened here. Google had over a year to comment and ended up wasting people's time.

True. Why the LLVM project should suffer because Google is unable to take action for such long time?

It was certainly quite unfortunate that nobody responded to questions posed here. It could've avoided unnecessary strife if this conversation had started sooner. Yet, in the end, everyone's goal is to have a good compiler -- nobody wants the LLVM project to suffer.

Rather than provide any reasonable action plan (maybe temporarily provide option for google/others to leave this hack on for them), they just reverted it without proper review/discussion to possibly solve this "issue" with a different way :/ Just OK from google-colleague is not enough.

I believe the revert was consistent with the long-standing LLVM reversion policy (which was recently codified in https://llvm.org/docs/DeveloperPolicy.html#patch-reversion-policy). There were real problems identified with this change -- it breaks both clang -fsanitize=leak and external leak-checkers, and a current user of external leak-checking (Google) came forward to say that they were broken by it.

As far as I can tell, the breakage of -fsanitize=leak was not known prior to the commit, nor was it known that this change would cause active breakage for Google's current use of an external leak-checker. (It's unfortunate and frustrating that the question was raised, yet not answered in the affirmative until now, but I don't think that changes the ultimate outcome.)

So, now we have two new pieces of information, which were not known at the time of commit. That is reasonable grounds for a revert while discussion takes place as to what the next step ("action plan") should be.

I find it quite worrying/unacceptable that $BIGCORP believes they get to have the final say as to what happens in $PROJ.

+1.

I don't think it's helpful to look at it that way. _Everyone_ gets to point out problems, even after commits have gone in. If there are issues, the first answer to revert while discussion on the proper resolution to the issue takes place. A patch reversion in the LLVM process is not a "veto" or a "final say", it is a step to take during discussion. One possible outcome of the discussion could be consensus that the commit was good after-all and should be reinstated as is, despite the issues identified. If so, it will be a more-educated conclusion, properly taking into account those now known issues.

I hope/expect that further discussion on this review can result in an outcome to everyone's satisfaction without any need for escalation. Yet, if disagreement on a review is not resolving, there are escalation mechanisms in place to help resolve the issue. First, the question can be brought up on llvm-dev for wider input. That's usually sufficient. If that also fails to resolve the issue, it can be escalated to the formal decision-making process (https://github.com/llvm/llvm-www/blob/main/proposals/LP0001-LLVMDecisionMaking.md). (I mention that mostly for completeness, I seriously doubt this could get that far.)

As far as I can tell, the breakage of -fsanitize=leak was not known prior to the commit

What kind of breakage? According to @MaskRay (I've just verified his code) LSAN detects memory leak when valgrind hack is reversed, so there is at least some improvement, not breakage. Even if LSAN needs to keep pointer globals for some reason then it can do so from instrumentation pass, because LSAN is part of LLVM. I see no reason why hacks should reside in LLVM trunk to appease some third party tools.

As far as I can tell, the breakage of -fsanitize=leak was not known prior to the commit

What kind of breakage? According to @MaskRay (I've just verified his code) LSAN detects memory leak when valgrind hack is reversed, so there is at least some improvement, not breakage.

MaskRay's example does not show an improvement in LSAN behavior, but rather a new false positive. Memory which is still accessible at process termination is not a leak, so the source -- as written -- had no leak, and should ideally not be reported as such.

Even if LSAN needs to keep pointer globals for some reason then it can do so from instrumentation pass, because LSAN is part of LLVM.

Perhaps that's the best answer in the end.

I see no reason why hacks should reside in LLVM trunk to appease some third party tools.

Well, there is a reason. It's pretty generic: if (some) users expect to be able to be able to use such tools, then breaking them -- with no possible recourse -- hurts such users. Of course, even knowing that you're breaking users, sometimes it can turn out that other factors are ultimately deemed more important. For example, the change might help many other users, or reduce complexity in the compiler significantly. Those factors could be judged to outweigh the breakage.

Yet, even when you judge other considerations to have a higher value, it is useful to recognize that this is a judgement on relative values, and not an absolute truth: the thing being broken had some non-zero value. So, it could be more accurate and less confrontational to say something like "I don't think it's worth reducing the optimization potential of LLVM to appease third-party tools like Valgrind or Google's leak checker, when LSAN can do the same job these days", instead of saying there is "no reason". (Assuming LSAN is fixed, of course.)

I have opened the discussion on llvm-dev. I do want to find a solution that everyone can live with, and I apologize that it felt like Google was stepping over its bounds.

I agree that this patch is the right direction. The original one was quite unprincipled. I appreciate the deleted code as well :-)

llvm/test/Transforms/GlobalOpt/dead-store-status.ll
0

Please don't add a dependency on instcombine here, this should be a unit test for globalopt alone.

evgeny777 updated this revision to Diff 337725.Apr 15 2021, 5:39 AM

Rebased and addressed comments from @lattner (I've removed the dead-store-status.ll because it is specific to CleanupPointerRootUsers)

evgeny777 reopened this revision.Apr 15 2021, 5:39 AM
This revision is now accepted and ready to land.Apr 15 2021, 5:39 AM
evgeny777 requested review of this revision.Apr 15 2021, 5:41 AM

Let's see what happens in the discussion on llvm-dev before moving forward with this patch.

For anyone reviewing this patch, one really good thing to do would be to run the leak checker of their choice on their own code base both with and without the patch, and evaluate if the change in behavior is acceptable.

MaskRay's example does not show an improvement in LSAN behavior, but rather a new false positive. Memory which is still accessible at process termination is not a leak, so the source -- as written -- had no leak

This seems an unusual definition of memory leak.

Is it predicated on the assumption of a hosted environment which will perform the memory free on behalf of the application at termination, where said host+application in some sense can never leak memory?

The assumption is that memory still reachable at program termination, but not deallocated, has not leaked. And it is how all three of LSAN, Valgrind, and HeapLeakChecker define it.

That may not make sense in embedded environments, but that is how most leak checkers work (and in nonhosted environments, things are quite different anyway). FWIW, HeapLeakChecker does have a mode that checks for this, but it has been unused for many years.

At a previous company I worked for that did deeply embedded bare-metal development, there were absolutely cleanup functions that could deallocate all memory, available to be called at any time the program deemed it appropriate. I don't think that is all that unusual.

MaskRay added a comment.EditedApr 20 2021, 10:11 AM

MaskRay's example does not show an improvement in LSAN behavior, but rather a new false positive. Memory which is still accessible at process termination is not a leak, so the source -- as written -- had no leak

This seems an unusual definition of memory leak.

Is it predicated on the assumption of a hosted environment which will perform the memory free on behalf of the application at termination, where said host+application in some sense can never leak memory?

I've provided some context why we need to retain global pointer variable retained objects: https://lists.llvm.org/pipermail/llvm-dev/2021-April/149857.html
Actually I just applied this pattern yesterday for lldb: D100806.

The global variable may have readers but that readers can be optimized out. Then with this patch, such global variables will be optimized away and cause a leak.

If we ignore such patterns, we would cause a huge list of false positives in many projects, with all of valgrind, -fsanitize=leak, -fsanitize=address. "Remove valgrind specific hacks" the subject is imprecise: folks might think this is valgrind specific, and likely no-longer-needed, workaround. However, this piece of code applies to various leak checkers and is actively useful.

As of how to retain the potential missing optimization (to the best of my knowledge it is unmeasured), I've suggested that we can ignore function pointers as root-set if deemed useful.

The global variable may have readers but that readers can be optimized out. Then with this patch, such global variables will be optimized away and cause a leak.

FWIW, this example from @MaskRay seems pretty reasonable to me. I could see users annotating globals that catch intentional leaks, but adding annotations for this scenario (optimized-out reader) might make users feel like they're fighting the compiler. At least, seems like it might need some discussion outside of patch review.

davide removed a reviewer: davide.Apr 20 2021, 11:55 AM

This patch will break https://lab.llvm.org/buildbot/#/builders/sanitizer-x86_64-linux-bootstrap
E.g. llvm::BuryPointer is not compatible with this change.

This patch will break https://lab.llvm.org/buildbot/#/builders/sanitizer-x86_64-linux-bootstrap
E.g. llvm::BuryPointer is not compatible with this change.

The global is marked with an attribute, is it because the store to the GraveYard isn't volatile?

llvm/test/ThinLTO/X86/import-constant.ll