Page MenuHomePhabricator

[clang] Fix CallExpr dependence bit may not respecting all its arguments.

Authored by hokein on Jun 10 2021, 12:01 PM.



Before this patch, the dependence of CallExpr was only computed in the
constructor, the dependence bits might not reflect truth -- some arguments might
be not set (nullptr) during this time, e.g. CXXDefaultArgExpr will be set via
the setArg method in the later parsing stage, so we need to recompute the
dependence bits.

Diff Detail

Unit TestsFailed

1,720 msx64 debian > libFuzzer.libFuzzer::dataflow.test
Script: -- : 'RUN: at line 5'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang --driver-mode=g++ -O2 -gline-tables-only -fsanitize=address -I/var/lib/buildkite-agent/builds/llvm-project/compiler-rt/lib/fuzzer -m64 -c -fno-sanitize=all -fsanitize=dataflow /var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/fuzzer/../../lib/fuzzer/dataflow/DataFlow.cpp -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/fuzzer/X86_64DefaultLinuxConfig/Output/dataflow.test.tmp-DataFlow.o
127,870 msx64 debian > libFuzzer.libFuzzer::only-some-bytes-fork.test
Script: -- : 'RUN: at line 5'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang --driver-mode=g++ -O2 -gline-tables-only -fsanitize=address -I/var/lib/buildkite-agent/builds/llvm-project/compiler-rt/lib/fuzzer -m64 -c -fno-sanitize=all -fsanitize=dataflow /var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/fuzzer/../../lib/fuzzer/dataflow/DataFlow.cpp -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/fuzzer/X86_64DefaultLinuxConfig/Output/only-some-bytes-fork.test.tmp-DataFlow.o
7,740 msx64 debian > libFuzzer.libFuzzer::only-some-bytes.test
Script: -- : 'RUN: at line 5'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang --driver-mode=g++ -O2 -gline-tables-only -fsanitize=address -I/var/lib/buildkite-agent/builds/llvm-project/compiler-rt/lib/fuzzer -m64 -c -fno-sanitize=all -fsanitize=dataflow /var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/fuzzer/../../lib/fuzzer/dataflow/DataFlow.cpp -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/fuzzer/X86_64DefaultLinuxConfig/Output/only-some-bytes.test.tmp-DataFlow.o

Event Timeline

hokein requested review of this revision.Jun 10 2021, 12:01 PM
hokein created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptJun 10 2021, 12:01 PM

This seems correct but it also seems like it makes setting all the args of a CallExpr into a quadratic operation.
e.g. ConvertArgumentForCall and BuildResolvedCallExpr in SemaExpr seem to do this (not just missing/changed args) and are probably on common code paths.
It's hard to see how to fix this without touching the API or being fiddly with the dependence bits (e.g. setArg only needs to rescan all args if we're removing at least one dep bit vs the old value, which should be rare).

Also this function just calls through to the (mutable) getArgs() which returns a mutable Expr**. This allows mutating args, bypassing setArg(), and ISTM there are cases where this really happens e.g. in TreeTransform::TransformCallExpr.
Again, it seems like changing the API might be the answer here: we can separate out the getArgs() which only need read access from those that are doing something tricky.

A tempting API would be something like getMutableArgs() -> some smart object that exposes the mutable arg array and also recalculates dependence when destroyed.
I guess the way to make this ergonomic is to actually inherit from MutableArrayRef, like:

class CallExpr::MutableArgs : public MutableArrayRef<Expr> {
  CallExpr *Parent;
  // Note that "slicing" copy to MutableArrayRef is still allowed.
  MutableArgs(const MutableArgs&) = delete;
  MutableArgs &operator=(const MutableArgs&) = delete;
  MutableArgs(MutableArgs&&) = delete;
  MutableArgs &operator=(MutableArgs&&) = delete;

  ~MutableArgs() { Parent->recomputeDependence(); }

I'm not sure if you see this as overengineering.
I think I can also live with adding explicit recomputeDependence() calls in the right places, and slapping a warning on setArg() that it might be needed.
But i'm not sure that making setArg() implicitly recalculate is a happy in-between point. WDYT?

Ping - curious about your thoughts here!

hokein updated this revision to Diff 355469.Wed, Jun 30, 1:20 AM

refine the solution: introduce a separate method, and call it when needed.

sorry for the delay, I lost the track.

I think I can also live with adding explicit recomputeDependence() calls in the right places, and slapping a warning on setArg() that it might be needed.

I'm inclined with this option.

As you mentioned, there are two major places (ConvertArgumentForCall and BuildResolvedCallExpr in SemaExpr) calling setArg, and it is trivial to update them.
There are a lot of setArg usage in SemaChecking.cpp, but I think we're not interested to them -- because there are invalid checks before calling setArg and we mostly care about the error-bit.

sammccall accepted this revision.Wed, Jun 30, 8:22 AM
sammccall added inline comments.

by calling computeDependence()

This revision is now accepted and ready to land.Wed, Jun 30, 8:22 AM
hokein closed this revision.Thu, Jul 1, 5:41 AM
hokein marked an inline comment as done.