This generalizes the hardcoded profitability check that was just added in D104156.
To my surprize, this does not catch any new cases.
|1,230 ms||x64 debian > Clang.OpenMP::atomic_ast_print.cpp|
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/clang -cc1 -internal-isystem /var/lib/buildkite-agent/builds/llvm-project/build/lib/clang/14.0.0/include -nostdsysteminc -verify -fopenmp -ast-print /var/lib/buildkite-agent/builds/llvm-project/clang/test/OpenMP/atomic_ast_print.cpp | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck --allow-unused-prefixes /var/lib/buildkite-agent/builds/llvm-project/clang/test/OpenMP/atomic_ast_print.cpp
|1,190 ms||x64 debian > Clang.OpenMP::atomic_codegen.cpp|
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/clang -cc1 -internal-isystem /var/lib/buildkite-agent/builds/llvm-project/build/lib/clang/14.0.0/include -nostdsysteminc -verify -triple x86_64-apple-darwin10 -fopenmp -fexceptions -fcxx-exceptions -x c++ -emit-llvm /var/lib/buildkite-agent/builds/llvm-project/clang/test/OpenMP/atomic_codegen.cpp -o - | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck --allow-unused-prefixes /var/lib/buildkite-agent/builds/llvm-project/clang/test/OpenMP/atomic_codegen.cpp
|1,220 ms||x64 debian > Clang.OpenMP::atomic_messages.c|
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/clang -cc1 -internal-isystem /var/lib/buildkite-agent/builds/llvm-project/build/lib/clang/14.0.0/include -nostdsysteminc -verify=expected,omp45 -fopenmp -fopenmp-version=45 -ferror-limit 100 /var/lib/buildkite-agent/builds/llvm-project/clang/test/OpenMP/atomic_messages.c -Wuninitialized
|100 ms||x64 debian > LLVM.Bindings/Go::go.test|
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/llvm-go go=/usr/bin/go test llvm.org/llvm/bindings/go/llvm
We do need an X86 test for this.
The values here make it seem like replacing a non-cheap operation with two cheap ones yields the same performance (1+1 = 2). contrary to the comment. Is this intentional?
So here, it seems like even if the extract is "free" (cost 0) it'll be counted as "cheap" (cost 1).
I wonder if it would be better to combine these functions into a single "getExtractSubvectorCost" target hook.
Sorry, i do not have an x86 test for this..
Not only does it make seem so, that is exactly what the comment is saying.
Right. Because even free extractions can be done for cheap, if that makes sense.
We could do that, there's already *TTIImpl::getShuffleCost(), the problem being, what will be the "cheap budget",
It's not what the comment is saying. The comment says "a win" which to my mind means, "performance improvement." If the cost is the same, where's the improvement? Maybe the comment is confusing me. This is used to calculate Budget which I guess limits how many extracts we can do (so two at most?). Perhaps the comment could be expanded to explain a bit more about what is going on.
In some contexts I suppose. A "free" extract is "cheap" if all you care about is whether an extract is "cheap." A "free" extract is not "cheap" if you care about the relative cost of the two. I could see the above code being very confusing in some circumstances.
Not knowing all of the users of that function, I can't say. It's probably context-dependent.