Page MenuHomePhabricator
Feed Advanced Search

Today

ABataev added a comment to D102107: [OpenMP] Codegen aggregate for outlined function captures.

We used this kind of codegen initially but later found out that it causes a large overhead when gathering pointers into a record. What about hybrid scheme where the first args are passed as arguments and others (if any) are gathered into a record?

I'm confused, maybe I misunderstand the problem. The parallel function arguments need to go from the main thread to the workers somehow, I don't see how this is done w/o a record. This patch makes it explicit though.

Pass it in a record for workers only? And use a hybrid scheme for all other parallel regions.

I still do not follow. What does it mean for workers only? What is a hybrid scheme? And, probably most importantly, how would we not eventually put everything into a record anyway?

Tue, Jun 22, 3:26 AM · Restricted Project, Restricted Project

Yesterday

ABataev updated the diff for D101109: [SLP]Improve multinode analysis..

Rebase

Mon, Jun 21, 1:54 PM · Restricted Project
ABataev committed rGc5bbc737e8c6: [SLP][NFC]Rename functions in the tests, NFC. (authored by ABataev).
[SLP][NFC]Rename functions in the tests, NFC.
Mon, Jun 21, 1:52 PM
ABataev updated the diff for D100486: [COST]Improve cost model for shuffles in SLP..

Rebase

Mon, Jun 21, 1:31 PM · Restricted Project
ABataev added a comment to D100486: [COST]Improve cost model for shuffles in SLP..

No i didn't.
I think while this may be somewhat correct,
is not really correct. For example, before AVX,
there is no non-32-bit shuffles, only unpacks.

Mon, Jun 21, 1:17 PM · Restricted Project
ABataev updated the diff for D104122: [SLP]Improve vectorization of stores..

Rebase

Mon, Jun 21, 12:32 PM · Restricted Project
ABataev committed rG908b7536615e: [SLP]Improve vectorization of PHI instructions. (authored by ABataev).
[SLP]Improve vectorization of PHI instructions.
Mon, Jun 21, 12:27 PM
ABataev closed D103638: [SLP]Improve vectorization of PHI instructions..
Mon, Jun 21, 12:27 PM · Restricted Project

Thu, Jun 17

ABataev accepted D103793: [Clang][OpenMP] Monotonic does not apply to SIMD.

LG

Thu, Jun 17, 6:58 AM · Restricted Project
ABataev added a comment to D102107: [OpenMP] Codegen aggregate for outlined function captures.

We used this kind of codegen initially but later found out that it causes a large overhead when gathering pointers into a record. What about hybrid scheme where the first args are passed as arguments and others (if any) are gathered into a record?

I'm confused, maybe I misunderstand the problem. The parallel function arguments need to go from the main thread to the workers somehow, I don't see how this is done w/o a record. This patch makes it explicit though.

Thu, Jun 17, 6:40 AM · Restricted Project, Restricted Project

Wed, Jun 16

ABataev added a comment to D102107: [OpenMP] Codegen aggregate for outlined function captures.

We used this kind of codegen initially but later found out that it causes a large overhead when gathering pointers into a record. What about hybrid scheme where the first args are passed as arguments and others (if any) are gathered into a record?

Wed, Jun 16, 7:22 AM · Restricted Project, Restricted Project
ABataev added inline comments to D104122: [SLP]Improve vectorization of stores..
Wed, Jun 16, 4:07 AM · Restricted Project
ABataev added inline comments to D101109: [SLP]Improve multinode analysis..
Wed, Jun 16, 4:03 AM · Restricted Project

Tue, Jun 15

ABataev updated the diff for D103638: [SLP]Improve vectorization of PHI instructions..

Rebase + address a comment

Tue, Jun 15, 5:07 AM · Restricted Project
ABataev committed rG45ae766e78e0: [OPENMP]Fix PR50699: capture locals in combine directrives for aligned clause. (authored by ABataev).
[OPENMP]Fix PR50699: capture locals in combine directrives for aligned clause.
Tue, Jun 15, 4:59 AM
ABataev closed D104258: [OPENMP]Fix PR50699: capture locals in combine directrives for aligned clause..
Tue, Jun 15, 4:58 AM · Restricted Project

Mon, Jun 14

ABataev requested review of D104258: [OPENMP]Fix PR50699: capture locals in combine directrives for aligned clause..
Mon, Jun 14, 12:35 PM · Restricted Project
ABataev committed rG4e155608796b: [OPENMP][C++20]Add support for CXXRewrittenBinaryOperator in ranged for loops. (authored by ABataev).
[OPENMP][C++20]Add support for CXXRewrittenBinaryOperator in ranged for loops.
Mon, Jun 14, 11:59 AM
ABataev closed D104240: [OPENMP][C++20]Add support for CXXRewrittenBinaryOperator in ranged for loops..
Mon, Jun 14, 11:59 AM · Restricted Project
ABataev committed rG44f197e94b83: [OpenMP] Fix C-only clang assert on parsing use_allocator clause of target… (authored by ABataev).
[OpenMP] Fix C-only clang assert on parsing use_allocator clause of target…
Mon, Jun 14, 10:39 AM
ABataev closed D103899: [OpenMP] Fix C-only clang assert on parsing use_allocator clause of target directive.
Mon, Jun 14, 10:39 AM · Restricted Project
ABataev requested review of D104240: [OPENMP][C++20]Add support for CXXRewrittenBinaryOperator in ranged for loops..
Mon, Jun 14, 10:04 AM · Restricted Project
ABataev accepted D103899: [OpenMP] Fix C-only clang assert on parsing use_allocator clause of target directive.

LG

Mon, Jun 14, 8:30 AM · Restricted Project

Sat, Jun 12

ABataev added inline comments to D103638: [SLP]Improve vectorization of PHI instructions..
Sat, Jun 12, 3:55 AM · Restricted Project

Fri, Jun 11

ABataev added a comment to D102834: [SLPVectorizer] WIP Implement initial memory versioning (WIP!).

I would add an option to control whether to allow it or not.

Fri, Jun 11, 11:49 AM · Restricted Project
ABataev updated the diff for D100486: [COST]Improve cost model for shuffles in SLP..

Rebase + fixed gathering cost calculation.

Fri, Jun 11, 10:06 AM · Restricted Project
ABataev added a comment to D101109: [SLP]Improve multinode analysis..

There are still regressions, even after we allowed reordering of insertelements. It is because the reordering is not quite effective. I have an idea of how to improve it (and avoid rebuilding the tree for the second time and improve compile time), will try to implement it next week.

Fri, Jun 11, 9:25 AM · Restricted Project
ABataev updated the diff for D101109: [SLP]Improve multinode analysis..

Rebase

Fri, Jun 11, 9:23 AM · Restricted Project
ABataev committed rGa010d4230e13: [SLP]Allow reordering of insertelements. (authored by ABataev).
[SLP]Allow reordering of insertelements.
Fri, Jun 11, 8:48 AM
ABataev closed D104057: [SLP]Allow reordering of insertelements..
Fri, Jun 11, 8:48 AM · Restricted Project
ABataev committed rG74af4bb1f471: [SLP]Remove unnecessary UndefValue in CreateShuffle. (authored by ABataev).
[SLP]Remove unnecessary UndefValue in CreateShuffle.
Fri, Jun 11, 8:11 AM
ABataev closed D104113: [SLP]Remove unnecessary UndefValue in CreateShuffle..
Fri, Jun 11, 8:10 AM · Restricted Project
ABataev requested review of D104122: [SLP]Improve vectorization of stores..
Fri, Jun 11, 8:07 AM · Restricted Project
ABataev committed rGcd2bb16d563e: [SLP][NFC]Add a test for unordered stores, NFC. (authored by ABataev).
[SLP][NFC]Add a test for unordered stores, NFC.
Fri, Jun 11, 8:03 AM
ABataev requested review of D104113: [SLP]Remove unnecessary UndefValue in CreateShuffle..
Fri, Jun 11, 5:45 AM · Restricted Project
ABataev accepted D104064: [SLP][NFC] Fix condition that was supposed to save a bit of compile time..

LG

Fri, Jun 11, 4:52 AM · Restricted Project

Thu, Jun 10

ABataev updated the diff for D103638: [SLP]Improve vectorization of PHI instructions..

Rebase

Thu, Jun 10, 1:49 PM · Restricted Project
ABataev accepted D103529: [OPENMP]Referencing a static function defined in declare variant is generating an erroneous warning..

LG, but add an extra description in summary

Thu, Jun 10, 1:49 PM
ABataev requested review of D104057: [SLP]Allow reordering of insertelements..
Thu, Jun 10, 1:44 PM · Restricted Project
ABataev added inline comments to D103529: [OPENMP]Referencing a static function defined in declare variant is generating an erroneous warning..
Thu, Jun 10, 12:40 PM
ABataev accepted D99459: [OpenMP] Implement '#pragma omp unroll'..

LG

Thu, Jun 10, 11:51 AM · Restricted Project, Restricted Project, Restricted Project
ABataev committed rGa893b441873d: [SLP]Disable scheduling of insertelements. (authored by ABataev).
[SLP]Disable scheduling of insertelements.
Thu, Jun 10, 10:27 AM
ABataev closed D104026: [SLP]Disable scheduling of insertelements..
Thu, Jun 10, 10:26 AM · Restricted Project
ABataev added inline comments to D103529: [OPENMP]Referencing a static function defined in declare variant is generating an erroneous warning..
Thu, Jun 10, 8:01 AM
ABataev accepted D104029: [Analysis] Pass RecurrenceDescriptor as const reference. NFCI..

LG

Thu, Jun 10, 7:17 AM · Restricted Project
ABataev added inline comments to D103529: [OPENMP]Referencing a static function defined in declare variant is generating an erroneous warning..
Thu, Jun 10, 6:06 AM
ABataev requested review of D104026: [SLP]Disable scheduling of insertelements..
Thu, Jun 10, 5:54 AM · Restricted Project
ABataev added inline comments to D99459: [OpenMP] Implement '#pragma omp unroll'..
Thu, Jun 10, 5:35 AM · Restricted Project, Restricted Project, Restricted Project
ABataev accepted D103995: [OpenMP] Add type to firstprivate symbol for const firstprivate values.

LG

Thu, Jun 10, 5:29 AM · Restricted Project
ABataev added inline comments to D103529: [OPENMP]Referencing a static function defined in declare variant is generating an erroneous warning..
Thu, Jun 10, 5:27 AM

Wed, Jun 9

ABataev added a comment to D101109: [SLP]Improve multinode analysis..

rebase?

Wed, Jun 9, 6:07 AM · Restricted Project
ABataev updated the diff for D99436: [OPENMP]Fix PR49366: crash on VLAs in task untied regions..

Ping

Wed, Jun 9, 5:54 AM · Restricted Project
ABataev updated the diff for D103638: [SLP]Improve vectorization of PHI instructions..

Rebase

Wed, Jun 9, 5:30 AM · Restricted Project
ABataev committed rGa0086add2e52: [SLP]Improve gathering of scalar elements. (authored by ABataev).
[SLP]Improve gathering of scalar elements.
Wed, Jun 9, 5:24 AM
ABataev closed D103458: [SLP]Improve gathering of scalar elements..
Wed, Jun 9, 5:24 AM · Restricted Project
ABataev accepted D103954: [SLP] Incorrect handling of external scalar values.

LG

Wed, Jun 9, 4:55 AM · Restricted Project

Tue, Jun 8

ABataev added a comment to D103899: [OpenMP] Fix C-only clang assert on parsing use_allocator clause of target directive.

Tests?

Tue, Jun 8, 8:23 AM · Restricted Project

Mon, Jun 7

ABataev added inline comments to D103529: [OPENMP]Referencing a static function defined in declare variant is generating an erroneous warning..
Mon, Jun 7, 1:40 PM
ABataev added inline comments to D103529: [OPENMP]Referencing a static function defined in declare variant is generating an erroneous warning..
Mon, Jun 7, 1:06 PM
ABataev added inline comments to D103529: [OPENMP]Referencing a static function defined in declare variant is generating an erroneous warning..
Mon, Jun 7, 12:42 PM
ABataev added a comment to D103529: [OPENMP]Referencing a static function defined in declare variant is generating an erroneous warning..

I assume the proper fix is to handle declare variants in checkUndefinedButUsed in Sema.cpp. This should fix both PRs, mentioned here

Mon, Jun 7, 8:32 AM
ABataev added inline comments to D103529: [OPENMP]Referencing a static function defined in declare variant is generating an erroneous warning..
Mon, Jun 7, 8:18 AM
ABataev added inline comments to D103529: [OPENMP]Referencing a static function defined in declare variant is generating an erroneous warning..
Mon, Jun 7, 7:21 AM
ABataev added a comment to D103529: [OPENMP]Referencing a static function defined in declare variant is generating an erroneous warning..

Review anyone please? Thanks.

Add more description to the summary. Would be good to see an analysis, why do we get this error message.

The warning warning: function 'foo' has internal linkage but is not defined

[-Wundefined-internal]

is generated at https://github.com/llvm/llvm-project/blob/main/clang/lib/Sema/Sema.cpp#L814

In ActoOnstartFunctionDefinitioninOpenMPDeclareVariantScope there is new Decl that is created here https://github.com/llvm/llvm-project/blob/main/clang/lib/Sema/SemaOpenMP.cpp#L6685. This Decl is added to the UndefinedButUsed map here https://github.com/llvm/llvm-project/blob/main/clang/lib/Sema/SemaExpr.cpp#L17118
This newly created Decl was marked 'imiplicit' but not marked 'used' as expected here https://github.com/llvm/llvm-project/blob/main/clang/include/clang/AST/DeclBase.h#L578.

Mon, Jun 7, 6:58 AM
ABataev added a comment to D102834: [SLPVectorizer] WIP Implement initial memory versioning (WIP!).

You can easily get the first instruction - VectorizableTree.front(). Stores can be only in the front of the vectorizable tree. As to the last instruction, I think we need to scan all tree entries and find all gather nodes with memaccess which may alias with stores. The main problem that the versioning better to perform before we try to build the tree, otherwise we may decide to gather some instructions instead of trying vectorizing them because of possible aliasing. And it may not be profitable.

I updated the patch to only collect possible bounds for versioning first and queue basic blocks for which we found bounds for re-processing. When re-processing those blocks, we create a versioned block with the appropriate !noalias metadata and re-run vectorization (for now just seeded by stores). As as, this should be correct, but may not be optimal, because either:
a) there were issues preventing vectorization other than aliasing, which may cause us to not vectorize anything in the versioned block
b) the runtime checks are too expensive and offset the gain from additional vectorization.

We should be able to solve both by comparing the cost of the versioned & non-versioned BB after vectorization. If versioning is not profitable, we can remove the conditional branch again. What do you think of this direction?

Mon, Jun 7, 5:55 AM · Restricted Project
ABataev added inline comments to D103529: [OPENMP]Referencing a static function defined in declare variant is generating an erroneous warning..
Mon, Jun 7, 4:55 AM
ABataev retitled D103529: [OPENMP]Referencing a static function defined in declare variant is generating an erroneous warning. from Referencing a static function defined in an opnemp clause is generating an erroneous warning. to [OPENMP]Referencing a static function defined in declare variant is generating an erroneous warning..
Mon, Jun 7, 4:54 AM
ABataev added a comment to D103529: [OPENMP]Referencing a static function defined in declare variant is generating an erroneous warning..

Review anyone please? Thanks.

Mon, Jun 7, 4:54 AM

Fri, Jun 4

ABataev added inline comments to D103458: [SLP]Improve gathering of scalar elements..
Fri, Jun 4, 8:53 AM · Restricted Project
ABataev updated the diff for D103458: [SLP]Improve gathering of scalar elements..

Address comments.

Fri, Jun 4, 8:52 AM · Restricted Project
ABataev added inline comments to D99459: [OpenMP] Implement '#pragma omp unroll'..
Fri, Jun 4, 8:36 AM · Restricted Project, Restricted Project, Restricted Project
ABataev committed rGc84a5448b5ac: [OPENMP]Fix PR50129: omp cancel parallel not working as expected. (authored by ABataev).
[OPENMP]Fix PR50129: omp cancel parallel not working as expected.
Fri, Jun 4, 8:28 AM
ABataev closed D103646: [OPENMP]Fix PR50129: omp cancel parallel not working as expected..
Fri, Jun 4, 8:28 AM · Restricted Project, Restricted Project
ABataev updated the diff for D103458: [SLP]Improve gathering of scalar elements..

Address comments, better analysis of gathering order.

Fri, Jun 4, 8:23 AM · Restricted Project
ABataev committed rG827b5c21545a: [OPENMP]Fix PR49790: Constexpr values not handled in `omp declare mapper`… (authored by ABataev).
[OPENMP]Fix PR49790: Constexpr values not handled in `omp declare mapper`…
Fri, Jun 4, 7:35 AM
ABataev closed D103642: [OPENMP]Fix PR49790: Constexpr values not handled in `omp declare mapper` clause..
Fri, Jun 4, 7:35 AM · Restricted Project
ABataev updated the diff for D103646: [OPENMP]Fix PR50129: omp cancel parallel not working as expected..
  1. Emit __kmpc_cancel_barrier only for parallel cancellation.
  2. Synced CreateCancel.
Fri, Jun 4, 7:05 AM · Restricted Project, Restricted Project
ABataev accepted D103666: [Clang][OpenMP] Refactor checking for mutually exclusive clauses. NFC..

LG

Fri, Jun 4, 4:58 AM · Restricted Project, Restricted Project
ABataev accepted D103665: [Clang][OpenMP] Add static version of getSingleClause<ClauseT>. NFC..

LG

Fri, Jun 4, 4:57 AM · Restricted Project, Restricted Project

Thu, Jun 3

ABataev requested review of D103646: [OPENMP]Fix PR50129: omp cancel parallel not working as expected..
Thu, Jun 3, 2:15 PM · Restricted Project, Restricted Project
ABataev requested review of D103642: [OPENMP]Fix PR49790: Constexpr values not handled in `omp declare mapper` clause..
Thu, Jun 3, 12:14 PM · Restricted Project
ABataev updated the diff for D99436: [OPENMP]Fix PR49366: crash on VLAs in task untied regions..

Rebase

Thu, Jun 3, 11:46 AM · Restricted Project
ABataev updated the diff for D100486: [COST]Improve cost model for shuffles in SLP..

Rebase

Thu, Jun 3, 11:35 AM · Restricted Project
ABataev requested review of D103638: [SLP]Improve vectorization of PHI instructions..
Thu, Jun 3, 11:18 AM · Restricted Project
ABataev updated the diff for D103458: [SLP]Improve gathering of scalar elements..

Rebase

Thu, Jun 3, 10:37 AM · Restricted Project
ABataev committed rG8c48d77cdfe5: [SLP]Improve cost estimation/emission of externally used extractelements. (authored by ABataev).
[SLP]Improve cost estimation/emission of externally used extractelements.
Thu, Jun 3, 10:28 AM
ABataev closed D102933: [SLP]Improve cost estimation/emission of externally used extractelements..
Thu, Jun 3, 10:28 AM · Restricted Project
ABataev committed rG89f3bc7698c5: [SLP]Allow to reorder nodes with >2 scalar values. (authored by ABataev).
[SLP]Allow to reorder nodes with >2 scalar values.
Thu, Jun 3, 10:03 AM
ABataev closed D103247: [SLP]Allow to reorder nodes with >2 scalar values..
Thu, Jun 3, 10:03 AM · Restricted Project
ABataev added inline comments to D103247: [SLP]Allow to reorder nodes with >2 scalar values..
Thu, Jun 3, 8:53 AM · Restricted Project

Wed, Jun 2

ABataev added inline comments to D99459: [OpenMP] Implement '#pragma omp unroll'..
Wed, Jun 2, 7:25 AM · Restricted Project, Restricted Project, Restricted Project
ABataev accepted D102180: [Clang][OpenMP] Emit dependent PreInits before directive..

LG

Wed, Jun 2, 7:03 AM · Restricted Project, Restricted Project

Tue, Jun 1

ABataev accepted D103479: [SLP] Ignore unreachable blocks.

LG

Tue, Jun 1, 12:20 PM · Restricted Project
ABataev added inline comments to D103479: [SLP] Ignore unreachable blocks.
Tue, Jun 1, 11:57 AM · Restricted Project
ABataev added a comment to D102920: [SLP]Better detection of perfect/shuffles matches for gather nodes..

FYI there seems to be some (but not very large) compile-time impact on ClamAV: https://llvm-compile-time-tracker.com/compare.php?from=e60f147324b64f7740de58e6b936cdc0e26daadd&to=36911971a58d1ba8b15e97790ac816eaadb0603e&stat=instructions

The file with the largest impact is libclamav_htmlnorm.c with 3.4% regression in NewPM-O3 (and 5.6% in NewPM-ReleaseLTO-g). Might be worth taking a look.

Tue, Jun 1, 9:38 AM · Restricted Project
ABataev added a comment to D102920: [SLP]Better detection of perfect/shuffles matches for gather nodes..

FYI there seems to be some (but not very large) compile-time impact on ClamAV: https://llvm-compile-time-tracker.com/compare.php?from=e60f147324b64f7740de58e6b936cdc0e26daadd&to=36911971a58d1ba8b15e97790ac816eaadb0603e&stat=instructions

The file with the largest impact is libclamav_htmlnorm.c with 3.4% regression in NewPM-O3 (and 5.6% in NewPM-ReleaseLTO-g). Might be worth taking a look.

Tue, Jun 1, 8:52 AM · Restricted Project
ABataev committed rG36911971a58d: [SLP]Better detection of perfect/shuffles matches for gather nodes. (authored by ABataev).
[SLP]Better detection of perfect/shuffles matches for gather nodes.
Tue, Jun 1, 7:09 AM
ABataev closed D102920: [SLP]Better detection of perfect/shuffles matches for gather nodes..
Tue, Jun 1, 7:09 AM · Restricted Project
ABataev requested review of D103458: [SLP]Improve gathering of scalar elements..
Tue, Jun 1, 6:53 AM · Restricted Project

Thu, May 27

ABataev updated the diff for D102920: [SLP]Better detection of perfect/shuffles matches for gather nodes..

Rebase

Thu, May 27, 10:50 AM · Restricted Project