This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Disable early vectorization of loads/stores in the runtime
ClosedPublic

Authored by jdoerfert on Aug 23 2023, 11:49 AM.

Download Raw Diff

Details

Reviewers

jhuber6
tianshilei1992
sstefan1

Commits

rG80906ce48d5b: [OpenMP] Disable early vectorization of loads/stores in the runtime

Summary

We are having a hard time optimizing some vectorized loads/stores later
on which causes this optimization to degrade performance.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

jdoerfert created this revision.Aug 23 2023, 11:49 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 23 2023, 11:49 AM

Herald added subscribers: sunshaoce, guansong, bollu, yaxunl. · View Herald Transcript

jdoerfert requested review of this revision.Aug 23 2023, 11:49 AM

Herald added a reviewer: sstefan1. · View Herald TranscriptAug 23 2023, 11:49 AM

Herald added subscribers: wangpc, jplehr, sstefan1. · View Herald Transcript

Harbormaster completed remote builds in B254424: Diff 552829.Aug 23 2023, 11:53 AM

This pass will get run eventually when it's linked into the user code, so this should be fine.

openmp/libomptarget/DeviceRTL/CMakeLists.txt
112–119	This new flag's presence is a little esoteric, maybe add a comment saying why it's there.

This revision is now accepted and ready to land.Aug 23 2023, 11:54 AM

jdoerfert added inline comments.Aug 23 2023, 11:56 AM

openmp/libomptarget/DeviceRTL/CMakeLists.txt

112–119

Will add:

# We disable the slp vectorizer during the runtime optimization to avoid
# vectorized accesses to the shared state. Generally, those are "good" but
# the optimizer pipeline (esp. Attributor) does not fully support vectorized
# instructions yet and we end up missing out on way more important constant
# propagation. That said, we will run the vectorizer again during LTO.

Closed by commit rG80906ce48d5b: [OpenMP] Disable early vectorization of loads/stores in the runtime (authored by jdoerfert). · Explain WhyAug 23 2023, 3:14 PM

This revision was automatically updated to reflect the committed changes.

jdoerfert added a commit: rG80906ce48d5b: [OpenMP] Disable early vectorization of loads/stores in the runtime.

Herald added a project: Restricted Project. · View Herald TranscriptAug 23 2023, 3:14 PM

Herald added a subscriber: openmp-commits. · View Herald Transcript

Revision Contents

Path

Size

openmp/

libomptarget/

DeviceRTL/

CMakeLists.txt

10 lines

Diff 552901

openmp/libomptarget/DeviceRTL/CMakeLists.txt

Show First 20 Lines • Show All 103 Lines • ▼ Show 20 Lines	set(src_files
${source_directory}/Reduction.cpp		${source_directory}/Reduction.cpp
${source_directory}/State.cpp		${source_directory}/State.cpp
${source_directory}/Synchronization.cpp		${source_directory}/Synchronization.cpp
${source_directory}/Tasking.cpp		${source_directory}/Tasking.cpp
${source_directory}/Utils.cpp		${source_directory}/Utils.cpp
${source_directory}/Workshare.cpp		${source_directory}/Workshare.cpp
)		)

set(clang_opt_flags -O3 -mllvm -openmp-opt-disable -DSHARED_SCRATCHPAD_SIZE=512)		# We disable the slp vectorizer during the runtime optimization to avoid
set(link_opt_flags -O3 -openmp-opt-disable -attributor-enable=module)		# vectorized accesses to the shared state. Generally, those are "good" but
		# the optimizer pipeline (esp. Attributor) does not fully support vectorized
		# instructions yet and we end up missing out on way more important constant
		# propagation. That said, we will run the vectorizer again after the runtime
		# has been linked into the user program.
		set(clang_opt_flags -O3 -mllvm -openmp-opt-disable -DSHARED_SCRATCHPAD_SIZE=512 -mllvm -vectorize-slp=false )
		set(link_opt_flags -O3 -openmp-opt-disable -attributor-enable=module -vectorize-slp=false )
		jhuber6Unsubmitted Not Done Reply Inline Actions This new flag's presence is a little esoteric, maybe add a comment saying why it's there. jhuber6: This new flag's presence is a little esoteric, maybe add a comment saying why it's there.
		jdoerfertAuthorUnsubmitted Done Reply Inline Actions Will add: # We disable the slp vectorizer during the runtime optimization to avoid # vectorized accesses to the shared state. Generally, those are "good" but # the optimizer pipeline (esp. Attributor) does not fully support vectorized # instructions yet and we end up missing out on way more important constant # propagation. That said, we will run the vectorizer again during LTO. jdoerfert: Will add: ``` # We disable the slp vectorizer during the runtime optimization to avoid #…
set(link_export_flag -passes=internalize -internalize-public-api-file=${source_directory}/exports)		set(link_export_flag -passes=internalize -internalize-public-api-file=${source_directory}/exports)

# Prepend -I to each list element		# Prepend -I to each list element
set (LIBOMPTARGET_LLVM_INCLUDE_DIRS_DEVICERTL "${LIBOMPTARGET_LLVM_INCLUDE_DIRS}")		set (LIBOMPTARGET_LLVM_INCLUDE_DIRS_DEVICERTL "${LIBOMPTARGET_LLVM_INCLUDE_DIRS}")
list(TRANSFORM LIBOMPTARGET_LLVM_INCLUDE_DIRS_DEVICERTL PREPEND "-I")		list(TRANSFORM LIBOMPTARGET_LLVM_INCLUDE_DIRS_DEVICERTL PREPEND "-I")

# Set flags for LLVM Bitcode compilation.		# Set flags for LLVM Bitcode compilation.
set(bc_flags -c -foffload-lto -std=c++17 -fvisibility=hidden		set(bc_flags -c -foffload-lto -std=c++17 -fvisibility=hidden
▲ Show 20 Lines • Show All 177 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Disable early vectorization of loads/stores in the runtimeClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 552901

openmp/libomptarget/DeviceRTL/CMakeLists.txt

[OpenMP] Disable early vectorization of loads/stores in the runtime
ClosedPublic