We are having a hard time optimizing some vectorized loads/stores later
on which causes this optimization to degrade performance.
Details
Details
Diff Detail
Diff Detail
Event Timeline
Comment Actions
This pass will get run eventually when it's linked into the user code, so this should be fine.
openmp/libomptarget/DeviceRTL/CMakeLists.txt | ||
---|---|---|
112 | This new flag's presence is a little esoteric, maybe add a comment saying why it's there. |
openmp/libomptarget/DeviceRTL/CMakeLists.txt | ||
---|---|---|
112 | Will add: # We disable the slp vectorizer during the runtime optimization to avoid # vectorized accesses to the shared state. Generally, those are "good" but # the optimizer pipeline (esp. Attributor) does not fully support vectorized # instructions yet and we end up missing out on way more important constant # propagation. That said, we will run the vectorizer again during LTO. |
This new flag's presence is a little esoteric, maybe add a comment saying why it's there.