This is an archive of the discontinued LLVM Phabricator instance.

[VectorCombine] Don't vectorize scalar load under asan/hwasan/memtag/tsan
ClosedPublic

Authored by MaskRay on Sep 11 2020, 12:00 PM.

Details

Summary

Similar to the tsan suppression in
Utils/VNCoercion.cpp:getLoadLoadClobberFullWidthSize (rL175034; load widening used by GVN),
the D81766 optimization should be suppressed under tsan due to potential
spurious data race reports:

struct A {
  int i;
  const short s; // the load cannot be vectorized because
  int modify;    // it overlaps with bytes being concurrently modified
  long pad1, pad2;
};
// __tsan_read16 does not know that some bytes are undef and accessing is safe

Similarly, under asan, users can mark memory regions with
__asan_poison_memory_region. A widened load can lead to a spurious
use-after-poison error. hwasan/memtag should be similarly suppressed.

mustSuppressSpeculation suppresses asan/hwasan/tsan but not memtag, so
we need to exclude memtag in vectorizeLoadInsert.

Note, memtag suppression can be relaxed if the load is aligned to the
its granule (usually 16), but that is out of scope of this patch.

Diff Detail

Event Timeline

MaskRay created this revision.Sep 11 2020, 12:00 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 11 2020, 12:00 PM
MaskRay requested review of this revision.Sep 11 2020, 12:00 PM
MaskRay edited the summary of this revision. (Show Details)Sep 11 2020, 12:46 PM

Making general combines dependent on whether we're running sanitizers doesn't sound like a great idea - won't the impact on codegen make it trickier for sanitizers to correctly assist with identifying specific issues?

MaskRay added a comment.EditedSep 11 2020, 12:59 PM

Making general combines dependent on whether we're running sanitizers doesn't sound like a great idea - won't the impact on codegen make it trickier for sanitizers to correctly assist with identifying specific issues?

Searching for Attribute::SanitizeThread can reveal a few other places where certain optimizations are disabled.
For example, in Utils/VNCoercion.cpp getLoadLoadClobberFullWidthSize (load widening used by GVN), there is a very similar pattern (rL175034; the test became stale; the large chunk of code in analyzeLoadFromClobberingLoad is untested now...).

MaskRay updated this revision to Diff 291341.EditedSep 11 2020, 2:50 PM
MaskRay retitled this revision from [VectorCombine] Don't vectorize scalar load under tsan to [VectorCombine] Don't vectorize scalar load under asan/tsan.
MaskRay edited the summary of this revision. (Show Details)

suppress asan as well

Regarding msan, I guess it is safe, because traps are only caused on control-flow dependent instructions. The IR passes know that the widened load has undef elements...

MaskRay updated this revision to Diff 291343.Sep 11 2020, 2:51 PM

Fix CanWiden

MaskRay edited the summary of this revision. (Show Details)Sep 11 2020, 2:52 PM
Harbormaster completed remote builds in B71435: Diff 291343.
MaskRay added a reviewer: kcc.Sep 11 2020, 3:31 PM
MaskRay edited the summary of this revision. (Show Details)
vitalybuka added inline comments.Sep 11 2020, 3:58 PM
llvm/lib/Transforms/Vectorize/VectorCombine.cpp
152

Shouldn't we disable only this case?

157

If we have the mask maybe we can use it for proper check on Asan side?

698

It should apply to SanitizeHWAddress and SanitizeMemTag as well

MaskRay updated this revision to Diff 291356.Sep 11 2020, 4:30 PM
MaskRay retitled this revision from [VectorCombine] Don't vectorize scalar load under asan/tsan to [VectorCombine] Don't vectorize scalar load under asan/hwasan/memtag/tsan.
MaskRay edited the summary of this revision. (Show Details)

Suppress hwasan/memtag

MaskRay marked 2 inline comments as done.Sep 11 2020, 4:33 PM
MaskRay added inline comments.
llvm/lib/Transforms/Vectorize/VectorCombine.cpp
152

This branch performs (1) load widening followed by (2) widening/narrowing the result with shufflevector. Due to the first step (load widening), the whole optimization (D86160) is unsafe.

Added a specific test for it.

698

Thanks! (I know little about hwasan/memtag. I will learn about them)

MaskRay marked 2 inline comments as done.Sep 11 2020, 4:35 PM
MaskRay added inline comments.
llvm/lib/Transforms/Vectorize/VectorCombine.cpp
157

Yeah, if the information can be conveyed to the asan side then it can probably be retained. (But the optimization may not be worth the additional efforts...)

eugenis added inline comments.Sep 14 2020, 11:52 AM
llvm/lib/Transforms/Vectorize/VectorCombine.cpp
157

Right, I wonder if we could come up with some sort of metadata to promise that only the first N bytes (or first N vector elements) of the loaded value will be used. ASan can use it to relax the check, and MSan can validate it.

spatel added a comment.EditedSep 14 2020, 12:09 PM

I have no experience with the sanitizer requirements, but IIUC we can't accurately sanitize any code that uses this transform with this restriction.
If that's what is happening/intended in other passes, I guess there's no other way around it.
But if it is happening in other passes, shouldn't we use some standard function to bail out? For example llvm::mustSuppressSpeculation()?

MaskRay updated this revision to Diff 291647.Sep 14 2020, 12:23 PM
MaskRay edited the summary of this revision. (Show Details)

Use mustSuppressSpeculation

eugenis added inline comments.Sep 14 2020, 12:28 PM
llvm/lib/Analysis/ValueTracking.cpp
4422 ↗(On Diff #291647)

MemTag allows extending memory access up to the next 16-byte granule. Memtag is a production thing, so it's important that we do not lose performance there when possible.

spatel added inline comments.Sep 14 2020, 1:07 PM
llvm/lib/Analysis/ValueTracking.cpp
4414 ↗(On Diff #291647)

Definitely get a 2nd reviewer opinion if we are making changes here (I don't know enough about this)...
But if I'm seeing it correctly, isSimple() is not a superset of isUnordered(). Ie, a load can have AtomicOrdering::Unordered but still be simple?

MaskRay added inline comments.Sep 14 2020, 1:53 PM
llvm/lib/Analysis/ValueTracking.cpp
4414 ↗(On Diff #291647)

isSimple is a subset of isUnordered. An AtomicOrdering::Unordered load is isUnordered but not isSimple.

4422 ↗(On Diff #291647)

Do you suggest that we should remove SanitizeMemTag here and exclude SanitizeMemTag from VectorCombine.cpp instead?

(VectorCombine.cpp can check alignment and allow SanitizeMemTag as an optimization but that will take more code that I don't want to take risk.

I am posting the patch to fix a tsan spurious report.
)

vitalybuka accepted this revision.Sep 14 2020, 2:59 PM
vitalybuka added inline comments.
llvm/lib/Analysis/ValueTracking.cpp
4414 ↗(On Diff #291647)

Should this be a separate patch?

This revision is now accepted and ready to land.Sep 14 2020, 2:59 PM
vitalybuka requested changes to this revision.Sep 14 2020, 3:00 PM

Actually we need decide on SanitizeMemTag. I don't have answer to your question.

This revision now requires changes to proceed.Sep 14 2020, 3:00 PM
MaskRay updated this revision to Diff 291701.Sep 14 2020, 3:12 PM
MaskRay edited the summary of this revision. (Show Details)

Revert changes from ValueTracking.cpp

OK, I want to play safe and have removed ValueTracking.cpp changes. Added the following notes to the description:

mustSuppressSpeculation suppresses asan/hwasan/tsan but not memtag, so we need to exclude memtag in vectorizeLoadInsert.

Note, memtag suppression can be relaxed if the load is aligned to the its granule (usually 16), but that is out of scope of this patch.

vitalybuka accepted this revision.Sep 14 2020, 4:31 PM

If I understand @eugenis comment, for memtag it's better to allow loads inside of 16byte granules.
But I guess patch as-is is already improvement and 16byte fix can be a followup optimization.

This revision is now accepted and ready to land.Sep 14 2020, 4:31 PM

If I understand @eugenis comment, for memtag it's better to allow loads inside of 16byte granules.
But I guess patch as-is is already improvement and 16byte fix can be a followup optimization.

Thanks! Yeah, memtag performance improvement can be done as a follow-up. The isUnordered() condition in mustSuppressSpeculation may need some thoughts (I left a comment in D66688)

I'd like an approval from @spatel or @RKSimon

spatel added inline comments.Sep 15 2020, 4:35 AM
llvm/test/Transforms/VectorCombine/X86/load.ll
459

Does this test a different code path than the first added test (gep10_load_i16_insert_v8i16_asan)?

MaskRay added inline comments.Sep 15 2020, 9:01 AM
llvm/test/Transforms/VectorCombine/X86/load.ll
459

It is for D86160. If you think it is unneeded I can delete it.

Do you have more concerns?

spatel accepted this revision.Sep 15 2020, 9:05 AM
spatel added inline comments.
llvm/test/Transforms/VectorCombine/X86/load.ll
459

Nope - let's keep it to verify that we are fully disabling the load transforms for the sanitizer cases.