Create more specific memory locations for masked load/store intrinsics.
Recognize these intrinsics in memory dependence analysis.
Differential D87061
Handle masked loads and stores in MemoryLocation/Dependence kparzysz on Sep 2 2020, 3:29 PM. Authored by
Details Create more specific memory locations for masked load/store intrinsics. Recognize these intrinsics in memory dependence analysis.
Diff Detail
Event TimelineComment Actions Testcase? It should be possible to show that GVN is more precise in cases involving masked load/store.
Comment Actions Can you pre-commit the test so we can see the changes. Make it an NFC change w/o review. Comment Actions I think this introduces the following crash when running opt -gvn on the IR below Assertion failed: (isa<X>(Val) && "cast<Ty>() argument of incompatible type!"), function cast, file /llvm-project/llvm/include/llvm/Support/Casting.h, line 269. PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace. Stack dump: 0. Program arguments: bin/opt -gvn bugpoint-reduced-simplified.ll 1. Running pass 'Function Pass Manager' on module 'bugpoint-reduced-simplified.ll'. 2. Running pass 'Global Value Numbering' on function '@test' 0 opt 0x000000010d35b688 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 40 1 opt 0x000000010d35a438 llvm::sys::RunSignalHandlers() + 248 2 opt 0x000000010d35bc76 SignalHandler(int) + 262 3 libsystem_platform.dylib 0x00007fff6d8fd5fd _sigtramp + 29 4 libsystem_platform.dylib 000000000000000000 _sigtramp + 18446603338678020640 5 libsystem_c.dylib 0x00007fff6d7d3808 abort + 120 6 libsystem_c.dylib 0x00007fff6d7d2ac6 err + 0 7 opt 0x000000010e6da773 llvm::GVN::ValueTable::lookupOrAddCall(llvm::CallInst*) (.cold.7) + 35 8 opt 0x000000010d031435 llvm::GVN::ValueTable::lookupOrAddCall(llvm::CallInst*) + 3381 9 opt 0x000000010d02f3c0 llvm::GVN::ValueTable::lookupOrAdd(llvm::Value*) + 896 10 opt 0x000000010d03aeb7 llvm::GVN::processInstruction(llvm::Instruction*) + 695 11 opt 0x000000010d03c044 llvm::GVN::processBlock(llvm::BasicBlock*) + 564 12 opt 0x000000010d03b750 llvm::GVN::iterateOnFunction(llvm::Function&) + 112 13 opt 0x000000010d0329ab llvm::GVN::runImpl(llvm::Function&, llvm::AssumptionC @file_mask = external global [8 x i64], align 32 define dso_local fastcc void @test() unnamed_addr #0 { entry: %wide.masked.load.1.i = tail call <4 x i64> @llvm.masked.load.v4i64.p0v4i64(<4 x i64>* nonnull bitcast (i64* getelementptr inbounds ([8 x i64], [8 x i64]* @file_mask, i64 0, i64 7) to <4 x i64>*), i32 8, <4 x i1> <i1 true, i1 false, i1 false, i1 false>, <4 x i64> undef) #2 %.pre392.i = load i64, i64* getelementptr inbounds ([8 x i64], [8 x i64]* @file_mask, i64 0, i64 7), align 8 %or156.4.i = or i64 %.pre392.i, undef %wide.masked.load614.1.i = tail call <4 x i64> @llvm.masked.load.v4i64.p0v4i64(<4 x i64>* nonnull bitcast (i64* getelementptr inbounds ([8 x i64], [8 x i64]* @file_mask, i64 0, i64 7) to <4 x i64>*), i32 8, <4 x i1> <i1 true, i1 false, i1 false, i1 false>, <4 x i64> undef) #2 unreachable } ; Function Attrs: argmemonly nounwind readonly willreturn declare <4 x i64> @llvm.masked.load.v4i64.p0v4i64(<4 x i64>*, i32 immarg, <4 x i1>, <4 x i64>) |
What benefit are you getting here that's better than just falling through to the call to getModRefInfo()?