Page MenuHomePhabricator

[X86] Teach X86FloatingPoint's handleCall to only erase the FP stack if there is a regmask operand that clobbers the FP stack.
ClosedPublic

Authored by craig.topper on Jul 10 2021, 11:59 AM.

Details

Summary

There are some calls to functions like __alloca that are missing
a regmask operand. Lack of a regmask operand means that all
registers that aren't mentioned by def operands are preserved.
__alloca only updates EAX and ESP and has def operands for
them so this is ok. Because there is no regmask the register
allocator won't spill the FP registers across the call. Assuming
we want to keep the FP stack untoched across these calls, we
need to handle this is in the FP stackifier.

We might want to add a proper regmask operand to the code that
creates these calls to indicate all registers are preserved, but we'd
still need this change to the FP stackifier to know to preserve the
FP stack for such a regmask.

The test is kind of long, but bugpoint wasn't able to reduce it
any further.

Fixes PR50782

Diff Detail

Event Timeline

craig.topper created this revision.Jul 10 2021, 11:59 AM
craig.topper requested review of this revision.Jul 10 2021, 11:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 10 2021, 11:59 AM

__alloca only updates EAX and ESP so this isn't wrong.

"is" or "isn't"?

llvm/lib/Target/X86/X86FloatingPoint.cpp
988–989

I found we don't clobber FP stack by default. E.g. void foo(void). The interesting thing is even if we passing registers to/from foo, e.g. long double foo(void) or long double foo(long double), the regmasks of FP are still 0.
Did I misunderstand something here?

995

Why can't the callee only clobber one or a few registers?

craig.topper edited the summary of this revision. (Show Details)Jul 10 2021, 7:24 PM
craig.topper edited the summary of this revision. (Show Details)
craig.topper edited the summary of this revision. (Show Details)Jul 10 2021, 7:28 PM

__alloca only updates EAX and ESP so this isn't wrong.

"is" or "isn't"?

"isn't wrong" was correct, but I've rewritten to hopefully be more readable

llvm/lib/Target/X86/X86FloatingPoint.cpp
988–989

The regmask bits indicate which registers aren't clobbered. So a 0 means clobbered.

995

The later code that calls popReg until the stack is empty would need to be updated in a non-trivial way to handle that.

craig.topper added inline comments.Jul 10 2021, 7:33 PM
llvm/lib/Target/X86/X86FloatingPoint.cpp
988–989

It's created from the callee saved register mask.

pengfei added inline comments.Jul 10 2021, 7:59 PM
llvm/lib/Target/X86/X86FloatingPoint.cpp
988–989

Yes, I see. They are caller saved registers.
By the way, does Clang/LLVM have a mechanism to label functions like known libraries that won't clobber some caller saved registers now.
We have the same requirements for AMX registers, but haven't handled for now.

This revision is now accepted and ready to land.Jul 10 2021, 8:01 PM
post.kadirselcuk commandeered this revision.Jul 10 2021, 8:06 PM
post.kadirselcuk added child revisions: D105769: [RISCV] Use DIVUW/REMUW/DIVW instructions for i8/i16/i32 udiv/urem/sdiv when LHS is constant., D105768: [OpenMP] Create and use `__kmpc_is_generic_main_thread`, D105767: [OpenMP] Simplify variable sharing and increase shared memory size, D105766: [libc] update benchmark distributions, D105765: Prepare Compiler-RT for GnuInstallDirs, matching libcxx, document all, D105764: [InstCombine] Fold lshr/ashr(or(neg(x),x),bw-1) --> zext/sext(icmp_ne(x,0)) (PR50816), D105763: [Attributes] Make type attribute handling more generic (NFCI), D105761: [lld][AMDGPU] Handle R_AMDGPU_REL16 relocation., D105760: [AMDGPU] Handle s_branch to another section., D105759: Implement P2361 Unevaluated string literals, D105758: Hold mutex lock while notify_all is called at notify_all_at_thread_exit, D105757: [SystemZ] Bugfix for the 'N' code for inline asm operand., D105756: [clang] C++98 implicit moves are back with a vengeance, D105755: [WebAssembly] Custom combines for f32x4.demote_zero_f64x2, D105754: [PowerPC] Fix L[D|W]ARX Implementation, D105753: [libcxx][ranges] Add `ranges::common_view`., D105752: [SLP] Do not make an attempt to match reduction on already erased instruction., D105751: GlobalISel: Introduce GenericMachineInstr classes and derivatives for idiomatic LLVM RTTI., D105750: [not for review][lld-macho] Set allEntriesAreOmitted correctly for unwind info from ld -r, D105749: WebAssembly: Update datalayout to match fp128 ABI change, D105748: [mlir] Fix broadcasting check with 1 values, D105747: [DWARF] Support emitting AdvanceLineAddrAbs, D105746: [DWARF] Introduce RefHandler to parse external refs in frame data, D105745: [compiler-rt][hwasan][fuchsia] Define shadow bound globals, D105744: [NFC][compiler-rt][hwasan] Move shadow bound variables to hwasan.cpp, D105743: [AIX] Emit version string in .file directive, D105742: [AMDGPU] Make some VOP1 instructions rematerializable, D105741: [trace] Introduce Hierarchical Trace Representation (HTR) and add `thread trace export ctf` command for Intel PT trace visualization, D105740: Remove `LIBC_INSTALL_PREFIX`, D105739: Mips/GlobalISel: Use more standard call lowering infrastructure, D105738: Mips: Mark special case calling convention handling as custom, D105737: Implement delimited escape sequences., D105736: [libcxx] [test] Fix spurious failures in the thread join test on Windows, D105735: [compiler-rt][hwasan][Fuchsia] Do not emit FindDynamicShadowStart for Fuchsia, D105734: [clang-tidy] performance-unnecessary-copy-initialization: Do not remove comments on new lines., D105733: [OpaquePtr] Require matching signature in getCalledFunction(), D105732: [lldb] Update logic to close inherited file descriptors., D105731: [mlir][sparse] add restrictive versions of division support, D105730: [SLP] match logical and/or as reduction candidates, D105729: [AFDO] Merge function attributes after inlining, D105728: [clang][Codegen] Directly lower `(*((volatile int *)(0))) = 0;` into a `call void @llvm.trap()`, D105727: [clang-tidy] performance-unnecessary-copy-initialization: Disable structured bindings., D105726: [asan][clang] Add flag to outline instrumentation, D105725: [compiler-rt][hwasan] Refactor kAliasRegionStart usage, D105724: [AMDGPU] Fix flags of V_MOV_B64_PSEUDO, D105723: [LSR] Do not hoist IV if it is not post increment case. PR43678, D105722: [scudo] Check if we use __clang_major__ >= 12, D105721: [amdgpu] Add scope metadata support for noalias kernel arguments., D105720: [AsmParser] Add support to LOCAL directive., D105719: sanitizer_common: split LibIgnore into fast/slow paths, D105718: sanitizer_common: sanitize time functions, D105717: [trace] [intel pt] Create a "thread trace dump stats" command, D105716: sanitizer_common: add thread safety annotations, D105715: [OpenMP] Minor improvement in task allocation, D105714: WIP/RFC: Generic MachineInstr convenience wrappers., D105713: sanitizer_common: add simpler ThreadRegistry ctor, D105712: [libc++] Fix libc++ in C++03 mode on Clang ToT, D105711: [OpaquePtr][Inline] Use byval type instead of pointee type, D105710: [OpaquePointers][ThreadSanitizer] Cleanup calls to PointerType::getElementType(), D105709: [AMDGPU][GlobalISel] Insert an and with exec before s_cbranch_vccnz if necessary, D105708: [analyzer][NFC] Display the correct function name even in crash dumps, D105707: [HIP] Move std headers after device malloc/free, D105706: [mlir] support collapsed loops in OpenMP-to-LLVM translation, D105705: [hwasan] More realistic setjmp test., D105704: [libcxx][CI] Work around Arm buildkite failures, D105703: [hwasan] Use stack safety analysis., D105702: [mlir] factor math-to-llvm out of standard-to-llvm, D105701: [clang-format] test revision (NOT FOR COMMIT) to demonstrate east/west const fixer capability, D105700: [LoopSimplify] Convert loop with multiple latches to nested loop using dominator tree, D105699: [libomptarget][devicertl] Remove branches around setting parallelLevel, D105698: [lldb/Target] Fix event handling during process launch, D105697: [libomptarget][nfc] Drop dead code in parallel_51, D105696: [AArch64][GlobalISel] Optimise lowering for some vector types for min/max, D105695: [clang][tooling] Accept Clang invocations with multiple jobs, D105694: [InstrRef][FastISel] Emit DBG_INSTR_REF from fast-isel when using instruction-referencing, D105693: [analyzer][solver][NFC] Refactor how we detect (dis)equalities, D105692: [analyzer][solver][NFC] Introduce ConstraintAssignor, D105691: [Polly][Isl] Use isl::*::ctx instead of isl::*::get_ctx. NFC, D105690: [RISCV] Rename assembler mnemonic of unordered floating-point reductions for v1.0-rc change, D105688: [LoopDeletion] Handle switch in proving that loop exits on first iteration, D105687: [Debug-Info] [llvm-dwarfdump] Don't use DW_FORM_data4/8 to encode the constants for DW_AT_data_member_location., D105685: [RISCV][RVV] Precommit a test case for D105684, D105686: [ARM] Move add(VMLALVA(A, X, Y), B) to VMLALVA(add(A, B), X, Y), D105684: [RegisterCoalescer] Make resolveConflicts aware of earlyclobber, D105683: [AMDGPU] Allow frontends to disable null export for pixel shaders, D105682: [AMDGPU] Handle functions in llvm's global ctors and dtors list, D105681: [clangd] Add platform triple (host & target) to version info, D105680: [ARM] Lower v16i8 -> i64 VMLA reductions., D105679: [clangd] Add CMake option to (not) link in clang-tidy checks, D105678: [MLIR][GPU][NFC] Fix documentation for wmma matrix load/store ops, D105677: [mlir-tblgen] Fix failed matching when binds same operand of an op in different depth.
post.kadirselcuk edited reviewers, added: craig.topper; removed: post.kadirselcuk.
This revision now requires review to proceed.Jul 10 2021, 8:08 PM
post.kadirselcuk planned changes to this revision.Jul 10 2021, 8:20 PM
post.kadirselcuk added commits: rG99b8c4682865: [RISCV] Restore non-constant srem test I accidentally deleted. NFC, rTd382dfd3c56a: [fpcmp] Fix memory leak. NFC., rTad57b9ae24df: SPEC2006: Pronounce endianness flags both ways, rGf8b7fdc9319e: Merge branch 'llvm:main' into main, rG74f6e1614b85: Create .html, rG86109fa9e84c: [RISCV] Add test cases for div/rem with constant left hand side. NFC, rG2e7e2994a94e: [Attributor][FIX] Destroy bump allocator objects to avoid leaks, rG514c033db1e0: [OpenMP] Detect SPMD compatible kernels and execute them as such, rG8cb7d71355f9: [OpenMP][FIX] Add missing `)` to remark, rG0a223827de8d: [OpenMP] Remove checkXXXX device runtime functions, rGa706b94ea556: [OpenMP][NFCI] Re-enable two remarks tests after D101977 landed, rGd9659bf6a036: [OpenMP] Create custom state machines for generic target regions, rGe2cfbfcc0c1f: [OpenMP] Unified entry point for SPMD & generic kernels in the device RTL, rG4761d29633ac: [Attributor][FIX] Sanitize queries to LVI and ScalarEvolution, rGc1d53a316d6c: [Attributor] Look through selects in genericValueTraversal, rG5b05a5f6cee2: [OpenMP][FIX] Update remark in test file after rewording, rGc1c1fe93852e: [Attributor] Reorganize AAHeapToStack, rG773beb6fbed7: [PowerPC] Fix L[D|W]ARX Implementation, rGdbb3a65f5b30: [Attributor][FIX] Do not replace a value with a non-dominating instruction, rGa6470408cf36: [ARM] Extra widening and narrowing combinations tests. NFC, rG5ef18e242183: [Attributor] Use AAValueSimplify to simplify returned values, rG0aab13aaf942: [Attributor] Introduce an optimistic getUnderlyingObjects helper, rG5b12cf3e659b: [Attributor][FIX] Traverse uses even if a value is assumed constant, rGd3e749133319: Revert Attributor patch series, rGf01d45c378cd: Reland "[clang-repl] Allow passing in code as positional arguments.", rG768510632c5d: Revert "llvm-symbolizer: Fix "start file" to work with Split DWARF", rG269416d41908: [Attributor][NFCI] Add UsedAssumedInformation to more interfaces, rGd39179d7fa17: [OpenMP] Detect SPMD compatible kernels and execute them as such, rGe603ca0306d7: [OpenMP] Remove checkXXXX device runtime functions, rG966342790e8d: [Attributor][FIX] Sanitize queries to LVI and ScalarEvolution, rGae08df87dfba: [Attributor][FIX] Do not replace a value with a non-dominating instruction, rGf0628c6ff7ba: [OpenMP] Create custom state machines for generic target regions, rG1d5711c3eeb6: [OpenMP] Unified entry point for SPMD & generic kernels in the device RTL, rG5003ba2542c1: [Attributor] Look through selects in genericValueTraversal, rG1eb31d6de36b: [Attributor] Reorganize AAHeapToStack, rG374e573cfc2b: [Attributor] Use AAValueSimplify to simplify returned values, rG93a279a67dc0: [Attributor] Introduce an optimistic getUnderlyingObjects helper, rGbe5d46e9bbc9: [Attributor][FIX] Traverse uses even if a value is assumed constant, rG2c0f17982f39: [mlir] Added OpPrintingFlags to AsmState and SSANameState., rGf4f11ee4a705: [mlir][NFC] Switched `interfaces` to a private member of SSANameState., rGd99f65de2ab1: [OpenMP] Avoid checking parent reference count in targetDataBegin, rG1d0456361a42: [OpenMP] Avoid checking parent reference count in targetDataEnd, rG8f4e5474de74: [AFDO] Require x86_64-linux in a testcase, rGa328ee657798: [X86] Add tests from D93707 for fsub_strict(x,fneg(y)) -> fadd_strict(x,y)…, rG4fe0fcd1c032: [llvm-mca][JSON] Teach the PipelinePrinter how to deal with anonymous code…, rG8662e04552c2: Update arithmetic-fence-builtin.c, rGd919bca87556: [llvm-mca][JSON] Further refactoring of the JSON printing logic., rG239fcda268dc: [LV] NFCI: Do cost comparison on InstructionCost directly., rG41b605764172: [InstructionCost] Add saturation support., rG8cf7ddbdd4e5: Revert "Prepare Compiler-RT for GnuInstallDirs, matching libcxx", rG97c426394a71: [AArch64][GlobalISel] Implement moreElements legalization for G_SHUFFLE_VECTOR., rG1f40870dda46: [NFC][ScalarEvolution] Precommit tests for D104075., rG58a2cb514366: [GlobalISel] Add a new artifact combiner for unmerge which looks through…, rGb8e5f918166c: [ORC] Flesh out ExecutorAddress, rename CommonOrcRuntimeTypes header., rG3822e3d5b049: [lld-macho] Fix bug in handling unwind info from ld -r, rGe5220104d070: [WebAssembly] Custom combines for f64x2.promote_low_f32x4, rG04c203e310bd: llvm-symbolizer: Fix "start file" to work with Split DWARF, rG882ee7fbd6fc: Fix buildbot regression from 9c4baf5., rGdb4c25822a1d: [scudo] Check if we use __clang_major__ >= 12, rG9c4baf5101e9: [ScalarEvolution] Strictly enforce pointer/int type rules., rG8e9216fe877c: [SLP] Do not make an attempt to match reduction on already erased instruction., rGac02baab48c2: WebAssembly: Update datalayout to match fp128 ABI change, rGfc01fafa3e7f: [MLIR][GPU][NFC] Fix documentation for wmma matrix load/store ops, rG49d66d9f9f49: [AFDO] Merge function attributes after inlining, rG5f306feb4d3f: [WebAssembly] Fix warnings, rGc82b96c0a414: [gn build] fix formatting after 9647a6f719ee, rG1a4d1315650b: [llvm-nm][test] diff -q => diff to make AIX happy, rG10cb03622325: [llvm-mca] Refactor the logic that prints JSON files., rGf3e6c3f327c2: [WebAssembly] Fixed 2 warnings in Asm Type Checker, rG156cb8d5ca67: [WebAssembly] fix broken tools/llvm-symbolizer/wasm-basic.s test, rGccb10266f56b: [HIP] Move std headers after device malloc/free, rG47aeeffc8fb4: [GlobalISel] Use GCDTy when extracting GCD ty from leftover regs in insertParts, rG9647a6f719ee: [WebAssembly] Added initial type checker to MC Assembler, rG0562d1786483: PR51018: A few more explicit conversions from SmallString to StringRef, rG9a9bc76c0eb7: Prepare Compiler-RT for GnuInstallDirs, matching libcxx, rGd124133f1735: Add scoped timers to ReadMemoryFromInferior and ReadMemoryFromFileCache., rG3338819b08fa: [lldb] Drop REQUIRES where redundant because of lit.local.cfg, rG3e97d11df8ce: [AMDGPU] Added v_accvgpr_read_b32 rematerialization test. NFC., rGab8989ab8710: [OPENMP]Fix overlapped mapping for dereferenced pointer members., rG4a3b0556536d: [AMDGPU] Fix flags of V_MOV_B64_PSEUDO, rG488fcea3b552: [lldb] Use custom script instead of lldb.macosx.crashlog in test, rG2e3f4694d61d: [IR] Add GEPOperator::indices() (NFC), rG55c5c0485924: [PhaseOrdering] add tests for vector cmp reductions; NFC, rG86e65234404f: [SLP] add tests for poison-safe logical reductions; NFC, rGc2b7f09d8c27: [SLP] make invalid operand explicit for extra arg in reduction matching; NFC, rGf8bef4734845: [libcxx][CI] Work around Arm buildkite failures, rG5511bfdb6715: [hwasan] More realistic setjmp test., rGecd15fbf6bb5: [ARC][NFC] Include file re-ordering, rGb379ab41937f: [AMDGPU] Add VOP rematerialization test. NFC., rGb00cff56cfb1: Reapply [IR] Don't accept nullptr as GEP element type, rGc476566be5d0: [IRForTarget] Don't pass nullptr to GetElementPtrInst::Create() (NFC), rG0813bd1696dc: [Polly][Isl] Use isl::*::ctx instead of isl::*::get_ctx. NFC, rG768e3af6345a: PR51034: Debug Info: Remove 'prototyped' from K&R function declarations.
post.kadirselcuk marked an inline comment as done.
post.kadirselcuk added inline comments.
llvm/lib/Target/X86/X86FloatingPoint.cpp
995

finish?

craig.topper commandeered this revision.Jul 10 2021, 9:46 PM
craig.topper requested review of this revision.
craig.topper edited reviewers, added: post.kadirselcuk; removed: craig.topper.
craig.topper removed child revisions: D105677: [mlir-tblgen] Fix failed matching when binds same operand of an op in different depth, D105678: [MLIR][GPU][NFC] Fix documentation for wmma matrix load/store ops, D105679: [clangd] Add CMake option to (not) link in clang-tidy checks, D105680: [ARM] Lower v16i8 -> i64 VMLA reductions., D105681: [clangd] Add platform triple (host & target) to version info, D105682: [AMDGPU] Handle functions in llvm's global ctors and dtors list, D105683: [AMDGPU] Allow frontends to disable null export for pixel shaders, D105684: [RegisterCoalescer] Make resolveConflicts aware of earlyclobber, D105686: [ARM] Move add(VMLALVA(A, X, Y), B) to VMLALVA(add(A, B), X, Y), D105685: [RISCV][RVV] Precommit a test case for D105684, D105687: [Debug-Info] [llvm-dwarfdump] Don't use DW_FORM_data4/8 to encode the constants for DW_AT_data_member_location., D105688: [LoopDeletion] Handle switch in proving that loop exits on first iteration, D105690: [RISCV] Rename assembler mnemonic of unordered floating-point reductions for v1.0-rc change, D105691: [Polly][Isl] Use isl::*::ctx instead of isl::*::get_ctx. NFC, D105692: [analyzer][solver][NFC] Introduce ConstraintAssignor, D105693: [analyzer][solver][NFC] Refactor how we detect (dis)equalities, D105694: [InstrRef][FastISel] Emit DBG_INSTR_REF from fast-isel when using instruction-referencing, D105695: [clang][tooling] Accept Clang invocations with multiple jobs, D105696: [AArch64][GlobalISel] Optimise lowering for some vector types for min/max, D105697: [libomptarget][nfc] Drop dead code in parallel_51, D105698: [lldb/Target] Fix event handling during process launch, D105699: [libomptarget][devicertl] Remove branches around setting parallelLevel, D105700: [LoopSimplify] Convert loop with multiple latches to nested loop using dominator tree, D105701: [clang-format] test revision (NOT FOR COMMIT) to demonstrate east/west const fixer capability, D105702: [mlir] factor math-to-llvm out of standard-to-llvm, D105703: [hwasan] Use stack safety analysis., D105704: [libcxx][CI] Work around Arm buildkite failures, D105705: [hwasan] More realistic setjmp test., D105706: [mlir] support collapsed loops in OpenMP-to-LLVM translation, D105707: [HIP] Move std headers after device malloc/free, D105708: [analyzer][NFC] Display the correct function name even in crash dumps, D105709: [AMDGPU][GlobalISel] Insert an and with exec before s_cbranch_vccnz if necessary, D105710: [OpaquePointers][ThreadSanitizer] Cleanup calls to PointerType::getElementType(), D105711: [OpaquePtr][Inline] Use byval type instead of pointee type, D105712: [libc++] Fix libc++ in C++03 mode on Clang ToT, D105713: sanitizer_common: add simpler ThreadRegistry ctor, D105714: WIP/RFC: Generic MachineInstr convenience wrappers., D105715: [OpenMP] Minor improvement in task allocation, D105716: sanitizer_common: add thread safety annotations, D105717: [trace] [intel pt] Create a "thread trace dump stats" command, D105718: sanitizer_common: sanitize time functions, D105719: sanitizer_common: split LibIgnore into fast/slow paths, D105720: [AsmParser] Add support to LOCAL directive., D105721: [amdgpu] Add scope metadata support for noalias kernel arguments., D105722: [scudo] Check if we use __clang_major__ >= 12, D105723: [LSR] Do not hoist IV if it is not post increment case. PR43678, D105724: [AMDGPU] Fix flags of V_MOV_B64_PSEUDO, D105725: [compiler-rt][hwasan] Refactor kAliasRegionStart usage, D105726: [asan][clang] Add flag to outline instrumentation, D105727: [clang-tidy] performance-unnecessary-copy-initialization: Disable structured bindings., D105728: [clang][Codegen] Directly lower `(*((volatile int *)(0))) = 0;` into a `call void @llvm.trap()`, D105729: [AFDO] Merge function attributes after inlining, D105730: [SLP] match logical and/or as reduction candidates, D105731: [mlir][sparse] add restrictive versions of division support, D105732: [lldb] Update logic to close inherited file descriptors., D105733: [OpaquePtr] Require matching signature in getCalledFunction(), D105734: [clang-tidy] performance-unnecessary-copy-initialization: Do not remove comments on new lines., D105735: [compiler-rt][hwasan][Fuchsia] Do not emit FindDynamicShadowStart for Fuchsia, D105736: [libcxx] [test] Fix spurious failures in the thread join test on Windows, D105737: Implement delimited escape sequences., D105738: Mips: Mark special case calling convention handling as custom, D105739: Mips/GlobalISel: Use more standard call lowering infrastructure, D105740: Remove `LIBC_INSTALL_PREFIX`, D105741: [trace] Introduce Hierarchical Trace Representation (HTR) and add `thread trace export ctf` command for Intel PT trace visualization, D105742: [AMDGPU] Make some VOP1 instructions rematerializable, D105743: [AIX] Emit version string in .file directive, D105744: [NFC][compiler-rt][hwasan] Move shadow bound variables to hwasan.cpp, D105745: [compiler-rt][hwasan][fuchsia] Define shadow bound globals, D105746: [DWARF] Introduce RefHandler to parse external refs in frame data, D105747: [DWARF] Support emitting AdvanceLineAddrAbs, D105748: [mlir] Fix broadcasting check with 1 values, D105749: WebAssembly: Update datalayout to match fp128 ABI change, D105750: [not for review][lld-macho] Set allEntriesAreOmitted correctly for unwind info from ld -r, D105751: GlobalISel: Introduce GenericMachineInstr classes and derivatives for idiomatic LLVM RTTI., D105752: [SLP] Do not make an attempt to match reduction on already erased instruction., D105753: [libcxx][ranges] Add `ranges::common_view`., D105755: [WebAssembly] Custom combines for f32x4.demote_zero_f64x2, D105756: [clang] C++98 implicit moves are back with a vengeance, D105757: [SystemZ] Bugfix for the 'N' code for inline asm operand., D105758: Hold mutex lock while notify_all is called at notify_all_at_thread_exit, D105759: Implement P2361 Unevaluated string literals, D105760: [AMDGPU] Handle s_branch to another section., D105761: [lld][AMDGPU] Handle R_AMDGPU_REL16 relocation., D105763: [Attributes] Make type attribute handling more generic (NFCI), D105764: [InstCombine] Fold lshr/ashr(or(neg(x),x),bw-1) --> zext/sext(icmp_ne(x,0)) (PR50816), D105765: Prepare Compiler-RT for GnuInstallDirs, matching libcxx, document all, D105766: [libc] update benchmark distributions, D105767: [OpenMP] Simplify variable sharing and increase shared memory size, D105768: [OpenMP] Create and use `__kmpc_is_generic_main_thread`, D105769: [RISCV] Use DIVUW/REMUW/DIVW instructions for i8/i16/i32 udiv/urem/sdiv when LHS is constant., D34362: [LNT] Support for different DataSet usage in Polybench for "lnt runtest nt".
craig.topper removed commits: rGdb89414da4ea: [libomptarget][nfc] Move grid size computation, rGf1cbea3e5275: [RISCV] Remove Zvamo implication for v1.0-rc change, rG8893d0816ccd: [MLIR] Change Operation::create() methods to use Value/Type/Block ranges., rG768e3af6345a: PR51034: Debug Info: Remove 'prototyped' from K&R function declarations, rG0813bd1696dc: [Polly][Isl] Use isl::*::ctx instead of isl::*::get_ctx. NFC, rGc476566be5d0: [IRForTarget] Don't pass nullptr to GetElementPtrInst::Create() (NFC), rGb00cff56cfb1: Reapply [IR] Don't accept nullptr as GEP element type, rGb379ab41937f: [AMDGPU] Add VOP rematerialization test. NFC., rGecd15fbf6bb5: [ARC][NFC] Include file re-ordering, rG5511bfdb6715: [hwasan] More realistic setjmp test., rGf8bef4734845: [libcxx][CI] Work around Arm buildkite failures, rGc2b7f09d8c27: [SLP] make invalid operand explicit for extra arg in reduction matching; NFC, rG86e65234404f: [SLP] add tests for poison-safe logical reductions; NFC, rG55c5c0485924: [PhaseOrdering] add tests for vector cmp reductions; NFC, rG2e3f4694d61d: [IR] Add GEPOperator::indices() (NFC), rG488fcea3b552: [lldb] Use custom script instead of lldb.macosx.crashlog in test, rG4a3b0556536d: [AMDGPU] Fix flags of V_MOV_B64_PSEUDO, rGab8989ab8710: [OPENMP]Fix overlapped mapping for dereferenced pointer members., rG3e97d11df8ce: [AMDGPU] Added v_accvgpr_read_b32 rematerialization test. NFC., rG3338819b08fa: [lldb] Drop REQUIRES where redundant because of lit.local.cfg, rGd124133f1735: Add scoped timers to ReadMemoryFromInferior and ReadMemoryFromFileCache., rG9a9bc76c0eb7: Prepare Compiler-RT for GnuInstallDirs, matching libcxx, rG0562d1786483: PR51018: A few more explicit conversions from SmallString to StringRef, rG9647a6f719ee: [WebAssembly] Added initial type checker to MC Assembler, rG47aeeffc8fb4: [GlobalISel] Use GCDTy when extracting GCD ty from leftover regs in insertParts, rGccb10266f56b: [HIP] Move std headers after device malloc/free, rG156cb8d5ca67: [WebAssembly] fix broken tools/llvm-symbolizer/wasm-basic.s test, rGf3e6c3f327c2: [WebAssembly] Fixed 2 warnings in Asm Type Checker, rG10cb03622325: [llvm-mca] Refactor the logic that prints JSON files., rG1a4d1315650b: [llvm-nm][test] diff -q => diff to make AIX happy, rGc82b96c0a414: [gn build] fix formatting after 9647a6f719ee, rG5f306feb4d3f: [WebAssembly] Fix warnings, rG49d66d9f9f49: [AFDO] Merge function attributes after inlining, rGfc01fafa3e7f: [MLIR][GPU][NFC] Fix documentation for wmma matrix load/store ops, rGac02baab48c2: WebAssembly: Update datalayout to match fp128 ABI change, rG8e9216fe877c: [SLP] Do not make an attempt to match reduction on already erased instruction., rG9c4baf5101e9: [ScalarEvolution] Strictly enforce pointer/int type rules., rGdb4c25822a1d: [scudo] Check if we use __clang_major__ >= 12, rG882ee7fbd6fc: Fix buildbot regression from 9c4baf5., rG04c203e310bd: llvm-symbolizer: Fix "start file" to work with Split DWARF, rGe5220104d070: [WebAssembly] Custom combines for f64x2.promote_low_f32x4, rG3822e3d5b049: [lld-macho] Fix bug in handling unwind info from ld -r, rGb8e5f918166c: [ORC] Flesh out ExecutorAddress, rename CommonOrcRuntimeTypes header., rG58a2cb514366: [GlobalISel] Add a new artifact combiner for unmerge which looks through…, rG1f40870dda46: [NFC][ScalarEvolution] Precommit tests for D104075., rG97c426394a71: [AArch64][GlobalISel] Implement moreElements legalization for G_SHUFFLE_VECTOR., rG8cf7ddbdd4e5: Revert "Prepare Compiler-RT for GnuInstallDirs, matching libcxx", rG41b605764172: [InstructionCost] Add saturation support., rG239fcda268dc: [LV] NFCI: Do cost comparison on InstructionCost directly., rGd919bca87556: [llvm-mca][JSON] Further refactoring of the JSON printing logic., rG8662e04552c2: Update arithmetic-fence-builtin.c, rG4fe0fcd1c032: [llvm-mca][JSON] Teach the PipelinePrinter how to deal with anonymous code…, rGa328ee657798: [X86] Add tests from D93707 for fsub_strict(x,fneg(y)) -> fadd_strict(x,y)…, rG8f4e5474de74: [AFDO] Require x86_64-linux in a testcase, rG1d0456361a42: [OpenMP] Avoid checking parent reference count in targetDataEnd, rGd99f65de2ab1: [OpenMP] Avoid checking parent reference count in targetDataBegin, rGf4f11ee4a705: [mlir][NFC] Switched `interfaces` to a private member of SSANameState., rG2c0f17982f39: [mlir] Added OpPrintingFlags to AsmState and SSANameState., rGbe5d46e9bbc9: [Attributor][FIX] Traverse uses even if a value is assumed constant, rG93a279a67dc0: [Attributor] Introduce an optimistic getUnderlyingObjects helper, rG374e573cfc2b: [Attributor] Use AAValueSimplify to simplify returned values, rG1eb31d6de36b: [Attributor] Reorganize AAHeapToStack, rG5003ba2542c1: [Attributor] Look through selects in genericValueTraversal, rG1d5711c3eeb6: [OpenMP] Unified entry point for SPMD & generic kernels in the device RTL, rGf0628c6ff7ba: [OpenMP] Create custom state machines for generic target regions, rGae08df87dfba: [Attributor][FIX] Do not replace a value with a non-dominating instruction, rG966342790e8d: [Attributor][FIX] Sanitize queries to LVI and ScalarEvolution, rGe603ca0306d7: [OpenMP] Remove checkXXXX device runtime functions, rGd39179d7fa17: [OpenMP] Detect SPMD compatible kernels and execute them as such, rG269416d41908: [Attributor][NFCI] Add UsedAssumedInformation to more interfaces, rG768510632c5d: Revert "llvm-symbolizer: Fix "start file" to work with Split DWARF", rGf01d45c378cd: Reland "[clang-repl] Allow passing in code as positional arguments.", rGd3e749133319: Revert Attributor patch series, rG5b12cf3e659b: [Attributor][FIX] Traverse uses even if a value is assumed constant, rG0aab13aaf942: [Attributor] Introduce an optimistic getUnderlyingObjects helper, rG5ef18e242183: [Attributor] Use AAValueSimplify to simplify returned values, rGa6470408cf36: [ARM] Extra widening and narrowing combinations tests. NFC, rGdbb3a65f5b30: [Attributor][FIX] Do not replace a value with a non-dominating instruction, rG773beb6fbed7: [PowerPC] Fix L[D|W]ARX Implementation, rGc1c1fe93852e: [Attributor] Reorganize AAHeapToStack, rG5b05a5f6cee2: [OpenMP][FIX] Update remark in test file after rewording, rGc1d53a316d6c: [Attributor] Look through selects in genericValueTraversal, rG4761d29633ac: [Attributor][FIX] Sanitize queries to LVI and ScalarEvolution, rGe2cfbfcc0c1f: [OpenMP] Unified entry point for SPMD & generic kernels in the device RTL, rGd9659bf6a036: [OpenMP] Create custom state machines for generic target regions, rGa706b94ea556: [OpenMP][NFCI] Re-enable two remarks tests after D101977 landed, rG0a223827de8d: [OpenMP] Remove checkXXXX device runtime functions, rG8cb7d71355f9: [OpenMP][FIX] Add missing `)` to remark, rG514c033db1e0: [OpenMP] Detect SPMD compatible kernels and execute them as such, rG2e7e2994a94e: [Attributor][FIX] Destroy bump allocator objects to avoid leaks, rG86109fa9e84c: [RISCV] Add test cases for div/rem with constant left hand side. NFC, rG74f6e1614b85: Create .html, rGf8b7fdc9319e: Merge branch 'llvm:main' into main, rTad57b9ae24df: SPEC2006: Pronounce endianness flags both ways, rTd382dfd3c56a: [fpcmp] Fix memory leak. NFC., rG99b8c4682865: [RISCV] Restore non-constant srem test I accidentally deleted. NFC, rGcbba7299f308: [DivRemPairs] Add test cases for D87555. NFC, rGb447b9dce0d1: Reapply "llvm-symbolizer: Fix "start file" to work with Split DWARF", rG80dd591610cb: [SelectionDAG] Replace APInt.lshr().trunc() with APInt.extractBits() where…, rG09cdcf09b54d: Fix windows directory separator some more for test from….
pengfei accepted this revision.Jul 12 2021, 7:31 AM

LGTM if the failed tests are not related with this patch.

This revision is now accepted and ready to land.Jul 12 2021, 7:31 AM
This revision was landed with ongoing or failed builds.Jul 12 2021, 10:16 AM
This revision was automatically updated to reflect the committed changes.