As pointed out in D70913, it turns out we need to ensure that fpexcept.strict nodes are not optimized away even if the result is unused. To do that, we need to chain them into the block's terminator nodes, like already done for PendingExcepts.
This patch adds two new lists of pending chains, PendingConstrainedFP and PendingConstrainedFPStrict to hold constrained FP intrinsic nodes without and with fpexcept.strict markers. This allows not only to solve the above problem, but also to relax chains a bit further by no longer flushing all FP nodes before a store or other memory access. (They are still flushed before nodes with other side effects.)
Note that this patch currently introduces test case failures:
LLVM :: CodeGen/X86/vec-strict-cmp-128.ll LLVM :: CodeGen/X86/vec-strict-cmp-256.ll LLVM :: CodeGen/X86/vec-strict-cmp-512.ll LLVM :: CodeGen/X86/vec-strict-fptoint-128.ll LLVM :: CodeGen/X86/vec-strict-fptoint-256.ll LLVM :: CodeGen/X86/vec-strict-fptoint-512.ll
These appear to be caused by problems handling the outgoing chains of some strictfp nodes in the X86 back-end; these problems were unnoticed before this patch because the outgoing chains were simply unused with those simple tests otherwise.
It may be that this is actually the same problem to be addressed by D72224; @craig.topper , maybe you can have a look?
Can we just use PendingExports.append here? I think it will take care of the reserve. I think even insert will probably take care of the reserve. Might make sense for the code in getRoot() too unless you're concerned about 2 reserves in 2 separate append calls.