This is an archive of the discontinued LLVM Phabricator instance.

[X86] Mutate fceil/ffloor/ftrunc/fnearbyint/frint into X86ISD::RNDSCALE during PreProcessIselDAG to cut down on pattern permutations
ClosedPublic

Authored by craig.topper on May 31 2019, 3:34 PM.

Details

Summary

We already need to have patterns for X86ISD::RNDSCALE to support software intrinsics. But we currently have 5 sets of patterns for the 5 rounding operations. For of these 6 patterns we have to support 3 vectors widths, 2 element sizes, sse/vex/evex encodings, load folding, and broadcast load folding. This results in a fair amount of bytes in the isel table.

This patch adds code to PreProcessIselDAG to morph the fceil/ffloor/ftrunc/fnearbyint/frint to X86ISD::RNDSCALE. This way we can remove everything, but the intrinsic pattern while still allowing the operations to be considered Legal for DAGCombine and Legalization. This shrinks the DAGISel by somewhere between 9K and 10K.

There is one complication to this, the STRICT versions of these nodes are currently mutated to their none strict equivalents at isel time when the node is visited. This won't be true in the future since that loses the chain ordering information. For now I've also added support for the non-STRICT nodes to Select so we can change the STRICT versions there after they've been mutated to their non-STRICT versions. We'll probably need a STRICT version of RNDSCALE or something to handle this in the future. Which will take us back to needing 2 sets of patterns for strict and non-strict, but that's still better than the 11 or 12 sets of patterns we'd need.

We can probably do something similar for scalar, but I haven't looked at it yet.

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper created this revision.May 31 2019, 3:34 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 31 2019, 3:34 PM
Herald added a subscriber: hiraditya. · View Herald Transcript

I've no objection to this - what are you intending to do with the scalar versions? And where is the select code for the strict versions?

I've no objection to this - what are you intending to do with the scalar versions? And where is the select code for the strict versions?

I think for scalar we would need to add patterns for a scalar X86ISD::RNDSCALE in addition to the X86ISD::RNDSCALES we use for intrinsics. Maybe we could use extractelement and movss/movsd matching to remove RNDSCALES?

StrictFP doesn't currently have proper selection code. Just before calling Select, SelectionDAGISel mutates them to remove the chain input/output which was a temporary solution. That will ultimately be removed.

RKSimon accepted this revision.Jun 4 2019, 5:47 AM

LGTM

This revision is now accepted and ready to land.Jun 4 2019, 5:47 AM
This revision was automatically updated to reflect the committed changes.