Page MenuHomePhabricator

[X86] VRNDSCALE* folding from masked and scalar ffloor and fceil patterns

Authored by mike.dvoretsky on Apr 3 2018, 2:13 AM.



This patch handles back-end folding of generic patterns created by D48067 in cases where the instruction isn't a straightforward packed values rounding operation, but a masked operation or a scalar operation.

Diff Detail


Event Timeline

mike.dvoretsky created this revision.Apr 3 2018, 2:13 AM
mike.dvoretsky edited the summary of this revision. (Show Details)

Added tests for folding. Also made changes to enable the new version of D45202.

craig.topper added inline comments.Apr 4 2018, 9:57 PM
30511 ↗(On Diff #140973)

Can we just do this with isel patterns like we do for ADDSS?

873 ↗(On Diff #140973)

Can you generate %k from a compare instruction rather than passing in a X x i1 type. It will make the code a little cleaner since we won't have to extend and split the mask in such crazy ways.

mike.dvoretsky added inline comments.Apr 5 2018, 7:09 AM
30511 ↗(On Diff #140973)

I've considered that, but decided to fold it here. To do it in .td patterns we'd need to add 4 new patterns in 2 separate files. 32 and 64 bit patterns would need to be added for VROUNDS* on AVX and ROUNDS* on SSE4.1. Writing this pattern here both makes it easier to track and produces less check complexity.

craig.topper added inline comments.Apr 5 2018, 9:55 AM
30505 ↗(On Diff #140973)

There's a signed vs unsigned comparison warning on this line.

Changed the folding to use isel patterns from D47012.

craig.topper added inline comments.Jun 3 2018, 3:20 PM
8606 ↗(On Diff #149464)

Do we have test cases covering this pattern? I can't find any zero extend instructions

mike.dvoretsky marked an inline comment as done.

Added zero extension of mask to i32 in the masked scalar tests and added more ways to represent the mask, testing the 8-bit mask pattern among others. 16-bit mask patterns removed due to scalar_to_vector errors.

craig.topper added inline comments.Jun 4 2018, 10:15 AM
8605 ↗(On Diff #149716)

Why HasVLX? Shouldn't scalar instructions be valid under HasAVX512?

9777 ↗(On Diff #149716)

What about zero masking?

mike.dvoretsky marked 2 inline comments as done.

Corrected the scalar pattern predicates, added packed zero-masked instruction patterns and tests to cover zero-masking. Changed the RUN line of vec_floor.ll to give different results for AVX512F and AVX512VL where needed (e.g. in 128- and 256-bit masked operations).

Added tests for floor intrinsics and masked scalar double patterns to cover all introduced isel patterns.

This revision is now accepted and ready to land.Jun 8 2018, 9:28 AM
mike.dvoretsky retitled this revision from [X86] VRNDSCALE* folding from masked and single-value ffloor and fceil patterns to [X86] VRNDSCALE* folding from masked and scalar ffloor and fceil patterns.
mike.dvoretsky edited the summary of this revision. (Show Details)

Updated to support D48067 instead of D45202. Removed changes since the lowering is now in InstCombine, changed the masked scalar pattern tests to exclude mask extension that was in the previous versions.

mike.dvoretsky requested review of this revision.Jun 15 2018, 1:36 AM

Updated the AVX512 isel pattern to account for the instruction record name change.

This revision is now accepted and ready to land.Jun 15 2018, 4:24 PM
This revision was automatically updated to reflect the committed changes.