Page MenuHomePhabricator

[X86] Lowering X86 avx512 sqrt intrinsics to IR
ClosedPublic

Authored by tkrupa on Dec 13 2017, 3:50 AM.

Details

Summary

seperates the non-round version from the round version of sqrt builtins
and catching them in CGBuiltin.cpp to replace builtin with IR.

there are two types of intrinsics here, packed and scalar, for the scalar instruction there is an unnecessary move intruction after
which includes the masking.
should be taken cared of in a different patch.

this patch goes together with another patch on the llvm side.

Diff Detail

Repository
rL LLVM

Event Timeline

uriel.k created this revision.Dec 13 2017, 3:50 AM
craig.topper added inline comments.Dec 13 2017, 1:10 PM
lib/CodeGen/CGBuiltin.cpp
8145 ↗(On Diff #126719)

I would suggest just using __builtin_i32_sqrtpd512_mask here and select different intrinsics based on the value of the rounding mode operand. The user should get the same behavior if they use the explicit rounding mode intrinsic, but pass CUR_DIRECTION.

uriel.k updated this revision to Diff 128232.Dec 27 2017, 6:52 AM
uriel.k retitled this revision from [X86][avx512] Lowering X86 avx512 sqrt intrinsics to IR to [X86] Lowering X86 avx512 sqrt intrinsics to IR.
uriel.k edited the summary of this revision. (Show Details)

now takes care of sqrt with round mode CURR_DIRECTION

tkrupa commandeered this revision.Mar 30 2018, 4:56 AM
tkrupa added a reviewer: uriel.k.
tkrupa added a subscriber: tkrupa.

I was assigned to finish this task.

tkrupa updated this revision to Diff 140409.Mar 30 2018, 4:57 AM

Removed renaming of the builtins (also in corresponding llvm patch).

craig.topper added inline comments.Apr 2 2018, 9:52 PM
lib/CodeGen/CGBuiltin.cpp
8904 ↗(On Diff #140409)

80 columns?

8907 ↗(On Diff #140409)

Drop the curly braces

8909 ↗(On Diff #140409)

There's a signature for CreateExtractElement that takes a uint64_t for the index. No need to create a ConstantInt explicitly

8930 ↗(On Diff #140409)

Drop the curly braces

craig.topper added inline comments.Apr 2 2018, 9:57 PM
lib/CodeGen/CGBuiltin.cpp
8901 ↗(On Diff #140409)

What about builtin_ia32_sqrtsd and builtin_ia32_sqrtss?

tkrupa marked 4 inline comments as done.Apr 4 2018, 4:46 AM
tkrupa added inline comments.
lib/CodeGen/CGBuiltin.cpp
8901 ↗(On Diff #140409)

There was a misunderstanding earlier - Uriel and me both took up this task simultaneously. We found out during the implementation phase and decided that he'd do this part (since it already was in the phabricator) and I'd do builtin_ia32_sqrtsd and builtin_ia32_sqrtss in another patch (lowering them require handling some dependencies) The other patch is currently going through internal review.

tkrupa updated this revision to Diff 140937.Apr 4 2018, 4:47 AM

I did the suggested corrections and improved the tests.

This revision is now accepted and ready to land.Apr 4 2018, 10:17 AM

I'll wait with upstreaming this patch until there's an agreement on mask scalar approach.

tkrupa added a comment.Jun 1 2018, 1:57 AM

Mask scalar case is closed and doesn't have any effects on this revision. Besides, I resolved issues connected to lowering scalar sqrt intrinsics without rounding (that is, if D47621 is accepted). Should I add them here to have everything sqrt in one place or upstream this and add them to a new revision?

RKSimon requested changes to this revision.Jun 1 2018, 2:38 AM

Mask scalar case is closed and doesn't have any effects on this revision. Besides, I resolved issues connected to lowering scalar sqrt intrinsics without rounding (that is, if D47621 is accepted). Should I add them here to have everything sqrt in one place or upstream this and add them to a new revision?

Add them here please

This revision now requires changes to proceed.Jun 1 2018, 2:38 AM
tkrupa updated this revision to Diff 149419.Jun 1 2018, 2:52 AM

Added missing scalar intrinsics without rounding.

RKSimon added inline comments.Jun 3 2018, 4:45 AM
test/CodeGen/sse2-builtins.c
1199 ↗(On Diff #149419)

You've added CHECK-NOT to some (scalar only?) tests but not others - please be consistent.

tkrupa updated this revision to Diff 149693.Jun 4 2018, 2:27 AM

Removed CHECK-NOTs for consistency.

tkrupa marked an inline comment as done.Jun 4 2018, 2:28 AM
craig.topper added inline comments.Jun 14 2018, 8:14 AM
lib/CodeGen/CGBuiltin.cpp
9222 ↗(On Diff #149693)

What does returning nullptr here do?

9246 ↗(On Diff #149693)

Again don't know what the caller is going to do with nullptr.

tkrupa updated this revision to Diff 151382.Jun 14 2018, 11:16 AM

Fixed rounding mode calls.

tkrupa marked 2 inline comments as done.Jun 14 2018, 11:20 AM
This revision was not accepted when it landed; it landed in state Needs Review.Jun 15 2018, 11:12 AM
Closed by commit rL334850: [X86] Lowering sqrt intrinsics to native IR (authored by tkrupa, committed by ). · Explain Why
This revision was automatically updated to reflect the committed changes.

LGTM as well