This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Use v_pk_max_f16 for fcanonicalize
ClosedPublic

Authored by rampitec on Aug 30 2017, 10:13 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

rampitec created this revision.Aug 30 2017, 10:13 PM
arsenm added inline comments.Sep 1 2017, 3:43 PM
lib/Target/AMDGPU/SIInstructions.td
1289 ↗(On Diff #113355)

This won't work. For now it's probably easier to just throw an S_MOV_B32 of the constant. This won't encode correctly as a direct immediate because you need to manipulate op_sel

rampitec added inline comments.Sep 1 2017, 3:58 PM
lib/Target/AMDGPU/SIInstructions.td
1289 ↗(On Diff #113355)

It was here before, this is the old code and the fix does not belong to the current change.

rampitec added inline comments.Sep 5 2017, 11:01 AM
lib/Target/AMDGPU/SIInstructions.td
1289 ↗(On Diff #113355)

For some reason S_MOV_B32 does not work here. It is really a separate issue and has to be addressed separately.

arsenm added inline comments.Sep 6 2017, 8:30 AM
lib/Target/AMDGPU/SIInstructions.td
1289 ↗(On Diff #113355)

Why doesn't it work? What about V_MOV_B32? The constant needs to be materialized in some way.

Actually I think setting this to be FP16 zero is fine, as long as you remove the SRCMODS.OP_SEL_1 from it.

rampitec added inline comments.Sep 6 2017, 8:37 AM
lib/Target/AMDGPU/SIInstructions.td
1289 ↗(On Diff #113355)

As I said this is really a subject for the separate patch.

arsenm added inline comments.Sep 6 2017, 9:04 AM
lib/Target/AMDGPU/SIInstructions.td
1289 ↗(On Diff #113355)

It's not, because what is here is incorrect and will not work.

rampitec retitled this revision from [AMDGPU] Use v_pm_max_f16 for fcanonicalize to [AMDGPU] Use v_pk_max_f16 for fcanonicalize.Sep 6 2017, 11:00 AM
rampitec added inline comments.
lib/Target/AMDGPU/SIInstructions.td
1289 ↗(On Diff #113355)

It is really different, one is fix the other is optimization.
Please see D37522.

rampitec updated this revision to Diff 114040.Sep 6 2017, 11:53 AM
rampitec marked 7 inline comments as done.

Rebased.

arsenm accepted this revision.Sep 6 2017, 12:08 PM

LGTM

This revision is now accepted and ready to land.Sep 6 2017, 12:08 PM
This revision was automatically updated to reflect the committed changes.