Page MenuHomePhabricator

[GlobalISel] Introduce a generic floating point floor opcode, G_FFLOOR
ClosedPublic

Authored by paquette on Jan 30 2019, 4:19 PM.

Details

Summary

This introduces a generic opcode for floating point floor, working towards selecting @llvm.floor.

Diff Detail

Repository
rL LLVM

Event Timeline

paquette created this revision.Jan 30 2019, 4:19 PM
aemerson accepted this revision.Feb 1 2019, 4:04 PM

LGTM

This revision is now accepted and ready to land.Feb 1 2019, 4:04 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 4 2019, 9:10 AM
Herald added a subscriber: kristina. · View Herald Transcript
paquette added a subscriber: arsenm.Feb 4 2019, 9:34 AM

@arsenm, adding this opcode breaks AMDGPU somehow. Do you have any idea why that might be?

arsenm added a comment.Feb 4 2019, 9:41 AM

@arsenm, adding this opcode breaks AMDGPU somehow. Do you have any idea why that might be?

Breaks what?

@arsenm, adding this opcode breaks AMDGPU somehow. Do you have any idea why that might be?

Breaks what?

Here's an example:
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/24100/steps/build%20stage%201/logs/stdio

Some tablegen seems unhappy?

arsenm added a comment.Feb 4 2019, 9:46 AM

@arsenm, adding this opcode breaks AMDGPU somehow. Do you have any idea why that might be?

Breaks what?

Here's an example:
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/24100/steps/build%20stage%201/logs/stdio

Some tablegen seems unhappy?

I have no idea. I haven't looked at the DAG compatibility stuff

@arsenm, adding this opcode breaks AMDGPU somehow. Do you have any idea why that might be?

Breaks what?

Here's an example:
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/24100/steps/build%20stage%201/logs/stdio

Some tablegen seems unhappy?

I have no idea. I haven't looked at the DAG compatibility stuff

It looks like the definition uncovered an unusual case the importer can't handle yet. These two rules are the cause:

// Convert (x - floor(x)) to fract(x)
def : GCNPat <
  (f32 (fsub (f32 (VOP3Mods f32:$x, i32:$mods)),
             (f32 (ffloor (f32 (VOP3Mods f32:$x, i32:$mods)))))),
  (V_FRACT_F32_e64 $mods, $x, DSTCLAMP.NONE, DSTOMOD.NONE)
>;

// Convert (x + (-floor(x))) to fract(x)
def : GCNPat <
  (f64 (fadd (f64 (VOP3Mods f64:$x, i32:$mods)),
             (f64 (fneg (f64 (ffloor (f64 (VOP3Mods f64:$x, i32:$mods)))))))),
  (V_FRACT_F64_e64 $mods, $x, DSTCLAMP.NONE, DSTOMOD.NONE)
>;

The importer doesn't have any code to handle same-operand constraints in combination with the (foo $x, $y) style of matching complex pattern foo at the moment. I had a quick look for workarounds but there doesn't seem to be a variant (e.g. naming the overall complex-operand and matching that) that works at the moment. This:

(f32 (fsub (f32 VOP3Mods:$a),
           (f32 (ffloor (f32 VOP3Mods:$a)))))

would probably work but then you wouldn't be able to reverse the sub-operands.

I think that this is safe to recommit after the changes in https://reviews.llvm.org/D57980.

Relanded in r353589. This works fine after r353586.