Details
- Reviewers
arsenm foad - Commits
- rGd8258508d498: [AMDGPU][GISel] Update `isCanonicalized`
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Time | Test | |
---|---|---|
60,050 ms | x64 debian > MLIR.Examples/standalone::test.toy |
Event Timeline
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
10202 | It seems wrong for FNEG to fall into the "supportsMinMaxDenormModes" check below. |
Fix unary ops
Also fixes the foldable-fneg test (no longer needs pre/post check lines)
Yes, that's what I meant - we had a split in the ieee/flush codegen but this fixes it :p
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
10202 | Indeed it was. Fixed it and also added the remaining unary operations while I was at it. |
LGTM. (I'm not 100% sure about G_FREM but I assume it expands to something that will flush denormals if appropriate?)
I think it's fine as this mimics the DAG's implementation. I can still remove FREM if you're not sure about it as I don't really need it added right now (I just added it for completeness).
I agree it's fine. It looks like it normally expands to some combination of div/mul/trunc/fma, i.e. normal fp instructions which will canonicalize.
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
10217–10218 | I just realized this is broken since it was copied from the DAG with FP value types. In another patch can you explicitly handle the custom intrinsics? Plus this isn't handling some of the less common generic opcodes, like all the different rounding functions |
I'm not sure I understand, which intrinsics need to be handled? The ones in INTRINSIC_WO_CHAIN in the DAG version?
And what are the opcodes you're talking about? Things like CVT_F32?
Some of them are obscured by custom nodes in the DAG. Some examples are AMDGPUISD::FMAD_FTZ, AMDGPUISD::RCP (corresponding to llvm.amdgcn.rcp), and the cases in INTRINSIC_WO_CHAIN. Plus this is missing a number of the generic opcodes like G_FSQRT and G_FFLOOR
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
10195 | G_FRINT, G_NEARBYINT, G_INTRINSIC_FPTRUNC_ROUND, G_INTRINSIC_TRUNC, G_INTRINSIC_ROUNDEVEN, | |
10223 | amdgcn_sqrt, fmed3, fmad_ftz, sin, cos, log, log_clamp | |
10231 | cubema, cubesc, cubetc? | |
10238 | Should do separately and also introduce for the DAG path, but we're missing some of the newer operations like _amdgcn_cvt_f32_bf8 |
Add more missing instructions/intrinsics
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
10223 | fmad_ftz was already there, log doesn't seem to exit as an intrinsic but it's an instruction so I it + the variants: case AMDGPU::G_FLOG: case AMDGPU::G_FLOG2: case AMDGPU::G_FLOG10: |
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
10206–10207 | Also should handle G_FMINNUM/G_FMAXNUM. Also, we should really stop ascribing target instruction behavior to the generic opcodes. This is a separate patch though since this is a big mess | |
10217 | Should default to return false |
G_FRINT, G_NEARBYINT, G_INTRINSIC_FPTRUNC_ROUND, G_INTRINSIC_TRUNC, G_INTRINSIC_ROUNDEVEN,