Page MenuHomePhabricator

[AMDGPU] Add gfx1013 target
ClosedPublic

Authored by bcahoon on Thu, Jun 3, 7:04 PM.

Diff Detail

Event Timeline

bcahoon created this revision.Thu, Jun 3, 7:04 PM
bcahoon requested review of this revision.Thu, Jun 3, 7:04 PM
Herald added projects: Restricted Project, Restricted Project, Restricted Project. · View Herald TranscriptThu, Jun 3, 7:04 PM
foad added a comment.Fri, Jun 4, 1:05 AM

Please also update llvm/test/Object/AMDGPU/elf-header-flags-mach.yaml.

foad added inline comments.Fri, Jun 4, 1:38 AM
llvm/docs/AMDGPUUsage.rst
389

Is it dGPU or APU?

Every other entry with *TBA* also has a TODO:: message

llvm/lib/Target/AMDGPU/AMDGPU.td
468

What is this new encoding? It doesn't seem to be used for anything.

llvm/lib/Target/AMDGPU/GCNSubtarget.h
879

Stray whitespace on this line.

llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
1453

Stray whitespace.

llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll
4

This test surely should not pass for gfx1012, since it does not have these instructions. And with your patch as written it should fail for gfx1013 too, since they are predicated on HasGFX10_BEncoding.

@rampitec any idea what is wrong here? Apparently the backend will happily generate image_bvh_intersect_ray instructions even for gfx900!

llvm/test/CodeGen/AMDGPU/llvm.amdgcn.intersect_ray.ll
3

Likewise.

You need to replace HasGFX10_BEncoding with HasGFX10_AEncoding in the BVH and IMAGE_MSAA_LOAD_X. You also need to update llvm.amdgcn.image.msaa.load.x.ll test to include gfx1013.

llvm/lib/Target/AMDGPU/AMDGPU.td
1106

gfx1030 should now include FeatureGFX10_AEncoding as well. 10_B is an extension above 10_A.

llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll
4

Indeed. MIMG_IntersectRay has this:

let SubtargetPredicate = HasGFX10_BEncoding,
    AssemblerPredicate = HasGFX10_BEncoding,

but apparently SubtargetPredicate did not work. It needs to be fixed.

gfx1012 does not have it, gfx1013 does though. That is what GFX10_A encoding is about, 10_B it has to be replaced with 10_A in BVH and MSAA load.

rampitec added inline comments.Fri, Jun 4, 12:27 PM
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll
4

Image lowering and selection is not really done like everything else. For BVH it just lowers intrinsic to opcode. I think the easiest fix is to add to SIISelLowering.cpp where we lower Intrinsic::amdgcn_image_bvh_intersect_ray something like this:

if (!Subtarget->hasGFX10_AEncoding())
  report_fatal_error(
      "requested image instruction is not supported on this GPU");
bcahoon updated this revision to Diff 350075.Sat, Jun 5, 1:09 PM

Addressed review comments. Updated the patch to use the new AEncoding target feature
correctly. Added code to report an error for the image intersect intrinsics for
unsupported targets.

bcahoon marked 8 inline comments as done.Sat, Jun 5, 1:22 PM
bcahoon added inline comments.
llvm/docs/AMDGPUUsage.rst
389

It is APU.

llvm/lib/Target/AMDGPU/AMDGPU.td
468

Fixed this. The BVH raytracing instructions use the encoding.

1106

I had added FeatureGFX10_AEncoding as an Implies feature for FeatureGFX10_BEncoding in the previous patch. But, I've changed the patch so that it's no longer an Implies feature.

llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll
4

I ended up using emitRemovedIntrinsicError, which uses DiagnosticInfoUnsupported. This way the failure isn't a crash dump.

rampitec added inline comments.Mon, Jun 7, 10:11 AM
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
4690

80 chars per line.

4693

Just return false like in other places.

llvm/lib/Target/AMDGPU/SIISelLowering.cpp
7345

return emitRemovedIntrinsicError();

llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll
4

not --crash llc ...

4

I ended up using emitRemovedIntrinsicError, which uses DiagnosticInfoUnsupported. This way the failure isn't a crash dump.

Diagnostics is a good thing, but we still have to fail the compilation.

llvm/test/CodeGen/AMDGPU/llvm.amdgcn.intersect_ray.ll
3

not --crash llc

bcahoon marked 4 inline comments as done.Mon, Jun 7, 11:31 AM
bcahoon added inline comments.
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll
4

The diagnostic is marked as an error, so the compilation fails in that llc returns a non-zero return code. This mechanism is used in other places in the back-end to report similar types of errors. The alternative, if I understand correctly, is that a crash occurs with an error message that indicates that the bug is in LLVM (rather the the input source file).

rampitec added inline comments.Mon, Jun 7, 11:56 AM
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll
4

We do not seem to be consistent here and return either undef or SDValue(), but as far as I can see we never continue selecting code though, like here in SIISelLowering and always return false from the AMDGPUInstructionSelector.

bcahoon updated this revision to Diff 350423.Mon, Jun 7, 2:54 PM

Addressed review comments

bcahoon marked an inline comment as done.Mon, Jun 7, 3:00 PM
bcahoon added inline comments.
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
7345

I've changed this to return. Thanks for catching that. But, it returns a UNDEF value instead of SDValue() so that it doesn't crash. I can change the behavior if that's preferred.

llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll
4

I've left the patch so that it doesn't crash. But, let me know if you think we should return false and crash, and I'll make that change.

rampitec added inline comments.Mon, Jun 7, 3:24 PM
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
4693

Just return false. I see that is like this in the whole file.

bcahoon updated this revision to Diff 350448.Mon, Jun 7, 4:50 PM

Changed legalizer to return false for raytracing intrinsics that are not supported by the subtarget. I changed both GlobalISel and regular ISel to work similarly. A crash occurs with a message that the intrinsic cannot be legalized, and only the first instance is reported.

bcahoon added inline comments.Mon, Jun 7, 4:51 PM
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
4693

Changed this to false, and also changed SIISelLlowering to return SDValue so that both fail in a similar way.

rampitec added inline comments.Mon, Jun 7, 5:13 PM
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
4694

You can just omit undef and erase.

bcahoon updated this revision to Diff 350468.Mon, Jun 7, 7:06 PM

Addressed review comment.

bcahoon marked an inline comment as done.Mon, Jun 7, 7:06 PM
rampitec accepted this revision.Mon, Jun 7, 11:43 PM
This revision is now accepted and ready to land.Mon, Jun 7, 11:43 PM
foad added inline comments.Tue, Jun 8, 3:12 AM
llvm/lib/Target/AMDGPU/AMDGPU.td
471

I realise you're just following the precedent set by GFX10_B, but is this terminology actually used in any documentation anywhere? And if not could we describe it a little better here?

llvm/lib/Target/AMDGPU/SIISelLowering.cpp
7345

Personally I would follow all the existing precedents and "return emitRemovedIntrinsicError(...)". I don't see any value in deliberately trying to make the compiler crash harder.

foad accepted this revision.Tue, Jun 8, 3:39 AM

LGTM anyway, with or without any action on my last couple of comments.

This revision was landed with ongoing or failed builds.Tue, Jun 8, 9:56 AM
This revision was automatically updated to reflect the committed changes.
bcahoon marked an inline comment as done.Tue, Jun 8, 10:21 AM
bcahoon added inline comments.
llvm/lib/Target/AMDGPU/AMDGPU.td
471

I changed the description to be specific w.r.t what the target feature enables.

llvm/lib/Target/AMDGPU/SIISelLowering.cpp
7345

The handling of diagnostic errors is a little inconsistent in ISelLowering. Sometimes SDValue() is returned and other times it's Undef. I have it return SDValue() so that the failure mode is consistent with how GlobaISel handles these intrinsics. It's probably worth a discussion to decide how best to handle diagnostic errors. I'm happy to submit a follow-on patch as needed.

As an aside, return emitRemovedIntrinsicError() isn't enough because the intrinsic has both a return value and a chain edge. So, something like return DAG.getMergeValues({emitRemovedIntrinsicError(), Op.getValue(0)}, DL) is needed.

foad added inline comments.Wed, Jun 9, 1:24 AM
llvm/lib/Target/AMDGPU/AMDGPU.td
471

Thank you. I think that is much more useful.