Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/docs/AMDGPUUsage.rst | ||
---|---|---|
389 | Is it dGPU or APU? Every other entry with *TBA* also has a TODO:: message | |
llvm/lib/Target/AMDGPU/AMDGPU.td | ||
468 | What is this new encoding? It doesn't seem to be used for anything. | |
llvm/lib/Target/AMDGPU/GCNSubtarget.h | ||
879 | Stray whitespace on this line. | |
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp | ||
1453 | Stray whitespace. | |
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll | ||
4 | This test surely should not pass for gfx1012, since it does not have these instructions. And with your patch as written it should fail for gfx1013 too, since they are predicated on HasGFX10_BEncoding. @rampitec any idea what is wrong here? Apparently the backend will happily generate image_bvh_intersect_ray instructions even for gfx900! | |
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.intersect_ray.ll | ||
3 | Likewise. |
You need to replace HasGFX10_BEncoding with HasGFX10_AEncoding in the BVH and IMAGE_MSAA_LOAD_X. You also need to update llvm.amdgcn.image.msaa.load.x.ll test to include gfx1013.
llvm/lib/Target/AMDGPU/AMDGPU.td | ||
---|---|---|
1106 | gfx1030 should now include FeatureGFX10_AEncoding as well. 10_B is an extension above 10_A. | |
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll | ||
4 | Indeed. MIMG_IntersectRay has this: let SubtargetPredicate = HasGFX10_BEncoding, AssemblerPredicate = HasGFX10_BEncoding, but apparently SubtargetPredicate did not work. It needs to be fixed. gfx1012 does not have it, gfx1013 does though. That is what GFX10_A encoding is about, 10_B it has to be replaced with 10_A in BVH and MSAA load. |
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll | ||
---|---|---|
4 | Image lowering and selection is not really done like everything else. For BVH it just lowers intrinsic to opcode. I think the easiest fix is to add to SIISelLowering.cpp where we lower Intrinsic::amdgcn_image_bvh_intersect_ray something like this: if (!Subtarget->hasGFX10_AEncoding()) report_fatal_error( "requested image instruction is not supported on this GPU"); |
Addressed review comments. Updated the patch to use the new AEncoding target feature
correctly. Added code to report an error for the image intersect intrinsics for
unsupported targets.
llvm/docs/AMDGPUUsage.rst | ||
---|---|---|
389 | It is APU. | |
llvm/lib/Target/AMDGPU/AMDGPU.td | ||
468 | Fixed this. The BVH raytracing instructions use the encoding. | |
1106 | I had added FeatureGFX10_AEncoding as an Implies feature for FeatureGFX10_BEncoding in the previous patch. But, I've changed the patch so that it's no longer an Implies feature. | |
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll | ||
4 | I ended up using emitRemovedIntrinsicError, which uses DiagnosticInfoUnsupported. This way the failure isn't a crash dump. |
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp | ||
---|---|---|
4690 | 80 chars per line. | |
4693 | Just return false like in other places. | |
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | ||
7345 | return emitRemovedIntrinsicError(); | |
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll | ||
4 | not --crash llc ... | |
4 |
Diagnostics is a good thing, but we still have to fail the compilation. | |
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.intersect_ray.ll | ||
3 | not --crash llc |
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll | ||
---|---|---|
4 | The diagnostic is marked as an error, so the compilation fails in that llc returns a non-zero return code. This mechanism is used in other places in the back-end to report similar types of errors. The alternative, if I understand correctly, is that a crash occurs with an error message that indicates that the bug is in LLVM (rather the the input source file). |
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll | ||
---|---|---|
4 | We do not seem to be consistent here and return either undef or SDValue(), but as far as I can see we never continue selecting code though, like here in SIISelLowering and always return false from the AMDGPUInstructionSelector. |
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
7345 | I've changed this to return. Thanks for catching that. But, it returns a UNDEF value instead of SDValue() so that it doesn't crash. I can change the behavior if that's preferred. | |
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll | ||
4 | I've left the patch so that it doesn't crash. But, let me know if you think we should return false and crash, and I'll make that change. |
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp | ||
---|---|---|
4693 | Just return false. I see that is like this in the whole file. |
Changed legalizer to return false for raytracing intrinsics that are not supported by the subtarget. I changed both GlobalISel and regular ISel to work similarly. A crash occurs with a message that the intrinsic cannot be legalized, and only the first instance is reported.
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp | ||
---|---|---|
4693 | Changed this to false, and also changed SIISelLlowering to return SDValue so that both fail in a similar way. |
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp | ||
---|---|---|
4694 | You can just omit undef and erase. |
llvm/lib/Target/AMDGPU/AMDGPU.td | ||
---|---|---|
471 | I realise you're just following the precedent set by GFX10_B, but is this terminology actually used in any documentation anywhere? And if not could we describe it a little better here? | |
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | ||
7345 | Personally I would follow all the existing precedents and "return emitRemovedIntrinsicError(...)". I don't see any value in deliberately trying to make the compiler crash harder. |
llvm/lib/Target/AMDGPU/AMDGPU.td | ||
---|---|---|
471 | I changed the description to be specific w.r.t what the target feature enables. | |
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | ||
7345 | The handling of diagnostic errors is a little inconsistent in ISelLowering. Sometimes SDValue() is returned and other times it's Undef. I have it return SDValue() so that the failure mode is consistent with how GlobaISel handles these intrinsics. It's probably worth a discussion to decide how best to handle diagnostic errors. I'm happy to submit a follow-on patch as needed. As an aside, return emitRemovedIntrinsicError() isn't enough because the intrinsic has both a return value and a chain edge. So, something like return DAG.getMergeValues({emitRemovedIntrinsicError(), Op.getValue(0)}, DL) is needed. |
llvm/lib/Target/AMDGPU/AMDGPU.td | ||
---|---|---|
471 | Thank you. I think that is much more useful. |
Is it dGPU or APU?
Every other entry with *TBA* also has a TODO:: message