This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][OpenCL] Remove "printf and hostcall" diagnostic
ClosedPublic

Authored by scott.linder on Mar 17 2022, 2:05 PM.

Details

Summary

The diagnostic is unreliable, and triggers even for dead uses of
hostcall that may exist when linking the device-libs at lower
optimization levels.

Eliminate the diagnostic, and directly document the limitation for
OpenCL before code object V5.

Make some NFC changes to clarify the related code in the
MetadataStreamer.

Add a clang test to tie OCL sources containing printf to the backend IR
tests for this situation.

Diff Detail

Event Timeline

scott.linder created this revision.Mar 17 2022, 2:05 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 17 2022, 2:05 PM
scott.linder requested review of this revision.Mar 17 2022, 2:05 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMar 17 2022, 2:05 PM

For reference, the error I'm changing to a warning was added as part of https://reviews.llvm.org/D70038, and the issue that made me consider this approach appears when building OCL conformance tests at -O0

Another approach would be to do some more work in Clang or early in the backend to strip out the unused device-libs functions which are calling __ockl_hostcall_internal. I started with that thought, but backed off as it seemed the less we could do at -O0 the better. The limitation is also being resolved in later HSA ABI versions, so this would be a legacy path almost as soon as we added it.

scott.linder added inline comments.Mar 17 2022, 2:16 PM
llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
565–566

I'm also not certain this is the the best option to classify the diagnostic, but it is what was used for the error so I just left it as-is. Suggestions welcome!

565–566

I will land this separately, as it is unrelated and NFC

arsenm accepted this revision.Mar 17 2022, 2:33 PM

A spurious warning still isn't ideal but this is an improvement

This revision is now accepted and ready to land.Mar 17 2022, 2:33 PM

The check for "__ockl_hostcall_internal" is not longer relevant with the new hostcall attribute. Can we simply remove this check? What is the situation where the warning is still useful?

The check for "__ockl_hostcall_internal" is not longer relevant with the new hostcall attribute. Can we simply remove this check? What is the situation where the warning is still useful?

I wasn't aware of the new attribute, I'm happy to just delete it.

This comment was removed by sameerds.

The check for "__ockl_hostcall_internal" is not longer relevant with the new hostcall attribute. Can we simply remove this check? What is the situation where the warning is still useful?

I wasn't aware of the new attribute, I'm happy to just delete it.

Yeah, it happened in D119216.

But I am still curious about the check itself. Do we need to worry about a situation where it is important to check whether OpenCL printf (non-hostcall) and some other hostcall service are active in the same application?

The check for "__ockl_hostcall_internal" is not longer relevant with the new hostcall attribute. Can we simply remove this check? What is the situation where the warning is still useful?

I wasn't aware of the new attribute, I'm happy to just delete it.

Yeah, it happened in D119216.

But I am still curious about the check itself. Do we need to worry about a situation where it is important to check whether OpenCL printf (non-hostcall) and some other hostcall service are active in the same application?

I don't know how important it is to notify the user, but my understanding is pre-code-object-v5 we will use the same kernarg for both the printf and hostcall pointers. I don't know what actually happens (does the runtime notice it can't actually construct the kernarg and fail? does the code just fail when one of the two uses of the kernarg are incorrect?). Should I try to nail down exactly what currently happens?

scott.linder retitled this revision from [AMDGPU] Only warn when mixing printf and hostcall to [AMDGPU][OpenCL] Add "amdgpu-no-hostcall-ptr" in Clang codegen pre-COV_5.
scott.linder edited the summary of this revision. (Show Details)

Eliminate diagnostic, and instead add function attribute in Clang for pre-COV_5
OpenCL codegen.

sameerds added inline comments.Mar 24 2022, 11:15 PM
clang/lib/CodeGen/TargetInfo.cpp
9381 ↗(On Diff #418135)

The frontend does not need to worry about this attribute. See the comment in the MetadataStreamer. A worthwhile check would be to generate an error if we are able to detect that some hostcall service is being used in OpenCL on code-object-v4 or lower. None exists right now, but we should add the check if such services show up. But those checks are likely to be in a different place. For example, enabling asan on OpenCL for code-object-v4 should result in an error in the place where asan commandline options are parsed.

llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
408–410

I would structure this differently: If this is code-object-v4 or lower, then if "llvm.printf.fmts" is present, then this kernel clearly contains OpenCL bits, and cannot use hostcall. So it's okay to just assume that the no-hostcall-ptr attribute is always present in this situation, which means the only metadata generated is for ValueKind::HiddenPrintfBuffer. Else if this is code-object-v5, then proceed to emit both metadata.

arsenm added inline comments.Mar 25 2022, 6:10 AM
clang/lib/CodeGen/TargetInfo.cpp
9381 ↗(On Diff #418135)

Should be all opencl, not just kernels. Also < instead of !=?

scott.linder added inline comments.Mar 25 2022, 11:03 AM
clang/lib/CodeGen/TargetInfo.cpp
9381 ↗(On Diff #418135)

The frontend does not need to worry about this attribute. See the comment in the MetadataStreamer.

The reason I went this route is that otherwise the MetadataStreamer can only go off of the presence of the OCL printf metadata to infer that hostcall argument metadata is invalid and will be ignored by the runtime. Ideally, the MetadataStreamer or Verifier or something would actually require that a module does not have both OCL printf metadata and functions which are not "amdgpu-no-hostcall-ptr", but I didn't go that far as it would break old IR/bitcode that relies on the implicit behavior.

I'm OK with removing this code, and the rest of the patch remains essentially unchanged. We will still have an implicit rule based on code-object-version and presence of printf metadata, and at least conceptually you will still be able to compile OCL for pre-V5 and end up with hostcall argument metadata. That case is benign if the runtime just ignores it, but it still seems needlessly relaxed when we can just add the correct attribute in Clang codegen.

Should be all opencl, not just kernels. Also < instead of !=?

Yes, my mistake, I'll update this

llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
408–410

I'm not sure I follow; doesn't the code as-is implement what you're describing?

If the printf metadata is present, this will only ever emit the HiddenPrintfBuffer, irrespective of the hostcall attribute. Otherwise, this respects amdgpu-no-hostcall-ptr (i.e. for HIP and other languages).

The "if this is code-object-v4 or lower" portion is implicit, as this is just the MetadataStreamerV2 impl. The MetadataStreamerV3 (below) and MetadataStreamerV4 (inherits from V3) impls below are similar. The MetadataStreamerV5 impl is already correct for V5 (i.e. supports both argument kinds for the same kernel).

yaxunl added inline comments.Mar 25 2022, 12:52 PM
clang/lib/CodeGen/TargetInfo.cpp
9381 ↗(On Diff #418135)

+1 for using < instead of !=

However, device library is written in OpenCL language. There are functions using hostcall buffer. We cannot mark all OpenCL functions with this attribute.

Do not add the no-hostcall attribute for "COV_None" modules, to allow for
the OpenCL implementation of hostcall in the device-libs.

@yaxunl Does excluding device-libs via COV_None make sense?

@sameerds Would you still rather we just not add this attribute in the frontend at all? I'm OK with it if that is the consensus

yaxunl accepted this revision.Mar 29 2022, 9:07 AM

@yaxunl Does excluding device-libs via COV_None make sense?

That should work. Hopefully, it can cover most spurious warnings.

@yaxunl Does excluding device-libs via COV_None make sense?

@sameerds Would you still rather we just not add this attribute in the frontend at all? I'm OK with it if that is the consensus

Yes, I still think there is no need to emit that attribute in the frontend. It will always be inferred by the Attributor when optimization is enabled. This also eliminates the check for COV_None and there seems to be some uncertainty about COV_None anyway. This also eliminates the updates to all the tests where the no-hostcall-ptr attribute does not actually matter. If ever we need to check if hostcall is being used on OpenCL and COV < 5, we should do it per feature and inform the user appropriately.

llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
408–410

Your last para about different streamers cleared the confusion. So, this looks good.

scott.linder retitled this revision from [AMDGPU][OpenCL] Add "amdgpu-no-hostcall-ptr" in Clang codegen pre-COV_5 to [AMDGPU][OpenCL] Remove "printf and hostcall" diagnostic.
scott.linder edited the summary of this revision. (Show Details)

Remove frontend changes, and just drop the diagnostic in the backend. Update
some comments and naming.

@yaxunl Does excluding device-libs via COV_None make sense?

@sameerds Would you still rather we just not add this attribute in the frontend at all? I'm OK with it if that is the consensus

Yes, I still think there is no need to emit that attribute in the frontend. It will always be inferred by the Attributor when optimization is enabled. This also eliminates the check for COV_None and there seems to be some uncertainty about COV_None anyway. This also eliminates the updates to all the tests where the no-hostcall-ptr attribute does not actually matter. If ever we need to check if hostcall is being used on OpenCL and COV < 5, we should do it per feature and inform the user appropriately.

OK, that makes sense to me. I think I was not interpreting the purpose of the attributes correctly, but if they are essentially just optimization remarks then I don't have any issue with not ensuring they are present at -O0.

I updated the patch, let me know what you think.

sameerds accepted this revision.Mar 29 2022, 11:15 AM

LGTM! But not sure if @yaxunl and/or @arsenm should have another look at the final version.

yaxunl accepted this revision.Apr 1 2022, 8:36 AM

LGTM.

This revision was landed with ongoing or failed builds.Apr 5 2022, 12:11 PM
This revision was automatically updated to reflect the committed changes.
llvm/test/CodeGen/AMDGPU/opencl-printf-and-hostcall.ll