This is an archive of the discontinued LLVM Phabricator instance.

[CUDA]Delayed diagnostics for the asm instructions.
ClosedPublic

Authored by ABataev on Feb 20 2019, 10:54 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

ABataev created this revision.Feb 20 2019, 10:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 20 2019, 10:54 AM
Herald added a subscriber: jdoerfert. · View Herald Transcript
tra accepted this revision.Feb 20 2019, 2:06 PM

Thank you.

lib/Sema/Sema.cpp
1494–1496 ↗(On Diff #187621)

Nit: i'd use ternary op here or explicit if/else to indicate that CUDADiagIfDeviceCode/CUDADiagIfHostCode are treated the same and that CUDADiagIfHostCode() is *not* a catch-all of some kind.

return getLangOpts().CUDAIsDevice 
           ? CUDADiagIfDeviceCode(Loc, DiagID) 
           : CUDADiagIfHostCode(Loc, DiagID)
This revision is now accepted and ready to land.Feb 20 2019, 2:06 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 21 2019, 7:52 AM
tra added a comment.EditedFeb 26 2019, 10:39 AM

There's a new quirk we've ran into after this patch landed. Consider this code:

int foo() {
  int prev;
  __asm__ __volatile__("mov %0, 0" : "=a" (prev)::);
  return prev;
}

When we compile for device, asm constraint is not valid for NVPTX, we emit delayed diag and move on. The function is never code-gen'ed so the diag never shows up. So far so good.

Now we add -Werror -Wininitialized and things break -- because we bail out early, prev is left uninitialized and is reported as such.

$ bin/clang++ -c --cuda-gpu-arch=sm_35 asm.cu -nocudainc --cuda-device-only -Wuninitialized -Werror
asm.cu:4:10: error: variable 'prev' is uninitialized when used here [-Werror,-Wuninitialized]
  return prev;
         ^~~~
asm.cu:2:11: note: initialize the variable 'prev' to silence this warning
  int prev;
          ^
           = 0
1 error generated when compiling for sm_35.

I think this should also show up in the test case in this patch, too, if you add -Wuninitialized

In D58463#1410900, @tra wrote:

There's a new quirk we've ran into after this patch landed. Consider this code:

int foo() {
  int prev;
  __asm__ __volatile__("mov %0, 0" : "=a" (prev)::);
  return prev;
}

When we compile for device, asm constraint is not valid for NVPTX, we emit delayed diag and move on. The function is never code-gen'ed so the diag never shows up. So far so good.

Now we add -Werror -Wininitialized and things break -- because we bail out early, prev is left uninitialized and is reported as such.

$ bin/clang++ -c --cuda-gpu-arch=sm_35 asm.cu -nocudainc --cuda-device-only -Wuninitialized -Werror
asm.cu:4:10: error: variable 'prev' is uninitialized when used here [-Werror,-Wuninitialized]
  return prev;
         ^~~~
asm.cu:2:11: note: initialize the variable 'prev' to silence this warning
  int prev;
          ^
           = 0
1 error generated when compiling for sm_35.

I think this should also show up in the test case in this patch, too, if you add -Wuninitialized

Hi Artem, I think we can just delay emission of this warning to solve this problem.

tra added a comment.Feb 26 2019, 11:36 AM

Hi Artem, I think we can just delay emission of this warning to solve this problem.

I'm not sure we can always tell whether the warning is real or if it's the consequence of failing to parse inline asm.

E.g.:

namespace {
__host__ __device__ a() {
  int prev;
  __asm__ __volatile__("mov %0, 0" : "=a" (prev)::);
  return prev;
}

__host__ __device__ b() {
  int prev;
  return prev;
}

} //namespace

Ideally we should always emit uninitialized diagnostics for b, but never for a in both host and device compilation modes.
I think we may want to propagate assignment from the inline asm statement -- we may not know the meaning of the constraint, but we do know which argument gets used/modified by the asm statement. Perhaps we can construct a fake GCCAsmStmt but bail out before we attempt to validate the asm string.

In D58463#1411039, @tra wrote:

Hi Artem, I think we can just delay emission of this warning to solve this problem.

I'm not sure we can always tell whether the warning is real or if it's the consequence of failing to parse inline asm.

E.g.:

namespace {
__host__ __device__ a() {
  int prev;
  __asm__ __volatile__("mov %0, 0" : "=a" (prev)::);
  return prev;
}

__host__ __device__ b() {
  int prev;
  return prev;
}

} //namespace

Ideally we should always emit uninitialized diagnostics for b, but never for a in both host and device compilation modes.
I think we may want to propagate assignment from the inline asm statement -- we may not know the meaning of the constraint, but we do know which argument gets used/modified by the asm statement. Perhaps we can construct a fake GCCAsmStmt but bail out before we attempt to validate the asm string.

But it is going to be emitted for b() if b() is really used on the host or on the device. For a() the warning is going to be emitted only if it is really used on device, otherwise it is not.
Instead, we can try to do what we did before: construct GCCAsmStmt object, just like you said. What option do you prefer?

tra added a comment.Feb 26 2019, 12:03 PM

E.g.:

namespace {
__host__ __device__ a() {
  int prev;
  __asm__ __volatile__("mov %0, 0" : "=a" (prev)::);
  return prev;
}

__host__ __device__ b() {
  int prev;
  return prev;
}

} //namespace

Ideally we should always emit uninitialized diagnostics for b, but never for a in both host and device compilation modes.
I think we may want to propagate assignment from the inline asm statement -- we may not know the meaning of the constraint, but we do know which argument gets used/modified by the asm statement. Perhaps we can construct a fake GCCAsmStmt but bail out before we attempt to validate the asm string.

But it is going to be emitted for b() if b() is really used on the host or on the device.

Clang also emits the uninitialized warnings for b when it is not used -- as in the example above.
I'm OK with that as b is a valid function on both sides.

Suppressing uninitialized warning in this case would be wrong, IMO -- that would diverge from what clang would do if b didn't have __host__ __device__ attributes.

For a() the warning is going to be emitted only if it is really used on device, otherwise it is not.

Instead, we can try to do what we did before: construct GCCAsmStmt object, just like you said. What option do you prefer?

I think creating a GCCAsmStmt() is the right way to deal with this as it gives compiler the correct (well, as correct as we can at that point) info about the code, as opposed to giving compiler broken pieces and trying to suppress the fallout.

In D58463#1411086, @tra wrote:

E.g.:

namespace {
__host__ __device__ a() {
  int prev;
  __asm__ __volatile__("mov %0, 0" : "=a" (prev)::);
  return prev;
}

__host__ __device__ b() {
  int prev;
  return prev;
}

} //namespace

Ideally we should always emit uninitialized diagnostics for b, but never for a in both host and device compilation modes.
I think we may want to propagate assignment from the inline asm statement -- we may not know the meaning of the constraint, but we do know which argument gets used/modified by the asm statement. Perhaps we can construct a fake GCCAsmStmt but bail out before we attempt to validate the asm string.

But it is going to be emitted for b() if b() is really used on the host or on the device.

Clang also emits the uninitialized warnings for b when it is not used -- as in the example above.
I'm OK with that as b is a valid function on both sides.

Suppressing uninitialized warning in this case would be wrong, IMO -- that would diverge from what clang would do if b didn't have __host__ __device__ attributes.

For a() the warning is going to be emitted only if it is really used on device, otherwise it is not.

Instead, we can try to do what we did before: construct GCCAsmStmt object, just like you said. What option do you prefer?

I think creating a GCCAsmStmt() is the right way to deal with this as it gives compiler the correct (well, as correct as we can at that point) info about the code, as opposed to giving compiler broken pieces and trying to suppress the fallout.

Ok, will prepare a fix shortly.