This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] Don't crash when trying to printf a non-scalar object.
ClosedPublic

Authored by jlebar on Feb 10 2016, 2:40 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

jlebar updated this revision to Diff 47533.Feb 10 2016, 2:40 PM
jlebar retitled this revision from to [CUDA] Don't crash when trying to printf a non-scalar object..
jlebar updated this object.
jlebar added reviewers: majnemer, rnk.
jlebar added subscribers: tra, jhen, cfe-commits.
tra added a comment.Feb 10 2016, 3:00 PM

Erasing an argument would only complicate the problem.
I guess for consistency we need to match clang's behavior for regular C++ code.
For optimized builds it just seems to pass NULL pointer instead.

Yeah, I have no idea what's the right thing to do here. We can always pass a null pointer, that's easy. David, Reid, do you know what is the correct behavior?

Yeah, I have no idea what's the right thing to do here. We can always pass a null pointer, that's easy. David, Reid, do you know what is the correct behavior?

I think we need to diagnose / reject this during semantic analysis (and then put a reasonable assert in the backend).

Yeah, I have no idea what's the right thing to do here. We can always pass a null pointer, that's easy. David, Reid, do you know what is the correct behavior?

I think we need to diagnose / reject this during semantic analysis (and then put a reasonable assert in the backend).

Two things.

a) That doesn't seem to be what we do in regular C++. It will happily let you pass a Struct in with only a warning.
b) At the moment, we don't have the capability to do a proper semantic analysis of this. The issue is, when doing sema checking of host device functions, we don't know whether the function will end up being codegen'ed for device. And the semantics of cuda are that it's OK to do things that are illegal in device mode from host device functions, so long as you never codegen those functions for the device.

We have a plan to address (b) (basically, when doing sema checking, buffer any errors we would emit if we were to codegen for device; then we can emit all those errors right before codegen), but it's a much bigger thing. Until then, we need to do *something* other than crash here, even if we add additional sema checking for plain device fns.

rnk edited edge metadata.Feb 10 2016, 3:34 PM

Ultimately, Sema should be responsible for rejecting this, correct? In the meantime we can have CodeGen reject this and emit a null value to avoid crashing.

lib/CodeGen/CGCUDABuiltin.cpp
105 ↗(On Diff #47533)

I assume this is what's asserting. Probably this code should do something like:

if (Args[I].RV.isScalar()) {
  Arg = Args[I].RV.getScalarVal();
} else {
  ErrorUnsupported(E, "non-scalar variadic argument");
  Arg = CGM.getNullValue(...);
}

Yeah, I have no idea what's the right thing to do here. We can always pass a null pointer, that's easy. David, Reid, do you know what is the correct behavior?

I think we need to diagnose / reject this during semantic analysis (and then put a reasonable assert in the backend).

Two things.

a) That doesn't seem to be what we do in regular C++. It will happily let you pass a Struct in with only a warning.

Yes, but it also can be legally lowered and does not crash.

b) At the moment, we don't have the capability to do a proper semantic analysis of this. The issue is, when doing sema checking of host device functions, we don't know whether the function will end up being codegen'ed for device. And the semantics of cuda are that it's OK to do things that are illegal in device mode from host device functions, so long as you never codegen those functions for the device.

We have a plan to address (b) (basically, when doing sema checking, buffer any errors we would emit if we were to codegen for device; then we can emit all those errors right before codegen), but it's a much bigger thing. Until then, we need to do *something* other than crash here, even if we add additional sema checking for plain device fns.

Interesting dilemma. In the mean time, you can call CGM.ErrorUnsupported instead of removing arguments.

Ultimately, Sema should be responsible for rejecting this, correct?

I guess this is the part I'm unsure of. If it's legal to pass a struct to printf in regular C++ (seems to be?), I'd guess it should be legal in CUDA, too? I'm just not sure what it's supposed to do (in either case).

Ultimately, Sema should be responsible for rejecting this, correct?

I guess this is the part I'm unsure of. If it's legal to pass a struct to printf in regular C++ (seems to be?), I'd guess it should be legal in CUDA, too? I'm just not sure what it's supposed to do (in either case).

Is this because PTX does not have a way to represent va_arg structs?

I guess this is the part I'm unsure of. If it's legal to pass a struct to printf in regular C++ (seems to be?), I'd guess it should be legal in CUDA, too? I'm just not sure what it's supposed to do (in either case).

Is this because PTX does not have a way to represent va_arg structs?

We do build up something that looks an awful lot like a va_arg struct in this function. (It's a struct with N members, one for each of the varargs.) Exactly what printf expects is not particularly carefully specified in the nvvm documentation.

If an arg to printf is non-scalar, we could pass the whole thing into the struct we build here, but that doesn't seem to be what regular C++ does (it seems to take the first 64 bits of the struct -- I have no idea if this is specified somewhere or just UB).

rnk added inline comments.Feb 10 2016, 4:09 PM
lib/CodeGen/CGCUDABuiltin.cpp
105 ↗(On Diff #47533)

Under the assumption that the implementation of vprintf expects an old-school va_list byte array, then it's probably easier to implement this behavior rather than try to diagnose it. All you have to do is memcpy the bytes of the aggregate.

You also might want to handle _Complex values.

I guess this is the part I'm unsure of. If it's legal to pass a struct to printf in regular C++ (seems to be?), I'd guess it should be legal in CUDA, too? I'm just not sure what it's supposed to do (in either case).

Is this because PTX does not have a way to represent va_arg structs?

We do build up something that looks an awful lot like a va_arg struct in this function. (It's a struct with N members, one for each of the varargs.) Exactly what printf expects is not particularly carefully specified in the nvvm documentation.

If an arg to printf is non-scalar, we could pass the whole thing into the struct we build here, but that doesn't seem to be what regular C++ does (it seems to take the first 64 bits of the struct -- I have no idea if this is specified somewhere or just UB).

It takes the first 64 bits of the struct in your example because the struct is only 64 bits in size (two 32-bit ints). If you're example was:

$ cat /tmp/p.cpp 
#include <stdio.h>

struct Struct {
  int x;
  int y;
  int z;
  int w;
};

void PrintfNonScalar() {
  Struct S = { 1, 2, 3, 4 };
  printf("%d", S);
}

then you'd get:

target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

@.str = private unnamed_addr constant [3 x i8] c"%d\00", align 1

; Function Attrs: nounwind uwtable
define void @_Z15PrintfNonScalarv() #0 {
  %1 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([3 x i8], [3 x i8]* @.str, i64 0, i64 0), i64 8589934593, i64 17179869187)
  ret void
}

and so on. The target ABI code decides how to handle this (by coercing the types to a series of ints in this case).

If you were to do this on ppc64, for example, the target ABI code there does a slightly different thing:

target datalayout = "E-m:e-i64:64-n32:64"
target triple = "powerpc64-unknown-linux-gnu"

@.str = private unnamed_addr constant [3 x i8] c"%d\00", align 1

; Function Attrs: nounwind
define void @_Z15PrintfNonScalarv() #0 {
  %1 = tail call signext i32 (i8*, ...) @printf(i8* getelementptr inbounds ([3 x i8], [3 x i8]* @.str, i64 0, i64 0), [2 x i64] [i64 4294967298, i64 12884901892])
  ret void
}

it looks like maybe you just need some more sophisticated code in NVPTXABIInfo in lib/CodeGen/TargetInfo.cpp to produce something the backend will accept?

OK, talked to Reid irl. Since this is just printf, not general varargs handling, the Simplest Thing That Could Possibly Work is to error-unsupported. Once we fix sema as described above, we can move the check there. Will update the patch, thanks everyone.

jlebar updated this revision to Diff 47569.Feb 10 2016, 5:39 PM
jlebar edited edge metadata.

Error out with CGM.ErrorUnsupported when we receive a non-scalar arg.

rnk accepted this revision.Feb 10 2016, 5:57 PM
rnk edited edge metadata.

lgtm

lib/CodeGen/CGCUDABuiltin.cpp
90 ↗(On Diff #47569)

Doesn't printf return int? Maybe return RValue::get(llvm::ConstantInt::get(IntTy, 0))?

This revision is now accepted and ready to land.Feb 10 2016, 5:57 PM
This revision was automatically updated to reflect the committed changes.
jlebar marked an inline comment as done.