Add definitions for the KernelLoaderSpec and MultiKernelLoaderSpec
classes to StreamExecutor. Instances of these classes are generated by the
compiler in order to provide host code with a handle to device code.
Details
Diff Detail
Event Timeline
streamexecutor/include/streamexecutor/KernelSpec.h | ||
---|---|---|
12 | Nit: "A Foo object" or "Foo is a class". | |
13 | suggest s/single/particular/ | |
16 | Oh, hm, I see why you said "A Foo class". I think maybe you want to say "FileLoaderSpec is the base class for types that know where to find the code." | |
29 | I think this would be much easier to read if it were in second person. As-is the subject of the first phrase is MultiKernelLoaderSpec, which isn't right. | |
43 | Would it be helpful to see how this function is implemented? | |
51 | UserKernelLoaderSpec()? | |
52 | This won't compile because MultiKernelLoaderSpec doesn't have this function. | |
92 | I'm not sure we need this comment; it's pretty clear. | |
99 | operator= usually returns the KernelLoaderSpec&. I think it will work as you've written it here [1], but maybe changing it will let others avoid having to look it up. [1] http://en.cppreference.com/w/cpp/language/copy_assignment | |
102 | Maybe "resides in memory as a null-terminated string." | |
103 | Should this have "Spec" in the name? | |
107 | A tuple as opposed to std::pair? pair also defines operator<, and seems simpler. | |
122 | I don't really understand this -- we're only giving one PTX string, so of course it's going to be the default? | |
126 | Since PTX source code usually (always?) has a directive at the beginning specifying the compute capability, it doesn't seem unreasonable for someone to assume that if they call this constructor, we will infer the CC from the given string. It also seems weird that this constructor is the only way to specify a minimum-CC PTX string. I wonder if we we need this constructor at all? If we had a public way to get the minimum CC, then we could just have the one constructor? | |
135 | I wonder if it would make more sense to put the name first. | |
135 | Using an initializer_list as opposed to, say, a vector, seems kind of limiting, since you can't dynamically construct an initializer_list. (llvm::ArrayRef is really what we want, but I don't know if you want to force the compiler to pull that in.) | |
150 | Since PTX is backwards-compatible, we want to return the PTX with the largest version <= the given compute-capability, right? Not clear from the comment if that's what we're doing. | |
155 | Can we write this as (cc_major, cc_minor) and (2, 0), etc? | |
158 | Not sure this comment is necessary here. It's kind of obvious, but if necessary maybe we can place it next to the code that uses this assumption? | |
178 | Do you think it would make sense to take the size of the cubin data? I guess we don't ever need it so it doesn't matter. Maybe I was mislead by the comments saying that we don't make a copy -- clearly we can't if we don't even know the length of the thing! | |
188 | It seems kind of inconsistent that the PTX spec encompasses multiple PTX strings compiled for different archs, while the cubin one doesn't. Unless you mean s/cubin/fatbin/? AIUI a cubin encompasses SASS code for a particular sm, while a fatbin is a combination of one or more cubins and ptx strings. | |
190 | If this is null-terminated, maybe we should say that? | |
205 | It looks like we do make a copy? That seems inconsistent though. | |
223 | It seems like we could at least assert() that they have the same name? | |
264 | extern "C" doesn't need a scope: extern "C" void foo() so maybe this is better said as 'if it is not declared extern "C".' | |
284 | Same comments as on the constructors for those types. | |
294 | Not sure these comments are necessary; seems obvious. | |
streamexecutor/lib/KernelSpec.cpp | ||
28 | Seems obvious from the code? | |
72 | For chaining purposes it's more common to return *this (see e.g. operator=). |
- Responding to comments from jlebar
streamexecutor/include/streamexecutor/KernelSpec.h | ||
---|---|---|
29 | I didn't switch to second person, but I do think I cleaned up the language to make it much more understandable. | |
43 | Oops, I shouldn't have put the parens here. This is supposed to be instantiating a MultiKernelLoaderSpec called UserKernelLoaderSpec. | |
51 | I think the confusion here is the same as the point above. UserKernelLoaderSpec is a class instance, not a function to create a class instance. | |
52 | I'm not sure what's causing the confusion here. It looks to me like I did declare MultiKernelLoaderSpec::addCUDACubinInMemory. Is that the function you are talking about? | |
122 | Yes, this whole method doesn't really make sense. I removed it. I also removed the method for returning the "default" code. We can add that in later if we have a use case, but it's not needed for now. | |
126 | We could add code to make this interface safer at the expense of more complexity, but I think the original design is the right trade-off because this is only meant to be called by the compiler. | |
150 | I added to the comment to explain that there is no fallback logic in this function. If the exact compute capability is not found, it returns an error value. If we need that other functionality, we can add in another method later, but this is the only functionality that is present right now. | |
158 | I removed the comment because the code that used it is gone now. | |
178 | I don't think it would be good to take the size because we would never use it. I'm not sure what's best to put in the comment to reduce confusion. If the user thought it was a null-terminated string (which it isn't, but they might think it was), then they might be imagining that a copy is made even though no length is provided. | |
188 | Good catch! What this code has been calling "cubin" is really "fatbin". That's how it is treated when it is loaded. So I changed the names to fatbin. Now they are accurate. | |
205 | Oops, thanks, I missed that. I removed the copy to make it consistent with the others. |
streamexecutor/include/streamexecutor/KernelSpec.h | ||
---|---|---|
29 | lgtm | |
52 | I have no idea what that comment was about, sorry. I think I was just very confused by that most vexing parse. | |
106 | Not sure we need this line anymore. | |
120 | Here and elsewhere, I think we should change "holds a reference to". For example, if I have a std::shared_ptr<Foo>, I "hold a (strong) reference to Foo", which is the opposite of what you mean. Maybe just "Does not take ownership of the PTXCode pointers in SpecList's elements." or something would make it clear what we're going after. | |
126 |
sgtm | |
144 | It's not a null-terminated string, I think. I thought it was at first, but I think I was being mislead by "char" -- it's more likely that the first 4 bytes are the length (or something). | |
178 | Hm...what if we used char* for text strings and uint8_t* for byte data? | |
streamexecutor/lib/unittests/KernelSpecTest.cpp | ||
26 | ASSERT has an implicit "return", so we usually call EXPECT unless the test is going to crash if the condition fails. | |
74 | EXPECT_DEBUG_DEATH and you don't need the ifdefs. Same below. |
- Respond to jlebar's comments, round 2
streamexecutor/include/streamexecutor/KernelSpec.h | ||
---|---|---|
144 | Oops. I just wrote that comment in the wrong place. You're right that the char types are making things confusing. | |
178 | Yes, that is much better. How do you feel about void* rather than uint8_t? I think that would be good because the data is a formless blob. | |
streamexecutor/lib/unittests/KernelSpecTest.cpp | ||
74 | Ooooooo, that's nice. Thanks! |
Nit: "A Foo object" or "Foo is a class".