This introduces the kernel environment which contains information passed
by the compiler to a GPU kernel. For now it mostly encapsulated the
ident_t object and the execution configuration, thus information we
passed explicitly before. We will add more content, including mutable
content similar to the debug indention level, later on.
Details
Details
- Reviewers
jhuber6 tianshilei1992 JonChesterfield
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
openmp/libomptarget/DeviceRTL/src/Debug.cpp | ||
---|---|---|
76 |
Comment Actions
It sounds like such info will be passed from host to the device once per kernel and the performance impact is negligible. Right?
Comment Actions
The information is directly baked into the device image (in form of globals).
The transfer happens as you load the image, basically a few extra bytes per kernel but we had most of the things before as separate globals already anyway.
There is no kernel start cost to speak of, at least I don't expect any.