If the target does not support .asciz and .ascii directives, the
strings are represented as bytes and each byte is placed on the new line
as a separate byte directive .b8 <data>. NVPTX target allows to
represent the vector of the data of the same type as a vector, where
values are separated using , symbol: .b8 <data1>,<data2>,.... This
allows to reduce the size of the final PTX file. Ptxas tool includes ptx
files into the resulting binary object, so reducing the size of the PTX
file is important.
Details
Diff Detail
- Repository
- rL LLVM
- Build Status
Buildable 22276 Build 22276: arc lint + arc unit
Event Timeline
lib/MC/MCAsmStreamer.cpp | ||
---|---|---|
860–869 | That's a bit too convoluted, IMO. I think two separate loops would be easier to read: if (DirectiveSeparator) { for(const auto C : Data.drop_back(1).bytes()) OS << C << DirectiveSeparator; OS << Data.back(); EmitEOL(); } else { for (const unsigned char C : Data.bytes()) { OS << Directive << (unsigned)C; EmitEOL(); } } |
test/DebugInfo/NVPTX/cu-range-hole.ll | ||
---|---|---|
151–152 | I wonder whether ptxas has a limit on the line length. |
test/DebugInfo/NVPTX/cu-range-hole.ll | ||
---|---|---|
151–152 | Probably, as (it seems to me) it uses flex/bison to parse the PTX files and I already ran into buffer overflow problem with the debug info. But I tried to compile this test with the ptxas and it was compiled correctly |
test/DebugInfo/NVPTX/cu-range-hole.ll | ||
---|---|---|
151–152 | Perhaps we should split the input into reasonably-sized chunks then. |
lib/MC/MCAsmStreamer.cpp | ||
---|---|---|
861–869 | I'd prefer all of this (and basically the function) be sunk down into a target function for EmitBytes. |
lib/MC/MCAsmStreamer.cpp | ||
---|---|---|
861–869 | It is not possible at the moment. We have the same problem for all the overloaded functions in MCTargetStreamer. The main problem is that not all targets define their own target streamer and sometimes getTargetStreamer() may return nullptr. But we still need some default behavior here. |
That's a bit too convoluted, IMO. I think two separate loops would be easier to read: