This is an archive of the discontinued LLVM Phabricator instance.

[DEBUGINFO, NVPTX] Try to pack bytes data into a single string.

Authored by ABataev on Apr 19 2018, 8:59 AM.



If the target does not support .asciz and .ascii directives, the
strings are represented as bytes and each byte is placed on the new line
as a separate byte directive .b8 <data>. NVPTX target allows to
represent the vector of the data of the same type as a vector, where
values are separated using , symbol: .b8 <data1>,<data2>,.... This
allows to reduce the size of the final PTX file. Ptxas tool includes ptx
files into the resulting binary object, so reducing the size of the PTX
file is important.

Diff Detail


Event Timeline

ABataev created this revision.Apr 19 2018, 8:59 AM
tra added inline comments.Apr 19 2018, 9:55 AM
819–838 ↗(On Diff #143105)

That's a bit too convoluted, IMO. I think two separate loops would be easier to read:

if (DirectiveSeparator) {
  for(const auto C : Data.drop_back(1).bytes())
     OS << C << DirectiveSeparator;
  OS << Data.back();
} else {
   for (const unsigned char C : Data.bytes()) {
     OS << Directive << (unsigned)C;
ABataev updated this revision to Diff 143129.Apr 19 2018, 11:02 AM

Address comments from @tra

tra accepted this revision.Apr 19 2018, 11:13 AM
tra added inline comments.
149 ↗(On Diff #143129)

I wonder whether ptxas has a limit on the line length.

This revision is now accepted and ready to land.Apr 19 2018, 11:13 AM
ABataev added inline comments.Apr 19 2018, 11:20 AM
149 ↗(On Diff #143129)

Probably, as (it seems to me) it uses flex/bison to parse the PTX files and I already ran into buffer overflow problem with the debug info. But I tried to compile this test with the ptxas and it was compiled correctly

tra added inline comments.Apr 19 2018, 11:24 AM
149 ↗(On Diff #143129)

Perhaps we should split the input into reasonably-sized chunks then.
I'd look at what nvcc generates and if they do split large inputs, we should probably do so, too.

ABataev updated this revision to Diff 157695.Jul 27 2018, 7:51 AM

Added maximal length of the merged directives.

echristo added inline comments.Jul 30 2018, 3:05 PM
862–883 ↗(On Diff #157695)

I'd prefer all of this (and basically the function) be sunk down into a target function for EmitBytes.

ABataev updated this revision to Diff 158750.Aug 2 2018, 7:18 AM

Reworked according to Eric's comments.

Some inline comments.

862–883 ↗(On Diff #157695)

This now has duplicated code between here and emitRawBytes in the MCStreamer. Can you unify this please?

101 ↗(On Diff #158750)

Can you add some more parens for readability please?

ABataev marked an inline comment as done.Sep 5 2018, 10:30 AM
ABataev added inline comments.
862–883 ↗(On Diff #157695)

It is not possible at the moment. We have the same problem for all the overloaded functions in MCTargetStreamer. The main problem is that not all targets define their own target streamer and sometimes getTargetStreamer() may return nullptr. But we still need some default behavior here.

ABataev updated this revision to Diff 164080.Sep 5 2018, 11:03 AM

Added some more parens in expression.

echristo accepted this revision.Oct 18 2018, 6:11 PM


This revision was automatically updated to reflect the committed changes.