This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] IR/Bitcode changes for DISubprogram flags
ClosedPublic

Authored by probinson on Nov 20 2018, 8:00 AM.

Details

Summary

Packing the flags into one bitcode word will simplify adding new flags in the future.

Diff Detail

Event Timeline

probinson created this revision.Nov 20 2018, 8:00 AM

I'm assuming we already have plenty of DISubprograms with flags in the old format as .bc tests that cover the bitcode upgrade?

llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
1644–1645

Am I right in assuming that these dead fields take up virtually no extra space in the bitcode? Otherwise we might want to add an extra bit distinguish old vs. a newer, more compact record layout.

llvm/lib/IR/DebugInfoMetadata.cpp
554

nit: comment should go on the line before.

probinson marked an inline comment as done.Nov 26 2018, 1:08 PM

I'm assuming we already have plenty of DISubprograms with flags in the old format as .bc tests that cover the bitcode upgrade?

That was my thinking, but perhaps it would do no harm to add another specifically for this. Basically it would be a pre-patch Assembler/disubprogram.ll run through a pre-patch llvm-as which the test would run through llvm-dis and verify the flags came out right.

llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
1644–1645

Each field in Record is 64-bit, so as written this is wasting 24 bytes per DISubprogram in the bitcode. Compressing those out would mean the reader would pretty much have to handle the old/new formats separately, as the indexes of all fields after this one would vary too much to handle cleanly any other way.
I'm happy to make that change if you think the size is worthwhile.

llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
1644–1645

I don't /think/ that's how the bitcode format works, though, right? I believe it's a more compact encoding than using all 64 bits regardless of the value encoded... but I don't know the specifics/not very sure about it.

dexonsmith added inline comments.Nov 26 2018, 1:36 PM
llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
1644–1645

@dblaikie should be right. IIRC the default (in absence of an explicit abbreviation) is vbr-6, so that's 6 bits times three for 18 bits.

That said, I'm not convinced this would be hard to deal with in the reader. There's probably a way to reorganize the logic to avoid forking.

probinson updated this revision to Diff 175581.Nov 27 2018, 2:24 PM

Don't emit unused fields to the bitcode. Accommodate the new variant in the reader.

This does not address all comments, but I wanted to show @dexonsmith the change for bitcode format that he asked for.

aprantl accepted this revision.Nov 28 2018, 8:56 AM

I think it would be nice to have one small explicit "DISubprogram version 4" .bc upgrade test, otherwise this LGTM now. Thanks!

This revision is now accepted and ready to land.Nov 28 2018, 8:56 AM

I think it would be nice to have one small explicit "DISubprogram version 4" .bc upgrade test, otherwise this LGTM now. Thanks!

In plan. Also pls look at Clang test updates in D54756.

I think it would be nice to have one small explicit "DISubprogram version 4" .bc upgrade test, otherwise this LGTM now. Thanks!

In plan. Also pls look at Clang test updates in D54756.

Nice!

Nice!

Oops, I had meant to respond to this one:

This does not address all comments, but I wanted to show @dexonsmith the change for bitcode format that he asked for.

This revision was automatically updated to reflect the committed changes.

While I was updating some internal testcases I wondered: why is DIFlagPrototype not a DISPFlag? Does this flag make sense on something other than functions?

While I was updating some internal testcases I wondered: why is DIFlagPrototype not a DISPFlag? Does this flag make sense on something other than functions?

Because I was not interested in moving existing flags from the old word to the new word in the first round. There are others that probably apply only to subprograms:
NoReturn, MainSubprogram; probably Thunk, All CallsDescribed; maybe Trivial, Explicit.
Really we should do some kind of audit, and do a bulk move in one go so we can minimize the bitcode-upgrade pain.

Nice!

Thanks--by clever choice of where to place the new field, it was possible to minimize the variants. I'm not super happy about having to track two different offsets, but it's not much worse than the old code.

While I was updating some internal testcases I wondered: why is DIFlagPrototype not a DISPFlag? Does this flag make sense on something other than functions?

Because I was not interested in moving existing flags from the old word to the new word in the first round. There are others that probably apply only to subprograms:
NoReturn, MainSubprogram; probably Thunk, All CallsDescribed; maybe Trivial, Explicit.
Really we should do some kind of audit, and do a bulk move in one go so we can minimize the bitcode-upgrade pain.

One additional problem is that the existing flags are exposed as part of the public API, i.e. to front-ends, so we'll need to work out how to avoid Bad Stuff(tm) if we repurpose any existing flag bits.

While I was updating some internal testcases I wondered: why is DIFlagPrototype not a DISPFlag? Does this flag make sense on something other than functions?

Because I was not interested in moving existing flags from the old word to the new word in the first round. There are others that probably apply only to subprograms:
NoReturn, MainSubprogram; probably Thunk, All CallsDescribed; maybe Trivial, Explicit.
Really we should do some kind of audit, and do a bulk move in one go so we can minimize the bitcode-upgrade pain.

One additional problem is that the existing flags are exposed as part of the public API, i.e. to front-ends, so we'll need to work out how to avoid Bad Stuff(tm) if we repurpose any existing flag bits.

Another option here is not to merge them at all in bitcode (I didn't consider this before). You can add an abbreviation for a fixed-width field of 1 bit each:
https://llvm.org/docs/BitCodeFormat.html#fixed-width-value

While I was updating some internal testcases I wondered: why is DIFlagPrototype not a DISPFlag? Does this flag make sense on something other than functions?

Because I was not interested in moving existing flags from the old word to the new word in the first round. There are others that probably apply only to subprograms:
NoReturn, MainSubprogram; probably Thunk, All CallsDescribed; maybe Trivial, Explicit.
Really we should do some kind of audit, and do a bulk move in one go so we can minimize the bitcode-upgrade pain.

One additional problem is that the existing flags are exposed as part of the public API, i.e. to front-ends, so we'll need to work out how to avoid Bad Stuff(tm) if we repurpose any existing flag bits.

Another option here is not to merge them at all in bitcode (I didn't consider this before). You can add an abbreviation for a fixed-width field of 1 bit each:
https://llvm.org/docs/BitCodeFormat.html#fixed-width-value

Wouldn't that cause every new flag to be a bitcode format change? The value of having a flags word is that new flags in the same word don't affect the bitcode format.